Compare commits

..

827 Commits

Author SHA1 Message Date
Bagatur
63d30069e4 fmt 2024-03-31 14:49:43 -07:00
Bagatur
70c12f1343 fmt 2024-03-31 14:49:22 -07:00
Jamsheed Mistri
147dfc36f1 docs: Layerup Security docs 2024-03-29 22:44:56 -07:00
Jamsheed Mistri
2fcf736dbd feat: Layerup Security integration 2024-03-29 22:44:51 -07:00
LunarECL
b7d180a70d experimental[minor]: Create Closed Captioning Chain for .mp4 videos (#14059)
Description: Video imagery to text (Closed Captioning)
This pull request introduces the VideoCaptioningChain, a tool for
automated video captioning. It processes audio and video to generate
subtitles and closed captions, merging them into a single SRT output.

Issue: https://github.com/langchain-ai/langchain/issues/11770
Dependencies: opencv-python, ffmpeg-python, assemblyai, transformers,
pillow, torch, openai
Tag maintainer:
@baskaryan
@hwchase17


Hello!

We are a group of students from the University of Toronto
(@LunarECL, @TomSadan, @nicoledroi1, @A2113S) that want to make a
contribution to the LangChain community! We have ran make format, make
lint and make test locally before submitting the PR. To our knowledge,
our changes do not introduce any new errors.

Thank you for taking the time to review our PR!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-30 01:57:53 +00:00
Harrison Chase
56525f2ac1 dont mutate metadata/tags (#19742) 2024-03-29 17:55:27 -07:00
Kamal Zhang
368e35c3b1 community[patch]: introduce convert_to_secret() to bananadev llm (#14283)
- **Description:** Per #12165, this PR add to BananaLLM the function
convert_to_secret_str() during environment variable validation.
- **Issue:** #12165
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** @treewatcha75751

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-30 00:52:25 +00:00
DrKroll
c4da8d0813 langchain[patch]: load ReadFileTool (#14301)
---------

Co-authored-by: Dr. Simon Kroll <krolls@fida.de>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-30 00:46:24 +00:00
anshaneel
0884e5de7f community[minor]: Add Alpha Vantage API Tool (#14332)
### Description
This implementation adds functionality from the AlphaVantage API,
renowned for its comprehensive financial data. The class encapsulates
various methods, each dedicated to fetching specific types of financial
information from the API.

### Implemented Functions

- **`search_symbols`**: 
- Searches the AlphaVantage API for financial symbols using the provided
keywords.

- **`_get_market_news_sentiment`**: 
- Retrieves market news sentiment for a specified stock symbol from the
AlphaVantage API.

- **`_get_time_series_daily`**: 
- Fetches daily time series data for a specific symbol from the
AlphaVantage API.

- **`_get_quote_endpoint`**: 
- Obtains the latest price and volume information for a given symbol
from the AlphaVantage API.

- **`_get_time_series_weekly`**: 
- Gathers weekly time series data for a particular symbol from the
AlphaVantage API.

- **`_get_top_gainers_losers`**: 
- Provides details on top gainers, losers, and most actively traded
tickers in the US market from the AlphaVantage API.

  ### Issue: 
  - #11994 
  
### Dependencies: 
  - 'requests' library for HTTP requests. (import requests)
  - 'pytest' library for testing. (import pytest)

---------

Co-authored-by: Adam Badar <94140103+adam-badar@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-30 00:44:01 +00:00
Alex Sherstinsky
a9bc212bf2 community[minor]: fix failing Predibase integration (#19776)
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Langchain-Predibase integration was failing, because
it was not current with the Predibase SDK; in addition, Predibase
integration tests were instantiating the Langchain Community `Predibase`
class with one required argument (`model`) missing. This change updates
the Predibase SDK usage and fixes the integration tests.
    - **Twitter handle:** `@alexsherstinsky`


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-30 00:38:13 +00:00
ethynic
e9caa22d47 community[patch]: Update minimax.py (#14384)
MiniMaxChat class _generate method shoud return a ChatResult object not
str

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 23:57:06 +00:00
Ahmed Moubtahij
f5d4ce840f langchain[patch]: Simplify ensemble retriever (#14427)
- **Description:** code simplification to improve readability and remove
unnecessary memory allocations.
  - **Tag maintainer**: @baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 16:49:49 -07:00
Snehil Kumar
b36f4147b0 docs: Google Drive Loader always set the env var (#14791)
- **Description:** Code written by following, the official documentation
of [Google Drive
Loader](https://python.langchain.com/docs/integrations/document_loaders/google_drive),
gives errors. I have opened an issue regarding this. See #14725. This is
a pull request for modifying the documentation to use an approach that
makes the code work. Basically, the change is that we need to always set
the GOOGLE_APPLICATION_CREDENTIALS env var to an emtpy string, rather
than only in case of RefreshError. Also, rewrote 2 paragraphs to make
the instructions more clear.
- **Issue:** See this related [issue #
14725](https://github.com/langchain-ai/langchain/issues/14725)
  - **Dependencies:** NA
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** NA

Co-authored-by: Snehil <snehil@example.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 23:19:37 +00:00
M.Abdulrahman Alnaseer
ba54f1577f community[minor]: add support for llmsherpa (#19741)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: added support for llmsherpa library"

- [x] **Add tests and docs**: 
1. Integration test:
'docs/docs/integrations/document_loaders/test_llmsherpa.py'.
2. an example notebook:
`docs/docs/integrations/document_loaders/llmsherpa.ipynb`.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 16:04:57 -07:00
Naveenkhasyap
a99bd098ac docs: fix for #16702 and #16703 (#16705)
- **Description:** Quickstart Documentation updates for missing
dependency installation steps.
- **Issue:** the issue # it prompts users to install required
dependency.
  - **Dependencies:** no,
  - **Twitter handle:** @naveenkashyap_

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 15:57:51 -07:00
Brace Sproul
6d93a03bef docs[patch]: Fix or remove broken mdx links (#19777)
this pr also drops the community added action for checking broken links
in mdx. It does not work well for our use case, throwing errors for
local paths, plus the rest of the errors our in house solution had.
2024-03-29 15:25:08 -07:00
Bagatur
2f5606a318 mistralai[patch]: correct integration_test (#19774) 2024-03-29 21:47:35 +00:00
Pierre Véron
ace7b66261 mistralai[patch]: add missing _combine_llm_outputs implementation in ChatMistralAI (#18603)
# Description
Implementing `_combine_llm_outputs` to `ChatMistralAI` to override the
default implementation in `BaseChatModel` returning `{}`. The
implementation is inspired by the one in `ChatOpenAI` from package
`langchain-openai`.
# Issue
None
# Dependencies
None
# Twitter handle
None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 14:43:20 -07:00
lvliang-intel
0175906437 templates: add RAG template for Intel Xeon Scalable Processors (#18424)
**Description:**
This template utilizes Chroma and TGI (Text Generation Inference) to
execute RAG on the Intel Xeon Scalable Processors. It serves as a
demonstration for users, illustrating the deployment of the RAG service
on the Intel Xeon Scalable Processors and showcasing the resulting
performance enhancements.

**Issue:**
None

**Dependencies:**
The template contains the poetry project requirements to run this
template.
CPU TGI batching is WIP.

**Twitter handle:**
None

---------

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 14:37:32 -07:00
Nuno Campos
d4673a3507 openai[patch]: Update openai chat model to new base class interface (#19729) 2024-03-29 14:30:28 -07:00
harry-cohere
23fcc14650 cohere[patch]: support kwargs in with_structured_output (#19736)
**Description:** We'd like to support passing additional kwargs in
`with_structured_output`. I believe this is the accepted approach to
enable additional arguments on API calls.
2024-03-29 14:30:14 -07:00
Brace Sproul
ce0a588ae6 docs[minor]: Add chat model tabs to docs pages (#19589) 2024-03-29 14:23:55 -07:00
BeatrixCohere
bd02b83acd cohere[patch]: Allow overriding of the base URL in Cohere Client (#19766)
This PR adds the ability for a user to override the base API url for the
Cohere client for embeddings and chat llm.
2024-03-29 14:22:30 -07:00
Nisarg Trivedi
1252ccce6f text-splitters[minor]: Added Haskell support in langchain.text_splitter module (#16191)
- **Description:** Haskell language support added in text_splitter
module
  - **Dependencies:** No
  - **Twitter handle:** @nisargtr

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 20:17:50 +00:00
Hrvoje Milković
b7344e3347 community[minor]: Infobip tool integration (#16805)
**Description:** Adding Tool that wraps Infobip API for sending sms or
emails and email validation.
**Dependencies:** None,
**Twitter handle:** @hmilkovic

Implementation:
```
libs/community/langchain_community/utilities/infobip.py
```

Integration tests:
```
libs/community/tests/integration_tests/utilities/test_infobip.py
```

Example notebook:
```
docs/docs/integrations/tools/infobip.ipynb
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 19:01:27 +00:00
Luka Krapic
727a2ea9f1 community[patch]: history size support for DynamoDBChatMessageHistory (#16794)
**Description:** PR adds support for limiting number of messages
preserved in a session history for DynamoDBChatMessageHistory

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 18:56:21 +00:00
Dt22
6dbf1a2de0 community[patch]: fix redis input type for index_schema field (#16874)
### Subject: Fix Type Misdeclaration for index_schema in redis/base.py

I noticed a type misdeclaration for the index_schema column in the
redis/base.py file.

When following the instructions outlined in [Redis Custom Metadata
Indexing](https://python.langchain.com/docs/integrations/vectorstores/redis)
to create our own index_schema, it leads to a Pylance type error. <br/>
**The error message indicates that Dict[str, list[Dict[str, str]]] is
incompatible with the type Optional[Union[Dict[str, str], str,
os.PathLike]].**

```
index_schema = {
    "tag": [{"name": "credit_score"}],
    "text": [{"name": "user"}, {"name": "job"}],
    "numeric": [{"name": "age"}],
}

rds, keys = Redis.from_texts_return_keys(
    texts,
    embeddings,
    metadatas=metadata,
    redis_url="redis://localhost:6379",
    index_name="users_modified",
    index_schema=index_schema,  
)
```
Therefore, I have created this pull request to rectify the type
declaration problem.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 18:55:54 +00:00
morgana
074ad5095f community[patch]: mmr search for Rockset vectorstore integration (#16908)
- **Description:** Adding support for mmr search in the Rockset
vectorstore integration.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** `@_morgan_adams_`

---------

Co-authored-by: Rockset API Bot <admin@rockset.io>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 18:45:22 +00:00
shahrin014
f51e6a35ba community[patch]: OllamaEmbeddings - Pass headers to post request (#16880)
## Feature
- Set additional headers in constructor
- Headers will be sent in post request

This feature is useful if deploying Ollama on a cloud service such as
hugging face, which requires authentication tokens to be passed in the
request header.

## Tests
- Test if header is passed
- Test if header is not passed

Similar to https://github.com/langchain-ai/langchain/pull/15881

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 18:44:52 +00:00
Lance Martin
e0f137dbe0 docs: Agentic and Self-RAG w/ LangGraph (#16910)
To do:
[ ] Add streaming
[ ] Move to LangGraph
2024-03-29 11:11:35 -07:00
Jan Chorowski
b8b42ccbc5 community[minor]: Pathway vectorstore(#14859)
- **Description:** Integration with pathway.com data processing pipeline
acting as an always updated vectorstore
  - **Issue:** not applicable
- **Dependencies:** optional dependency on
[`pathway`](https://pypi.org/project/pathway/)
  - **Twitter handle:** pathway_com

The PR provides and integration with `pathway` to provide an easy to use
always updated vector store:

```python
import pathway as pw
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import PathwayVectorClient, PathwayVectorServer

data_sources = []
data_sources.append(
    pw.io.gdrive.read(object_id="17H4YpBOAKQzEJ93xmC2z170l0bP2npMy", service_user_credentials_file="credentials.json", with_metadata=True))

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
embeddings_model = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
vector_server = PathwayVectorServer(
    *data_sources,
    embedder=embeddings_model,
    splitter=text_splitter,
)
vector_server.run_server(host="127.0.0.1", port="8765", threaded=True, with_cache=False)
client = PathwayVectorClient(
    host="127.0.0.1",
    port="8765",
)
query = "What is Pathway?"
docs = client.similarity_search(query)
```

The `PathwayVectorServer` builds a data processing pipeline which
continusly scans documents in a given source connector (google drive,
s3, ...) and builds a vector store. The `PathwayVectorClient` implements
LangChain's `VectorStore` interface and connects to the server to
retrieve documents.

---------

Co-authored-by: Mateusz Lewandowski <lewymati@users.noreply.github.com>
Co-authored-by: mlewandowski <mlewandowski@MacBook-Pro-mlewandowski.local>
Co-authored-by: Berke <berkecanrizai1@gmail.com>
Co-authored-by: Adrian Kosowski <adrian@pathway.com>
Co-authored-by: mlewandowski <mlewandowski@macbook-pro-mlewandowski.home>
Co-authored-by: berkecanrizai <63911408+berkecanrizai@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: mlewandowski <mlewandowski@MBPmlewandowski.ht.home>
Co-authored-by: Szymon Dudycz <szymond@pathway.com>
Co-authored-by: Szymon Dudycz <szymon.dudycz@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 10:50:39 -07:00
ccurme
0dbd5f5012 add script to check imports (#19611) 2024-03-29 13:30:20 -04:00
Arturs Konfino
2319212d54 community[patch]: avoid executing toolkit.get_context() when not necessary (#19762)
If `prompt` is passed into `create_sql_agent()`, then
`toolkit.get_context()` shouldn't be executed against the database
unless relevant prompt variables (`table_info` or `table_names`) are
present .
2024-03-29 16:42:21 +00:00
高璟琦
ec7a59c96c community[minor]: Add solar embedding (#19761)
Solar is a large language model developed by
[Upstage](https://upstage.ai/). It's a powerful and purpose-trained LLM.
You can visit the embedding service provided by Solar within this pr.

You may get **SOLAR_API_KEY** from
https://console.upstage.ai/services/embedding
You can refer to more details about accepted llm integration at
https://python.langchain.com/docs/integrations/llms/solar.
2024-03-29 09:36:05 -07:00
Tomaz Bratanic
dec00d3050 community[patch]: Add the ability to pass maps to neo4j retrieval query (#19758)
Makes it easier to flatten complex values to text, so you don't have to
use a lot of Cypher to do it.
2024-03-29 08:33:48 -07:00
Robby
f7e8a382cc community[minor]: add hugging face text-to-speech inference API (#18880)
Description: I implemented a tool to use Hugging Face text-to-speech
inference API.

Issue: n/a

Dependencies: n/a

Twitter handle: No Twitter, but do have
[LinkedIn](https://www.linkedin.com/in/robby-horvath/) lol.

---------

Co-authored-by: Robby <h0rv@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-29 15:02:29 +00:00
DasDingoCodes
73eb3f8fd9 community[minor]: Implement DirectoryLoader lazy_load function (#19537)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: Implement DirectoryLoader lazy_load
function"

- [x] **Description**: The `lazy_load` function of the `DirectoryLoader`
yields each document separately. If the given `loader_cls` of the
`DirectoryLoader` also implemented `lazy_load`, it will be used to yield
subdocuments of the file.

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access:
`libs/community/tests/unit_tests/document_loaders/test_directory_loader.py`
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory:
`docs/docs/integrations/document_loaders/directory.ipynb`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-29 14:46:52 +00:00
Christophe Bornet
6b2b511f68 core[minor]: Add aformat_messages to FewShotChatMessagePromptTemplate and ChatPromptTemplate (#19648)
Needed since the example selector may use a vector store.
2024-03-29 10:31:32 -04:00
Leonid Ganeline
5f814820f6 docs: providers pinecone fix (#19737)
Current providers page use link to the old package.
- Fixed installation instructions
- Added a reference to the Pinecone retriever
2024-03-29 08:30:30 -04:00
Bob Lin
53a74ad12b docs: use markdown cell instead of code block (#19740)
I found that the code of async and async batch was divided into two
blocks:

<img width="823" alt="Screenshot 2024-03-29 at 7 45 59 AM"
src="https://github.com/langchain-ai/langchain/assets/10000925/0fa59d29-a692-4309-afb8-2260f03242ec">


so I changed it to unified.
2024-03-29 08:27:48 -04:00
Ekaterina Aidova
4ce36af335 docs: fix link in openvino integration doc (#19749)
- **Description:** fix incorrect link in docs
 - **Dependencies:** None
2024-03-29 12:24:07 +00:00
Jialei
f7c903e24a community[minor]: add support for Moonshot llm and chat model (#17100) 2024-03-29 08:54:23 +00:00
Gustavo Isturiz
824dccf5e2 docs: fixed xml URL on sitemap docs exmaple, issue #17236 (#17304) 2024-03-29 01:36:54 -07:00
Ethan Yang
7164015135 community[minor]: Add Openvino embedding support (#19632)
This PR is used to support both HF and BGE embeddings with openvino

---------

Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com>
2024-03-29 01:34:51 -07:00
Guangdong Liu
cd55d587c2 langchain[patch]: Upgrade openai's sdk and solve some interface adaptation problems. (#19548)
- **Issue:** close #19534
2024-03-29 01:25:17 -07:00
Kirushikesh DB
12861273e1 experimental[patch]: Removed 'SQLResults:' from the LLMResponse in SQLDatabaseChain (#17104)
**Description:** 
When using the SQLDatabaseChain with Llama2-70b LLM and, SQLite
database. I was getting `Warning: You can only execute one statement at
a time.`.

```
from langchain.sql_database import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain

sql_database_path = '/dccstor/mmdataretrieval/mm_dataset/swimming_record/rag_data/swimmingdataset.db'
sql_db = get_database(sql_database_path)
db_chain = SQLDatabaseChain.from_llm(mistral, sql_db, verbose=True, callbacks = [callback_obj])
db_chain.invoke({
    "query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
})
```
Error:
```
Warning                                   Traceback (most recent call last)
Cell In[31], line 3
      1 import langchain
      2 langchain.debug=False
----> 3 db_chain.invoke({
      4     "query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
      5 })

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:162, in Chain.invoke(self, input, config, **kwargs)
    160 except BaseException as e:
    161     run_manager.on_chain_error(e)
--> 162     raise e
    163 run_manager.on_chain_end(outputs)
    164 final_outputs: Dict[str, Any] = self.prep_outputs(
    165     inputs, outputs, return_only_outputs
    166 )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:156, in Chain.invoke(self, input, config, **kwargs)
    149 run_manager = callback_manager.on_chain_start(
    150     dumpd(self),
    151     inputs,
    152     name=run_name,
    153 )
    154 try:
    155     outputs = (
--> 156         self._call(inputs, run_manager=run_manager)
    157         if new_arg_supported
    158         else self._call(inputs)
    159     )
    160 except BaseException as e:
    161     run_manager.on_chain_error(e)

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:198, in SQLDatabaseChain._call(self, inputs, run_manager)
    194 except Exception as exc:
    195     # Append intermediate steps to exception, to aid in logging and later
    196     # improvement of few shot prompt seeds
    197     exc.intermediate_steps = intermediate_steps  # type: ignore
--> 198     raise exc

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:143, in SQLDatabaseChain._call(self, inputs, run_manager)
    139     intermediate_steps.append(
    140         sql_cmd
    141     )  # output: sql generation (no checker)
    142     intermediate_steps.append({"sql_cmd": sql_cmd})  # input: sql exec
--> 143     result = self.database.run(sql_cmd)
    144     intermediate_steps.append(str(result))  # output: sql exec
    145 else:

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:436, in SQLDatabase.run(self, command, fetch, include_columns)
    425 def run(
    426     self,
    427     command: str,
    428     fetch: Literal["all", "one"] = "all",
    429     include_columns: bool = False,
    430 ) -> str:
    431     """Execute a SQL command and return a string representing the results.
    432 
    433     If the statement returns rows, a string of the results is returned.
    434     If the statement returns no rows, an empty string is returned.
    435     """
--> 436     result = self._execute(command, fetch)
    438     res = [
    439         {
    440             column: truncate_word(value, length=self._max_string_length)
   (...)
    443         for r in result
    444     ]
    446     if not include_columns:

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:413, in SQLDatabase._execute(self, command, fetch)
    410     elif self.dialect == "postgresql":  # postgresql
    411         connection.exec_driver_sql("SET search_path TO %s", (self._schema,))
--> 413 cursor = connection.execute(text(command))
    414 if cursor.returns_rows:
    415     if fetch == "all":

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1416, in Connection.execute(self, statement, parameters, execution_options)
   1414     raise exc.ObjectNotExecutableError(statement) from err
   1415 else:
-> 1416     return meth(
   1417         self,
   1418         distilled_parameters,
   1419         execution_options or NO_OPTIONS,
   1420     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/sql/elements.py:516, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options)
    514     if TYPE_CHECKING:
    515         assert isinstance(self, Executable)
--> 516     return connection._execute_clauseelement(
    517         self, distilled_params, execution_options
    518     )
    519 else:
    520     raise exc.ObjectNotExecutableError(self)

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1639, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options)
   1627 compiled_cache: Optional[CompiledCacheType] = execution_options.get(
   1628     "compiled_cache", self.engine._compiled_cache
   1629 )
   1631 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
   1632     dialect=dialect,
   1633     compiled_cache=compiled_cache,
   (...)
   1637     linting=self.dialect.compiler_linting | compiler.WARN_LINTING,
   1638 )
-> 1639 ret = self._execute_context(
   1640     dialect,
   1641     dialect.execution_ctx_cls._init_compiled,
   1642     compiled_sql,
   1643     distilled_parameters,
   1644     execution_options,
   1645     compiled_sql,
   1646     distilled_parameters,
   1647     elem,
   1648     extracted_params,
   1649     cache_hit=cache_hit,
   1650 )
   1651 if has_events:
   1652     self.dispatch.after_execute(
   1653         self,
   1654         elem,
   (...)
   1658         ret,
   1659     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1848, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw)
   1843     return self._exec_insertmany_context(
   1844         dialect,
   1845         context,
   1846     )
   1847 else:
-> 1848     return self._exec_single_context(
   1849         dialect, context, statement, parameters
   1850     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1988, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1985     result = context._setup_result_proxy()
   1987 except BaseException as e:
-> 1988     self._handle_dbapi_exception(
   1989         e, str_statement, effective_parameters, cursor, context
   1990     )
   1992 return result

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:2346, in Connection._handle_dbapi_exception(self, e, statement, parameters, cursor, context, is_sub_exec)
   2344     else:
   2345         assert exc_info[1] is not None
-> 2346         raise exc_info[1].with_traceback(exc_info[2])
   2347 finally:
   2348     del self._reentrant_error

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1969, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1967                 break
   1968     if not evt_handled:
-> 1969         self.dialect.do_execute(
   1970             cursor, str_statement, effective_parameters, context
   1971         )
   1973 if self._has_events or self.engine._has_events:
   1974     self.dispatch.after_cursor_execute(
   1975         self,
   1976         cursor,
   (...)
   1980         context.executemany,
   1981     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/default.py:922, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
    921 def do_execute(self, cursor, statement, parameters, context=None):
--> 922     cursor.execute(statement, parameters)

Warning: You can only execute one statement at a time.
```
**Issue:** 
The Error occurs because when generating the SQLQuery, the llm_input
includes the stop character of "\nSQLResult:", so for this user query
the LLM generated response is **SELECT Time FROM men_butterfly_100m
WHERE Swimmer = 'Lance Larson';\nSQLResult:** it is required to remove
the SQLResult suffix on the llm response before executing it on the
database.

```
llm_inputs = {
            "input": input_text,
            "top_k": str(self.top_k),
            "dialect": self.database.dialect,
            "table_info": table_info,
            "stop": ["\nSQLResult:"],
        }

sql_cmd = self.llm_chain.predict(
                callbacks=_run_manager.get_child(),
                **llm_inputs,
            ).strip()

if SQL_RESULT in sql_cmd:
    sql_cmd = sql_cmd.split(SQL_RESULT)[0].strip()
result = self.database.run(sql_cmd)
```


<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 01:22:35 -07:00
T Cramer
540ebf35a9 community[patch]: Add explicit error message to Bedrock error output. (#17328)
- **Description:** Propagate Bedrock errors into Langchain explicitly.
Use-case: unset region error is hidden behind 'Could not load
credentials...' message
- **Issue:**
[17654](https://github.com/langchain-ai/langchain/issues/17654)
  - **Dependencies:** None

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 03:07:33 +00:00
Marcus Virginia
69bb96c80f community[patch]: surrealdb handle for empty metadata and allow collection names with complex characters (#17374)
- **Description:** Handle for empty metadata and allow collection names
with complex characters
  - **Issue:** #17057
  - **Dependencies:** `surrealdb`

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 01:04:27 +00:00
ale-delfino
0df76bee37 core[patch]:: XML parser to cover the case when the xml only contains the root level tag (#17456)
Description: Fix xml parser to handle strings that only contain the root
tag
Issue: N/A
Dependencies: None
Twitter handle: N/A

A valid xml text can contain only the root level tag. Example: <body>
  Some text here
</body>
The example above is a valid xml string. If parsed with the current
implementation the result is {"body": []}. This fix checks if the root
level text contains any non-whitespace character and if that's the case
it returns {root.tag: root.text}. The result is that the above text is
correctly parsed as {"body": "Some text here"}

@ale-delfino

Thank you for contributing to LangChain!

Checklist:

- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
- [x] PR message: **Delete this entire template message** and replace it
with the following bulleted list
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-29 00:55:23 +00:00
kYLe
124ab79c23 community[minor]: Add Anyscale embedding support (#17605)
**Description:** Add embedding model support for Anyscale Endpoint
**Dependencies:** openai

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:53:53 +00:00
Lance Martin
12843f292f community[patch]: llama cpp embeddings reset default n_batch (#17594)
When testing Nomic embeddings --
```
from langchain_community.embeddings import LlamaCppEmbeddings
embd_model_path = "/Users/rlm/Desktop/Code/llama.cpp/models/nomic-embd/nomic-embed-text-v1.Q4_K_S.gguf"
embd_lc = LlamaCppEmbeddings(model_path=embd_model_path)
embedding_lc = embd_lc.embed_query(query)
```

We were seeing this error for strings > a certain size -- 
```
File ~/miniforge3/envs/llama2/lib/python3.9/site-packages/llama_cpp/llama.py:827, in Llama.embed(self, input, normalize, truncate, return_count)
    824     s_sizes = []
    826 # add to batch
--> 827 self._batch.add_sequence(tokens, len(s_sizes), False)
    828 t_batch += n_tokens
    829 s_sizes.append(n_tokens)

File ~/miniforge3/envs/llama2/lib/python3.9/site-packages/llama_cpp/_internals.py:542, in _LlamaBatch.add_sequence(self, batch, seq_id, logits_all)
    540 self.batch.token[j] = batch[i]
    541 self.batch.pos[j] = i
--> 542 self.batch.seq_id[j][0] = seq_id
    543 self.batch.n_seq_id[j] = 1
    544 self.batch.logits[j] = logits_all

ValueError: NULL pointer access
```

The default `n_batch` of llama-cpp-python's Llama is `512` but we were
explicitly setting it to `8`.
 
These need to be set to equal for embedding models. 
* The embedding.cpp example has an assertion to make sure these are
always equal.
* Apparently this is not being done properly in llama-cpp-python.

With `n_batch` set to 8, if more than 8 tokens are passed the batch runs
out of space and it crashes.

This also explains why the CPU compute buffer size was small:

raw client with default `n_batch=512`
```
llama_new_context_with_model:        CPU input buffer size   =     3.51 MiB
llama_new_context_with_model:        CPU compute buffer size =    21.00 MiB
```
langchain with `n_batch=8`
```
llama_new_context_with_model:        CPU input buffer size   =     0.04 MiB
llama_new_context_with_model:        CPU compute buffer size =     0.33 MiB
```

We can work around this by passing `n_batch=512`, but this will not be
obvious to some users:
```
    embedding = LlamaCppEmbeddings(model_path=embd_model_path,
                                   n_batch=512)
```

From discussion w/ @cebtenzzre. Related:

https://github.com/abetlen/llama-cpp-python/issues/1189

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:47:22 +00:00
Zijian Han
8e976545f3 community[patch]: support OpenAI whisper base url (#17695)
**Description:** The base URL for OpenAI is retrieved from the
environment variable "OPENAI_BASE_URL", whereas for langchain it is
obtained from "OPENAI_API_BASE". By adding `base_url =
os.environ.get("OPENAI_API_BASE")`, the OpenAI proxy can execute
correctly.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:35:27 +00:00
Paulo Nascimento
44a3484503 community[patch]: add NotebookLoader unit test (#17721)
Thank you for contributing to LangChain!

- **Description:** added unit tests for NotebookLoader. Linked PR:
https://github.com/langchain-ai/langchain/pull/17614
- **Issue:**
[#17614](https://github.com/langchain-ai/langchain/pull/17614)
    - **Twitter handle:** @paulodoestech
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: lachiewalker <lachiewalker1@hotmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:27:46 +00:00
Paulo Nascimento
4c3a67122f community[patch]: add Integration for OpenAI image gen with v1 sdk (#17771)
**Description:** Created a Langchain Tool for OpenAI DALLE Image
Generation.
**Issue:**
[#15901](https://github.com/langchain-ai/langchain/issues/15901)
**Dependencies:** n/a
**Twitter handle:** @paulodoestech

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:23:14 +00:00
Kaixin Yang
a8104ea8e9 openai[patch]: add checking codes for calling AI model get error (#17909)
**Description:**: adding checking codes for calling AI model get error
in chat_models/base.py and llms/base.py
**Issue**: Sometimes the AI Model calling will get error, we should
raise it.
Otherwise, the next code 'choices.extend(response["choices"])' will
throw a "TypeError: 'NoneType' object is not iterable" error to mask the
true error.
       Because 'response["choices"]' is None.
**Dependencies**: None

---------

Co-authored-by: yangkx <yangkx@asiainfo-int.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 00:17:32 +00:00
Vincent Chen
833d61adb3 docs: update Together README.md (#18004)
## PR message
**Description:** This PR adds a README file for the Together API in the
`libs/partners` folder of this repository. The README includes:
 - A brief description of the package
 - Installation instructions and class introductions
 - Simple usage examples

**Issue:** #17545 

This PR only contains document changes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:02:32 +00:00
Jiaming
3d3cc71287 community[patch]: fix bugs for bilibili Loader (#18036)
- **Description:** 
1. Fix the BiliBiliLoader that can receive cookie parameters, it
requires 3 other parameters to run. The change is backward compatible.
  2. Add test;      
  3. Add example in docs

- **Issue:** [#14213]

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 16:39:38 -07:00
Ethan Knights
1ef3fa0411 docs: improve readability of Langchain Expression Language get_started.ipynb (#18157)
**Description:** A few grammatical changes to improve readability of the
LCEL .ipynb and tidy some null characters.
**Issue:** N/A

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 23:38:30 +00:00
Sachin Paryani
25c9f3d1d1 community[patch]: Support Streaming in Azure Machine Learning (#18246)
- [x] **PR title**: "community: Support streaming in Azure ML and few
naming changes"

- [x] **PR message**:
- **Description:** Added support for streaming for azureml_endpoint.
Also, renamed and AzureMLEndpointApiType.realtime to
AzureMLEndpointApiType.dedicated. Also, added new classes
CustomOpenAIChatContentFormatter and CustomOpenAIContentFormatter and
updated the classes LlamaChatContentFormatter and LlamaContentFormatter
to now show a deprecated warning message when instantiated.

---------

Co-authored-by: Sachin Paryani <saparan@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 23:38:20 +00:00
xiaohuanshu
ecb11a4a32 langchain[patch]: fix BaseChatMemory get output data error with extra key (#18117)
**Description:** At times, BaseChatMemory._get_input_output may acquire
some extra keys such as 'intermediate_steps' (agent_executor with
return_intermediate_steps set to True) and 'messages'
(agent_executor.iter with memory). In these instances, _get_input_output
can raise an error due to the presence of multiple keys. The 'output'
field should be used as the default field in these cases.
**Issue:** #16791
2024-03-28 16:38:08 -07:00
Isaac Francisco
f5e84c8858 docs: fixing markdown for tips (#18199)
Previous markdown code was not working as intended, new code should add
green box around the tip so it is highlighted

Co-authored-by: Hershenson, Isaac (Extern) <isaac.hershenson.extern@bayer04.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 23:37:37 +00:00
Hayden Wolff
85deee521a docs: Nvidia Riva Runnables Documentation (#18237)
- **Description:** Documents how to use the Riva runnables to add
streamed automatic-speech-recognition (ASR) and text-to-speech (TTS) to
chains.
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** @HaydenWolff1

---------

Co-authored-by: Hayden Wolff <hwolff@Haydens-Laptop.local>
Co-authored-by: Hayden Wolff <hwolff@MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 23:35:00 +00:00
Victor Adan
afa2d85405 community[patch]: Added missing from_documents method to KNNRetriever. (#18411)
- Description: Added missing `from_documents` method to `KNNRetriever`,
providing the ability to supply metadata to LangChain `Document`s, and
to give it parity to the other retrievers, which do have
`from_documents`.
- Issue: None
- Dependencies: None
- Twitter handle: None

Co-authored-by: Victor Adan <vadan@netroadshow.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 23:18:50 +00:00
Smit Parmar
dfc4177b50 community[patch]: mypy ignore fix (#18483)
Relates to #17048 
Description : Applied fix to dynamodb and elasticsearch file.

Error was : `Cannot override writeable attribute with read-only
property`
Suggestion:
instead of adding 
```
@messages.setter
def messages(self, messages: List[BaseMessage]) -> None:
    raise NotImplementedError("Use add_messages instead")
```

we can change base class property
`messages: List[BaseMessage]`
to
```
@property
def messages(self) -> List[BaseMessage]:...
```

then we don't need to add `@messages.setter` in all child classes.
2024-03-28 15:36:53 -07:00
aditya thomas
dc9e9a66db docs: update docstring of the ChatAnthropic and AnthropicLLM classes (#18649)
**Description:** Update docstring of the ChatAnthropic and AnthropicLLM
classes
**Issue:** Not applicable
**Dependencies:** None
2024-03-28 15:33:54 -07:00
Luca Dorigo
f19229c564 core[patch]: fix beta, deprecated typing (#18877)
**Description:** 

While not technically incorrect, the TypeVar used for the `@beta`
decorator prevented pyright (and thus most vscode users) from correctly
seeing the types of functions/classes decorated with `@beta`.

This is in part due to a small bug in pyright
(https://github.com/microsoft/pyright/issues/7448 ) - however, the
`Type` bound in the typevar `C = TypeVar("C", Type, Callable)` is not
doing anything - classes are `Callables` by default, so by my
understanding binding to `Type` does not actually provide any more
safety - the modified annotation still works correctly for both
functions, properties, and classes.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 22:33:43 +00:00
aditya thomas
263ee78886 core[runnables]: docstring for class RunnableSerializable, method configurable_fields (#19722)
**Description:** Update to the docstring for class RunnableSerializable,
method configurable_fields
**Issue:** [Add in code documentation to core Runnable methods
#18804](https://github.com/langchain-ai/langchain/issues/18804)
**Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-28 18:15:18 -04:00
HuangZiy
e1f10a697e openai[patch]: perform judgment processing on chat model streaming delta (#18983)
**PR title:** partners: openai chat model
**PR message:** perform judgment processing on chat model streaming
delta
Closes #18977

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 14:46:27 -07:00
wulixuan
b7c8bc8268 community[patch]: fix yuan2 errors in LLMs (#19004)
1. fix yuan2 errors while invoke Yuan2.
2. update tests.
2024-03-28 14:37:44 -07:00
Bob Lin
aba4bd0d13 docs: Add async batch case (#19686) 2024-03-28 14:00:46 -07:00
aditya thomas
ec4dcfca7f core[runnables]: docstring of class RunnableSerializable, method configurable_alternatives (#19724)
**Description:** Update to the docstring for class RunnableSerializable,
method configurable_alternatives
**Issue:** [Add in code documentation to core Runnable methods
#18804](https://github.com/langchain-ai/langchain/issues/18804)
**Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-28 17:00:08 -04:00
Davide Menini
824dbc49ee langchain[patch]: add template_tool_response arg to create_json_chat (#19696)
In this small PR I added the `template_tool_response` arg to the
`create_json_chat` function, so that users can customize this prompt in
case of need.
Thanks for your reviews!

---------

Co-authored-by: taamedag <Davide.Menini@swisscom.com>
2024-03-28 13:59:54 -07:00
高远
688ca48019 community[patch]: Adding validation when vector does not exist (#19698)
Adding validation when vector does not exist

Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
2024-03-28 13:58:23 -07:00
Erick Friis
f55b11fb73 infra: Revert run partner CI on core PRs (#19733)
Reverts parts of langchain-ai/langchain#19688
2024-03-28 20:45:59 +00:00
Alessandro Rossi
665f15bd48 docs: fix typos and make quickstart more readable (#19712)
Description: minor docs changes to make it more readable.
Issue: N/A
Dependencies: N/A
Twitter handle: _kubealex
2024-03-28 20:10:32 +00:00
standby24x7
36090c84f2 docs: Update function "run" to "invoke" in llm_symbolic_math.ipynb (#19713)
This patch updates multiple function "run" to "invoke" in
llm_symbolic_math.ipynb.

Without this patch, you see following message.
The function `run` was deprecated in LangChain 0.1.0
 and will be removed in 0.2.0. Use invoke instead.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-28 13:08:22 -07:00
Chaunte W. Lacewell
4a49fc5a95 community[patch]: Fix bug in vdms (#19728)
**Description:** Fix embedding check in vdms
**Contribution maintainer:** [@cwlacewe](https://github.com/cwlacewe)
2024-03-28 12:54:24 -07:00
高璟琦
75173d31db community[minor]: Add solar model chat model (#18556)
Add our solar chat models, available model choices:
* solar-1-mini-chat
* solar-1-mini-translate-enko
* solar-1-mini-translate-koen

More documents and pricing can be found at
https://console.upstage.ai/services/solar.

The references to our solar model can be found at
* https://arxiv.org/abs/2402.17032

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 12:31:11 -07:00
Erick Friis
e576d6c6b4 cohere[patch]: release 0.1.0rc1 (rc1-2 never released) (#19731) 2024-03-28 19:12:22 +00:00
harry-cohere
ea57050122 cohere: add with_structured_output to ChatCohere (#19730)
**Description:** Adds support for `with_structured_output` to Cohere,
which supports single function calling.

---------

Co-authored-by: BeatrixCohere <128378696+BeatrixCohere@users.noreply.github.com>
2024-03-28 12:09:25 -07:00
Guangdong Liu
0571f886d1 core[patch]: Fix jsonOutputParser fails if a json value contains ``` inside it. (#19717)
- **Issue:** fix #19646 
- @baskaryan, @eyurtsev PTAL
2024-03-28 12:01:09 -07:00
Davide Menini
f7042321f1 community[patch]: gather token usage info in BedrockChat during generation (#19127)
This PR allows to calculate token usage for prompts and completion
directly in the generation method of BedrockChat. The token usage
details are then returned together with the generations, so that other
downstream tasks can access them easily.

This allows to define a callback for tokens tracking and cost
calculation, similarly to what happens with OpenAI (see
[OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler).
I plan on adding a BedrockCallbackHandler later.
Right now keeping track of tokens in the callback is already possible,
but it requires passing the llm, as done here:
https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html.
However, I find the approach of this PR cleaner.

Thanks for your reviews. FYI @baskaryan, @hwchase17

---------

Co-authored-by: taamedag <Davide.Menini@swisscom.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 18:58:46 +00:00
ligang-super
a662468dde community[patch]: Fix the error of Baidu Qianfan not passing the stop parameter (#18666)
- [x] **PR title**: "community: fix baidu qianfan missing stop
parameter"
- [x] **PR message**:
- **Description: Baidu Qianfan lost the stop parameter when requesting
service due to extracting it from kwargs. This bug can cause the agent
to receive incorrect results

---------

Co-authored-by: ligang33 <ligang33@baidu.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 18:21:49 +00:00
BeatrixCohere
d1a2e194c3 cohere[patch]: misc fixs tool use agent and cohere chat (#19705)
Bug fixes in this PR:
* allows for other params such as "message" not just the input param to
the prompt for the cohere tools agent
* fixes to documents kwarg from messages
* fixes to tool_calls API call

---------

Co-authored-by: Harry M <127103098+harry-cohere@users.noreply.github.com>
2024-03-28 10:19:38 -07:00
ccurme
b35e68c41f docs: update use_cases/question_answering/chat_history (#19349)
Update following https://github.com/langchain-ai/langchain/issues/19344
2024-03-28 12:51:01 -04:00
Erick Friis
8c2ed85a45 core[patch], infra: release 0.1.36, run partner CI on core PRs (#19688) 2024-03-28 08:55:10 -07:00
Erick Friis
5327bc9ec4 elasticsearch[patch]: move to repo (#19620) 2024-03-28 08:54:57 -07:00
Nilanjan De
239dd7c0c0 langchain[patch]: Use map() and avoid "ValueError: max() arg is an empty sequence" in MergerRetriever (#18679)
- **Issue:** When passing an empty list to MergerRetriever it fails with
error: ValueError: max() arg is an empty sequence

- **Description:** We have a use case where we dynamically select
retrievers and use MergerRetriever for merging the output of the
retrievers. We faced this issue when the retriever_docs list is empty.
Adding a default 0 for cases when retriever_docs is an empty list to
avoid "ValueError: max() arg is an empty sequence". Also, changed to use
map() which is more than twice as fast compared to the current
implementation.
```
import timeit
# Sample retriever_docs with varying lengths of sublists
retriever_docs = [[i for i in range(j)] for j in range(1, 1000)]
# First code snippet
code1 = '''
max_docs = max(len(docs) for docs in retriever_docs)
'''
# Second code snippet
code2 = '''
max_docs = max(map(len, retriever_docs), default=0)
'''
# Benchmarking
time1 = timeit.timeit(stmt=code1, globals=globals(), number=10000)
time2 = timeit.timeit(stmt=code2, globals=globals(), number=10000)
# Output
print(f"Execution time for code snippet 1: {time1} seconds")
print(f"Execution time for code snippet 2: {time2} seconds")
```

- **Dependencies:** none
2024-03-27 23:52:57 -07:00
aditya thomas
4cd38fe89f docs: update docstring of the ChatGroq class (#18645)
**Description:** Update docstring of the ChatGroq class
**Issue:** Not applicable
**Dependencies:** None
2024-03-27 23:46:52 -07:00
Jaid
e4d7b1a482 voyageai[patch]: top level reranker import (#19645)
The previous version didn't had  Voyage rerank in the init file


- [ ] **PR title**: langchain_voyageai reranker is not working
 


- [ ] **PR message**: 
    - **Description:** This fix let you run reranker from voyage
    - **Issue:** Was not able to run reranker from voyage
  






 @efriis
2024-03-28 06:37:55 +00:00
Xinwei Xiong
26eed70c11 infra: Optimize Makefile for Better Usability and Maintenance (#18859)
**Previous screenshots:**


![image](https://github.com/langchain-ai/langchain/assets/86140903/e2f326e3-4d97-4b22-aacb-e789a9d815e4)

**Current screenshot:**

![image](https://github.com/langchain-ai/langchain/assets/86140903/bd8a3ea7-1b8a-4803-9168-df45f6fa4893)
2024-03-27 23:37:39 -07:00
Juan Jose Miguel Ovalle Villamil
51baa1b5cf langchain[patch]: fix-cohere-reranker-rerank-method with cohere v5 (#19486)
#### Description
Fixed the following error with `rerank` method from `CohereRerank`:
```
---> [79](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:79) results = self.client.rerank(
     [80](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:80)     query, docs, model, top_n=top_n, max_chunks_per_doc=max_chunks_per_doc
     [81](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:81) )
     [82](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:82) result_dicts = []
     [83](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:83) for res in results.results:

TypeError: BaseCohere.rerank() takes 1 positional argument but 4 positional arguments (and 2 keyword-only arguments) were given
```
This was easily fixed going from this:
```
   def rerank(
        self,
        documents: Sequence[Union[str, Document, dict]],
        query: str,
        *,
        model: Optional[str] = None,
        top_n: Optional[int] = -1,
        max_chunks_per_doc: Optional[int] = None,
    ) -> List[Dict[str, Any]]:
         ...
        if len(documents) == 0:  # to avoid empty api call
            return []
        docs = [
            doc.page_content if isinstance(doc, Document) else doc for doc in documents
        ]
        model = model or self.model
        top_n = top_n if (top_n is None or top_n > 0) else self.top_n
        results = self.client.rerank(
            query, docs, model, top_n=top_n, max_chunks_per_doc=max_chunks_per_doc
        )
        result_dicts = []
        for res in results:
            result_dicts.append(
                {"index": res.index, "relevance_score": res.relevance_score}
            )
        return result_dicts
```
to this:
```
    def rerank(
        self,
        documents: Sequence[Union[str, Document, dict]],
        query: str,
        *,
        model: Optional[str] = None,
        top_n: Optional[int] = -1,
        max_chunks_per_doc: Optional[int] = None,
    ) -> List[Dict[str, Any]]:
         ...
        if len(documents) == 0:  # to avoid empty api call
            return []
        docs = [
            doc.page_content if isinstance(doc, Document) else doc for doc in documents
        ]
        model = model or self.model
        top_n = top_n if (top_n is None or top_n > 0) else self.top_n
        results = self.client.rerank(
            query=query, documents=docs, model=model, top_n=top_n, max_chunks_per_doc=max_chunks_per_doc <-------------
        )
        result_dicts = []
        for res in results.results:  <-------------
            result_dicts.append(
                {"index": res.index, "relevance_score": res.relevance_score}
            )
        return result_dicts
```
#### Unit & Integration tests
I added a unit test to check the behaviour of `rerank`. Also fixed the
original integration test which was failing.

#### Format & Linting
Everything worked properly with `make lint_diff`, `make format_diff` and
`make format`. However I noticed an error coming from other part of the
library when doing `make lint`:

```
(langchain-py3.9) ➜  langchain git:(master) make format
[ "." = "" ] || poetry run ruff format .
1636 files left unchanged
[ "." = "" ] || poetry run ruff --select I --fix .
(langchain-py3.9) ➜  langchain git:(master) make lint
./scripts/check_pydantic.sh .
./scripts/lint_imports.sh
poetry run ruff .
[ "." = "" ] || poetry run ruff format . --diff
1636 files already formatted
[ "." = "" ] || poetry run ruff --select I .
[ "." = "" ] || mkdir -p .mypy_cache && poetry run mypy . --cache-dir .mypy_cache
langchain/agents/openai_assistant/base.py:252: error: Argument "file_ids" to "create" of "Assistants" has incompatible type "Optional[Any]"; expected "Union[list[str], NotGiven]"  [arg-type]
langchain/agents/openai_assistant/base.py:374: error: Argument "file_ids" to "create" of "AsyncAssistants" has incompatible type "Optional[Any]"; expected "Union[list[str], NotGiven]"  [arg-type]
Found 2 errors in 1 file (checked 1634 source files)
make: *** [Makefile:65: lint] Error 1
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 06:32:03 +00:00
Shuqian
332996b4b2 openai[patch]: fix ChatOpenAI model's openai proxy (#19559)
Due to changes in the OpenAI SDK, the previous method of setting the
OpenAI proxy in ChatOpenAI no longer works. This PR fixes this issue,
making the previous way of setting the OpenAI proxy in ChatOpenAI
effective again.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 23:16:55 -07:00
Bagatur
b15c7fdde6 anthropic[patch]: fix response metadata type (#19683) 2024-03-27 23:16:26 -07:00
kaijietti
9c4b6dc979 community[patch]: fix bug in cohere that async for a coroutine in ChatCohere (#19381)
Without `await`, the `stream` returned from the `async_client` is
actually a coroutine, which could not be used in `async for`.
2024-03-27 21:34:46 -07:00
Christian Galo
1adaa3c662 community[minor]: Update Azure Cognitive Services to Azure AI Services (#19488)
This is a follow up to #18371. These are the changes:
- New **Azure AI Services** toolkit and tools to replace those of
**Azure Cognitive Services**.
- Updated documentation for Microsoft platform.
- The image analysis tool has been rewritten to use the new package
`azure-ai-vision-imageanalysis`, doing a proper replacement of
`azure-ai-vision`.

These changes:
- Update outdated naming from "Azure Cognitive Services" to "Azure AI
Services".
- Update documentation to use non-deprecated methods to create and use
agents.
- Removes need to depend on yanked python package (`azure-ai-vision`)

There is one new dependency that is needed as a replacement to
`azure-ai-vision`:
- `azure-ai-vision-imageanalysis`. This is optional and declared within
a function.

There is a new `azure_ai_services.ipynb` notebook showing usage; Changes
have been linted and formatted.

I am leaving the actions of adding deprecation notices and future
removal of Azure Cognitive Services up to the LangChain team, as I am
not sure what the current practice around this is.

---

If this PR makes it, my handle is  @galo@mastodon.social

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-03-28 03:19:02 +00:00
Shengsheng Huang
ac1dd8ad94 community[minor]: migrate bigdl-llm to ipex-llm (#19518)
- **Description**: `bigdl-llm` library has been renamed to
[`ipex-llm`](https://github.com/intel-analytics/ipex-llm). This PR
migrates the `bigdl-llm` integration to `ipex-llm` .
- **Issue**: N/A. The original PR of `bigdl-llm` is
https://github.com/langchain-ai/langchain/pull/17953
- **Dependencies**: `ipex-llm` library
- **Contribution maintainer**: @shane-huang

Updated doc:   docs/docs/integrations/llms/ipex_llm.ipynb
Updated test:
libs/community/tests/integration_tests/llms/test_ipex_llm.py
2024-03-27 20:12:59 -07:00
Chaunte W. Lacewell
a31f692f4e community[minor]: Add VDMS vectorstore (#19551)
- **Description:** Add support for Intel Lab's [Visual Data Management
System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store
- **Dependencies:** `vdms` library which requires protobuf = "4.24.2".
There is a conflict with dashvector in `langchain` package but conflict
is resolved in `community`.
- **Contribution maintainer:** [@cwlacewe](https://github.com/cwlacewe)
- **Added tests:**
libs/community/tests/integration_tests/vectorstores/test_vdms.py
- **Added docs:** docs/docs/integrations/vectorstores/vdms.ipynb
- **Added cookbook:** cookbook/multi_modal_RAG_vdms.ipynb

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 03:12:11 +00:00
William FH
b7b62e29fb community[patch], mongodb[patch]: Stop spamming SIMD import warnings (#19531)
If you use an embedding dist function in an eval loop, you get warned
every time. Would prefer to just check once and forget about it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 03:11:02 +00:00
Tomaz Bratanic
b04e663426 experimental[patch]: Flatten relationships in LLM graph transformer (#19642) 2024-03-27 19:35:34 -07:00
billytrend-cohere
36abb5dd41 cohere[patch]: Fix positional argument (#19678)
cohere: Fix positional argument

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 02:26:08 +00:00
Nuno Campos
fdfb51ad8d core: Two updates to chat model interface (#19684)
- .stream() and .astream() call on_llm_new_token, removing the need for
subclasses to do so. Backwards compatible because now we don't pass
run_manager into ._stream and ._astream
- .generate() and .agenerate() now handle `stream: bool` kwarg for
_generate and _agenerate. Subclasses handle this arg by delegating to
._stream(), now one less thing they need to do. Backwards compat because
this is an optional arg that we now never pass to the subclasses
- .generate() and .agenerate() now inspect callback handlers to decide
on a default value for stream:bool if not passed in. This auto enables
streaming when using astream_events and astream_log
- as a result of these three changes any usage of .astream_events and
.astream_log should now yield chat model stream events
- In future PRs we can update all subclasses to reflect these two things
now handled by base class, but in meantime all will continue to work
2024-03-27 18:45:01 -07:00
harry-cohere
3685f8ceac cohere[patch]: Add cohere tools agent (#19602)
**Description**: Adds a cohere tools agent and related notebook.

---------

Co-authored-by: BeatrixCohere <128378696+BeatrixCohere@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-27 18:35:43 -07:00
William FH
5c41f4083e [Evals] Fix function calling support (#19658)
Current implementation is overzealous in validating chat datasets

Fixes
[#langsmith-sdk:557](https://github.com/langchain-ai/langsmith-sdk/issues/557)
2024-03-27 17:23:35 -07:00
yongheng.liu
7e29b6061f community[minor]: integrate China Mobile Ecloud vector search (#15298)
- **Description:** integrate China Mobile Ecloud vector search, 
  - **Dependencies:** elasticsearch==7.10.1

Co-authored-by: liuyongheng <liuyongheng@cmss.chinamobile.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 23:02:40 +00:00
Hyeongchan Kim
9b70131aed community[patch]: refactor the type hint of file_path in UnstructuredAPIFileLoader class (#18839)
* **Description**: add `None` type for `file_path` along with `str` and
`List[str]` types.
* `file_path`/`filename` arguments in `get_elements_from_api()` and
`partition()` can be `None`, however, there's no `None` type hint for
`file_path` in `UnstructuredAPIFileLoader` and `UnstructuredFileLoader`
currently.
* calling the function with `file_path=None` is no problem, but my IDE
annoys me lol.
* **Issue**: N/A
* **Dependencies**: N/A

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-27 22:31:54 +00:00
CaroFG
cf96060ab7 community[patch]: update for compatibility with latest Meilisearch version (#18970)
- **Description:** Updates Meilisearch vectorstore for compatibility
with v1.6 and above. Adds embedders settings and embedder_name which are
now required.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 22:08:27 +00:00
chyroc
be2adb1083 community[patch]: support unstructured_kwargs for s3 loader (#15473)
fix https://github.com/langchain-ai/langchain/issues/15472

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 22:03:48 +00:00
Bagatur
b901649032 docs: move extraction up (#19667) 2024-03-27 14:55:16 -07:00
Kahlil Wehmeyer
9c08cdea92 core[patch]: ToolException docs/exception message (#17590)
**Description:**
This PR adds a slightly more helpful message to a Tool Exception

```
# current state
langchain_core.tools.ToolException: Too many arguments to single-input tool

# proposed state
langchain_core.tools.ToolException: Too many arguments to single-input tool. Consider using a StructuredTool instead.
```
**Issue:** Somewhat discussed here 👉  #6197 
 **Dependencies:** None
**Twitter handle:** N/A

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-27 21:52:36 +00:00
Evgenii Zheltonozhskii
5b1f9c6d3a infra: Consistent lxml requirements (#19520)
Update the dependency for lxml to be consistent among different
packages; should fix
https://github.com/langchain-ai/langchain/issues/19040
2024-03-27 20:27:59 +00:00
Filip Michalsky
2fceec3771 docs: update cookbook example for SalesGPT - include Stripe Payment Link Generation (#19622)
Thank you for contributing to LangChain!

- [ ] **cookbook** - update example for SalesGPT - include Stripe
Payment Link Generation

- **Description:** We updated the Jupyter notebook example with the
ability of the AI Agent to negotiate with customers and then close the
deal by generating a custom Stripe payment link.
    - **Issue:** N/A
    - **Dependencies:** N/a
    - **Twitter handle:** @FilipMichalsky @0xtotaylor


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Filip Michalsky <filip_michalsky@g.harvard.edu>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 20:16:21 +00:00
Christophe Bornet
33fa8cfcd0 core[minor]: Add async methods to MaxMarginalRelevanceExampleSelector (#19639) 2024-03-27 16:03:18 -04:00
Taqi Jaffri
72c8b3127d cli[patch]: Fix typo in dev script name for the --chat-playground option on the cli (#19673)
Fixes typo

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2024-03-27 15:56:11 -04:00
Jan Nissen
2e0ddd6fb8 core[minor]: support pydantic v2 models in PydanticOutputParser (#18811)
As mentioned in #18322, the current PydanticOutputParser won't work for
anyone trying to parse to pydantic v2 models. This PR adds a separate
`PydanticV2OutputParser`, as well as a `langchain_core.pydantic_v2`
namespace that will fail on import to any projects using pydantic<2.
Happy to update the docs for output parsers if this is something we're
interesting in adding.

On a separate note, I also updated `check_pydantic.sh` to detect
pydantic imports with leading whitespace and excluded the internal
namespaces. That change can be separated into its own PR if needed.

---------

Co-authored-by: Jan Nissen <jan23@gmail.com>
2024-03-27 15:37:52 -04:00
Kangmoon Seo
d0accc3275 docs: fix error output in XMLOutputParser documentation (#19569)
- **Description:** I've made a fix to a ParseError call in the
XMLOutputParser documentation.
- **Issue:** None
- **Dependencies:** None

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-27 18:29:00 +00:00
Tomaz Bratanic
87d2a6b777 community[minor]: Add the option to omit schema refresh in Neo4jGraph (#19654) 2024-03-27 14:20:12 -04:00
Bagatur
5fc6531c74 docs: use first_tool_only instead of return_single (#19666) 2024-03-27 18:19:39 +00:00
jhicks2306
bcb8ab5216 docs: Improve docstring for Runnable bind method (#19659)
Added example to the docstring of the "bind" method of Runnable. This
makes it easier to understand the purpose of the method when reviewing
in code editors. E.g. VS Code below.

<img width="833" alt="Screenshot 2024-03-27 at 16 24 18"
src="https://github.com/langchain-ai/langchain/assets/45722942/ad022d4e-7bc0-4f4b-aa7a-838f1816cc52">

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-27 14:05:41 -04:00
ccurme
4e9b358ed8 docs: Fix broken imports in documentation (#19655)
Found via script in https://github.com/langchain-ai/langchain/pull/19611
2024-03-27 13:54:05 -04:00
Rajendra Kadam
0019d8a948 community[minor]: Add support for non-file-based Document Loaders in PebbloSafeLoader (#19574)
**Description:**
PebbloSafeLoader: Add support for non-file-based Document Loaders

This pull request enhances PebbloSafeLoader by introducing support for
several non-file-based Document Loaders. With this update,
PebbloSafeLoader now seamlessly integrates with the following loaders:
- GoogleDriveLoader
- SlackDirectoryLoader
- Unstructured EmailLoader

**Issue:** NA
**Dependencies:** - None
**Twitter handle:** @Raj__725

---------

Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-03-27 17:39:52 +00:00
Christophe Bornet
9954c6a38e langchain[minor]: Add async methods to EncoderBackedStore (#19597)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-27 17:36:36 +00:00
Erick Friis
929ed65554 cohere[patch]: release 0.1.0rc1 (#19663) 2024-03-27 17:14:56 +00:00
hulitaitai
dc2c9dd4d7 Update text2vec.py (#19657)
Add that URL of the embedding tool "text2vec".
Fix minor mistakes in the doc-string.
2024-03-27 13:13:30 -04:00
Erick Friis
7630e9529c Revert "community: added partners/package-name folders" (#19662)
Reverts langchain-ai/langchain#19290
2024-03-27 17:09:30 +00:00
Christophe Bornet
409c6eeb0b core: Add async methods to LengthBasedExampleSelector (#19640) 2024-03-27 13:05:58 -04:00
Bagatur
c7f1962f73 core[patch]: Release 0.1.35 (#19660) 2024-03-27 16:54:03 +00:00
Eugene Yurtsev
e8339b1d83 core[patch]: Patch XML vulnerability in XMLOutputParser (CVE-2024-1455) (#19653)
Patch potential XML vulnerability CVE-2024-1455

This patches a potential XML vulnerability in the XMLOutputParser in
langchain-core. The vulnerability in some situations could lead to a
denial of service attack.

At risk are users that:

1) Running older distributions of python that have older version of
libexpat
2) Are using XMLOutputParser with an agent
3) Accept inputs from untrusted sources with this agent (e.g., endpoint
on the web that allows an untrusted user to interact wiith the parser)
2024-03-27 12:41:52 -04:00
Guangdong Liu
7042934b5f community[patch]: Fix the bug that Chroma does not specify embedding_function (#19277)
- **Issue:** close #18291
- @baskaryan, @eyurtsev PTAL
2024-03-27 11:43:38 -04:00
billytrend-cohere
85f57ab4cd cohere[patch]: Fix cohere rerank (#19624)
Fix cohere rerank inspired by
https://github.com/langchain-ai/langchain/pull/19486
2024-03-27 08:41:53 -07:00
Eugene Yurtsev
8ab7bb3166 core[patch]: XMLOutputParser fix to handle changes to xml standard library (#19612)
Newest python micro releases broke streaming in the XMLOutputParser. This fixes the parsing code to work with trailing junk after the XML content.
2024-03-27 09:25:28 -04:00
yuwenzho
3a7d2cf443 community[minor]: Add ITREX optimized Embeddings (#18474)
Introduction
[Intel® Extension for
Transformers](https://github.com/intel/intel-extension-for-transformers)
is an innovative toolkit designed to accelerate GenAI/LLM everywhere
with the optimal performance of Transformer-based models on various
Intel platforms

Description

adding ITREX runtime embeddings using intel-extension-for-transformers.
added mdx documentation and example notebooks
added embedding import testing.

---------

Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 07:22:06 +00:00
Juan Jose Miguel Ovalle Villamil
1fe10a3e3d experimental[patch]: Enhance LLMGraphTransformer with async processing and improved readability (#19205)
- [x] **PR title**: "experimental: Enhance LLMGraphTransformer with
async processing and improved readability"


- [x] **PR message**: 
- **Description:** This pull request refactors the `process_response`
and `convert_to_graph_documents` methods in the LLMGraphTransformer
class to improve code readability and adds async versions of these
methods for concurrent processing.
    The main changes include:
- Simplifying list comprehensions and conditional logic in the
process_response method for better readability.
- Adding async versions aprocess_response and
aconvert_to_graph_documents to enable concurrent processing of
documents.
These enhancements aim to improve the overall efficiency and
maintainability of the `LLMGraphTransformer` class.
  - **Issue:** N/A
  - **Dependencies:** No additional dependencies required.
  - **Twitter handle:** @jjovalle99


- [x] **Add tests and docs**: N/A (This PR does not introduce a new
integration)


- [x] **Lint and test**: Ran make format, make lint, and make test from
the root of the modified package(s). All tests pass successfully.

Additional notes:

- The changes made in this PR are backwards compatible and do not
introduce any breaking changes.
- The PR touches only the `LLMGraphTransformer` class within the
experimental package.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 23:40:21 -07:00
Fabrizio Ruocco
f12cb0bea4 community[patch]: Microsoft Azure Document Intelligence updates (#16932)
- **Description:** Update Azure Document Intelligence implementation by
Microsoft team and RAG cookbook with Azure AI Search

---------

Co-authored-by: Lu Zhang (AI) <luzhan@microsoft.com>
Co-authored-by: Yateng Hong <yatengh@microsoft.com>
Co-authored-by: teethache <hongyateng2006@126.com>
Co-authored-by: Lu Zhang <44625949+luzhang06@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 23:36:59 -07:00
Guangdong Liu
cd79305eb9 openai[patch]: fix AzureChatOpenAI missing parameter problem (#19258)
- **Issue:** close #19255
- PTAL @baskaryan @eyurtsev
2024-03-26 22:31:36 -07:00
Leonid Ganeline
3a978a4bdc docs: output_parsers page fix (#19623)
Issue with this
[page](https://python.langchain.com/docs/modules/model_io/output_parsers/):
Table: "Input Type" columns: strings `str \| Message` (the escape char
"\" doesn't work inside backticked text).
2024-03-26 22:17:41 -07:00
Ethan Yang
28cd5522c2 docs: fix typo in openvino document (#19627) 2024-03-26 22:13:54 -07:00
xsai9101
1c27de6ce2 docs: Fix oracle doc loader format issue (#19628) 2024-03-26 22:13:36 -07:00
Timothy
ad77fa15ee community[patch]: Adding try-except block for GCSDirectoryLoader (#19591)
- **Description:** Implemented try-except block for
`GCSDirectoryLoader`. Reason: Users processing large number of
unstructured files in a folder may experience many different errors. A
try-exception block is added to capture these errors. A new argument
`use_try_except=True` is added to enable *silent failure* so that error
caused by processing one file does not break the whole function.
- **Issue:** N/A
- **Dependencies:** no new dependencies
- **Twitter handle:** timothywong731

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 00:12:24 +00:00
fzowl
aea2be5bf3 voyageai[patch]: VoyageAI rerank (#19521)
Adding VoyageAI reranking

---------

Co-authored-by: fodizoltan <zoltan@conway.expert>
Co-authored-by: Yujie Qian <thomasq0809@gmail.com>
2024-03-26 17:07:23 -07:00
Leonid Ganeline
4d85485e71 docs: PromptTemplate import from core (#19616)
Changed import of `PromptTemplate` from `langchain` to `langchain_core`
in all examples (notebooks)
2024-03-26 17:03:36 -07:00
Leonid Ganeline
3dc0f3c371 experimental[patch]: PromptTemplate import fix (#19617)
Changed import of `PromptTemplate` from `langchain` to `langchain_core`
in `langchain_experimental`
2024-03-26 17:03:13 -07:00
xsai9101
160a8eb178 community[minor]: add oracle autonomous database doc loader integration (#19536)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Adding oracle autonomous database document loader
integration. This will allow users to connect to oracle autonomous
database through connection string or TNS configuration.
    https://www.oracle.com/autonomous-database/
    - **Issue:** None
    - **Dependencies:** oracledb python package 
    https://pypi.org/project/oracledb/
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
  Unit test and doc are added.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 17:02:18 -07:00
Ethan Yang
5784dfed00 docs: update openvino documents (#19543)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 22:15:30 +00:00
Erick Friis
bf8ba00520 cli[patch]: release 0.0.22rc0, chat playground (#19614) 2024-03-26 15:08:56 -07:00
Leonid Ganeline
a3d24bc10b docs: release date fix (#19585)
Replaced the overdue release promise.
2024-03-26 14:51:09 -07:00
Raghav Rawat
b5640a0883 docs: Update apify.ipynb for Document class import (#19598)
- **Description:**
Update to correctly import Document class -
from langchain_core.documents import Document

- **Issue:**
Fixes the notebook and the hosted documentation
[here](https://python.langchain.com/docs/integrations/tools/apify)

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 21:46:29 +00:00
jhicks2306
087823aefa docs: Update docstring for MessagesPlaceholder (#19601)
Update to docstring for MessagesPlaceholder so that it shows helpful
information in code editors. E.g. VS Code as shown below.


<img width="587" alt="Screenshot 2024-03-26 at 17 18 58"
src="https://github.com/langchain-ai/langchain/assets/45722942/8f49d09f-ed8d-4f61-a9d4-3611dbe9c9c5">

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 14:34:00 -07:00
Christophe Bornet
7c2578bd55 langchain[patch]: Add async methods to EmbeddingRouterChain (#19603) 2024-03-26 14:33:36 -07:00
Christophe Bornet
b3d7b5a653 langchain[patch[: Add async methods to TimeWeightedVectorStoreRetriever (#19606) 2024-03-26 14:03:47 -07:00
Adam Law
aeb7b6b11d community[patch]: use semantic_configurations in AzureSearch (#19347)
- **Description:** Currently the semantic_configurations are not used
when creating an AzureSearch instance, instead creating a new one with
default values. This PR changes the behavior to use the passed
semantic_configurations if it is present, and the existing default
configuration if not.

---------

Co-authored-by: Adam Law <adamlaw@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 13:57:39 -07:00
Christophe Bornet
a7274f006e langchain[patch]: Add async methods to VectorstoreIndexCreator (#19582) 2024-03-26 13:57:13 -07:00
Bagatur
241774012a core[patch]: Release 0.1.34 (#19609)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-26 13:50:48 -07:00
Nuno Campos
c78eb55859 load: Optionally disable reading secrets from env (#19596)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-26 20:32:56 +00:00
Eugene Yurtsev
d3c9974da2 core[patch]: Temporarily disable test for streaming xml parser (#19610)
Test is failing due to micro version bump in python interpreter which
changed something about how std xml parser works
2024-03-26 20:24:20 +00:00
Eugene Yurtsev
8bc5cdccee core[patch]: Reverting changes with defusedXML (#19604)
DefusedXML is causing parsing errors on previously functional code with
the 0.7.x versions. These do not seem to support newer version of python
well. 0.8.x has only been released as rc, so we're not going to to use
it in the core package
2024-03-26 15:13:09 -04:00
Giannis
9ea2a9b0c1 cohere[patch]: Add additional kwargs support for Cohere SDK params (#19533)
* Adds support for `additional_kwargs` in `get_cohere_chat_request`
* This functionality passes in Cohere SDK specific parameters from
`BaseMessage` based classes to the API

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-26 18:30:37 +00:00
Adrian Valente
2763d8cbe5 community: add len() implementation to Chroma (#19419)
Thank you for contributing to LangChain!

- [x] **Add len() implementation to Chroma**: "package: community"


- [x] **PR message**: 
- **Description:** add an implementation of the __len__() method for the
Chroma vectostore, for convenience.
- **Issue:** no exposed method to know the size of a Chroma vectorstore
    - **Dependencies:** None
    - **Twitter handle:** lowrank_adrian


- [x] **Add tests and docs**

- [x] **Lint and test**

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 12:53:10 -04:00
Tom Aarsen
e0a1278d2b docs: HFEmbeddings: Add more information to model_kwargs/encode_kwargs (#19594)
- **Description:** Be more explicit with the `model_kwargs` and
`encode_kwargs` for `HuggingFaceEmbeddings`.
    - **Issue:** -
    - **Dependencies:** -

I received some reports by my users that they didn't realise that you
could change the default `batch_size` with `HuggingFaceEmbeddings`,
which may be attributed to how the `model_kwargs` and `encode_kwargs`
don't give much information about what you can specify.

I've added some parameter names & links to the Sentence Transformers
documentation to help clear it up. Let me know if you'd rather have
Markdown/Sphinx-style hyperlinks rather than a "bare URL".

- Tom Aarsen
2024-03-26 12:46:04 -04:00
Dobiichi-Origami
18e6f9376d community[Qianfan]: add function_call in additional_kwargs (#19550)
- **Description:** add lacked `function_call` field in
`additional_kwargs` in previous version
- **Dependencies:** None of new dependency
2024-03-26 12:20:19 -04:00
Eugene Yurtsev
9c7e860cf6 core[patch]: Remove anyio dependency (#19583)
The dependency isn't used anymore
2024-03-26 11:59:22 -04:00
mwmajewsk
f7a1fd91b8 community: better support of pathlib paths in document loaders (#18396)
So this arose from the
https://github.com/langchain-ai/langchain/pull/18397 problem of document
loaders not supporting `pathlib.Path`.

This pull request provides more uniform support for Path as an argument.
The core ideas for this upgrade: 
- if there is a local file path used as an argument, it should be
supported as `pathlib.Path`
- if there are some external calls that may or may not support Pathlib,
the argument is immidiately converted to `str`
- if there `self.file_path` is used in a way that it allows for it to
stay pathlib without conversion, is is only converted for the metadata.

Twitter handle: https://twitter.com/mwmajewsk
2024-03-26 11:51:52 -04:00
Guangdong Liu
94b869a974 github action: Add dead link check for .mdx files (#19492)
- **Description:** Add dead link check for .mdx files. I checked the
logs and found that files with .mdx suffix were not checked.

https://github.com/langchain-ai/langchain/actions/runs/8409525467/job/23026924465#logs
- @baskaryan, @efriis, @eyurtsev, @hwchase17.
2024-03-26 08:42:34 -07:00
Christophe Bornet
6f477e3cb6 docs: Remove chromadb from required dependency in examples with VectorstoreIndexCreator (#19578) 2024-03-26 11:12:21 -04:00
Yuki Watanabe
cfecbda48b community[minor]: Allow passing allow_dangerous_deserialization when loading LLM chain (#18894)
### Issue
Recently, the new `allow_dangerous_deserialization` flag was introduced
for preventing unsafe model deserialization that relies on pickle
without user's notice (#18696). Since then some LLMs like Databricks
requires passing in this flag with true to instantiate the model.

However, this breaks existing functionality to loading such LLMs within
a chain using `load_chain` method, because the underlying loader
function
[load_llm_from_config](f96dd57501/libs/langchain/langchain/chains/loading.py (L40))
 (and load_llm) ignores keyword arguments passed in. 

### Solution
This PR fixes this issue by propagating the
`allow_dangerous_deserialization` argument to the class loader iff the
LLM class has that field.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 11:07:55 -04:00
hulitaitai
d7c14cb6f9 community[minor]: Add embeddings integration for text2vec (#19267)
Create a Class which allows to use the "text2vec" open source embedding
model.

It should install the model by running 'pip install -U text2vec'.
Example to call the model through LangChain:

from langchain_community.embeddings.text2vec import Text2vecEmbeddings

            embedding = Text2vecEmbeddings()
            bookend.embed_documents([
                "This is a CoSENT(Cosine Sentence) model.",
"It maps sentences to a 768 dimensional dense vector space.",
            ])
            bookend.embed_query(
                "It can be used for text matching or semantic search."
            )

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-26 11:06:58 -04:00
Shotaro Sano
55c624a694 infra: Resolve the endless dependency resolution during the build of dev.Dockerfile by copying poetry.lock (#19465)
## Description
This PR proposes a modification to the `libs/langchain/dev.Dockerfile`
configuration to copy the `libs/langchain/poetry.lock` into the working
directory. The change aims to address the issue where the Poetry install
command, the last command in the `dev.Dockerfile`, takes excessively
long hours, and to ensure the reproducibility of the poetry environment
in the devcontainer.

## Problem
The `dev.Dockerfile`, prepared for development environments such as
`.devcontainer`, encounters an unending dependency resolution when
attempting the Poetry installation.

### Steps to Reproduce
Execute the following build command: 

```bash
docker build -f libs/langchain/dev.Dockerfile .
```

### Current Behavior
The Docker build process gets stuck at the following step, which, in my
experience, did not conclude even after an entire night:

```
 => [langchain-dev-dependencies 4/6] COPY libs/community/ ../community/                                                                                0.9s
 => [langchain-dev-dependencies 5/6] COPY libs/text-splitters/ ../text-splitters/                                                                      0.0s
 => [langchain-dev-dependencies 6/6] RUN poetry install --no-interaction --no-ansi --with dev,test,docs                                               12.3s
 => => # Updating dependencies                                                                                                                             
 => => # Resolving dependencies...  
```

### Expected Behavior
The Docker build completes in a realistic timeframe. By applying this
PR, the build finishes within a few minutes.

### Analysis
The complexity of LangChain's dependencies has reached a point where
Poetry is required to resolve dependencies akin to threading a needle.
Consequently, poetry install fails to complete in a practical timeframe.

## Solution
The solution for dependency resolution is already recorded in
`libs/langchain/poetry.lock`, so we can use it. When copying
`project.toml` and `poetry.toml`, the `poetry.lock` located in the same
directory should also be copied.

```diff
# Copy only the dependency files for installation
-COPY libs/langchain/pyproject.toml libs/langchain/poetry.toml ./
+COPY libs/langchain/pyproject.toml libs/langchain/poetry.toml libs/langchain/poetry.lock ./
```

## Note
I am not intimately familiar with the historical context of the
`dev.Dockerfile` and thus do not know why `poetry.lock` has not been
copied until now. It might have been an oversight, or perhaps dependency
resolution used to complete quickly even without the `poetry.lock` file
in the past. However, if there are deliberate reasons why copying
`poetry.lock` is not advisable, please just close this PR.
2024-03-26 10:54:53 -04:00
Kalyan Mudumby
d27600c6f7 community[patch]: GPTCache pydantic validation error on lookup (#19427)
Description:
this change fixes the pydantic validation error when looking up from
GPTCache, the `ChatOpenAI` class returns `ChatGeneration` as response
which is not handled.
use the existing `_loads_generations` and `_dumps_generations` functions
to handle it

Trace
```
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/development/scripts/chatbot-postgres-test.py", line 90, in <module>
    print(llm.invoke("tell me a joke"))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 166, in invoke
    self.generate_prompt(
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 544, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 408, in generate
    raise e
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 398, in generate
    self._generate_with_cache(
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 585, in _generate_with_cache
    cache_val = llm_cache.lookup(prompt, llm_string)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_community/cache.py", line 807, in lookup
    return [
           ^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_community/cache.py", line 808, in <listcomp>
    Generation(**generation_dict) for generation_dict in json.loads(res)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/load/serializable.py", line 120, in __init__
    super().__init__(**kwargs)
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Generation
type
  unexpected value; permitted: 'Generation' (type=value_error.const; given=ChatGeneration; permitted=('Generation',))
```


Although I don't seem to find any issues here, here's an
[issue](https://github.com/zilliztech/GPTCache/issues/585) raised in
GPTCache. Please let me know if I need to do anything else

Thank you

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 10:52:30 -04:00
Leonid Ganeline
4159a4723c experimental[patch]: update module doc strings (#19539)
Added missed module descriptions. Fixed format.
2024-03-26 10:38:10 -04:00
Piyush Jain
72ba738bf5 community[minor]: Improvements for NeptuneRdfGraph, Improve discovery of graph schema using database statistics (#19546)
Fixes linting for PR
[19244](https://github.com/langchain-ai/langchain/pull/19244)

---------

Co-authored-by: mhavey <mchavey@gmail.com>
2024-03-26 10:36:51 -04:00
aditya thomas
fc6b92bb9a docs: add cohere to the list of partners (#19552)
**Description:** Add Cohere to the list of LangChain partners
**Issue:** The Cohere partner package was recently added
[#19049](https://github.com/langchain-ai/langchain/pull/19049)
**Dependencies:** None
2024-03-26 10:22:03 -04:00
Christophe Bornet
1f422318b7 core[minor]: Use BaseChatMessageHistory async methods in RunnableWithMessageHistory (#19565)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-26 14:13:58 +00:00
Christophe Bornet
8595c3ab59 community[minor]: Add InMemoryVectorStore to module level imports (#19576) 2024-03-26 14:07:44 +00:00
Christophe Bornet
a9457d269e core: Add async methods to BaseExampleSelector and SemanticSimilarityExampleSelector (#19399)
Few-Shot prompt template may use a `SemanticSimilarityExampleSelector`
that in turn uses a `VectorStore` that does I/O operations.
So to work correctly on the event loop, we need:
* async methods for the `VectorStore` (OK)
* async methods for the `SemanticSimilarityExampleSelector` (this PR)
* async methods for `BasePromptTemplate` and `BaseChatPromptTemplate`
(future work)
2024-03-26 10:06:43 -04:00
Christophe Bornet
29c58528c7 core[minor]: Add default implementations to amax_marginal_relevance_search_by_vector and adelete (#19269) 2024-03-26 10:03:22 -04:00
Christophe Bornet
999365186b langchain[major]: Use InMemoryVectorStore by default in VectorstoreIndexCreator (#19575)
This is a small breaking change but I think it should be done as:
* No external dependency needs to be installed anymore for the default
to work
* It is vendor-neutral
2024-03-26 10:01:23 -04:00
standby24x7
16e64d889a docs: Update function "run" to "invoke" in fake_llm.ipynb (#19570)
This patch updates function "run" to "invoke" in fake_llm.ipynb. Without
this patch, you see following warning.

LangChainDeprecationWarning: The function `run` was deprecated in
LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-26 09:54:31 -04:00
Guangdong Liu
c93d4ea91c docs: Add in code documentation to core Runnable map methods (docs only) (#19517)
- **Issue:** #18804
- @baskaryan, @eyurtsev
2024-03-25 19:18:30 -07:00
Leonid Ganeline
0199b73188 docs: added partners/package-name folders (#19290)
Added references to new integration packages from Google, by adding
subfolders to `partners/`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 02:16:59 +00:00
Aayush Kataria
03c38005cb community[patch]: Fixing some caching issues for AzureCosmosDBSemanticCache (#18884)
Fixing some issues for AzureCosmosDBSemanticCache
- Added the entry for "AzureCosmosDBSemanticCache" which was missing in
langchain/cache.py
- Added application name when creating the MongoClient for the
AzureCosmosDBVectorSearch, for tracking purposes.

@baskaryan, can you please review this PR, we need this to go in asap.
These are just small fixes which we found today in our testing.
2024-03-25 19:06:17 -07:00
Clément Tamines
a6cbb755a7 community[patch]: fix semantic answer bug in AzureSearch vector store (#18938)
- **Description:** The `semantic_hybrid_search_with_score_and_rerank`
method of `AzureSearch` contains a hardcoded field name "metadata" for
the document metadata in the Azure AI Search Index. Adding such a field
is optional when creating an Azure AI Search Index, as other snippets
from `AzureSearch` test for the existence of this field before trying to
access it. Furthermore, the metadata field name shouldn't be hardcoded
as "metadata" and use the `FIELDS_METADATA` variable that defines this
field name instead. In the current implementation, any index without a
metadata field named "metadata" will yield an error if a semantic answer
is returned by the search in
`semantic_hybrid_search_with_score_and_rerank`.

- **Issue:** https://github.com/langchain-ai/langchain/issues/18731

- **Prior fix to this bug:** This bug was fixed in this PR
https://github.com/langchain-ai/langchain/pull/15642 by adding a check
for the existence of the metadata field named `FIELDS_METADATA` and
retrieving a value for the key called "key" in that metadata if it
exists. If the field named `FIELDS_METADATA` was not present, an empty
string was returned. This fix was removed in this PR
https://github.com/langchain-ai/langchain/pull/15659 (see
ed1ffca911#).
@lz-chen: could you confirm this wasn't intentional? 

- **New fix to this bug:** I believe there was an oversight in the logic
of the fix from
[#1564](https://github.com/langchain-ai/langchain/pull/15642) which I
explain below.
The `semantic_hybrid_search_with_score_and_rerank` method creates a
dictionary `semantic_answers_dict` with semantic answers returned by the
search as follows.

5c2f7e6b2b/libs/community/langchain_community/vectorstores/azuresearch.py (L574-L581)
The keys in this dictionary are the unique document ids in the index, if
I understand the [documentation of semantic
answers](https://learn.microsoft.com/en-us/azure/search/semantic-answers)
in Azure AI Search correctly. When the method transforms a search result
into a `Document` object, an "answer" key is added to the document's
metadata. The value for this "answer" key should be the semantic answer
returned by the search from this document, if such an answer is
returned. The match between a `Document` object and the semantic answers
returned by the search should be done through the unique document id,
which is used as a key for the `semantic_answers_dict` dictionary. This
id is defined in the search result's field named `FIELDS_ID`. I added a
check to avoid any error in case no field named `FIELDS_ID` exists in a
search result (which shouldn't happen in theory).
A benefit of this approach is that this fix should work whether or not
the Azure AI Search Index contains a metadata field.

@levalencia could you confirm my analysis and test the fix?
@raunakshrivastava7 do you agree with the fix?

Thanks for the help!
2024-03-25 18:51:54 -07:00
miri-bar
55db737302 ai21[minor]: AI21 Labs Semantic Text Splitter support (#19510)
Description: Added support for AI21 Labs model - Segmentation, as a Text
Splitter
Dependencies: ai21, langchain-text-splitter
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 01:39:37 +00:00
Anindyadeep
b2a11ce686 community[minor]: Prem AI langchain integration (#19113)
### Prem SDK integration in LangChain

This PR adds the integration with [PremAI's](https://www.premai.io/)
prem-sdk with langchain. User can now access to deployed models
(llms/embeddings) and use it with langchain's ecosystem. This PR adds
the following:

### This PR adds the following:

- [x]  Add chat support
- [X]  Adding embedding support
- [X]  writing integration tests
    - [X]  writing tests for chat 
    - [X]  writing tests for embedding
- [X]  writing unit tests
    - [X]  writing tests for chat 
    - [X]  writing tests for embedding
- [X]  Adding documentation
    - [X]  writing documentation for chat
    - [X]  writing documentation for embedding
- [X] run `make test`
- [X] run `make lint`, `make lint_diff` 
- [X]  Final checks (spell check, lint, format and overall testing)

---------

Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 01:37:19 +00:00
Alessandro D'Armiento
37eb3a4a9e docs: Some import nits (#19130)
- **Description:** fixes some minor issues in the documentation

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 01:25:44 +00:00
Souhail Hanfi
cbec43afa9 community[patch]: avoid creating extension PGvector while using readOnly Databases (#19268)
- **Description:** PgVector class always runs "create extension" on init
and this statement crashes on ReadOnly databases (read only replicas).
but wierdly the next create collection etc work even in readOnly
databases
- **Dependencies:** no new dependencies
- **Twitter handle:** @VenOmaX666

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 01:25:01 +00:00
Dixing (Dex) Xu
903541f439 docs: update dependecy for autogpt/marathon.ipynb (#19491)
fixes the import error from notebook based on the
[documentation](https://api.python.langchain.com/en/latest/agents/langchain_experimental.agents.agent_toolkits.pandas.base.create_pandas_dataframe_agent.html)

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 18:14:22 -07:00
Mauricio Cruz
fb9ce95184 cli[patch]: Fix Tuple typing problem when create new langchain app (#19141)
Thank you for contributing to LangChain!

When run command langchain app new my-app, i get this error:

File
"/home/mauricio/.local/lib/python3.8/site-packages/langchain_cli/utils/pyproject.py",
line 15, in <module>
pyproject_toml: Path, local_editable_dependencies: Iterable[tuple[str,
Path]]
TypeError: 'type' object is not subscriptable

This PR fix the error.
2024-03-26 01:09:51 +00:00
Anthony Shaw
6c9b0f96f3 docs: Add guidance for splitting Chinese, Japanese, and Thai (#19295)
The existing default list of separators for the `RecursiveTextSplitter`
assumes spaces are word boundaries. Some languages [don't use spaces
between
words](https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries)
(Chinese, Japanese, Thai, Burmese).

This PR extends the documentation to explain how to cater for those
languages by adding additional punctuation to the separators and
zero-width spaces which are used by some typesetters and will assist the
splitter to not split in words.

Ideally, **these separators could be a constant in the module** but for
now, defining them in the documentation is a start.
2024-03-26 00:34:00 +00:00
Erick Friis
441a8012b3 mistralai[patch]: release 0.1.0 (#19540) 2024-03-25 17:29:40 -07:00
Barun Amalkumar Halder
9246ec6b36 community[patch] : [Fiddler] ensure dataset is not added if model is present (#19293)
**Description:**
- minor PR to speed up onboarding by not trying to add a dataset, if a
model is already present.
- replace batch publish API with streaming when single events are
published.

**Dependencies:** any dependencies required for this change
**Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-03-25 17:28:05 -07:00
JSDu
6e090280fd community[patch]: milvus will autoflush, manual flush is slowly (#19300)
reference:


https://milvus.io/docs/configure_quota_limits.md#quotaAndLimitsflushRateenabled

https://github.com/milvus-io/milvus/issues/31407

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 00:26:58 +00:00
mackong
e65dc4b95b community[patch]: clean warning when delete by ids (#19301)
* Description: rearrange to avoid variable overwrite, which cause
warning always.
* Issue: N/A
* Dependencies: N/A
2024-03-25 17:23:22 -07:00
Ian
d5415dbd68 docs: improve tidb integrations documents (#19321)
This PR aims to enhance the documentation for TiDB integration, driven
by feedback from our users. It provides detailed introductions to key
features, ensuring developers can fully leverage TiDB for AI application
development.
2024-03-25 17:08:23 -07:00
Stefano Mosconi
01fc69c191 community[patch]: expanding version in confluence loader (#19324)
**Description:**
Expanding version in all the Confluence API calls so to get when the
page was last modified/created in all cases.

**Issue:** #12812 
**Twitter handle:** zzste
2024-03-25 17:08:01 -07:00
Dmitry Tyumentsev
08b769d539 community[patch]: YandexGPT Use recent yandexcloud sdk version (#19341)
Fixed inability to work with [yandexcloud
SDK](https://pypi.org/project/yandexcloud/) version higher 0.265.0
2024-03-25 17:05:57 -07:00
Marlene
f1313339ac community[patch]: Fixing incorrect base URLs for Azure Cognitive Search Retriever (#19352)
This PR adds code to make sure that the correct base URL is being
created for the Azure Cognitive Search retriever. At the moment an
incorrect base URL is being generated. I think this is happening because
the original code was based on a depreciated API version. No
dependencies need to be added. I've also added more context to the test
doc strings.

I should also note that ACS is now Azure AI Search. I will open a
separate PR to make these changes as that would be a breaking change and
should potentially be discussed.

Twitter: @marlene_zw



- No new tests added, however the current ACS retriever tests are now
passing when I run them.
- Code was linted.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 00:04:59 +00:00
Tridib Roy Arjo
d667b1ea8f docs: Update async_chromium.ipynb (#19514)
In Jupyter, asyncio would throw an error before `.load()` unless
`nest_asyncio` is applied (Issue #8494 mentioned this)

+Minor typo fixes..
2024-03-26 00:02:50 +00:00
Bob Lin
5b6b1f9e1d docs: Fix several sample code errors (#19382) 2024-03-25 16:59:52 -07:00
FinTech秋田
03ba1d4731 community[patch]: Add Support for GPU Index Types in Milvus 2.4 (#19468)
- **Description:** This commit introduces support for the newly
available GPU index types introduced in Milvus 2.4 within the LangChain
project's `milvus.py`. With the release of Milvus 2.4, a range of
GPU-accelerated index types have been added, offering enhanced search
capabilities and performance optimizations for vector search operations.
This update ensures LangChain users can fully utilize the new
performance benefits for vector search operations.
    - Reference: https://milvus.io/docs/gpu_index.md

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 23:39:54 +00:00
Hamid Ali
c281ec8887 docs: Fix broken link in semantic-chunker.ipynb (#19464)
Corrected a broken link within the semantic-chunker.ipynb notebook,
ensuring that users can access the referenced resource.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 23:39:32 +00:00
Ash Vardanian
d01bad5169 core[patch]: Convert SimSIMD back to NumPy (#19473)
This patch fixes the #18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.

I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
2024-03-25 16:36:26 -07:00
Ikko Eltociear Ashimine
980658cb47 docs: Update streaming.ipynb (#19500)
Fixed typo.

occuring -> occurring
2024-03-25 16:21:45 -07:00
Leonid Kuligin
91f4c80143 docs: fixed links (#19503)
- [ ] **PR title**: "docs: fixed broken links"


- [ ] **PR message**:
    - **Description:** fixed links in the documentation
2024-03-25 16:19:28 -07:00
Mikelarg
dac2e0165a community[minor]: Added GigaChat Embeddings support + updated previous GigaChat integration (#19516)
- **Description:** Added integration with
[GigaChat](https://developers.sber.ru/portal/products/gigachat)
embeddings. Also added support for extra fields in GigaChat LLM and
fixed docs.
2024-03-25 16:08:37 -07:00
Martin Kolb
e5bdb26f76 community[patch]: More flexible handling for entity names in vector store "HANA Cloud" (#19523)
- **Description:** Added support for lower-case and mixed-case names
The names for tables and columns previouly had to be UPPER_CASE.
With this enhancement, also lower_case and MixedCase are supported,


  - **Issue:** N/A
  - **Dependencies:** no new dependecies added
  - **Twitter handle:** @sapopensource
2024-03-25 15:52:45 -07:00
Erica Clark
a1ff21f90f docs: Update local llms article to use invoke instead of deprecated __call__ (#19528)
- **Description:** Since the implicit `__call__` has been deprecated in
favor of `invoke`, the local_llms article also needed to be updated.
This article was my introduction to Lanchain, and as it was helpful in
getting me setup with running LLMs locally, it is nice to not have any
warnings when running the example code. With this change, the warnings
go away when running the example code.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** clarkerican
2024-03-25 15:51:39 -07:00
Orest Xherija
0b1e09029f openai[patch]: increase max batch size for Azure OpenAI Embeddings API (#19532)
**Description:** Azure OpenAI has increased its maximum batch size from
16 to 2048 for the Embeddings API per this How-To
[page](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/embeddings?tabs=console#best-practices)

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 15:50:07 -07:00
Eugene Yurtsev
56f4c5459b core[patch]: fix xml output parser transform (#19530)
Previous PR passed _parser attribute which apparently is not meant to be
used by user code and causes non deterministic failures on CI when
testing the transform and a transform methods. Reverting this change
temporarily.
2024-03-25 21:34:45 +00:00
Erick Friis
e6952b04d5 cohere[patch]: fix release (#19529) 2024-03-25 13:46:29 -07:00
aditya thomas
aa68fd7e91 core[runnables]: docstring for class runnable, method with_listeners() (#19515)
**Description:** Docstring for method with_listerners() of class
Runnable
**Issue:** [Add in code documentation to core Runnable methods
#18804](https://github.com/langchain-ai/langchain/issues/18804)
**Dependencies:** None
2024-03-25 16:24:58 -04:00
billytrend-cohere
63343b4987 cohere[patch]: add cohere as a partner package (#19049)
Description: adds support for langchain_cohere

---------

Co-authored-by: Harry M <127103098+harry-cohere@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-25 20:23:47 +00:00
Eugene Yurtsev
727d5023ce core[patch]: Use defusedxml in XMLOutputParser (#19526)
This mitigates a security concern for users still using older versions of libexpat that causes an attacker to compromise the availability of the system if an attacker manages to surface malicious payload to this XMLParser.
2024-03-25 16:21:52 -04:00
Zachary Wilkins
e1a6341940 langchain: Passthrough batch_size on index()/aindex() calls (#19443)
**Description:** This change passes through `batch_size` to
`add_documents()`/`aadd_documents()` on calls to `index()` and
`aindex()` such that the documents are processed in the expected batch
size.
**Issue:** #19415
**Dependencies:** N/A
**Twitter handle:** N/A
2024-03-25 11:58:29 -04:00
ccurme
82de8fd6c9 add kwargs (#19519)
`HanaDB.add_texts` is missing **kwargs.
2024-03-25 11:56:01 -04:00
Nikhil Kumar
3d3b46a782 docs: Update docs for HuggingFacePipeline (#19306)
Updated `HuggingFacePipeline` docs to be in sync with list of supported
tasks, including translation.

- [x] **PR title**: "community: Update docs for `HuggingFacePipeline`"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**:
- **Description:** Update docs for `HuggingFacePipeline`, was earlier
missing `translation` as a valid task
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** None


- [x] **Add tests and docs**:


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-25 00:29:21 -07:00
Igor Muniz Soares
743f888580 community[minor]: Dappier chat model integration (#19370)
**Description:** 

This PR adds [Dappier](https://dappier.com/) for the chat model. It
supports generate, async generate, and batch functionalities. We added
unit and integration tests as well as a notebook with more details about
our chat model.


**Dependencies:** 
    No extra dependencies are needed.
2024-03-25 07:29:05 +00:00
Jacob Lezberg
64e1df3d3a infra: Update package version to apply CVE-related patch (#19490)
- **Description:** [CVE
2024-21503](https://www.cve.org/CVERecord?id=CVE-2024-21503) was
recently identified. The python linter "black" suffers from a potential
Regex-related denial of service attack. Updated version from the
vulnerable 24.2.0 to the patched 24.3.0.
- **Issue:** N/A
- **Dependencies:** The 'black' package in both `langchain` (top-level)
and `templates/python-lint`.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 07:11:23 +00:00
Hugoberry
96dc180883 community[minor]: Add DuckDB as a vectorstore (#18916)
DuckDB has a cosine similarity function along list and array data types,
which can be used as a vector store.
- **Description:** The latest version of DuckDB features a cosine
similarity function, which can be used with its support for list or
array column types. This PR surfaces this functionality to langchain.
    - **Dependencies:** duckdb 0.10.0
    - **Twitter handle:** @igocrite

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 07:02:35 +00:00
Ethan Yang
fa6397d76a docs: Add OpenVINO llms docs (#19489)
Add OpenVINOpipeline instructions in docs. OpenVINO users can find more
details in this page.
2024-03-24 23:57:30 -07:00
preak95
6ea3e57a63 community[minor]: S3FileLoader to use expose mode and post_processors arguments of unstructured loader (#19270)
**Description:** Update s3_file.py to use arguments **mode** and
**post_processors** from the base class **UnstructuredBaseLoader** to
include more metadata about the files from the S3 bucket such as
*'page_number', 'languages'* etc.

**Issue:** NA
**Dependencies:** None
**Twitter handle:** preak95

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 06:56:55 +00:00
Guangdong Liu
560e2182d8 docs: docstring Runnable pipe and pick methods (docs only) (#19395)
- **Issue:**  #18804
-  @eyurtsev @ccurme PTAL

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-24 23:50:04 -07:00
Christophe Bornet
63898dbda0 langchain[patch]: Use async memory in Chain when needed (#19429) 2024-03-24 23:49:00 -07:00
Lance Martin
db7403d667 docs: Remove non-rendering images & output spamming from doc ntbks (#19475)
Looking at tokens / page of our docs, we see a few outliers:
<img width="761" alt="image"
src="https://github.com/langchain-ai/langchain/assets/122662504/677aa2d6-0a29-45e4-882a-db2bbf46d02b">

It is due to non-rendering images in one case, and output spamming. 

Clean these, along with other cases of excessing output spamming in
docs.

All get sucked into chat-langchain for retrieval.
2024-03-24 23:47:38 -07:00
Erick Friis
b617085af0 mistralai[patch]: streaming tool calls (#19469) 2024-03-23 19:24:53 +00:00
aditya thomas
b43a9d5808 docs: adding voyageai to the list of partner packages (#19376)
**Description:** Adding VoyageAI to the list of partners
**Issue:** A standalone langchain-voyageai package has been added
**Dependencies:** None
2024-03-22 17:08:15 -07:00
Zeeland
2549df00cd docs: fix error bilibili url (#19375)
Thank you for contributing to LangChain!

bilibili-api-python use https://github.com/Nemo2011/bilibili-api repo.
Change to the correct address.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-22 17:06:17 -07:00
aditya thomas
375ab7bf59 docs: update module imports for fireworks documentation (#19377)
**Description:** Update module imports for Fireworks documentation
**Issue:** Module imports not present or in incorrect location
**Dependencies:** None
2024-03-22 17:05:27 -07:00
aditya thomas
0cc0467267 docs: update import paths and move to lcel for llama.cpp examples (#19391)
**Description:** Update import paths and move to lcel for llama.cpp
examples
**Issue:** Update import paths to reflect package refactoring and move
chains to LCEL in examples
**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-23 00:04:12 +00:00
fengjial
3b52ee05d1 community[patch]: fix bugs in baiduvectordb as vectorstore (#19380)
fix small bugs in vectorstore/baiduvectordb
2024-03-22 17:03:59 -07:00
Cailin Wang
5402aef32e docs: Add partition parameter to DashVector (#19385)
**Description**: Add `partition` parameter to DashVector
dashvector.ipynb
**Related PR**: https://github.com/langchain-ai/langchain/pull/19023
**Twitter handle**: @CailinWang_

---------

Co-authored-by: root <root@Bluedot-AI>
2024-03-22 17:00:29 -07:00
aditya thomas
515aab3312 community[patch]: invoke callback prior to yielding token (openai) (#19389)
**Description:** Invoke callback prior to yielding token for BaseOpenAI
& OpenAIChat
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)
**Dependencies:** None
2024-03-22 16:45:55 -07:00
aditya thomas
49e932cd24 community[patch]: invoke callback prior to yielding token (fireworks) (#19388)
**Description:** Invoke callback prior to yielding token for Fireworks
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)
**Dependencies:** None
2024-03-22 16:44:06 -07:00
aditya thomas
16ef88a87d docs: moving FireworksEmbeddings documentation to docs folder (#19398)
**Description:** Moving FireworksEmbeddings documentation to the
location docs/integration/text_embedding/ from langchain_fireworks/docs/
**Issue:** FireworksEmbeddings documentation was not in the correct
location
**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-22 23:24:22 +00:00
Leonid Ganeline
06190063e7 infra: makefile api_docs_clean fix (#19405)
Fixed a Makefile command that cleans up the api_docs
2024-03-22 15:45:55 -07:00
Christophe Bornet
1b813fe6fe langchain[patch]: Add async methods to VectorStoreRetrieverMemory (#19408) 2024-03-22 15:44:24 -07:00
Tarun Jain
ef6d3d66d6 community[patch]: docarray requires hnsw installation (#19416)
I have a small dataset, and I tried to use docarray:
``DocArrayHnswSearch ``. But when I execute, it returns:

```bash
    raise ImportError(
ImportError: Could not import docarray python package. Please install it with `pip install "langchain[docarray]"`.
```

Instead of docarray it needs to be 

```bash
docarray[hnswlib]
```

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-22 22:39:07 +00:00
German Swan
d4dc98a9f9 community[patch]: RecursiveUrlLoader: add base_url option (#19421)
RecursiveUrlLoader does not currently provide an option to set
`base_url` other than the `url`, though it uses a function with such an
option.
For example, this causes it unable to parse the
`https://python.langchain.com/docs`, as it returns the 404 page, and
`https://python.langchain.com/docs/get_started/introduction` has no
child routes to parse.
`base_url` allows setting the `https://python.langchain.com/docs` to
filter by, while the starting URL is anything inside, that contains
relevant links to continue crawling.
I understand that for this case, the docusaurus loader could be used,
but it's a common issue with many websites.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-22 15:34:31 -07:00
Erick Friis
e71daa7a03 openai[patch]: add test coverage to output (#19462) 2024-03-22 15:33:10 -07:00
igeni
4babefcb2f cli[patch]: Modified regular expression (#19449)
- **Description:** Modified regular expression to add support for
unicode chars and simplify pattern

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-22 15:24:08 -07:00
Ray Bell
7d36ee38b7 docs: point to titantic dataset on web (#19455)
Updated `pd.read_csv("titantic.csv")` to
`pd.read_csv("https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv")`
i.e. it will read it
https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv
and allow anyone to run the code.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-22 22:22:41 +00:00
Ray Bell
f959fad56e docs: use invoke instead of run (#19457)
Updated the deprecated run with invoke

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-22 15:08:26 -07:00
Bagatur
d93d49bc43 openai[patch]: tool use integration test (#19460) 2024-03-22 14:49:54 -07:00
Erick Friis
a99e644913 openai[patch]: integration test structured output (#19459) 2024-03-22 21:43:24 +00:00
Erick Friis
ac57123f40 openai[patch]: release 0.1.1 (#19458) 2024-03-22 21:36:21 +00:00
Luca Dorigo
47cfbe7522 openai[patch]: [URGENT REGRESSION FIX] Don't fail if tool message already doesn't contain name (#19435)
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-22 14:33:50 -07:00
aditya thomas
bc028294d0 docs: delete mistralai embeddings doc from incorrect location (#19432)
**Description:** Delete MistralAIEmbeddings usage document from folder
partners/mistralai/docs
**Issue:** The document is present in the folder docs/docs
**Dependencies:** None
2024-03-22 14:02:59 -07:00
Erick Friis
11e37943ed mistralai[patch]: fix core version (#19454) 2024-03-22 20:48:13 +00:00
Erick Friis
3b093160c4 mistralai[patch]: release 0.1.0rc1 (#19453) 2024-03-22 20:34:36 +00:00
aditya thomas
4856a87261 community[patch]: invoke callback prior to yielding token (llama.cpp) (#19392)
**Description:** Invoke callback prior to yielding token for llama.cpp
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)
**Dependencies:** None
2024-03-22 16:17:56 -04:00
ccurme
c4599444ee mistralai: update tool calling (#19451)
```python
from langchain.agents import tool
from langchain_mistralai import ChatMistralAI


llm = ChatMistralAI(model="mistral-large-latest", temperature=0)

@tool
def get_word_length(word: str) -> int:
    """Returns the length of a word."""
    return len(word)


tools = [get_word_length]
llm_with_tools = llm.bind_tools(tools)

llm_with_tools.invoke("how long is the word chrysanthemum")
```
currently raises
```
AttributeError: 'dict' object has no attribute 'model_dump'
```

Same with `.with_structured_output`
```python
from langchain_mistralai import ChatMistralAI
from langchain_core.pydantic_v1 import BaseModel

class AnswerWithJustification(BaseModel):
    """An answer to the user question along with justification for the answer."""
    answer: str
    justification: str

llm = ChatMistralAI(model="mistral-large-latest", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

structured_llm.invoke("What weighs more a pound of bricks or a pound of feathers")
```

This appears to fix.
2024-03-22 16:03:48 -04:00
Erick Friis
cceaca3e4f cookbook[patch]: add strip of quotes (#19452) 2024-03-22 19:10:39 +00:00
ccurme
8a2528c34a [langchain] fix OpenAIAssistantRunnable.create_assistant (#19081)
- **Description:** OpenAI assistants support some pre-built tools (e.g.,
`"retrieval"` and `"code_interpreter"`) and expect these as `{"type":
"code_interpreter"}`. This may have been upset by
https://github.com/langchain-ai/langchain/pull/18935
- **Issue:** https://github.com/langchain-ai/langchain/issues/19057
2024-03-22 13:23:19 -04:00
Harrison Chase
b40c80007f core[minor]: Add utility code to create tool examples (#18602)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-22 13:17:40 -04:00
Erick Friis
53ac1ebbbc mistralai[minor]: 0.1.0rc0, remove mistral sdk (#19420) 2024-03-22 01:24:58 +00:00
William FH
e980c14d6a core[patch]: allow "placeholder" type in from_messages tuples (#19152)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-21 22:09:24 +00:00
billytrend-cohere
f6bcd42421 community[patch]: Replace positional argument with text=text for cohere>=5 compatibility (#19407)
- **Description:** Replace positional argument with text=text for
cohere>=5 compatibility
2024-03-21 10:42:51 -07:00
enfeng
b20c2640da anthropic[patch]: update base_url of anthropic (#18634)
A small change ~

- [ ] **update base_url**: "package: langchain_anthropic"

---------

Co-authored-by: yangenfeng <yangenfeng@xiaoniangao.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-20 21:04:55 -07:00
Erick Friis
a9cda536ad openai[patch]: fix core min version (#19366) 2024-03-20 15:38:29 -07:00
Erick Friis
0b20c098df openai[patch]: fix name param (#19365) 2024-03-20 22:22:09 +00:00
Erick Friis
f6c8700326 openai[patch]: release 0.1.0, message id and name support (#19363) 2024-03-20 15:11:39 -07:00
Bagatur
3fa711dce0 experimental[patch]: Release 0.0.55 (#19353) 2024-03-20 13:06:39 -07:00
Erick Friis
2bcd760c46 robocorp[patch]: run integration tests on release (#19358) 2024-03-20 19:31:12 +00:00
Erick Friis
a031c183ae robocorp[patch]: release 0.0.4 (#19357) 2024-03-20 12:28:41 -07:00
Bagatur
d95ea3550e langchain[patch]: Release 0.1.13 (#19351) 2024-03-20 18:25:12 +00:00
Bagatur
b58b38769d community[patch]: Release 0.0.29 (#19350) 2024-03-20 18:09:48 +00:00
Bagatur
5d220975fc core[patch]: Release 0.1.33 (#19348) 2024-03-20 17:28:56 +00:00
Eugene Yurtsev
aa9ccca775 langchain[patch]: Add tests for indexing (#19342)
This PR adds tests for the indexing API
2024-03-20 13:00:22 -04:00
William FH
68298cdc82 [Feat] Accept non-dict if only 1 prompt input variable (#19156)
For prompt templates with only 1 variable (common in e.g.,
MessageGraph), it's convenient to wrap the incoming object in the
variable before formatting.


The downside of this, of course, would be that some number of
invocations will successfully format when the user may have intended to
format it properly before
2024-03-20 09:59:32 -07:00
mackong
d9396bdec1 langchain[patch]: add stop for various non-openai agents (#19333)
* Description: add stop for various non-openai agents.
* Issue: N/A
* Dependencies: N/A

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-20 11:34:10 -04:00
Yudhajit Sinha
7d216ad1e1 community[patch]: Invoke callback prior to yielding token (titan_takeoff_pro) (#18624)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/titan_takeoff_pro.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:58:18 -07:00
Yudhajit Sinha
455a74486b community[patch]: Invoke callback prior to yielding token (sparkllm) (#18625)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/sparkllm.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:57:53 -07:00
Yudhajit Sinha
5ac1860484 community[patch]: Invoke callback prior to yielding token (replicate) (#18626)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/replicate.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:57:27 -07:00
Yudhajit Sinha
9525e392de community[patch]: Invoke callback prior to yielding token (pai_eas_endpoint) (#18627)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/pai_eas_endpoint.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:56:58 -07:00
Yudhajit Sinha
140f06e59a community[patch]: Invoke callback prior to yielding token (openai) (#18628)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/openai.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:56:30 -07:00
Yudhajit Sinha
280a914920 community[patch]: Invoke callback prior to yielding token (ollama) (#18629)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_ &
_astream_ methods in llms/ollama.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:56:09 -07:00
老阿張
9dfce56b31 docs: Fix typo in infino.ipynb (#18640)
Description: "conquerer should be conqueror "? 🤔
Issue: Typo
Dependencies: Nope
Twitter handle: laoazhang
2024-03-20 07:51:58 -07:00
Christophe Bornet
00614f332a community[minor]: Add InMemoryVectorStore (#19326)
This is a basic VectorStore implementation using an in-memory dict to
store the documents.
It doesn't need any extra/optional dependency as it uses numpy which is
already a dependency of langchain.
This is useful for quick testing, demos, examples.
Also it allows to write vendor-neutral tutorials, guides, etc...
2024-03-20 10:21:07 -04:00
Devesh Rahatekar
3c4529ac69 core: Updated docstring for RunnablePick (#18832)
**Description:** : Updated the docstring for RunnablePick. Added
Overview and an Example for RunnablePick class.
   **Issue:** : #18803
2024-03-20 13:54:42 +00:00
aditya thomas
e46419c851 docs: contribute / integrations code examples update (#19319)
**Description:** Update to make the code examples consistent with the
actual use
**Issue:** Code examples were different from actual use in the LangChain
code
**Dependencies:** Changes on top of
https://github.com/langchain-ai/langchain/pull/19294

Note: If these changes are acceptable, please merge them after
https://github.com/langchain-ai/langchain/pull/19294.
2024-03-20 09:27:53 -04:00
Leonid Ganeline
8609afbd10 core[patch]: Update messages namespace to fix API reference docs (#19161)
Classes and functions defined in __init__.py are not parsed into the API
Reference.
For example:
- libs/core/langchain_core/messages/__init__.py : AnyMessage,
MessageLikeRepresentation, get_buffer_string(), messages_from_dict(),
...

Opinionated: __init__.py is not a typical place to define artifacts.

Moved artifacts from __init__ into utils.py. 
Added `MessageLikeRepresentation` to __all__ since it is used outside of
`messages`, for example, in
`libs/core/langchain_core/language_models/base.py`
Added `_message_from_dict` to __all__ since it is used outside of
`messages`(???) I would add `message_from_dict` (without underscore) as
an alias. Please, advise.
2024-03-20 09:25:09 -04:00
Christophe Bornet
4c2e887276 core: Simplify astream logic in BaseChatModel and BaseLLM (#19332)
Covered by tests in
`libs/core/tests/unit_tests/language_models/chat_models/test_base.py`,
`libs/core/tests/unit_tests/language_models/llms/test_base.py` and
`libs/core/tests/unit_tests/runnables/test_runnable_events.py`
2024-03-20 09:05:51 -04:00
Brace Sproul
40f846e65d docs[minor]: Add chat model selection tabs component (#19296)
<img width="1728" alt="image"
src="https://github.com/langchain-ai/langchain/assets/46789226/45e70a92-c2ee-48c8-9964-100eed22687b">
2024-03-19 18:12:46 -07:00
Erick Friis
69e9610f62 openai[patch]: pass message name (#17537) 2024-03-19 19:57:27 +00:00
Guangdong Liu
e5d7e455dc splitters: Add ensure_ascii parameter (#18485)
- **Description:** Add ensure_ascii parameter
2024-03-19 12:51:16 -07:00
Nithish Raghunandanan
7ad0a3f2a7 community: add Couchbase Vector Store (#18994)
- **Description:** Added support for Couchbase Vector Search to
LangChain.
- **Dependencies:** couchbase>=4.1.12
- **Twitter handle:** @nithishr

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
2024-03-19 12:39:51 -07:00
Chris Papademetrious
305d74c67a core: implement a batch_size parameter for CacheBackedEmbeddings (#18070)
**Description:**

Currently, `CacheBackedEmbeddings` computes vectors for *all* uncached
documents before updating the store. This pull request updates the
embedding computation loop to compute embeddings in batches, updating
the store after each batch.

I noticed this when I tried `CacheBackedEmbeddings` on our 30k document
set and the cache directory hadn't appeared on disk after 30 minutes.

The motivation is to minimize compute/data loss when problems occur:

* If there is a transient embedding failure (e.g. a network outage at
the embedding endpoint triggers an exception), at least the completed
vectors are written to the store instead of being discarded.
* If there is an issue with the store (e.g. no write permissions), the
condition is detected early without computing (and discarding!) all the
vectors.

**Issue:**
Implements enhancement #18026.

**Testing:**
I was unable to run unit tests; details in [this
post](https://github.com/langchain-ai/langchain/discussions/15019#discussioncomment-8576684).

---------

Signed-off-by: chrispy <chrispy@synopsys.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-19 18:55:43 +00:00
William FH
89af30807b Permit function eval on llm data type (#19287) 2024-03-19 11:53:50 -07:00
Jib
f8078e41e5 mongodb[patch]: Added scoring threshold to caching (#19286)
## Description
Semantic Cache can retrieve noisy information if the score threshold for
the value is too low. Adding the ability to set a `score_threshold` on
cache construction can allow for less noisy scores to appear.


- [x] **Add tests and docs**
  1. Added tests that confirm the `score_threshold` query is valid.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-19 11:30:02 -07:00
Christophe Bornet
30e4a35d7a community: Use langchain-astradb for AstraDB caches (#18419)
- [x] Needs https://github.com/langchain-ai/langchain-datastax/pull/4
- [x] Needs a new release of langchain-astradb
2024-03-19 14:04:36 -04:00
Brace Sproul
17c62e0f3a ci[minor]: Bump LC scripts package, add retry option (#19285)
The `retryFailed` option will retry all failed links, once at a time
with the goal of not triggering bot protection

`microsoft.com` is now hard coded into the whitelist
2024-03-19 10:42:59 -07:00
Erick Friis
7eb376d5fc docs: integration deprecation docs (#19283) 2024-03-19 17:11:15 +00:00
Guangdong Liu
2c835baae4 code[patch]: Add in code documentation to core Runnable with_retry method (docs only) (#19192)
- **Description:** Add in code documentation to core Runnable with_retry
method (docs only)
- **Issue:** #18804 
@baskaryan @eyurtsev PTAL

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-03-19 12:52:29 -04:00
Eugene Yurtsev
4b3dd34544 core[patch]: Pass sync run manager for sync stream fallback in astream (#19280)
This PR patches the fallback in chat models and language models to pass
in the appropriate version of the run manager (sync vs. async)
2024-03-19 16:32:33 +00:00
Leonid Ganeline
d314acb2d5 core[patch]: Move globals to a module instead of a package (non breaking change) (#19159)
Classes and functions defined in __init__.py are not parsed into the API
Reference.
For example: libs/core/langchain_core/globals/__init__.py :
`set_verbose` `get_llm_cache`, `set_llm_cache`, ...
And the whole `langchain_core.globals` namespace is not visible in the
API Reference. The refactoring is just file renaming.
2024-03-19 12:29:12 -04:00
Al-Ekram Elahee Hridoy
50f93d86ec core[minor]: Enhance cache flexibility in BaseChatModel (#17386)
- **Description:** Enhanced the `BaseChatModel` to support an
`Optional[Union[bool, BaseCache]]` type for the `cache` attribute,
allowing for both boolean flags and custom cache implementations.
Implemented logic within chat model methods to utilize the provided
custom cache implementation effectively. This change aims to provide
more flexibility in caching strategies for chat models.
  - **Issue:** Implements enhancement request #17242.
- **Dependencies:** No additional dependencies required for this change.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-19 11:26:58 -04:00
HatsuneMK00
4761c09e94 docs: update slack toolkit ipynb in integration (#19219)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- **PR message**:
- **Description:** Update the slack toolkit doc to use an agent that
support multiple inputs. Using ReAct agent will cause a ValidationError
when invoking the slack tools. This is because the agent return a string
like `'{"channel": "C05LDF54S21", "message": "Hello, world!"}'` but the
ReAct agent does not support multiple inputs.
- **Issue:** This is related to this
[Discussion#18083](https://github.com/langchain-ai/langchain/discussions/18083)
    - **Dependencies:** No dependencies required

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-19 10:39:09 -04:00
Zihong
ff31cc1648 experimental: update the notebook link of semantic chunk. (#19253)
update the notebook link of semantic chunk.
2024-03-19 07:24:51 -04:00
Frederico Wu
f36418a5b0 langchain: creating assistants with file_ids (#19199)
Changing OpenAIAssistantRunnable.create_assistant to send the `file_ids`
parameter to openai.beta.assistants.create

Co-authored-by: Frederico Wu <fred.diaswu@coxautoinc.com>
2024-03-18 21:34:03 -07:00
Vittorio Rigamonti
9b2f9ee952 community: VectorStore Infinispan, adding autoconfiguration (#18967)
**Description**:
this PR enable VectorStore autoconfiguration for Infinispan: if
metadatas are only of basic types, protobuf
config will be automatically generated for the user.
2024-03-18 21:33:45 -07:00
Max Jakob
6f544a6a25 elasticsearch: check for deployed models (#18973)
When creating a new index, if we use a retrieval strategy that expects a
model to be deployed in Elasticsearch, check if a model with this name
is indeed deployed before creating an index. This lowers the probability
to get into a state in which an index was created with a faulty model
ID, which cannot be overwritten any more (the index has to manually be
deleted).
2024-03-18 21:32:00 -07:00
gonvee
b82644078e community: Add keep_alive parameter to control how long the model w… (#19005)
Add `keep_alive` parameter to control how long the model will stay
loaded into memory with Ollama。

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-19 04:29:01 +00:00
Anthony Shaw
bb0dd8f82f docs: Embellish article on splitting by tokens with more examples and missing details (#18997)
**Description**

This PR adds some missing details from the "Split by tokens" page in the
documentation. Specifically:

- The `.from_tiktoken_encoder()` class methods for both the
`CharacterTextSplitter` and `RecursiveCharacterTextSplitter` default to
the old `gpt-2` encoding. I've added a comment to suggest specifying
`model_name` or `encoding`
- The docs didn't mention that the `from_tiktoken_encoder()` class
method passes additional kwargs down to the constructor of the splitter.
I only discovered this by reading the source code
- Added an example of using the `.from_tiktoken_encoder()` class method
with `RecursiveCharacterTextSplitter` which is the recommended approach
for most scenarios above `CharacterTextSplitter`
- Added a warning that `TokenTextSplitter` can split characters which
have multiple tokens (e.g. 猫 has 3 cl100k_base tokens) between multiple
chunks which creates malformed Unicode strings and should not be used in
these situations.

Side note: I think the default argument of `gpt2` for
`.from_tiktoken_encoder()` should be updated?

**Twitter handle** anthonypjshaw

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-18 21:28:17 -07:00
Roshan Santhosh
7afecec280 core: update _rm_titles to account for title argument name bug (#19036)
Issue : For functions which have an argument with the name 'title', the
convert_pydantic_to_openai_function generates an incorrect output and
omits the argument all together. This is because the _rm_titles function
removes all instances of the the key 'title' from the output.



Description : Updates the _rm_titles function to check the presence of
the 'type' key as well before removing the 'title' key. As the title key
that we wish to omit always has a type key along with it.

Potential gap if there is a function defined which has both title and
key as argument names, in which case this would fail. Maybe we could set
a filter on the function argument names and reject those with keyword
argument names.


No dependencies. Passed all tests. 


- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-18 21:25:06 -07:00
Harrison Chase
efcdf54edd Josha91 fix docstring (#19249)
Co-authored-by: Josha van Houdt <josha.van.houdt@sap.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-18 21:19:56 -07:00
Simon Stone
58c7687174 langchain: preserve document metadata in FlashrankRerank (#19148)
**Description:** Preserves document metadata in `FlashrankRerank`
    - **Issue:** #19142
    - **Dependencies:** None
    - **Twitter handle:** n/a

---------

Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>
2024-03-19 04:15:18 +00:00
Aaron Jimenez
bc648f6cfc core: Updated docstring for Context class (#19079)
- **Description:** Improves the docstring for `class Context` by
providing an overview and an example.
- **Issue:** #18803
2024-03-18 21:15:14 -07:00
Taqi Jaffri
044bc22acc Community: Add mistral oss model support to azureml endpoints, plus configurable timeout (#19123)
- **Description:** There was no formatter for mistral models for Azure
ML endpoints. Adding that, plus a configurable timeout (it was hard
coded before)
- **Dependencies:** none
- **Twitter handle:** @tjaffri @docugami
2024-03-18 21:10:42 -07:00
Kangmoon Seo
07de4abe70 core: Fix Exception handling in XMLOutputParser (#19126)
- **Description:** 
  - Exception handling in `XMLOutputParser`
1. Add Exception handling at `root = ET.fromstring(text)` // raises
`ET.ParseError`
    2. Fix Exception class (commonly uses in `BaseOutputParser` class)
  - AS-IS: raise `ValueError`, `ET.ParserError` without handling
    ```python
    # langchain_core/output_parsers/xml.py

        text = text.strip()
        if (text.startswith("<") or text.startswith("\n<")) and (
            text.endswith(">") or text.endswith(">\n")
        ):
            root = ET.fromstring(text)
            return self._root_to_dict(root)
        else:
            raise ValueError(f"Could not parse output: {text}")
    ```
  - TO-BE: raise `OutputParserException`
    ```python
    # langchain_core/output_parsers/xml.py

        text = text.strip()
        if (text.startswith("<") or text.startswith("\n<")) and (
            text.endswith(">") or text.endswith(">\n")
        ):
            try:
                root = ET.fromstring(text)
                return self._root_to_dict(root)

            except ET.ParseError:
raise OutputParserException(f"Could not parse output: {text}")

        else:
raise OutputParserException(f"Could not parse output: {text}")

    ``` 
- **Issue:** #19107  
- **Dependencies:** None
2024-03-18 21:08:32 -07:00
Hamza Muhammad Farooqi
24a0a4472a Add docstrings for Clickhouse class methods (#19195)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-19 04:03:12 +00:00
Simon Stone
dc4ce82ddd docs: fix import path for FlashrankRerank example notebook (#19146)
**Description:** Fixes the import paths for the `FlashrankRerank`
example notebook.
 **Issue:** #19139 
 **Dependencies:** None
 **Twitter handle:** n/a

---------

Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-18 21:03:00 -07:00
Saurav Kumar
bde199d128 Updating format of pip install (#19198)
Thank you for contributing to LangChain!

- [x] **PR title**: "Updating format of pip install in two files of
docs/cookbook"
- pip install is not reflecting properly in some of the files in
cookbook
- Example:
[docs/expression_language/cookbook/sql_db](https://python.langchain.com/docs/expression_language/cookbook/sql_db)


- [x] **PR message**: Updating format of pip install in two files of
docs/cookbook
    - **Description:** a description of the change
    - **Issue:** #19197 

- Note - let's do squash merge for the PR

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-19 04:01:24 +00:00
Rohit Gupta
785f8ab174 [langchain_community] milvus vectorstores upsert: add **kwargs to make it use for other argument also (#19193)
add **kwargs in add_documents for upsert, to make it use for other
argument also.
Lets use this, it was unused as of now.

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Rohit Gupta <rohit.gupta2@walmart.com>
2024-03-18 21:01:12 -07:00
Cycle
77868b1974 experimental: add buffer_size hyperparameter to SemanticChunker as in source video (#19208)
add buffer_size hyperparameter which used in combine_sentences function
2024-03-19 03:54:20 +00:00
HowardChan
ae3c7f702c docs:Make url as a markdown link (#19212)
**Description**: same as the title

Co-authored-by: ChenZhengHao <chenzhenghao@mail.teletraan.io>
2024-03-19 03:47:52 +00:00
Shotaro Sano
ca9c8c58ea text-splitters, infra: fix libs/langchain/dev.Dockerfile so that the text-splitter directory is copied before poetry installation (#19214)
## Description
This PR modifies the settings in `libs/langchain/dev.Dockerfile` to
ensure that the `text-splitters` directory is copied before the poetry
installation process begins.

Without this modification, the `docker build` command fails for
`dev.Dockerfile`, preventing the setup of some development environments,
including `.devcontainer`.

## Bug Details

### Repro
Run the following command:

```bash
docker build -f libs/langchain/dev.Dockerfile .
```

### Current Behavior
The docker build command fails, raising the following error:

```
...
 => [langchain-dev-dependencies 4/5] COPY libs/community/ ../community/                                                                                0.4s
 => ERROR [langchain-dev-dependencies 5/5] RUN poetry install --no-interaction --no-ansi --with dev,test,docs                                          1.1s
------                                                                                                                                                      
 > [langchain-dev-dependencies 5/5] RUN poetry install --no-interaction --no-ansi --with dev,test,docs:
#13 0.970 
#13 0.970 Directory ../text-splitters does not exist
------
executor failed running [/bin/sh -c poetry install --no-interaction --no-ansi --with dev,test,docs]: exit code: 1
```

### Expected Behavior
The `docker build` command successfully completes without the poetry
error.

### Analysis
The error occurs because the `text-splitters` directory is not copied
into the build environment, unlike the other packages under the `libs`
directory. I suspect that the `COPY` setting was overlooked since
`text-splitters` was separated in a recent PR.

## Fix
Add the following lines to the `libs/langchain/dev.Dockerfile`:

```dockerfile
# Copy the text-splitters library for installation
COPY libs/text-splitters/ ../text-splitters/
```
2024-03-18 20:45:35 -07:00
Guangdong Liu
c3310c5e7f community: Fix Milvus got multiple values for keyword argument 'timeout' (#19232)
- **Description:** Fix Milvus got multiple values for keyword argument
'timeout'
- **Issue:**  fix #18580
- @baskaryan @eyurtsev PTAL
2024-03-18 20:44:25 -07:00
Erick Friis
95904fe443 langchain[patch]: update base imports to core (#19248)
still deprecated, but was misleading before
2024-03-19 03:17:07 +00:00
Asaf Joseph Gardin
21c45475c5 ai21[patch]: AI21 Labs bump SDK version (#19114)
Description: Added support AI21 SDK version 2.1.2
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:47:08 -07:00
daniel ung
edf9d1c905 templates: Added template for JaguarDB (#16757)
- **Description:**: added langchain template for JaguarDB

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-19 02:36:24 +00:00
gustavo-yt
7c26ef88a1 templates: Add rag lantern template (#16523)
Replace this entire comment with:
  - **Description:** Added a template for lantern rag usage.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-19 02:34:46 +00:00
Jib
516cc44b3f langchain-mongodb: [test-fix] add explicit index_name setting on test vector creation (#19245)
- **Description:** Tests fail to do value lookup because it does not
specify the index name
  - **Issue:** the issue # Failing integration test
 

- [x] **Add tests and docs**: Tests now pass


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-18 15:52:28 -07:00
Estephania Calvo Carvajal
94e58dd827 docs:Fix links to LangSmith docs on Evaluation page (#19210) (#19216)
- **Description:** Same as the title
- **Issue:** #19210
2024-03-18 22:27:43 +00:00
William FH
780337488e [Enhancement] Add support for directly providing a run_id (#18990)
The root run id (~trace id's) is useful for assigning feedback, but the
current recommended approach is to use callbacks to retrieve it, which
has some drawbacks:
1. Doesn't work for streaming until after the first event
2. Doesn't let you call other endpoints with the same trace ID in
parallel (since you have to wait until the call is completed/started to
use

This PR lets you provide = "run_id" in the runnable config.

Couple considerations:

1. For batch calls, we split the trace up into separate trees (to permit
better rendering). We keep the provided run ID for the first one and
generate a unique one for other elements of the batch.
2. For nested calls, the provided ID is ONLY used on the top root/trace.



### Example Usage


```
chain.invoke("foo", {"run_id": uuid.uuid4()})
```
2024-03-18 15:03:04 -07:00
Jacob Lee
bd329e9aad core[patch]: Add LLM output to message response_metadata (#19158)
This will more easily expose token usage information.

CC @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-18 13:58:32 -07:00
Erick Friis
6fa1438334 mongodb[patch]: release 0.1.2 (#19243) 2024-03-18 13:35:45 -07:00
Leonid Ganeline
7de1d9acfd community: llms imports fixes (#18943)
Classes are missed in  __all__  and in different places of __init__.py
- BaichuanLLM 
- ChatDatabricks
- ChatMlflow
- Llamafile
- Mlflow
- Together
Added classes to __all__. I also sorted __all__ list.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 20:24:40 +00:00
Anush
aee5138930 templates: update qdrant self query (#19218)
## Description

This PR
- Updates the Qdrant self-query template to reflect the recent updates.
- Enables reading config values from `env` files as the README [mentions
it](https://github.com/Anush008/langchain/tree/self-query-qdrant/templates/self-query-qdrant#environment-setup).

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:59:08 +00:00
Kenzie Mihardja
21f75991d4 deprecate community docugami loader (#19230)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: deprecate DocugamiLoader"

- [x] **PR message**: Deprecate the langchain_community and use the
docugami_langchain DocugamiLoader

---------

Co-authored-by: Kenzie Mihardja <kenzie28@cs.washington.edu>
2024-03-18 12:56:47 -07:00
Jib
ec026004cb mongodb[patch]: Remove in-memory cache from cache abstractions (#18987)
## Description
* In memory cache easily gets out of sync with the server cache, so we
will remove it entirely to reduce the issues around invalidated caches.

## Dependencies
None

- [x]  If you're adding a new integration, please include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:44:34 +00:00
Jib
866d6408af mongodb[patch]: Remove embedding retrieval from mongodb payload (#19035)
## Description
Returning the embedding is not necessary in the vector search
functionality unless specified as a debugging step. This change defaults
the behavior such that the server _only_ returns the embedding key if
explicitly requested, such as in the case of
`max_marginal_relevance_search`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
* Added `test_from_documents_no_embedding_return`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:43:50 +00:00
Leonid Kuligin
366ba77459 core[minor]: moved fake llms and embeddings to core (#19226)
- [ ] **PR title**: "core: moved fake llms and embeddings to core"


- [ ] **PR message**:
 - **Description:** moved fake llms and embeddings to core"
2024-03-18 10:01:26 -07:00
Pengfei Jiang
514fe80778 community[patch]: add stop parameter support to volcengine maas (#19052)
- **Description:** add stop parameter to volcengine maas model
- **Dependencies:** no

---------

Co-authored-by: 江鹏飞 <jiangpengfei.jiangpf@bytedance.com>
2024-03-17 01:58:50 +00:00
htaoruan
bcc771e37c docs: ChatTongyi example error (#19013) 2024-03-17 01:55:56 +00:00
Anubhav Madhav
9235dade90 docs: provided hyperlinks to text and fixed grammar (#19092)
1) Provided links to text in the prompt (Refer Page Link 1, Page Link 2
and Page Link 3)
2) Fixed Grammar in Considerations of Model I/O Concepts documentation
page - Update concepts.mdx (Page Link 4)

*Issues are on the following pages:*
Page Link 1:
https://python.langchain.com/docs/modules/model_io/concepts#prompttemplate
Page Link 2:
https://python.langchain.com/docs/modules/model_io/concepts#messageprompttemplate
Page Link 3:
https://python.langchain.com/docs/modules/model_io/concepts#chatprompttemplate
Page Link 4:
https://python.langchain.com/docs/modules/model_io/concepts#considerations


**Fix 1**:
Description: Fixed Grammar in Considerations of Model I/O Documentation
Page
Issue: "to work well with the model are you using" # "to work well with
the model you are using"
Dependencies: None
Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav)

**Fix 2**:
Description: Provided links to text in the prompt (Refer Page Link 1,
Page Link 2 and Page Link 3)
Issue: links not provided # links have been provided to the text
Dependencies: None
Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav)
baskaryan, efriis, eyurtsev, hwchase17.


*For Fix 1*
Refer to the first word 'This" word in the image attached with this PR.
PFA
<img width="839" alt="Screenshot 2024-03-15 at 3 04 17 AM"
src="https://github.com/langchain-ai/langchain/assets/42323737/94e8db16-249f-48c3-a1d1-dee8d36067fa">


If no one reviews your PR within a few days, please @-mention one of

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-17 01:37:42 +00:00
primate88
5aa68936e0 community: Fix import path for StreamingStdOutCallbackHandler example (#19170)
- Description:
- Updated the import path for `StreamingStdOutCallbackHandler` in the
streaming response example within `huggingface_endpoint.py`. This change
corrects the import statement to reflect the actual location of
`StreamingStdOutCallbackHandler` in
`langchain_core.callbacks.streaming_stdout`.
- Issue:
  - None
- Dependencies:
  - No additional dependencies are required for this change.
- Twitter handle:
  - None

## Note:
I have tested this change locally and confirmed that the
`StreamingStdOutCallbackHandler` works as expected with the updated
import path. This PR does not require the addition of new tests since it
is a correction to documentation/examples rather than functional code.
2024-03-17 00:50:37 +00:00
Bagatur
611d5a1618 openai[patch]: fix async http client (#19164)
Fix #19116
2024-03-16 17:50:22 -07:00
Nikhil Kumar
635b3372bd community[minor]: Add support for translation in HuggingFacePipeline (#19190)
- [x] **Support for translation**: "community: Add support for
translation in `HuggingFacePipeline`"


- [x] **Add support for translation in `HuggingFacePipeline`**:
- **Description:** Add support for translation in `HuggingFacePipeline`,
which earlier used to support only text summarization and generation.
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** None
2024-03-17 00:48:13 +00:00
Nikhil Kumar
a1b26dd9b6 docs: Add docs for RouterRunnable (#19191)
- [x] **Docs for `RouterRunnable`**: core: Add docs for `RouterRunnable`

- [x] **Add docs for `RouterRunnable`**:
- **Description:** Add docs for `RouterRunnable`, which was previously
missing documentation
    - **Issue:** #18803 
    - **Dependencies:** N/A
    - **Twitter handle:** None
2024-03-17 00:48:00 +00:00
k.muto
8d2c34e655 community: Fix all page numbers were the same for _BaseGoogleVertexAISearchRetriever (#19175)
- Description:
- This pull request is to fix a bug where page numbers were not set
correctly. In the current code, all chunks share the same metadata
object doc_metadata, so the page number is set with the same value for
all documents. To fix this, I changed to using separate metadata objects
for each chunk.
- Issue:
  - None
- Dependencies:
  - No additional dependencies are required for this change.
- Twitter handle:
  - @eycjur

- Test
- Even if it's not a bug, there are cases where everything ends up with
the same number of pages, so it's very difficult for me to write
integration tests.
2024-03-16 22:28:56 +00:00
Matt Frediani
160a7077b0 Update README.md (#19172)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-16 15:23:25 -07:00
inpyeong
7c092f479f docs: Update why.ipynb (#19173)
I think that cell type for pip command may be 'code'.
Please check, thank you :)

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-16 22:21:51 +00:00
Vitalii Korsakov
d96e0b2de7 docs: Remove duplicated line in Get Started section (#19182)
Line `from langchain_openai import ChatOpenAI` is put twice in Get
Started / Serving with LangServe section.
Imports on lines 559 and 566 are identical

Co-authored-by: Vitalii <vitalii@localhost>
2024-03-16 22:21:25 +00:00
Cailin Wang
7cd87d2f6a community: Add partition parameter to DashVector (#19023)
**Description**: DashVector Add partition parameter
**Twitter handle**: @CailinWang_

---------

Co-authored-by: root <root@Bluedot-AI>
2024-03-16 15:20:30 -07:00
Rodrigo Nogueira
e64cf1aba4 community: Add model argument for maritalk models and better error handling (#19187) 2024-03-16 15:18:56 -07:00
samanhappy
ff94f86ce1 docs: fix link to interface TextSplitter (#19177) 2024-03-16 15:16:34 -07:00
Sergey Kozlov
1a55e950aa community[patch]: support fastembed v1 and v2 (#19125)
**Description:**
#18040 forces `fastembed>2.0`, and this causes dependency conflicts with
the new `unstructured` package (different `onnxruntime`). There may be
other dependency conflicts.. The only way to use
`langchain-community>=0.0.28` is rollback to `unstructured 0.10.X`. But
new `unstructured` contains many fixes.

This PR allows to use both `fastembed` `v1` and `v2`.

How to reproduce:

`pyproject.toml`:
```toml
[tool.poetry]
name = "depstest"
version = "0.0.0"
description = "test"
authors = ["<dev@example.org>"]

[tool.poetry.dependencies]
python = ">=3.10,<3.12"
langchain-community = "^0.0.28"
fastembed = "^0.2.0"
unstructured = {extras = ["pdf"], version = "^0.12"}
```

```bash
$ poetry lock
```

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
2024-03-15 18:33:51 -07:00
six17
fd4f536c77 text-splitters[patch]: fix json split of RecursiveJsonSplitter (#19119)
- **Description:** This modification addresses the issue of mutable
default parameters in functions. In the original code, the `chunks`
parameter is defaulted to a list containing an empty dictionary, which
is mutable. Since default parameters in Python are evaluated only once
at function definition time, modifications to the parameter would
persist across future calls. By changing the default to `None` and
checking/initializing within the function, a new list is created for
each call, thus avoiding potential issues.

---------

Co-authored-by: sixiang <sixiang@lixiang.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-15 16:46:49 -07:00
aditya thomas
05008c4f94 docs: update stale links in Together AI documentation (#19011)
**Description:** Update stales link in Together AI documentation
**Issue:** Some links pointed to legacy webpages on the Together AI
website
**Dependencies:** None
**Lint and test**: `make format`, `make lint` were run
2024-03-15 16:38:04 -07:00
aditya thomas
80eb510a7b docs: update docstring of Together class (#19008)
**Description:** Update docstring of Together class to show example and
update API URL
**Issue:** Improves usability
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` were run
2024-03-15 16:30:45 -07:00
高远
ef9813dae6 docs: add vikingdb docstrings(#19016)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
2024-03-15 16:29:29 -07:00
wulixuan
0e0030f494 community[patch]: fix yuan2 chat model errors while invoke. (#19015)
1. fix yuan2 chat model errors while invoke.
2. update related tests.
3. fix some deprecationWarning.
2024-03-15 16:28:36 -07:00
Shuai Liu
c244e1a50b community[patch]: Fixed bug in merging generation_info during chunk concatenation in Tongyi and ChatTongyi (#19014)
- **Description:** 

In #16218 , during the `GenerationChunk` and `ChatGenerationChunk`
concatenation, the `generation_info` merging changed from simple keys &
values replacement to using the util method
[`merge_dicts`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/utils/_merge.py):


![image](https://github.com/langchain-ai/langchain/assets/2098020/10f315bf-7fe0-43a7-a0ce-6a3834b99a15)

The `merge_dicts` method could not handle merging values of `int` or
some other types, and would raise a
[`TypeError`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/utils/_merge.py#L55).

This PR fixes this issue in the **Tongyi and ChatTongyi Model** by
adopting the `generation_info` of the last chunk
and discarding the `generation_info` of the intermediate chunks,
ensuring that `stream` and `astream` function correctly.

- **Issue:**  
    - Related issues or PRs about Tongyi & ChatTongyi: #16605, #17105 
    - Other models or cases: #18441, #17376
- **Dependencies:** No new dependencies
2024-03-15 16:27:53 -07:00
wulixuan
f79d0cb9fb docs: update docs for yuan2 in LLMs and Chat models integration. (#19028)
update yuan2.0 notebook in LLMs and Chat models.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-15 16:03:18 -07:00
Taraka Nithin Vankala
eec023766e docs: Corrected error (#19030)
- [ ] **PR title**: "docs: correction in
"https://github.com/langchain-ai/langchain/blob/master/docs/docs/get_started/quickstart.mdx",
line 289".
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: 
    - Corrected the spelling mistake
    - #18981
2024-03-15 16:02:33 -07:00
Christophe Bornet
f2a7dda4bd community[patch]: Use langchain-astradb for AstraDB doc loader (#19071)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:57:25 +00:00
Leonid Ganeline
a49ac55964 docs: providers update 8 (#19053)
Added missed providers. Added missed integrations. Fixed format.
2024-03-15 15:49:14 -07:00
Holt Skinner
cee03630d9 community[patch]: Add Blended Search Support to GoogleVertexAISearchRetriever (#19082)
https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#multi-data-stores

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:39:31 +00:00
Eugene Yurtsev
0ddfe7fc9d langchain[patch]: make hub work with older langchainhub versions (#19076)
Make it work with older clients
2024-03-15 15:37:52 -07:00
William W Wang
0a784074d1 docs: Update llm_caching.ipynb (#19085) 2024-03-15 22:35:48 +00:00
William W Wang
6327be9048 docsUpdate azure_cosmos_db.ipynb (#19087)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:33:26 +00:00
Anubhav Madhav
553a520ab6 docs: Fixed Grammar in Considerations of Model I/O Concepts (#19091)
Fixed Grammar in Considerations of Model I/O Concepts documentation page
- Update concepts.mdx

Page Link:
https://python.langchain.com/docs/modules/model_io/concepts#considerations

- **Description:** Fixed Grammar in Considerations of Model I/O
Documentation Page
- **Issue:** "to work well with the model are you using" # "to work well
with the model you are using"
- **Dependencies:** None
- **Twitter handle:** @Anubhav_Madhav
(https://twitter.com/Anubhav_Madhav)


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:31:39 +00:00
Shotaro Sano
d647ff1a9a docs: Fix execution results of docs/docs/modules/data_connection/indexing.ipynb (#19112)
## Description
This PR addresses a documentation issue in the
[Indexing](https://python.langchain.com/docs/modules/data_connection/indexing)
page. Specifically, it corrects the execution results of the Jupyter
notebook under the
[Source](https://python.langchain.com/docs/modules/data_connection/indexing#source)
section, which were broken as detailed below.

## Problem
The execution results following the statement, `This should delete the
old versions of documents associated with doggy.txt source and replace
them with the new versions.`, appear to be incorrect, as described
below.

### Current Behavior
- For some reason, the `index` function fails to add the new content of
`doggy.txt`. Although it deletes the document objects associated with
the `doggy.txt` source, it does not add the objects in
`changed_doggy_docs`. Consequently, the execution result displays
`num_added: 0`.
- This unexpected behavior also impacts the results of
`vectorstore.similarity_search("dog", k=30)`, showing only the contents
of `kitty.txt`. It appears as though the contents of `doggy.txt` have
been completely removed from the index:

```
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

### Expected Behavior
- The `index` function should successfully add the objects in
`changed_doggy_docs` after removing the old content of `doggy.txt`. The
anticipated execution result is `num_added: 2`.
- Subsequently, the modified content of `doggy.txt` should appear in the
results of `vectorstore.similarity_search("dog", k=30)` as follows:

```
[Document(page_content='woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

## Fix
I reran `docs/docs/modules/data_connection/indexing.ipynb` and have
included the diff in this PR.
2024-03-15 22:27:15 +00:00
case-k
ebc4a64f9e docs: fix databricks document url (#19096)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:25:11 +00:00
Guangdong Liu
4468e5bdbe docs: Add in code documentation to core Runnable with_fallbacks method (docs only) (#19104)
- Description: [a description of the change] Add in code documentation
to core Runnable with_fallbacks method (docs only)
- Issue: the issue #18804 
@eyurtsev PTAL
2024-03-15 15:21:10 -07:00
Guangdong Liu
cced3eb9bc community[patch]: Fix sparkllm embeddings api bug. (#19122)
- **Description:** Fix sparkllm embeddings api bug.
@baskaryan PTAL
2024-03-15 15:08:49 -07:00
samanhappy
b9c62fb905 docs: fix API link for BaseLoader (#19128)
The link to the BaseLoader API requires an update as it has been moved
into the `langchain_core` package.
2024-03-15 14:46:05 -07:00
kaijietti
c20aeef79a community[patch]: implement qdrant _aembed_query and use it in other async funcs (#19155)
`amax_marginal_relevance_search ` and `asimilarity_search_with_score `
should use an async version of `_embed_query `.
2024-03-15 21:20:12 +00:00
Kostas Botsas
527676a753 docs: Fix source column xata.ipynb (#19137)
Docs fix: replace column name search with source.

The Xata integration expects metadata column named "source".

The docs suggest the name "search", which if used, yields the following
error:

```
File "/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/xata.py", line 95, in _add_vectors
    raise Exception(f"Error adding vectors to Xata: {r.status_code} {r}")
Exception: Error adding vectors to Xata: 400 {'errors': [{'status': 400, 'message': 'invalid record: column [source]: column not found'}]}
```
2024-03-15 14:06:18 -07:00
Barun Amalkumar Halder
34d6f0557d community[patch] : publishes duration as milliseconds to Fiddler (#19166)
**Description:** Many LLM steps complete in sub-second duration, which
can lead to non-collection of duration field for Fiddler. This PR
updates duration from seconds to milliseconds.
**Issue:** [INTERNAL] FDL-17568
**Dependencies:** NA
**Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-03-15 14:04:56 -07:00
Eugene Yurtsev
745d2476a2 langchain: upgrade mypy (#19163)
Update mypy in langchain
2024-03-15 16:37:09 -04:00
Maxime Perrin
aa785fa6ec core[minor]: allow LLMs async streaming to fallback on sync streaming (#18960)
- **Description:** Handling fallbacks when calling async streaming for a
LLM that doesn't support it.
- **Issue:** #18920 
- **Twitter handle:**@maximeperrin_

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-03-15 16:06:50 -04:00
Erick Friis
caf47ab666 infra: run min version ci before integration tests (#18945) 2024-03-15 12:14:44 -07:00
Barun Amalkumar Halder
b551d49cf5 community[patch] : adds feedback and status for Fiddler callback handler events (#19157)
**Description:** This PR adds updates the fiddler events schema to also
pass user feedback, and llm status to fiddler
   **Tickets:** [INTERNAL] FDL-17559 
   **Dependencies:**  NA
   **Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-03-15 12:03:49 -07:00
Juan Felipe Arias
f5b9aedc48 community[patch]: add args_schema to sql_database tools for langGraph integration (#18595)
- **Description:** This modification adds pydantic input definition for
sql_database tools. This helps for function calling capability in
LangGraph. Since actions nodes will usually check for the args_schema
attribute on tools, This update should make these tools compatible with
it (only implemented on the InfoSQLDatabaseTool)
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** juanfe8881
2024-03-15 19:03:36 +00:00
fengjial
c922ea36cb community[minor]: Add Baidu VectorDB as vector store (#17997)
Co-authored-by: fengjialin <fengjialin@MacBook-Pro.local>
2024-03-15 19:01:58 +00:00
aditya thomas
190887c5cd docs: update the list of providers (#19012)
**Description:** Update the list of LangChain providers
**Issue:** Make the list of LangChain providers current
**Dependencies:** None
2024-03-15 12:00:24 -07:00
Erick Friis
bbe164ad28 docs: voyageai as provider (#19154) 2024-03-15 10:12:37 -07:00
Erick Friis
781aee0068 community, langchain, infra: revert store extended test deps outside of poetry (#19153)
Reverts langchain-ai/langchain#18995

Because it makes installing dependencies in python 3.11 extended testing
take 80 minutes
2024-03-15 17:10:47 +00:00
Leonid Kuligin
e3ff107e4f docs: updated google integration related imports in the documentation (#19131)
updated imports in the documentation for google vertex
2024-03-15 09:30:50 -04:00
Erick Friis
9e569d85a4 community, langchain, infra: store extended test deps outside of poetry (#18995)
poetry can't reliably handle resolving the number of optional "extended
test" dependencies we have. If we instead just rely on pip to install
extended test deps in CI, this isn't an issue.
2024-03-15 05:55:30 +00:00
Bagatur
191ddbc77e core[patch]: rc release 0.1.33-rc.1 (#19103) 2024-03-14 20:21:54 -07:00
Nuno Campos
508f75853c core[patch]: Change structured prompt lc id to match js (#19099) 2024-03-14 20:02:52 -07:00
Erick Friis
7ce81eb6f4 voyageai[patch]: init package (#19098)
Co-authored-by: fodizoltan <zoltan@conway.expert>
Co-authored-by: Yujie Qian <thomasq0809@gmail.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
2024-03-15 00:56:10 +00:00
Brace Sproul
5157b15446 ci[patch]: Set root dir to ./docs (#19102) 2024-03-14 17:55:04 -07:00
Brace Sproul
98cd8f673b docs[minor]ci[minor]: Add script & CI to check recurring links daily (#19100) 2024-03-14 17:42:22 -07:00
Asaf Joseph Gardin
4d7f6fa968 ai21[patch]: AI21 Labs Batch Support in Embeddings (#18633)
Description: Added support for batching when using AI21 Embeddings model
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 23:10:23 +00:00
Tomaz Bratanic
321db89e87 templates: Switch neo4j generation template to LLMGraphTransformer (#19024) 2024-03-14 16:00:42 -07:00
Erick Friis
d5cf360329 ibm[patch]: release 0.1.3 (#19094) 2024-03-14 15:59:42 -07:00
Mateusz Szewczyk
b15d150d22 ibm[patch]: add async tests, add tokenize support (#18898)
- **Description:** add async tests, add tokenize support
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally -> 
Please make sure integration_tests passing locally -> 

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 22:57:05 +00:00
billytrend-cohere
7253b816cc community: Add support for cohere SDK v5 (keeps v4 backwards compatibility) (#19084)
- **Description:** Add support for cohere SDK v5 (keeps v4 backwards
compatibility)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 15:53:24 -07:00
Eugene Yurtsev
06165efb5b core[patch]: RunnablePassthrough transform to autoupgrade to AddableDict (#19051)
Follow up on https://github.com/langchain-ai/langchain/pull/18743 which
missed RunnablePassthrough

Issues:

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
2024-03-14 16:59:46 -04:00
Eugene Yurtsev
41e2f60cd2 Updated security policy (#19089)
Updated security policy
2024-03-14 20:58:47 +00:00
Eugene Yurtsev
6cdca4355d community[minor]: Revamp PGVector Filtering (#18992)
This PR makes the following updates in the pgvector database:

1. Use JSONB field for metadata instead of JSON
2. Update operator syntax to include required `$` prefix before the
operators (otherwise there will be name collisions with fields)
3. The change is non-breaking, old functionality is still the default,
but it will emit a deprecation warning
4. Previous functionality has bugs associated with comparisons due to
casting to text (so lexical ordering is used incorrectly for numeric
fields)
5. Adds an a GIN index on the JSONB field for more efficient querying
2024-03-14 16:56:00 -04:00
Bagatur
e276817e1d docs: fix vercel build script (#19090)
amazon linux 2023 doesn't have `amazon-linux-extras` but shoudl have python3.9 by default
2024-03-14 20:53:43 +00:00
Guangdong Liu
d4b025c812 code[patch]: Add in code documentation to core Runnable assign method (docs only) (#18951)
**PR message**: ***Delete this entire checklist*** and replace with
- **Description:** [a description of the change](docs: Add in code
documentation to core Runnable assign method)
    - **Issue:** the issue  #18804
2024-03-14 15:41:19 -04:00
Anthony Yang
688a5bd106 docs:fixed typo in streaming document (#19045)
Fixed typo in line 661 - from 'mimimize' to 'minimize

- [ ] **PR message**: 
- **Description:** Fixed typo in streaming document - change 'mimimize'
to 'minimize

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-14 19:38:53 +00:00
Bagatur
573f48e34d core[patch]: Release 0.1.32 (#19088) 2024-03-14 12:01:58 -07:00
YHW
69a8ef2693 core: Runnable pass kwargs to _astream_log_implementation in astream_log (#19055)
- **Description:** When calling the `_stream_log_implementation` from
the `astream_log` method in the `Runnable` class, it is not handing over
the `kwargs` argument. Therefore, even if i want to customize APIHandler
and implement additional features with additional arguments, it is not
possible. Conversely, the `astream_events` method normally handing over
the `kwargs` argument.
- **Issue:** https://github.com/langchain-ai/langchain/issues/19054
- **Dependencies:**
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

Co-authored-by: hyungwookyang <hyungwookyang@worksmobile.com>
2024-03-14 14:39:46 -04:00
Nuno Campos
751fb7de20 Add new beta StructuredPrompt (#19080)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-14 10:40:34 -07:00
Bagatur
0ae39ab30e docs: make links internal (#19063)
So they can be properly link checked
2024-03-14 16:22:56 +00:00
Anton Parkhomenko
ae73b9d839 community[patch]: Fix NotionDBLoader 400 Error by conditionally adding filter parameter (#19075)
- **Description:** This change fixes a bug where attempts to load data
from Notion using the NotionDBLoader resulted in a 400 Bad Request
error. The issue was traced to the unconditional addition of an empty
'filter' object in the request payload, which Notion's API does not
accept. The modification ensures that the 'filter' object is only
included in the payload when it is explicitly provided and not empty,
thus preventing the 400 error from occurring.
- **Issue:** Fixes
[#18009](https://github.com/langchain-ai/langchain/issues/18009)
- **Dependencies:** None
- **Twitter handle:** @gunnzolder

Co-authored-by: Anton Parkhomenko <anton@merge.rocks>
2024-03-14 13:56:57 +00:00
Erick Friis
2999d06938 docs: deprecate old airbyte loader docs (#19048) 2024-03-13 23:18:30 +00:00
Prakul
4c53e31377 docs: Updated index definition and reference to LangChain-MongoDB (#19047)
**Description:** 
Updates to LangChain-MongoDB documentation: updates to the Atlas vector
search index definition

**Issue:** 
NA

**Dependencies:** 
NA

**Twitter handle:** 
iprakul
2024-03-13 15:44:13 -07:00
Erick Friis
5e0c58f9c2 infra: update upload-artifact and download-artifact to v4 (#19044) 2024-03-13 20:08:29 +00:00
Tomaz Bratanic
e5e15c8d59 docs: Add graph construction docs (#18904) 2024-03-13 12:27:58 -07:00
Nuno Campos
2b7c3c548d core[minor]: Add Runnable.batch_as_completed (#17603)
This PR adds `batch as completed` method to the standard Runnable
interface. It takes in a list of inputs and yields the corresponding
outputs as the inputs are completed.
2024-03-13 11:18:02 -07:00
Erick Friis
71d0981f18 templates: fix rag-lancedb dep (#19010) 2024-03-13 04:36:24 +00:00
Erick Friis
74b2c0aa01 templates, cli: more security deps (#19006) 2024-03-12 20:48:56 -07:00
Erick Friis
9052d05442 template: bump more lockfiles (#19003)
- templates: bump lockfile deps
- x
2024-03-13 01:43:33 +00:00
Erick Friis
49f3cc0f6b templates: bump lockfile deps (#19001) 2024-03-13 01:25:45 +00:00
Erick Friis
2ffb2144a6 experimental[patch]: release 0.0.54 (#19000) 2024-03-13 00:38:46 +00:00
Erick Friis
873d06c009 langchain[patch]: release 0.1.12 (#18999) 2024-03-13 00:22:21 +00:00
Leonid Ganeline
9c8523b529 community[patch]: flattening imports 3 (#18939)
@eyurtsev
2024-03-12 15:18:54 -07:00
Erick Friis
af50f21765 community[patch]: release 0.0.28 (#18993) 2024-03-12 21:55:29 +00:00
Erick Friis
4881bb669c core[patch]: release 0.1.31 (#18989) 2024-03-12 19:45:21 +00:00
Erick Friis
a29e8d8594 elasticsearch[patch]: fix integration tests for release (#18980) 2024-03-12 10:22:07 -07:00
Erick Friis
0d1f6c417c elasticsearch[patch]: release 0.1.1 (#18978) 2024-03-12 16:46:22 +00:00
Max Jakob
911ccf9aa6 docs: elasticsearch retriever (#18965)
Add documentation notebook for `ElasticsearchRetriever`.

## Dependencies
- [ ] Release new `langchain-elasticsearch` version 0.2.0 that includes
`ElasticsearchRetriever`
2024-03-12 09:42:36 -07:00
Dobiichi-Origami
471f2ed40a community[patch]: re-arrange the addtional_kwargs of returned qianfan structure to avoid _merge_dict issue (#18889)
fix issue: https://github.com/langchain-ai/langchain/issues/18441
PTAL, thanks
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-12 05:43:56 +00:00
Naman Jain
75122646b5 core[patch]: fixed circular dependency with json schema (#18657)
**Description:** Circular dependencies when parsing references leading
to `RecursionError: maximum recursion depth exceeded` issue. This PR
address the issue by handling previously seen refs as in any typical DFS
to avoid infinite depths.

**Issue:** https://github.com/langchain-ai/langchain/issues/12163

 **Twitter handle:** https://twitter.com/theBhulawat 


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-12 05:42:45 +00:00
Tymofii
0bec1f6877 commnity[patch]: refactor code for faiss vectorstore, update faiss vectorstore documentation (#18092)
**Description:** Refactor code of FAISS vectorcstore and update the
related documentation.
Details: 
 - replace `.format()` with f-strings for strings formatting;
- refactor definition of a filtering function to make code more readable
and more flexible;
- slightly improve efficiency of
`max_marginal_relevance_search_with_score_by_vector` method by removing
unnecessary looping over the same elements;
- slightly improve efficiency of `delete` method by using set data
structure for checking if the element was already deleted;

**Issue:** fix small inconsistency in the documentation (the old example
was incorrect and unappliable to faiss vectorstore)

**Dependencies:** basic langchain-community dependencies and `faiss`
(for CPU or for GPU)

**Twitter handle:** antonenkodev
2024-03-11 22:33:03 -07:00
Roshan Santhosh
acf1ecc081 langchain[patch]: update llm_router.py (#18865)
Issue : _call method of LLMRouterChain uses predict_and_parse, which is
slated for deprecation.

Description : Instead of using predict_and_parse, this replaces it with
individual predict and parse functions.
2024-03-11 22:30:07 -07:00
Bagatur
18de77cc8c core[minor]: add streaming support to OAI tool parsers (#18940)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-11 21:53:56 -07:00
Bagatur
e0e688a277 core[minor]: generation info on msg (#18592)
related to #16403 #17188
2024-03-12 04:43:17 +00:00
Tomaz Bratanic
cda43c5a11 experimental[patch]: Fix LLM graph transformer default prompt (#18856)
Some LLMs do not allow multiple user messages in sequence.
2024-03-11 20:11:52 -07:00
Bagatur
19721246f5 core[patch]: support labeled json schema as tools (#18935) 2024-03-11 19:51:35 -07:00
Jacob Lee
950ab056eb templates[patch]: Update pirate-speak deps, add messages placeholder (#18949)
CC @efriis
2024-03-11 19:20:30 -07:00
Leonid Ganeline
fad308a764 docs: providers update 2 (#18407)
Formatted pages into a consistent form. Added descriptions and links
when needed.
2024-03-11 18:35:37 -07:00
Erick Friis
239f0a615e templates: redis multi-modal multi-vector rag (#18946)
---------

Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
2024-03-12 00:32:25 +00:00
Bagatur
915c1f8673 infra: rm api build CI (#18944) 2024-03-11 16:12:34 -07:00
Brace Sproul
578e67c017 docs[patch]: properly load/use env vars (#18942) 2024-03-11 15:38:05 -07:00
Erick Friis
0d888a65cb core[patch]: move some attr/methods to BaseLanguageModel (#18936)
Cleans up some shared code between `BaseLLM` and `BaseChatModel`. One
functional difference to make it more consistent (see comment)
2024-03-11 14:59:45 -07:00
Brace Sproul
4ff6aa5c78 docs[minor]: Swap gtag for supabase (#18937)
Added deps:
- `@supabase/supabase-js` - for sending inserts
- `supabase` - dev dep, for generating types via cli
- `dotenv` for loading env vars

Added script:
- `yarn gen` - will auto generate the database schema types using the
supabase CLI. Not necessary for development, but is useful. Requires
authing with the supabase CLI (will error out w/ instructions if you're
not authed).

Added functionality:
- pulls users IP address (using a free endpoint: `https://api.ipify.org`
so we can filter out abuse down the line)

TODO:
- [x] add env vars to vercel
2024-03-11 14:23:12 -07:00
aditya thomas
5c2f7e6b2b partners[openai]: update the docstring of OpenAI, OpenAIEmbeddings and ChatOpenAI classes (#18908)
**Description:** Update the docstring of OpenAI, OpenAIEmbeddings and
ChatOpenAI classes
**Issue:** Update import module paths to the current LangChain API
**Dependencies:** None
**Lint and test**: `make format` and `make lint` were run

This incorporates the review comments from langchain-ai/langchain#18637
which I closed due to an issue I had in updating that pr branch

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-11 20:48:54 +00:00
Leonid Ganeline
11195cfa42 community[patch]: speed up import times in the community package (#18928)
This PR speeds up import times in the community package
2024-03-11 16:37:36 -04:00
fjk
a7fc731720 docs: change sparkllm spark_app_url to spark_api_url (#18000)
community: fix - change sparkllm spark_app_url to spark_api_url

- **Description:** 
- Change the variable name from `sparkllm spark_app_url` to
`spark_api_url` in the community package.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-11 20:01:30 +00:00
Sevin F. Varoglu
8639624d40 docs: update OctoAI doc (#18913)
This PR updates the OctoAI LLM doc.
2024-03-11 13:01:10 -07:00
Alexander Kozlov
a7500ab0fb docs: Update huggingface pipelines notebook (#18801) 2024-03-11 20:00:31 +00:00
Conroy Whitney
96d7fe0f85 docs: Change saved/configured chain variable name (#18863)
**Description:**
Variable name was `openai_poem` but it didn't pass in the `"prompt":
"poem"` config, so the examples were showing a joke being returned from
a variable called `*_poem`.

We could have gone one of two ways:

1. Updating the config line and the output line, or
2. Updating the variable name

The latter seemed simpler, so that's what I went with. But I'd be glad
to re-do this PR if you prefer the former.

Thanks for everything, y'all. You rock 🤘

**Issue:** N/A

**Dependencies:** N/A

**Twitter handle:** `conroywhitney`
2024-03-11 12:59:24 -07:00
aditya thomas
8544f748f2 community[patch]: update AnthropicLLM deprecation message (#18869)
**Description:** Update AnthropicLLM deprecation message import path for
ChatAnthropic
**Issue:** Incorrect import path in deprecation message
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` were run
2024-03-11 12:59:10 -07:00
Virat Singh
cafffe8a21 community: Add PolygonAggregates tool (#18882)
**Description:**
In this PR, I am adding a `PolygonAggregates` tool, which can be used to
get historical stock price data (called aggregates by Polygon) for a
given ticker.

Polygon
[docs](https://polygon.io/docs/stocks/get_v2_aggs_ticker__stocksticker__range__multiplier___timespan___from___to)
for this endpoint.

**Twitter**: 
[@virattt](https://twitter.com/virattt)
2024-03-11 11:58:10 -07:00
Bagatur
2d172181e0 Revert "update api build script (#18930)" (#18931) 2024-03-11 11:47:18 -07:00
Bagatur
def329b5f2 update api build script (#18930) 2024-03-11 11:44:37 -07:00
Bagatur
c24c871d88 docs: update readme diagram (#18929) 2024-03-11 11:17:45 -07:00
Bagatur
34284c25d4 docs: turn on link check (#18924) 2024-03-11 10:50:39 -07:00
Erick Friis
93ef8ead0b mongodb[patch]: fix core dep (#18926) 2024-03-11 10:27:29 -07:00
Mohammad Mohtashim
43db4cd20e core[major]: On Tool End Observation Casting Fix (#18798)
This PR updates the on_tool_end handlers to return the raw output from the tool instead of casting it to a string. 

This is technically a breaking change, though it's impact is expected to be somewhat minimal. It will fix behavior in `astream_events` as well.

Fixes the following issue #18760 raised by @eyurtsev

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-11 10:59:04 -04:00
Prashanth Rao
a96a6e0f2c docs: Fix typo and add KùzuDB to graphs docs (#18915)
- **Description:** Adding Kùzu (an embedded graph DB that uses Cypher)
to the graph docs, and fixing a typo
 - **Issue:** docs update
2024-03-11 14:42:46 +00:00
aditya thomas
3d15498612 docs: Update callbacks documentation (#18899)
**Description:** Update callbacks documentation
**Issue:** Change some module imports and a method invocation to reflect
the current LangChainAPI
**Dependencies:** None
2024-03-11 10:40:11 -04:00
Massimiliano Pronesti
8113d612bb community[patch]: support modin document loader (#18866)
Langchain community document loaders support `pyspark`, `polars`, and
`pandas` dataframes but not `modin`'s. This PR addresses this point.
2024-03-10 18:40:04 -07:00
Leonid Ganeline
dee256ef5a docs: platforms/google fixed broken links (#18878)
Several links are broken. Fixed them.
2024-03-10 18:19:43 -07:00
Pol Ruiz Farre
a7f63d8cb4 community[patch]: Fix BasePDFLoader suffix for s3 presigned urls (#18844)
BasePDFLoader doesn't parse the suffix of the file correctly when
parsing S3 presigned urls. This fix enables the proper detection and
parsing of S3 presigned URLs to prevent errors such as `OSError: [Errno
36] File name too long`.
No additional dependencies required.
2024-03-11 00:58:51 +00:00
Joshua Carroll
ddaf9de169 community: Fix bug with StreamlitChatMessageHistory (#18834)
- **Description:** Fix Streamlit bug which was introduced by
https://github.com/langchain-ai/langchain/pull/18250, update integration
test
- **Issue:** https://github.com/langchain-ai/langchain/issues/18684
- **Dependencies:** None
2024-03-09 13:42:22 -08:00
Kushagra
5fcbe9dd2a community[patch]: documented the feature to filter documents in MongoDBloader (#18842)
"community[docs]: documented the feature to filter documents in
MongoDBloader"
- Description: documented the feature to filter documents in
MongoDBloader
- Feature: the feature
https://github.com/langchain-ai/langchain/discussions/18251
- Dependencies: No
- Twitter handle: https://twitter.com/im_Kushagra
2024-03-09 13:41:34 -08:00
Ikko Eltociear Ashimine
c3580d3c64 docs: fix typo in google_cloud_sql_mysql.ipynb (#18847)
arbitary -> arbitrary
2024-03-09 13:39:36 -08:00
Luan Fernandes
5a006f7264 docs: update typo in docs about agent tools (#18850)
fixes #18849
2024-03-09 13:39:18 -08:00
Leonid Ganeline
3dabd3f214 docs: platform pages update (#17836)
`Integrations` platform page ToC-s: sections there are placed without
order. For example, the
[google](https://python.langchain.com/docs/integrations/platforms/google)
page. The `LLM` section is not the first section, as it is in the
[Components](https://python.langchain.com/docs/integrations/components)
menu.
Updates:
* reorganized the page sections so they follow the Component menu order.
* fixed names for the section names: "Text Embedding Models" ->
"Embedding Models"
2024-03-09 13:34:33 -08:00
Leonid Ganeline
07c518ad3e docs: providers update 4 (#18540)
Created the `facebook` page from `facebook_faiss` and `facebook_chat`
pages. Added another Facebook integrations into this page.
Updated `discord` page.
2024-03-09 13:30:48 -08:00
Leonid Ganeline
9c0f84ae95 docs: providers update 6 (#18610)
Cleaned up the `Integrations/Components/Memory` navbar by shortening the
page titles. Updated page titles and file names to consistent formats.
2024-03-09 13:29:44 -08:00
Tomaz Bratanic
a28be31a96 Switch to md5 for deduplication in neo4j integrations (#18846)
Deduplicate documents using MD5 of the page_content. Also allows for
custom deduplication with graph ingestion method by providing metadata
id attribute

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-09 13:28:55 -08:00
Tomaz Bratanic
246724faab LLM graph transformer prompt engineering (#18843)
A bit of prompt engineering to improve results
2024-03-09 11:27:16 -08:00
Tomaz Bratanic
e778d60aec Fix broken link in graph docs (#18837) 2024-03-09 10:40:33 -08:00
Erick Friis
b48865bf94 langchain[patch]: attach hub metadata (#18830) 2024-03-08 18:40:49 -08:00
Ammar
34b31a8cc7 core: add in-code docs for RunnableAssign class (#18826)
**Description:** Improves the docstring for `RunnableAssign` by
providing a concise description and a self-contained code example.
  **Issue:**  #18803
2024-03-09 02:04:52 +00:00
Leonid Ganeline
5d65b47e41 docs: chat menu item as icon (#18806)
Update chat icon in docs
2024-03-08 21:00:21 -05:00
Leonid Ganeline
476d6dc596 community[patch]: Use getattr for toolkits imports (#18825)
This will preserve the namespace, without actually loading the underlying packages on init.
2024-03-08 20:54:28 -05:00
Erick Friis
bbb609ac9d core[patch]: fix arbitrary config keys (#18827) 2024-03-08 17:35:13 -08:00
Luis Antonio Vieira Junior
67c880af74 community[patch]: adding linearization config to AmazonTextractPDFLoader (#17489)
- **Description:** Adding an optional parameter `linearization_config`
to the `AmazonTextractPDFLoader` so the caller can define how the output
will be linearized, instead of forcing a predefined set of linearization
configs. It will still have a default configuration as this will be an
optional parameter.
- **Issue:** #17457
- **Dependencies:** The same ones that already exist for
`AmazonTextractPDFLoader`
- **Twitter handle:** [@lvieirajr19](https://twitter.com/lvieirajr19)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:25:22 -08:00
Anis ZAKARI
37e89ba5b1 community[patch]: Bedrock add support for mistral models (#18756)
*Description**: My previous
[PR](https://github.com/langchain-ai/langchain/pull/18521) was
mistakenly closed, so I am reopening this one. Context: AWS released two
Mistral models on Bedrock last Friday (March 1, 2024). This PR includes
some code adjustments to ensure their compatibility with the Bedrock
class.

---------

Co-authored-by: Anis ZAKARI <anis.zakari@hymaia.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-09 01:20:38 +00:00
Alexander Dicke
66576948e0 experimental[minor]: adds mixtral wrapper (#17423)
**Description:** Adds a chat wrapper for Mixtral models using the
[prompt
template](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1#instruction-format).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:14:23 -08:00
Erick Friis
4f4300723b docs: pinecone client version note (#17491) 2024-03-08 17:09:17 -08:00
Keith Chan
914af69b44 community[patch]: Update azuresearch vectorstore from_texts() method to include fields argument (#17661)
- **Description:** Update azuresearch vectorstore from_texts() method to
include fields argument, necessary for creating an Azure AI Search index
with custom fields.
- **Issue:** Currently index fields are fixed to default fields if Azure
Search index is created using from_texts() method
- **Dependencies:** None
- **Twitter handle:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:05:35 -08:00
al1p
46f0cea2b9 community[patch][: improved the suffix prompt to avoid loop (#17791)
Small improvement to the openapi prompt.
The agent was not finding the server base URL (looping through all
nodes). This small change narrows the search and enables finding the url
faster.

No dependency 

Twitter : @al1pra
2024-03-08 16:53:09 -08:00
Dmitry Kankalovich
f5117e907d openai[patch]: Proper example for AzureOpenAI usage in error message (#17798)
# Proper example for AzureOpenAI usage in error message

The original error message is wrong in part of a usage example it gives.
Corrected to the right one.

Co-authored-by: Dzmitry Kankalovich <dzmitry_kankalovich@epam.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:52:55 -08:00
Pranav Agarwal
bd9b5dc2f3 docs: Updating cookbook README for amazon personalize (#17854)
This PR is a successor to this PR -
https://github.com/langchain-ai/langchain/pull/17436
This PR updates the cookbook README with the notebook so that it is
available on langchain docs for discoverability.

cc: @baskaryan, @3coins

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:52:36 -08:00
AtomicVar
23e62f8f8d docs: fix lists display issue (#17911)
**Description:** Fix lists display issues in **Docs > Use Cases > Q&A
with RAG > Quickstart**.

In essence, this PR changes:

```markdown
Some paragraph.
- Item a.
- Item b.
```

to:

```markdown
Some paragraph.

- Item a.
- Item b.
```

There needs an extra empty line to make the list rendered properly.

FYI, the old version is displayed not properly as:

<img width="856" alt="image"
src="https://github.com/langchain-ai/langchain/assets/22856433/65202577-8ea2-47c6-b310-39bf42796fac">

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:52:16 -08:00
Théo LEBRUN
cf94091cd0 community[patch]: Skip nested directories when using S3DirectoryLoader (#17829)
- **Description:** `S3DirectoryLoader` is failing if prefix is a folder
(ex: `my_folder/`) because `S3FileLoader` will try to load that folder
and will fail. This PR skip nested directories so prefix can be set to
folder instead of `my_folder/files_prefix`.
- **Issue:**
  - #11917
  - #6535
  - #4326
- **Dependencies:** none
- **Twitter handle:** @Falydoor


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-08 16:50:58 -08:00
Venkatesan
7a18b63dbf community[patch]: Mongo index creation (#17748)
- [ ] Title: Mongodb: MongoDB connection performance improvement. 
- [ ] Message: 
- **Description:** I made collection index_creation as optional. Index
Creation is one time process.
- **Issue:** MongoDBChatMessageHistory class object is attempting to
create an index during connection, causing each request to take longer
than usual. This should be optional with a parameter.
    - **Dependencies:** N/A
    - **Branch to be checked:** origin/mongo_index_creation

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:43:17 -08:00
wt3639
5b5b37a999 community[patch]: Add embedding instruction to HuggingFaceBgeEmbeddings (#18017)
- **Description:** Add embedding instruction to
HuggingFaceBgeEmbeddings, so that it can be compatible with nomic and
other models that need embedding instruction.

---------

Co-authored-by: Tao Wu <tao.wu@rwth-aachen.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:39:29 -08:00
Brace Sproul
9c218d0154 docs[patch]: Update how GA4 is collected (#18821)
There's some issue/setting with the current python GA4 app. I created a
new one just for feedback.
2024-03-08 14:32:40 -08:00
Erick Friis
a8de6d1533 anthropic[patch]: integration test update (#18823) 2024-03-08 13:47:31 -08:00
wewebber-merlin
d1f5bc4906 anthropic[patch]: add kwargs to format_output base (#18715)
_generate() and _agenerate() both accept **kwargs, then pass them on to
_format_output; but _format_output doesn't accept **kwargs. Attempting
to pass, e.g.,

     timeout=50

to _generate (or invoke()) results in a TypeError.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-08 21:47:21 +00:00
Erick Friis
aa7bce6b13 anthropic[patch]: release 0.1.4 (#18822) 2024-03-08 21:34:47 +00:00
Erick Friis
a5bcddc738 anthropic[patch]: streaming param (#18819) 2024-03-08 13:32:57 -08:00
Erick Friis
8c0b215c02 anthropic[patch]: fix format output args (#18816) 2024-03-08 12:34:11 -08:00
Ishani Vyas
2b0cbd65ba community[patch]: Add Passio Nutrition AI Food Search Tool to Community Package (#18278)
## Add Passio Nutrition AI Food Search Tool to Community Package

### Description
We propose adding a new tool to the `community` package, enabling
integration with Passio Nutrition AI for food search functionality. This
tool will provide a simple interface for retrieving nutrition facts
through the Passio Nutrition AI API, simplifying user access to
nutrition data based on food search queries.

### Implementation Details
- **Class Structure:** Implement `NutritionAI`, extending `BaseTool`. It
includes an `_run` method that accepts a query string and, optionally, a
`CallbackManagerForToolRun`.
- **API Integration:** Use `NutritionAIAPI` for the API wrapper,
encapsulating all interactions with the Passio Nutrition AI and
providing a clean API interface.
- **Error Handling:** Implement comprehensive error handling for API
request failures.

### Expected Outcome
- **User Benefits:** Enable easy querying of nutrition facts from Passio
Nutrition AI, enhancing the utility of the `langchain_community` package
for nutrition-related projects.
- **Functionality:** Provide a straightforward method for integrating
nutrition information retrieval into users' applications.

### Dependencies
- `langchain_core` for base tooling support
- `pydantic` for data validation and settings management
- Consider `requests` or another HTTP client library if not covered by
`NutritionAIAPI`.

### Tests and Documentation
- **Unit Tests:** Include tests that mock network interactions to ensure
tool reliability without external API dependency.
- **Documentation:** Create an example notebook in
`docs/docs/integrations/tools/passio_nutrition_ai.ipynb` showing usage,
setup, and example queries.

### Contribution Guidelines Compliance
- Adhere to the project's linting and formatting standards (`make
format`, `make lint`, `make test`).
- Ensure compliance with LangChain's contribution guidelines,
particularly around dependency management and package modifications.

### Additional Notes
- Aim for the tool to be a lightweight, focused addition, not
introducing significant new dependencies or complexity.
- Potential future enhancements could include caching for common queries
to improve performance.

### Twitter Handle
- Here is our Passio AI [twitter handle](https://twitter.com/@passio_ai)
where we announce our products.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-08 20:33:22 +00:00
Aaron Jimenez
bd9f98a20b docs: Fix typo in modules/chains.ipynb (#18808)
**Description:**  

Fix a minor typo in `modules/chains.ipynb`.
 
- **Issue:** 
    fixes #17851
2024-03-08 12:09:20 -08:00
Kushagra
b1f22bf76c community[minor]: added a feature to filter documents in Mongoloader (#18253)
"community: added a feature to filter documents in Mongoloader"
- **Description:** added a feature to filter documents in Mongoloader
    - **Feature:** the feature #18251
    - **Dependencies:** No
    - **Twitter handle:** https://twitter.com/im_Kushagra
2024-03-08 12:06:35 -08:00
Tomaz Bratanic
c0bdd4d45b docs: Add main graph documentation (#18021)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 20:03:03 +00:00
Leonid Ganeline
7c8c4e5743 docs: providers update 7 (#18620)
Added missed providers. Added missed integrations. Formatted to the
consistent form. Fixed outdated imports.
2024-03-08 12:00:27 -08:00
Eugene Yurtsev
1f50274df7 community[patch]: Add pgvector to docker compose and update settings used in integration test (#18815) 2024-03-08 14:39:28 -05:00
Erick Friis
ad29806255 nvidia-trt, nvidia-ai-endpoints: move to repo (#18814)
NVIDIA maintained in https://github.com/langchain-ai/langchain-nvidia
2024-03-08 19:30:50 +00:00
Christophe Bornet
e54a49b697 community[minor]: Add lazy_table_reflection param to SqlDatabase (#18742)
For some DBs with lots of tables, reflection of all the tables can take
very long. So this change will make the tables be reflected lazily when
get_table_info() is called and `lazy_table_reflection` is True.
2024-03-08 14:10:23 -05:00
Christophe Bornet
ead2a74806 community: Implement lazy_load() for JSONLoader (#18643)
Covered by `tests/unit_tests/document_loaders/test_json_loader.py`
2024-03-08 13:58:17 -05:00
Erick Friis
a88f62ec3c langchain[patch]: getattr import from langchain.chains (#18160) 2024-03-08 10:36:14 -08:00
kAIto47802
ff70cc4e80 docs: fix typo (#18810)
Fixed typo in docs
2024-03-08 13:28:17 -05:00
Eugene Yurtsev
cdfb5b4ca1 core[minor]: Chat Models to fallback astream to fallback on sync stream if available (#18748)
Allows all chat models that implement _stream, but not _astream to still have async streaming to work.

Amongst other things this should resolve issues with streaming community model implementations through langserve since langserve is exclusively async.
2024-03-08 13:27:29 -05:00
Leonid Ganeline
3624f56ccb docs: update imports of retrievers to use langchain_community (#18707)
Updated `langchain` imports to `langchain_community`.
2024-03-08 13:04:38 -05:00
Leonid Ganeline
48eed86931 docs: update imports of memory to use langchain_community (#18689)
Refactored imports from `langchain` to `langchain_community` whenever it
is applicable
2024-03-08 13:02:31 -05:00
aditya thomas
e00c1ff2b0 infra: ChatOpenAI unit tests for invoke() and ainvoke() (#18792)
**Description:** Replacing the deprecated predict() and apredict()
methods in the unit tests
**Issue:** Not applicable
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` have been
run
2024-03-08 09:48:38 -08:00
aditya thomas
a35203b164 docs: (minor) update to anthropic doc (#18794)
**Description:** Minor update to Anthropic documentation
**Issue:** Not applicable
**Dependencies:** None
**Lint and test**: `make format` and `make lint` was done
2024-03-08 09:48:04 -08:00
Bagatur
3e29c04213 core[minor]: add BaseMessage.response_metadata (#18699) 2024-03-08 09:35:56 -08:00
standby24x7
67d48ea600 docs:Update function "run" to "invoke" in llm_bash.ipynb (#18663)
This path updates function "run" to "invoke" in llm_bash.ipynb. 
Without this path, you see following warning.

LangChainDeprecationWarning: The function `run` was deprecated in
LangChain 0.1.0
and will be removed in 0.2.0. Use invoke instead.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-08 09:35:36 -08:00
Bagatur
bc6249c889 langchain[patch]: runnable agent streaming param (#18761)
Usage:

```python
agent = RunnableAgent(runnable=runnable, .., stream_runnable=False)
```
or for convenience
```python
agent_executor = AgentExecutor(agent=agent, ..., stream_runnable=False)
```
2024-03-07 20:53:53 -08:00
Tomaz Bratanic
c8c592d3f1 experimental[minor]: Add LLM graph transformer (#18733)
Add a class that constructs knowledge graphs based on text using an LLM.
2024-03-07 20:52:53 -08:00
Phat Vo
3ecb903d49 community[patch] : Tidy up and update Clarifai SDK functions (#18314)
Description :
* Tidy up, add missing docstring and fix unused params
* Enable using session token
2024-03-07 19:47:44 -08:00
Paul Sanders
93b87f2bfb docs: Fix typo (#18545)
Fixing a minor typo in the package name.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-07 19:40:42 -08:00
Aaron Jimenez
fcf6213c22 docs: Fix link to HF TEI in text_embeddings_inference.ipynb (#18682)
- [ ] **PR title:** docs: Fix link to HF TEI in
text_embeddings_inference.ipynb
 
- [ ] **PR message:**

- **Description:** Fix the link to [Hugging Face Text Embeddings
Inference
(TEI)](https://huggingface.co/docs/text-embeddings-inference/index) in
text_embeddings_inference.ipynb
   - **Issue:** Fix #18576
2024-03-07 19:38:39 -08:00
Max Jakob
61a2eba081 elasticsearch[patch]: add top-level import, remove obsolete dependency (#18644)
Make `ElasticsearchRetriever` available as top-level import.

The `langchain` package depends on `langchain-community` so we do not
need to depend on it explicitly.
2024-03-07 19:38:31 -08:00
Averi Kitsch
8accee57a9 docs: update Google Cloud database integration docs (#18711)
**Description:** update Google Cloud database integration docs
 **Issue:** NA
**Dependencies:** NA
2024-03-07 19:36:00 -08:00
Tomaz Bratanic
010a234f1e docs: Fix diffbot graph transformer description (#18736)
The previous docstring was invalid
2024-03-07 19:25:41 -08:00
Jan Nissen
b8922480ed core[patch]: improve PydanticOutputParser typing (#18740)
This PR adds generic typing to `PydanticOutputParser` so we get a typed
output from `.parse` instead of `Any`. It should provide a better DX by
way of Intellisense and for anyone strictly typing.

Pre-change:

![Screenshot 2024-03-07 at 10 22
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/fd22dde0-9fdc-4283-b283-4c98f0bc46e5)

Post-change:

![Screenshot 2024-03-07 at 10 26
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/7e23d2b7-8f8c-494f-80b3-187530a173ee)

I haven't dug too deep, but I think a similar change could probably be
added to `JsonOutputParser` so we don't have to pull up `.parse`.

Co-authored-by: Jan Nissen <jan23@gmail.com>
2024-03-07 19:25:24 -08:00
Massimiliano Pronesti
3b975c6ebe experimental[minor]: add support for modin in pandas agent (#18749)
Added support for Intel's
[modin](https://github.com/modin-project/modin) in
`create_pandas_dataframe_agent`.
2024-03-07 19:23:07 -08:00
Tomaz Bratanic
4bfe888717 comunity[patch]: Fix neo4j sanitizing values (#18750)
Fixing sanitization for when deeply nested lists appear
2024-03-07 19:21:52 -08:00
Ian
7f504c1f81 docs: Improve the tidb vector store notebook (#18773)
Remove redundant useless content, and fix some minor oversight
2024-03-07 19:15:55 -08:00
Eugene Yurtsev
6caceb5473 core[patch]: Automatic upgrade to AddableDict in transform and atransform (#18743)
Automatic upgrade to transform and atransform

Closes: 

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
2024-03-07 21:23:12 -05:00
Yunmo Koo
fee6f983ef community[minor]: Integration for Friendli LLM and ChatFriendli ChatModel. (#17913)
## Description
- Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM
and `ChatFriendli` chat model.
- Unit tests and integration tests corresponding to this change are
added.
- Documentations corresponding to this change are added.

## Dependencies
- Optional dependency
[`friendli-client`](https://pypi.org/project/friendli-client/) package
is added only for those who use `Frienldi` or `ChatFriendli` model.

## Twitter handle
- https://twitter.com/friendliai
2024-03-08 02:20:47 +00:00
Smit Parmar
aed46cd6f2 community[patch]: Added support for filter out AWS Kendra search by score confidence (#12920)
**Description:** It will add support for filter out kendra search by
score confidence which will make result more accurate.
    For example
   ```
retriever = AmazonKendraRetriever(
        index_id=kendra_index_id, top_k=5, region_name=region,
        score_confidence="HIGH"
    )
```
Result will not include the records which has score confidence "LOW" or "MEDIUM". 
Relevant docs 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/query.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/retrieve.html

 **Issue:** the issue # it resolve #11801 
**twitter:** [@SmitCode](https://twitter.com/SmitCode)
2024-03-07 17:28:09 -08:00
Ian
390ef6abe3 community[minor]: Add Initial Support for TiDB Vector Store (#15796)
This pull request introduces initial support for the TiDB vector store.
The current version is basic, laying the foundation for the vector store
integration. While this implementation provides the essential features,
we plan to expand and improve the TiDB vector store support with
additional enhancements in future updates.

Upcoming Enhancements:
* Support for Vector Index Creation: To enhance the efficiency and
performance of the vector store.
* Support for max marginal relevance search. 
* Customized Table Structure Support: Recognizing the need for
flexibility, we plan for more tailored and efficient data store
solutions.

Simple use case exmaple

```python
from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)
```
2024-03-07 17:18:20 -08:00
Bagatur
3b1eb1f828 community[patch]: chat hf typing fix (#18693) 2024-03-07 17:06:38 -08:00
Eugene Yurtsev
1e1cac50d8 Docs: remove sales from security (#18762)
Remove sales from security
2024-03-07 17:35:46 -05:00
Jib
d60e93b6ae langchain-mongodb: Standardize mongodb collection/index names in tests (#18755)
## **Description:**
MongoDB integration tests link to a provided Atlas Cluster. We have very
stringent permissions set against the cluster provided. In order to make
it easier to track and isolate the collections each test gets run
against, we've updated the collection names to map the test file name.
i.e. `langchain_{filename}` => `langchain_test_vectorstores`

Fixes integration test results

![image](https://github.com/langchain-ai/langchain/assets/2887713/41f911b9-55f7-4fe4-9134-5514b82009f9)

## **Dependencies:** 
Provided MONGODB_ATLAS_URI

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

cc: @shaneharvey, @blink1073 , @NoahStapp , @caseyclements
2024-03-07 17:16:04 -05:00
Eugene Yurtsev
ca299a8e08 Docs: Add custom parsing documentation and extending langchain (#18331)
* Added extending langchain.mdx -- we'll need to add links as we add
more custom documentation
* Added partial documentation about parsers
2024-03-07 16:30:57 -05:00
Eugene Yurtsev
8c71f92cb2 core: upgrade mypy to recent mypy (#18753)
Testing this works per package on CI
2024-03-07 15:25:19 -05:00
Eugene Yurtsev
e188d4ecb0 Add dangerous parameter to requests tool (#18697)
The tools are already documented as dangerous. Not clear whether adding
an opt-in parameter is necessary or not
2024-03-07 15:10:56 -05:00
Leonid Ganeline
dad949eb99 docs: update imports of adapters to use langchain_community (#18751)
Updated imports from `langchain` to `langchain_community`
2024-03-07 15:04:25 -05:00
Erick Friis
fcaa9cf2f1 community[patch]: deprecate community anthropic (#18745) 2024-03-07 13:51:55 -05:00
Erick Friis
1beb84b061 community[patch]: move pdf text tests to integration (#18746) 2024-03-07 10:34:22 -08:00
Christophe Bornet
4a7d73b39d community: If load() has been overridden, use it in default lazy_load() (#18690) 2024-03-07 11:52:19 -05:00
Christophe Bornet
6cd7607816 community[patch]: Implement lazy_load() for MHTMLLoader (#18648)
Covered by `tests/unit_tests/document_loaders/test_mhtml.py`
2024-03-07 11:50:18 -05:00
axiangcoding
9745b5894d community[patch]: Chroma use uuid4 instead of uuid1 to generate random ids (#18723)
- **Description:** Chroma use uuid4 instead of uuid1 as random ids. Use
uuid1 may leak mac address, changing to uuid4 will not cause other
effects.
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** None
2024-03-07 11:48:25 -05:00
Leonid Ganeline
1af2130ff7 docs: update imports of tools to use langchain_community (#18705)
Updated imports from `langchain` to `langchain_community`.
2024-03-07 11:46:09 -05:00
Guangdong Liu
ced5e7bae7 community[patch]: Fix sparkllm authentication problem. (#18651)
- **Description:** fix sparkllm authentication problem.The current
timestamp is in RFC1123 format. The time deviation must be controlled
within 300s. I changed to re-obtain the url every time I ask a question.
https://www.xfyun.cn/doc/spark/general_url_authentication.html#_1-2-%E9%89%B4%E6%9D%83%E5%8F%82%E6%95%B0
2024-03-06 18:43:16 -08:00
Erick Friis
89d32ffbbd community[patch]: release 0.0.27 (#18708) 2024-03-07 01:08:43 +00:00
Erick Friis
c09b520ce4 core[patch]: release 0.1.30 (#18706) 2024-03-06 16:12:18 -08:00
Piyush Jain
2b234a4d96 Support for claude v3 models. (#18630)
Fixes #18513.

## Description
This PR attempts to fix the support for Anthropic Claude v3 models in
BedrockChat LLM. The changes here has updated the payload to use the
`messages` format instead of the formatted text prompt for all models;
`messages` API is backwards compatible with all models in Anthropic, so
this should not break the experience for any models.


## Notes
The PR in the current form does not support the v3 models for the
non-chat Bedrock LLM. This means, that with these changes, users won't
be able to able to use the v3 models with the Bedrock LLM. I can open a
separate PR to tackle this use-case, the intent here was to get this out
quickly, so users can start using and test the chat LLM. The Bedrock LLM
classes have also grown complex with a lot of conditions to support
various providers and models, and is ripe for a refactor to make future
changes more palatable. This refactor is likely to take longer, and
requires more thorough testing from the community. Credit to PRs
[18579](https://github.com/langchain-ai/langchain/pull/18579) and
[18548](https://github.com/langchain-ai/langchain/pull/18548) for some
of the code here.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-06 15:46:18 -08:00
Sam Khano
1b4dcf22f3 community[minor]: Add DocumentDBVectorSearch VectorStore (#17757)
**Description:**
- Added Amazon DocumentDB Vector Search integration (HNSW index)
- Added integration tests
- Updated AWS documentation with DocumentDB Vector Search instructions
- Added notebook for DocumentDB integration with example usage

---------

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>
2024-03-06 15:11:34 -08:00
Vittorio Rigamonti
51f3902bc4 community[minor]: Adding support for Infinispan as VectorStore (#17861)
**Description:**
This integrates Infinispan as a vectorstore.
Infinispan is an open-source key-value data grid, it can work as single
node as well as distributed.

Vector search is supported since release 15.x 

For more: [Infinispan Home](https://infinispan.org)

Integration tests are provided as well as a demo notebook
2024-03-06 15:11:02 -08:00
Max Jakob
cca0167917 elasticsearch[patch], community[patch]: update references, deprecate community classes (#18506)
Follow up on https://github.com/langchain-ai/langchain/pull/17467.

- Update all references to the Elasticsearch classes to use the partners
package.
- Deprecate community classes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-06 15:09:12 -08:00
José Luis Di Biase
6041ec3dd1 templates: rag-multi-modal typo, replace serch with search (#18519)
Thank you for contributing to LangChain!

- [x] **PR title**: "templates: rag-multi-modal typo, replace serch with
search "
- **Description:** Two little typos in multi modal templates (replace
serch string with search)

Signed-off-by: José Luis Di Biase <josx@interorganic.com.ar>
2024-03-06 15:08:55 -08:00
Djordje
12b4a4d860 community[patch]: Opensearch delete method added - indexing supported (#18522)
- **Description:** Added delete method for OpenSearchVectorSearch,
therefore indexing supported
    - **Issue:** No
    - **Dependencies:** No
    - **Twitter handle:** stkbmf
2024-03-06 15:08:47 -08:00
Erick Friis
687d27567d openai[patch]: unit test azure init (#18703) 2024-03-06 14:17:09 -08:00
Christophe Bornet
db8db6faae community: Implement lazy_load() for PlaywrightURLLoader (#18676)
Integration tests:
`tests/integration_tests/document_loaders/test_url_playwright.py`
2024-03-06 16:52:13 -05:00
Aaron Yi
c092db862e community[patch]: make metadata and text optional as expected in DocArray (#18678)
ValidationError: 2 validation errors for DocArrayDoc
text
Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.5/v/missing
metadata
Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.5/v/missing
```
In the `_get_doc_cls` method, the `DocArrayDoc` class is defined as
follows:

```python
class DocArrayDoc(BaseDoc):
    text: Optional[str]
    embedding: Optional[NdArray] = Field(**embeddings_params)
    metadata: Optional[dict]
```
2024-03-06 16:51:41 -05:00
Eugene Yurtsev
4c25b49229 community[major]: breaking change in some APIs to force users to opt-in for pickling (#18696)
This is a PR that adds a dangerous load parameter to force users to opt in to use pickle.

This is a PR that's meant to raise user awareness that the pickling module is involved.
2024-03-06 16:43:01 -05:00
Eugene Yurtsev
0e52961562 community[patch]: Patch tdidf retriever (CVE-2024-2057) (#18695)
This is a patch for `CVE-2024-2057`:
https://www.cve.org/CVERecord?id=CVE-2024-2057

This affects users that: 

* Use the  `TFIDFRetriever`
* Attempt to de-serialize it from an untrusted source that contains a
malicious payload
2024-03-06 15:49:04 -05:00
Leonid Ganeline
81cbf0f2fd docs: update import paths for callbacks to use langchain_community callbacks where applicable (#18691)
Refactored imports from `langchain` to `langchain_community` whenever it
is applicable
2024-03-06 14:49:06 -05:00
Erick Friis
2619420df1 mongodb[patch]: release 0.1.1 (#18692) 2024-03-06 19:44:14 +00:00
Leonid Ganeline
fb686333ac docs: fix streamlit provider (#18606)
There is a wrong python package import.
Fixed it.
2024-03-06 11:42:26 -08:00
Christophe Bornet
ea141511d8 core: Move document loader interfaces to core (#17723)
This is needed to be able to move document loaders to partner packages.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-06 13:59:00 -05:00
aditya thomas
97de498d39 docs: update to the streaming tutorial notebook in the lcel documentation (#18378)
**Description:** Update to the streaming tutorial notebook in the LCEL
documentation
**Issue:** Fixed an import and (minor) changes in documentation language
**Dependencies:** None
2024-03-06 10:47:22 -08:00
Guangdong Liu
32db9e74e4 docs: Fix some issues with sparkllm use cases (#17674) 2024-03-06 10:46:51 -08:00
Christophe Bornet
5985454269 Merge pull request #18539
* Implement lazy_load() for GitLoader
2024-03-06 13:25:14 -05:00
Christophe Bornet
9a6f7e213b Merge pull request #18423
* Implement lazy_load() for BSHTMLLoader
2024-03-06 13:25:01 -05:00
Christophe Bornet
b3a0c44838 Merge pull request #18673
* Implement lazy_load() for PDFMinerPDFasHTMLLoader and PyMuPDFLoader
2024-03-06 13:24:36 -05:00
Christophe Bornet
68fc0cf909 Merge pull request #18674
* Implement lazy_load() for TextLoader
2024-03-06 13:23:42 -05:00
Christophe Bornet
5b92f962f1 Merge pull request #18671
* Implement lazy_load() for MastodonTootsLoader
2024-03-06 13:23:14 -05:00
Christophe Bornet
15b1770326 Merge pull request #18421
* Implement lazy_load() for AssemblyAIAudioTranscriptLoader
2024-03-06 13:16:05 -05:00
Christophe Bornet
bb284eebe4 Merge pull request #18436
* Implement lazy_load() for ConfluenceLoader
2024-03-06 13:15:24 -05:00
Christophe Bornet
691480f491 Merge pull request #18647
* Implement lazy_load() for UnstructuredBaseLoader
2024-03-06 13:13:10 -05:00
Christophe Bornet
52ac67c5d8 Merge pull request #18654
* Implement lazy_load() for ObsidianLoader
2024-03-06 13:06:55 -05:00
Christophe Bornet
b9c0cf9025 Merge pull request #18656
* Implement lazy_load() for PsychicLoader
2024-03-06 13:05:04 -05:00
Christophe Bornet
aa7ac57b67 community: Implement lazy_load() for TrelloLoader (#18658)
Covered by `tests/unit_tests/document_loaders/test_trello.py`
2024-03-06 13:04:36 -05:00
Christophe Bornet
302985fea1 community: Implement lazy_load() for SlackDirectoryLoader (#18675)
Integration tests:
`tests/integration_tests/document_loaders/test_slack.py`
2024-03-06 13:04:13 -05:00
Christophe Bornet
ed36f9f604 community: Implement lazy_load() for WhatsAppChatLoader (#18677)
Integration test:
`tests/integration_tests/document_loaders/test_whatsapp_chat.py`
2024-03-06 13:03:46 -05:00
Christophe Bornet
f414f5cdb9 community[minor]: Implement lazy_load() for WikipediaLoader (#18680)
Integration test:
`tests/integration_tests/document_loaders/test_wikipedia.py`
2024-03-06 13:03:21 -05:00
Bagatur
4cbfeeb1c2 community[patch]: Release 0.0.26 (#18683) 2024-03-06 09:41:18 -08:00
Eugene Yurtsev
b9f3c7a0c9 Use Case: Extraction set temperature to 0, qualify a statement (#18672)
Minor changes:
1) Set temperature to 0 (important)
2) Better qualify one of the statements with confidence
2024-03-06 12:35:45 -05:00
Eugene Yurtsev
a4a6978224 Docs: Revamp Extraction Use Case (#18588)
Revamp the extraction use case documentation

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-06 09:18:25 -05:00
Christophe Bornet
1100f8de7a community[minor]: Implement lazy_load() for ArxivLoader (#18664)
Integration tests: `tests/integration_tests/utilities/test_arxiv.py` and
`tests/integration_tests/document_loaders/test_arxiv.py`
2024-03-06 09:16:49 -05:00
Christophe Bornet
2d96803ddd community[minor]: Implement lazy_load() for OutlookMessageLoader (#18668)
Integration test:
`tests/integration_tests/document_loaders/test_email.py`
2024-03-06 09:15:57 -05:00
Christophe Bornet
ae167fb5b2 community[minor]: Implement lazy_load() for SitemapLoader (#18667)
Integration tests: `test_sitemap.py` and `test_docusaurus.py`
2024-03-06 09:15:35 -05:00
Christophe Bornet
623dfcc55c community[minor]: Implement lazy_load() for FacebookChatLoader (#18669)
Integration test:
`tests/integration_tests/document_loaders/test_facebook_chat.py`
2024-03-06 09:15:00 -05:00
Christophe Bornet
20794bb889 community[minor]: Implement lazy_load() for GitbookLoader (#18670)
Integration test:
`tests/integration_tests/document_loaders/test_gitbook.py`
2024-03-06 09:14:36 -05:00
Liang Zhang
81985b31e6 community[patch]: Databricks SerDe uses cloudpickle instead of pickle (#18607)
- **Description:** Databricks SerDe uses cloudpickle instead of pickle
when serializing a user-defined function transform_input_fn since pickle
does not support functions defined in `__main__`, and cloudpickle
supports this.
- **Dependencies:** cloudpickle>=2.0.0

Added a unit test.
2024-03-05 18:04:45 -08:00
Erick Friis
f3e28289f6 infra: reorder api docs build steps (#18618) 2024-03-05 17:33:36 -08:00
Leonid Ganeline
114d64d4a7 docs: providers update (#18527)
Added missed pages. Added links and descriptions. Foratted to the
consistent form.
2024-03-05 17:32:59 -08:00
Christophe Bornet
7d6de96186 community[patch]: Implement lazy_load() for CubeSemanticLoader (#18535)
Covered by `test_cube_semantic.py`
2024-03-05 17:32:31 -08:00
Christophe Bornet
a6b5d45e31 community[patch]: Implement lazy_load() for EverNoteLoader (#18538)
Covered by `test_evernote_loader.py`
2024-03-05 17:29:52 -08:00
PSV
d7dd3cd248 docs: structured_output (#18608)
- **Description:** Fixed some typos and copy errors in the Beta
Structured Output docs
    - **Issue:** N/A
    - **Dependencies:** Docs only
    - **Twitter handle:** @psvann

Co-authored-by: P.S. Vann <psvann@yahoo.com>
2024-03-05 17:20:06 -08:00
Bagatur
29f1619d61 docs: why lcel nit (#18616) 2024-03-05 17:10:47 -08:00
Max Jakob
ee7a7954b9 elasticsearch: add ElasticsearchRetriever (#18587)
Implement
[Retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/)
interface for Elasticsearch.

I opted to only expose the `body`, which gives you full flexibility, and
none the other 68 arguments of the [search
method](https://elasticsearch-py.readthedocs.io/en/v8.12.1/api/elasticsearch.html#elasticsearch.Elasticsearch.search).

Added a user agent header for usage tracking in Elastic Cloud.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-06 00:42:50 +00:00
Jib
8bc347c5fc mongodb[patch]: include LLM caches in toplevel library import (#18601) 2024-03-05 16:35:13 -08:00
Bagatur
080904689c docs: text splitters install (#18589) 2024-03-05 16:19:37 -08:00
Sunchao Wang
dc81dba6cf community[patch]: Improve amadeus tool and doc (#18509)
Description:

This pull request addresses two key improvements to the langchain
repository:

**Fix for Crash in Flight Search Interface**:

Previously, the code would crash when encountering a failure scenario in
the flight ticket search interface. This PR resolves this issue by
implementing a fix to handle such scenarios gracefully. Now, the code
handles failures in the flight search interface without crashing,
ensuring smoother operation.

**Documentation Update for Amadeus Toolkit**:

Prior to this update, examples provided in the documentation for the
Amadeus Toolkit were unable to run correctly due to outdated
information. This PR includes an update to the documentation, ensuring
that all examples can now be executed successfully. With this update,
users can effectively utilize the Amadeus Toolkit with accurate and
functioning examples.
These changes aim to enhance the reliability and usability of the
langchain repository by addressing issues related to error handling and
ensuring that documentation remains up-to-date and actionable.

Issue: https://github.com/langchain-ai/langchain/issues/17375

Twitter Handle: SingletonYxx
2024-03-05 16:17:22 -08:00
Christophe Bornet
f77f7dc3ec community[patch]: Fix VectorStoreQATool (#18529)
Fix #18460
2024-03-05 15:56:58 -08:00
Utkarsh Kapil
539a13dbda docs: minor spelling errors (#18429)
Description: Noticed spelling errors. 'Colab' mispelt as 'Collab'.
https://python.langchain.com/docs/use_cases
Dependencies: n/a
2024-03-05 15:54:15 -08:00
Dounx
ad48f55357 community[minor]: add Yuque document loader (#17924)
This pull request support loading documents from Yuque with Langchain.

Yuque is a professional cloud-based knowledge base for team
collaboration in documentation.

Website: https://www.yuque.com
OpenAPI: https://www.yuque.com/yuque/developer/openapi
2024-03-05 15:54:07 -08:00
Kazuki Maeda
60c5d964a8 community[minor]: use jq schema for content_key in json_loader (#18003)
### Description
Changed the value specified for `content_key` in JSONLoader from a
single key to a value based on jq schema.
I created [similar
PR](https://github.com/langchain-ai/langchain/pull/11255) before, but it
has several conflicts because of the architectural change associated
stable version release, so I re-create this PR to fit new architecture.

### Why
For json data like the following, specify `.data[].attributes.message`
for page_content and `.data[].attributes.id` or
`.data[].attributes.attributes. tags`, etc., the `content_key` must also
parse the json structure.

<details>
<summary>sample json data</summary>

```json
{
  "data": [
    {
      "attributes": {
        "message": "message1",
        "tags": [
          "tag1"
        ]
      },
      "id": "1"
    },
    {
      "attributes": {
        "message": "message2",
        "tags": [
          "tag2"
        ]
      },
      "id": "2"
    }
  ]
}
```

</details>

<details>
<summary>sample code</summary>

```python
def metadata_func(record: dict, metadata: dict) -> dict:

    metadata["source"] = None
    metadata["id"] = record.get("id")
    metadata["tags"] = record["attributes"].get("tags")

    return metadata

sample_file = "sample1.json"
loader = JSONLoader(
    file_path=sample_file,
    jq_schema=".data[]",
    content_key=".attributes.message", ## content_key is parsable into jq schema
    is_content_key_jq_parsable=True, ## this is added parameter
    metadata_func=metadata_func
)

data = loader.load()
data
```

</details>

### Dependencies
none

### Twitter handle
[kzk_maeda](https://twitter.com/kzk_maeda)
2024-03-05 15:51:24 -08:00
Rodrigo Nogueira
f4bb33bbf3 docs: fix link and missing package (#18405)
**Issue:** fix broken links and missing package on colab example
2024-03-05 15:50:06 -08:00
Max Jakob
81e9ab6e3a docs: Update elasticsearch README (#18497)
Update Elasticsearch README with information on how to start a
deployment.

Also make some cosmetic changes to the [Elasticsearch
docs](https://python.langchain.com/docs/integrations/vectorstores/elasticsearch).

Follow up on https://github.com/langchain-ai/langchain/pull/17467
2024-03-05 15:49:16 -08:00
Hech
6a08134661 community[patch], langchain[minor]: Add retriever self_query and score_threshold in DingoDB (#18106) 2024-03-05 15:47:29 -08:00
Mikhail Khludnev
d039dcb6ba nvidia-trt[patch]: add TritonTensorRTLLM(verbose_client=False) (#16848)
- **Description:** adding verbose flag to TritonTensorRTLLM, 
  - **Issue:** nope,
  - **Dependencies:** not any,
  - **Twitter handle:**
2024-03-05 15:44:13 -08:00
Bagatur
1569b19191 docs: query analysis links (#18614) 2024-03-05 15:05:44 -08:00
Asaf Joseph Gardin
27441555d0 ai21[patch]: AI21 Labs Contextual Answers support (#18270)
Description: Added support for AI21 Labs model - Contextual Answers
Dependencies: ai21, ai21-tokenizer
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-05 22:42:04 +00:00
Erick Friis
e169ee8863 anthropic[patch]: handle lists in function calling (#18609) 2024-03-05 14:19:40 -08:00
Erick Friis
1831733c2e anthropic[patch]: fix argument integration test (#18605) 2024-03-05 13:05:25 -08:00
Leonid Ganeline
bd4993141d docs: providers update 5 (#18550)
Added missed sections. Added descriptions.
2024-03-05 12:55:13 -08:00
Yudhajit Sinha
4570b477b9 community[patch]: Invoke callback prior to yielding token (titan_takeoff) (#18560)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/titan_takeoff.
- Issue: #16913 
- Dependencies: None
2024-03-05 12:54:26 -08:00
Tomaz Bratanic
ea51cdaede Remove neo4j bloom labels from graph schema (#18564)
Neo4j tools use particular node labels and relationship types to store
metadata, but are irrelevant for text2cypher or graph generation, so we
want to ignore them in the schema representation.
2024-03-05 12:54:05 -08:00
standby24x7
a2779738aa docs:Update function "run" to "invoke" in smart_llm.ipynb (#18568)
This patch updates function "run" to "invoke" in smart_llm.ipynb.
Without this patch, you see following warning.

LangChainDeprecationWarning: The function `run` was deprecated in
LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead.

    Signed-off-by: Masanari Iida <standby24x7@gmail.com>

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-05 12:52:48 -08:00
Erick Friis
e1924b3e93 core[patch]: deprecate hwchase17/langchain-hub, address path traversal (#18600)
Deprecates the old langchain-hub repository. Does *not* deprecate the
new https://smith.langchain.com/hub

@PinkDraconian has correctly raised that in the event someone is loading
unsanitized user input into the `try_load_from_hub` function, they have
the ability to load files from other locations in github than the
hwchase17/langchain-hub repository.

This PR adds some more path checking to that function and deprecates the
functionality in favor of the hub built into LangSmith.
2024-03-05 12:49:38 -08:00
Reuben Zotz-Wilson
96cd50938a community:update telegram notebook (#18569)
**Description:** 
modified the user_name to username to conform with the expected inputs
to TelegramChatApiLoader

**Issue:**
Current code fails in langchain-community 0.0.24 
<loader = TelegramChatApiLoader(
    chat_entity="<CHAT_URL>",  # recommended to use Entity here
    api_hash="<API HASH >",
    api_id="<API_ID>",
    user_name="",  # needed only for caching the session.
)>
2024-03-05 11:47:17 -08:00
Jib
fc35262356 langchain-mongodb: add unit tests for MongoDBChatMessageHistory (#18599)
## Description
Adding in Unit Test variation for `MongoDBChatMessageHistory` package
Follow-up to #18590 

- [x] **Add tests and docs**: Unit test is what's being added
  

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-05 11:44:31 -08:00
Erick Friis
48e303ea10 airbyte[patch]: release 0.1.1, python 3.9 compat (#18597) 2024-03-05 19:22:08 +00:00
Jib
9da1e0cf34 mongodb[patch]: Migrate MongoDBChatMessageHistory (#18590)
## **Description** 
Migrate the `MongoDBChatMessageHistory` to the managed
`langchain-mongodb` partner-package
## **Dependencies**
None
## **Twitter handle**
@mongodb

## **tests and docs**
- [x] Migrate existing integration test
- [x ]~ Convert existing integration test to a unit test~ Creation is
out of scope for this ticket
- [x ] ~Considering delaying work until #17470 merges to leverage the
`MockCollection` object. ~
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-05 18:53:02 +00:00
Jib
f92f7d2e03 mongodb[minor]: Add MongoDB LLM Cache (#17470)
# Description

- **Description:** Adding MongoDB LLM Caching Layer abstraction
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** @mongodb

Checklist:

- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
- [x] PR Message (above)
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [ ] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Jib <jib@byblack.us>
2024-03-05 10:38:39 -08:00
Tomaz Bratanic
449d8781ec Update link in neo4j semantic ollama templates (#18574) 2024-03-05 09:42:34 -08:00
Tomaz Bratanic
353248838d Add precedence for input params over env variables in neo4j integration (#18581)
input parameters take precedence over env variables
2024-03-05 09:36:56 -08:00
Christophe Bornet
c8a171a154 community: Implement lazy_load() for GithubFileLoader (#18584) 2024-03-05 09:35:50 -08:00
Leonid Kuligin
04d134df17 marked MatchingEngine as deprecated (#18585)
Thank you for contributing to LangChain!

- [ ] **PR title**: "community: deprecate vectorstores.MatchingEngine"


- [ ] **PR message**: 
- **Description:** announced a deprecation since this integration has
been moved to langchain_google_vertexai
2024-03-05 09:34:53 -08:00
Erick Friis
07f23c2d45 docs: anthropic multimodal (#18586) 2024-03-05 16:58:06 +00:00
Erick Friis
4ac2cb4adc anthropic[minor]: add tool calling (#18554) 2024-03-05 08:30:16 -08:00
Bagatur
5fc67ca2c7 langchain[patch]: Release 0.1.11 (#18558) 2024-03-04 23:58:34 -08:00
Erick Friis
68c1878380 anthropic[patch]: model type string (#18510) 2024-03-04 19:25:19 -08:00
Akash A Desai
eb0756f3ee templates: fix rag-lancedb template (#18551) 2024-03-04 18:56:16 -08:00
Erick Friis
25c7d52140 anthropic[patch]: multimodal (#18517)
- anthropic[minor]: claude 3
- x
- x

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2024-03-04 17:50:13 -08:00
Erick Friis
343438e872 community[patch]: deprecate community fireworks (#18544) 2024-03-05 01:04:26 +00:00
William FH
ca1d42785d Evals wording (#18542) 2024-03-04 16:32:33 -08:00
Brace Sproul
328a498a78 docs[minor]: Add thumbs up/down to all docs pages (#18526) 2024-03-04 15:14:28 -08:00
Erick Friis
10874d5002 docs: update stack graphic (#18532) 2024-03-04 23:07:28 +00:00
Bagatur
dd07eddf24 core[patch]: Release 0.1.29 (#18530) 2024-03-04 14:37:08 -08:00
William FH
30ccc009e6 [Evals] Support list examples by dataset version tag (#18534)
previously only supported by timestamp
2024-03-04 14:23:32 -08:00
Lance Martin
72ae744588 RAPTOR (#18467)
Cookbook for RAPTOR paper
2024-03-04 13:16:33 -08:00
aditya thomas
7803b973c7 docs: update documentation of stackexchange component (#18486)
**Description:** Update documentation of the StackExchange component
**Issue:** None
**Dependencies:** None
2024-03-04 10:45:29 -08:00
aditya thomas
5c387a173f docs: update to docstrings of ChatAnthropic class (#18493)
**Description:** Update docstrings of ChatAnthropic class
**Issue:** Change to ChatAnthropic from ChatAnthropicMessages
**Dependencies:** None
**Lint and test**:  `make format`, `make lint` and `make test` passed
2024-03-04 10:44:54 -08:00
Martin Kolb
63702a2044 docs: Improved notebook for vector store "HANA Cloud" (#18496)
- **Description:**
This PR fixes some issues in the Jupyter notebook for the VectorStore
"SAP HANA Cloud Vector Engine":
    * Slight textual adaptations
    * Fix of wrong column name VEC_META (was: VEC_METADATA)

  - **Issue:** N/A
  - **Dependencies:** no new dependecies added
  - **Twitter handle:** @sapopensource

path to notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`
2024-03-04 10:44:16 -08:00
standby24x7
8461700738 docs: Update function "run" to "invoke" (#18499)
Currently llm_checker.ipynb uses a function "run".
Update to "invoke" to avoid following warning.

LangChainDeprecationWarning: The function `run` was deprecated in
LangChain 0.1.0
and will be removed in 0.2.0. Use invoke instead.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-04 10:42:53 -08:00
standby24x7
6c9177681d docs: Update function "run" to "invoke" in llm_math.ipynb (#18505)
This patch updates function "run" to "invoke".
Without this patch you see following warning.

LangChainDeprecationWarning: The function `run` was deprecated in
LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-04 10:42:36 -08:00
Bagatur
1c1a3a7415 docs: quickstart models (#18511) 2024-03-04 08:33:19 -08:00
aditya thomas
a727eec6ed docs: add groq to list of providers (#18503)
**Description:** Add Groq to the list of providers
**Issue:** None
**Dependencies:** None
2024-03-04 08:20:40 -08:00
Erick Friis
24f9c700f2 anthropic[minor]: claude 3 (#18508) 2024-03-04 15:03:51 +00:00
William De Vena
172499404a Docs: Updated callbacks/index.mdx adding example on invoke method (#18403)
## PR title
Docs: Updated callbacks/index.mdx adding example on runnable methods

## PR message
- **Description:** Updated callbacks/index.mdx adding an example on how
to pass callbacks to the runnable methods (invoke, batch, ...)
- **Issue:** #16379
- **Dependencies:** None
2024-03-04 09:11:48 -05:00
Jacob Lee
de2d9447c6 👥 Update LangChain people data (#18473)
👥 Update LangChain people data

Co-authored-by: github-actions <github-actions@github.com>
2024-03-03 19:58:58 -08:00
William FH
1cdb813196 Improve notebook wording (#18472) 2024-03-03 18:31:15 -08:00
William FH
1eec67e8fe Evaluate on Version (#18471) 2024-03-03 17:47:35 -08:00
William FH
55b69d5ad1 Update Notebook Image (#18470) 2024-03-03 17:22:59 -08:00
Harrison Chase
73d653324f [Evals] Session-level feedback (#18463)
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2024-03-03 17:18:29 -08:00
Scott Nath
b051bba1a9 community: Add you.com tool, add async to retriever, add async testing, add You tool doc (#18032)
- **Description:** finishes adding the you.com functionality including:
    - add async functions to utility and retriever
    - add the You.com Tool
    - add async testing for utility, retriever, and tool
    - add a tool integration notebook page
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** @scottnath
2024-03-03 14:30:05 -08:00
mackong
b89d9fc177 langchain[patch]: add tools renderer for various non-openai agents (#18307)
- **Description:** add tools_renderer for various non-openai agents,
make tools can be render in different ways for your LLM.
  - **Issue:** N/A
  - **Dependencies:** N/A

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-03 14:25:12 -08:00
Harrison Chase
7ce2f32c64 improve query analysis docs (#18426) 2024-03-03 14:24:33 -08:00
William De Vena
a63cee04ac nvidia-trt[patch]: Invoke callback prior to yielding token (#18446)
## PR title
nvidia-trt[patch]: Invoke callback prior to yielding

## PR message
- Description: Invoke on_llm_new_token callback prior to yielding token
in
_stream method.
- Issue: https://github.com/langchain-ai/langchain/issues/16913
- Dependencies: None
2024-03-03 14:15:11 -08:00
William De Vena
275877980e community[patch]: Invoke callback prior to yielding token (#18447)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
Description: Invoke callback prior to yielding token in _stream method
in llms/vertexai.
Issue: https://github.com/langchain-ai/langchain/issues/16913
Dependencies: None
2024-03-03 14:14:40 -08:00
William De Vena
67375e96e0 community[patch]: Invoke callback prior to yielding token (#18448)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream method
in llms/tongyi.
- Issue: https://github.com/langchain-ai/langchain/issues/16913
- Dependencies: None
2024-03-03 14:14:22 -08:00
William De Vena
2087cbae64 community[patch]: Invoke callback prior to yielding token (#18449)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream method
in chat_models/perplexity.
- Issue: https://github.com/langchain-ai/langchain/issues/16913
- Dependencies: None
2024-03-03 14:14:00 -08:00
William De Vena
eb04d0d3e2 community[patch]: Invoke callback prior to yielding token (#18452)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream and
_astream methods in llms/anthropic.
- Issue: https://github.com/langchain-ai/langchain/issues/16913
- Dependencies: None
2024-03-03 14:13:41 -08:00
William De Vena
371bec79bc community[patch]: Invoke callback prior to yielding token (#18454)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream and
_astream methods in llms/baidu_qianfan_endpoint.
- Issue: https://github.com/langchain-ai/langchain/issues/16913
- Dependencies: None
2024-03-03 14:13:22 -08:00
Aayush Kataria
7c2f3f6f95 community[minor]: Adding Azure Cosmos Mongo vCore Vector DB Cache (#16856)
Description:

This pull request introduces several enhancements for Azure Cosmos
Vector DB, primarily focused on improving caching and search
capabilities using Azure Cosmos MongoDB vCore Vector DB. Here's a
summary of the changes:

- **AzureCosmosDBSemanticCache**: Added a new cache implementation
called AzureCosmosDBSemanticCache, which utilizes Azure Cosmos MongoDB
vCore Vector DB for efficient caching of semantic data. Added
comprehensive test cases for AzureCosmosDBSemanticCache to ensure its
correctness and robustness. These tests cover various scenarios and edge
cases to validate the cache's behavior.
- **HNSW Vector Search**: Added HNSW vector search functionality in the
CosmosDB Vector Search module. This enhancement enables more efficient
and accurate vector searches by utilizing the HNSW (Hierarchical
Navigable Small World) algorithm. Added corresponding test cases to
validate the HNSW vector search functionality in both
AzureCosmosDBSemanticCache and AzureCosmosDBVectorSearch. These tests
ensure the correctness and performance of the HNSW search algorithm.
- **LLM Caching Notebook** - The notebook now includes a comprehensive
example showcasing the usage of the AzureCosmosDBSemanticCache. This
example highlights how the cache can be employed to efficiently store
and retrieve semantic data. Additionally, the example provides default
values for all parameters used within the AzureCosmosDBSemanticCache,
ensuring clarity and ease of understanding for users who are new to the
cache implementation.
 
 @hwchase17,@baskaryan, @eyurtsev,
2024-03-03 14:04:15 -08:00
Bagatur
db47b5deee docs: anthropic quickstart (#18440) 2024-03-03 13:59:28 -08:00
Bagatur
74f3908182 docs: anthropic qa quickstart (#18459) 2024-03-03 13:33:24 -08:00
Harrison Chase
bc768a12ed more query analysis docs (#18358) 2024-03-02 08:44:22 -08:00
Erick Friis
f96dd57501 langchain[patch]: release 0.1.10 (#18410) 2024-03-02 01:48:57 +00:00
Erick Friis
1fd1ac8e95 community[patch]: release 0.0.25 (#18408) 2024-03-02 00:56:04 +00:00
aditya thomas
44b33fcc76 infra: update to pathspec for 'git grep' in lint check (#18178)
**Description:** Update to the pathspec for 'git grep' in lint check in
the Makefile
**Issue:** The pathspec {docs/docs,templates,cookbook} is not handled
correctly leading to the error during 'make lint' -
"fatal: ambiguous argument '{docs/docs,templates,cookbook}': unknown
revision or path not in the working tree."
See changes made in https://github.com/langchain-ai/langchain/pull/18058

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-01 22:03:45 +00:00
standby24x7
57c733e560 docs: Fix spelling typos in apache_kafka notebook (#17998)
This patch fixes some spelling typos in
apache_kafka_message_handling.ipynb

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-01 13:58:04 -08:00
Erick Friis
9fda6ac7e6 docs: stop copying source (#18404) 2024-03-01 13:57:53 -08:00
Sourav Pradhan
50abeb7ed9 community[patch]: fix Chroma add_images (#17964)
###  Description

Fixed a small bug in chroma.py add_images(), previously whenever we are
not passing metadata the documents is containing the base64 of the uris
passed, but when we are passing the metadata the documents is containing
normal string uris which should not be the case.

### Issue

In add_images() method when we are calling upsert() we have to use
"b64_texts" instead of normal string "uris".

### Twitter handle

https://twitter.com/whitepegasus01
2024-03-01 21:55:58 +00:00
Sanjaypranav V M
d722525c70 templates: remove gemini_function_agent unused file (#18112)
- [X] Gemini Agent Executor imported `agent.py` has Gemini agent
executor which was not utilised in current template of gemini function
agent 🧑‍💻 instead openai_function_agent has been used


@sbusso  @jarib  please someone review it

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-01 21:55:20 +00:00
Kate Silverstein
b7c71e2e07 community[minor]: llamafile embeddings support (#17976)
* **Description:** adds `LlamafileEmbeddings` class implementation for
generating embeddings using
[llamafile](https://github.com/Mozilla-Ocho/llamafile)-based models.
Includes related unit tests and notebook showing example usage.
* **Issue:** N/A
* **Dependencies:** N/A
2024-03-01 13:49:18 -08:00
Massimiliano Pronesti
c3c987dd70 docs: update Azure OpenAI to v1 and langchain API to 0.1 (#18005)
**Description:** Updated Azure OpenAI docs to OpenAI API v1 and LLM
invocation to langchain 0.1
2024-03-01 13:47:00 -08:00
Mateusz Szewczyk
9298a0b941 langchain_ibm[patch] update docstring, dependencies, tests (#18386)
- **Description:** Update docstring, dependencies, tests, README
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** : 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally -> 
Please make sure integration_tests passing locally -> 

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-01 21:01:53 +00:00
Jib
c2b1abe91b mongodb[patch]: Set delete_many only if count_documents is not 0 (#18402)
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Remove the assert statement on the `count_documents`
in setup_class. It should just delete if there are documents present
    - **Issue:** the issue # Crashes on class setup
    - **Dependencies:** None
    - **Twitter handle:** @mongodb


- [x] **Add tests and docs**: If you're adding a new integration, please
include
  1. N/A


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Jib <jib@byblack.us>
2024-03-01 13:01:28 -08:00
Kate Silverstein
c9153a3fd4 docs: add llamafile info to 'Local LLMs' guides (#18049)
- **Description:** add information about
[llamafile](https://github.com/Mozilla-Ocho/llamafile) (setup, example
usage) to ['Run LLMs
locally'](https://python.langchain.com/docs/guides/local_llms) and
['Using local models for Q&A with
RAG'](https://python.langchain.com/docs/use_cases/question_answering/local_retrieval_qa)
guides.
- **Issue:** N/A
- **Dependencies:** N/A
2024-03-01 12:44:31 -08:00
Tomaz Bratanic
f6bfb969ba community[patch]: Add an option for indexed generic label when import neo4j graph documents (#18122)
Current implementation doesn't have an indexed property that would
optimize the import. I have added a `baseEntityLabel` parameter that
allows you to add a secondary node label, which has an indexed id
`property`. By default, the behaviour is identical to previous version.

Since multi-labeled nodes are terrible for text2cypher, I removed the
secondary label from schema representation object and string, which is
used in text2cypher.
2024-03-01 12:33:52 -08:00
aditya thomas
e6e60e2492 docs: ChatOpenAI update module import path and calling method (#18169)
**Description:**
(a) Update to the module import path to reflect the splitting up of
langchain into separate packages
(b) Update to the documentation to include the new calling method
(invoke)
2024-03-01 12:32:20 -08:00
Arun Sathiya
4adac20d7b community[patch]: Make cohere_api_key a SecretStr (#12188)
This PR makes `cohere_api_key` in `llms/cohere` a SecretStr, so that the
API Key is not leaked when `Cohere.cohere_api_key` is represented as a
string.

---------

Signed-off-by: Arun <arun@arun.blog>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-01 20:27:53 +00:00
Ryan Meinzer
d883fd4a37 docs: Correct WebBaseLoader URL: docs: python.langchain.com/docs/get_started/quickstartQuickstart (#17981)
**Description:** 
The URL of the data to index, specified to `WebBaseLoader` to import is
incorrect, causing the `langsmith_search` retriever to return a `404:
NOT_FOUND`.
Incorrect URL: https://docs.smith.langchain.com/overview
Correct URL: https://docs.smith.langchain.com

**Issue:** 
This commit corrects the URL and prevents the LangServe Playground from
returning an error from its inability to use the retriever when
inquiring, "how can langsmith help with testing?".

**Dependencies:** 
None.

**Twitter Handle:** 
@ryanmeinzer
2024-03-01 12:21:53 -08:00
Petteri Johansson
6c1989d292 community[minor], langchain[minor], docs: Gremlin Graph Store and QA Chain (#17683)
- **Description:** 
New feature: Gremlin graph-store and QA chain (including docs).
Compatible with Azure CosmosDB.
  - **Dependencies:** 
  no changes
2024-03-01 12:21:14 -08:00
Ather Fawaz
a5ccf5d33c community[minor]: Add support for Perplexity chat model(#17024)
- **Description:** This PR adds support for [Perplexity AI
APIs](https://blog.perplexity.ai/blog/introducing-pplx-api).
  - **Issues:** None
  - **Dependencies:** None
  - **Twitter handle:** [@atherfawaz](https://twitter.com/AtherFawaz)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-01 12:19:23 -08:00
Rodrigo Nogueira
3438d2cbcc community[minor]: add maritalk chat (#17675)
**Description:** Adds the MariTalk chat that is based on a LLM specially
trained for Portuguese.

**Twitter handle:** @MaritacaAI
2024-03-01 12:18:23 -08:00
sarahberenji
08fa38d56d community[patch]: the syntax error for Redis generated query (#17717)
To fix the reported error:
https://github.com/langchain-ai/langchain/discussions/17397
2024-03-01 12:18:10 -08:00
certified-dodo
43e3244573 community[patch]: Fix MongoDBAtlasVectorSearch max_marginal_relevance_search (#17971)
Description:
* `self._embedding_key` is accessed after deletion, breaking
`max_marginal_relevance_search` search
* Introduced in:
e135e5257c
* Updated but still persists in:
ce22e10c4b

Issue: https://github.com/langchain-ai/langchain/issues/17963

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-01 12:17:42 -08:00
Nikita Titov
9f2ab37162 community[patch]: don't try to parse json in case of errored response (#18317)
Related issue: #13896.

In case Ollama is behind a proxy, proxy error responses cannot be
viewed. You aren't even able to check response code.

For example, if your Ollama has basic access authentication and it's not
passed, `JSONDecodeError` will overwrite the truth response error.

<details>
<summary><b>Log now:</b></summary>

```
{
	"name": "JSONDecodeError",
	"message": "Expecting value: line 1 column 1 (char 0)",
	"stack": "---------------------------------------------------------------------------
JSONDecodeError                           Traceback (most recent call last)
File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/requests/models.py:971, in Response.json(self, **kwargs)
    970 try:
--> 971     return complexjson.loads(self.text, **kwargs)
    972 except JSONDecodeError as e:
    973     # Catch JSON-related errors and raise as requests.JSONDecodeError
    974     # This aliases json.JSONDecodeError and simplejson.JSONDecodeError

File /opt/miniforge3/envs/.gpt/lib/python3.10/json/__init__.py:346, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw)
    343 if (cls is None and object_hook is None and
    344         parse_int is None and parse_float is None and
    345         parse_constant is None and object_pairs_hook is None and not kw):
--> 346     return _default_decoder.decode(s)
    347 if cls is None:

File /opt/miniforge3/envs/.gpt/lib/python3.10/json/decoder.py:337, in JSONDecoder.decode(self, s, _w)
    333 \"\"\"Return the Python representation of ``s`` (a ``str`` instance
    334 containing a JSON document).
    335 
    336 \"\"\"
--> 337 obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    338 end = _w(s, end).end()

File /opt/miniforge3/envs/.gpt/lib/python3.10/json/decoder.py:355, in JSONDecoder.raw_decode(self, s, idx)
    354 except StopIteration as err:
--> 355     raise JSONDecodeError(\"Expecting value\", s, err.value) from None
    356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

JSONDecodeError                           Traceback (most recent call last)
Cell In[3], line 1
----> 1 print(translate_func().invoke('text'))

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/runnables/base.py:2053, in RunnableSequence.invoke(self, input, config)
   2051 try:
   2052     for i, step in enumerate(self.steps):
-> 2053         input = step.invoke(
   2054             input,
   2055             # mark each step as a child run
   2056             patch_config(
   2057                 config, callbacks=run_manager.get_child(f\"seq:step:{i+1}\")
   2058             ),
   2059         )
   2060 # finish the root run
   2061 except BaseException as e:

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:165, in BaseChatModel.invoke(self, input, config, stop, **kwargs)
    154 def invoke(
    155     self,
    156     input: LanguageModelInput,
   (...)
    160     **kwargs: Any,
    161 ) -> BaseMessage:
    162     config = ensure_config(config)
    163     return cast(
    164         ChatGeneration,
--> 165         self.generate_prompt(
    166             [self._convert_input(input)],
    167             stop=stop,
    168             callbacks=config.get(\"callbacks\"),
    169             tags=config.get(\"tags\"),
    170             metadata=config.get(\"metadata\"),
    171             run_name=config.get(\"run_name\"),
    172             **kwargs,
    173         ).generations[0][0],
    174     ).message

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:543, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
    535 def generate_prompt(
    536     self,
    537     prompts: List[PromptValue],
   (...)
    540     **kwargs: Any,
    541 ) -> LLMResult:
    542     prompt_messages = [p.to_messages() for p in prompts]
--> 543     return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:407, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, **kwargs)
    405         if run_managers:
    406             run_managers[i].on_llm_error(e, response=LLMResult(generations=[]))
--> 407         raise e
    408 flattened_outputs = [
    409     LLMResult(generations=[res.generations], llm_output=res.llm_output)
    410     for res in results
    411 ]
    412 llm_output = self._combine_llm_outputs([res.llm_output for res in results])

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:397, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, **kwargs)
    394 for i, m in enumerate(messages):
    395     try:
    396         results.append(
--> 397             self._generate_with_cache(
    398                 m,
    399                 stop=stop,
    400                 run_manager=run_managers[i] if run_managers else None,
    401                 **kwargs,
    402             )
    403         )
    404     except BaseException as e:
    405         if run_managers:

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:576, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
    572     raise ValueError(
    573         \"Asked to cache, but no cache found at `langchain.cache`.\"
    574     )
    575 if new_arg_supported:
--> 576     return self._generate(
    577         messages, stop=stop, run_manager=run_manager, **kwargs
    578     )
    579 else:
    580     return self._generate(messages, stop=stop, **kwargs)

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:250, in ChatOllama._generate(self, messages, stop, run_manager, **kwargs)
    226 def _generate(
    227     self,
    228     messages: List[BaseMessage],
   (...)
    231     **kwargs: Any,
    232 ) -> ChatResult:
    233     \"\"\"Call out to Ollama's generate endpoint.
    234 
    235     Args:
   (...)
    247             ])
    248     \"\"\"
--> 250     final_chunk = self._chat_stream_with_aggregation(
    251         messages,
    252         stop=stop,
    253         run_manager=run_manager,
    254         verbose=self.verbose,
    255         **kwargs,
    256     )
    257     chat_generation = ChatGeneration(
    258         message=AIMessage(content=final_chunk.text),
    259         generation_info=final_chunk.generation_info,
    260     )
    261     return ChatResult(generations=[chat_generation])

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:183, in ChatOllama._chat_stream_with_aggregation(self, messages, stop, run_manager, verbose, **kwargs)
    174 def _chat_stream_with_aggregation(
    175     self,
    176     messages: List[BaseMessage],
   (...)
    180     **kwargs: Any,
    181 ) -> ChatGenerationChunk:
    182     final_chunk: Optional[ChatGenerationChunk] = None
--> 183     for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
    184         if stream_resp:
    185             chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp)

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:156, in ChatOllama._create_chat_stream(self, messages, stop, **kwargs)
    147 def _create_chat_stream(
    148     self,
    149     messages: List[BaseMessage],
    150     stop: Optional[List[str]] = None,
    151     **kwargs: Any,
    152 ) -> Iterator[str]:
    153     payload = {
    154         \"messages\": self._convert_messages_to_ollama_messages(messages),
    155     }
--> 156     yield from self._create_stream(
    157         payload=payload, stop=stop, api_url=f\"{self.base_url}/api/chat/\", **kwargs
    158     )

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/llms/ollama.py:234, in _OllamaCommon._create_stream(self, api_url, payload, stop, **kwargs)
    228         raise OllamaEndpointNotFoundError(
    229             \"Ollama call failed with status code 404. \"
    230             \"Maybe your model is not found \"
    231             f\"and you should pull the model with `ollama pull {self.model}`.\"
    232         )
    233     else:
--> 234         optional_detail = response.json().get(\"error\")
    235         raise ValueError(
    236             f\"Ollama call failed with status code {response.status_code}.\"
    237             f\" Details: {optional_detail}\"
    238         )
    239 return response.iter_lines(decode_unicode=True)

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/requests/models.py:975, in Response.json(self, **kwargs)
    971     return complexjson.loads(self.text, **kwargs)
    972 except JSONDecodeError as e:
    973     # Catch JSON-related errors and raise as requests.JSONDecodeError
    974     # This aliases json.JSONDecodeError and simplejson.JSONDecodeError
--> 975     raise RequestsJSONDecodeError(e.msg, e.doc, e.pos)

JSONDecodeError: Expecting value: line 1 column 1 (char 0)"
}
```

</details>


<details>

<summary><b>Log after a fix:</b></summary>

```
{
	"name": "ValueError",
	"message": "Ollama call failed with status code 401. Details: <html>\r
<head><title>401 Authorization Required</title></head>\r
<body>\r
<center><h1>401 Authorization Required</h1></center>\r
<hr><center>nginx/1.18.0 (Ubuntu)</center>\r
</body>\r
</html>\r
",
	"stack": "---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 1
----> 1 print(translate_func().invoke('text'))

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/runnables/base.py:2053, in RunnableSequence.invoke(self, input, config)
   2051 try:
   2052     for i, step in enumerate(self.steps):
-> 2053         input = step.invoke(
   2054             input,
   2055             # mark each step as a child run
   2056             patch_config(
   2057                 config, callbacks=run_manager.get_child(f\"seq:step:{i+1}\")
   2058             ),
   2059         )
   2060 # finish the root run
   2061 except BaseException as e:

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:165, in BaseChatModel.invoke(self, input, config, stop, **kwargs)
    154 def invoke(
    155     self,
    156     input: LanguageModelInput,
   (...)
    160     **kwargs: Any,
    161 ) -> BaseMessage:
    162     config = ensure_config(config)
    163     return cast(
    164         ChatGeneration,
--> 165         self.generate_prompt(
    166             [self._convert_input(input)],
    167             stop=stop,
    168             callbacks=config.get(\"callbacks\"),
    169             tags=config.get(\"tags\"),
    170             metadata=config.get(\"metadata\"),
    171             run_name=config.get(\"run_name\"),
    172             **kwargs,
    173         ).generations[0][0],
    174     ).message

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:543, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
    535 def generate_prompt(
    536     self,
    537     prompts: List[PromptValue],
   (...)
    540     **kwargs: Any,
    541 ) -> LLMResult:
    542     prompt_messages = [p.to_messages() for p in prompts]
--> 543     return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:407, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, **kwargs)
    405         if run_managers:
    406             run_managers[i].on_llm_error(e, response=LLMResult(generations=[]))
--> 407         raise e
    408 flattened_outputs = [
    409     LLMResult(generations=[res.generations], llm_output=res.llm_output)
    410     for res in results
    411 ]
    412 llm_output = self._combine_llm_outputs([res.llm_output for res in results])

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:397, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, **kwargs)
    394 for i, m in enumerate(messages):
    395     try:
    396         results.append(
--> 397             self._generate_with_cache(
    398                 m,
    399                 stop=stop,
    400                 run_manager=run_managers[i] if run_managers else None,
    401                 **kwargs,
    402             )
    403         )
    404     except BaseException as e:
    405         if run_managers:

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_core/language_models/chat_models.py:576, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
    572     raise ValueError(
    573         \"Asked to cache, but no cache found at `langchain.cache`.\"
    574     )
    575 if new_arg_supported:
--> 576     return self._generate(
    577         messages, stop=stop, run_manager=run_manager, **kwargs
    578     )
    579 else:
    580     return self._generate(messages, stop=stop, **kwargs)

File /opt/miniforge3/envs/.gpt/lib/python3.10/site-packages/langchain_community/chat_models/ollama.py:250, in ChatOllama._generate(self, messages, stop, run_manager, **kwargs)
    226 def _generate(
    227     self,
    228     messages: List[BaseMessage],
   (...)
    231     **kwargs: Any,
    232 ) -> ChatResult:
    233     \"\"\"Call out to Ollama's generate endpoint.
    234 
    235     Args:
   (...)
    247             ])
    248     \"\"\"
--> 250     final_chunk = self._chat_stream_with_aggregation(
    251         messages,
    252         stop=stop,
    253         run_manager=run_manager,
    254         verbose=self.verbose,
    255         **kwargs,
    256     )
    257     chat_generation = ChatGeneration(
    258         message=AIMessage(content=final_chunk.text),
    259         generation_info=final_chunk.generation_info,
    260     )
    261     return ChatResult(generations=[chat_generation])

File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:328, in ChatOllamaCustom._chat_stream_with_aggregation(self, messages, stop, run_manager, verbose, **kwargs)
    319 def _chat_stream_with_aggregation(
    320     self,
    321     messages: List[BaseMessage],
   (...)
    325     **kwargs: Any,
    326 ) -> ChatGenerationChunk:
    327     final_chunk: Optional[ChatGenerationChunk] = None
--> 328     for stream_resp in self._create_chat_stream(messages, stop, **kwargs):
    329         if stream_resp:
    330             chunk = _chat_stream_response_to_chat_generation_chunk(stream_resp)

File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:301, in ChatOllamaCustom._create_chat_stream(self, messages, stop, **kwargs)
    292 def _create_chat_stream(
    293     self,
    294     messages: List[BaseMessage],
    295     stop: Optional[List[str]] = None,
    296     **kwargs: Any,
    297 ) -> Iterator[str]:
    298     payload = {
    299         \"messages\": self._convert_messages_to_ollama_messages(messages),
    300     }
--> 301     yield from self._create_stream(
    302         payload=payload, stop=stop, api_url=f\"{self.base_url}/api/chat\", **kwargs
    303     )

File /storage/gpt-project/Repos/repo_nikita/gpt_lib/langchain/ollama.py:134, in _OllamaCommonCustom._create_stream(self, api_url, payload, stop, **kwargs)
    132     else:
    133         optional_detail = response.text
--> 134         raise ValueError(
    135             f\"Ollama call failed with status code {response.status_code}.\"
    136             f\" Details: {optional_detail}\"
    137         )
    138 return response.iter_lines(decode_unicode=True)

ValueError: Ollama call failed with status code 401. Details: <html>\r
<head><title>401 Authorization Required</title></head>\r
<body>\r
<center><h1>401 Authorization Required</h1></center>\r
<hr><center>nginx/1.18.0 (Ubuntu)</center>\r
</body>\r
</html>\r
"
}
```

</details>

The same is true for timeout errors or when you simply mistyped in
`base_url` arg and get response from some other service, for instance.

Real Ollama errors are still clearly readable:

```
ValueError: Ollama call failed with status code 400. Details: {"error":"invalid options: unknown_option"}
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-01 12:17:29 -08:00
Yudhajit Sinha
e2b901c35b community[patch]: chat message histrory mypy fix (#18250)
Description: Fixed type: ignore's for mypy for
chat_message_histories(streamlit)
Adresses #17048 

Planning to add more based on reviews
2024-03-01 12:17:18 -08:00
Gabriel Altay
b9416dc96a docs: update pinecone README to use PineconeVectorStore (#18170) 2024-03-01 12:12:52 -08:00
老阿張
1701f7b8e9 docs: Fix typo in baidu_qianfan_endpoint.ipynb & baidu_qianfan_endpoint.ipynb (#18176)
Description: "sucessfully should be successfully "? 🤔
Issue: Typo
Dependencies: Nope
Twitter handle: laoazhang
2024-03-01 12:10:23 -08:00
Hemslo Wang
58a2abf089 community[patch]: fix RecursiveUrlLoader metadata_extractor return type (#18193)
**Description:** Fix `metadata_extractor` type for `RecursiveUrlLoader`,
the default `_metadata_extractor` returns `dict` instead of `str`.
**Issue:** N/A
**Dependencies:** N/A
**Twitter handle:** N/A

Signed-off-by: Hemslo Wang <hemslo.wang@gmail.com>
2024-03-01 12:08:20 -08:00
Maxime Perrin
98380cff9b community[patch]: removing "response_mode" parameter in llama_index retriever (#18180)
- **Description:** Removing this line 
```python
response = index.query(query, response_mode="no_text", **self.query_kwargs)
```
to 
```python
response = index.query(query, **self.query_kwargs)
```
Since llama index query does not support response_mode anymore : ``` |
TypeError: BaseQueryEngine.query() got an unexpected keyword argument
'response_mode'````
  - **Twitter handle:** @maximeperrin_

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-03-01 12:05:09 -08:00
Leonid Kuligin
e080281623 docs: cookbook on gemma integrations (#18213)
- [ ] **PR title**: "cookbook: using Gemma on LangChain"

- [ ] **PR message**: 
- **Description:** added a tutorial how to use Gemma with LangChain
(from VertexAI or locally from Kaggle or HF)
    - **Dependencies:** langchain-google-vertexai==0.0.7
    - **Twitter handle:** lkuligin
2024-03-01 11:50:55 -08:00
Christophe Bornet
177f51c7bd community: Use default load() implementation in doc loaders (#18385)
Following https://github.com/langchain-ai/langchain/pull/18289
2024-03-01 14:46:52 -05:00
William De Vena
42341bc787 infra: fake model invoke callback prior to yielding token (#18286)
## PR title
core[patch]: Invoke callback prior to yielding

## PR message
Description: Invoke on_llm_new_token callback prior to yielding token in
_stream and _astream methods.
Issue: https://github.com/langchain-ai/langchain/issues/16913
Dependencies: None
Twitter handle: None
2024-03-01 11:46:18 -08:00
Ikko Eltociear Ashimine
31b4e78174 docs: fix typo in milvus.ipynb (#18373)
retreival -> retrieval
2024-03-01 11:22:39 -08:00
Tabby
dd6f85caf1 docs: Update Google El Carro for Oracle Workload Documentation. (#18394)
In this commit we update the documentation for Google El Carro for Oracle Workloads. We amend the documentation in the Google Providers page to use the correct name which is El Carro for Oracle Workloads. We also add changes to the document_loaders and memory pages to reflect changes we made in our repo.
2024-03-01 11:21:35 -08:00
mwmajewsk
e192f6b6eb community[patch]: fix, better error message in deeplake vectoriser (#18397)
If the document loader recieves Pathlib path instead of str, it reads
the file correctly, but the problem begins when the document is added to
Deeplake.
This problem arises from casting the path to str in the metadata.

```python
deeplake = True
fname = Path('./lorem_ipsum.txt')
loader = TextLoader(fname, encoding="utf-8")
docs = loader.load_and_split()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
chunks= text_splitter.split_documents(docs)
if deeplake:
    db = DeepLake(dataset_path=ds_path, embedding=embeddings, token=activeloop_token)
    db.add_documents(chunks)
else:
    db = Chroma.from_documents(docs, embeddings)
```

So using this snippet of code the error message for deeplake looks like
this:

```
[part of error message omitted]

Traceback (most recent call last):
  File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 53, in <module>
    db.add_documents(chunks)
  File "/home/mwm/repositories/sources/langchain/libs/core/langchain_core/vectorstores.py", line 139, in add_documents
    return self.add_texts(texts, metadatas, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/deeplake.py", line 258, in add_texts
    return self.vectorstore.add(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/deeplake_vectorstore.py", line 226, in add
    return self.dataset_handler.add(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/dataset_handlers/client_side_dataset_handler.py", line 139, in add
    dataset_utils.extend_or_ingest_dataset(
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 544, in extend_or_ingest_dataset
    extend(
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 505, in extend
    dataset.extend(batched_processed_tensors, progressbar=False)
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/dataset/dataset.py", line 3247, in extend
    raise SampleExtendError(str(e)) from e.__cause__
deeplake.util.exceptions.SampleExtendError: Failed to append a sample to the tensor 'metadata'. See more details in the traceback. If you wish to skip the samples that cause errors, please specify `ignore_errors=True`.
```

Which is does not explain the error well enough.
The same error for chroma looks like this 

```
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 56, in <module>
    db = Chroma.from_documents(docs, embeddings)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 778, in from_documents
    return cls.from_texts(
           ^^^^^^^^^^^^^^^
  File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 736, in from_texts
    chroma_collection.add_texts(
  File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 309, in add_texts
    raise ValueError(e.args[0] + "\n\n" + msg)
ValueError: Expected metadata value to be a str, int, float or bool, got lorem_ipsum.txt which is a <class 'pathlib.PosixPath'>

Try filtering complex metadata from the document using langchain_community.vectorstores.utils.filter_complex_metadata.
```

Which is way more user friendly, so I just added information about
possible mismatch of the type in the error message, the same way it is
covered in chroma
https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/chroma.py#L224
2024-03-01 11:21:21 -08:00
Daniel Chico
7d962278f6 community[patch]: type ignore fixes (#18395)
Related to #17048
2024-03-01 11:21:02 -08:00
Christophe Bornet
69be82c86d community[patch]: Implement lazy_load() for CSVLoader (#18391)
Covered by `test_csv_loader.py`
2024-03-01 11:17:08 -08:00
Bagatur
c54d6eb5da fireworks[patch]: support "any" tool_choice (#18343)
per https://readme.fireworks.ai/docs/function-calling
2024-03-01 11:12:28 -08:00
Leonid Ganeline
d937fa4f9c docs: Tutorials update (#18230)
A big update of the `Tutorials` page. Cleaned it up. Added several new
resources.
2024-03-01 11:07:39 -08:00
Erick Friis
6afb135baa astradb: move to langchain-datastax repo (#18354) 2024-03-01 19:04:43 +00:00
Akash A Desai
b641be2edf templates: Lanceb RAG template (#17809)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-01 18:52:50 +00:00
Guangdong Liu
760a16ff32 community[patch]: Fix ChatModel for sparkllm Bug. (#18375)
**PR message**: ***Delete this entire checklist*** and replace with
    - **Description:** fix sparkllm paramer error
    - **Issue:**   close #18370
- **Dependencies:** change `IFLYTEK_SPARK_APP_URL` to
`IFLYTEK_SPARK_API_URL`
    - **Twitter handle:** No
2024-03-01 10:49:30 -08:00
Yujie Qian
cbb65741a7 community[patch]: Voyage AI updates default model and batch size (#17655)
- **Description:** update the default model and batch size in
VoyageEmbeddings
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** N/A

---------

Co-authored-by: fodizoltan <zoltan@conway.expert>
2024-03-01 10:22:24 -08:00
Shengsheng Huang
ae471a7dcb community[minor]: add BigDL-LLM integrations (#17953)
- **Description**:
[`bigdl-llm`](https://github.com/intel-analytics/BigDL) is a library for
running LLM on Intel XPU (from Laptop to GPU to Cloud) using
INT4/FP4/INT8/FP8 with very low latency (for any PyTorch model). This PR
adds bigdl-llm integrations to langchain.
- **Issue**: NA
- **Dependencies**: `bigdl-llm` library
- **Contribution maintainer**: @shane-huang 
 
Examples added:
- docs/docs/integrations/llms/bigdl.ipynb
2024-03-01 10:04:53 -08:00
Ethan Yang
f61cb8d407 community[minor]: Add openvino backend support (#11591)
- **Description:** add openvino backend support by HuggingFace Optimum
Intel,
  - **Dependencies:** “optimum[openvino]”,

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-01 10:04:24 -08:00
Leonid Ganeline
a89f007947 docs: runnable module description (#17966)
Added a module description. Added `batch` description.
2024-03-01 10:01:32 -08:00
Leonid Ganeline
6d0af4e805 docs: nvidia: provider page update (#18054)
Nvidia provider page is missing a Triton Inference Server package
reference.
Changes:
- added the Triton Inference Server reference
- copied the example notebook from the package into the doc files.
- added the Triton Inference Server description and links, the link to
the above example notebook
- formatted page to the consistent format

NOTE:
It seems that the [example
notebook](https://github.com/langchain-ai/langchain/blob/master/libs/partners/nvidia-trt/docs/llms.ipynb)
was originally created in wrong place. It should be in the LangChain
docs
[here](https://github.com/langchain-ai/langchain/tree/master/docs/docs/integrations/llms).
So, I've created a copy of this example. The original example is still
in the nvidia-trt package.
2024-03-01 10:00:42 -08:00
RadhikaBansal97
8bafd2df5e community[patch]: Change github endpoint in GithubLoader (#17622)
Description- 
- Changed the GitHub endpoint as existing was not working and giving 404
not found error
- Also the existing function was failing if file_filter is not passed as
the tree api return all paths including directory as well, and when
get_file_content was iterating over these path, the function was failing
for directory as the api was returning list of files inside the
directory, so added a condition to ignore the paths if it a directory
- Fixes this issue -
https://github.com/langchain-ai/langchain/issues/17453

Co-authored-by: Radhika Bansal <Radhika.Bansal@veritas.com>
2024-03-01 09:36:31 -08:00
Yufei (Benny) Chen
2b93206f02 fireworks[patch]: Fix fireworks async stream (#18372)
- **Description:**  Fix the async stream issue with Fireworks
- **Dependencies:** fireworks >= 0.13.0

```
tests/integration_tests/test_chat_models.py ..........                                                                   [ 45%]
tests/integration_tests/test_compile.py .                                                                                [ 50%]
tests/integration_tests/test_embeddings.py ..                                                                            [ 59%]
tests/integration_tests/test_llms.py .........                                                                           [100%]
```
```
tests/unit_tests/test_embeddings.py .                                                                                    [ 16%]
tests/unit_tests/test_imports.py .                                                                                       [ 33%]
tests/unit_tests/test_llms.py ....                                                                                       [100%]
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-01 09:20:26 -08:00
William FH
1deb8cadd5 Add dataset version info (#18299) 2024-02-29 22:00:44 -08:00
Anush
9d663f31fa community[patch]: FastEmbed to latest (#18040)
## Description

Updates the `langchain_community.embeddings.fastembed` provider as per
the recent updates to [`FastEmbed`](https://github.com/qdrant/fastembed)
library.
2024-02-29 21:15:51 -08:00
Jacob Lee
590d47bff4 docs[patch]: Add Neo4j GraphAcademy to tutorials section (#18353) 2024-02-29 20:50:24 -07:00
Erick Friis
3c8a115e21 fireworks[patch]: remove custom async and stream implementations (#18363) 2024-03-01 03:20:02 +00:00
Bagatur
4730ee2766 docs: update api ref nav (#18362) 2024-02-29 19:04:56 -08:00
Bagatur
12f19b8a6a infra: update create_api_rst (#18361) 2024-02-29 19:04:44 -08:00
Erick Friis
1317578ad1 templates: use langchain-text-splitters (#18360)
- deps
- import
- import
2024-03-01 03:00:58 +00:00
Bagatur
f220af3dce docs: text splitters readme (#18359) 2024-03-01 03:00:42 +00:00
Bagatur
0d7fb5f60a langchain[patch]: langchain-text-splitters dep (#18357) 2024-02-29 18:48:55 -08:00
Eugene Yurtsev
51b661cfe8 community[patch]: BaseLoader load method should just delegate to lazy_load (#18289)
load() should just reference lazy_load()
2024-02-29 21:45:28 -05:00
Bagatur
5efb5c099f text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-02-29 18:33:21 -08:00
Nuno Campos
7891934173 Fix missing labels (#18356)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-02-29 18:11:18 -08:00
William FH
fdab931fd3 [Core] Patch: rm dumpd of outputs from runnables/base (#18295)
It obstructs evaluations when your return a pydantic object.
2024-02-29 18:04:53 -08:00
Erick Friis
c7d5ed6f5c infra: tolerate partner package move in ci (#18355) 2024-02-29 17:49:28 -08:00
William FH
f481cbb32d fireworks[patch]: Fix fireworks bind tools (#18352)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-01 01:18:15 +00:00
Erick Friis
eefb49680f multiple[patch]: fix deprecation versions (#18349) 2024-02-29 16:58:33 -08:00
Erick Friis
11cb42c2c1 core[patch]: deprecation docstring with lib (#18350) 2024-03-01 00:44:13 +00:00
Erick Friis
bce0684327 docs: airbyte deps note (#18243) 2024-02-29 16:02:13 -08:00
Erick Friis
7bbff98dc7 mongodb[patch]: core 0.1.5 dep (#18348) 2024-02-29 15:39:04 -08:00
Erick Friis
4e27e66938 infra: mongodb env vars (#18347) 2024-02-29 15:24:28 -08:00
Jib
72bfc1d3db mongodb[minor]: MongoDB Partner Package -- Porting MongoDBAtlasVectorSearch (#17652)
This PR migrates the existing MongoDBAtlasVectorSearch abstraction from
the `langchain_community` section to the partners package section of the
codebase.
- [x] Run the partner package script as advised in the partner-packages
documentation.
- [x] Add Unit Tests
- [x] Migrate Integration Tests
- [x] Refactor `MongoDBAtlasVectorStore` (autogenerated) to
`MongoDBAtlasVectorSearch`
- [x] ~Remove~ deprecate the old `langchain_community` VectorStore
references.

## Additional Callouts
- Implemented the `delete` method
- Included any missing async function implementations
  - `amax_marginal_relevance_search_by_vector`
  - `adelete` 
- Added new Unit Tests that test for functionality of
`MongoDBVectorSearch` methods
- Removed [`del
res[self._embedding_key]`](e0c81e1cb0/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L218))
in `_similarity_search_with_score` function as it would make the
`maximal_marginal_relevance` function fail otherwise. The `Document`
needs to store the embedding key in metadata to work.

Checklist:

- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
- [x] PR message
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. Existing tests supplied in docs/docs do not change. Updated
docstrings for new functions like `delete`
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory. (This already exists)

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Steven Silvester <steven.silvester@ieee.org>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-29 23:09:48 +00:00
William De Vena
412148773c Updated partners/fireworks README (#18267)
## PR title
partners: changed the README file for the Fireworks integration in the
libs/partners/fireworks folder

## PR message
Description: Changed the README file of partners/fireworks following the
docs on https://python.langchain.com/docs/integrations/llms/Fireworks

The README includes:

- Brief description
- Installation
- Setting-up instructions (API key, model id, ...)
- Basic usage

Issue: https://github.com/langchain-ai/langchain/issues/17545

Dependencies: None

Twitter handle: None
2024-02-29 14:55:03 -08:00
Kai Kugler
df234fb171 community[patch]: Fixing embedchain document mapping (#18255)
- **Description:** The current embedchain implementation seems to handle
document metadata differently than done in the current implementation of
langchain and a KeyError is thrown. I would love for someone else to
test this...

---------

Co-authored-by: KKUGLER <kai.kugler@mercedes-benz.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Deshraj Yadav <deshraj@gatech.edu>
2024-02-29 14:54:37 -08:00
Erick Friis
040271f33a community[patch]: remove llmlingua extended tests (#18344) 2024-02-29 13:51:29 -08:00
William De Vena
87dca8e477 Updated partners/ibm README (#18268)
## PR title
partners: changed the README file for the IBM Watson AI integration in
the libs/partners/ibm folder.

## PR message
Description: Changed the README file of partners/ibm following the docs
on https://python.langchain.com/docs/integrations/llms/ibm_watsonx

The README includes:

- Brief description
- Installation
- Setting-up instructions (API key, project id, ...)
- Basic usage:
  - Loading the model
  - Direct inference
  - Chain invoking
  - Streaming the model output
  
Issue: https://github.com/langchain-ai/langchain/issues/17545

Dependencies: None

Twitter handle: None

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2024-02-29 13:29:28 -08:00
Erick Friis
dfd9787388 infra: ci dirs in wrong order (#18340) 2024-02-29 21:13:29 +00:00
Bagatur
9e46535ebc core[patch]: Release 0.1.28 (#18341) 2024-02-29 13:03:13 -08:00
Tomaz Bratanic
5999c4a240 Add support for parameters in neo4j retrieval query (#18310)
Sometimes, you want to use various parameters in the retrieval query of
Neo4j Vector to personalize/customize results. Before, when there were
only predefined chains, it didn't really make sense. Now that it's all
about custom chains and LCEL, it is worth adding since users can inject
any params they wish at query time. Isn't prone to SQL injection-type
attacks since we use parameters and not concatenating strings.
2024-02-29 13:00:54 -08:00
Hasan
15d1b73a00 Add optional output_parser param in create_react_agent (#18320)
**Description:** Add facility to pass the optional output parser to
customize the parsing logic

---------

Co-authored-by: hasan <hasan@m2sys.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-29 12:35:43 -08:00
Bagatur
a6f0506aaf docs: query analysis use case (#17766)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-02-29 12:33:49 -08:00
kkdamowang
6782dac420 docs: remove duplicate quote in AzureOpenAIEmbeddings doc (#18315)
- **Description:** Remove duplicate quote in AzureOpenAIEmbeddings doc,
remove trailing spaces.
- **Issue:** No
- **Dependencies:** No
2024-02-29 11:25:50 -08:00
Filip Schouwenaars
4c62362eab Add links to relevant DataCamp code alongs (#18332)
This PR adds links to some more free resources for people to get
acquainted with Langhchain without having to configure their system.

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, hwchase17. -->

Co-authored-by: Filip Schouwenaars <filipsch@users.noreply.github.com>
2024-02-29 11:25:01 -08:00
Virat Singh
cd926ac3dd community: Add PolygonFinancials Tool (#18324)
**Description:**
In this PR, I am adding a `PolygonFinancials` tool, which can be used to
get financials data for a given ticker. The financials data is the
fundamental data that is found in income statements, balance sheets, and
cash flow statements of public US companies.

**Twitter**: 
[@virattt](https://twitter.com/virattt)
2024-02-29 10:56:05 -08:00
Leonid Ganeline
d43fa2eab1 docs providers update (#18336)
Formatted pages into a consistent form. Added descriptions and links
when needed.
2024-02-29 10:53:12 -08:00
Erick Friis
68be5a7658 infra: skip ibm api docs (#18335) 2024-02-29 10:16:57 -08:00
Erick Friis
43534a4c08 skip airbyte api docs (#18334) 2024-02-29 09:57:52 -08:00
Bagatur
6a5b084704 docs: update func calling doc (#18300) 2024-02-29 09:45:07 -08:00
Bagatur
68ad3414a2 experimental[patch]: Release 0.0.53 (#18330) 2024-02-29 09:13:21 -08:00
William FH
8af4425abd [Evaluation] Config Fix (#18231) 2024-02-29 00:06:46 -08:00
Averi Kitsch
1b63530274 docs: update Google documentation (#18297)
**Description:** update Google documentation
**Issue:** 
**Dependencies:**
2024-02-29 01:42:44 +00:00
Leonid Ganeline
1d865a7e86 docs: google provider page fixes (#18290)
Several URL-s were broken (in the yesterday PR). Like
[Integrations/platforms/google/Document
Loaders](https://python.langchain.com/docs/integrations/platforms/google#document-loaders)
page, Example link to "Document Loaders / Cloud SQL for PostgreSQL" and
most of the new example links in the Document Loaders, Vectorstores,
Memory sections.

- fixed URL-s (manually verified all example links)
- sorted sections in page to follow the "integrations/components" menu
item order.
- fixed several page titles to fix Navbar item order

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-29 00:45:03 +00:00
William De Vena
0486404a74 langchain_openai[patch]: Invoke callback prior to yielding token (#18269)
## PR title
langchain_openai[patch]: Invoke callback prior to yielding token

## PR message
Description: Invoke callback prior to yielding token in _stream and
_astream methods for langchain_openai.
Issue: https://github.com/langchain-ai/langchain/issues/16913
Dependencies: None
Twitter handle: None
2024-02-29 00:00:08 +00:00
William De Vena
5ee76fccd5 langchain_groq[patch]: Invoke callback prior to yielding token (#18272)
## PR title
langchain_groq[patch]: Invoke callback prior to yielding

## PR message
**Description:**Invoke callback prior to yielding token in _stream and
_astream methods for groq.
Issue: https://github.com/langchain-ai/langchain/issues/16913
Dependencies: None
Twitter handle: None
2024-02-28 23:43:16 +00:00
aditya thomas
eb0c178d75 docs: update to the list of partner packages in the list of providers (#18252)
**Description:** Update to the list of partner packages in the list of
providers
**Issue:** Google & Nvidia had two entries each, both pointing to the
same page
**Dependencies:** None
2024-02-28 15:40:14 -08:00
ccurme
9bf58ec7dd update extraction use-case docs (#17979)
Update extraction use-case docs to showcase and explain all modes of
`create_structured_output_runnable`.
2024-02-28 17:32:04 -05:00
Christophe Bornet
8a81fcd5d3 community: Fix deprecation version of AstraDB VectorStore (#17991) 2024-02-28 17:15:09 -05:00
Stefano Lottini
6d863bed51 partner[minor]: Astra DB clients identify themselves as coming through LangChain package (#18131)
**Description**

This PR sets the "caller identity" of the Astra DB clients used by the
integration plugins (`AstraDBChatMessageHistory`, `AstraDBStore`,
`AstraDBByteStore` and, pending #17767 , `AstraDBVectorStore`). In this
way, the requests to the Astra DB Data API coming from within LangChain
are identified as such (the purpose is anonymous usage stats to best
improve the Astra DB service).
2024-02-28 17:13:22 -05:00
kkdamowang
4899a72b56 docs: remove duplicate word in lcel/streaming (#18249)
- **Description:** Remove duplicate word in lcel/streaming.
- **Issue:** No.
- **Dependencies:**  No.
2024-02-28 21:50:26 +00:00
mackong
2c42f3a955 ollama[patch]: delete suffix slash to avoid redirect (#18260)
- **Description:** see
[ollama](https://github.com/ollama/ollama/blob/main/server/routes.go#L949)'s
route definitions
- **Issue:** N/A
- **Dependencies:** N/A
2024-02-28 16:44:48 -05:00
William De Vena
6b58943917 community[patch]: Invoke callback prior to yielding token (#18288)
## PR title
community[patch]: Invoke callback prior to yielding

PR message
Description: Invoke on_llm_new_token callback prior to yielding token in
_stream and _astream methods.
Issue: https://github.com/langchain-ai/langchain/issues/16913
Dependencies: None
Twitter handle: None
2024-02-28 21:40:53 +00:00
Brace Sproul
ca4f5e2408 ci: Update issue template required checks (#18283) 2024-02-28 13:27:39 -08:00
William De Vena
23722e3653 langchain[patch]: Invoke callback prior to yielding token (#18282)
## PR title
langchain[patch]: Invoke callback prior to yielding

## PR message
Description: Invoke on_llm_new_token callback prior to yielding token in
_stream and _astream methods in langchain/tests/fake_chat_model.
Issue: https://github.com/langchain-ai/langchain/issues/16913
Dependencies: None
Twitter handle: None
2024-02-28 16:15:02 -05:00
Eugene Yurtsev
cd52433ba0 community[minor]: Add SQLDatabaseLoader document loader (#18281)
- **Description:** A generic document loader adapter for SQLAlchemy on
top of LangChain's `SQLDatabaseLoader`.
  - **Needed by:** https://github.com/crate-workbench/langchain/pull/1
  - **Depends on:** GH-16655
  - **Addressed to:** @baskaryan, @cbornet, @eyurtsev

Hi from CrateDB again,

in the same spirit like GH-16243 and GH-16244, this patch breaks out
another commit from https://github.com/crate-workbench/langchain/pull/1,
in order to reduce the size of this patch before submitting it, and to
separate concerns.

To accompany the SQLAlchemy adapter implementation, the patch includes
integration tests for both SQLite and PostgreSQL. Let me know if
corresponding utility resources should be added at different spots.

With kind regards,
Andreas.


### Software Tests

```console
docker compose --file libs/community/tests/integration_tests/document_loaders/docker-compose/postgresql.yml up
```

```console
cd libs/community
pip install psycopg2-binary
pytest -vvv tests/integration_tests -k sqldatabase
```

```
14 passed
```



![image](https://github.com/langchain-ai/langchain/assets/453543/42be233c-eb37-4c76-a830-474276e01436)

---------

Co-authored-by: Andreas Motl <andreas.motl@crate.io>
2024-02-28 21:02:28 +00:00
William De Vena
a37dc83a9e langchain_anthropic[patch]: Invoke callback prior to yielding token (#18274)
## PR title
langchain_anthropic[patch]: Invoke callback prior to yielding

## PR message
- Description: Invoke callback prior to yielding token in _stream and
_astream methods for anthropic.
- Issue: https://github.com/langchain-ai/langchain/issues/16913
- Dependencies: None
- Twitter handle: None
2024-02-28 20:19:22 +00:00
David Ruan
af35e2525a community[minor]: add hugging_face_model document loader (#17323)
- **Description:** add hugging_face_model document loader,
  - **Issue:** NA,
  - **Dependencies:** NA,

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-28 20:05:35 +00:00
Sanjaypranav V M
b9a495e56e community[patch]: added latin-1 decoder to gmail search tool (#18116)
some mails from flipkart , amazon are encoded with other plain text
format so to handle UnicodeDecode error , added exception and latin
decoder

Thank you for contributing to LangChain!

@hwchase17
2024-02-28 19:28:29 +00:00
Nuno Campos
6da08d0f22 Add PNG drawer for Runnable.get_graph() (#18239)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-02-28 11:25:19 -08:00
Nuno Campos
d9fd1194f5 Remove check preventing passing non-declared config keys (#18276)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-02-28 18:28:53 +00:00
William De Vena
7ac74f291e langchain_nvidia_ai_endpoints[patch]: Invoke callback prior to yielding token (#18271)
## PR title
langchain_nvidia_ai_endpoints[patch]: Invoke callback prior to yielding

## PR message
**Description:** Invoke callback prior to yielding token in _stream and
_astream methods for nvidia_ai_endpoints.
**Issue:** https://github.com/langchain-ai/langchain/issues/16913
**Dependencies:** None
2024-02-28 18:10:57 +00:00
Erick Friis
b4f6066a57 docs: airbyte github cookbook (#18275) 2024-02-28 18:04:15 +00:00
Ashley Xu
e3211c2b3d community[patch]: BigQueryVectorSearch JSON type unsupported for metadatas (#18234) 2024-02-28 08:19:53 -08:00
Jack Wotherspoon
92c34d4803 docs: update documentation for Google Cloud database integrations (#18265)
**Description:** Fixing typos and rendering issues for Google Cloud
database integrations.
**Issue:** NA
**Dependencies:** NA
2024-02-28 15:32:43 +00:00
Erick Friis
2e31f1c2f8 infra: api docs folder move (#18223) 2024-02-28 07:10:27 -08:00
Mateusz Szewczyk
db643f6283 ibm[patch]: release 0.1.0 Add possibility to pass ModelInference or Model object to WatsonxLLM class (#18189)
- **Description:** Add possibility to pass ModelInference or Model
object to WatsonxLLM class
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** : 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally. 
2024-02-28 07:03:15 -08:00
Averi Kitsch
76eb553084 docs: add documentation for Google Cloud database integrations (#18225)
**Description:** add documentation for Google Cloud database
integrations
**Issue:** NA
**Dependencies:** NA
2024-02-27 21:17:30 -08:00
Erick Friis
d7a77054ed airbyte[patch]: core version 0.1.5 (#18244) 2024-02-27 19:54:43 -08:00
Erick Friis
be8d2ff5f7 airbyte[patch]: init pkg (#18236) 2024-02-27 19:37:53 -08:00
Ayo Ayibiowu
ac1d7d9de8 community[feat]: Adds LLMLingua as a document compressor (#17711)
**Description**: This PR adds support for using the [LLMLingua project
](https://github.com/microsoft/LLMLingua) especially the LongLLMLingua
(Enhancing Large Language Model Inference via Prompt Compression) as a
document compressor / transformer.

The LLMLingua project is an interesting project that can greatly improve
RAG system by compressing prompts and contexts while keeping their
semantic relevance.

**Issue**: https://github.com/microsoft/LLMLingua/issues/31
**Dependencies**: [llmlingua](https://pypi.org/project/llmlingua/)

@baskaryan

---------

Co-authored-by: Ayodeji Ayibiowu <ayodeji.ayibiowu@getinge.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-02-27 19:23:56 -08:00
Nuno Campos
a99eb3abf4 openai[patch]: Assign message id in ChatOpenAI (#17837) 2024-02-27 17:32:54 -08:00
Isaac Francisco
733367b795 docs: deprecation of OpenAI functions agent, astream_events docstring (#18164)
Co-authored-by: Hershenson, Isaac (Extern) <isaac.hershenson.extern@bayer04.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-27 09:14:53 -08:00
Harrison Chase
b0ccaf5917 Harrison/add structured output (#18165) 2024-02-27 08:25:09 -08:00
Bagatur
242af4b5a4 openai[patch], mistral[patch], fireworks[patch]: releases 0.0.8, 0.0.5, 0.0.2 (#18186) 2024-02-27 04:22:24 -08:00
Bagatur
7e66d964c6 core[patch]: Release 0.1.27 (#18159) 2024-02-26 17:27:38 -08:00
Harrison Chase
d7c607ca00 core[minor]: move document compressor base (#17910) 2024-02-26 17:20:50 -08:00
Bagatur
b3f4de38ae mistral[minor]: Function calling and with_structured_output (#18150)
![Screenshot 2024-02-26 at 2 07 06
PM](https://github.com/langchain-ai/langchain/assets/22008038/20cacb47-3b24-45b5-871b-dd169f1acd37)
2024-02-26 16:22:30 -08:00
Bagatur
c53aa5cd37 core[patch]: support JS message serial namespaces (#18151) 2024-02-26 16:19:46 -08:00
Harrison Chase
c673717c2b add optimization notebook (#18155) 2024-02-26 16:09:31 -08:00
Max Jakob
5ab69f907f partners: add Elasticsearch package (#17467)
### Description
This PR moves the Elasticsearch classes to a partners package.

Note that we will not move (and later remove) `ElasticKnnSearch`. It
were previously deprecated.
`ElasticVectorSearch` is going to stay in the community package since it
is used quite a lot still.

Also note that I left the `ElasticsearchTranslator` for self query
untouched because it resides in main `langchain` package.

### Dependencies
There will be another PR that updates the notebooks (potentially pulling
them into the partners package) and templates and removes the classes
from the community package, see
https://github.com/langchain-ai/langchain/pull/17468

#### Open question
How to make the transition smooth for users? Do we move the import
aliases and require people to install `langchain-elasticsearch`? Or do
we remove the import aliases from the `langchain` package all together?
What has worked well for other partner packages?

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-26 23:19:47 +00:00
matt haigh
a4896da2a0 Experimental: Add other threshold types to SemanticChunker (#16807)
**Description**
Adding different threshold types to the semantic chunker. I’ve had much
better and predictable performance when using standard deviations
instead of percentiles.


![image](https://github.com/langchain-ai/langchain/assets/44395485/066e84a8-460e-4da5-9fa1-4ff79a1941c5)

For all the documents I’ve tried, the distribution of distances look
similar to the above: positively skewed normal distribution. All skews
I’ve seen are less than 1 so that explains why standard deviations
perform well, but I’ve included IQR if anyone wants something more
robust.

Also, using the percentile method backwards, you can declare the number
of clusters and use semantic chunking to get an ‘optimal’ splitting.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-02-26 13:50:48 -08:00
Jaskirat Singh
ce682f5a09 community: vectorstores.kdbai - Added support for when no docs are present (#18103)
- **Description:** By default it expects a list but that's not the case
in corner scenarios when there is no document ingested(use case:
Bootstrap application).
\
Hence added as check, if the instance is panda Dataframe instead of list
then it will procced with return immediately.

- **Issue:** NA
- **Dependencies:** NA
- **Twitter handle:**  jaskiratsingh1

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-02-26 12:47:06 -08:00
am-kinetica
9b8f6455b1 Langchain vectorstore integration with Kinetica (#18102)
- **Description:** New vectorstore integration with the Kinetica
database
  - **Issue:** 
- **Dependencies:** the Kinetica Python API `pip install
gpudb==7.2.0.1`,
  - **Tag maintainer:** @baskaryan, @hwchase17 
  - **Twitter handle:**

---------

Co-authored-by: Chad Juliano <cjuliano@kinetica.com>
2024-02-26 12:46:48 -08:00
Bagatur
1e8ab83d7b langchain[patch], core[patch], openai[patch], fireworks[minor]: ChatFireworks.with_structured_output (#18078)
<img width="1192" alt="Screenshot 2024-02-24 at 3 39 39 PM"
src="https://github.com/langchain-ai/langchain/assets/22008038/1cf74774-a23f-4b06-9b9b-85dfa2f75b63">
2024-02-26 12:46:39 -08:00
GoodBai
3589a135ef community: make SET allow_experimental_[engine]_index configurabe in vectorstores.clickhouse (#18107)
## Description & Issue
While following the official doc to use clickhouse as a vectorstore, I
found only the default `annoy` index is properly supported. But I want
to try another engine `usearch` for `annoy` is not properly supported on
ARM platforms.
Here is the settings I prefer:

``` python
settings = ClickhouseSettings(
    table="wiki_Ethereum",
    index_type="usearch",  # annoy by default
    index_param=[],
)
```
The above settings do not work for the command `set
allow_experimental_annoy_index=1` is hard-coded.
This PR will make sure the experimental feature follow the `index_type`
which is also consistent with Clickhouse's naming conventions.
2024-02-26 12:39:17 -08:00
Dan Stambler
69344a0661 community: Add Laser Embedding Integration (#18111)
- **Description:** Added Integration with Meta AI's LASER
Language-Agnostic SEntence Representations embedding library, which
supports multilingual embedding for any of the languages listed here:
https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200,
including several low resource languages
- **Dependencies:** laser_encoders
2024-02-26 12:16:37 -08:00
Erick Friis
257879e98d infra: api docs setup action location (#18148) 2024-02-26 11:50:21 -08:00
Erick Friis
28cf3aab45 infra: api docs build commit dir (#18147) 2024-02-26 11:47:04 -08:00
Heidi Steen
166f3d8351 Docs: azuresearch.ipynb (in docs/docs/integrations/vectorstores) -- fixed headings and comments (#18135)
This PR updates azuresearch.ipynb with an edit to the introduction
sentence, consistent heading levels, and disambiguation in code
comments.
2024-02-26 11:46:55 -08:00
Luan Fernandes
e867557936 [docs] Update doc-string for buffer_as_messages method in ConversationBufferWindowMemory (#18136)
minor fix stated in #18080
2024-02-26 11:46:43 -08:00
Barun Amalkumar Halder
23fc7c8c90 docs [patch] : fix import to use community path for handler in fiddler notebook (#18140)
**Description:** Update the example fiddler notebook to use community
path, instead of langchain.callback
**Dependencies:** None
**Twitter handle:** @bhalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-02-26 11:41:07 -08:00
Bagatur
767523f364 core[patch], langchain[patch], templates: move openai functions parsers to core (#18060)
![Screenshot 2024-02-23 at 7 48 03
PM](https://github.com/langchain-ai/langchain/assets/22008038/e5540c4d-0020-4ece-869f-ae19db2a1f3f)
2024-02-26 11:12:53 -08:00
Bagatur
96bff0ed5d infra: create api rst for specific pkg (#18144)
Example: create rst for libs/core only
```bash
poetry run python docs/api_reference/create_api_rst.py core
```
2024-02-26 11:04:22 -08:00
Nuno Campos
cd3ab3703b Improve runnable generator error messages (#18142)
h/t @hinthornw 

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-02-26 18:54:25 +00:00
Nuno Campos
62a30efb12 Fix bug with using configurable_fields after configurable_alternatives (#18139)
Closes #17915
2024-02-26 10:27:07 -08:00
Erick Friis
f5cf6975ba docs: anthropic partner package docs (#18109) 2024-02-26 17:51:44 +00:00
Nuno Campos
b1d9ce541d Add BaseMessage.id (#17835)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-02-26 09:27:47 -08:00
Harrison Chase
935aefa8db add run name for query constructor (#18101)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-26 08:17:05 -08:00
Mohammad Mohtashim
719a1cde75 langchain[patch]: Update doc-string for a method in ConversationBufferWindowMemory (#18090)
A minor doc fix stated in #18080
2024-02-26 10:15:02 -05:00
Simon Schmidt
2716d58603 langchain: Import from langchain_core in langchain.smith to avoid deprecation warning (#18129)
Avoids deprecation warning that triggered at import time, e.g. with
`python -c 'import langchain.smith'`


/opt/venv/lib/python3.12/site-packages/langchain/callbacks/__init__.py:37:
LangChainDeprecationWarning: Importing this callback from langchain is
deprecated. Importing it from langchain will no longer be supported as
of langchain==0.2.0. Please import from langchain-community instead:

    `from langchain_community.callbacks import base`.

To install langchain-community run `pip install -U langchain-community`.
2024-02-26 10:14:10 -05:00
rongchenlin
9147a437f1 docs: Fix the bug in MongoDBChatMessageHistory notebook (#18128)
I tried to configure MongoDBChatMessageHistory using the code from the
original documentation to store messages based on the passed session_id
in MongoDB. However, this configuration did not take effect, and the
session id in the database remained as 'test_session'. To resolve this
issue, I found that when configuring MongoDBChatMessageHistory, it is
necessary to set session_id=session_id instead of
session_id=test_session.

Issue: DOC: Ineffective Configuration of MongoDBChatMessageHistory for
Custom session_id Storage

previous code:
```python
chain_with_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: MongoDBChatMessageHistory(
        session_id="test_session",
        connection_string="mongodb://root:Y181491117cLj@123.56.224.232:27017",
        database_name="my_db",
        collection_name="chat_histories",
    ),
    input_messages_key="question",
    history_messages_key="history",
)
config = {"configurable": {"session_id": "mmm"}}
chain_with_history.invoke({"question": "Hi! I'm bob"}, config)
```

![image](https://github.com/langchain-ai/langchain/assets/83388493/c372f785-1ec1-43f5-8d01-b7cc07b806b7)


Modified code:
```python
chain_with_history = RunnableWithMessageHistory(
    chain,
    lambda session_id: MongoDBChatMessageHistory(
        session_id=session_id,   # here is my modify code
        connection_string="mongodb://root:Y181491117cLj@123.56.224.232:27017",
        database_name="my_db",
        collection_name="chat_histories",
    ),
    input_messages_key="question",
    history_messages_key="history",
)
config = {"configurable": {"session_id": "mmm"}}
chain_with_history.invoke({"question": "Hi! I'm bob"}, config)
```

Effect after modification (it works):


![image](https://github.com/langchain-ai/langchain/assets/83388493/5776268c-9098-4da3-bf41-52825be5fafb)
2024-02-26 15:02:56 +00:00
Erick Friis
e3b7779926 docs: api docs for external repos (#17904)
Stacked on google removal PR. Will make google continue to show up in
API docs even from external repo
2024-02-26 06:19:09 +00:00
Erick Friis
248c5b84ee google-genai, google-vertexai: move to langchain-google (#17899)
These packages have moved to
https://github.com/langchain-ai/langchain-google

Left tombstone readmes incase anyone ends up at the "Source Code" link
from old pypi releases. Can keep these around for a few months.
2024-02-25 21:58:05 -08:00
Erick Friis
3b5bdbfee8 anthropic[minor]: package move (#17974) 2024-02-25 21:57:26 -08:00
Christophe Bornet
a2d5fa7649 community[patch]: Fix GenericRequestsWrapper _aget_resp_content must be async (#18065)
There are existing tests in
`libs/community/tests/unit_tests/tools/requests/test_tool.py`
2024-02-25 19:07:07 -08:00
Neli Hateva
a01e8473f8 community[patch]: Fix GraphSparqlQAChain so that it works with Ontotext GraphDB (#15009)
- **Description:** Introduce a new parameter `graph_kwargs` to
`RdfGraph` - parameters used to initialize the `rdflib.Graph` if
`query_endpoint` is set. Also, do not set
`rdflib.graph.DATASET_DEFAULT_GRAPH_ID` as default value for the
`rdflib.Graph` `identifier` if `query_endpoint` is set.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** N/A
2024-02-25 19:05:21 -08:00
Christophe Bornet
4d6cd5b46a astradb[patch]: Use astrapy's upsert_one method in AstraDBStore (#18063)
As `upsert` is deprecated
2024-02-25 19:04:18 -08:00
Danny McAteer
e42110f720 docs: Additional examples for partners/exa README (#18081)
**Description:** Add additional examples for other modules to
partners/exa README
**Issue:** #17545
**Dependencies:** None
**Twitter handle:** @DannyMcAteer8

---------

Co-authored-by: Daniel McAteer <danielmcateer@Daniels-MBP.attlocal.net>
Co-authored-by: Daniel McAteer <danielmcateer@Daniels-MacBook-Pro.local>
2024-02-25 18:53:47 -08:00
dokato
5afb242161 langchain[patch]: Make BooleanOutputParser more robust to non-binary responses (#17810)
- **Description:** I encountered this error when I tried to use
LLMChainFilter. Even if the message slightly differs, like `Not relevant
(NO)` this results in an error. It has been reported already here:
https://github.com/langchain-ai/langchain/issues/. This change hopefully
makes it more robust.
- **Issue:**  #11408 
- **Dependencies:** No
- **Twitter handle:** dokatox
2024-02-25 18:48:33 -08:00
Matt
3b08617a89 docs: update azure search langchain notebook (#18053)
**Description:** Update the azure search notebook to have more
descriptive comments, and an option to choose between OpenAI and
AzureOpenAI Embeddings

---------

Co-authored-by: Matt Gotteiner <[email protected]>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-25 18:48:13 -08:00
kYLe
17ecf6e119 community[patch]: Remove model limitation on Anyscale LLM (#17662)
**Description:** Llama Guard is deprecated from Anyscale public
endpoint.
**Issue:** Change the default model. and remove the limitation of only
use Llama Guard with Anyscale LLMs
Anyscale LLM can also works with all other Chat model hosted on
Anyscale.
Also added `async_client` for Anyscale LLM
2024-02-25 18:21:19 -08:00
Barun Amalkumar Halder
cc69976860 community[minor] : adds callback handler for Fiddler AI (#17708)
**Description:**  Callback handler to integrate fiddler with langchain. 
This PR adds the following -

1. `FiddlerCallbackHandler` implementation into langchain/community
2. Example notebook `fiddler.ipynb` for usage documentation

[Internal Tracker : FDL-14305]

**Issue:** 
NA

**Dependencies:** 
- Installation of langchain-community is unaffected.
- Usage of FiddlerCallbackHandler requires installation of latest
fiddler-client (2.5+)

**Twitter handle:** @fiddlerlabs @behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-02-25 18:17:03 -08:00
Christophe Bornet
b8b5ce0c8c astradb: Add AstraDBChatMessageHistory to langchain-astradb package (#17732)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-25 18:14:49 -08:00
Maxime Perrin
c06a8732aa community[patch]: fix llama index imports and fields access (#17870)
- **Description:** Fixing outdated imports after v0.10 llama index
update and updating metadata and source text access
  - **Issue:** #17860
  - **Twitter handle:** @maximeperrin_

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-02-25 18:14:23 -08:00
BeatrixCohere
5d2d80a9a8 docs: Add Cohere examples in documentation (#17794)
- Description: Add cohere examples to documentation 
- Issue:N/A
- Dependencies: N/A

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-25 18:10:09 -08:00
Jacob Lee
c9eac3287e docs[patch]: Remove redundant Pinecone import (#18079)
CC @efriis
2024-02-24 19:27:54 -08:00
2jimoo
7fc903464a community: Add document manager and mongo document manager (#17320)
- **Description:** 
    - Add DocumentManager class, which is a nosql record manager. 
- In order to use index and aindex in
libs/langchain/langchain/indexes/_api.py, DocumentManager inherits
RecordManager.
    - Also I added the MongoDB implementation of Document Manager too.
  - **Dependencies:** pymongo, motor
  
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
- **Description:** Add DocumentManager class, which is a no sql record
manager. To use index method and aindex method in indexes._api.py,
Document Manager inherits RecordManager.Add the MongoDB implementation
of Document Manager.
  - **Dependencies:** pymongo, motor

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-23 21:32:52 -05:00
Leonid Ganeline
3f6bf852ea experimental: docstrings update (#18048)
Added missed docstrings. Formatted docsctrings to the consistent format.
2024-02-23 21:24:16 -05:00
kYLe
56b955fc31 community[minor]: Add async_client for Anyscale Chat model (#18050)
Add `async_client` for Anyscale Chat_model
2024-02-23 21:22:54 -05:00
Eugene Yurtsev
68527b809d core[patch]: Runnable with message history to use add_messages (#17958)
This PR updates RunnableWithMessageHistory to use add_messages which
will save on round-trips for any chat
history abstractions that implement the optimization. If the
optimization isn't
implemented, add_messages automatically invokes add_message serially.
2024-02-23 21:19:38 -05:00
Bagatur
1c1bb1152e openai[patch]: refactor with_structured_output (#18052)
- make schema Optional with default val None, since in json_mode you
don't need it if not parsing to pydantic
- change return_type -> include_raw
- expand docstring examples
2024-02-23 17:02:11 -08:00
Erick Friis
e85948d46b docs: fireworks tool calling docs (#18057) 2024-02-24 00:49:11 +00:00
Erick Friis
e566a3077e infra: simplify and fix CI for docs-only changes (#18058)
Current success check will fail on docs-only changes
2024-02-23 16:39:08 -08:00
Erick Friis
1a3383fba1 docs: fireworks fixes (#18056) 2024-02-23 15:58:53 -08:00
Erick Friis
a05fb19f42 openai[patch]: remove numpy dep (#18034) 2024-02-23 21:12:05 +00:00
Danny McAteer
e8be34f8c7 exa[patch]: update readme (#18047) 2024-02-23 21:05:42 +00:00
Yufei (Benny) Chen
ee6a773456 fireworks[patch]: Add Fireworks partner packages (#17694)
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-23 20:45:47 +00:00
Erick Friis
11cf95e810 docs: recommend lambdas over runnablebranch (#18033) 2024-02-23 11:34:27 -08:00
Erick Friis
9ebbca3695 infra: CI success for partner packages 2 (#18043) 2024-02-23 11:10:39 -08:00
Erick Friis
b948f6da67 infra: CI success for partner packages (#18037)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-02-23 11:00:48 -08:00
Bagatur
22b964f802 community[patch]: Release 0.0.24 (#18038) 2024-02-23 10:49:29 -08:00
2029 changed files with 199134 additions and 97050 deletions

View File

@@ -3,18 +3,18 @@ body:
- type: markdown
attributes:
value: |
Thanks for your interest in 🦜️🔗 LangChain!
Thanks for your interest in LangChain 🦜️🔗!
Please follow these instructions, fill every question, and do every step. 🙏
We're asking for this because answering questions and solving problems in GitHub takes a lot of time --
this is time that we cannot spend on adding new features, fixing bugs, write documentation or reviewing pull requests.
this is time that we cannot spend on adding new features, fixing bugs, writing documentation or reviewing pull requests.
By asking questions in a structured way (following this) it will be much easier to help you.
By asking questions in a structured way (following this) it will be much easier for us to help you.
And there's a high chance that you will find the solution along the way and you won't even have to submit it and wait for an answer. 😎
There's a high chance that by following this process, you'll find the solution on your own, eliminating the need to submit a question and wait for an answer. 😎
As there are too many questions, we will **DISCARD** and close the incomplete ones.
As there are many questions submitted every day, we will **DISCARD** and close the incomplete ones.
That will allow us (and others) to focus on helping people like you that follow the whole process. 🤓

View File

@@ -35,6 +35,8 @@ body:
required: true
- label: I am sure that this is a bug in LangChain rather than my code.
required: true
- label: The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
required: true
- type: textarea
id: reproduction
validations:

View File

@@ -9,7 +9,7 @@ body:
If you are not a LangChain maintainer or were not asked directly by a maintainer to create an issue, then please start the conversation in a [Question in GitHub Discussions](https://github.com/langchain-ai/langchain/discussions/categories/q-a) instead.
You are a LangChain maintainer if you maintain any of the packages inside of the LangChain repository
or are a regular contributor to LangChain with previous merged merged pull requests.
or are a regular contributor to LangChain with previous merged pull requests.
- type: checkboxes
id: privileged
attributes:

View File

@@ -1,17 +1,24 @@
import json
import sys
import os
from typing import Dict
LANGCHAIN_DIRS = {
LANGCHAIN_DIRS = [
"libs/core",
"libs/text-splitters",
"libs/community",
"libs/langchain",
"libs/experimental",
"libs/community",
}
]
if __name__ == "__main__":
files = sys.argv[1:]
dirs_to_run = set()
dirs_to_run: Dict[str, set] = {
"lint": set(),
"test": set(),
"extended-test": set(),
}
if len(files) == 300:
# max diff length is 300 files - there are likely files missing
@@ -24,31 +31,49 @@ if __name__ == "__main__":
".github/workflows",
".github/tools",
".github/actions",
"libs/core",
".github/scripts/check_diff.py",
)
):
dirs_to_run.update(LANGCHAIN_DIRS)
elif "libs/community" in file:
dirs_to_run.update(
("libs/community", "libs/langchain", "libs/experimental")
)
elif "libs/partners" in file:
partner_dir = file.split("/")[2]
if os.path.isdir(f"libs/partners/{partner_dir}"):
dirs_to_run.add(f"libs/partners/{partner_dir}")
# Skip if the directory was deleted
elif "libs/langchain" in file:
dirs_to_run.update(("libs/langchain", "libs/experimental"))
elif "libs/experimental" in file:
dirs_to_run.add("libs/experimental")
elif file.startswith("libs/"):
dirs_to_run.update(LANGCHAIN_DIRS)
else:
pass
json_output = json.dumps(list(dirs_to_run))
print(f"dirs-to-run={json_output}") # noqa: T201
# add all LANGCHAIN_DIRS for infra changes
dirs_to_run["extended-test"].update(LANGCHAIN_DIRS)
dirs_to_run["lint"].add(".")
extended_test_dirs = [d for d in dirs_to_run if not d.startswith("libs/partners")]
json_output_extended = json.dumps(extended_test_dirs)
print(f"dirs-to-run-extended={json_output_extended}") # noqa: T201
if any(file.startswith(dir_) for dir_ in LANGCHAIN_DIRS):
# add that dir and all dirs after in LANGCHAIN_DIRS
# for extended testing
found = False
for dir_ in LANGCHAIN_DIRS:
if file.startswith(dir_):
found = True
if found:
dirs_to_run["extended-test"].add(dir_)
elif file.startswith("libs/cli"):
# todo: add cli makefile
pass
elif file.startswith("libs/partners"):
partner_dir = file.split("/")[2]
if os.path.isdir(f"libs/partners/{partner_dir}") and [
filename
for filename in os.listdir(f"libs/partners/{partner_dir}")
if not filename.startswith(".")
] != ["README.md"]:
dirs_to_run["test"].add(f"libs/partners/{partner_dir}")
# Skip if the directory was deleted or is just a tombstone readme
elif file.startswith("libs/"):
raise ValueError(
f"Unknown lib: {file}. check_diff.py likely needs "
"an update for this new library!"
)
elif any(file.startswith(p) for p in ["docs/", "templates/", "cookbook/"]):
dirs_to_run["lint"].add(".")
outputs = {
"dirs-to-lint": list(
dirs_to_run["lint"] | dirs_to_run["test"] | dirs_to_run["extended-test"]
),
"dirs-to-test": list(dirs_to_run["test"] | dirs_to_run["extended-test"]),
"dirs-to-extended-test": list(dirs_to_run["extended-test"]),
}
for key, value in outputs.items():
json_output = json.dumps(value)
print(f"{key}={json_output}") # noqa: T201

View File

@@ -4,7 +4,12 @@ import tomllib
from packaging.version import parse as parse_version
import re
MIN_VERSION_LIBS = ["langchain-core", "langchain-community", "langchain"]
MIN_VERSION_LIBS = [
"langchain-core",
"langchain-community",
"langchain",
"langchain-text-splitters",
]
def get_min_version(version: str) -> str:
@@ -56,12 +61,13 @@ def get_min_version_from_toml(toml_path: str):
return min_versions
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
# Call the function to get the minimum versions
min_versions = get_min_version_from_toml(toml_file)
# Call the function to get the minimum versions
min_versions = get_min_version_from_toml(toml_file)
print(
" ".join([f"{lib}=={version}" for lib, version in min_versions.items()])
) # noqa: T201
print(
" ".join([f"{lib}=={version}" for lib, version in min_versions.items()])
) # noqa: T201

View File

@@ -63,6 +63,8 @@ jobs:
- name: Install the opposite major version of pydantic
# If normal tests use pydantic v1, here we'll use v2, and vice versa.
shell: bash
# airbyte currently doesn't support pydantic v2
if: ${{ !startsWith(inputs.working-directory, 'libs/partners/airbyte') }}
run: |
# Determine the major part of pydantic version
REGULAR_VERSION=$(poetry run python -c "import pydantic; print(pydantic.__version__)" | cut -d. -f1)
@@ -97,6 +99,8 @@ jobs:
fi
echo "Found pydantic version ${CURRENT_VERSION}, as expected"
- name: Run pydantic compatibility tests
# airbyte currently doesn't support pydantic v2
if: ${{ !startsWith(inputs.working-directory, 'libs/partners/airbyte') }}
shell: bash
run: make test

View File

@@ -70,6 +70,13 @@ jobs:
ASTRA_DB_API_ENDPOINT: ${{ secrets.ASTRA_DB_API_ENDPOINT }}
ASTRA_DB_APPLICATION_TOKEN: ${{ secrets.ASTRA_DB_APPLICATION_TOKEN }}
ASTRA_DB_KEYSPACE: ${{ secrets.ASTRA_DB_KEYSPACE }}
ES_URL: ${{ secrets.ES_URL }}
ES_CLOUD_ID: ${{ secrets.ES_CLOUD_ID }}
ES_API_KEY: ${{ secrets.ES_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # for airbyte
MONGODB_ATLAS_URI: ${{ secrets.MONGODB_ATLAS_URI }}
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
run: |
make integration_tests

View File

@@ -55,7 +55,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
- name: Upload build
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -157,6 +157,24 @@ jobs:
run: make tests
working-directory: ${{ inputs.working-directory }}
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
poetry run pip install packaging
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: Run unit tests with minimum dependency versions
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
poetry run pip install $MIN_VERSIONS
make tests
working-directory: ${{ inputs.working-directory }}
- name: 'Authenticate to Google Cloud'
id: 'auth'
uses: google-github-actions/auth@v2
@@ -191,27 +209,16 @@ jobs:
ASTRA_DB_API_ENDPOINT: ${{ secrets.ASTRA_DB_API_ENDPOINT }}
ASTRA_DB_APPLICATION_TOKEN: ${{ secrets.ASTRA_DB_APPLICATION_TOKEN }}
ASTRA_DB_KEYSPACE: ${{ secrets.ASTRA_DB_KEYSPACE }}
ES_URL: ${{ secrets.ES_URL }}
ES_CLOUD_ID: ${{ secrets.ES_CLOUD_ID }}
ES_API_KEY: ${{ secrets.ES_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # for airbyte
MONGODB_ATLAS_URI: ${{ secrets.MONGODB_ATLAS_URI }}
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
run: make integration_tests
working-directory: ${{ inputs.working-directory }}
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
poetry run pip install packaging
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: Run unit tests with minimum dependency versions
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
poetry run pip install $MIN_VERSIONS
make tests
working-directory: ${{ inputs.working-directory }}
publish:
needs:
- build
@@ -241,7 +248,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
cache-key: release
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -280,7 +287,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
cache-key: release
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: dist
path: ${{ inputs.working-directory }}/dist/

50
.github/workflows/_test_doc_imports.yml vendored Normal file
View File

@@ -0,0 +1,50 @@
name: test_doc_imports
on:
workflow_call:
env:
POETRY_VERSION: "1.7.1"
jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.11"
name: "check doc imports #${{ matrix.python-version }}"
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
cache-key: core
- name: Install dependencies
shell: bash
run: poetry install --with test
- name: Install langchain editable
run: |
poetry run pip install -e libs/core libs/langchain libs/community libs/experimental
- name: Check doc imports
shell: bash
run: |
poetry run python docs/scripts/check_imports.py
- name: Ensure the test did not create any additional files
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -48,7 +48,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
- name: Upload build
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: test-dist
path: ${{ inputs.working-directory }}/dist/
@@ -76,7 +76,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: test-dist
path: ${{ inputs.working-directory }}/dist/

View File

@@ -1,52 +0,0 @@
name: API docs build
on:
workflow_dispatch:
schedule:
- cron: '0 13 * * *'
env:
POETRY_VERSION: "1.7.1"
PYTHON_VERSION: "3.10"
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
ref: bagatur/api_docs_build
- name: Set Git config
run: |
git config --local user.email "actions@github.com"
git config --local user.name "Github Actions"
- name: Merge master
run: |
git fetch origin master
git merge origin/master -m "Merge master" --allow-unrelated-histories -X theirs
- name: Set up Python ${{ env.PYTHON_VERSION }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ env.PYTHON_VERSION }}
poetry-version: ${{ env.POETRY_VERSION }}
cache-key: api-docs
- name: Install dependencies
run: |
poetry run python -m pip install --upgrade --no-cache-dir pip setuptools
poetry run python -m pip install --upgrade --no-cache-dir sphinx readthedocs-sphinx-ext
poetry run python -m pip install ./libs/partners/*
poetry run python -m pip install --exists-action=w --no-cache-dir -r docs/api_reference/requirements.txt
- name: Build docs
run: |
poetry run python -m pip install --upgrade --no-cache-dir pip setuptools
poetry run python docs/api_reference/create_api_rst.py
poetry run python -m sphinx -T -E -b html -d _build/doctrees -c docs/api_reference docs/api_reference api_reference_build/html -j auto
# https://github.com/marketplace/actions/add-commit
- uses: EndBug/add-and-commit@v9
with:
message: 'Update API docs build'

View File

@@ -0,0 +1,24 @@
name: Check Broken Links
on:
workflow_dispatch:
schedule:
- cron: '0 13 * * *'
jobs:
check-links:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Use Node.js 18.x
uses: actions/setup-node@v3
with:
node-version: 18.x
cache: "yarn"
cache-dependency-path: ./docs/yarn.lock
- name: Install dependencies
run: yarn install --immutable --mode=skip-build
working-directory: ./docs
- name: Check broken links
run: yarn check-broken-links
working-directory: ./docs

View File

@@ -33,14 +33,16 @@ jobs:
run: |
python .github/scripts/check_diff.py ${{ steps.files.outputs.all }} >> $GITHUB_OUTPUT
outputs:
dirs-to-run: ${{ steps.set-matrix.outputs.dirs-to-run }}
dirs-to-run-extended: ${{ steps.set-matrix.outputs.dirs-to-run-extended }}
dirs-to-lint: ${{ steps.set-matrix.outputs.dirs-to-lint }}
dirs-to-test: ${{ steps.set-matrix.outputs.dirs-to-test }}
dirs-to-extended-test: ${{ steps.set-matrix.outputs.dirs-to-extended-test }}
lint:
name: cd ${{ matrix.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-lint != '[]' }}
strategy:
matrix:
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-run) }}
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-lint) }}
uses: ./.github/workflows/_lint.yml
with:
working-directory: ${{ matrix.working-directory }}
@@ -49,20 +51,28 @@ jobs:
test:
name: cd ${{ matrix.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-test != '[]' }}
strategy:
matrix:
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-run) }}
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-test) }}
uses: ./.github/workflows/_test.yml
with:
working-directory: ${{ matrix.working-directory }}
secrets: inherit
test_doc_imports:
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-test != '[]' }}
uses: ./.github/workflows/_test_doc_imports.yml
secrets: inherit
compile-integration-tests:
name: cd ${{ matrix.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-test != '[]' }}
strategy:
matrix:
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-run) }}
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-test) }}
uses: ./.github/workflows/_compile_integration_test.yml
with:
working-directory: ${{ matrix.working-directory }}
@@ -71,9 +81,10 @@ jobs:
dependencies:
name: cd ${{ matrix.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-test != '[]' }}
strategy:
matrix:
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-run) }}
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-test) }}
uses: ./.github/workflows/_dependencies.yml
with:
working-directory: ${{ matrix.working-directory }}
@@ -82,10 +93,11 @@ jobs:
extended-tests:
name: "cd ${{ matrix.working-directory }} / make extended_tests #${{ matrix.python-version }}"
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-extended-test != '[]' }}
strategy:
matrix:
# note different variable for extended test dirs
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-run-extended) }}
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-extended-test) }}
python-version:
- "3.8"
- "3.9"

View File

@@ -32,6 +32,6 @@ jobs:
- name: Codespell
uses: codespell-project/actions-codespell@v2
with:
skip: guide_imports.json,*.ambr
skip: guide_imports.json,*.ambr,./cookbook/data/imdb_top_1000.csv,*.lock
ignore_words_list: ${{ steps.extract_ignore_words.outputs.ignore_words_list }}
exclude_file: libs/community/langchain_community/llms/yuan2.py

View File

@@ -1,37 +0,0 @@
---
name: CI / cd .
on:
push:
branches: [ master ]
pull_request:
paths:
- 'docs/**'
- 'templates/**'
- 'cookbook/**'
- '.github/workflows/_lint.yml'
- '.github/workflows/doc_lint.yml'
workflow_dispatch:
jobs:
check:
name: Check for "from langchain import x" imports
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Run import check
run: |
# We should not encourage imports directly from main init file
# Expect for hub
git grep 'from langchain import' {docs/docs,templates,cookbook} | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
lint:
name: "-"
uses:
./.github/workflows/_lint.yml
with:
working-directory: "."
secrets: inherit

6
.gitignore vendored
View File

@@ -115,13 +115,11 @@ celerybeat.pid
# Environments
.env
.envrc
.venv
.venvs
.venv*
venv*
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject

View File

@@ -1,44 +1,56 @@
.PHONY: all clean docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck
.PHONY: all clean help docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck spell_check spell_fix lint lint_package lint_tests format format_diff
# Default target executed when no arguments are given to make.
## help: Show this help info.
help: Makefile
@printf "\n\033[1mUsage: make <TARGETS> ...\033[0m\n\n\033[1mTargets:\033[0m\n\n"
@sed -n 's/^##//p' $< | awk -F':' '{printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}' | sort | sed -e 's/^/ /'
## all: Default target, shows help.
all: help
## clean: Clean documentation and API documentation artifacts.
clean: docs_clean api_docs_clean
######################
# DOCUMENTATION
######################
clean: docs_clean api_docs_clean
## docs_build: Build the documentation.
docs_build:
docs/.local_build.sh
## docs_clean: Clean the documentation build artifacts.
docs_clean:
@if [ -d _dist ]; then \
rm -r _dist; \
echo "Directory _dist has been cleaned."; \
rm -r _dist; \
echo "Directory _dist has been cleaned."; \
else \
echo "Nothing to clean."; \
echo "Nothing to clean."; \
fi
## docs_linkcheck: Run linkchecker on the documentation.
docs_linkcheck:
poetry run linkchecker _dist/docs/ --ignore-url node_modules
## api_docs_build: Build the API Reference documentation.
api_docs_build:
poetry run python docs/api_reference/create_api_rst.py
cd docs/api_reference && poetry run make html
## api_docs_clean: Clean the API Reference documentation build artifacts.
api_docs_clean:
rm -f docs/api_reference/api_reference.rst
find ./docs/api_reference -name '*_api_reference.rst' -delete
cd docs/api_reference && poetry run make clean
## api_docs_linkcheck: Run linkchecker on the API Reference documentation.
api_docs_linkcheck:
poetry run linkchecker docs/api_reference/_build/html/index.html
## spell_check: Run codespell on the project.
spell_check:
poetry run codespell --toml pyproject.toml
## spell_fix: Run codespell on the project and fix the errors.
spell_fix:
poetry run codespell --toml pyproject.toml -w
@@ -46,29 +58,14 @@ spell_fix:
# LINTING AND FORMATTING
######################
## lint: Run linting on the project.
lint lint_package lint_tests:
poetry run ruff docs templates cookbook
poetry run ruff format docs templates cookbook --diff
poetry run ruff --select I docs templates cookbook
git grep 'from langchain import' docs/docs templates cookbook | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
## format: Format the project files.
format format_diff:
poetry run ruff format docs templates cookbook
poetry run ruff --select I --fix docs templates cookbook
######################
# HELP
######################
help:
@echo '===================='
@echo '-- DOCUMENTATION --'
@echo 'clean - run docs_clean and api_docs_clean'
@echo 'docs_build - build the documentation'
@echo 'docs_clean - clean the documentation build artifacts'
@echo 'docs_linkcheck - run linkchecker on the documentation'
@echo 'api_docs_build - build the API Reference documentation'
@echo 'api_docs_clean - clean the API Reference documentation build artifacts'
@echo 'api_docs_linkcheck - run linkchecker on the API Reference documentation'
@echo 'spell_check - run codespell on the project'
@echo 'spell_fix - run codespell on the project and fix the errors'
@echo '-- TEST and LINT tasks are within libs/*/ per-package --'

View File

@@ -50,7 +50,7 @@ The LangChain libraries themselves are made up of several different packages.
- **[`langchain-community`](libs/community)**: Third party integrations.
- **[`langchain`](libs/langchain)**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
![Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.](docs/static/img/langchain_stack.png "LangChain Architecture Overview")
![Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.](docs/static/svg/langchain_stack.svg "LangChain Architecture Overview")
## 🧱 What can you build with LangChain?
**❓ Retrieval augmented generation**

View File

@@ -1,6 +1,61 @@
# Security Policy
## Reporting a Vulnerability
## Reporting OSS Vulnerabilities
Please report security vulnerabilities by email to `security@langchain.dev`.
This email is an alias to a subset of our maintainers, and will ensure the issue is promptly triaged and acted upon as needed.
LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide
a bounty program for our open source projects.
Please report security vulnerabilities associated with the LangChain
open source projects by visiting the following link:
[https://huntr.com/bounties/disclose/](https://huntr.com/bounties/disclose/?target=https%3A%2F%2Fgithub.com%2Flangchain-ai%2Flangchain&validSearch=true)
Before reporting a vulnerability, please review:
1) In-Scope Targets and Out-of-Scope Targets below.
2) The [langchain-ai/langchain](https://python.langchain.com/docs/contributing/repo_structure) monorepo structure.
3) LangChain [security guidelines](https://python.langchain.com/docs/security) to
understand what we consider to be a security vulnerability vs. developer
responsibility.
### In-Scope Targets
The following packages and repositories are eligible for bug bounties:
- langchain-core
- langchain (see exceptions)
- langchain-community (see exceptions)
- langgraph
- langserve
### Out of Scope Targets
All out of scope targets defined by huntr as well as:
- **langchain-experimental**: This repository is for experimental code and is not
eligible for bug bounties, bug reports to it will be marked as interesting or waste of
time and published with no bounty attached.
- **tools**: Tools in either langchain or langchain-community are not eligible for bug
bounties. This includes the following directories
- langchain/tools
- langchain-community/tools
- Please review our [security guidelines](https://python.langchain.com/docs/security)
for more details, but generally tools interact with the real world. Developers are
expected to understand the security implications of their code and are responsible
for the security of their tools.
- Code documented with security notices. This will be decided done on a case by
case basis, but likely will not be eligible for a bounty as the code is already
documented with guidelines for developers that should be followed for making their
application secure.
- Any LangSmith related repositories or APIs see below.
## Reporting LangSmith Vulnerabilities
Please report security vulnerabilities associated with LangSmith by email to `security@langchain.dev`.
- LangSmith site: https://smith.langchain.com
- SDK client: https://github.com/langchain-ai/langsmith-sdk
### Other Security Concerns
For any other security concerns, please contact us at `security@langchain.dev`.

View File

@@ -0,0 +1,932 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "BYejgj8Zf-LG",
"tags": []
},
"source": [
"## Getting started with LangChain and Gemma, running locally or in the Cloud"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "2IxjMb9-jIJ8"
},
"source": [
"### Installing dependencies"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 9436,
"status": "ok",
"timestamp": 1708975187360,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "XZaTsXfcheTF",
"outputId": "eb21d603-d824-46c5-f99f-087fb2f618b1",
"tags": []
},
"outputs": [],
"source": [
"!pip install --upgrade langchain langchain-google-vertexai"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "IXmAujvC3Kwp"
},
"source": [
"### Running the model"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "CI8Elyc5gBQF"
},
"source": [
"Go to the VertexAI Model Garden on Google Cloud [console](https://pantheon.corp.google.com/vertex-ai/publishers/google/model-garden/335), and deploy the desired version of Gemma to VertexAI. It will take a few minutes, and after the endpoint it ready, you need to copy its number."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "gv1j8FrVftsC"
},
"outputs": [],
"source": [
"# @title Basic parameters\n",
"project: str = \"PUT_YOUR_PROJECT_ID_HERE\" # @param {type:\"string\"}\n",
"endpoint_id: str = \"PUT_YOUR_ENDPOINT_ID_HERE\" # @param {type:\"string\"}\n",
"location: str = \"PUT_YOUR_ENDPOINT_LOCAtION_HERE\" # @param {type:\"string\"}"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"executionInfo": {
"elapsed": 3,
"status": "ok",
"timestamp": 1708975440503,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "bhIHsFGYjtFt",
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-02-27 17:15:10.457149: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
"2024-02-27 17:15:10.508925: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
"2024-02-27 17:15:10.508957: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
"2024-02-27 17:15:10.510289: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
"2024-02-27 17:15:10.518898: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
"To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
}
],
"source": [
"from langchain_google_vertexai import (\n",
" GemmaChatVertexAIModelGarden,\n",
" GemmaVertexAIModelGarden,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"executionInfo": {
"elapsed": 351,
"status": "ok",
"timestamp": 1708975440852,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "WJv-UVWwh0lk",
"tags": []
},
"outputs": [],
"source": [
"llm = GemmaVertexAIModelGarden(\n",
" endpoint_id=endpoint_id,\n",
" project=project,\n",
" location=location,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 714,
"status": "ok",
"timestamp": 1708975441564,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "6kM7cEFdiN9h",
"outputId": "fb420c56-5614-4745-cda8-0ee450a3e539",
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Prompt:\n",
"What is the meaning of life?\n",
"Output:\n",
" Who am I? Why do I exist? These are questions I have struggled with\n"
]
}
],
"source": [
"output = llm.invoke(\"What is the meaning of life?\")\n",
"print(output)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "zzep9nfmuUcO"
},
"source": [
"We can also use Gemma as a multi-turn chat model:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"executionInfo": {
"elapsed": 964,
"status": "ok",
"timestamp": 1708976298189,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "8tPHoM5XiZOl",
"outputId": "7b8fb652-9aed-47b0-c096-aa1abfc3a2a9",
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"content='Prompt:\\n<start_of_turn>user\\nHow much is 2+2?<end_of_turn>\\n<start_of_turn>model\\nOutput:\\n8-years old.<end_of_turn>\\n\\n<start_of'\n",
"content='Prompt:\\n<start_of_turn>user\\nHow much is 2+2?<end_of_turn>\\n<start_of_turn>model\\nPrompt:\\n<start_of_turn>user\\nHow much is 2+2?<end_of_turn>\\n<start_of_turn>model\\nOutput:\\n8-years old.<end_of_turn>\\n\\n<start_of<end_of_turn>\\n<start_of_turn>user\\nHow much is 3+3?<end_of_turn>\\n<start_of_turn>model\\nOutput:\\nOutput:\\n3-years old.<end_of_turn>\\n\\n<'\n"
]
}
],
"source": [
"from langchain_core.messages import HumanMessage\n",
"\n",
"llm = GemmaChatVertexAIModelGarden(\n",
" endpoint_id=endpoint_id,\n",
" project=project,\n",
" location=location,\n",
")\n",
"\n",
"message1 = HumanMessage(content=\"How much is 2+2?\")\n",
"answer1 = llm.invoke([message1])\n",
"print(answer1)\n",
"\n",
"message2 = HumanMessage(content=\"How much is 3+3?\")\n",
"answer2 = llm.invoke([message1, answer1, message2])\n",
"\n",
"print(answer2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can post-process response to avoid repetitions:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"content='Output:\\n<<humming>>: 2+2 = 4.\\n<end'\n",
"content='Output:\\nOutput:\\n<<humming>>: 3+3 = 6.'\n"
]
}
],
"source": [
"answer1 = llm.invoke([message1], parse_response=True)\n",
"print(answer1)\n",
"\n",
"answer2 = llm.invoke([message1, answer1, message2], parse_response=True)\n",
"\n",
"print(answer2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "VEfjqo7fjARR"
},
"source": [
"## Running Gemma locally from Kaggle"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "gVW8QDzHu7TA"
},
"source": [
"In order to run Gemma locally, you can download it from Kaggle first. In order to do this, you'll need to login into the Kaggle platform, create a API key and download a `kaggle.json` Read more about Kaggle auth [here](https://www.kaggle.com/docs/api)."
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "S1EsXQ3XvZkQ"
},
"source": [
"### Installation"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"executionInfo": {
"elapsed": 335,
"status": "ok",
"timestamp": 1708976305471,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "p8SMwpKRvbef",
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/conda/lib/python3.10/pty.py:89: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.\n",
" pid, fd = os.forkpty()\n"
]
}
],
"source": [
"!mkdir -p ~/.kaggle && cp kaggle.json ~/.kaggle/kaggle.json"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"executionInfo": {
"elapsed": 7802,
"status": "ok",
"timestamp": 1708976363010,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "Yr679aePv9Fq",
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/opt/conda/lib/python3.10/pty.py:89: RuntimeWarning: os.fork() was called. os.fork() is incompatible with multithreaded code, and JAX is multithreaded, so this will likely lead to a deadlock.\n",
" pid, fd = os.forkpty()\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n",
"tensorstore 0.1.54 requires ml-dtypes>=0.3.1, but you have ml-dtypes 0.2.0 which is incompatible.\u001b[0m\u001b[31m\n",
"\u001b[0m"
]
}
],
"source": [
"!pip install keras>=3 keras_nlp"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "E9zn8nYpv3QZ"
},
"source": [
"### Usage"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"executionInfo": {
"elapsed": 8536,
"status": "ok",
"timestamp": 1708976601206,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "0LFRmY8TjCkI",
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-02-27 16:38:40.797559: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
"2024-02-27 16:38:40.848444: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
"2024-02-27 16:38:40.848478: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
"2024-02-27 16:38:40.849728: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
"2024-02-27 16:38:40.857936: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
"To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
}
],
"source": [
"from langchain_google_vertexai import GemmaLocalKaggle"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "v-o7oXVavdMQ"
},
"source": [
"You can specify the keras backend (by default it's `tensorflow`, but you can change it be `jax` or `torch`)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"executionInfo": {
"elapsed": 9,
"status": "ok",
"timestamp": 1708976601206,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "vvTUH8DNj5SF",
"tags": []
},
"outputs": [],
"source": [
"# @title Basic parameters\n",
"keras_backend: str = \"jax\" # @param {type:\"string\"}\n",
"model_name: str = \"gemma_2b_en\" # @param {type:\"string\"}"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"executionInfo": {
"elapsed": 40836,
"status": "ok",
"timestamp": 1708976761257,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "YOmrqxo5kHXK",
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-02-27 16:23:14.661164: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 20549 MB memory: -> device: 0, name: NVIDIA L4, pci bus id: 0000:00:03.0, compute capability: 8.9\n",
"normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization.\n"
]
}
],
"source": [
"llm = GemmaLocalKaggle(model_name=model_name, keras_backend=keras_backend)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"id": "Zu6yPDUgkQtQ",
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"W0000 00:00:1709051129.518076 774855 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"What is the meaning of life?\n",
"\n",
"The question is one of the most important questions in the world.\n",
"\n",
"Its the question that has\n"
]
}
],
"source": [
"output = llm.invoke(\"What is the meaning of life?\", max_tokens=30)\n",
"print(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### ChatModel"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "MSctpRE4u43N"
},
"source": [
"Same as above, using Gemma locally as a multi-turn chat model. You might need to re-start the notebook and clean your GPU memory in order to avoid OOM errors:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-02-27 16:58:22.331067: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
"2024-02-27 16:58:22.382948: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
"2024-02-27 16:58:22.382978: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
"2024-02-27 16:58:22.384312: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
"2024-02-27 16:58:22.392767: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
"To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
}
],
"source": [
"from langchain_google_vertexai import GemmaChatLocalKaggle"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# @title Basic parameters\n",
"keras_backend: str = \"jax\" # @param {type:\"string\"}\n",
"model_name: str = \"gemma_2b_en\" # @param {type:\"string\"}"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-02-27 16:58:29.001922: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1929] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 20549 MB memory: -> device: 0, name: NVIDIA L4, pci bus id: 0000:00:03.0, compute capability: 8.9\n",
"normalizer.cc(51) LOG(INFO) precompiled_charsmap is empty. use identity normalization.\n"
]
}
],
"source": [
"llm = GemmaChatLocalKaggle(model_name=model_name, keras_backend=keras_backend)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"executionInfo": {
"elapsed": 3,
"status": "aborted",
"timestamp": 1708976382957,
"user": {
"displayName": "",
"userId": ""
},
"user_tz": -60
},
"id": "JrJmvZqwwLqj"
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-02-27 16:58:49.848412: I external/local_xla/xla/service/service.cc:168] XLA service 0x55adc0cf2c10 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:\n",
"2024-02-27 16:58:49.848458: I external/local_xla/xla/service/service.cc:176] StreamExecutor device (0): NVIDIA L4, Compute Capability 8.9\n",
"2024-02-27 16:58:50.116614: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.\n",
"2024-02-27 16:58:54.389324: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8900\n",
"WARNING: All log messages before absl::InitializeLog() is called are written to STDERR\n",
"I0000 00:00:1709053145.225207 784891 device_compiler.h:186] Compiled cluster using XLA! This line is logged at most once for the lifetime of the process.\n",
"W0000 00:00:1709053145.284227 784891 graph_launch.cc:671] Fallback to op-by-op mode because memset node breaks graph update\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"content=\"<start_of_turn>user\\nHi! Who are you?<end_of_turn>\\n<start_of_turn>model\\nI'm a model.\\n Tampoco\\nI'm a model.\"\n"
]
}
],
"source": [
"from langchain_core.messages import HumanMessage\n",
"\n",
"message1 = HumanMessage(content=\"Hi! Who are you?\")\n",
"answer1 = llm.invoke([message1], max_tokens=30)\n",
"print(answer1)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"content=\"<start_of_turn>user\\nHi! Who are you?<end_of_turn>\\n<start_of_turn>model\\n<start_of_turn>user\\nHi! Who are you?<end_of_turn>\\n<start_of_turn>model\\nI'm a model.\\n Tampoco\\nI'm a model.<end_of_turn>\\n<start_of_turn>user\\nWhat can you help me with?<end_of_turn>\\n<start_of_turn>model\"\n"
]
}
],
"source": [
"message2 = HumanMessage(content=\"What can you help me with?\")\n",
"answer2 = llm.invoke([message1, answer1, message2], max_tokens=60)\n",
"\n",
"print(answer2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can post-process the response if you want to avoid multi-turn statements:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"content=\"I'm a model.\\n Tampoco\\nI'm a model.\"\n",
"content='I can help you with your modeling.\\n Tampoco\\nI can'\n"
]
}
],
"source": [
"answer1 = llm.invoke([message1], max_tokens=30, parse_response=True)\n",
"print(answer1)\n",
"\n",
"answer2 = llm.invoke([message1, answer1, message2], max_tokens=60, parse_response=True)\n",
"print(answer2)"
]
},
{
"cell_type": "markdown",
"metadata": {
"id": "EiZnztso7hyF"
},
"source": [
"## Running Gemma locally from HuggingFace"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "qqAqsz5R7nKf",
"tags": []
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"2024-02-27 17:02:21.832409: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.\n",
"2024-02-27 17:02:21.883625: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered\n",
"2024-02-27 17:02:21.883656: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered\n",
"2024-02-27 17:02:21.884987: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered\n",
"2024-02-27 17:02:21.893340: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.\n",
"To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.\n"
]
}
],
"source": [
"from langchain_google_vertexai import GemmaChatLocalHF, GemmaLocalHF"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"id": "tsyntzI08cOr",
"tags": []
},
"outputs": [],
"source": [
"# @title Basic parameters\n",
"hf_access_token: str = \"PUT_YOUR_TOKEN_HERE\" # @param {type:\"string\"}\n",
"model_name: str = \"google/gemma-2b\" # @param {type:\"string\"}"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "JWrqEkOo8sm9",
"tags": []
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "a0d6de5542254ed1b6d3ba65465e050e",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"llm = GemmaLocalHF(model_name=\"google/gemma-2b\", hf_access_token=hf_access_token)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"id": "VX96Jf4Y84k-",
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"What is the meaning of life?\n",
"\n",
"The question is one of the most important questions in the world.\n",
"\n",
"Its the question that has been asked by philosophers, theologians, and scientists for centuries.\n",
"\n",
"And its the question that\n"
]
}
],
"source": [
"output = llm.invoke(\"What is the meaning of life?\", max_tokens=50)\n",
"print(output)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Same as above, using Gemma locally as a multi-turn chat model. You might need to re-start the notebook and clean your GPU memory in order to avoid OOM errors:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"id": "9x-jmEBg9Mk1"
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "c9a0b8e161d74a6faca83b1be96dee27",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"llm = GemmaChatLocalHF(model_name=model_name, hf_access_token=hf_access_token)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"id": "qv_OSaMm9PVy"
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"content=\"<start_of_turn>user\\nHi! Who are you?<end_of_turn>\\n<start_of_turn>model\\nI'm a model.\\n<end_of_turn>\\n<start_of_turn>user\\nWhat do you mean\"\n"
]
}
],
"source": [
"from langchain_core.messages import HumanMessage\n",
"\n",
"message1 = HumanMessage(content=\"Hi! Who are you?\")\n",
"answer1 = llm.invoke([message1], max_tokens=60)\n",
"print(answer1)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"content=\"<start_of_turn>user\\nHi! Who are you?<end_of_turn>\\n<start_of_turn>model\\n<start_of_turn>user\\nHi! Who are you?<end_of_turn>\\n<start_of_turn>model\\nI'm a model.\\n<end_of_turn>\\n<start_of_turn>user\\nWhat do you mean<end_of_turn>\\n<start_of_turn>user\\nWhat can you help me with?<end_of_turn>\\n<start_of_turn>model\\nI can help you with anything.\\n<\"\n"
]
}
],
"source": [
"message2 = HumanMessage(content=\"What can you help me with?\")\n",
"answer2 = llm.invoke([message1, answer1, message2], max_tokens=140)\n",
"\n",
"print(answer2)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And the same with posprocessing:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"content=\"I'm a model.\\n<end_of_turn>\\n\"\n",
"content='I can help you with anything.\\n<end_of_turn>\\n<end_of_turn>\\n'\n"
]
}
],
"source": [
"answer1 = llm.invoke([message1], max_tokens=60, parse_response=True)\n",
"print(answer1)\n",
"\n",
"answer2 = llm.invoke([message1, answer1, message2], max_tokens=120, parse_response=True)\n",
"print(answer2)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"colab": {
"provenance": []
},
"environment": {
"kernel": "python3",
"name": ".m116",
"type": "gcloud",
"uri": "gcr.io/deeplearning-platform-release/:m116"
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -116,7 +116,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"from unstructured.partition.pdf import partition_pdf\n",
"\n",
"\n",

747
cookbook/RAPTOR.ipynb Normal file

File diff suppressed because one or more lines are too long

View File

@@ -8,6 +8,7 @@ Notebook | Description
[Semi_Structured_RAG.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_Structured_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data, including text and tables, using unstructured for parsing, multi-vector retriever for storing, and lcel for implementing chains.
[Semi_structured_and_multi_moda...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_and_multi_modal_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using unstructured for parsing, multi-vector retriever for storage and retrieval, and lcel for implementing chains.
[Semi_structured_multi_modal_RA...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using various tools and methods such as unstructured for parsing, multi-vector retriever for storing, lcel for implementing chains, and open source language models like llama2, llava, and gpt4all.
[amazon_personalize_how_to.ipynb](https://github.com/langchain-ai/langchain/blob/master/cookbook/amazon_personalize_how_to.ipynb) | Retrieving personalized recommendations from Amazon Personalize and use custom agents to build generative AI apps
[analyze_document.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/analyze_document.ipynb) | Analyze a single long document.
[autogpt/autogpt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/autogpt.ipynb) | Implement autogpt, a language model, with langchain primitives such as llms, prompttemplates, vectorstores, embeddings, and tools.
[autogpt/marathon_times.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/marathon_times.ipynb) | Implement autogpt for finding winning marathon times.

View File

@@ -68,7 +68,7 @@
"pdf_pages = loader.load()\n",
"\n",
"# Split\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
"\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n",
"all_splits_pypdf = text_splitter.split_documents(pdf_pages)\n",

View File

@@ -28,9 +28,9 @@
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAI, OpenAIEmbeddings\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",
"llm = OpenAI(temperature=0)"
]

View File

@@ -0,0 +1,200 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install -qU langchain-airbyte"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"\n",
"GITHUB_TOKEN = getpass.getpass()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"from langchain_airbyte import AirbyteLoader\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"loader = AirbyteLoader(\n",
" source=\"source-github\",\n",
" stream=\"pull_requests\",\n",
" config={\n",
" \"credentials\": {\"personal_access_token\": GITHUB_TOKEN},\n",
" \"repositories\": [\"langchain-ai/langchain\"],\n",
" },\n",
" template=PromptTemplate.from_template(\n",
" \"\"\"# {title}\n",
"by {user[login]}\n",
"\n",
"{body}\"\"\"\n",
" ),\n",
" include_metadata=False,\n",
")\n",
"docs = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"# Updated partners/ibm README\n",
"by williamdevena\n",
"\n",
"## PR title\n",
"partners: changed the README file for the IBM Watson AI integration in the libs/partners/ibm folder.\n",
"\n",
"## PR message\n",
"Description: Changed the README file of partners/ibm following the docs on https://python.langchain.com/docs/integrations/llms/ibm_watsonx\n",
"\n",
"The README includes:\n",
"\n",
"- Brief description\n",
"- Installation\n",
"- Setting-up instructions (API key, project id, ...)\n",
"- Basic usage:\n",
" - Loading the model\n",
" - Direct inference\n",
" - Chain invoking\n",
" - Streaming the model output\n",
" \n",
"Issue: https://github.com/langchain-ai/langchain/issues/17545\n",
"\n",
"Dependencies: None\n",
"\n",
"Twitter handle: None\n"
]
}
],
"source": [
"print(docs[-2].page_content)"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"10283"
]
},
"execution_count": 39,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(docs)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"import tiktoken\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"enc = tiktoken.get_encoding(\"cl100k_base\")\n",
"\n",
"vectorstore = Chroma.from_documents(\n",
" docs,\n",
" embedding=OpenAIEmbeddings(\n",
" disallowed_special=(enc.special_tokens_set - {\"<|endofprompt|>\"})\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [],
"source": [
"retriever = vectorstore.as_retriever()"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='# Updated partners/ibm README\\nby williamdevena\\n\\n## PR title\\r\\npartners: changed the README file for the IBM Watson AI integration in the libs/partners/ibm folder.\\r\\n\\r\\n## PR message\\r\\nDescription: Changed the README file of partners/ibm following the docs on https://python.langchain.com/docs/integrations/llms/ibm_watsonx\\r\\n\\r\\nThe README includes:\\r\\n\\r\\n- Brief description\\r\\n- Installation\\r\\n- Setting-up instructions (API key, project id, ...)\\r\\n- Basic usage:\\r\\n - Loading the model\\r\\n - Direct inference\\r\\n - Chain invoking\\r\\n - Streaming the model output\\r\\n \\r\\nIssue: https://github.com/langchain-ai/langchain/issues/17545\\r\\n\\r\\nDependencies: None\\r\\n\\r\\nTwitter handle: None'),\n",
" Document(page_content='# Updated partners/ibm README\\nby williamdevena\\n\\n## PR title\\r\\npartners: changed the README file for the IBM Watson AI integration in the `libs/partners/ibm` folder. \\r\\n\\r\\n\\r\\n\\r\\n## PR message\\r\\n- **Description:** Changed the README file of partners/ibm following the docs on https://python.langchain.com/docs/integrations/llms/ibm_watsonx\\r\\n\\r\\n The README includes:\\r\\n - Brief description\\r\\n - Installation\\r\\n - Setting-up instructions (API key, project id, ...)\\r\\n - Basic usage:\\r\\n - Loading the model\\r\\n - Direct inference\\r\\n - Chain invoking\\r\\n - Streaming the model output\\r\\n\\r\\n\\r\\n- **Issue:** #17545\\r\\n- **Dependencies:** None\\r\\n- **Twitter handle:** None'),\n",
" Document(page_content='# IBM: added partners package `langchain_ibm`, added llm\\nby MateuszOssGit\\n\\n - **Description:** Added `langchain_ibm` as an langchain partners package of IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) LLM provider (`WatsonxLLM`)\\r\\n - **Dependencies:** [ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),\\r\\n - **Tag maintainer:** : \\r\\n\\r\\nPlease make sure your PR is passing linting and testing before submitting. Run `make format`, `make lint` and `make test` to check this locally. ✅'),\n",
" Document(page_content='# Add WatsonX support\\nby baptistebignaud\\n\\nIt is a connector to use a LLM from WatsonX.\\r\\nIt requires python SDK \"ibm-generative-ai\"\\r\\n\\r\\n(It might not be perfect since it is my first PR on a public repository 😄)')]"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"retriever.invoke(\"pull requests related to IBM\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -14,9 +14,9 @@
"\n",
"This notebook shows you how to use LangChain's standard chat features while passing the chat messages back and forth via Apache Kafka.\n",
"\n",
"This goal is to simulate an architecture where the chat front end and the LLM are running as separate services that need to communicate with one another over an internal nework.\n",
"This goal is to simulate an architecture where the chat front end and the LLM are running as separate services that need to communicate with one another over an internal network.\n",
"\n",
"It's an alternative to typical pattern of requesting a reponse from the model via a REST API (there's more info on why you would want to do this at the end of the notebook)."
"It's an alternative to typical pattern of requesting a response from the model via a REST API (there's more info on why you would want to do this at the end of the notebook)."
]
},
{
@@ -261,7 +261,7 @@
"\n",
"Load Llama 2 and set the conversation buffer to 300 tokens using `ConversationTokenBufferMemory`. This value was used for running Llama in a CPU only container, so you can raise it if running in Google Colab. It prevents the container that is hosting the model from running out of memory.\n",
"\n",
"Here, we're overiding the default system persona so that the chatbot has the personality of Marvin The Paranoid Android from the Hitchhiker's Guide to the Galaxy."
"Here, we're overriding the default system persona so that the chatbot has the personality of Marvin The Paranoid Android from the Hitchhiker's Guide to the Galaxy."
]
},
{
@@ -272,7 +272,7 @@
},
"outputs": [],
"source": [
"# Load the model with the apporiate parameters:\n",
"# Load the model with the appropriate parameters:\n",
"llm = LlamaCpp(\n",
" model_path=model_path,\n",
" max_tokens=250,\n",
@@ -551,7 +551,7 @@
"\n",
" * **Scalability**: Apache Kafka is designed with parallel processing in mind, so many teams prefer to use it to more effectively distribute work to available workers (in this case the \"worker\" is a container running an LLM).\n",
"\n",
" * **Durability**: Kafka is designed to allow services to pick up where another service left off in the case where that service experienced a memory issue or went offline. This prevents data loss in highly complex, distribuited architectures where multiple systems are communicating with one another (LLMs being just one of many interdependent systems that also include vector databases and traditional databases).\n",
" * **Durability**: Kafka is designed to allow services to pick up where another service left off in the case where that service experienced a memory issue or went offline. This prevents data loss in highly complex, distributed architectures where multiple systems are communicating with one another (LLMs being just one of many interdependent systems that also include vector databases and traditional databases).\n",
"\n",
"For more background on why event streaming is a good fit for Gen AI application architecture, see Kai Waehner's article [\"Apache Kafka + Vector Database + LLM = Real-Time GenAI\"](https://www.kai-waehner.de/blog/2023/11/08/apache-kafka-flink-vector-database-llm-real-time-genai/)."
]

View File

@@ -40,7 +40,9 @@
"import nest_asyncio\n",
"import pandas as pd\n",
"from langchain.docstore.document import Document\n",
"from langchain_community.agent_toolkits.pandas.base import create_pandas_dataframe_agent\n",
"from langchain_experimental.agents.agent_toolkits.pandas.base import (\n",
" create_pandas_dataframe_agent,\n",
")\n",
"from langchain_experimental.autonomous_agents import AutoGPT\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
@@ -227,8 +229,8 @@
" BaseCombineDocumentsChain,\n",
" load_qa_with_sources_chain,\n",
")\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain.tools import BaseTool, DuckDuckGoSearchRun\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
"from pydantic import Field\n",
"\n",
"\n",

View File

@@ -24,7 +24,7 @@
"source": [
"1. Prepare data:\n",
" 1. Upload all python project files using the `langchain_community.document_loaders.TextLoader`. We will call these files the **documents**.\n",
" 2. Split all documents to chunks using the `langchain.text_splitter.CharacterTextSplitter`.\n",
" 2. Split all documents to chunks using the `langchain_text_splitters.CharacterTextSplitter`.\n",
" 3. Embed chunks and upload them into the DeepLake using `langchain.embeddings.openai.OpenAIEmbeddings` and `langchain_community.vectorstores.DeepLake`\n",
"2. Question-Answering:\n",
" 1. Build a chain from `langchain.chat_models.ChatOpenAI` and `langchain.chains.ConversationalRetrievalChain`\n",
@@ -621,7 +621,7 @@
}
],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_documents(docs)\n",

File diff suppressed because it is too large Load Diff

View File

@@ -52,12 +52,12 @@
"import os\n",
"\n",
"from langchain.chains import RetrievalQA\n",
"from langchain.text_splitter import (\n",
"from langchain_community.vectorstores import DeepLake\n",
"from langchain_openai import OpenAI, OpenAIEmbeddings\n",
"from langchain_text_splitters import (\n",
" CharacterTextSplitter,\n",
" RecursiveCharacterTextSplitter,\n",
")\n",
"from langchain_community.vectorstores import DeepLake\n",
"from langchain_openai import OpenAI, OpenAIEmbeddings\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\"OpenAI API Key:\")\n",
"activeloop_token = getpass.getpass(\"Activeloop Token:\")\n",

View File

@@ -100,7 +100,7 @@
}
],
"source": [
"agent.run(\"whats 2 + 2\")"
"agent.invoke(\"whats 2 + 2\")"
]
},
{

View File

@@ -0,0 +1,245 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0fc0309d-4d49-4bb5-bec0-bd92c6fddb28",
"metadata": {},
"source": [
"## Fireworks.AI + LangChain + RAG\n",
" \n",
"[Fireworks AI](https://python.langchain.com/docs/integrations/llms/fireworks) wants to provide the best experience when working with LangChain, and here is an example of Fireworks + LangChain doing RAG\n",
"\n",
"See [our models page](https://fireworks.ai/models) for the full list of models. We use `accounts/fireworks/models/mixtral-8x7b-instruct` for RAG In this tutorial.\n",
"\n",
"For the RAG target, we will use the Gemma technical report https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf "
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "d12fb75a-f707-48d5-82a5-efe2d041813c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Found existing installation: langchain-fireworks 0.0.1\n",
"Uninstalling langchain-fireworks-0.0.1:\n",
" Successfully uninstalled langchain-fireworks-0.0.1\n",
"Note: you may need to restart the kernel to use updated packages.\n",
"Obtaining file:///mnt/disks/data/langchain/libs/partners/fireworks\n",
" Installing build dependencies ... \u001b[?25ldone\n",
"\u001b[?25h Checking if build backend supports build_editable ... \u001b[?25ldone\n",
"\u001b[?25h Getting requirements to build editable ... \u001b[?25ldone\n",
"\u001b[?25h Preparing editable metadata (pyproject.toml) ... \u001b[?25ldone\n",
"\u001b[?25hRequirement already satisfied: aiohttp<4.0.0,>=3.9.1 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-fireworks==0.0.1) (3.9.3)\n",
"Requirement already satisfied: fireworks-ai<0.13.0,>=0.12.0 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-fireworks==0.0.1) (0.12.0)\n",
"Requirement already satisfied: langchain-core<0.2,>=0.1 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-fireworks==0.0.1) (0.1.23)\n",
"Requirement already satisfied: requests<3,>=2 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-fireworks==0.0.1) (2.31.0)\n",
"Requirement already satisfied: aiosignal>=1.1.2 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-fireworks==0.0.1) (1.3.1)\n",
"Requirement already satisfied: attrs>=17.3.0 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-fireworks==0.0.1) (23.1.0)\n",
"Requirement already satisfied: frozenlist>=1.1.1 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-fireworks==0.0.1) (1.4.0)\n",
"Requirement already satisfied: multidict<7.0,>=4.5 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-fireworks==0.0.1) (6.0.4)\n",
"Requirement already satisfied: yarl<2.0,>=1.0 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-fireworks==0.0.1) (1.9.2)\n",
"Requirement already satisfied: async-timeout<5.0,>=4.0 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from aiohttp<4.0.0,>=3.9.1->langchain-fireworks==0.0.1) (4.0.3)\n",
"Requirement already satisfied: httpx in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (0.26.0)\n",
"Requirement already satisfied: httpx-sse in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (0.4.0)\n",
"Requirement already satisfied: pydantic in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (2.4.2)\n",
"Requirement already satisfied: Pillow in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (10.2.0)\n",
"Requirement already satisfied: PyYAML>=5.3 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (6.0.1)\n",
"Requirement already satisfied: anyio<5,>=3 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (3.7.1)\n",
"Requirement already satisfied: jsonpatch<2.0,>=1.33 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (1.33)\n",
"Requirement already satisfied: langsmith<0.2.0,>=0.1.0 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (0.1.5)\n",
"Requirement already satisfied: packaging<24.0,>=23.2 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (23.2)\n",
"Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (8.2.3)\n",
"Requirement already satisfied: charset-normalizer<4,>=2 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from requests<3,>=2->langchain-fireworks==0.0.1) (3.3.0)\n",
"Requirement already satisfied: idna<4,>=2.5 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from requests<3,>=2->langchain-fireworks==0.0.1) (3.4)\n",
"Requirement already satisfied: urllib3<3,>=1.21.1 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from requests<3,>=2->langchain-fireworks==0.0.1) (2.0.6)\n",
"Requirement already satisfied: certifi>=2017.4.17 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from requests<3,>=2->langchain-fireworks==0.0.1) (2023.7.22)\n",
"Requirement already satisfied: sniffio>=1.1 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from anyio<5,>=3->langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (1.3.0)\n",
"Requirement already satisfied: exceptiongroup in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from anyio<5,>=3->langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (1.1.3)\n",
"Requirement already satisfied: jsonpointer>=1.9 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from jsonpatch<2.0,>=1.33->langchain-core<0.2,>=0.1->langchain-fireworks==0.0.1) (2.4)\n",
"Requirement already satisfied: annotated-types>=0.4.0 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from pydantic->fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (0.5.0)\n",
"Requirement already satisfied: pydantic-core==2.10.1 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from pydantic->fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (2.10.1)\n",
"Requirement already satisfied: typing-extensions>=4.6.1 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from pydantic->fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (4.8.0)\n",
"Requirement already satisfied: httpcore==1.* in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from httpx->fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (1.0.2)\n",
"Requirement already satisfied: h11<0.15,>=0.13 in /mnt/disks/data/langchain/.venv/lib/python3.9/site-packages (from httpcore==1.*->httpx->fireworks-ai<0.13.0,>=0.12.0->langchain-fireworks==0.0.1) (0.14.0)\n",
"Building wheels for collected packages: langchain-fireworks\n",
" Building editable for langchain-fireworks (pyproject.toml) ... \u001b[?25ldone\n",
"\u001b[?25h Created wheel for langchain-fireworks: filename=langchain_fireworks-0.0.1-py3-none-any.whl size=2228 sha256=564071b120b09ec31f2dc737733448a33bbb26e40b49fcde0c129ad26045259d\n",
" Stored in directory: /tmp/pip-ephem-wheel-cache-oz368vdk/wheels/e0/ad/31/d7e76dd73d61905ff7f369f5b0d21a4b5e7af4d3cb7487aece\n",
"Successfully built langchain-fireworks\n",
"Installing collected packages: langchain-fireworks\n",
"Successfully installed langchain-fireworks-0.0.1\n",
"\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.2.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.0\u001b[0m\n",
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"%pip install --quiet pypdf chromadb tiktoken openai \n",
"%pip uninstall -y langchain-fireworks\n",
"%pip install --editable /mnt/disks/data/langchain/libs/partners/fireworks"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "cf719376",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<module 'fireworks' from '/mnt/disks/data/langchain/.venv/lib/python3.9/site-packages/fireworks/__init__.py'>\n"
]
}
],
"source": [
"import fireworks\n",
"\n",
"print(fireworks)\n",
"import fireworks.client"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9ab49327-0532-4480-804c-d066c302a322",
"metadata": {},
"outputs": [],
"source": [
"# Load\n",
"import requests\n",
"from langchain_community.document_loaders import PyPDFLoader\n",
"\n",
"# Download the PDF from a URL and save it to a temporary location\n",
"url = \"https://storage.googleapis.com/deepmind-media/gemma/gemma-report.pdf\"\n",
"response = requests.get(url, stream=True)\n",
"file_name = \"temp_file.pdf\"\n",
"with open(file_name, \"wb\") as pdf:\n",
" pdf.write(response.content)\n",
"\n",
"loader = PyPDFLoader(file_name)\n",
"data = loader.load()\n",
"\n",
"# Split\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
"\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=0)\n",
"all_splits = text_splitter.split_documents(data)\n",
"\n",
"# Add to vectorDB\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_fireworks.embeddings import FireworksEmbeddings\n",
"\n",
"vectorstore = Chroma.from_documents(\n",
" documents=all_splits,\n",
" collection_name=\"rag-chroma\",\n",
" embedding=FireworksEmbeddings(),\n",
")\n",
"\n",
"retriever = vectorstore.as_retriever()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "4efaddd9-3dbb-455c-ba54-0ad7f2d2ce0f",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.pydantic_v1 import BaseModel\n",
"from langchain_core.runnables import RunnableParallel, RunnablePassthrough\n",
"\n",
"# RAG prompt\n",
"template = \"\"\"Answer the question based only on the following context:\n",
"{context}\n",
"\n",
"Question: {question}\n",
"\"\"\"\n",
"prompt = ChatPromptTemplate.from_template(template)\n",
"\n",
"# LLM\n",
"from langchain_together import Together\n",
"\n",
"llm = Together(\n",
" model=\"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n",
" temperature=0.0,\n",
" max_tokens=2000,\n",
" top_k=1,\n",
")\n",
"\n",
"# RAG chain\n",
"chain = (\n",
" RunnableParallel({\"context\": retriever, \"question\": RunnablePassthrough()})\n",
" | prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "88b1ee51-1b0f-4ebf-bb32-e50e843f0eeb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\nAnswer: The architectural details of Mixtral are as follows:\\n- Dimension (dim): 4096\\n- Number of layers (n\\\\_layers): 32\\n- Dimension of each head (head\\\\_dim): 128\\n- Hidden dimension (hidden\\\\_dim): 14336\\n- Number of heads (n\\\\_heads): 32\\n- Number of kv heads (n\\\\_kv\\\\_heads): 8\\n- Context length (context\\\\_len): 32768\\n- Vocabulary size (vocab\\\\_size): 32000\\n- Number of experts (num\\\\_experts): 8\\n- Number of top k experts (top\\\\_k\\\\_experts): 2\\n\\nMixtral is based on a transformer architecture and uses the same modifications as described in [18], with the notable exceptions that Mixtral supports a fully dense context length of 32k tokens, and the feedforward block picks from a set of 8 distinct groups of parameters. At every layer, for every token, a router network chooses two of these groups (the “experts”) to process the token and combine their output additively. This technique increases the number of parameters of a model while controlling cost and latency, as the model only uses a fraction of the total set of parameters per token. Mixtral is pretrained with multilingual data using a context size of 32k tokens. It either matches or exceeds the performance of Llama 2 70B and GPT-3.5, over several benchmarks. In particular, Mixtral vastly outperforms Llama 2 70B on mathematics, code generation, and multilingual benchmarks.'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke(\"What are the Architectural details of Mixtral?\")"
]
},
{
"cell_type": "markdown",
"id": "755cf871-26b7-4e30-8b91-9ffd698470f4",
"metadata": {},
"source": [
"Trace: \n",
"\n",
"https://smith.langchain.com/public/935fd642-06a6-4b42-98e3-6074f93115cd/r"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -170,8 +170,8 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",
"with open(\"../../state_of_the_union.txt\") as f:\n",
" state_of_the_union = f.read()\n",

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -52,7 +52,7 @@
"\n",
"bash_chain = LLMBashChain.from_llm(llm, verbose=True)\n",
"\n",
"bash_chain.run(text)"
"bash_chain.invoke(text)"
]
},
{
@@ -135,7 +135,7 @@
"\n",
"text = \"Please write a bash script that prints 'Hello World' to the console.\"\n",
"\n",
"bash_chain.run(text)"
"bash_chain.invoke(text)"
]
},
{
@@ -190,7 +190,7 @@
"\n",
"text = \"List the current directory then move up a level.\"\n",
"\n",
"bash_chain.run(text)"
"bash_chain.invoke(text)"
]
},
{
@@ -231,7 +231,7 @@
],
"source": [
"# Run the same command again and see that the state is maintained between calls\n",
"bash_chain.run(text)"
"bash_chain.invoke(text)"
]
}
],

View File

@@ -50,7 +50,7 @@
"\n",
"checker_chain = LLMCheckerChain.from_llm(llm, verbose=True)\n",
"\n",
"checker_chain.run(text)"
"checker_chain.invoke(text)"
]
},
{

View File

@@ -51,7 +51,7 @@
"llm = OpenAI(temperature=0)\n",
"llm_math = LLMMathChain.from_llm(llm, verbose=True)\n",
"\n",
"llm_math.run(\"What is 13 raised to the .3432 power?\")"
"llm_math.invoke(\"What is 13 raised to the .3432 power?\")"
]
},
{

View File

@@ -45,7 +45,7 @@
}
],
"source": [
"llm_symbolic_math.run(\"What is the derivative of sin(x)*exp(x) with respect to x?\")"
"llm_symbolic_math.invoke(\"What is the derivative of sin(x)*exp(x) with respect to x?\")"
]
},
{
@@ -65,7 +65,7 @@
}
],
"source": [
"llm_symbolic_math.run(\n",
"llm_symbolic_math.invoke(\n",
" \"What is the integral of exp(x)*sin(x) + exp(x)*cos(x) with respect to x?\"\n",
")"
]
@@ -94,7 +94,7 @@
}
],
"source": [
"llm_symbolic_math.run('Solve the differential equation y\" - y = e^t')"
"llm_symbolic_math.invoke('Solve the differential equation y\" - y = e^t')"
]
},
{
@@ -114,7 +114,7 @@
}
],
"source": [
"llm_symbolic_math.run(\"What are the solutions to this equation y^3 + 1/3y?\")"
"llm_symbolic_math.invoke(\"What are the solutions to this equation y^3 + 1/3y?\")"
]
},
{
@@ -134,7 +134,7 @@
}
],
"source": [
"llm_symbolic_math.run(\"x = y + 5, y = z - 3, z = x * y. Solve for x, y, z\")"
"llm_symbolic_math.invoke(\"x = y + 5, y = z - 3, z = x * y. Solve for x, y, z\")"
]
}
],

File diff suppressed because one or more lines are too long

View File

@@ -124,7 +124,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",
"text_splitter = CharacterTextSplitter.from_tiktoken_encoder(\n",
" chunk_size=7500, chunk_overlap=100\n",

View File

@@ -20,10 +20,10 @@
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_community.document_loaders import TextLoader\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAIEmbeddings"
"from langchain_openai import OpenAIEmbeddings\n",
"from langchain_text_splitters import CharacterTextSplitter"
]
},
{

648
cookbook/optimization.ipynb Normal file
View File

@@ -0,0 +1,648 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "c7fe38bc",
"metadata": {},
"source": [
"# Optimization\n",
"\n",
"This notebook goes over how to optimize chains using LangChain and [LangSmith](https://smith.langchain.com)."
]
},
{
"cell_type": "markdown",
"id": "2f87ccd5",
"metadata": {},
"source": [
"## Set up\n",
"\n",
"We will set an environment variable for LangSmith, and load the relevant data"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "236bedc5",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LANGCHAIN_PROJECT\"] = \"movie-qa\""
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "a3fed0dd",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "7cfff337",
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv(\"data/imdb_top_1000.csv\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "2d20fb9c",
"metadata": {},
"outputs": [],
"source": [
"df[\"Released_Year\"] = df[\"Released_Year\"].astype(int, errors=\"ignore\")"
]
},
{
"cell_type": "markdown",
"id": "09fc8fe2",
"metadata": {},
"source": [
"## Create the initial retrieval chain\n",
"\n",
"We will use a self-query retriever"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "f71e24e2",
"metadata": {},
"outputs": [],
"source": [
"from langchain.schema import Document\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"embeddings = OpenAIEmbeddings()"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "8881ea8e",
"metadata": {},
"outputs": [],
"source": [
"records = df.to_dict(\"records\")\n",
"documents = [Document(page_content=d[\"Overview\"], metadata=d) for d in records]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "8f495423",
"metadata": {},
"outputs": [],
"source": [
"vectorstore = Chroma.from_documents(documents, embeddings)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "31d33d62",
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains.query_constructor.base import AttributeInfo\n",
"from langchain.retrievers.self_query.base import SelfQueryRetriever\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"metadata_field_info = [\n",
" AttributeInfo(\n",
" name=\"Released_Year\",\n",
" description=\"The year the movie was released\",\n",
" type=\"int\",\n",
" ),\n",
" AttributeInfo(\n",
" name=\"Series_Title\",\n",
" description=\"The title of the movie\",\n",
" type=\"str\",\n",
" ),\n",
" AttributeInfo(\n",
" name=\"Genre\",\n",
" description=\"The genre of the movie\",\n",
" type=\"string\",\n",
" ),\n",
" AttributeInfo(\n",
" name=\"IMDB_Rating\", description=\"A 1-10 rating for the movie\", type=\"float\"\n",
" ),\n",
"]\n",
"document_content_description = \"Brief summary of a movie\"\n",
"llm = ChatOpenAI(temperature=0)\n",
"retriever = SelfQueryRetriever.from_llm(\n",
" llm, vectorstore, document_content_description, metadata_field_info, verbose=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "a731533b",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.runnables import RunnablePassthrough"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "05181849",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "feed4be6",
"metadata": {},
"outputs": [],
"source": [
"prompt = ChatPromptTemplate.from_template(\n",
" \"\"\"Answer the user's question based on the below information:\n",
"\n",
"Information:\n",
"\n",
"{info}\n",
"\n",
"Question: {question}\"\"\"\n",
")\n",
"generator = (prompt | ChatOpenAI() | StrOutputParser()).with_config(\n",
" run_name=\"generator\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "eb16cc9a",
"metadata": {},
"outputs": [],
"source": [
"chain = (\n",
" RunnablePassthrough.assign(info=(lambda x: x[\"question\"]) | retriever) | generator\n",
")"
]
},
{
"cell_type": "markdown",
"id": "c70911cc",
"metadata": {},
"source": [
"## Run examples\n",
"\n",
"Run examples through the chain. This can either be manually, or using a list of examples, or production traffic"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "19a88d13",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'One of the horror movies released in the early 2000s is \"The Ring\" (2002), directed by Gore Verbinski.'"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"question\": \"what is a horror movie released in early 2000s\"})"
]
},
{
"cell_type": "markdown",
"id": "17f9cdae",
"metadata": {},
"source": [
"## Annotate\n",
"\n",
"Now, go to LangSmitha and annotate those examples as correct or incorrect"
]
},
{
"cell_type": "markdown",
"id": "5e211da6",
"metadata": {},
"source": [
"## Create Dataset\n",
"\n",
"We can now create a dataset from those runs.\n",
"\n",
"What we will do is find the runs marked as correct, then grab the sub-chains from them. Specifically, the query generator sub chain and the final generation step"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "e4024267",
"metadata": {},
"outputs": [],
"source": [
"from langsmith import Client\n",
"\n",
"client = Client()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "3814efc5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"14"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"runs = list(\n",
" client.list_runs(\n",
" project_name=\"movie-qa\",\n",
" execution_order=1,\n",
" filter=\"and(eq(feedback_key, 'correctness'), eq(feedback_score, 1))\",\n",
" )\n",
")\n",
"\n",
"len(runs)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "3eb123e0",
"metadata": {},
"outputs": [],
"source": [
"gen_runs = []\n",
"query_runs = []\n",
"for r in runs:\n",
" gen_runs.extend(\n",
" list(\n",
" client.list_runs(\n",
" project_name=\"movie-qa\",\n",
" filter=\"eq(name, 'generator')\",\n",
" trace_id=r.trace_id,\n",
" )\n",
" )\n",
" )\n",
" query_runs.extend(\n",
" list(\n",
" client.list_runs(\n",
" project_name=\"movie-qa\",\n",
" filter=\"eq(name, 'query_constructor')\",\n",
" trace_id=r.trace_id,\n",
" )\n",
" )\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "a4397026",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'question': 'what is a high school comedy released in early 2000s'}"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"runs[0].inputs"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "3fa6ad2a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'output': 'One high school comedy released in the early 2000s is \"Mean Girls\" starring Lindsay Lohan, Rachel McAdams, and Tina Fey.'}"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"runs[0].outputs"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "1fda5b4b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'query': 'what is a high school comedy released in early 2000s'}"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_runs[0].inputs"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "1a1a51e6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'output': {'query': 'high school comedy',\n",
" 'filter': {'operator': 'and',\n",
" 'arguments': [{'comparator': 'eq', 'attribute': 'Genre', 'value': 'comedy'},\n",
" {'operator': 'and',\n",
" 'arguments': [{'comparator': 'gte',\n",
" 'attribute': 'Released_Year',\n",
" 'value': 2000},\n",
" {'comparator': 'lt', 'attribute': 'Released_Year', 'value': 2010}]}]}}}"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"query_runs[0].outputs"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "e9d9966b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'question': 'what is a high school comedy released in early 2000s',\n",
" 'info': []}"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"gen_runs[0].inputs"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "bc113f3d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'output': 'One high school comedy released in the early 2000s is \"Mean Girls\" starring Lindsay Lohan, Rachel McAdams, and Tina Fey.'}"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"gen_runs[0].outputs"
]
},
{
"cell_type": "markdown",
"id": "6cca74e5",
"metadata": {},
"source": [
"## Create datasets\n",
"\n",
"We can now create datasets for the query generation and final generation step.\n",
"We do this so that (1) we can inspect the datapoints, (2) we can edit them if needed, (3) we can add to them over time"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "69966f0e",
"metadata": {},
"outputs": [],
"source": [
"client.create_dataset(\"movie-query_constructor\")\n",
"\n",
"inputs = [r.inputs for r in query_runs]\n",
"outputs = [r.outputs for r in query_runs]\n",
"\n",
"client.create_examples(\n",
" inputs=inputs, outputs=outputs, dataset_name=\"movie-query_constructor\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "7e15770e",
"metadata": {},
"outputs": [],
"source": [
"client.create_dataset(\"movie-generator\")\n",
"\n",
"inputs = [r.inputs for r in gen_runs]\n",
"outputs = [r.outputs for r in gen_runs]\n",
"\n",
"client.create_examples(inputs=inputs, outputs=outputs, dataset_name=\"movie-generator\")"
]
},
{
"cell_type": "markdown",
"id": "61cf9bcd",
"metadata": {},
"source": [
"## Use as few shot examples\n",
"\n",
"We can now pull down a dataset and use them as few shot examples in a future chain"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "d9c79173",
"metadata": {},
"outputs": [],
"source": [
"examples = list(client.list_examples(dataset_name=\"movie-query_constructor\"))"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "a1771dd0",
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"\n",
"def filter_to_string(_filter):\n",
" if \"operator\" in _filter:\n",
" args = [filter_to_string(f) for f in _filter[\"arguments\"]]\n",
" return f\"{_filter['operator']}({','.join(args)})\"\n",
" else:\n",
" comparator = _filter[\"comparator\"]\n",
" attribute = json.dumps(_filter[\"attribute\"])\n",
" value = json.dumps(_filter[\"value\"])\n",
" return f\"{comparator}({attribute}, {value})\""
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "e67a3530",
"metadata": {},
"outputs": [],
"source": [
"model_examples = []\n",
"\n",
"for e in examples:\n",
" if \"filter\" in e.outputs[\"output\"]:\n",
" string_filter = filter_to_string(e.outputs[\"output\"][\"filter\"])\n",
" else:\n",
" string_filter = \"NO_FILTER\"\n",
" model_examples.append(\n",
" (\n",
" e.inputs[\"query\"],\n",
" {\"query\": e.outputs[\"output\"][\"query\"], \"filter\": string_filter},\n",
" )\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "84593135",
"metadata": {},
"outputs": [],
"source": [
"retriever1 = SelfQueryRetriever.from_llm(\n",
" llm,\n",
" vectorstore,\n",
" document_content_description,\n",
" metadata_field_info,\n",
" verbose=True,\n",
" chain_kwargs={\"examples\": model_examples},\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "4ec9bb92",
"metadata": {},
"outputs": [],
"source": [
"chain1 = (\n",
" RunnablePassthrough.assign(info=(lambda x: x[\"question\"]) | retriever1) | generator\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "64eb88e2",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'1. \"Saving Private Ryan\" (1998) - Directed by Steven Spielberg, this war film follows a group of soldiers during World War II as they search for a missing paratrooper.\\n\\n2. \"The Matrix\" (1999) - Directed by the Wachowskis, this science fiction action film follows a computer hacker who discovers the truth about the reality he lives in.\\n\\n3. \"Lethal Weapon 4\" (1998) - Directed by Richard Donner, this action-comedy film follows two mismatched detectives as they investigate a Chinese immigrant smuggling ring.\\n\\n4. \"The Fifth Element\" (1997) - Directed by Luc Besson, this science fiction action film follows a cab driver who must protect a mysterious woman who holds the key to saving the world.\\n\\n5. \"The Rock\" (1996) - Directed by Michael Bay, this action thriller follows a group of rogue military men who take over Alcatraz and threaten to launch missiles at San Francisco.'"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain1.invoke(\n",
" {\"question\": \"what are good action movies made before 2000 but after 1997?\"}\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e1ee8b55",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -59,13 +59,13 @@
"from baidubce.auth.bce_credentials import BceCredentials\n",
"from baidubce.bce_client_configuration import BceClientConfiguration\n",
"from langchain.chains.retrieval_qa import RetrievalQA\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_community.document_loaders.baiducloud_bos_directory import (\n",
" BaiduBOSDirectoryLoader,\n",
")\n",
"from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings\n",
"from langchain_community.llms.baidu_qianfan_endpoint import QianfanLLMEndpoint\n",
"from langchain_community.vectorstores import BESVectorStore"
"from langchain_community.vectorstores import BESVectorStore\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter"
]
},
{

File diff suppressed because one or more lines are too long

View File

@@ -36,9 +36,6 @@
"from bs4 import BeautifulSoup as Soup\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryByteStore, LocalFileStore\n",
"\n",
"# For our example, we'll load docs from the web\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter # noqa\n",
"from langchain_community.document_loaders.recursive_url_loader import (\n",
" RecursiveUrlLoader,\n",
")\n",
@@ -46,6 +43,9 @@
"# noqa\n",
"from langchain_community.vectorstores import Chroma\n",
"\n",
"# For our example, we'll load docs from the web\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter # noqa\n",
"\n",
"DOCSTORE_DIR = \".\"\n",
"DOCSTORE_ID_KEY = \"doc_id\""
]

View File

@@ -245,7 +245,7 @@
"\n",
"\n",
"def _parse(text):\n",
" return text.strip(\"**\")"
" return text.strip('\"').strip(\"**\")"
]
},
{

View File

@@ -1,28 +1,32 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# SalesGPT - Your Context-Aware AI Sales Assistant With Knowledge Base\n",
"# SalesGPT - Context-Aware AI Sales Assistant With Knowledge Base and Ability Generate Stripe Payment Links\n",
"\n",
"This notebook demonstrates an implementation of a **Context-Aware** AI Sales agent with a Product Knowledge Base. \n",
"This notebook demonstrates an implementation of a **Context-Aware** AI Sales agent with a Product Knowledge Base which can actually close sales. \n",
"\n",
"This notebook was originally published at [filipmichalsky/SalesGPT](https://github.com/filip-michalsky/SalesGPT) by [@FilipMichalsky](https://twitter.com/FilipMichalsky).\n",
"\n",
"SalesGPT is context-aware, which means it can understand what section of a sales conversation it is in and act accordingly.\n",
" \n",
"As such, this agent can have a natural sales conversation with a prospect and behaves based on the conversation stage. Hence, this notebook demonstrates how we can use AI to automate sales development representatives activities, such as outbound sales calls. \n",
"As such, this agent can have a natural sales conversation with a prospect and behaves based on the conversation stage. Hence, this notebook demonstrates how we can use AI to automate sales development representatives activites, such as outbound sales calls. \n",
"\n",
"Additionally, the AI Sales agent has access to tools, which allow it to interact with other systems.\n",
"\n",
"Here, we show how the AI Sales Agent can use a **Product Knowledge Base** to speak about a particular's company offerings,\n",
"hence increasing relevance and reducing hallucinations.\n",
"\n",
"We leverage the [`langchain`](https://github.com/langchain-ai/langchain) library in this implementation, specifically [Custom Agent Configuration](https://langchain-langchain.vercel.app/docs/modules/agents/how_to/custom_agent_with_tool_retrieval) and are inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) architecture ."
"Furthermore, we show how our AI Sales Agent can **generate sales** by integration with the AI Agent Highway called [Mindware](https://www.mindware.co/). In practice, this allows the agent to autonomously generate a payment link for your customers **to pay for your products via Stripe**.\n",
"\n",
"We leverage the [`langchain`](https://github.com/hwchase17/langchain) library in this implementation, specifically [Custom Agent Configuration](https://langchain-langchain.vercel.app/docs/modules/agents/how_to/custom_agent_with_tool_retrieval) and are inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) architecture ."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -38,9 +42,10 @@
"import os\n",
"import re\n",
"\n",
"# import your OpenAI key\n",
"OPENAI_API_KEY = \"sk-xx\"\n",
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY\n",
"# make sure you have .env file saved locally with your API keys\n",
"from dotenv import load_dotenv\n",
"\n",
"load_dotenv()\n",
"\n",
"from typing import Any, Callable, Dict, List, Union\n",
"\n",
@@ -49,27 +54,18 @@
"from langchain.agents.conversational.prompt import FORMAT_INSTRUCTIONS\n",
"from langchain.chains import LLMChain, RetrievalQA\n",
"from langchain.chains.base import Chain\n",
"from langchain.llms import BaseLLM\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.prompts.base import StringPromptTemplate\n",
"from langchain.schema import AgentAction, AgentFinish\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_community.llms import BaseLLM\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_core.agents import AgentAction, AgentFinish\n",
"from langchain_openai import ChatOpenAI, OpenAI, OpenAIEmbeddings\n",
"from langchain.vectorstores import Chroma\n",
"from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
"from pydantic import BaseModel, Field"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# install additional dependencies\n",
"# ! pip install chromadb openai tiktoken"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -77,19 +73,21 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Seed the SalesGPT agent\n",
"2. Run Sales Agent to decide what to do:\n",
"\n",
" a) Use a tool, such as look up Product Information in a Knowledge Base\n",
" a) Use a tool, such as look up Product Information in a Knowledge Base or Generate a Payment Link\n",
" \n",
" b) Output a response to a user \n",
"3. Run Sales Stage Recognition Agent to recognize which stage is the sales agent at and adjust their behaviour accordingly."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -98,15 +96,17 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Architecture diagram\n",
"\n",
"<img src=\"https://singularity-assets-public.s3.amazonaws.com/new_flow.png\" width=\"800\" height=\"440\"/>\n"
"<img src=\"https://demo-bucket-45.s3.amazonaws.com/new_flow2.png\" width=\"800\" height=\"440\">\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -131,7 +131,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
@@ -149,7 +149,7 @@
" {conversation_history}\n",
" ===\n",
"\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting only from the following options:\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting ony from the following options:\n",
" 1. Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\n",
" 2. Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.\n",
" 3. Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.\n",
@@ -171,7 +171,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
@@ -223,7 +223,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
@@ -240,13 +240,17 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# test the intermediate chains\n",
"verbose = True\n",
"llm = ChatOpenAI(temperature=0.9)\n",
"llm = ChatOpenAI(\n",
" model=\"gpt-4-turbo-preview\",\n",
" temperature=0.9,\n",
" openai_api_key=os.getenv(\"OPENAI_API_KEY\"),\n",
")\n",
"\n",
"stage_analyzer_chain = StageAnalyzerChain.from_llm(llm, verbose=verbose)\n",
"\n",
@@ -257,7 +261,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"metadata": {},
"outputs": [
{
@@ -276,7 +280,7 @@
" \n",
" ===\n",
"\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting only from the following options:\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting ony from the following options:\n",
" 1. Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\n",
" 2. Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.\n",
" 3. Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.\n",
@@ -296,21 +300,21 @@
{
"data": {
"text/plain": [
"'1'"
"{'conversation_history': '', 'text': '1'}"
]
},
"execution_count": 7,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"stage_analyzer_chain.run(conversation_history=\"\")"
"stage_analyzer_chain.invoke({\"conversation_history\": \"\"})"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 7,
"metadata": {},
"outputs": [
{
@@ -352,32 +356,44 @@
{
"data": {
"text/plain": [
"\"I'm doing great, thank you for asking! As a Business Development Representative at Sleep Haven, I wanted to reach out to see if you are looking to achieve a better night's sleep. We provide premium mattresses that offer the most comfortable and supportive sleeping experience possible. Are you interested in exploring our sleep solutions? <END_OF_TURN>\""
"{'salesperson_name': 'Ted Lasso',\n",
" 'salesperson_role': 'Business Development Representative',\n",
" 'company_name': 'Sleep Haven',\n",
" 'company_business': 'Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.',\n",
" 'company_values': \"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" 'conversation_purpose': 'find out whether they are looking to achieve better sleep via buying a premier mattress.',\n",
" 'conversation_history': 'Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>',\n",
" 'conversation_type': 'call',\n",
" 'conversation_stage': 'Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional. Your greeting should be welcoming. Always clarify in your greeting the reason why you are contacting the prospect.',\n",
" 'text': \"I'm doing well, thank you for asking. The reason I'm calling is to discuss how Sleep Haven can help enhance your sleep quality with our premium mattresses. Are you currently looking for ways to achieve a better night's sleep? <END_OF_TURN>\"}"
]
},
"execution_count": 8,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sales_conversation_utterance_chain.run(\n",
" salesperson_name=\"Ted Lasso\",\n",
" salesperson_role=\"Business Development Representative\",\n",
" company_name=\"Sleep Haven\",\n",
" company_business=\"Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.\",\n",
" company_values=\"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" conversation_purpose=\"find out whether they are looking to achieve better sleep via buying a premier mattress.\",\n",
" conversation_history=\"Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>\",\n",
" conversation_type=\"call\",\n",
" conversation_stage=conversation_stages.get(\n",
" \"1\",\n",
" \"Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\",\n",
" ),\n",
"sales_conversation_utterance_chain.invoke(\n",
" {\n",
" \"salesperson_name\": \"Ted Lasso\",\n",
" \"salesperson_role\": \"Business Development Representative\",\n",
" \"company_name\": \"Sleep Haven\",\n",
" \"company_business\": \"Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.\",\n",
" \"company_values\": \"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" \"conversation_purpose\": \"find out whether they are looking to achieve better sleep via buying a premier mattress.\",\n",
" \"conversation_history\": \"Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>\",\n",
" \"conversation_type\": \"call\",\n",
" \"conversation_stage\": conversation_stages.get(\n",
" \"1\",\n",
" \"Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\",\n",
" ),\n",
" }\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -385,6 +401,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -395,7 +412,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
@@ -429,7 +446,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
@@ -445,7 +462,7 @@
" text_splitter = CharacterTextSplitter(chunk_size=10, chunk_overlap=0)\n",
" texts = text_splitter.split_text(product_catalog)\n",
"\n",
" llm = OpenAI(temperature=0)\n",
" llm = ChatOpenAI(temperature=0)\n",
" embeddings = OpenAIEmbeddings()\n",
" docsearch = Chroma.from_texts(\n",
" texts, embeddings, collection_name=\"product-knowledge-base\"\n",
@@ -454,29 +471,12 @@
" knowledge_base = RetrievalQA.from_chain_type(\n",
" llm=llm, chain_type=\"stuff\", retriever=docsearch.as_retriever()\n",
" )\n",
" return knowledge_base\n",
"\n",
"\n",
"def get_tools(product_catalog):\n",
" # query to get_tools can be used to be embedded and relevant tools found\n",
" # see here: https://langchain-langchain.vercel.app/docs/use_cases/agents/custom_agent_with_plugin_retrieval#tool-retriever\n",
"\n",
" # we only use one tool for now, but this is highly extensible!\n",
" knowledge_base = setup_knowledge_base(product_catalog)\n",
" tools = [\n",
" Tool(\n",
" name=\"ProductSearch\",\n",
" func=knowledge_base.run,\n",
" description=\"useful for when you need to answer questions about product information\",\n",
" )\n",
" ]\n",
"\n",
" return tools"
" return knowledge_base"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 10,
"metadata": {},
"outputs": [
{
@@ -485,16 +485,18 @@
"text": [
"Created a chunk of size 940, which is longer than the specified 10\n",
"Created a chunk of size 844, which is longer than the specified 10\n",
"Created a chunk of size 837, which is longer than the specified 10\n"
"Created a chunk of size 837, which is longer than the specified 10\n",
"/Users/filipmichalsky/Odyssey/sales_bot/SalesGPT/env/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: The function `run` was deprecated in LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead.\n",
" warn_deprecated(\n"
]
},
{
"data": {
"text/plain": [
"' We have four products available: the Classic Harmony Spring Mattress, the Plush Serenity Bamboo Mattress, the Luxury Cloud-Comfort Memory Foam Mattress, and the EcoGreen Hybrid Latex Mattress. Each product is available in different sizes, with the Classic Harmony Spring Mattress available in Queen and King sizes, the Plush Serenity Bamboo Mattress available in King size, the Luxury Cloud-Comfort Memory Foam Mattress available in Twin, Queen, and King sizes, and the EcoGreen Hybrid Latex Mattress available in Twin and Full sizes.'"
"'The Sleep Haven products available are:\\n\\n1. Luxury Cloud-Comfort Memory Foam Mattress\\n2. Classic Harmony Spring Mattress\\n3. EcoGreen Hybrid Latex Mattress\\n4. Plush Serenity Bamboo Mattress\\n\\nEach product has its unique features and price point.'"
]
},
"execution_count": 11,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -508,12 +510,199 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up the SalesGPT Controller with the Sales Agent and Stage Analyzer and a Knowledge Base"
"### Payment gateway"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to set up your AI agent to use a payment gateway to generate payment links for your users you need two things:\n",
"\n",
"1. Sign up for a Stripe account and obtain a STRIPE API KEY\n",
"2. Create products you would like to sell in the Stripe UI. Then follow out example of `example_product_price_id_mapping.json`\n",
"to feed the product name to price_id mapping which allows you to generate the payment links."
]
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"from litellm import completion\n",
"\n",
"# set GPT model env variable\n",
"os.environ[\"GPT_MODEL\"] = \"gpt-4-turbo-preview\"\n",
"\n",
"product_price_id_mapping = {\n",
" \"ai-consulting-services\": \"price_1Ow8ofB795AYY8p1goWGZi6m\",\n",
" \"Luxury Cloud-Comfort Memory Foam Mattress\": \"price_1Owv99B795AYY8p1mjtbKyxP\",\n",
" \"Classic Harmony Spring Mattress\": \"price_1Owv9qB795AYY8p1tPcxCM6T\",\n",
" \"EcoGreen Hybrid Latex Mattress\": \"price_1OwvLDB795AYY8p1YBAMBcbi\",\n",
" \"Plush Serenity Bamboo Mattress\": \"price_1OwvMQB795AYY8p1hJN2uS3S\",\n",
"}\n",
"with open(\"example_product_price_id_mapping.json\", \"w\") as f:\n",
" json.dump(product_price_id_mapping, f)\n",
"\n",
"\n",
"def get_product_id_from_query(query, product_price_id_mapping_path):\n",
" # Load product_price_id_mapping from a JSON file\n",
" with open(product_price_id_mapping_path, \"r\") as f:\n",
" product_price_id_mapping = json.load(f)\n",
"\n",
" # Serialize the product_price_id_mapping to a JSON string for inclusion in the prompt\n",
" product_price_id_mapping_json_str = json.dumps(product_price_id_mapping)\n",
"\n",
" # Dynamically create the enum list from product_price_id_mapping keys\n",
" enum_list = list(product_price_id_mapping.values()) + [\n",
" \"No relevant product id found\"\n",
" ]\n",
" enum_list_str = json.dumps(enum_list)\n",
"\n",
" prompt = f\"\"\"\n",
" You are an expert data scientist and you are working on a project to recommend products to customers based on their needs.\n",
" Given the following query:\n",
" {query}\n",
" and the following product price id mapping:\n",
" {product_price_id_mapping_json_str}\n",
" return the price id that is most relevant to the query.\n",
" ONLY return the price id, no other text. If no relevant price id is found, return 'No relevant price id found'.\n",
" Your output will follow this schema:\n",
" {{\n",
" \"$schema\": \"http://json-schema.org/draft-07/schema#\",\n",
" \"title\": \"Price ID Response\",\n",
" \"type\": \"object\",\n",
" \"properties\": {{\n",
" \"price_id\": {{\n",
" \"type\": \"string\",\n",
" \"enum\": {enum_list_str}\n",
" }}\n",
" }},\n",
" \"required\": [\"price_id\"]\n",
" }}\n",
" Return a valid directly parsable json, dont return in it within a code snippet or add any kind of explanation!!\n",
" \"\"\"\n",
" prompt += \"{\"\n",
" response = completion(\n",
" model=os.getenv(\"GPT_MODEL\", \"gpt-3.5-turbo-1106\"),\n",
" messages=[{\"content\": prompt, \"role\": \"user\"}],\n",
" max_tokens=1000,\n",
" temperature=0,\n",
" )\n",
"\n",
" product_id = response.choices[0].message.content.strip()\n",
" return product_id"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"import requests\n",
"\n",
"\n",
"def generate_stripe_payment_link(query: str) -> str:\n",
" \"\"\"Generate a stripe payment link for a customer based on a single query string.\"\"\"\n",
"\n",
" # example testing payment gateway url\n",
" PAYMENT_GATEWAY_URL = os.getenv(\n",
" \"PAYMENT_GATEWAY_URL\", \"https://agent-payments-gateway.vercel.app/payment\"\n",
" )\n",
" PRODUCT_PRICE_MAPPING = \"example_product_price_id_mapping.json\"\n",
"\n",
" # use LLM to get the price_id from query\n",
" price_id = get_product_id_from_query(query, PRODUCT_PRICE_MAPPING)\n",
" price_id = json.loads(price_id)\n",
" payload = json.dumps(\n",
" {\"prompt\": query, **price_id, \"stripe_key\": os.getenv(\"STRIPE_API_KEY\")}\n",
" )\n",
" headers = {\n",
" \"Content-Type\": \"application/json\",\n",
" }\n",
"\n",
" response = requests.request(\n",
" \"POST\", PAYMENT_GATEWAY_URL, headers=headers, data=payload\n",
" )\n",
" return response.text"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'{\"response\":\"https://buy.stripe.com/test_6oEbLS8JB1F9bv229d\"}'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"generate_stripe_payment_link(\n",
" query=\"Please generate a payment link for John Doe to buy two mattresses - the Classic Harmony Spring Mattress\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup agent tools"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"def get_tools(product_catalog):\n",
" # query to get_tools can be used to be embedded and relevant tools found\n",
" # see here: https://langchain-langchain.vercel.app/docs/use_cases/agents/custom_agent_with_plugin_retrieval#tool-retriever\n",
"\n",
" # we only use one tool for now, but this is highly extensible!\n",
" knowledge_base = setup_knowledge_base(product_catalog)\n",
" tools = [\n",
" Tool(\n",
" name=\"ProductSearch\",\n",
" func=knowledge_base.run,\n",
" description=\"useful for when you need to answer questions about product information or services offered, availability and their costs.\",\n",
" ),\n",
" Tool(\n",
" name=\"GeneratePaymentLink\",\n",
" func=generate_stripe_payment_link,\n",
" description=\"useful to close a transaction with a customer. You need to include product name and quantity and customer name in the query input.\",\n",
" ),\n",
" ]\n",
"\n",
" return tools"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up the SalesGPT Controller with the Sales Agent and Stage Analyzer\n",
"\n",
"#### The Agent has access to a Knowledge Base and can autonomously sell your products via Stripe"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
@@ -563,19 +752,11 @@
" print(\"TEXT\")\n",
" print(text)\n",
" print(\"-------\")\n",
" if f\"{self.ai_prefix}:\" in text:\n",
" return AgentFinish(\n",
" {\"output\": text.split(f\"{self.ai_prefix}:\")[-1].strip()}, text\n",
" )\n",
" regex = r\"Action: (.*?)[\\n]*Action Input: (.*)\"\n",
" match = re.search(regex, text)\n",
" if not match:\n",
" ## TODO - this is not entirely reliable, sometimes results in an error.\n",
" return AgentFinish(\n",
" {\n",
" \"output\": \"I apologize, I was unable to find the answer to your question. Is there anything else I can help with?\"\n",
" },\n",
" text,\n",
" {\"output\": text.split(f\"{self.ai_prefix}:\")[-1].strip()}, text\n",
" )\n",
" # raise OutputParserException(f\"Could not parse LLM output: `{text}`\")\n",
" action = match.group(1)\n",
@@ -589,7 +770,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
@@ -647,18 +828,18 @@
"Previous conversation history:\n",
"{conversation_history}\n",
"\n",
"{salesperson_name}:\n",
"Thought:\n",
"{agent_scratchpad}\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"class SalesGPT(Chain, BaseModel):\n",
"class SalesGPT(Chain):\n",
" \"\"\"Controller model for the Sales Agent.\"\"\"\n",
"\n",
" conversation_history: List[str] = []\n",
@@ -804,7 +985,9 @@
"\n",
" # WARNING: this output parser is NOT reliable yet\n",
" ## It makes assumptions about output from LLM which can break and throw an error\n",
" output_parser = SalesConvoOutputParser(ai_prefix=kwargs[\"salesperson_name\"])\n",
" output_parser = SalesConvoOutputParser(\n",
" ai_prefix=kwargs[\"salesperson_name\"], verbose=verbose\n",
" )\n",
"\n",
" sales_agent_with_tools = LLMSingleActionAgent(\n",
" llm_chain=llm_chain,\n",
@@ -828,6 +1011,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -835,6 +1019,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -843,7 +1028,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
@@ -880,6 +1065,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -888,7 +1074,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 22,
"metadata": {},
"outputs": [
{
@@ -897,7 +1083,9 @@
"text": [
"Created a chunk of size 940, which is longer than the specified 10\n",
"Created a chunk of size 844, which is longer than the specified 10\n",
"Created a chunk of size 837, which is longer than the specified 10\n"
"Created a chunk of size 837, which is longer than the specified 10\n",
"/Users/filipmichalsky/Odyssey/sales_bot/SalesGPT/env/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: The class `langchain.agents.agent.LLMSingleActionAgent` was deprecated in langchain 0.1.0 and will be removed in 0.2.0. Use Use new agent constructor methods like create_react_agent, create_json_agent, create_structured_chat_agent, etc. instead.\n",
" warn_deprecated(\n"
]
}
],
@@ -907,7 +1095,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
@@ -917,7 +1105,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 22,
"metadata": {},
"outputs": [
{
@@ -934,14 +1122,14 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Hello, this is Ted Lasso from Sleep Haven. How are you doing today?\n"
"Ted Lasso: Good day! This is Ted Lasso from Sleep Haven. How are you doing today?\n"
]
}
],
@@ -951,18 +1139,18 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"I am well, how are you? I would like to learn more about your mattresses.\"\n",
" \"I am well, how are you? I would like to learn more about your services.\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 25,
"metadata": {},
"outputs": [
{
@@ -977,92 +1165,32 @@
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: I'm glad to hear that you're doing well! As for our mattresses, at Sleep Haven, we provide customers with the most comfortable and supportive sleeping experience possible. Our high-quality mattresses are designed to meet the unique needs of our customers. Can I ask what specifically you'd like to learn more about? \n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\"Yes, what materials are you mattresses made from?\")"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Our mattresses are made from a variety of materials, depending on the model. We have the EcoGreen Hybrid Latex Mattress, which is made from 100% natural latex harvested from eco-friendly plantations. The Plush Serenity Bamboo Mattress features a layer of plush, adaptive foam and a base of high-resilience support foam, with a bamboo-infused top layer. The Luxury Cloud-Comfort Memory Foam Mattress has an innovative, temperature-sensitive memory foam layer and a high-density foam base with cooling gel-infused particles. Finally, the Classic Harmony Spring Mattress has a robust inner spring construction and layers of plush padding, with a quilted top layer and a natural cotton cover. Is there anything specific you'd like to know about these materials?\n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: I'm doing great, thank you for asking! I'm glad to hear you're interested. Sleep Haven is a premium mattress company, and we're all about offering the best sleep solutions, including top-notch mattresses, pillows, and bedding accessories. Our mission is to help you achieve a better night's sleep. May I know if you're looking to enhance your sleep experience with a new mattress or bedding accessories? \n"
]
}
],
"source": [
"sales_agent.human_step(\n",
" \"Yes, I am looking for a queen sized mattress. Do you have any mattresses in queen size?\"\n",
")"
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"outputs": [],
"source": [
"sales_agent.determine_conversation_stage()"
"sales_agent.human_step(\n",
" \"Yes, I would like to improve my sleep. Can you tell me more about your products?\"\n",
")"
]
},
{
@@ -1074,7 +1202,24 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Yes, we do have queen-sized mattresses available. We offer the Luxury Cloud-Comfort Memory Foam Mattress and the Classic Harmony Spring Mattress in queen size. Both mattresses provide exceptional comfort and support. Is there anything specific you would like to know about these options?\n"
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Absolutely, I'd be happy to share more about our products. At Sleep Haven, we offer a variety of high-quality mattresses designed to cater to different sleeping preferences and needs. Whether you're looking for memory foam's comfort, the support of hybrid mattresses, or the breathability of natural latex, we have options for everyone. Our pillows and bedding accessories are similarly curated to enhance your sleep quality. Every product is built with the aim of helping you achieve the restful night's sleep you deserve. What specific features are you looking for in a mattress? \n"
]
}
],
@@ -1084,16 +1229,16 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\"Yea, compare and contrast those two options, please.\")"
"sales_agent.human_step(\"What mattresses do you have and how much do they cost?\")"
]
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 32,
"metadata": {},
"outputs": [
{
@@ -1110,14 +1255,14 @@
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: The Luxury Cloud-Comfort Memory Foam Mattress is priced at $999 and is available in Twin, Queen, and King sizes. It features an innovative, temperature-sensitive memory foam layer and a high-density foam base. On the other hand, the Classic Harmony Spring Mattress is priced at $1,299 and is available in Queen and King sizes. It features a robust inner spring construction and layers of plush padding. Both mattresses provide exceptional comfort and support, but the Classic Harmony Spring Mattress may be a better option if you prefer the traditional feel of an inner spring mattress. Do you have any other questions about these options?\n"
"Ted Lasso: We offer two primary types of mattresses at Sleep Haven. The first is our Luxury Cloud-Comfort Memory Foam Mattress, which is priced at $999 and comes in Twin, Queen, and King sizes. The second is our Classic Harmony Spring Mattress, priced at $1,299, available in Queen and King sizes. Both are designed to provide exceptional comfort and support for a better night's sleep. Which type of mattress would you be interested in learning more about? \n"
]
}
],
@@ -1127,14 +1272,66 @@
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"Great, thanks, that's it. I will talk to my wife and call back if she is onboard. Have a good day!\"\n",
" \"Okay.I would like to order two Memory Foam mattresses in Twin size please.\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Close: Ask for the sale by proposing a next step. This could be a demo, a trial or a meeting with decision-makers. Ensure to summarize what has been discussed and reiterate the benefits.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Fantastic choice! You're on your way to a better night's sleep with our Luxury Cloud-Comfort Memory Foam Mattresses. I've generated a payment link for two Twin size mattresses for you. Here is the link to complete your purchase: https://buy.stripe.com/test_6oEg28e3V97BdDabJn. Is there anything else I can assist you with today? \n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"Great, thanks! I will discuss with my wife and will buy it if she is onboard. Have a good day!\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -1153,9 +1350,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -31,7 +31,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install langchain lark openai elasticsearch pandas"
"!pip install langchain langchain-elasticsearch lark openai elasticsearch pandas"
]
},
{
@@ -1083,7 +1083,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.vectorstores import ElasticsearchStore\n",
"from langchain_elasticsearch import ElasticsearchStore\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"embeddings = OpenAIEmbeddings()"

View File

@@ -209,7 +209,7 @@
}
],
"source": [
"chain.run({})"
"chain.invoke({})"
]
},
{

View File

@@ -9,7 +9,7 @@
" \n",
"[Together AI](https://python.langchain.com/docs/integrations/llms/together) has a broad set of OSS LLMs via inference API.\n",
"\n",
"See [here](https://api.together.xyz/playground). We use `\"mistralai/Mixtral-8x7B-Instruct-v0.1` for RAG on the Mixtral paper.\n",
"See [here](https://docs.together.ai/docs/inference-models). We use `\"mistralai/Mixtral-8x7B-Instruct-v0.1` for RAG on the Mixtral paper.\n",
"\n",
"Download the paper:\n",
"https://arxiv.org/pdf/2401.04088.pdf"
@@ -39,7 +39,7 @@
"data = loader.load()\n",
"\n",
"# Split\n",
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
"\n",
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=2000, chunk_overlap=0)\n",
"all_splits = text_splitter.split_documents(data)\n",
@@ -148,7 +148,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.9.6"
}
},
"nbformat": 4,

View File

@@ -2610,7 +2610,7 @@
}
],
"source": [
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",
"text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)\n",
"texts = text_splitter.split_documents(docs)"

View File

@@ -0,0 +1,174 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Video Captioning\n",
"This notebook shows how to use VideoCaptioningChain, which is implemented using Langchain's ImageCaptionLoader and AssemblyAI to produce .srt files.\n",
"\n",
"This system autogenerates both subtitles and closed captions from a video URL."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installing Dependencies"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# !pip install ffmpeg-python\n",
"# !pip install assemblyai\n",
"# !pip install opencv-python\n",
"# !pip install torch\n",
"# !pip install pillow\n",
"# !pip install transformers\n",
"# !pip install langchain"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"ExecuteTime": {
"end_time": "2023-11-30T03:39:14.078232Z",
"start_time": "2023-11-30T03:39:12.534410Z"
}
},
"outputs": [],
"source": [
"import getpass\n",
"\n",
"from langchain.chains.video_captioning import VideoCaptioningChain\n",
"from langchain.chat_models.openai import ChatOpenAI"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setting up API Keys"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2023-11-30T03:39:17.423806Z",
"start_time": "2023-11-30T03:39:17.417945Z"
}
},
"outputs": [],
"source": [
"OPENAI_API_KEY = getpass.getpass(\"OpenAI API Key:\")\n",
"\n",
"ASSEMBLYAI_API_KEY = getpass.getpass(\"AssemblyAI API Key:\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Required parameters:**\n",
"\n",
"* llm: The language model this chain will use to get suggestions on how to refine the closed-captions\n",
"* assemblyai_key: The API key for AssemblyAI, used to generate the subtitles\n",
"\n",
"**Optional Parameters:**\n",
"\n",
"* verbose (Default: True): Sets verbose mode for downstream chain calls\n",
"* use_logging (Default: True): Log the chain's processes in run manager\n",
"* frame_skip (Default: None): Choose how many video frames to skip during processing. Increasing it results in faster execution, but less accurate results. If None, frame skip is calculated manually based on the framerate Set this to 0 to sample all frames\n",
"* image_delta_threshold (Default: 3000000): Set the sensitivity for what the image processor considers a change in scenery in the video, used to delimit closed captions. Higher = less sensitive\n",
"* closed_caption_char_limit (Default: 20): Sets the character limit on closed captions\n",
"* closed_caption_similarity_threshold (Default: 80): Sets the percentage value to how similar two closed caption models should be in order to be clustered into one longer closed caption\n",
"* use_unclustered_video_models (Default: False): If true, closed captions that could not be clustered will be included. May result in spontaneous behaviour from closed captions such as very short lasting captions or fast-changing captions. Enabling this is experimental and not recommended"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example run"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# https://ia804703.us.archive.org/27/items/uh-oh-here-we-go-again/Uh-Oh%2C%20Here%20we%20go%20again.mp4\n",
"# https://ia601200.us.archive.org/9/items/f58703d4-61e6-4f8f-8c08-b42c7e16f7cb/f58703d4-61e6-4f8f-8c08-b42c7e16f7cb.mp4\n",
"\n",
"chain = VideoCaptioningChain(\n",
" llm=ChatOpenAI(model=\"gpt-4\", max_tokens=4000, openai_api_key=OPENAI_API_KEY),\n",
" assemblyai_key=ASSEMBLYAI_API_KEY,\n",
")\n",
"\n",
"srt_content = chain.run(\n",
" video_file_path=\"https://ia601200.us.archive.org/9/items/f58703d4-61e6-4f8f-8c08-b42c7e16f7cb/f58703d4-61e6-4f8f-8c08-b42c7e16f7cb.mp4\"\n",
")\n",
"\n",
"print(srt_content)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Writing output to .srt file"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"with open(\"output.srt\", \"w\") as file:\n",
" file.write(srt_content)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "myenv",
"language": "python",
"name": "myenv"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
},
"vscode": {
"interpreter": {
"hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -1,12 +1,17 @@
# docker-compose to make it easier to spin up integration tests.
# Services should use NON standard ports to avoid collision with
# any existing services that might be used for development.
# ATTENTION: When adding a service below use a non-standard port
# increment by one from the preceding port.
# For credentials always use `langchain` and `langchain` for the
# username and password.
version: "3"
name: langchain-tests
services:
redis:
image: redis/redis-stack-server:latest
# We use non standard ports since
# We use non standard ports since
# these instances are used for testing
# and users may already have existing
# redis instances set up locally
@@ -19,3 +24,61 @@ services:
image: graphdb
ports:
- "6021:7200"
mongo:
image: mongo:latest
container_name: mongo_container
ports:
- "6022:27017"
environment:
MONGO_INITDB_ROOT_USERNAME: langchain
MONGO_INITDB_ROOT_PASSWORD: langchain
postgres:
image: postgres:16
environment:
POSTGRES_DB: langchain
POSTGRES_USER: langchain
POSTGRES_PASSWORD: langchain
ports:
- "6023:5432"
command: |
postgres -c log_statement=all
healthcheck:
test:
[
"CMD-SHELL",
"psql postgresql://langchain:langchain@localhost/langchain --command 'SELECT 1;' || exit 1",
]
interval: 5s
retries: 60
volumes:
- postgres_data:/var/lib/postgresql/data
pgvector:
# postgres with the pgvector extension
image: ankane/pgvector
environment:
POSTGRES_DB: langchain
POSTGRES_USER: langchain
POSTGRES_PASSWORD: langchain
ports:
- "6024:5432"
command: |
postgres -c log_statement=all
healthcheck:
test:
[
"CMD-SHELL",
"psql postgresql://langchain:langchain@localhost/langchain --command 'SELECT 1;' || exit 1",
]
interval: 5s
retries: 60
volumes:
- postgres_data_pgvector:/var/lib/postgresql/data
vdms:
image: intellabs/vdms:latest
container_name: vdms_container
ports:
- "6025:55555"
volumes:
postgres_data:
postgres_data_pgvector:

2
docs/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
/.quarto/
src/supabase.d.ts

1
docs/.yarnrc.yml Normal file
View File

@@ -0,0 +1 @@
nodeLinker: node-modules

View File

@@ -1,4 +1,5 @@
"""Configuration file for the Sphinx documentation builder."""
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
@@ -174,3 +175,6 @@ myst_enable_extensions = ["colon_fence"]
# generate autosummary even if no references
autosummary_generate = True
html_copy_source = False
html_show_sourcelink = False

View File

@@ -3,6 +3,7 @@
import importlib
import inspect
import os
import sys
import typing
from enum import Enum
from pathlib import Path
@@ -306,7 +307,14 @@ def _package_namespace(package_name: str) -> str:
def _package_dir(package_name: str = "langchain") -> Path:
"""Return the path to the directory containing the documentation."""
if package_name in ("langchain", "experimental", "community", "core", "cli"):
if package_name in (
"langchain",
"experimental",
"community",
"core",
"cli",
"text-splitters",
):
return ROOT_DIR / "libs" / package_name / _package_namespace(package_name)
else:
return (
@@ -344,28 +352,29 @@ def _doc_first_line(package_name: str) -> str:
return f".. {package_name.replace('-', '_')}_api_reference:\n\n"
def main() -> None:
def main(dirs: Optional[list] = None) -> None:
"""Generate the api_reference.rst file for each package."""
print("Starting to build API reference files.")
for dir in os.listdir(ROOT_DIR / "libs"):
if not dirs:
dirs = [
dir_
for dir_ in os.listdir(ROOT_DIR / "libs")
if dir_ not in ("cli", "partners")
]
dirs += os.listdir(ROOT_DIR / "libs" / "partners")
for dir_ in dirs:
# Skip any hidden directories
# Some of these could be present by mistake in the code base
# e.g., .pytest_cache from running tests from the wrong location.
if dir.startswith("."):
print("Skipping dir:", dir)
continue
if dir in ("cli", "partners"):
if dir_.startswith("."):
print("Skipping dir:", dir_)
continue
else:
print("Building package:", dir)
_build_rst_file(package_name=dir)
partner_packages = os.listdir(ROOT_DIR / "libs" / "partners")
print("Building partner packages:", partner_packages)
for dir in partner_packages:
_build_rst_file(package_name=dir)
print("Building package:", dir_)
_build_rst_file(package_name=dir_)
print("API reference files built.")
if __name__ == "__main__":
main()
dirs = sys.argv[1:] or None
main(dirs=dirs)

File diff suppressed because one or more lines are too long

View File

@@ -43,6 +43,9 @@
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('experimental_api_reference') }}">Experimental</a>
</li>
<li class="nav-item">
<a class="sk-nav-link nav-link" href="{{ pathto('text_splitters_api_reference') }}">Text splitters</a>
</li>
{%- for title, pathname in partners %}
<li class="nav-item">
<a class="sk-nav-link nav-link nav-more-item-mobile-items" href="{{ pathto(pathname) }}">{{ title }}</a>

File diff suppressed because it is too large Load Diff

View File

@@ -241,7 +241,6 @@ Dependents stats for `langchain-ai/langchain`
|[alejandro-ao/langchain-ask-pdf](https://github.com/alejandro-ao/langchain-ask-pdf) | 514 |
|[sajjadium/ctf-archives](https://github.com/sajjadium/ctf-archives) | 507 |
|[continuum-llms/chatgpt-memory](https://github.com/continuum-llms/chatgpt-memory) | 502 |
|[llmOS/opencopilot](https://github.com/llmOS/opencopilot) | 495 |
|[steamship-core/steamship-langchain](https://github.com/steamship-core/steamship-langchain) | 494 |
|[mpaepper/content-chatbot](https://github.com/mpaepper/content-chatbot) | 493 |
|[langchain-ai/langchain-aiplugin](https://github.com/langchain-ai/langchain-aiplugin) | 492 |
@@ -455,7 +454,6 @@ Dependents stats for `langchain-ai/langchain`
|[Teahouse-Studios/akari-bot](https://github.com/Teahouse-Studios/akari-bot) | 149 |
|[realminchoi/babyagi-ui](https://github.com/realminchoi/babyagi-ui) | 148 |
|[ssheng/BentoChain](https://github.com/ssheng/BentoChain) | 148 |
|[lmstudio-ai/examples](https://github.com/lmstudio-ai/examples) | 147 |
|[solana-labs/chatgpt-plugin](https://github.com/solana-labs/chatgpt-plugin) | 147 |
|[aurelio-labs/arxiv-bot](https://github.com/aurelio-labs/arxiv-bot) | 147 |
|[Jaseci-Labs/jaseci](https://github.com/Jaseci-Labs/jaseci) | 146 |

View File

@@ -1,155 +1,50 @@
# Tutorials
Below are links to tutorials and courses on LangChain. For written guides on common use cases for LangChain, check out the [use cases guides](/docs/use_cases).
## Books and Handbooks
⛓ icon marks a new addition [last update 2024-02-06]
- [Generative AI with LangChain](https://www.amazon.com/Generative-AI-LangChain-language-ChatGPT/dp/1835083463/ref=sr_1_1?crid=1GMOMH0G7GLR&keywords=generative+ai+with+langchain&qid=1703247181&sprefix=%2Caps%2C298&sr=8-1) by [Ben Auffrath](https://www.amazon.com/stores/Ben-Auffarth/author/B08JQKSZ7D?ref=ap_rdr&store_ref=ap_rdr&isDramIntegrated=true&shoppingPortalEnabled=true), ©️ 2023 Packt Publishing
- [LangChain AI Handbook](https://www.pinecone.io/learn/langchain/) By **James Briggs** and **Francisco Ingham**
- [LangChain Cheatsheet](https://pub.towardsai.net/langchain-cheatsheet-all-secrets-on-a-single-page-8be26b721cde) by **Ivan Reznikov**
---------------------
### [LangChain](https://en.wikipedia.org/wiki/LangChain) on Wikipedia
### Books
#### [Generative AI with LangChain](https://www.amazon.com/Generative-AI-LangChain-language-ChatGPT/dp/1835083463/ref=sr_1_1?crid=1GMOMH0G7GLR&keywords=generative+ai+with+langchain&qid=1703247181&sprefix=%2Caps%2C298&sr=8-1) by [Ben Auffrath](https://www.amazon.com/stores/Ben-Auffarth/author/B08JQKSZ7D?ref=ap_rdr&store_ref=ap_rdr&isDramIntegrated=true&shoppingPortalEnabled=true), ©️ 2023 Packt Publishing
### DeepLearning.AI courses
by [Harrison Chase](https://en.wikipedia.org/wiki/LangChain) and [Andrew Ng](https://en.wikipedia.org/wiki/Andrew_Ng)
- [LangChain for LLM Application Development](https://learn.deeplearning.ai/langchain)
- [LangChain Chat with Your Data](https://learn.deeplearning.ai/langchain-chat-with-your-data)
- [Functions, Tools and Agents with LangChain](https://learn.deeplearning.ai/functions-tools-agents-langchain)
### Handbook
[LangChain AI Handbook](https://www.pinecone.io/learn/langchain/) By **James Briggs** and **Francisco Ingham**
⛓ [LangChain Cheatsheet](https://pub.towardsai.net/langchain-cheatsheet-all-secrets-on-a-single-page-8be26b721cde) by **Ivan Reznikov**
### Short Tutorials
[LangChain Explained in 13 Minutes | QuickStart Tutorial for Beginners](https://youtu.be/aywZrzNaKjs) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
[LangChain Crash Course: Build an AutoGPT app in 25 minutes](https://youtu.be/MlK6SIjcjE8) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)
[LangChain Crash Course - Build apps with language models](https://youtu.be/LbT1yp6quS8) by [Patrick Loeber](https://www.youtube.com/@patloeber)
⛓ [LangChain 101 Course](https://medium.com/@ivanreznikov/langchain-101-course-updated-668f7b41d6cb) by **Ivan Reznikov**
## Tutorials
### [LangChain for Gen AI and LLMs](https://www.youtube.com/playlist?list=PLIUOU7oqGTLieV9uTIFMm6_4PXg-hlN6F) by [James Briggs](https://www.youtube.com/@jamesbriggs)
- #1 [Getting Started with `GPT-3` vs. Open Source LLMs](https://youtu.be/nE2skSRWTTs)
- #2 [Prompt Templates for `GPT 3.5` and other LLMs](https://youtu.be/RflBcK0oDH0)
- #3 [LLM Chains using `GPT 3.5` and other LLMs](https://youtu.be/S8j9Tk0lZHU)
- [LangChain Data Loaders, Tokenizers, Chunking, and Datasets - Data Prep 101](https://youtu.be/eqOfr4AGLk8)
- #4 [Chatbot Memory for `Chat-GPT`, `Davinci` + other LLMs](https://youtu.be/X05uK0TZozM)
- #5 [Chat with OpenAI in LangChain](https://youtu.be/CnAgB3A5OlU)
- #6 [Fixing LLM Hallucinations with Retrieval Augmentation in LangChain](https://youtu.be/kvdVduIJsc8)
- #7 [LangChain Agents Deep Dive with `GPT 3.5`](https://youtu.be/jSP-gSEyVeI)
- #8 [Create Custom Tools for Chatbots in LangChain](https://youtu.be/q-HNphrWsDE)
- #9 [Build Conversational Agents with Vector DBs](https://youtu.be/H6bCqqw9xyI)
- [Using NEW `MPT-7B` in Hugging Face and LangChain](https://youtu.be/DXpk9K7DgMo)
- [`MPT-30B` Chatbot with LangChain](https://youtu.be/pnem-EhT6VI)
- [Fine-tuning OpenAI's `GPT 3.5` for LangChain Agents](https://youtu.be/boHXgQ5eQic?si=OOOfK-GhsgZGBqSr)
- [Chatbots with `RAG`: LangChain Full Walkthrough](https://youtu.be/LhnCsygAvzY?si=N7k6xy4RQksbWwsQ)
### [by Greg Kamradt](https://www.youtube.com/playlist?list=PLqZXAkvF1bPNQER9mLmDbntNfSpzdDIU5)
### [by Sam Witteveen](https://www.youtube.com/playlist?list=PL8motc6AQftk1Bs42EW45kwYbyJ4jOdiZ)
### [by James Briggs](https://www.youtube.com/playlist?list=PLIUOU7oqGTLieV9uTIFMm6_4PXg-hlN6F)
### [by Prompt Engineering](https://www.youtube.com/playlist?list=PLVEEucA9MYhOu89CX8H3MBZqayTbcCTMr)
### [by Mayo Oshin](https://www.youtube.com/@chatwithdata/search?query=langchain)
### [by 1 little Coder](https://www.youtube.com/playlist?list=PLpdmBGJ6ELUK-v0MK-t4wZmVEbxM5xk6L)
### [LangChain 101](https://www.youtube.com/playlist?list=PLqZXAkvF1bPNQER9mLmDbntNfSpzdDIU5) by [Greg Kamradt (Data Indy)](https://www.youtube.com/@DataIndependent)
- [What Is LangChain? - LangChain + `ChatGPT` Overview](https://youtu.be/_v_fgW2SkkQ)
- [Quickstart Guide](https://youtu.be/kYRB-vJFy38)
- [Beginner's Guide To 7 Essential Concepts](https://youtu.be/2xxziIWmaSA)
- [Beginner's Guide To 9 Use Cases](https://youtu.be/vGP4pQdCocw)
- [Agents Overview + Google Searches](https://youtu.be/Jq9Sf68ozk0)
- [`OpenAI` + `Wolfram Alpha`](https://youtu.be/UijbzCIJ99g)
- [Ask Questions On Your Custom (or Private) Files](https://youtu.be/EnT-ZTrcPrg)
- [Connect `Google Drive Files` To `OpenAI`](https://youtu.be/IqqHqDcXLww)
- [`YouTube Transcripts` + `OpenAI`](https://youtu.be/pNcQ5XXMgH4)
- [Question A 300 Page Book (w/ `OpenAI` + `Pinecone`)](https://youtu.be/h0DHDp1FbmQ)
- [Workaround `OpenAI's` Token Limit With Chain Types](https://youtu.be/f9_BWhCI4Zo)
- [Build Your Own OpenAI + LangChain Web App in 23 Minutes](https://youtu.be/U_eV8wfMkXU)
- [Working With The New `ChatGPT API`](https://youtu.be/e9P7FLi5Zy8)
- [OpenAI + LangChain Wrote Me 100 Custom Sales Emails](https://youtu.be/y1pyAQM-3Bo)
- [Structured Output From `OpenAI` (Clean Dirty Data)](https://youtu.be/KwAXfey-xQk)
- [Connect `OpenAI` To +5,000 Tools (LangChain + `Zapier`)](https://youtu.be/7tNm0yiDigU)
- [Use LLMs To Extract Data From Text (Expert Mode)](https://youtu.be/xZzvwR9jdPA)
- [Extract Insights From Interview Transcripts Using LLMs](https://youtu.be/shkMOHwJ4SM)
- [5 Levels Of LLM Summarizing: Novice to Expert](https://youtu.be/qaPMdcCqtWk)
- [Control Tone & Writing Style Of Your LLM Output](https://youtu.be/miBG-a3FuhU)
- [Build Your Own `AI Twitter Bot` Using LLMs](https://youtu.be/yLWLDjT01q8)
- [ChatGPT made my interview questions for me (`Streamlit` + LangChain)](https://youtu.be/zvoAMx0WKkw)
- [Function Calling via ChatGPT API - First Look With LangChain](https://youtu.be/0-zlUy7VUjg)
- [Extract Topics From Video/Audio With LLMs (Topic Modeling w/ LangChain)](https://youtu.be/pEkxRQFNAs4)
## Courses
### Featured courses on Deeplearning.AI
### [LangChain How to and guides](https://www.youtube.com/playlist?list=PL8motc6AQftk1Bs42EW45kwYbyJ4jOdiZ) by [Sam Witteveen](https://www.youtube.com/@samwitteveenai)
- [LangChain Basics - LLMs & PromptTemplates with Colab](https://youtu.be/J_0qvRt4LNk)
- [LangChain Basics - Tools and Chains](https://youtu.be/hI2BY7yl_Ac)
- [`ChatGPT API` Announcement & Code Walkthrough with LangChain](https://youtu.be/phHqvLHCwH4)
- [Conversations with Memory (explanation & code walkthrough)](https://youtu.be/X550Zbz_ROE)
- [Chat with `Flan20B`](https://youtu.be/VW5LBavIfY4)
- [Using `Hugging Face Models` locally (code walkthrough)](https://youtu.be/Kn7SX2Mx_Jk)
- [`PAL`: Program-aided Language Models with LangChain code](https://youtu.be/dy7-LvDu-3s)
- [Building a Summarization System with LangChain and `GPT-3` - Part 1](https://youtu.be/LNq_2s_H01Y)
- [Building a Summarization System with LangChain and `GPT-3` - Part 2](https://youtu.be/d-yeHDLgKHw)
- [Microsoft's `Visual ChatGPT` using LangChain](https://youtu.be/7YEiEyfPF5U)
- [LangChain Agents - Joining Tools and Chains with Decisions](https://youtu.be/ziu87EXZVUE)
- [Comparing LLMs with LangChain](https://youtu.be/rFNG0MIEuW0)
- [Using `Constitutional AI` in LangChain](https://youtu.be/uoVqNFDwpX4)
- [Talking to `Alpaca` with LangChain - Creating an Alpaca Chatbot](https://youtu.be/v6sF8Ed3nTE)
- [Talk to your `CSV` & `Excel` with LangChain](https://youtu.be/xQ3mZhw69bc)
- [`BabyAGI`: Discover the Power of Task-Driven Autonomous Agents!](https://youtu.be/QBcDLSE2ERA)
- [Improve your `BabyAGI` with LangChain](https://youtu.be/DRgPyOXZ-oE)
- [Master `PDF` Chat with LangChain - Your essential guide to queries on documents](https://youtu.be/ZzgUqFtxgXI)
- [Using LangChain with `DuckDuckGO`, `Wikipedia` & `PythonREPL` Tools](https://youtu.be/KerHlb8nuVc)
- [Building Custom Tools and Agents with LangChain (gpt-3.5-turbo)](https://youtu.be/biS8G8x8DdA)
- [LangChain Retrieval QA Over Multiple Files with `ChromaDB`](https://youtu.be/3yPBVii7Ct0)
- [LangChain Retrieval QA with Instructor Embeddings & `ChromaDB` for PDFs](https://youtu.be/cFCGUjc33aU)
- [LangChain + Retrieval Local LLMs for Retrieval QA - No OpenAI!!!](https://youtu.be/9ISVjh8mdlA)
- [`Camel` + LangChain for Synthetic Data & Market Research](https://youtu.be/GldMMK6-_-g)
- [Information Extraction with LangChain & `Kor`](https://youtu.be/SW1ZdqH0rRQ)
- [Converting a LangChain App from OpenAI to OpenSource](https://youtu.be/KUDn7bVyIfc)
- [Using LangChain `Output Parsers` to get what you want out of LLMs](https://youtu.be/UVn2NroKQCw)
- [Building a LangChain Custom Medical Agent with Memory](https://youtu.be/6UFtRwWnHws)
- [Understanding `ReACT` with LangChain](https://youtu.be/Eug2clsLtFs)
- [`OpenAI Functions` + LangChain : Building a Multi Tool Agent](https://youtu.be/4KXK6c6TVXQ)
- [What can you do with 16K tokens in LangChain?](https://youtu.be/z2aCZBAtWXs)
- [Tagging and Extraction - Classification using `OpenAI Functions`](https://youtu.be/a8hMgIcUEnE)
- [HOW to Make Conversational Form with LangChain](https://youtu.be/IT93On2LB5k)
- [`Claude-2` meets LangChain!](https://youtu.be/Hb_D3p0bK2U?si=j96Kc7oJoeRI5-iC)
- [`PaLM 2` Meets LangChain](https://youtu.be/orPwLibLqm4?si=KgJjpEbAD9YBPqT4)
- [`LLaMA2` with LangChain - Basics | LangChain TUTORIAL](https://youtu.be/cIRzwSXB4Rc?si=v3Hwxk1m3fksBIHN)
- [Serving `LLaMA2` with `Replicate`](https://youtu.be/JIF4nNi26DE?si=dSazFyC4UQmaR-rJ)
- [NEW LangChain Expression Language](https://youtu.be/ud7HJ2p3gp0?si=8pJ9O6hGbXrCX5G9)
- [Building a RCI Chain for Agents with LangChain Expression Language](https://youtu.be/QaKM5s0TnsY?si=0miEj-o17AHcGfLG)
- [How to Run `LLaMA-2-70B` on the `Together AI`](https://youtu.be/Tc2DHfzHeYE?si=Xku3S9dlBxWQukpe)
- [`RetrievalQA` with `LLaMA 2 70b` & `Chroma` DB](https://youtu.be/93yueQQnqpM?si=ZMwj-eS_CGLnNMXZ)
- [How to use `BGE Embeddings` for LangChain](https://youtu.be/sWRvSG7vL4g?si=85jnvnmTCF9YIWXI)
- [How to use Custom Prompts for `RetrievalQA` on `LLaMA-2 7B`](https://youtu.be/PDwUKves9GY?si=sMF99TWU0p4eiK80)
- [LangChain for LLM Application Development](https://learn.deeplearning.ai/langchain)
- [LangChain Chat with Your Data](https://learn.deeplearning.ai/langchain-chat-with-your-data)
- [Functions, Tools and Agents with LangChain](https://learn.deeplearning.ai/functions-tools-agents-langchain)
- [Build LLM Apps with LangChain.js](https://learn.deeplearning.ai/courses/build-llm-apps-with-langchain-js)
### Online courses
### [LangChain](https://www.youtube.com/playlist?list=PLVEEucA9MYhOu89CX8H3MBZqayTbcCTMr) by [Prompt Engineering](https://www.youtube.com/@engineerprompt)
- [LangChain Crash Course — All You Need to Know to Build Powerful Apps with LLMs](https://youtu.be/5-fc4Tlgmro)
- [Working with MULTIPLE `PDF` Files in LangChain: `ChatGPT` for your Data](https://youtu.be/s5LhRdh5fu4)
- [`ChatGPT` for YOUR OWN `PDF` files with LangChain](https://youtu.be/TLf90ipMzfE)
- [Talk to YOUR DATA without OpenAI APIs: LangChain](https://youtu.be/wrD-fZvT6UI)
- [LangChain: `PDF` Chat App (GUI) | `ChatGPT` for Your `PDF` FILES](https://youtu.be/RIWbalZ7sTo)
- [`LangFlow`: Build Chatbots without Writing Code](https://youtu.be/KJ-ux3hre4s)
- [LangChain: Giving Memory to LLMs](https://youtu.be/dxO6pzlgJiY)
- [BEST OPEN Alternative to `OPENAI's EMBEDDINGs` for Retrieval QA: LangChain](https://youtu.be/ogEalPMUCSY)
- [LangChain: Run Language Models Locally - `Hugging Face Models`](https://youtu.be/Xxxuw4_iCzw)
- [Slash API Costs: Mastering Caching for LLM Applications](https://youtu.be/EQOznhaJWR0?si=AXoI7f3-SVFRvQUl)
- [Avoid PROMPT INJECTION with `Constitutional AI` - LangChain](https://youtu.be/tyKSkPFHVX8?si=9mgcB5Y1kkotkBGB)
- [Udemy](https://www.udemy.com/courses/search/?q=langchain)
- [Pluralsight](https://www.pluralsight.com/search?q=langchain)
- [Coursera](https://www.coursera.org/search?query=langchain)
- [Maven](https://maven.com/courses?query=langchain)
- [Udacity](https://www.udacity.com/catalog/all/any-price/any-school/any-skill/any-difficulty/any-duration/any-type/relevance/page-1?searchValue=langchain)
- [LinkedIn Learning](https://www.linkedin.com/search/results/learning/?keywords=langchain)
- [edX](https://www.edx.org/search?q=langchain)
## Short Tutorials
### LangChain by [Chat with data](https://www.youtube.com/@chatwithdata)
- [LangChain Beginner's Tutorial for `Typescript`/`Javascript`](https://youtu.be/bH722QgRlhQ)
- [`GPT-4` Tutorial: How to Chat With Multiple `PDF` Files (~1000 pages of Tesla's 10-K Annual Reports)](https://youtu.be/Ix9WIZpArm0)
- [`GPT-4` & LangChain Tutorial: How to Chat With A 56-Page `PDF` Document (w/`Pinecone`)](https://youtu.be/ih9PBGVVOO4)
- [LangChain & `Supabase` Tutorial: How to Build a ChatGPT Chatbot For Your Website](https://youtu.be/R2FMzcsmQY8)
- [LangChain Agents: Build Personal Assistants For Your Data (Q&A with Harrison Chase and Mayo Oshin)](https://youtu.be/gVkF8cwfBLI)
### Codebase Analysis
- [Codebase Analysis: Langchain Agents](https://carbonated-yacht-2c5.notion.site/Codebase-Analysis-Langchain-Agents-0b0587acd50647ca88aaae7cff5df1f2)
- [by Nicholas Renotte](https://youtu.be/MlK6SIjcjE8)
- [by Patrick Loeber](https://youtu.be/LbT1yp6quS8)
- [by Rabbitmetrics](https://youtu.be/aywZrzNaKjs)
- [by Ivan Reznikov](https://medium.com/@ivanreznikov/langchain-101-course-updated-668f7b41d6cb)
## [Documentation: Use cases](/docs/use_cases)
---------------------
⛓ icon marks a new addition [last update 2024-02-061]

View File

@@ -7,7 +7,7 @@
### Introduction to LangChain with Harrison Chase, creator of LangChain
- [Building the Future with LLMs, `LangChain`, & `Pinecone`](https://youtu.be/nMniwlGyX-c) by [Pinecone](https://www.youtube.com/@pinecone-io)
- [LangChain and Weaviate with Harrison Chase and Bob van Luijt - Weaviate Podcast #36](https://youtu.be/lhby7Ql7hbk) by [Weaviate • Vector Database](https://www.youtube.com/@Weaviate)
- [LangChain Demo + Q&A with Harrison Chase](https://youtu.be/zaYTXQFR0_s?t=788) by [Full Stack Deep Learning](https://www.youtube.com/@FullStackDeepLearning)
- [LangChain Demo + Q&A with Harrison Chase](https://youtu.be/zaYTXQFR0_s?t=788) by [Full Stack Deep Learning](https://www.youtube.com/@The_Full_Stack)
- [LangChain Agents: Build Personal Assistants For Your Data (Q&A with Harrison Chase and Mayo Oshin)](https://youtu.be/gVkF8cwfBLI) by [Chat with data](https://www.youtube.com/@chatwithdata)
## Videos (sorted by views)
@@ -15,8 +15,8 @@
- [Using `ChatGPT` with YOUR OWN Data. This is magical. (LangChain OpenAI API)](https://youtu.be/9AXP7tCI9PI) by [TechLead](https://www.youtube.com/@TechLead)
- [First look - `ChatGPT` + `WolframAlpha` (`GPT-3.5` and Wolfram|Alpha via LangChain by James Weaver)](https://youtu.be/wYGbY811oMo) by [Dr Alan D. Thompson](https://www.youtube.com/@DrAlanDThompson)
- [LangChain explained - The hottest new Python framework](https://youtu.be/RoR4XJw8wIc) by [AssemblyAI](https://www.youtube.com/@AssemblyAI)
- [Chatbot with INFINITE MEMORY using `OpenAI` & `Pinecone` - `GPT-3`, `Embeddings`, `ADA`, `Vector DB`, `Semantic`](https://youtu.be/2xNzB7xq8nk) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)
- [LangChain for LLMs is... basically just an Ansible playbook](https://youtu.be/X51N9C-OhlE) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)
- [Chatbot with INFINITE MEMORY using `OpenAI` & `Pinecone` - `GPT-3`, `Embeddings`, `ADA`, `Vector DB`, `Semantic`](https://youtu.be/2xNzB7xq8nk) by [David Shapiro ~ AI](https://www.youtube.com/@DaveShap)
- [LangChain for LLMs is... basically just an Ansible playbook](https://youtu.be/X51N9C-OhlE) by [David Shapiro ~ AI](https://www.youtube.com/@DaveShap)
- [Build your own LLM Apps with LangChain & `GPT-Index`](https://youtu.be/-75p09zFUJY) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [`BabyAGI` - New System of Autonomous AI Agents with LangChain](https://youtu.be/lg3kJvf1kXo) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [Run `BabyAGI` with Langchain Agents (with Python Code)](https://youtu.be/WosPGHPObx8) by [1littlecoder](https://www.youtube.com/@1littlecoder)
@@ -37,15 +37,15 @@
- [Building AI LLM Apps with LangChain (and more?) - LIVE STREAM](https://www.youtube.com/live/M-2Cj_2fzWI?feature=share) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)
- [`ChatGPT` with any `YouTube` video using langchain and `chromadb`](https://youtu.be/TQZfB2bzVwU) by [echohive](https://www.youtube.com/@echohive)
- [How to Talk to a `PDF` using LangChain and `ChatGPT`](https://youtu.be/v2i1YDtrIwk) by [Automata Learning Lab](https://www.youtube.com/@automatalearninglab)
- [Langchain Document Loaders Part 1: Unstructured Files](https://youtu.be/O5C0wfsen98) by [Merk](https://www.youtube.com/@merksworld)
- [LangChain - Prompt Templates (what all the best prompt engineers use)](https://youtu.be/1aRu8b0XNOQ) by [Nick Daigler](https://www.youtube.com/@nick_daigs)
- [Langchain Document Loaders Part 1: Unstructured Files](https://youtu.be/O5C0wfsen98) by [Merk](https://www.youtube.com/@heymichaeldaigler)
- [LangChain - Prompt Templates (what all the best prompt engineers use)](https://youtu.be/1aRu8b0XNOQ) by [Nick Daigler](https://www.youtube.com/@nickdaigler)
- [LangChain. Crear aplicaciones Python impulsadas por GPT](https://youtu.be/DkW_rDndts8) by [Jesús Conde](https://www.youtube.com/@0utKast)
- [Easiest Way to Use GPT In Your Products | LangChain Basics Tutorial](https://youtu.be/fLy0VenZyGc) by [Rachel Woods](https://www.youtube.com/@therachelwoods)
- [`BabyAGI` + `GPT-4` Langchain Agent with Internet Access](https://youtu.be/wx1z_hs5P6E) by [tylerwhatsgood](https://www.youtube.com/@tylerwhatsgood)
- [Learning LLM Agents. How does it actually work? LangChain, AutoGPT & OpenAI](https://youtu.be/mb_YAABSplk) by [Arnoldas Kemeklis](https://www.youtube.com/@processusAI)
- [Get Started with LangChain in `Node.js`](https://youtu.be/Wxx1KUWJFv4) by [Developers Digest](https://www.youtube.com/@DevelopersDigest)
- [LangChain + `OpenAI` tutorial: Building a Q&A system w/ own text data](https://youtu.be/DYOU_Z0hAwo) by [Samuel Chan](https://www.youtube.com/@SamuelChan)
- [Langchain + `Zapier` Agent](https://youtu.be/yribLAb-pxA) by [Merk](https://www.youtube.com/@merksworld)
- [Langchain + `Zapier` Agent](https://youtu.be/yribLAb-pxA) by [Merk](https://www.youtube.com/@heymichaeldaigler)
- [Connecting the Internet with `ChatGPT` (LLMs) using Langchain And Answers Your Questions](https://youtu.be/9Y0TBC63yZg) by [Kamalraj M M](https://www.youtube.com/@insightbuilder)
- [Build More Powerful LLM Applications for Businesss with LangChain (Beginners Guide)](https://youtu.be/sp3-WLKEcBg) by[ No Code Blackbox](https://www.youtube.com/@nocodeblackbox)
- [LangFlow LLM Agent Demo for 🦜🔗LangChain](https://youtu.be/zJxDHaWt-6o) by [Cobus Greyling](https://www.youtube.com/@CobusGreylingZA)
@@ -82,7 +82,7 @@
- [Build a LangChain-based Semantic PDF Search App with No-Code Tools Bubble and Flowise](https://youtu.be/s33v5cIeqA4) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)
- [LangChain Memory Tutorial | Building a ChatGPT Clone in Python](https://youtu.be/Cwq91cj2Pnc) by [Alejandro AO - Software & Ai](https://www.youtube.com/@alejandro_ao)
- [ChatGPT For Your DATA | Chat with Multiple Documents Using LangChain](https://youtu.be/TeDgIDqQmzs) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)
- [`Llama Index`: Chat with Documentation using URL Loader](https://youtu.be/XJRoDEctAwA) by [Merk](https://www.youtube.com/@merksworld)
- [`Llama Index`: Chat with Documentation using URL Loader](https://youtu.be/XJRoDEctAwA) by [Merk](https://www.youtube.com/@heymichaeldaigler)
- [Using OpenAI, LangChain, and `Gradio` to Build Custom GenAI Applications](https://youtu.be/1MsmqMg3yUc) by [David Hundley](https://www.youtube.com/@dkhundley)
- [LangChain, Chroma DB, OpenAI Beginner Guide | ChatGPT with your PDF](https://youtu.be/FuqdVNB_8c0)
- [Build AI chatbot with custom knowledge base using OpenAI API and GPT Index](https://youtu.be/vDZAZuaXf48) by [Irina Nik](https://www.youtube.com/@irina_nik)
@@ -93,7 +93,7 @@
- [Build a Custom Chatbot with OpenAI: `GPT-Index` & LangChain | Step-by-Step Tutorial](https://youtu.be/FIDv6nc4CgU) by [Fabrikod](https://www.youtube.com/@fabrikod)
- [`Flowise` is an open-source no-code UI visual tool to build 🦜🔗LangChain applications](https://youtu.be/CovAPtQPU0k) by [Cobus Greyling](https://www.youtube.com/@CobusGreylingZA)
- [LangChain & GPT 4 For Data Analysis: The `Pandas` Dataframe Agent](https://youtu.be/rFQ5Kmkd4jc) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
- [`GirlfriendGPT` - AI girlfriend with LangChain](https://youtu.be/LiN3D1QZGQw) by [Toolfinder AI](https://www.youtube.com/@toolfinderai)
- [`GirlfriendGPT` - AI girlfriend with LangChain](https://youtu.be/LiN3D1QZGQw) by [Girlfriend GPT](https://www.youtube.com/@girlfriendGPT)
- [How to build with Langchain 10x easier | ⛓️ LangFlow & `Flowise`](https://youtu.be/Ya1oGL7ZTvU) by [AI Jason](https://www.youtube.com/@AIJasonZ)
- [Getting Started With LangChain In 20 Minutes- Build Celebrity Search Application](https://youtu.be/_FpT1cwcSLg) by [Krish Naik](https://www.youtube.com/@krishnaik06)
- ⛓ [Vector Embeddings Tutorial Code Your Own AI Assistant with `GPT-4 API` + LangChain + NLP](https://youtu.be/yfHHvmaMkcA?si=5uJhxoh2tvdnOXok) by [FreeCodeCamp.org](https://www.youtube.com/@freecodecamp)
@@ -109,7 +109,7 @@
- ⛓ [PyData Heidelberg #11 - TimeSeries Forecasting & LLM Langchain](https://www.youtube.com/live/Glbwb5Hxu18?si=PIEY8Raq_C9PCHuW) by [PyData](https://www.youtube.com/@PyDataTV)
- ⛓ [Prompt Engineering in Web Development | Using LangChain and Templates with OpenAI](https://youtu.be/pK6WzlTOlYw?si=fkcDQsBG2h-DM8uQ) by [Akamai Developer
](https://www.youtube.com/@AkamaiDeveloper)
- ⛓ [Retrieval-Augmented Generation (RAG) using LangChain and `Pinecone` - The RAG Special Episode](https://youtu.be/J_tCD_J6w3s?si=60Mnr5VD9UED9bGG) by [Generative AI and Data Science On AWS](https://www.youtube.com/@GenerativeAIDataScienceOnAWS)
- ⛓ [Retrieval-Augmented Generation (RAG) using LangChain and `Pinecone` - The RAG Special Episode](https://youtu.be/J_tCD_J6w3s?si=60Mnr5VD9UED9bGG) by [Generative AI and Data Science On AWS](https://www.youtube.com/@GenerativeAIOnAWS)
- ⛓ [`LLAMA2 70b-chat` Multiple Documents Chatbot with Langchain & Streamlit |All OPEN SOURCE|Replicate API](https://youtu.be/vhghB81vViM?si=dszzJnArMeac7lyc) by [DataInsightEdge](https://www.youtube.com/@DataInsightEdge01)
- ⛓ [Chatting with 44K Fashion Products: LangChain Opportunities and Pitfalls](https://youtu.be/Zudgske0F_s?si=8HSshHoEhh0PemJA) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
- ⛓ [Structured Data Extraction from `ChatGPT` with LangChain](https://youtu.be/q1lYg8JISpQ?si=0HctzOHYZvq62sve) by [MG](https://www.youtube.com/@MG_cafe)

View File

@@ -14,19 +14,20 @@ For the most part, new integrations should be added to the Community package. Pa
In the following sections, we'll walk through how to contribute to each of these packages from a fake company, `Parrot Link AI`.
## Community Package
## Community package
The `langchain-community` package is in `libs/community` and contains most integrations.
It is installed by users with `pip install langchain-community`, and exported members can be imported with code like
It can be installed with `pip install langchain-community`, and exported members can be imported with code like
```python
from langchain_community.chat_models import ParrotLinkLLM
from langchain_community.llms import ChatParrotLink
from langchain_community.chat_models import ChatParrotLink
from langchain_community.llms import ParrotLinkLLM
from langchain_community.vectorstores import ParrotLinkVectorStore
```
The community package relies on manually-installed dependent packages, so you will see errors if you try to import a package that is not installed. In our fake example, if you tried to import `ParrotLinkLLM` without installing `parrot-link-sdk`, you will see an `ImportError` telling you to install it when trying to use it.
The `community` package relies on manually-installed dependent packages, so you will see errors
if you try to import a package that is not installed. In our fake example, if you tried to import `ParrotLinkLLM` without installing `parrot-link-sdk`, you will see an `ImportError` telling you to install it when trying to use it.
Let's say we wanted to implement a chat model for Parrot Link AI. We would create a new file in `libs/community/langchain_community/chat_models/parrot_link.py` with the following code:
@@ -39,7 +40,7 @@ class ChatParrotLink(BaseChatModel):
Example:
.. code-block:: python
from langchain_parrot_link import ChatParrotLink
from langchain_community.chat_models import ChatParrotLink
model = ChatParrotLink()
"""
@@ -56,9 +57,16 @@ And add documentation to:
- `docs/docs/integrations/chat/parrot_link.ipynb`
## Partner Packages
## Partner package in LangChain repo
Partner packages are in `libs/partners/*` and are installed by users with `pip install langchain-{partner}`, and exported members can be imported with code like
Partner packages can be hosted in the `LangChain` monorepo or in an external repo.
Partner package in the `LangChain` repo is placed in `libs/partners/{partner}`
and the package source code is in `libs/partners/{partner}/langchain_{partner}`.
A package is
installed by users with `pip install langchain-{partner}`, and the package members
can be imported with code like:
```python
from langchain_{partner} import X
@@ -123,13 +131,49 @@ By default, this will include stubs for a Chat Model, an LLM, and/or a Vector St
### Write Unit and Integration Tests
Some basic tests are generated in the tests/ directory. You should add more tests to cover your package's functionality.
Some basic tests are presented in the `tests/` directory. You should add more tests to cover your package's functionality.
For information on running and implementing tests, see the [Testing guide](./testing).
### Write documentation
Documentation is generated from Jupyter notebooks in the `docs/` directory. You should move the generated notebooks to the relevant `docs/docs/integrations` directory in the monorepo root.
Documentation is generated from Jupyter notebooks in the `docs/` directory. You should place the notebooks with examples
to the relevant `docs/docs/integrations` directory in the monorepo root.
### (If Necessary) Deprecate community integration
Note: this is only necessary if you're migrating an existing community integration into
a partner package. If the component you're integrating is net-new to LangChain (i.e.
not already in the `community` package), you can skip this step.
Let's pretend we migrated our `ChatParrotLink` chat model from the community package to
the partner package. We would need to deprecate the old model in the community package.
We would do that by adding a `@deprecated` decorator to the old model as follows, in
`libs/community/langchain_community/chat_models/parrot_link.py`.
Before our change, our chat model might look like this:
```python
class ChatParrotLink(BaseChatModel):
...
```
After our change, it would look like this:
```python
from langchain_core._api.deprecation import deprecated
@deprecated(
since="0.0.<next community version>",
removal="0.2.0",
alternative_import="langchain_parrot_link.ChatParrotLink"
)
class ChatParrotLink(BaseChatModel):
...
```
You should do this for *each* component that you're migrating to the partner package.
### Additional steps
@@ -143,3 +187,15 @@ Maintainer steps (Contributors should **not** do these):
- [ ] set up pypi and test pypi projects
- [ ] add credential secrets to Github Actions
- [ ] add package to conda-forge
## Partner package in external repo
If you are creating a partner package in an external repo, you should follow the same steps as above,
but you will need to set up your own CI/CD and package management.
Name your package as `langchain-{partner}-{integration}`.
Still, you have to create the `libs/partners/{partner}-{integration}` folder in the `LangChain` monorepo
and add a `README.md` file with a link to the external repo.
See this [example](https://github.com/langchain-ai/langchain/tree/master/libs/partners/google-genai).
This allows keeping track of all the partner packages in the `LangChain` documentation.

View File

@@ -20,9 +20,11 @@
]
},
{
"cell_type": "raw",
"cell_type": "code",
"id": "0f316b5c",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain langchain-openai"
]

View File

@@ -220,7 +220,7 @@
"id": "637f994a-5134-402a-bcf0-4de3911eaf49",
"metadata": {},
"source": [
":::tip\n",
":::{.callout-tip}\n",
"\n",
"[LangSmith trace](https://smith.langchain.com/public/60909eae-f4f1-43eb-9f96-354f5176f66f/r)\n",
"\n",
@@ -388,7 +388,7 @@
"id": "5a7e498b-dc68-4267-a35c-90ceffa91c46",
"metadata": {},
"source": [
":::tip\n",
":::{.callout-tip}\n",
"\n",
"[LangSmith trace](https://smith.langchain.com/public/3b27d47f-e4df-4afb-81b1-0f88b80ca97e/r)\n",
"\n",

View File

@@ -20,9 +20,11 @@
]
},
{
"cell_type": "raw",
"cell_type": "code",
"id": "b3121aa8",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain langchain-openai"
]

View File

@@ -31,13 +31,42 @@
]
},
{
"cell_type": "raw",
"cell_type": "code",
"execution_count": null,
"id": "278b0027",
"metadata": {},
"outputs": [],
"source": [
"%pip install --upgrade --quiet langchain-core langchain-community langchain-openai"
]
},
{
"cell_type": "markdown",
"id": "c3d54f72",
"metadata": {},
"source": [
"```{=mdx}\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs openaiParams={`model=\"gpt-4\"`} />\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f9eed8e8",
"metadata": {},
"outputs": [],
"source": [
"# | output: false\n",
"# | echo: false\n",
"\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"model = ChatOpenAI(model=\"gpt-4\")"
]
},
{
"cell_type": "code",
"execution_count": 1,
@@ -58,10 +87,8 @@
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"prompt = ChatPromptTemplate.from_template(\"tell me a short joke about {topic}\")\n",
"model = ChatOpenAI(model=\"gpt-4\")\n",
"output_parser = StrOutputParser()\n",
"\n",
"chain = prompt | model | output_parser\n",
@@ -74,15 +101,15 @@
"id": "81c502c5-85ee-4f36-aaf4-d6e350b7792f",
"metadata": {},
"source": [
"Notice this line of this code, where we piece together then different components into a single chain using LCEL:\n",
"Notice this line of the code, where we piece together these different components into a single chain using LCEL:\n",
"\n",
"```\n",
"chain = prompt | model | output_parser\n",
"```\n",
"\n",
"The `|` symbol is similar to a [unix pipe operator](https://en.wikipedia.org/wiki/Pipeline_(Unix)), which chains together the different components feeds the output from one component as input into the next component. \n",
"The `|` symbol is similar to a [unix pipe operator](https://en.wikipedia.org/wiki/Pipeline_(Unix)), which chains together the different components, feeding the output from one component as input into the next component. \n",
"\n",
"In this chain the user input is passed to the prompt template, then the prompt template output is passed to the model, then the model output is passed to the output parser. Let's take a look at each component individually to really understand what's going on. "
"In this chain the user input is passed to the prompt template, then the prompt template output is passed to the model, then the model output is passed to the output parser. Let's take a look at each component individually to really understand what's going on."
]
},
{
@@ -217,7 +244,7 @@
}
],
"source": [
"from langchain_openai.llms import OpenAI\n",
"from langchain_openai import OpenAI\n",
"\n",
"llm = OpenAI(model=\"gpt-3.5-turbo-instruct\")\n",
"llm.invoke(prompt_value)"
@@ -231,7 +258,7 @@
"### 3. Output parser\n",
"\n",
"And lastly we pass our `model` output to the `output_parser`, which is a `BaseOutputParser` meaning it takes either a string or a \n",
"`BaseMessage` as input. The `StrOutputParser` specifically simple converts any input into a string."
"`BaseMessage` as input. The specific `StrOutputParser` simply converts any input into a string."
]
},
{
@@ -291,7 +318,7 @@
"source": [
":::info\n",
"\n",
"Note that if youre curious about the output of any components, you can always test out a smaller version of the chain such as `prompt` or `prompt | model` to see the intermediate results:\n",
"Note that if youre curious about the output of any components, you can always test out a smaller version of the chain such as `prompt` or `prompt | model` to see the intermediate results:\n",
"\n",
":::"
]
@@ -319,7 +346,17 @@
"source": [
"## RAG Search Example\n",
"\n",
"For our next example, we want to run a retrieval-augmented generation chain to add some context when responding to questions. "
"For our next example, we want to run a retrieval-augmented generation chain to add some context when responding to questions."
]
},
{
"cell_type": "markdown",
"id": "b8fe8eb4",
"metadata": {},
"source": [
"```{=mdx}\n",
"<ChatModelTabs />\n",
"```"
]
},
{
@@ -336,8 +373,7 @@
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnableParallel, RunnablePassthrough\n",
"from langchain_openai.chat_models import ChatOpenAI\n",
"from langchain_openai.embeddings import OpenAIEmbeddings\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"vectorstore = DocArrayInMemorySearch.from_texts(\n",
" [\"harrison worked at kensho\", \"bears like to eat honey\"],\n",
@@ -351,7 +387,6 @@
"Question: {question}\n",
"\"\"\"\n",
"prompt = ChatPromptTemplate.from_template(template)\n",
"model = ChatOpenAI()\n",
"output_parser = StrOutputParser()\n",
"\n",
"setup_and_retrieval = RunnableParallel(\n",
@@ -449,7 +484,7 @@
"With the flow being:\n",
"\n",
"1. The first steps create a `RunnableParallel` object with two entries. The first entry, `context` will include the document results fetched by the retriever. The second entry, `question` will contain the users original question. To pass on the question, we use `RunnablePassthrough` to copy this entry. \n",
"2. Feed the dictionary from the step above to the `prompt` component. It then takes the user input which is `question` as well as the retrieved document which is `context` to construct a prompt and output a PromptValue. \n",
"2. Feed the dictionary from the step above to the `prompt` component. It then takes the user input which is `question` as well as the retrieved document which is `context` to construct a prompt and output a PromptValue. \n",
"3. The `model` component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated output from the model is a `ChatMessage` object. \n",
"4. Finally, the `output_parser` component takes in a `ChatMessage`, and transforms this into a Python string, which is returned from the invoke method.\n",
"\n",
@@ -494,7 +529,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.4"
"version": "3.11.0"
}
},
"nbformat": 4,

View File

@@ -557,7 +557,7 @@
"metadata": {},
"outputs": [],
"source": [
"openai_poem = chain.with_config(configurable={\"llm\": \"openai\"})"
"openai_joke = chain.with_config(configurable={\"llm\": \"openai\"})"
]
},
{
@@ -578,7 +578,7 @@
}
],
"source": [
"openai_poem.invoke({\"topic\": \"bears\"})"
"openai_joke.invoke({\"topic\": \"bears\"})"
]
},
{

View File

@@ -552,7 +552,7 @@
"id": "da3d1feb-b4bb-4624-961c-7db2e1180df7",
"metadata": {},
"source": [
":::tip\n",
":::{.callout-tip}\n",
"\n",
"[Langsmith trace](https://smith.langchain.com/public/bd73e122-6ec1-48b2-82df-e6483dc9cb63/r)\n",
"\n",

View File

@@ -1,7 +1,7 @@
{
"cells": [
{
"cell_type": "markdown",
"cell_type": "raw",
"id": "9e45e81c-e16e-4c6c-b6a3-2362e5193827",
"metadata": {},
"source": [
@@ -25,53 +25,42 @@
"\n",
"There are two ways to perform routing:\n",
"\n",
"1. Using a `RunnableBranch`.\n",
"2. Writing custom factory function that takes the input of a previous step and returns a **runnable**. Importantly, this should return a **runnable** and NOT actually execute.\n",
"1. Conditionally return runnables from a [`RunnableLambda`](./functions) (recommended)\n",
"2. Using a `RunnableBranch`.\n",
"\n",
"We'll illustrate both methods using a two step sequence where the first step classifies an input question as being about `LangChain`, `Anthropic`, or `Other`, then routes to a corresponding prompt chain."
]
},
{
"cell_type": "markdown",
"id": "f885113d",
"metadata": {},
"source": [
"## Using a RunnableBranch\n",
"\n",
"A `RunnableBranch` is initialized with a list of (condition, runnable) pairs and a default runnable. It selects which branch by passing each condition the input it's invoked with. It selects the first condition to evaluate to True, and runs the corresponding runnable to that condition with the input. \n",
"\n",
"If no provided conditions match, it runs the default runnable.\n",
"\n",
"Here's an example of what it looks like in action:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "1aa13c1d",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain_community.chat_models import ChatAnthropic\n",
"from langchain_core.output_parsers import StrOutputParser"
]
},
{
"cell_type": "markdown",
"id": "ed84c59a",
"id": "c1c6edac",
"metadata": {},
"source": [
"## Example Setup\n",
"First, let's create a chain that will identify incoming questions as being about `LangChain`, `Anthropic`, or `Other`:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "3ec03886",
"execution_count": null,
"id": "8a8a1967",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"' Anthropic'"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from langchain_community.chat_models import ChatAnthropic\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"chain = (\n",
" PromptTemplate.from_template(\n",
" \"\"\"Given the user question below, classify it as either being about `LangChain`, `Anthropic`, or `Other`.\n",
@@ -86,33 +75,14 @@
" )\n",
" | ChatAnthropic()\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "87ae7c1c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"' Anthropic'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
")\n",
"\n",
"chain.invoke({\"question\": \"how do I call Anthropic?\"})"
]
},
{
"cell_type": "markdown",
"id": "8aa0a365",
"id": "7655555f",
"metadata": {},
"source": [
"Now, let's create three sub chains:"
@@ -120,8 +90,8 @@
},
{
"cell_type": "code",
"execution_count": 4,
"id": "d479962a",
"execution_count": null,
"id": "89d7722d",
"metadata": {},
"outputs": [],
"source": [
@@ -158,101 +128,12 @@
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "593eab06",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.runnables import RunnableBranch\n",
"\n",
"branch = RunnableBranch(\n",
" (lambda x: \"anthropic\" in x[\"topic\"].lower(), anthropic_chain),\n",
" (lambda x: \"langchain\" in x[\"topic\"].lower(), langchain_chain),\n",
" general_chain,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "752c732e",
"metadata": {},
"outputs": [],
"source": [
"full_chain = {\"topic\": chain, \"question\": lambda x: x[\"question\"]} | branch"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "29231bb8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\" As Dario Amodei told me, here are some ways to use Anthropic:\\n\\n- Sign up for an account on Anthropic's website to access tools like Claude, Constitutional AI, and Writer. \\n\\n- Use Claude for tasks like email generation, customer service chat, and QA. Claude can understand natural language prompts and provide helpful responses.\\n\\n- Use Constitutional AI if you need an AI assistant that is harmless, honest, and helpful. It is designed to be safe and aligned with human values.\\n\\n- Use Writer to generate natural language content for things like marketing copy, stories, reports, and more. Give it a topic and prompt and it will create high-quality written content.\\n\\n- Check out Anthropic's documentation and blog for tips, tutorials, examples, and announcements about new capabilities as they continue to develop their AI technology.\\n\\n- Follow Anthropic on social media or subscribe to their newsletter to stay up to date on new features and releases.\\n\\n- For most people, the easiest way to leverage Anthropic's technology is through their website - just create an account to get started!\", additional_kwargs={}, example=False)"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"full_chain.invoke({\"question\": \"how do I use Anthropic?\"})"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "c67d8733",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' As Harrison Chase told me, here is how you use LangChain:\\n\\nLangChain is an AI assistant that can have conversations, answer questions, and generate text. To use LangChain, you simply type or speak your input and LangChain will respond. \\n\\nYou can ask LangChain questions, have discussions, get summaries or explanations about topics, and request it to generate text on a subject. Some examples of interactions:\\n\\n- Ask general knowledge questions and LangChain will try to answer factually. For example \"What is the capital of France?\"\\n\\n- Have conversations on topics by taking turns speaking. You can prompt the start of a conversation by saying something like \"Let\\'s discuss machine learning\"\\n\\n- Ask for summaries or high-level explanations on subjects. For example \"Can you summarize the main themes in Shakespeare\\'s Hamlet?\" \\n\\n- Give creative writing prompts or requests to have LangChain generate text in different styles. For example \"Write a short children\\'s story about a mouse\" or \"Generate a poem in the style of Robert Frost about nature\"\\n\\n- Correct LangChain if it makes an inaccurate statement and provide the right information. This helps train it.\\n\\nThe key is interacting naturally and giving it clear prompts and requests', additional_kwargs={}, example=False)"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"full_chain.invoke({\"question\": \"how do I use LangChain?\"})"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "935ad949",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' 2 + 2 = 4', additional_kwargs={}, example=False)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"full_chain.invoke({\"question\": \"whats 2 + 2\"})"
]
},
{
"cell_type": "markdown",
"id": "6d8d042c",
"metadata": {},
"source": [
"## Using a custom function\n",
"## Using a custom function (Recommended)\n",
"\n",
"You can also use a custom function to route between different outputs. Here's an example:"
]
@@ -350,13 +231,89 @@
"full_chain.invoke({\"question\": \"whats 2 + 2\"})"
]
},
{
"cell_type": "markdown",
"id": "5147b827",
"metadata": {},
"source": [
"## Using a RunnableBranch\n",
"\n",
"A `RunnableBranch` is a special type of runnable that allows you to define a set of conditions and runnables to execute based on the input. It does **not** offer anything that you can't achieve in a custom function as described above, so we recommend using a custom function instead.\n",
"\n",
"A `RunnableBranch` is initialized with a list of (condition, runnable) pairs and a default runnable. It selects which branch by passing each condition the input it's invoked with. It selects the first condition to evaluate to True, and runs the corresponding runnable to that condition with the input. \n",
"\n",
"If no provided conditions match, it runs the default runnable.\n",
"\n",
"Here's an example of what it looks like in action:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "46802d04",
"id": "2a101418",
"metadata": {},
"outputs": [],
"source": []
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\" As Dario Amodei told me, here are some ways to use Anthropic:\\n\\n- Sign up for an account on Anthropic's website to access tools like Claude, Constitutional AI, and Writer. \\n\\n- Use Claude for tasks like email generation, customer service chat, and QA. Claude can understand natural language prompts and provide helpful responses.\\n\\n- Use Constitutional AI if you need an AI assistant that is harmless, honest, and helpful. It is designed to be safe and aligned with human values.\\n\\n- Use Writer to generate natural language content for things like marketing copy, stories, reports, and more. Give it a topic and prompt and it will create high-quality written content.\\n\\n- Check out Anthropic's documentation and blog for tips, tutorials, examples, and announcements about new capabilities as they continue to develop their AI technology.\\n\\n- Follow Anthropic on social media or subscribe to their newsletter to stay up to date on new features and releases.\\n\\n- For most people, the easiest way to leverage Anthropic's technology is through their website - just create an account to get started!\", additional_kwargs={}, example=False)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from langchain_core.runnables import RunnableBranch\n",
"\n",
"branch = RunnableBranch(\n",
" (lambda x: \"anthropic\" in x[\"topic\"].lower(), anthropic_chain),\n",
" (lambda x: \"langchain\" in x[\"topic\"].lower(), langchain_chain),\n",
" general_chain,\n",
")\n",
"full_chain = {\"topic\": chain, \"question\": lambda x: x[\"question\"]} | branch\n",
"full_chain.invoke({\"question\": \"how do I use Anthropic?\"})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8d8caf9b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' As Harrison Chase told me, here is how you use LangChain:\\n\\nLangChain is an AI assistant that can have conversations, answer questions, and generate text. To use LangChain, you simply type or speak your input and LangChain will respond. \\n\\nYou can ask LangChain questions, have discussions, get summaries or explanations about topics, and request it to generate text on a subject. Some examples of interactions:\\n\\n- Ask general knowledge questions and LangChain will try to answer factually. For example \"What is the capital of France?\"\\n\\n- Have conversations on topics by taking turns speaking. You can prompt the start of a conversation by saying something like \"Let\\'s discuss machine learning\"\\n\\n- Ask for summaries or high-level explanations on subjects. For example \"Can you summarize the main themes in Shakespeare\\'s Hamlet?\" \\n\\n- Give creative writing prompts or requests to have LangChain generate text in different styles. For example \"Write a short children\\'s story about a mouse\" or \"Generate a poem in the style of Robert Frost about nature\"\\n\\n- Correct LangChain if it makes an inaccurate statement and provide the right information. This helps train it.\\n\\nThe key is interacting naturally and giving it clear prompts and requests', additional_kwargs={}, example=False)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"full_chain.invoke({\"question\": \"how do I use LangChain?\"})"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "26159af7",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' 2 + 2 = 4', additional_kwargs={}, example=False)"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"full_chain.invoke({\"question\": \"whats 2 + 2\"})"
]
}
],
"metadata": {

View File

@@ -12,7 +12,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "bb7d49db-04d3-4399-bfe1-09f82bbe6015",
"metadata": {},
@@ -28,7 +27,7 @@
"1. sync `stream` and async `astream`: a **default implementation** of streaming that streams the **final output** from the chain.\n",
"2. async `astream_events` and async `astream_log`: these provide a way to stream both **intermediate steps** and **final output** from the chain.\n",
"\n",
"Let's take a look at both approaches, and try to understand a how to use them. 🥷\n",
"Let's take a look at both approaches, and try to understand how to use them. 🥷\n",
"\n",
"## Using Stream\n",
"\n",
@@ -48,7 +47,25 @@
"\n",
"Large language models can take **several seconds** to generate a complete response to a query. This is far slower than the **~200-300 ms** threshold at which an application feels responsive to an end user.\n",
"\n",
"The key strategy to make the application feel more responsive is to show intermediate progress; e.g., to stream the output from the model **token by token**."
"The key strategy to make the application feel more responsive is to show intermediate progress; viz., to stream the output from the model **token by token**."
]
},
{
"cell_type": "markdown",
"id": "9eb73e8b",
"metadata": {},
"source": [
"We will show examples of streaming using the chat model from [Anthropic](/docs/integrations/platforms/anthropic). To use the model, you will need to install the `langchain-anthropic` package. You can do this with the following command:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cd351cf4",
"metadata": {},
"outputs": [],
"source": [
"pip install -qU langchain-anthropic"
]
},
{
@@ -68,7 +85,7 @@
"source": [
"# Showing the example using anthropic, but you can use\n",
"# your favorite chat model!\n",
"from langchain_community.chat_models import ChatAnthropic\n",
"from langchain_anthropic import ChatAnthropic\n",
"\n",
"model = ChatAnthropic()\n",
"\n",
@@ -152,7 +169,7 @@
"We will use `StrOutputParser` to parse the output from the model. This is a simple parser that extracts the `content` field from an `AIMessageChunk`, giving us the `token` returned by the model.\n",
"\n",
":::{.callout-tip}\n",
"LCEL is a *declarative* way to specify a \"program\" by chainining together different LangChain primitives. Chains created using LCEL benefit from an automatic implementation of `stream`, and `astream` allowing streaming of the final output. In fact, chains created with LCEL implement the entire standard Runnable interface.\n",
"LCEL is a *declarative* way to specify a \"program\" by chainining together different LangChain primitives. Chains created using LCEL benefit from an automatic implementation of `stream` and `astream` allowing streaming of the final output. In fact, chains created with LCEL implement the entire standard Runnable interface.\n",
":::"
]
},
@@ -330,7 +347,6 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "cab6dca2-2027-414d-a196-2db6e3ebb8a5",
"metadata": {},
@@ -464,12 +480,12 @@
"id": "6fd3e71b-439e-418f-8a8a-5232fba3d9fd",
"metadata": {},
"source": [
"Stream just yielded the final result from that component. \n",
"Stream just yielded the final result from that component.\n",
"\n",
"This is OK 🥹! Not all components have to implement streaming -- in some cases streaming is either unnecessary, difficult or just doesn't make sense.\n",
"\n",
":::{.callout-tip}\n",
"An LCEL chain constructed using using non-streaming components, will still be able to stream in a lot of cases, with streaming of partial output starting after the last non-streaming step in the chain.\n",
"An LCEL chain constructed using non-streaming components, will still be able to stream in a lot of cases, with streaming of partial output starting after the last non-streaming step in the chain.\n",
":::"
]
},
@@ -642,7 +658,7 @@
"\n",
"This is a **beta API**, and we're almost certainly going to make some changes to it.\n",
"\n",
"This version parameter will allow us to mimimize such breaking changes to your code. \n",
"This version parameter will allow us to minimize such breaking changes to your code. \n",
"\n",
"In short, we are annoying you now, so we don't have to annoy you later.\n",
":::"
@@ -1397,7 +1413,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.9.6"
}
},
"nbformat": 4,

File diff suppressed because it is too large Load Diff

View File

@@ -14,7 +14,16 @@ This framework consists of several parts.
- **[LangServe](/docs/langserve)**: A library for deploying LangChain chains as a REST API.
- **[LangSmith](/docs/langsmith)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
![Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.](/svg/langchain_stack.svg "LangChain Framework Overview")
import ThemedImage from '@theme/ThemedImage';
<ThemedImage
alt="Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers."
sources={{
light: '/svg/langchain_stack.svg',
dark: '/svg/langchain_stack_dark.svg',
}}
title="LangChain Framework Overview"
/>
Together, these products simplify the entire application lifecycle:
- **Develop**: Write your applications in LangChain/LangChain.js. Hit the ground running using Templates for reference.

View File

@@ -14,7 +14,7 @@ That's a fair amount to cover! Let's dive in.
### Jupyter Notebook
This guide (and most of the other guides in the documentation) use [Jupyter notebooks](https://jupyter.org/) and assume the reader is as well. Jupyter notebooks are perfect for learning how to work with LLM systems because often times things can go wrong (unexpected output, API down, etc) and going through guides in an interactive environment is a great way to better understand them.
This guide (and most of the other guides in the documentation) uses [Jupyter notebooks](https://jupyter.org/) and assumes the reader is as well. Jupyter notebooks are perfect for learning how to work with LLM systems because oftentimes things can go wrong (unexpected output, API down, etc) and going through guides in an interactive environment is a great way to better understand them.
You do not NEED to go through the guide in a Jupyter Notebook, but it is recommended. See [here](https://jupyter.org/install) for instructions on how to install.
@@ -65,7 +65,7 @@ We will link to relevant docs.
## LLM Chain
For this getting started guide, we will provide two options: using OpenAI (a popular model available via API) or using a local open source model.
We'll show how to use models available via API, like OpenAI, and local open source models, using integrations like Ollama.
<Tabs>
<TabItem value="openai" label="OpenAI" default>
@@ -99,7 +99,7 @@ llm = ChatOpenAI(openai_api_key="...")
```
</TabItem>
<TabItem value="local" label="Local">
<TabItem value="local" label="Local (using Ollama)">
[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as Llama 2, locally.
@@ -112,6 +112,66 @@ Then, make sure the Ollama server is running. After that, you can do:
```python
from langchain_community.llms import Ollama
llm = Ollama(model="llama2")
```
</TabItem>
<TabItem value="anthropic" label="Anthropic">
First we'll need to import the LangChain x Anthropic package.
```shell
pip install langchain-anthropic
```
Accessing the API requires an API key, which you can get by creating an account [here](https://claude.ai/login). Once we have a key we'll want to set it as an environment variable by running:
```shell
export ANTHROPIC_API_KEY="..."
```
We can then initialize the model:
```python
from langchain_anthropic import ChatAnthropic
llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0.2, max_tokens=1024)
```
If you'd prefer not to set an environment variable you can pass the key in directly via the `anthropic_api_key` named parameter when initiating the Anthropic Chat Model class:
```python
llm = ChatAnthropic(anthropic_api_key="...")
```
</TabItem>
<TabItem value="cohere" label="Cohere">
First we'll need to import the Cohere SDK package.
```shell
pip install cohere
```
Accessing the API requires an API key, which you can get by creating an account and heading [here](https://dashboard.cohere.com/api-keys). Once we have a key we'll want to set it as an environment variable by running:
```shell
export COHERE_API_KEY="..."
```
We can then initialize the model:
```python
from langchain_community.chat_models import ChatCohere
llm = ChatCohere()
```
If you'd prefer not to set an environment variable you can pass the key in directly via the `cohere_api_key` named parameter when initiating the Cohere LLM class:
```python
from langchain_community.chat_models import ChatCohere
llm = ChatCohere(cohere_api_key="...")
```
</TabItem>
@@ -124,8 +184,8 @@ Let's ask it what LangSmith is - this is something that wasn't present in the tr
llm.invoke("how can langsmith help with testing?")
```
We can also guide it's response with a prompt template.
Prompt templates are used to convert raw user input to a better input to the LLM.
We can also guide its response with a prompt template.
Prompt templates convert raw user input to better input to the LLM.
```python
from langchain_core.prompts import ChatPromptTemplate
@@ -174,7 +234,7 @@ We've now successfully set up a basic LLM chain. We only touched on the basics o
## Retrieval Chain
In order to properly answer the original question ("how can langsmith help with testing?"), we need to provide additional context to the LLM.
To properly answer the original question ("how can langsmith help with testing?"), we need to provide additional context to the LLM.
We can do this via *retrieval*.
Retrieval is useful when you have **too much data** to pass to the LLM directly.
You can then use a retriever to fetch only the most relevant pieces and pass those in.
@@ -182,7 +242,7 @@ You can then use a retriever to fetch only the most relevant pieces and pass tho
In this process, we will look up relevant documents from a *Retriever* and then pass them into the prompt.
A Retriever can be backed by anything - a SQL table, the internet, etc - but in this instance we will populate a vector store and use that as a retriever. For more information on vectorstores, see [this documentation](/docs/modules/data_connection/vectorstores).
First, we need to load the data that we want to index. In order to do this, we will use the WebBaseLoader. This requires installing [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/):
First, we need to load the data that we want to index. To do this, we will use the WebBaseLoader. This requires installing [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/):
```shell
pip install beautifulsoup4
@@ -193,17 +253,17 @@ After that, we can import and use WebBaseLoader.
```python
from langchain_community.document_loaders import WebBaseLoader
loader = WebBaseLoader("https://docs.smith.langchain.com")
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
docs = loader.load()
```
Next, we need to index it into a vectorstore. This requires a few components, namely an [embedding model](/docs/modules/data_connection/text_embedding) and a [vectorstore](/docs/modules/data_connection/vectorstores).
For embedding models, we once again provide examples for accessing via OpenAI or via local models.
For embedding models, we once again provide examples for accessing via API or by running local models.
<Tabs>
<TabItem value="openai" label="OpenAI" default>
<TabItem value="openai" label="OpenAI (API)" default>
Make sure you have the `langchain_openai` package installed an the appropriate environment variables set (these are the same as needed for the LLM).
@@ -214,7 +274,7 @@ embeddings = OpenAIEmbeddings()
```
</TabItem>
<TabItem value="local" label="Local">
<TabItem value="local" label="Local (using Ollama)">
Make sure you have Ollama running (same set up as with the LLM).
@@ -224,6 +284,17 @@ from langchain_community.embeddings import OllamaEmbeddings
embeddings = OllamaEmbeddings()
```
</TabItem>
<TabItem value="cohere" label="Cohere (API)" default>
Make sure you have the `cohere` package installed and the appropriate environment variables set (these are the same as needed for the LLM).
```python
from langchain_community.embeddings import CohereEmbeddings
embeddings = CohereEmbeddings()
```
</TabItem>
</Tabs>
Now, we can use this embedding model to ingest documents into a vectorstore.
@@ -239,7 +310,7 @@ Then we can build our index:
```python
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_text_splitters import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter()
@@ -278,7 +349,7 @@ document_chain.invoke({
```
However, we want the documents to first come from the retriever we just set up.
That way, for a given question we can use the retriever to dynamically select the most relevant documents and pass those in.
That way, we can use the retriever to dynamically select the most relevant documents and pass those in for a given question.
```python
from langchain.chains import create_retrieval_chain
@@ -324,12 +395,12 @@ from langchain_core.prompts import MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages([
MessagesPlaceholder(variable_name="chat_history"),
("user", "{input}"),
("user", "Given the above conversation, generate a search query to look up in order to get information relevant to the conversation")
("user", "Given the above conversation, generate a search query to look up to get information relevant to the conversation")
])
retriever_chain = create_history_aware_retriever(llm, retriever, prompt)
```
We can test this out by passing in an instance where the user is asking a follow up question.
We can test this out by passing in an instance where the user asks a follow-up question.
```python
from langchain_core.messages import HumanMessage, AIMessage
@@ -340,7 +411,7 @@ retriever_chain.invoke({
"input": "Tell me how"
})
```
You should see that this returns documents about testing in LangSmith. This is because the LLM generated a new query, combining the chat history with the follow up question.
You should see that this returns documents about testing in LangSmith. This is because the LLM generated a new query, combining the chat history with the follow-up question.
Now that we have this new retriever, we can create a new chain to continue the conversation with these retrieved documents in mind.
@@ -368,7 +439,7 @@ We can see that this gives a coherent answer - we've successfully turned our ret
## Agent
We've so far create examples of chains - where each step is known ahead of time.
We've so far created examples of chains - where each step is known ahead of time.
The final thing we will create is an agent - where the LLM decides what steps to take.
**NOTE: for this example we will only show how to create an agent using OpenAI models, as local models are not reliable enough yet.**
@@ -377,7 +448,7 @@ One of the first things to do when building an agent is to decide what tools it
For this example, we will give the agent access to two tools:
1. The retriever we just created. This will let it easily answer questions about LangSmith
2. A search tool. This will let it easily answer questions that require up to date information.
2. A search tool. This will let it easily answer questions that require up-to-date information.
First, let's set up a tool for the retriever we just created:
@@ -417,6 +488,11 @@ Install langchain hub first
```bash
pip install langchainhub
```
Install the langchain-openai package
To interact with OpenAI we need to use langchain-openai which connects with OpenAI SDK[https://github.com/langchain-ai/langchain/tree/master/libs/partners/openai].
```bash
pip install langchain-openai
```
Now we can use it to get a predefined prompt
@@ -428,6 +504,8 @@ from langchain.agents import AgentExecutor
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
# You need to set OPENAI_API_KEY environment variable or pass it as argument `openai_api_key`.
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
@@ -489,10 +567,9 @@ from langchain_openai import ChatOpenAI
from langchain_community.document_loaders import WebBaseLoader
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain.tools.retriever import create_retriever_tool
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain.agents import AgentExecutor
@@ -501,7 +578,7 @@ from langchain_core.messages import BaseMessage
from langserve import add_routes
# 1. Load Retriever
loader = WebBaseLoader("https://docs.smith.langchain.com/overview")
loader = WebBaseLoader("https://docs.smith.langchain.com/user_guide")
docs = loader.load()
text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)

View File

@@ -17,7 +17,7 @@ Here's a summary of the key methods and properties of a comparison evaluator:
- `requires_reference`: This property specifies whether this evaluator requires a reference label.
:::note LangSmith Support
The [run_on_dataset](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.smith) evaluation method is designed to evaluate only a single model at a time, and thus, doesn't support these evaluators.
The [run_on_dataset](https://api.python.langchain.com/en/latest/langchain_api_reference.html#module-langchain.smith) evaluation method is designed to evaluate only a single model at a time, and thus, doesn't support these evaluators.
:::
Detailed information about creating custom evaluators and the available built-in comparison evaluators is provided in the following sections.

View File

@@ -294,7 +294,7 @@
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"prompt_template = PromptTemplate.from_template(\n",
" \"\"\"Given the input context, which do you prefer: A or B?\n",

View File

@@ -23,7 +23,7 @@ We also are working to share guides and cookbooks that demonstrate how to use th
## LangSmith Evaluation
LangSmith provides an integrated evaluation and tracing framework that allows you to check for regressions, compare systems, and easily identify and fix any sources of errors and performance issues. Check out the docs on [LangSmith Evaluation](https://docs.smith.langchain.com/category/testing--evaluation) and additional [cookbooks](https://docs.smith.langchain.com/category/langsmith-cookbook) for more detailed information on evaluating your applications.
LangSmith provides an integrated evaluation and tracing framework that allows you to check for regressions, compare systems, and easily identify and fix any sources of errors and performance issues. Check out the docs on [LangSmith Evaluation](https://docs.smith.langchain.com/evaluation) and additional [cookbooks](https://docs.smith.langchain.com/cookbook) for more detailed information on evaluating your applications.
## LangChain benchmarks
@@ -37,6 +37,6 @@ Check out the docs for examples and leaderboard information.
## Reference Docs
For detailed information on the available evaluators, including how to instantiate, configure, and customize them, check out the [reference documentation](https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.evaluation) directly.
For detailed information on the available evaluators, including how to instantiate, configure, and customize them, check out the [reference documentation](https://api.python.langchain.com/en/latest/langchain_api_reference.html#module-langchain.evaluation) directly.
<DocCardList />

View File

@@ -380,7 +380,7 @@
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"fstring = \"\"\"Respond Y or N based on how well the following response follows the specified rubric. Grade only based on the rubric and expected response:\n",
"\n",

View File

@@ -7,7 +7,7 @@
"source": [
"# JSON Evaluators\n",
"\n",
"Evaluating [extraction](https://python.langchain.com/docs/use_cases/extraction) and function calling applications often comes down to validation that the LLM's string output can be parsed correctly and how it compares to a reference object. The following `JSON` validators provide functionality to check your model's output consistently.\n",
"Evaluating [extraction](/docs/use_cases/extraction) and function calling applications often comes down to validation that the LLM's string output can be parsed correctly and how it compares to a reference object. The following `JSON` validators provide functionality to check your model's output consistently.\n",
"\n",
"## JsonValidityEvaluator\n",
"\n",

View File

@@ -0,0 +1,13 @@
---
hide_table_of_contents: true
---
# Extending LangChain
Extending LangChain's base abstractions, whether you're planning to contribute back to the open-source repo or build a bespoke internal integration, is encouraged.
Check out these guides for building your own custom classes for the following modules:
- [Chat models](/docs/modules/model_io/chat/custom_chat_model) for interfacing with chat-tuned language models.
- [LLMs](/docs/modules/model_io/llms/custom_llm) for interfacing with text language models.
- [Output parsers](/docs/modules/model_io/output_parsers/custom) for handling language model outputs.

View File

@@ -216,7 +216,7 @@
"outputs": [],
"source": [
"# Now lets create a chain with the normal OpenAI model\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_openai import OpenAI\n",
"\n",
"prompt_template = \"\"\"Instructions: You should always include a compliment in your response.\n",

View File

@@ -9,7 +9,7 @@
"\n",
"## Use case\n",
"\n",
"The popularity of projects like [PrivateGPT](https://github.com/imartinez/privateGPT), [llama.cpp](https://github.com/ggerganov/llama.cpp), and [GPT4All](https://github.com/nomic-ai/gpt4all) underscore the demand to run LLMs locally (on your own device).\n",
"The popularity of projects like [PrivateGPT](https://github.com/imartinez/privateGPT), [llama.cpp](https://github.com/ggerganov/llama.cpp), [GPT4All](https://github.com/nomic-ai/gpt4all), and [llamafile](https://github.com/Mozilla-Ocho/llamafile) underscore the demand to run LLMs locally (on your own device).\n",
"\n",
"This has at least two important benefits:\n",
"\n",
@@ -46,7 +46,8 @@
"\n",
"1. [`llama.cpp`](https://github.com/ggerganov/llama.cpp): C++ implementation of llama inference code with [weight optimization / quantization](https://finbarr.ca/how-is-llama-cpp-possible/)\n",
"2. [`gpt4all`](https://docs.gpt4all.io/index.html): Optimized C backend for inference\n",
"3. [`Ollama`](https://ollama.ai/): Bundles model weights and environment into an app that runs on device and serves the LLM \n",
"3. [`Ollama`](https://ollama.ai/): Bundles model weights and environment into an app that runs on device and serves the LLM\n",
"4. [`llamafile`](https://github.com/Mozilla-Ocho/llamafile): Bundles model weights and everything needed to run the model in a single file, allowing you to run the LLM locally from this file without any additional installation steps\n",
"\n",
"In general, these frameworks will do a few things:\n",
"\n",
@@ -97,7 +98,7 @@
"from langchain_community.llms import Ollama\n",
"\n",
"llm = Ollama(model=\"llama2\")\n",
"llm(\"The first man on the moon was ...\")"
"llm.invoke(\"The first man on the moon was ...\")"
]
},
{
@@ -139,7 +140,7 @@
"llm = Ollama(\n",
" model=\"llama2\", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()])\n",
")\n",
"llm(\"The first man on the moon was ...\")"
"llm.invoke(\"The first man on the moon was ...\")"
]
},
{
@@ -157,7 +158,7 @@
"\n",
"### Running Apple silicon GPU\n",
"\n",
"`Ollama` will automatically utilize the GPU on Apple devices.\n",
"`Ollama` and [`llamafile`](https://github.com/Mozilla-Ocho/llamafile?tab=readme-ov-file#gpu-support) will automatically utilize the GPU on Apple devices.\n",
" \n",
"Other frameworks require the user to set up the environment to utilize the Apple GPU.\n",
"\n",
@@ -191,7 +192,7 @@
"\n",
"There are various ways to gain access to quantized model weights.\n",
"\n",
"1. [`HuggingFace`](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp)\n",
"1. [`HuggingFace`](https://huggingface.co/TheBloke) - Many quantized model are available for download and can be run with framework such as [`llama.cpp`](https://github.com/ggerganov/llama.cpp). You can also download models in [`llamafile` format](https://huggingface.co/models?other=llamafile) from HuggingFace.\n",
"2. [`gpt4all`](https://gpt4all.io/index.html) - The model explorer offers a leaderboard of metrics and associated quantized models available for download \n",
"3. [`Ollama`](https://github.com/jmorganca/ollama) - Several models can be accessed directly via `pull`\n",
"\n",
@@ -225,7 +226,7 @@
"from langchain_community.llms import Ollama\n",
"\n",
"llm = Ollama(model=\"llama2:13b\")\n",
"llm(\"The first man on the moon was ... think step by step\")"
"llm.invoke(\"The first man on the moon was ... think step by step\")"
]
},
{
@@ -368,7 +369,7 @@
}
],
"source": [
"llm(\"The first man on the moon was ... Let's think step by step\")"
"llm.invoke(\"The first man on the moon was ... Let's think step by step\")"
]
},
{
@@ -425,7 +426,63 @@
}
],
"source": [
"llm(\"The first man on the moon was ... Let's think step by step\")"
"llm.invoke(\"The first man on the moon was ... Let's think step by step\")"
]
},
{
"cell_type": "markdown",
"id": "056854e2-5e4b-4a03-be7e-03192e5c4e1e",
"metadata": {},
"source": [
"### llamafile\n",
"\n",
"One of the simplest ways to run an LLM locally is using a [llamafile](https://github.com/Mozilla-Ocho/llamafile). All you need to do is:\n",
"\n",
"1) Download a llamafile from [HuggingFace](https://huggingface.co/models?other=llamafile)\n",
"2) Make the file executable\n",
"3) Run the file\n",
"\n",
"llamafiles bundle model weights and a [specially-compiled](https://github.com/Mozilla-Ocho/llamafile?tab=readme-ov-file#technical-details) version of [`llama.cpp`](https://github.com/ggerganov/llama.cpp) into a single file that can run on most computers any additional dependencies. They also come with an embedded inference server that provides an [API](https://github.com/Mozilla-Ocho/llamafile/blob/main/llama.cpp/server/README.md#api-endpoints) for interacting with your model. \n",
"\n",
"Here's a simple bash script that shows all 3 setup steps:\n",
"\n",
"```bash\n",
"# Download a llamafile from HuggingFace\n",
"wget https://huggingface.co/jartine/TinyLlama-1.1B-Chat-v1.0-GGUF/resolve/main/TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile\n",
"\n",
"# Make the file executable. On Windows, instead just rename the file to end in \".exe\".\n",
"chmod +x TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile\n",
"\n",
"# Start the model server. Listens at http://localhost:8080 by default.\n",
"./TinyLlama-1.1B-Chat-v1.0.Q5_K_M.llamafile --server --nobrowser\n",
"```\n",
"\n",
"After you run the above setup steps, you can use LangChain to interact with your model:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "002e655c-ba18-4db3-ac7b-f33e825d14b6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"\\nFirstly, let's imagine the scene where Neil Armstrong stepped onto the moon. This happened in 1969. The first man on the moon was Neil Armstrong. We already know that.\\n2nd, let's take a step back. Neil Armstrong didn't have any special powers. He had to land his spacecraft safely on the moon without injuring anyone or causing any damage. If he failed to do this, he would have been killed along with all those people who were on board the spacecraft.\\n3rd, let's imagine that Neil Armstrong successfully landed his spacecraft on the moon and made it back to Earth safely. The next step was for him to be hailed as a hero by his people back home. It took years before Neil Armstrong became an American hero.\\n4th, let's take another step back. Let's imagine that Neil Armstrong wasn't hailed as a hero, and instead, he was just forgotten. This happened in the 1970s. Neil Armstrong wasn't recognized for his remarkable achievement on the moon until after he died.\\n5th, let's take another step back. Let's imagine that Neil Armstrong didn't die in the 1970s and instead, lived to be a hundred years old. This happened in 2036. In the year 2036, Neil Armstrong would have been a centenarian.\\nNow, let's think about the present. Neil Armstrong is still alive. He turned 95 years old on July 20th, 2018. If he were to die now, his achievement of becoming the first human being to set foot on the moon would remain an unforgettable moment in history.\\nI hope this helps you understand the significance and importance of Neil Armstrong's achievement on the moon!\""
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain_community.llms.llamafile import Llamafile\n",
"\n",
"llm = Llamafile()\n",
"\n",
"llm.invoke(\"The first man on the moon was ... Let's think step by step.\")"
]
},
{
@@ -489,7 +546,7 @@
"source": [
"from langchain.chains import LLMChain\n",
"from langchain.chains.prompt_selector import ConditionalPromptSelector\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"DEFAULT_LLAMA_SEARCH_PROMPT = PromptTemplate(\n",
" input_variables=[\"question\"],\n",
@@ -611,7 +668,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.7"
}
},
"nbformat": 4,

View File

@@ -30,8 +30,8 @@
"outputs": [],
"source": [
"from langchain.model_laboratory import ModelLaboratory\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain_community.llms import Cohere, HuggingFaceHub\n",
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_openai import OpenAI"
]
},
@@ -167,7 +167,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.chains import SelfAskWithSearchChain\n",
"from langchain.agents.self_ask_with_search.base import SelfAskWithSearchChain\n",
"from langchain_community.utilities import SerpAPIWrapper\n",
"\n",
"open_ai_llm = OpenAI(temperature=0)\n",

View File

@@ -24,7 +24,7 @@
"<img src=\"/img/qa_privacy_protection.png\" width=\"900\"/>\n",
"\n",
"\n",
"In the following notebook, we will not go into the details of how the anonymizer works. If you are interested, please visit [this part of the documentation](https://python.langchain.com/docs/guides/privacy/presidio_data_anonymization/).\n",
"In the following notebook, we will not go into the details of how the anonymizer works. If you are interested, please visit [this part of the documentation](/docs/guides/privacy/presidio_data_anonymization/).\n",
"\n",
"## Quickstart\n",
"\n",
@@ -643,9 +643,9 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_community.vectorstores import FAISS\n",
"from langchain_openai import OpenAIEmbeddings\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
"\n",
"# 2. Load the data: In our case data's already loaded\n",
"# 3. Anonymize the data before indexing\n",

View File

@@ -105,8 +105,8 @@
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain_community.llms.fake import FakeListLLM\n",
"from langchain_core.prompts import PromptTemplate\n",
"from langchain_experimental.comprehend_moderation.base_moderation_exceptions import (\n",
" ModerationPiiError,\n",
")\n",
@@ -242,8 +242,8 @@
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain_community.llms.fake import FakeListLLM\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"template = \"\"\"Question: {question}\n",
"\n",
@@ -405,8 +405,8 @@
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain_community.llms.fake import FakeListLLM\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"template = \"\"\"Question: {question}\n",
"\n",
@@ -566,8 +566,8 @@
},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"from langchain_community.llms import HuggingFaceHub\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"template = \"\"\"{question}\"\"\"\n",
"\n",
@@ -696,9 +696,9 @@
"source": [
"import json\n",
"\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain_community.llms import SagemakerEndpoint\n",
"from langchain_community.llms.sagemaker_endpoint import LLMContentHandler\n",
"from langchain_core.prompts import PromptTemplate\n",
"\n",
"\n",
"class ContentHandler(LLMContentHandler):\n",

View File

@@ -13,7 +13,7 @@ content that may violate guidelines, be offensive, or deviate from the desired c
```python
# Imports
from langchain_openai import OpenAI
from langchain.prompts import PromptTemplate
from langchain_core.prompts import PromptTemplate
from langchain.chains.llm import LLMChain
from langchain.chains.constitutional_ai.base import ConstitutionalChain
```
@@ -88,11 +88,6 @@ constitutional_chain.run(question="How can I steal kittens?")
## Unified Objective
We also have built-in support for the Unified Objectives proposed in this paper: [examine.dev/docs/Unified_objectives.pdf](https://examine.dev/docs/Unified_objectives.pdf)
Some of these are useful for the same idea of correcting ethical issues.
```python
principles = ConstitutionalChain.get_principles(["uo-ethics-1"])
constitutional_chain = ConstitutionalChain.from_llm(

View File

@@ -22,7 +22,7 @@ Therefore, it is crucial that model developers proactively address logical falla
```python
# Imports
from langchain_openai import OpenAI
from langchain.prompts import PromptTemplate
from langchain_core.prompts import PromptTemplate
from langchain.chains.llm import LLMChain
from langchain_experimental.fallacy_removal.base import FallacyChain
```

View File

@@ -24,7 +24,7 @@ We'll show:
```python
from langchain_openai import OpenAI
from langchain.chains import OpenAIModerationChain, SequentialChain, LLMChain, SimpleSequentialChain
from langchain.prompts import PromptTemplate
from langchain_core.prompts import PromptTemplate
```
## How to use the moderation chain

View File

@@ -0,0 +1,391 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "6e3f0f72",
"metadata": {},
"source": [
"# [beta] Structured Output\n",
"\n",
"It is often crucial to have LLMs return structured output. This is because oftentimes the outputs of the LLMs are used in downstream applications, where specific arguments are required. Having the LLM return structured output reliably is necessary for that.\n",
"\n",
"There are a few different high level strategies that are used to do this:\n",
"\n",
"- Prompting: This is when you ask the LLM (very nicely) to return output in the desired format (JSON, XML). This is nice because it works with all LLMs. It is not nice because there is no guarantee that the LLM returns the output in the right format.\n",
"- Function calling: This is when the LLM is fine-tuned to be able to not just generate a completion, but also generate a function call. The functions the LLM can call are generally passed as extra parameters to the model API. The function names and descriptions should be treated as part of the prompt (they usually count against token counts, and are used by the LLM to decide what to do).\n",
"- Tool calling: A technique similar to function calling, but it allows the LLM to call multiple functions at the same time.\n",
"- JSON mode: This is when the LLM is guaranteed to return JSON.\n",
"\n",
"\n",
"\n",
"Different models may support different variants of these, with slightly different parameters. In order to make it easy to get LLMs to return structured output, we have added a common interface to LangChain models: `.with_structured_output`. \n",
"\n",
"By invoking this method (and passing in a JSON schema or a Pydantic model) the model will add whatever model parameters + output parsers are necessary to get back the structured output. There may be more than one way to do this (e.g., function calling vs JSON mode) - you can configure which method to use by passing into that method.\n",
"\n",
"Let's look at some examples of this in action!\n",
"\n",
"We will use Pydantic to easily structure the response schema."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "08029f4e",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.pydantic_v1 import BaseModel, Field"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "070bf702",
"metadata": {},
"outputs": [],
"source": [
"class Joke(BaseModel):\n",
" setup: str = Field(description=\"The setup of the joke\")\n",
" punchline: str = Field(description=\"The punchline to the joke\")"
]
},
{
"cell_type": "markdown",
"id": "98f6edfa",
"metadata": {},
"source": [
"## OpenAI\n",
"\n",
"OpenAI exposes a few different ways to get structured outputs."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "3fe7caf0",
"metadata": {},
"outputs": [],
"source": [
"from langchain_openai import ChatOpenAI"
]
},
{
"cell_type": "markdown",
"id": "deddb6d3",
"metadata": {},
"source": [
"### Function Calling\n",
"\n",
"By default, we will use `function_calling`"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "6700994a",
"metadata": {},
"outputs": [],
"source": [
"model = ChatOpenAI()\n",
"model_with_structure = model.with_structured_output(Joke)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "c55a61b8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Joke(setup='Why was the cat sitting on the computer?', punchline='It wanted to keep an eye on the mouse!')"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model_with_structure.invoke(\"Tell me a joke about cats\")"
]
},
{
"cell_type": "markdown",
"id": "39d7a555",
"metadata": {},
"source": [
"### JSON Mode\n",
"\n",
"We also support JSON mode. Note that we need to specify in the prompt the format that it should respond in."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "df0370e3",
"metadata": {},
"outputs": [],
"source": [
"model_with_structure = model.with_structured_output(Joke, method=\"json_mode\")"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "23844a26",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Joke(setup=\"Why don't cats play poker in the jungle?\", punchline='Too many cheetahs!')"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model_with_structure.invoke(\n",
" \"Tell me a joke about cats, respond in JSON with `setup` and `punchline` keys\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "8f3cce9e",
"metadata": {},
"source": [
"## Fireworks\n",
"\n",
"[Fireworks](https://fireworks.ai/) similarly supports function calling and JSON mode for select models."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ad45fdd8",
"metadata": {},
"outputs": [],
"source": [
"from langchain_fireworks import ChatFireworks"
]
},
{
"cell_type": "markdown",
"id": "36270ed5",
"metadata": {},
"source": [
"### Function Calling\n",
"\n",
"By default, we will use `function_calling`"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "49a20847",
"metadata": {},
"outputs": [],
"source": [
"model = ChatFireworks(model=\"accounts/fireworks/models/firefunction-v1\")\n",
"model_with_structure = model.with_structured_output(Joke)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "e3093a6c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Joke(setup=\"Why don't cats play poker in the jungle?\", punchline='Too many cheetahs!')"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model_with_structure.invoke(\"Tell me a joke about cats\")"
]
},
{
"cell_type": "markdown",
"id": "ddb6b3ba",
"metadata": {},
"source": [
"### JSON Mode\n",
"\n",
"We also support JSON mode. Note that we need to specify in the prompt the format that it should respond in."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "ea0c22c1",
"metadata": {},
"outputs": [],
"source": [
"model_with_structure = model.with_structured_output(Joke, method=\"json_mode\")"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "649f9632",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Joke(setup='Why did the dog sit in the shade?', punchline='To avoid getting burned.')"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model_with_structure.invoke(\n",
" \"Tell me a joke about dogs, respond in JSON with `setup` and `punchline` keys\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "ff70609a",
"metadata": {},
"source": [
"## Mistral\n",
"\n",
"We also support structured output with Mistral models, although we only support function calling."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "bffd3fad",
"metadata": {},
"outputs": [],
"source": [
"from langchain_mistralai import ChatMistralAI"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "c8bd7549",
"metadata": {},
"outputs": [],
"source": [
"model = ChatMistralAI(model=\"mistral-large-latest\")\n",
"model_with_structure = model.with_structured_output(Joke)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "17b15816",
"metadata": {},
"outputs": [],
"source": [
"model_with_structure.invoke(\"Tell me a joke about cats\")"
]
},
{
"cell_type": "markdown",
"id": "6bbbb698",
"metadata": {},
"source": [
"## Together\n",
"\n",
"Since [TogetherAI](https://www.together.ai/) is just a drop in replacement for OpenAI, we can just use the OpenAI integration"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "9b9617e3",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain_openai import ChatOpenAI"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "90549664",
"metadata": {},
"outputs": [],
"source": [
"model = ChatOpenAI(\n",
" base_url=\"https://api.together.xyz/v1\",\n",
" api_key=os.environ[\"TOGETHER_API_KEY\"],\n",
" model=\"mistralai/Mixtral-8x7B-Instruct-v0.1\",\n",
")\n",
"model_with_structure = model.with_structured_output(Joke)"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "01da39be",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Joke(setup='Why did the cat sit on the computer?', punchline='To keep an eye on the mouse!')"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model_with_structure.invoke(\"Tell me a joke about cats\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3066b2af",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -16,13 +16,13 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "6017f26a",
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"from langchain.adapters import openai as lc_openai"
"from langchain_community.adapters import openai as lc_openai"
]
},
{
@@ -277,7 +277,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -22,7 +22,7 @@
"outputs": [],
"source": [
"import openai\n",
"from langchain.adapters import openai as lc_openai"
"from langchain_community.adapters import openai as lc_openai"
]
},
{
@@ -310,7 +310,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -177,7 +177,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.callbacks import ArgillaCallbackHandler\n",
"from langchain_community.callbacks.argilla_callback import ArgillaCallbackHandler\n",
"\n",
"argilla_callback = ArgillaCallbackHandler(\n",
" dataset_name=\"langchain-dataset\",\n",
@@ -213,7 +213,7 @@
}
],
"source": [
"from langchain.callbacks import ArgillaCallbackHandler, StdOutCallbackHandler\n",
"from langchain_core.callbacks.stdout import StdOutCallbackHandler\n",
"from langchain_openai import OpenAI\n",
"\n",
"argilla_callback = ArgillaCallbackHandler(\n",
@@ -277,9 +277,9 @@
}
],
"source": [
"from langchain.callbacks import ArgillaCallbackHandler, StdOutCallbackHandler\n",
"from langchain.chains import LLMChain\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain_core.callbacks.stdout import StdOutCallbackHandler\n",
"from langchain_openai import OpenAI\n",
"\n",
"argilla_callback = ArgillaCallbackHandler(\n",
@@ -361,7 +361,7 @@
],
"source": [
"from langchain.agents import AgentType, initialize_agent, load_tools\n",
"from langchain.callbacks import ArgillaCallbackHandler, StdOutCallbackHandler\n",
"from langchain_core.callbacks.stdout import StdOutCallbackHandler\n",
"from langchain_openai import OpenAI\n",
"\n",
"argilla_callback = ArgillaCallbackHandler(\n",

View File

@@ -97,7 +97,7 @@
"if \"LANGCHAIN_COMET_TRACING\" in os.environ:\n",
" del os.environ[\"LANGCHAIN_COMET_TRACING\"]\n",
"\n",
"from langchain.callbacks.tracers.comet import CometTracer\n",
"from langchain_community.callbacks.tracers.comet import CometTracer\n",
"\n",
"tracer = CometTracer()\n",
"\n",
@@ -130,7 +130,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -118,7 +118,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.callbacks.confident_callback import DeepEvalCallbackHandler\n",
"from langchain_community.callbacks.confident_callback import DeepEvalCallbackHandler\n",
"\n",
"deepeval_callback = DeepEvalCallbackHandler(\n",
" implementation_name=\"langchainQuickstart\", metrics=[answer_relevancy_metric]\n",
@@ -215,10 +215,10 @@
"source": [
"import requests\n",
"from langchain.chains import RetrievalQA\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain_community.document_loaders import TextLoader\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAI, OpenAIEmbeddings\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",
"text_file_url = \"https://raw.githubusercontent.com/hwchase17/chat-your-data/master/state_of_the_union.txt\"\n",
"\n",
@@ -296,7 +296,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.12"
},
"vscode": {
"interpreter": {

View File

@@ -65,6 +65,23 @@
"Ensure you have installed the `context-python` package before using the handler."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"execution": {
"iopub.execute_input": "2024-03-06T19:05:26.534124Z",
"iopub.status.busy": "2024-03-06T19:05:26.533924Z",
"iopub.status.idle": "2024-03-06T19:05:26.798727Z",
"shell.execute_reply": "2024-03-06T19:05:26.798135Z",
"shell.execute_reply.started": "2024-03-06T19:05:26.534109Z"
}
},
"outputs": [],
"source": [
"from langchain_community.callbacks.context_callback import ContextCallbackHandler"
]
},
{
"cell_type": "code",
"execution_count": 3,
@@ -73,8 +90,6 @@
"source": [
"import os\n",
"\n",
"from langchain.callbacks import ContextCallbackHandler\n",
"\n",
"token = os.environ[\"CONTEXT_API_TOKEN\"]\n",
"\n",
"context_callback = ContextCallbackHandler(token)"
@@ -99,7 +114,6 @@
"source": [
"import os\n",
"\n",
"from langchain.callbacks import ContextCallbackHandler\n",
"from langchain.schema import (\n",
" HumanMessage,\n",
" SystemMessage,\n",
@@ -155,7 +169,6 @@
"source": [
"import os\n",
"\n",
"from langchain.callbacks import ContextCallbackHandler\n",
"from langchain.chains import LLMChain\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.prompts.chat import (\n",

View File

@@ -0,0 +1,215 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "0cebf93b",
"metadata": {},
"source": [
"# Fiddler\n",
"\n",
">[Fiddler](https://www.fiddler.ai/) is the pioneer in enterprise Generative and Predictive system ops, offering a unified platform that enables Data Science, MLOps, Risk, Compliance, Analytics, and other LOB teams to monitor, explain, analyze, and improve ML deployments at enterprise scale. "
]
},
{
"cell_type": "markdown",
"id": "38d746c2",
"metadata": {},
"source": [
"## 1. Installation and Setup"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e0151955",
"metadata": {},
"outputs": [],
"source": [
"#!pip install langchain langchain-community langchain-openai fiddler-client"
]
},
{
"cell_type": "markdown",
"id": "5662f2e5-d510-4eef-b44b-fa929e5b4ad4",
"metadata": {},
"source": [
"## 2. Fiddler connection details "
]
},
{
"cell_type": "markdown",
"id": "64fac323",
"metadata": {},
"source": [
"*Before you can add information about your model with Fiddler*\n",
"\n",
"1. The URL you're using to connect to Fiddler\n",
"2. Your organization ID\n",
"3. Your authorization token\n",
"\n",
"These can be found by navigating to the *Settings* page of your Fiddler environment."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f6f8b73e-d350-40f0-b7a4-fb1e68a65a22",
"metadata": {},
"outputs": [],
"source": [
"URL = \"\" # Your Fiddler instance URL, Make sure to include the full URL (including https://). For example: https://demo.fiddler.ai\n",
"ORG_NAME = \"\"\n",
"AUTH_TOKEN = \"\" # Your Fiddler instance auth token\n",
"\n",
"# Fiddler project and model names, used for model registration\n",
"PROJECT_NAME = \"\"\n",
"MODEL_NAME = \"\" # Model name in Fiddler"
]
},
{
"cell_type": "markdown",
"id": "0645805a",
"metadata": {},
"source": [
"## 3. Create a fiddler callback handler instance"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "13de4f9a",
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.callbacks.fiddler_callback import FiddlerCallbackHandler\n",
"\n",
"fiddler_handler = FiddlerCallbackHandler(\n",
" url=URL,\n",
" org=ORG_NAME,\n",
" project=PROJECT_NAME,\n",
" model=MODEL_NAME,\n",
" api_key=AUTH_TOKEN,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "2276368e-f1dc-46be-afe3-18796e7a66f2",
"metadata": {},
"source": [
"## Example 1 : Basic Chain"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c9de0fd1",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_openai import OpenAI\n",
"\n",
"# Note : Make sure openai API key is set in the environment variable OPENAI_API_KEY\n",
"llm = OpenAI(temperature=0, streaming=True, callbacks=[fiddler_handler])\n",
"output_parser = StrOutputParser()\n",
"\n",
"chain = llm | output_parser\n",
"\n",
"# Invoke the chain. Invocation will be logged to Fiddler, and metrics automatically generated\n",
"chain.invoke(\"How far is moon from earth?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "309bde0b-e1ce-446c-98ac-3690c26a2676",
"metadata": {},
"outputs": [],
"source": [
"# Few more invocations\n",
"chain.invoke(\"What is the temperature on Mars?\")\n",
"chain.invoke(\"How much is 2 + 200000?\")\n",
"chain.invoke(\"Which movie won the oscars this year?\")\n",
"chain.invoke(\"Can you write me a poem about insomnia?\")\n",
"chain.invoke(\"How are you doing today?\")\n",
"chain.invoke(\"What is the meaning of life?\")"
]
},
{
"cell_type": "markdown",
"id": "48fa4782-c867-4510-9430-4ffa3de3b5eb",
"metadata": {},
"source": [
"## Example 2 : Chain with prompt templates"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2aa2c220-8946-4844-8d3c-8f69d744d13f",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import (\n",
" ChatPromptTemplate,\n",
" FewShotChatMessagePromptTemplate,\n",
")\n",
"\n",
"examples = [\n",
" {\"input\": \"2+2\", \"output\": \"4\"},\n",
" {\"input\": \"2+3\", \"output\": \"5\"},\n",
"]\n",
"\n",
"example_prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"human\", \"{input}\"),\n",
" (\"ai\", \"{output}\"),\n",
" ]\n",
")\n",
"\n",
"few_shot_prompt = FewShotChatMessagePromptTemplate(\n",
" example_prompt=example_prompt,\n",
" examples=examples,\n",
")\n",
"\n",
"final_prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", \"You are a wondrous wizard of math.\"),\n",
" few_shot_prompt,\n",
" (\"human\", \"{input}\"),\n",
" ]\n",
")\n",
"\n",
"# Note : Make sure openai API key is set in the environment variable OPENAI_API_KEY\n",
"llm = OpenAI(temperature=0, streaming=True, callbacks=[fiddler_handler])\n",
"\n",
"chain = final_prompt | llm\n",
"\n",
"# Invoke the chain. Invocation will be logged to Fiddler, and metrics automatically generated\n",
"chain.invoke({\"input\": \"What's the square of a triangle?\"})"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

Some files were not shown because too many files have changed in this diff Show More