```python
import json
import re
from pathlib import Path
def parse_markdown_to_sidebar(markdown_content):
lines = markdown_content.splitlines()
sidebar = []
current_category = None
current_subcategory = None
for line in lines:
if line.startswith('### '):
# Subcategory
if current_subcategory is not None:
current_category['items'].append(current_subcategory)
subcategory_title = line.strip('# ').strip()
current_subcategory = {
"type": "category",
"label": subcategory_title,
"collapsed": True,
"items": [],
"link": {"type": "generated-index"}
}
elif line.startswith('## '):
# Category
if current_category is not None:
if current_subcategory is not None:
current_category['items'].append(current_subcategory)
current_subcategory = None
sidebar.append(current_category)
category_title = line.strip('# ').strip()
current_category = {
"type": "category",
"label": category_title,
"collapsed": True,
"items": [],
"link": {"type": "generated-index"}
}
elif line.startswith('- ['):
# Link
match = re.match(r'- \[(.*?)\]\((.*?)\)', line)
if match:
title, link = match.groups()
link = link.replace('/docs/', '') # Remove '/docs/' prefix
if current_subcategory is not None:
current_subcategory['items'].append(link)
elif current_category is not None:
current_category['items'].append(link)
# Add the last category and subcategory if they exist
if current_subcategory is not None:
current_category['items'].append(current_subcategory)
if current_category is not None:
sidebar.append(current_category)
return sidebar
def generate_sidebar_json(file_path):
with open(file_path, 'r') as md_file:
markdown_content = md_file.read()
sidebar = parse_markdown_to_sidebar(markdown_content)
sidebar_json = json.dumps({"items": sidebar}, indent=2)
return sidebar_json
```
Proposing to centralize code for handling dynamic imports. This allows treating langchain-community as an optional dependency.
---
The proposal is to scan the code base and to replace all existing imports with dynamic imports using this functionality.
Fixed the error that the model name is never actually put into GigaChat
request payload, always defaulting to `GigaChat-Lite`.
With this fix, model selection through
```python
import os
from langchain.chat_models.gigachat import GigaChat
chat = GigaChat(
name="GigaChat-Pro", # <- HERE!!!!!
...
)
```
should actually work, as intended in
[here](804390ba4b/libs/community/langchain_community/llms/gigachat.py (L36)).
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
**Description**: ToolKit and Tools for accessing data in a Cassandra
Database primarily for Agent integration. Initially, this includes the
following tools:
- `cassandra_db_schema` Gathers all schema information for the connected
database or a specific schema. Critical for the agent when determining
actions.
- `cassandra_db_select_table_data` Selects data from a specific keyspace
and table. The agent can pass paramaters for a predicate and limits on
the number of returned records.
- `cassandra_db_query` Expiriemental alternative to
`cassandra_db_select_table_data` which takes a query string completely
formed by the agent instead of parameters. May be removed in future
versions.
Includes unit test and two notebooks to demonstrate usage.
**Dependencies**: cassio
**Twitter handle**: @PatrickMcFadin
---------
Co-authored-by: Phil Miesle <phil.miesle@datastax.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** This pull request introduces a new feature to community
tools, enhancing its search capabilities by integrating the Mojeek
search engine
**Dependencies:** None
---------
Co-authored-by: Igor Brai <igor@mojeek.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Removed redundant self/cls from required args of class functions in
_get_python_function_required_args:
```python
class MemberTool:
def search_member(
self,
keyword: str,
*args,
**kwargs,
):
"""Search on members with any keyword like first_name, last_name, email
Args:
keyword: Any keyword of member
"""
headers = dict(authorization=kwargs['token'])
members = []
try:
members = request_(
method='SEARCH',
url=f'{service_url}/apiv1/members',
headers=headers,
json=dict(query=keyword),
)
except Exception as e:
logger.info(e.__doc__)
return members
convert_to_openai_tool(MemberTool.search_member)
```
expected result:
```
{'type': 'function', 'function': {'name': 'search_member', 'description': 'Search on members with any keyword like first_name, last_name, username, email', 'parameters': {'type': 'object', 'properties': {'keyword': {'type': 'string', 'description': 'Any keyword of member'}}, 'required': ['keyword']}}}
```
#20685
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "docs: switched GCSLoaders docs to
langchain-google-community"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** switched GCSLoaders docs to
langchain-google-community
Issue: When the third-party package is not installed, whenever we need
to `pip install <package>` the ImportError is raised.
But sometimes, the `ValueError` or `ModuleNotFoundError` is raised. It
is bad for consistency.
Change: replaced the `ValueError` or `ModuleNotFoundError` with
`ImportError` when we raise an error with the `pip install <package>`
message.
Note: Ideally, we replace all `try: import... except... raise ... `with
helper functions like `import_aim` or just use the existing
[langchain_core.utils.utils.guard_import](https://api.python.langchain.com/en/latest/utils/langchain_core.utils.utils.guard_import.html#langchain_core.utils.utils.guard_import)
But it would be much bigger refactoring. @baskaryan Please, advice on
this.
Implemented bind_tools for OllamaFunctions.
Made OllamaFunctions sub class of ChatOllama.
Implemented with_structured_output for OllamaFunctions.
integration unit test has been updated.
notebook has been updated.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
I can't seem to reproduce, but i got this:
```
SystemError: AST constructor recursion depth mismatch (before=102, after=37)
```
And the operation isn't critical for the actual forward pass so seems
preferable to expand our caught exceptions
**Description**: This update enhances the `extract_sub_links` function
within the `langchain_core/utils/html.py` module to include query
parameters in the extracted URLs.
**Issue**: N/A
**Dependencies**: No additional dependencies required for this change.
**Twitter handle**: N/A
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Just a simple PR to fix a broken link. Apparently having backticks
outside a link makes it render as code.
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This introduces `store_kwargs` which behaves similarly to `graph_kwargs`
on the `RdfGraph` object, which will enable users to pass `headers` and
other arguments to the underlying `SPARQLStore` object. I have also made
a [PR in `rdflib` to support passing
`default_graph`](https://github.com/RDFLib/rdflib/pull/2761).
Example usage:
```python
from langchain_community.graphs import RdfGraph
graph = RdfGraph(
query_endpoint="http://localhost/sparql",
standard="rdf",
store_kwargs=dict(
default_graph="http://example.com/mygraph"
)
)
```
<!--If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.-->
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
MindsDB integrates with LangChain, enabling users to deploy, serve, and
fine-tune models available via LangChain within MindsDB, making them
accessible to numerous data sources.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Description: The PebbloSafeLoader should first check for owner,
full_path and size in metadata before implementing its own logic.
Dependencies: None
Documentation: NA.
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Issue: #20514
The current implementation of `construct_instance` expects a `texts:
List[str]` that will call the embedding function. This might not be
needed when we already have a client with collection and `path, you
don't want to add any text.
This PR adds a class method that returns a qdrant instance with an
existing client.
Here everytime
cb6e5e56c2/libs/community/langchain_community/vectorstores/qdrant.py (L1592)
`construct_instance` is called, this line sends some text for embedding
generation.
---------
Co-authored-by: Anush <anushshetty90@gmail.com>
* Groundedness Check takes `str` or `list[Document]` as input.
* Deprecate `GroundednessCheck` due to its naming.
* Added `UpstageGroundednessCheck`.
* Hotfix for Groundedness Check parameter.
The name `query` was misleading and it should be `answer` instead.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
This auto generates partner migrations.
At the moment the migration is from community -> partner.
So one would need to run the migration script twice to go from langchain to partner.
Add script to help generate migrations.
This works well for partner packages. Migrations are generated based on run time rather than static analysis (much simpler to get the correct migrations implemented).
The script for generating migrations from langchain to community still needs work.
`langchain_pinecone.Pinecone` is deprecated in favor of
`PineconeVectorStore`, and is currently a subclass of
`PineconeVectorStore`.
```python
@deprecated(since="0.0.3", removal="0.2.0", alternative="PineconeVectorStore")
class Pinecone(PineconeVectorStore):
"""Deprecated. Use PineconeVectorStore instead."""
pass
```
**Description:** AzureSearch vector store has no tests. This PR adds
initial tests to validate the code can be imported and used.
**Issue:** N/A
**Dependencies:** azure-search-documents and azure-identity are added as
optional dependencies for testing
---------
Co-authored-by: Matt Gotteiner <[email protected]>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description**:
_PebbloSafeLoader_: Add support for pebblo server and client version
**Documentation:** NA
**Unit test:** NA
**Issue:** NA
**Dependencies:** None
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- [ ] **Kinetica Document Loader**: "community: a class to load
Documents from Kinetica"
- [ ] **Kinetica Document Loader**:
- **Description:** implemented KineticaLoader in `kinetica_loader.py`
- **Dependencies:** install the Kinetica API using `pip install
gpudb==7.2.0.1 `
**Description:** Fixes a bug in the HuggingGPT task execution logic
here:
except Exception as e:
self.status = "failed"
self.message = str(e)
self.status = "completed"
self.save_product()
where a caught exception effectively just sets `self.message` and can
then throw an exception if, e.g., `self.product` is not defined.
**Issue:** None that I'm aware of.
**Dependencies:** None
**Twitter handle:** https://twitter.com/michaeljschock
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Changes
`lanchain_core.output_parsers.CommaSeparatedListOutputParser` to handle
`,` as a delimiter alongside the previous implementation which used `, `
as delimiter.
- **Issue:** Started noticing that some results returned by LLMs were
not getting parsed correctly when the output contained `,` instead of `,
`.
- **Dependencies:** No
- **Twitter handle:** not active on twitter.
<!---
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
-->
- **Description**:
- **add support for more data types**: by default `IpexLLM` will load
the model in int4 format. This PR adds more data types support such as
`sym_in5`, `sym_int8`, etc. Data formats like NF3, NF4, FP4 and FP8 are
only supported on GPU and will be added in future PR.
- Fix a small issue in saving/loading, update api docs
- **Dependencies**: `ipex-llm` library
- **Document**: In `docs/docs/integrations/llms/ipex_llm.ipynb`, added
instructions for saving/loading low-bit model.
- **Tests**: added new test cases to
`libs/community/tests/integration_tests/llms/test_ipex_llm.py`, added
config params.
- **Contribution maintainer**: @shane-huang
Description: Add support for Semantic topics and entities.
Classification done by pebblo-server is not used to enhance metadata of
Documents loaded by document loaders.
Dependencies: None
Documentation: Updated.
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**
- [x] **PR message**:
- **Description:** Deprecate persist method in Chroma no longer exists
in Chroma 0.4.x
- **Issue:** #20851
- **Dependencies:** None
- **Twitter handle:** AndresAlgaba1
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
**Description:**
This PR removes an unnecessary code snippet from the documentation. The
snippet in question is not relevant to the content and does not
contribute to the overall understanding of the topic. It contained
redundant imports and unused code, potentially causing confusion for
readers.
**Issue:**
There is no specific issue number associated with this change.
**Dependencies:**
No additional dependencies are required for this change.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
The RecursiveUrlLoader loader offers a link_regex parameter that can
filter out URLs. However, this filtering capability is limited, and if
the internal links of the website change, unexpected resources may be
loaded. These resources, such as font files, can cause problems in
subsequent embedding processing.
>
https://blog.langchain.dev/assets/fonts/source-sans-pro-v21-latin-ext_latin-regular.woff2?v=0312715cbf
We can add the Content-Type in the HTTP response headers to the document
metadata so developers can choose which resources to use. This allows
developers to make their own choices.
For example, the following may be a good choice for text knowledge.
- text/plain - simple text file
- text/html - HTML web page
- text/xml - XML format file
- text/json - JSON format data
- application/pdf - PDF file
- application/msword - Word document
and ignore the following
- text/css - CSS stylesheet
- text/javascript - JavaScript script
- application/octet-stream - binary data
- image/jpeg - JPEG image
- image/png - PNG image
- image/gif - GIF image
- image/svg+xml - SVG image
- audio/mpeg - MPEG audio files
- video/mp4 - MP4 video file
- application/font-woff - WOFF font file
- application/font-ttf - TTF font file
- application/zip - ZIP compressed file
- application/octet-stream - binary data
**Twitter handle:** @coolbeevip
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
**Description:** In VoyageAI text-embedding examples use voyage-law-2
model
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: Fix misplaced zep cloud example links
- [x] **PR message**:
- **Description:** Fixes misplaced links for vector store and memory zep
cloud examples
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
- **Description:** Adapt JinaEmbeddings to run with the new Jina AI
Rerank API
- **Twitter handle:** https://twitter.com/JinaAI_
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
OpenAI API compatible server may not support `safe_len_embedding`,
use `disable_safe_len_embeddings=True` to disable it.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
* Updating the provider docs page.
The RAG example was meant to be moved to cookbook, but was merged by
mistake.
* Fix bug in Groundedness Check
---------
Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Currently, when a new dev container is created, poetry does not work in
it with the error "No module named 'rapidfuzz'".
Install Poetry outside the project venv so that poetry and project
dependencies do not get mixed. Use pipx to install poetry securely in
its own isolated environment.
Issue: #12237
Twitter handle: https://twitter.com/ibratoev
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Currently, the regex is static (`r"(?<=[.?!])\s+"`),
which is only useful for certain use cases. The current change only
moves this to be a parameter of split_text(). Which adds flexibility
without making it more complex (as the default regex is still the same).
- **Issue:** Not applicable (I searched, no one seems to have created
this issue yet).
- **Dependencies:** None.
_If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17._
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Description: MarkdownHeaderTextSplitter Fails to Parse Headers with
non-printable characters. more #20643
The following is the official test case. Just replacing `# Foo\n\n` with
`\ufeff# Foo\n\n` will cause the test case to fail.
chunk metadata is empty
```python
def test_md_header_text_splitter_1() -> None:
"""Test markdown splitter by header: Case 1."""
markdown_document = (
"\ufeff# Foo\n\n"
" ## Bar\n\n"
"Hi this is Jim\n\n"
"Hi this is Joe\n\n"
" ## Baz\n\n"
" Hi this is Molly"
)
headers_to_split_on = [
("#", "Header 1"),
("##", "Header 2"),
]
markdown_splitter = MarkdownHeaderTextSplitter(
headers_to_split_on=headers_to_split_on,
)
output = markdown_splitter.split_text(markdown_document)
expected_output = [
Document(
page_content="Hi this is Jim \nHi this is Joe",
metadata={"Header 1": "Foo", "Header 2": "Bar"},
),
Document(
page_content="Hi this is Molly",
metadata={"Header 1": "Foo", "Header 2": "Baz"},
),
]
assert output == expected_output
```
twitter: @coolbeevip
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Description :
- added functionalities - delete, index creation, using existing
connection object etc.
- updated usage
- Added LaceDB cloud OSS support
make lint_diff , make test checks done
- **Description:** fix a bug in the agent_token_buffer_memory
- **Issue:** agent_token_buffer_memory was not working with openai tools
- **Dependencies:** None
- **Twitter handle:** @pokidyshef
**Description:** Adds the command to install packages required before
using _Unstructured_ and _PDFMiner_ from `langchain.community`
**Documentation Page Being Updated:** [LangChain > Retrieval > Document
loaders > PDF > Using
Unstructured](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/#using-unstructured)
**Issue:** #20719
**Dependencies:** no dependencies
**Twitter handle:** SalikaDave
<!--
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17. -->
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
## Description
Add `aprep_output` method to `langchain/chains/base.py`. Some downstream
`ChatMessageHistory` objects that use async connections require an async
way to append to the context.
It turned out that `ainvoke()` was calling `prep_output` which is
synchronous.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
# Proxy Fix for Groq Class 🐛🚀
## Description
This PR fixes a bug related to proxy settings in the `Groq` class,
allowing users to connect to LangChain services via a proxy.
## Changes Made
- ✅ FIX support for specifying proxy settings in the `Groq` class.
- ✅ Resolved the bug causing issues with proxy settings.
- ❌ Did not include unit tests and documentation updates.
- ❌ Did not run make format, make lint, and make test to ensure code
quality and functionality because I couldn't get it to run, so I don't
program in Python and couldn't run `ruff`.
- ❔ Ensured that the changes are backwards compatible.
- ✅ No additional dependencies were added to `pyproject.toml`.
### Error Before Fix
```python
Traceback (most recent call last):
File "/home/bg/Documents/code/github.com/back2nix/test/groq/main.py", line 9, in <module>
chat = ChatGroq(
^^^^^^^^^
File "/home/bg/Documents/code/github.com/back2nix/test/groq/venv310/lib/python3.11/site-packages/langchain_core/load/serializable.py", line 120, in __init__
super().__init__(**kwargs)
File "/home/bg/Documents/code/github.com/back2nix/test/groq/venv310/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for ChatGroq
__root__
Invalid `http_client` argument; Expected an instance of `httpx.AsyncClient` but got <class 'httpx.Client'> (type=type_error)
```
### Example usage after fix
```python3
import os
import httpx
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq
chat = ChatGroq(
temperature=0,
groq_api_key=os.environ.get("GROQ_API_KEY"),
model_name="mixtral-8x7b-32768",
http_client=httpx.Client(
proxies="socks5://127.0.0.1:1080",
transport=httpx.HTTPTransport(local_address="0.0.0.0"),
),
http_async_client=httpx.AsyncClient(
proxies="socks5://127.0.0.1:1080",
transport=httpx.HTTPTransport(local_address="0.0.0.0"),
),
)
system = "You are a helpful assistant."
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])
chain = prompt | chat
out = chain.invoke({"text": "Explain the importance of low latency LLMs"})
print(out)
```
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Implemented the ability to enable full-text search within the
SingleStore vector store, offering users a versatile range of search
strategies. This enhancement allows users to seamlessly combine
full-text search with vector search, enabling the following search
strategies:
* Search solely by vector similarity.
* Conduct searches exclusively based on text similarity, utilizing
Lucene internally.
* Filter search results by text similarity score, with the option to
specify a threshold, followed by a search based on vector similarity.
* Filter results by vector similarity score before conducting a search
based on text similarity.
* Perform searches using a weighted sum of vector and text similarity
scores.
Additionally, integration tests have been added to comprehensively cover
all scenarios.
Updated notebook with examples.
CC: @baskaryan, @hwchase17
---------
Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- added guard on the `pyTigerGraph` import
- added a missed example page in the `docs/integrations/graphs/`
- formatted the `docs/integrations/providers/` page to the consistent
format. Added links.
- **Description:**
This PR adds support for advanced filtering to the integration of HANA
Vector Engine.
The newly supported filtering operators are: $eq, $ne, $gt, $gte, $lt,
$lte, $between, $in, $nin, $like, $and, $or
- **Issue:** N/A
- **Dependencies:** no new dependencies added
Added integration tests to:
`libs/community/tests/integration_tests/vectorstores/test_hanavector.py`
Description of the new capabilities in notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`
Thank you for contributing to LangChain!
community:perplexity[patch]: standardize init args
updated pplx_api_key and request_timeout so that aliased to api_key, and
timeout respectively. Added test that both continue to set the same
underlying attributes.
Related to
[20085](https://github.com/langchain-ai/langchain/issues/20085)
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Thank you for contributing to LangChain!
- [x] **PR title**: docs: Update Zep Messaging, add links to Zep Cloud
Docs
- [x] **PR message**:
- **Description:** This PR updates Zep messaging in the docs + links to
Langchain Zep Cloud examples in our documentation
- **Twitter handle:** @paulpaliychuk51
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
This PR moves the interface and the logic to core.
The following changes to namespaces:
`indexes` -> `indexing`
`indexes._api` -> `indexing.api`
Testing code is intentionally duplicated for now since it's testing
different
implementations of the record manager (in-memory vs. SQL).
Common logic will need to be pulled out into the test client.
A follow up PR will move the SQL based implementation outside of
LangChain.
**Description:**
This PR fixes an issue in message formatting function for Anthropic
models on Amazon Bedrock.
Currently, LangChain BedrockChat model will crash if it uses Anthropic
models and the model return a message in the following type:
- `AIMessageChunk`
Moreover, when use BedrockChat with for building Agent, the following
message types will trigger the same issue too:
- `HumanMessageChunk`
- `FunctionMessage`
**Issue:**
https://github.com/langchain-ai/langchain/issues/18831
**Dependencies:**
No.
**Testing:**
Manually tested. The following code was failing before the patch and
works after.
```
@tool
def square_root(x: str):
"Useful when you need to calculate the square root of a number"
return math.sqrt(int(x))
llm = ChatBedrock(
model_id="anthropic.claude-3-sonnet-20240229-v1:0",
model_kwargs={ "temperature": 0.0 },
)
prompt = ChatPromptTemplate.from_messages(
[
("system", FUNCTION_CALL_PROMPT),
("human", "Question: {user_input}"),
MessagesPlaceholder(variable_name="agent_scratchpad"),
]
)
tools = [square_root]
tools_string = format_tool_to_anthropic_function(square_root)
agent = (
RunnablePassthrough.assign(
user_input=lambda x: x['user_input'],
agent_scratchpad=lambda x: format_to_openai_function_messages(
x["intermediate_steps"]
)
)
| prompt
| llm
| AnthropicFunctionsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True)
output = agent_executor.invoke({
"user_input": "What is the square root of 2?",
"tools_string": tools_string,
})
```
List of messages returned from Bedrock:
```
<SystemMessage> content='You are a helpful assistant.'
<HumanMessage> content='Question: What is the square root of 2?'
<AIMessageChunk> content="Okay, let's calculate the square root of 2.<scratchpad>\nTo calculate the square root of a number, I can use the square_root tool:\n\n<function_calls>\n <invoke>\n <tool_name>square_root</tool_name>\n <parameters>\n <__arg1>2</__arg1>\n </parameters>\n </invoke>\n</function_calls>\n</scratchpad>\n\n<function_results>\n<search_result>\nThe square root of 2 is approximately 1.414213562373095\n</search_result>\n</function_results>\n\n<answer>\nThe square root of 2 is approximately 1.414213562373095\n</answer>" id='run-92363df7-eff6-4849-bbba-fa16a1b2988c'"
<FunctionMessage> content='1.4142135623730951' name='square_root'
```
Hi! My name is Alex, I'm an SDK engineer from
[Comet](https://www.comet.com/site/)
This PR updates the `CometTracer` class.
Fixed an issue when `CometTracer` failed while logging the data to Comet
because this data is not JSON-encodable.
The problem was in some of the `Run` attributes that could contain
non-default types inside, now these attributes are taken not from the
run instance, but from the `run.dict()` return value.
Causes an issue for this code
```python
from langchain.chat_models.openai import ChatOpenAI
from langchain.output_parsers.openai_tools import JsonOutputToolsParser
from langchain.schema import SystemMessage
prompt = SystemMessage(content="You are a nice assistant.") + "{question}"
llm = ChatOpenAI(
model_kwargs={
"tools": [
{
"type": "function",
"function": {
"name": "web_search",
"description": "Searches the web for the answer to the question.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The question to search for.",
},
},
},
},
}
],
},
streaming=True,
)
parser = JsonOutputToolsParser(first_tool_only=True)
llm_chain = prompt | llm | parser | (lambda x: x)
for chunk in llm_chain.stream({"question": "tell me more about turtles"}):
print(chunk)
# message = llm_chain.invoke({"question": "tell me more about turtles"})
# print(message)
```
Instead by definition, we'll assume that RunnableLambdas consume the
entire stream and that if the stream isn't addable then it's the last
message of the stream that's in the usable format.
---
If users want to use addable dicts, they can wrap the dict in an
AddableDict class.
---
Likely, need to follow up with the same change for other places in the
code that do the upgrade
- **Description:** In January, Laiyer.ai became part of ProtectAI, which
means the model became owned by ProtectAI. In addition to that,
yesterday, we released a new version of the model addressing issues the
Langchain's community and others mentioned to us about false-positives.
The new model has a better accuracy compared to the previous version,
and we thought the Langchain community would benefit from using the
[latest version of the
model](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2).
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** @alex_yaremchuk
This PR moves the implementations for chat history to core. So it's
easier to determine which dependencies need to be broken / add
deprecation warnings
Fixed an error in the sample code to ensure that the code can run
directly.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
langchain_community.document_loaders depricated
new langchain_google_community
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
docs: Fix link for `partition_pdf` in Semi_Structured_RAG.ipynb cookbook
- **Description:** Fix incorrect link to unstructured-io `partition_pdf`
section
Vector indexes in ClickHouse are experimental at the moment and can
sometimes break/change behaviour. So this PR makes it possible to say
that you don't want to specify an index type.
Any queries against the embedding column will be brute force/linear
scan, but that gives reasonable performance for small-medium dataset
sizes.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "docs: added a description of differences
langchain_google_genai vs langchain_google_vertexai"
- [ ]
- **Description:** added a description of differences
langchain_google_genai vs langchain_google_vertexai
**Description:** implemented GraphStore class for Apache Age graph db
**Dependencies:** depends on psycopg2
Unit and integration tests included. Formatting and linting have been
run.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Update Neo4j Cypher templates to use function callback to pass context
instead of passing it in user prompt.
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** This pull request removes a duplicated `--quiet` flag
in the pip install command found in the LangSmith Walkthrough section of
the documentation.
**Issue:** N/A
**Dependencies:** None
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: docs"
- [ ] **PR message**:
- **Description:** Updated Tutorials for Vertex Vector Search
- **Issue:** NA
- **Dependencies:** NA
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
@lkuligin for review
---------
Co-authored-by: adityarane@google.com <adityarane@google.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
This pull request corrects a mistake in the variable name within the
example code. The variable doc_schema has been changed to dog_schema to
fix the error.
Description: you don't need to pass a version for Replicate official
models. That was broken on LangChain until now!
You can now run:
```
llm = Replicate(
model="meta/meta-llama-3-8b-instruct",
model_kwargs={"temperature": 0.75, "max_length": 500, "top_p": 1},
)
prompt = """
User: Answer the following yes/no question by reasoning step by step. Can a dog drive a car?
Assistant:
"""
llm(prompt)
```
I've updated the replicate.ipynb to reflect that.
twitter: @charliebholtz
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
ZhipuAI API only accepts `temperature` parameter between `(0, 1)` open
interval, and if `0` is passed, it responds with status code `400`.
However, 0 and 1 is often accepted by other APIs, for example, OpenAI
allows `[0, 2]` for temperature closed range.
This PR truncates temperature parameter passed to `[0.01, 0.99]` to
improve the compatibility between langchain's ecosystem's and ZhipuAI
(e.g., ragas `evaluate` often generates temperature 0, which results in
a lot of 400 invalid responses). The PR also truncates `top_p` parameter
since it has the same restriction.
Reference: [glm-4 doc](https://open.bigmodel.cn/dev/api#glm-4) (which
unfortunately is in Chinese though).
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
faster-whisper is a reimplementation of OpenAI's Whisper model using
CTranslate2, which is up to 4 times faster than enai/whisper for the
same accuracy while using less memory. The efficiency can be further
improved with 8-bit quantization on both CPU and GPU.
It can automatically detect the following 14 languages and transcribe
the text into their respective languages: en, zh, fr, de, ja, ko, ru,
es, th, it, pt, vi, ar, tr.
The gitbub repository for faster-whisper is :
https://github.com/SYSTRAN/faster-whisper
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
VSDX data contains EMF files. Some of these apparently can contain
exploits with some Adobe tools.
This is likely a false positive from antivirus software, but we
can remove it nonetheless.
Hey @eyurtsev, I noticed that the notebook isn't displaying the outputs
properly. I've gone ahead and rerun the cells to ensure that readers can
easily understand the functionality without having to run the code
themselves.
Replaced `from langchain.prompts` with `from langchain_core.prompts`
where it is appropriate.
Most of the changes go to `langchain_experimental`
Similar to #20348
…gFaceTextGenInference)
- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [HuggingFaceTextGenInference]
- [x] **PR message**:
- **Description:** Invoke callback prior to yielding token in stream
method in [HuggingFaceTextGenInference]
- **Issue:** https://github.com/langchain-ai/langchain/issues/16913
- **Dependencies:** None
- **Twitter handle:** @bolun_zhang
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
fix timeout issue
fix zhipuai usecase notebookbook
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
fixed broken `LangGraph` hyperlink
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
@rgupta2508 I believe this change is necessary following
https://github.com/langchain-ai/langchain/pull/20318 because of how
Milvus handles defaults:
59bf5e811a/pymilvus/client/prepare.py (L82-L85)
```python
num_shards = kwargs[next(iter(same_key))]
if not isinstance(num_shards, int):
msg = f"invalid num_shards type, got {type(num_shards)}, expected int"
raise ParamError(message=msg)
req.shards_num = num_shards
```
this way lets Milvus control the default value (instead of maintaining a
separate default in Langchain).
Let me know if I've got this wrong or you feel it's unnecessary. Thanks.
To support number of the shards for the collection to create in milvus
vvectorstores.
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
**Description:** Move `FileCallbackHandler` from community to core
**Issue:** #20493
**Dependencies:** None
(imo) `FileCallbackHandler` is a built-in LangChain callback handler
like `StdOutCallbackHandler` and should properly be in in core.
- **Description:** added the headless parameter as optional argument to
the langchain_community.document_loaders AsyncChromiumLoader class
- **Dependencies:** None
- **Twitter handle:** @perinim_98
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- would happen when user's code tries to access attritbute that doesnt
exist, we prefer to let this crash in the user's code, rather than here
- also catch more cases where a runnable is invoked/streamed inside a
lambda. before we weren't seeing these as deps
**Description:** currently, the `DirectoryLoader` progress-bar maximum value is based on an incorrect number of files to process
In langchain_community/document_loaders/directory.py:127:
```python
paths = p.rglob(self.glob) if self.recursive else p.glob(self.glob)
items = [
path
for path in paths
if not (self.exclude and any(path.match(glob) for glob in self.exclude))
]
```
`paths` returns both files and directories. `items` is later used to determine the maximum value of the progress-bar which gives an incorrect progress indication.
- Add functions (_stream, _astream)
- Connect to _generate and _agenerate
Thank you for contributing to LangChain!
- [x] **PR title**: "community: Add streaming logic in ChatHuggingFace"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Addition functions (_stream, _astream) and connection
to _generate and _agenerate
- **Issue:** #18782
- **Dependencies:** none
- **Twitter handle:** @lunara_x
**Community: Unify Titan Takeoff Integrations and Adding Embedding
Support**
**Description:**
Titan Takeoff no longer reflects this either of the integrations in the
community folder. The two integrations (TitanTakeoffPro and
TitanTakeoff) where causing confusion with clients, so have moved code
into one place and created an alias for backwards compatibility. Added
Takeoff Client python package to do the bulk of the work with the
requests, this is because this package is actively updated with new
versions of Takeoff. So this integration will be far more robust and
will not degrade as badly over time.
**Issue:**
Fixes bugs in the old Titan integrations and unified the code with added
unit test converge to avoid future problems.
**Dependencies:**
Added optional dependency takeoff-client, all imports still work without
dependency including the Titan Takeoff classes but just will fail on
initialisation if not pip installed takeoff-client
**Twitter**
@MeryemArik9
Thanks all :)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Description: Add support for authorized identities in PebbloSafeLoader.
Now with this change, PebbloSafeLoader will extract
authorized_identities from metadata and send it to pebblo server
Dependencies: None
Documentation: None
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
From `langchain_community 0.0.30`, there's a bug that cannot send a
file-like object via `file` parameter instead of `file path` due to
casting the `file_path` to str type even if `file_path` is None.
which means that when I call the `partition_via_api()`, exactly one of
`filename` and `file` must be specified by the following error message.
however, from `langchain_community 0.0.30`, `file_path` is casted into
`str` type even `file_path` is None in `get_elements_from_api()` and got
an error at `exactly_one(filename=filename, file=file)`.
here's an error message
```
---> 51 exactly_one(filename=filename, file=file)
53 if metadata_filename and file_filename:
54 raise ValueError(
55 "Only one of metadata_filename and file_filename is specified. "
56 "metadata_filename is preferred. file_filename is marked for deprecation.",
57 )
File /opt/homebrew/lib/python3.11/site-packages/unstructured/partition/common.py:441, in exactly_one(**kwargs)
439 else:
440 message = f"{names[0]} must be specified."
--> 441 raise ValueError(message)
ValueError: Exactly one of filename and file must be specified.
```
So, I simply made a change that casting to str type when `file_path` is
not None.
I use `UnstructuredAPIFileLoader` like below.
```
from langchain_community.document_loaders.unstructured import UnstructuredAPIFileLoader
documents: list = UnstructuredAPIFileLoader(
file_path=None,
file=file, # file-like object, io.BytesIO type
mode='elements',
url='http://127.0.0.1:8000/general/v0/general',
content_type='application/pdf',
metadata_filename='asdf.pdf',
).load_and_split()
```
- [x] **PR title**: "community: improve kuzu cypher generation prompt"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Improves the Kùzu Cypher generation prompt to be more
robust to open source LLM outputs
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** @kuzudb
- [x] **Add tests and docs**: If you're adding a new integration, please
include
No new tests (non-breaking. change)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
## Description:
The PR introduces 3 changes:
1. added `recursive` property to `O365BaseLoader`. (To keep the behavior
unchanged, by default is set to `False`). When `recursive=True`,
`_load_from_folder()` also recursively loads all nested folders.
2. added `folder_id` to SharePointLoader.(similar to (this
PR)[https://github.com/langchain-ai/langchain/pull/10780] ) This
provides an alternative to `folder_path` that doesn't seem to reliably
work.
3. when none of `document_ids`, `folder_id`, `folder_path` is provided,
the loader fetches documets from root folder. Combined with
`recursive=True` this provides an easy way of loading all compatible
documents from SharePoint.
The PR contains the same logic as [this stale
PR](https://github.com/langchain-ai/langchain/pull/10780) by
@WaleedAlfaris. I'd like to ask his blessing for moving forward with
this one.
## Issue:
- As described in https://github.com/langchain-ai/langchain/issues/19938
and https://github.com/langchain-ai/langchain/pull/10780 the sharepoint
loader often does not seem to work with folder_path.
- Recursive loading of subfolders is a missing functionality
## Dependecies: None
Twitter handle:
@martintriska1 @WRhetoric
This is my first PR here, please be gentle :-)
Please review @baskaryan
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This PR updates OctoAIEndpoint LLM to subclass BaseOpenAI as OctoAI is
an OpenAI-compatible service. The documentation and tests have also been
updated.
**Description:** Adds ThirdAI NeuralDB retriever integration. NeuralDB
is a CPU-friendly and fine-tunable text retrieval engine. We previously
added a vector store integration but we think that it will be easier for
our customers if they can also find us under under
langchain-community/retrievers.
---------
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
**Description:** Make ChatDatabricks model supports stream
**Issue:** N/A
**Dependencies:** MLflow nightly build version (we will release next
MLflow version soon)
**Twitter handle:** N/A
Manually test:
(Before testing, please install `pip install
git+https://github.com/mlflow/mlflow.git`)
```python
# Test Databricks Foundation LLM model
from langchain.chat_models import ChatDatabricks
chat_model = ChatDatabricks(
endpoint="databricks-llama-2-70b-chat",
max_tokens=500
)
from langchain_core.messages import AIMessageChunk
for chunk in chat_model.stream("What is mlflow?"):
print(chunk.content, end="|")
```
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Add conditional: bool property to json representation of the graphs
- Add option to generate mermaid graph stripped of styles (useful as a
text representation of graph)
…s arg too
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
- **Description:**
This PR adds a callback handler for UpTrain. It performs evaluations in
the RAG pipeline to check the quality of retrieved documents, generated
queries and responses.
- **Dependencies:**
- The UpTrainCallbackHandler requires the uptrain package
---------
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
enviroment variable ANTHROPIC_API_URL will not work if anthropic_api_url
has default value
---------
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
**Description**: Support filter by OR and AND for deprecated PGVector
version
**Issue**: #20445
**Dependencies**: N/A
**Twitter** handle: @martinferenaz
- **Description:**Add Google Firestore Vector store docs
- **Issue:** NA
- **Dependencies:** NA
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
Description: fixes LangChainDeprecationWarning: The class
`langchain_community.embeddings.cohere.CohereEmbeddings` was deprecated
in langchain-community 0.0.30 and will be removed in 0.2.0. An updated
version of the class exists in the langchain-cohere package and should
be used instead. To use it run `pip install -U langchain-cohere` and
import as `from langchain_cohere import CohereEmbeddings`.

Dependencies : langchain_cohere
Twitter handle: @Mo_Noumaan
Description of features on mermaid graph renderer:
- Fixing CDN to use official Mermaid JS CDN:
https://www.jsdelivr.com/package/npm/mermaid?tab=files
- Add device_scale_factor to allow increasing quality of resulting PNG.
- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [DeepInfra]
- [x] **PR message**:
- **Description:** Invoke callback prior to yielding token in stream
method in [DeepInfra]
- **Issue:** https://github.com/langchain-ai/langchain/issues/16913
- **Dependencies:** None
- **Twitter handle:** @bolun_zhang
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Description: This update refines the documentation for
`RunnablePassthrough` by removing an unnecessary import and correcting a
minor syntactical error in the example provided. This change enhances
the clarity and correctness of the documentation, ensuring that users
have a more accurate guide to follow.
Issue: N/A
Dependencies: None
This PR focuses solely on documentation improvements, specifically
targeting the `RunnablePassthrough` class within the `langchain_core`
module. By clarifying the example provided in the docstring, users are
offered a more straightforward and error-free guide to utilizing the
`RunnablePassthrough` class effectively.
As this is a documentation update, it does not include changes that
require new integrations, tests, or modifications to dependencies. It
adheres to the guidelines of minimal package interference and backward
compatibility, ensuring that the overall integrity and functionality of
the LangChain package remain unaffected.
Thank you for considering this documentation refinement for inclusion in
the LangChain project.
Fix of YandexGPT embeddings.
The current version uses a single `model_name` for queries and
documents, essentially making the `embed_documents` and `embed_query`
methods the same. Yandex has a different endpoint (`model_uri`) for
encoding documents, see
[this](https://yandex.cloud/en/docs/yandexgpt/concepts/embeddings). The
bug may impact retrievers built with `YandexGPTEmbeddings` (for instance
FAISS database as retriever) since they use both `embed_documents` and
`embed_query`.
A simple snippet to test the behaviour:
```python
from langchain_community.embeddings.yandex import YandexGPTEmbeddings
embeddings = YandexGPTEmbeddings()
q_emb = embeddings.embed_query('hello world')
doc_emb = embeddings.embed_documents(['hello world', 'hello world'])
q_emb == doc_emb[0]
```
The response is `True` with the current version and `False` with the
changes I made.
Twitter: @egor_krash
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Updates the documentation for Portkey and Langchain.
Also updates the notebook. The current documentation is fairly old and
is non-functional.
**Twitter handle:** @portkeyai
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
`_ListSQLDatabaseToolInput` raise error if model returns `{}`.
For example, gpt-4-turbo returns `{}` with SQL Agent initialized by
`create_sql_agent`.
So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input.
This is actually a gpt-4-turbo issue, not a LangChain issue, but I
thought it would be helpful to set a default value `""`.
This problem is discussed in detail in the following Issue.
**Issue:** https://github.com/langchain-ai/langchain/issues/20405
**Dependencies:** none
Sorry, I did not add or change the test code, as tests for this
components was not exist .
However, I have tested the following code based on the [SQL Agent
Document](https://python.langchain.com/docs/use_cases/sql/agents/), to
make sure it works.
```
from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI
db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?")
print(result["output"])
```
- **Description:** Complete the support for Lua code in
langchain.text_splitter module.
- **Dependencies:** No
- **Twitter handle:** @saberuster
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
```python
from langchain.agents import AgentExecutor, create_tool_calling_agent, tool
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_groq import ChatGroq
prompt = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful assistant"),
("human", "{input}"),
MessagesPlaceholder("agent_scratchpad"),
]
)
model = ChatGroq(model_name="mixtral-8x7b-32768", temperature=0)
@tool
def magic_function(input: int) -> int:
"""Applies a magic function to an input."""
return input + 2
tools = [magic_function]
agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "what is the value of magic_function(3)?"})
```
```
> Entering new AgentExecutor chain...
Invoking: `magic_function` with `{'input': 3}`
5The value of magic\_function(3) is 5.
> Finished chain.
{'input': 'what is the value of magic_function(3)?',
'output': 'The value of magic\\_function(3) is 5.'}
```
**Description:** Masking of the API key for AI21 models
**Issue:** Fixes#12165 for AI21
**Dependencies:** None
Note: This fix came in originally through #12418 but was possibly missed
in the refactor to the AI21 partner package
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Replaced all `from langchain.callbacks` into `from
langchain_core.callbacks` .
Changes in the `langchain` and `langchain_experimental`
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
- **Description**: The pydantic schema fields are supposed to be
optional but the use of `...` makes them required. This causes a
`ValidationError` when running the example code. I replaced `...` with
`default=None` to make the fields optional as intended. I also
standardized the format for all fields.
- **Issue**: n/a
- **Dependencies**: none
- **Twitter handle**: https://twitter.com/m_atoms
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for Llamafile
- [x] **PR message**:
- **Description:** Invoke callback prior to yielding token in stream
method in community llamafile.py
- **Issue:** https://github.com/langchain-ai/langchain/issues/16913
- **Dependencies:** None
- **Twitter handle:** @bolun_zhang
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
spelling error fixed
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for HuggingFaceEndpoint
- [x] **PR message**:
- **Description:** Invoke callback prior to yielding token in stream
method in community HuggingFaceEndpoint
- **Issue:** https://github.com/langchain-ai/langchain/issues/16913
- **Dependencies:** None
- **Twitter handle:** @bolun_zhang
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Added the [FireCrawl](https://firecrawl.dev) document loader. Firecrawl
crawls and convert any website into LLM-ready data. It crawls all
accessible subpages and give you clean markdown for each.
- **Description:** Adds FireCrawl data loader
- **Dependencies:** firecrawl-py
- **Twitter handle:** @mendableai
ccing contributors: (@ericciarla @nickscamara)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
LLMs might sometimes return invalid response for LLM graph transformer.
Instead of failing due to pydantic validation, we skip it and manually
check and optionally fix error where we can, so that more information
gets extracted
- **Description:** Added cross-links for easy access of api
documentation of each output parser class from it's description page.
- **Issue:** related to issue #19969
Co-authored-by: Haris Ali <haris.ali@formulatrix.com>
avaliable -> available
- **Description:** fixed typo
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
Mistral gives us one ID per response, no individual IDs for tool calls.
```python
from langchain.agents import AgentExecutor, create_tool_calling_agent, tool
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_mistralai import ChatMistralAI
prompt = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful assistant"),
("human", "{input}"),
MessagesPlaceholder("agent_scratchpad"),
]
)
model = ChatMistralAI(model="mistral-large-latest", temperature=0)
@tool
def magic_function(input: int) -> int:
"""Applies a magic function to an input."""
return input + 2
tools = [magic_function]
agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "what is the value of magic_function(3)?"})
```
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:** Adds chroma to the partners package. Tests & code
mirror those in the community package.
**Dependencies:** None
**Twitter handle:** @akiradev0x
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
This PR should make it easier for linters to do type checking and for IDEs to jump to definition of code.
See #20050 as a template for this PR.
- As a byproduct: Added 3 missed `test_imports`.
- Added missed `SolarChat` in to __init___.py Added it into test_import
ut.
- Added `# type: ignore` to fix linting. It is not clear, why linting
errors appear after ^ changes.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
```python
from langchain.agents import AgentExecutor, create_tool_calling_agent, tool
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
prompt = ChatPromptTemplate.from_messages(
[
("system", "You are a helpful assistant"),
MessagesPlaceholder("chat_history", optional=True),
("human", "{input}"),
MessagesPlaceholder("agent_scratchpad"),
]
)
model = ChatAnthropic(model="claude-3-opus-20240229")
@tool
def magic_function(input: int) -> int:
"""Applies a magic function to an input."""
return input + 2
tools = [magic_function]
agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
agent_executor.invoke({"input": "what is the value of magic_function(3)?"})
```
```
> Entering new AgentExecutor chain...
Invoking: `magic_function` with `{'input': 3}`
responded: [{'text': '<thinking>\nThe user has asked for the value of magic_function applied to the input 3. Looking at the available tools, magic_function is the relevant one to use here, as it takes an integer input and returns an integer output.\n\nThe magic_function has one required parameter:\n- input (integer)\n\nThe user has directly provided the value 3 for the input parameter. Since the required parameter is present, we can proceed with calling the function.\n</thinking>', 'type': 'text'}, {'id': 'toolu_01HsTheJPA5mcipuFDBbJ1CW', 'input': {'input': 3}, 'name': 'magic_function', 'type': 'tool_use'}]
5
Therefore, the value of magic_function(3) is 5.
> Finished chain.
{'input': 'what is the value of magic_function(3)?',
'output': 'Therefore, the value of magic_function(3) is 5.'}
```
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]
```python
class ToolCall(TypedDict):
name: str
args: Dict[str, Any]
id: Optional[str]
class InvalidToolCall(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
error: Optional[str]
class ToolCallChunk(TypedDict):
name: Optional[str]
args: Optional[str]
id: Optional[str]
index: Optional[int]
class AIMessage(BaseMessage):
...
tool_calls: List[ToolCall] = []
invalid_tool_calls: List[InvalidToolCall] = []
...
class AIMessageChunk(AIMessage, BaseMessageChunk):
...
tool_call_chunks: Optional[List[ToolCallChunk]] = None
...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
- additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).
Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Description: When multithreading is set to True and using the
DirectoryLoader, there was a bug that caused the return type to be a
double nested list. This resulted in other places upstream not being
able to utilize the from_documents method as it was no longer a
`List[Documents]` it was a `List[List[Documents]]`. The change made was
to just loop through the `future.result()` and yield every item.
Issue: #20093
Dependencies: N/A
Twitter handle: N/A
This unit test fails likely validation by the openai client.
Newer openai library seems to be doing more validation so the existing
test fails since http_client needs to be of httpx instance
- **Description**: fixes BooleanOutputParser detecting sub-words ("NOW
this is likely (YES)" -> `True`, not `AmbiguousError`)
- **Issue(s)**: fixes#11408 (follow-up to #17810)
- **Dependencies**: None
- **GitHub handle**: @casperdcl
<!-- if unreviewd after a few days, @-mention one of baskaryan, efriis,
eyurtsev, hwchase17 -->
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
Use the `Stream` context managers in `ChatOpenAi` `stream` and `astream`
method.
Using the context manager returned by the OpenAI client makes it
possible to terminate the stream early since the response connection
will be closed when the context manager exists.
**Issue:** #5340
**Twitter handle:** @snopoke
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Bug fix. Removed extra line in `GCSDirectoryLoader`
to allow catching Exceptions. Now also logs the file path if Exception
is raised for easier debugging.
- **Issue:** #20198 Bug since langchain-community==0.0.31
- **Dependencies:** No change
- **Twitter handle:** timothywong731
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- make Tencent Cloud VectorDB support metadata filtering.
- implement delete function for Tencent Cloud VectorDB.
- support both Langchain Embedding model and Tencent Cloud VDB embedding
model.
- Tencent Cloud VectorDB support filter search keyword, compatible with
langchain filtering syntax.
- add Tencent Cloud VectorDB TranslationVisitor, now work with self
query retriever.
- more documentations.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** In this PR I fixed the links which points to the API
docs for classes in OpenAI functions and OpenAI tools section of output
parsers.
- **Issue:** It fixed the issue #19969
Co-authored-by: Haris Ali <haris.ali@formulatrix.com>
Issue `langchain_community.cross_encoders` didn't have flattening
namespace code in the __init__.py file.
Changes:
- added code to flattening namespaces (used #20050 as a template)
- added ut for a change
- added missed `test_imports` for `chat_loaders` and
`chat_message_histories` modules
This PR make `request_timeout` and `max_retries` configurable for
ChatAnthropic.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Thank you for contributing to LangChain!
- [ ] **PR title**: "community: Add semantic caching and memory using
MongoDB"
- [ ] **PR message**:
- **Description:** This PR introduces functionality for adding semantic
caching and chat message history using MongoDB in RAG applications. By
leveraging the MongoDBCache and MongoDBChatMessageHistory classes,
developers can now enhance their retrieval-augmented generation
applications with efficient semantic caching mechanisms and persistent
conversation histories, improving response times and consistency across
chat sessions.
- **Issue:** N/A
- **Dependencies:** Requires `datasets`, `langchain`,
`langchain-mongodb`, `langchain-openai`, `pymongo`, and `pandas` for
implementation. MongoDB Atlas is used for database services, and the
OpenAI API for model access.
- **Twitter handle:** @richmondalake
Co-authored-by: Erick Friis <erick@langchain.dev>
Issue:
When async_req is the default value True, pinecone client return the
multiprocessing AsyncResult object.
When async_req is set to False, pinecone client return the result
directly. `[{'upserted_count': 1}]` . Calling get() method will throw an
error in this case.
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Langchain-Predibase integration was failing, because
it was not current with the Predibase SDK; in addition, Predibase
integration tests were instantiating the Langchain Community `Predibase`
class with one required argument (`model`) missing. This change updates
the Predibase SDK usage and fixes the integration tests.
- **Twitter handle:** `@alexsherstinsky`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Last year Microsoft [changed the
name](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search)
of Azure Cognitive Search to Azure AI Search. This PR updates the
Langchain Azure Retriever API and it's associated docs to reflect this
change. It may be confusing for users to see the name Cognitive here and
AI in the Microsoft documentation which is why this is needed. I've also
added a more detailed example to the Azure retriever doc page.
There are more places that need a similar update but I'm breaking it up
so the PRs are not too big 😄 Fixing my errors from the previous PR.
Twitter: @marlene_zw
Two new tests added to test backward compatibility in
`libs/community/tests/integration_tests/retrievers/test_azure_cognitive_search.py`
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
- **Description:** update langchain anthropic templates to support
Claude 3 (iterative search, chain of note, summarization, and XML
response)
- **Issue:** issue # N/A. Stability issues and errors encountered when
trying to use older langchain and anthropic libraries.
- **Dependencies:**
- langchain_anthropic version 0.1.4\
- anthropic package version in the range ">=0.17.0,<1" to support
langchain_anthropic.
- **Twitter handle:** @d_w_b7
- [ x]**Add tests and docs**: If you're adding a new integration, please
include
1. used instructions in the README for testing
- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
After this PR it will be possible to pass a cache instance directly to a
language model. This is useful to allow different language models to use
different caches if needed.
- **Issue:** close#19276
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- Added missed providers
- Added links, descriptions in related examples
- Formatted in a consistent format
Co-authored-by: Erick Friis <erick@langchain.dev>
Updated a page with existing document loaders with links to examples.
Fixed formatting of one example.
Co-authored-by: Erick Friis <erick@langchain.dev>
Issue: The `graph` code was moved into the `community` package a long
ago. But the related documentation is still in the
[use_cases](https://python.langchain.com/docs/use_cases/graph/integrations/diffbot_graphtransformer)
section and not in the `integrations`.
Changes:
- moved the `use_cases/graph/integrations` notebooks into the
`integrations/graphs`
- renamed files and changed titles to follow the consistent format
- redirected old page URLs to new URLs in `vercel.json` and in several
other pages
- added descriptions and links when necessary
- formatted into the consistent format
Should hopefully avoid weird broken link edge cases.
Relative links now trip up the Docusaurus broken link checker, so this
PR also removes them.
Also snuck in a small addition about asyncio
**Description:**
The `LocalFileStore` class can be used to create an on-disk
`CacheBackedEmbeddings` cache. However, the default `umask` settings
gives file/directory write permissions only to the original user. Once
the cache directory is created by the first user, other users cannot
write their own cache entries into the directory.
To make the cache usable by multiple users, this pull request updates
the `LocalFileStore` constructor to allow the permissions for newly
created directories and files to be specified. The specified permissions
override the default `umask` values.
For example, when configured as follows:
```python
file_store = LocalFileStore(temp_dir, chmod_dir=0o770, chmod_file=0o660)
```
then "user" and "group" (but not "other") have permissions to access the
store, which means:
* Anyone in our group could contribute embeddings to the cache.
* If we implement cache cleanup/eviction in the future, anyone in our
group could perform the cleanup.
The default values for the `chmod_dir` and `chmod_file` parameters is
`None`, which retains the original behavior of using the default `umask`
settings.
**Issue:**
Implements enhancement #18075.
**Testing:**
I updated the `LocalFileStore` unit tests to test the permissions.
---------
Signed-off-by: chrispy <chrispy@synopsys.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
- **Description:** Adds async variants of afrom_texts and
afrom_embeddings into `OpenSearchVectorSearch`, which allows for
`afrom_documents` to be called.
- **Issue:** I implemented this because my use case involves an async
scraper generating documents as and when they're ready to be ingested by
Embedding/OpenSearch
- **Dependencies:** None that I'm aware
Co-authored-by: Ben Mitchell <b.mitchell@reply.com>
This PR supports using Pydantic v2 objects to generate the schema for
the JSONOutputParser (#19441). This also adds a `json_schema` parameter
to allow users to pass any JSON schema to validate with, not just
pydantic.
core/langchain_core/_api[Patch]: mypy ignore fixes#17048
Related to #17048
Applied mypy fixes to below two files:
libs/core/langchain_core/_api/deprecation.py
libs/core/langchain_core/_api/beta_decorator.py
Summary of Fixes:
**Issue 1**
class _deprecated_property(type(obj)): # type: ignore
error: Unsupported dynamic base class "type" [misc]
Fix:
1. Added an __init__ method to _deprecated_property to initialize the
fget, fset, fdel, and __doc__ attributes.
2. In the __get__, __set__, and __delete__ methods, we now use the
self.fget, self.fset, and self.fdel attributes to call the original
methods after emitting the warning.
3. The finalize function now creates an instance of _deprecated_property
with the fget, fset, fdel, and doc attributes from the original obj
property.
**Issue 2**
def finalize( # type: ignore
wrapper: Callable[..., Any], new_doc: str
) -> T:
error: All conditional function variants must have identical
signatures
Fix: Ensured that both definitions of the finalize function have the
same signature
Twitter Handle -
https://x.com/gupteutkarsha?s=11&t=uwHe4C3PPpGRvoO5Qpm1aA
**Description:** Citations are the main addition in this PR. We now emit
them from the multihop agent! Additionally the agent is now more
flexible with observations (`Any` is now accepted), and the Cohere SDK
version is bumped to fix an issue with the most recent version of
pydantic v1 (1.10.15)
- **Description:** In order to use index and aindex in
libs/langchain/langchain/indexes/_api.py, I implemented delete method
and all async methods in opensearch_vector_search
- **Dependencies:** No changes
- **Description:** Improvement for #19599: fixing missing return of
graph.draw_mermaid_png and improve it to make the saving of the rendered
image optional
Co-authored-by: Angel Igareta <angel.igareta@klarna.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
**Description:** Update of Cohere documentation (main provider page)
**Issue:** After addition of the Cohere partner package, the
documentation was out of date
**Dependencies:** None
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "community: deprecating integrations moved to
langchain_google_community"
- [ ] **PR message**: deprecating integrations moved to
langchain_google_community
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
Removes required usage of `requests` from `langchain-core`, all of which
has been deprecated.
- removes Tracer V1 implementations
- removes old `try_load_from_hub` github-based hub implementations
Removal done in a way where imports will still succeed, and usage will
fail with a `RuntimeError`.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**:
- **Description:** mention not-caching methods in CacheBackedEmbeddings
- **Issue:** n/a I almost created one until I read the code
- **Dependencies:** n/a
- **Twitter handle:** `tarsylia`
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
**Description**: Improves the stability of all Cohere partner package
integration tests. Fixes a bug with document parsing (both dicts and
Documents are handled).
**Description**: This PR simplifies an integration test within the
Cohere partner package:
* It no longer relies on exact model answers
* It no longer relies on a third party tool
This PR completes work for PR #18798 to expose raw tool output in
on_tool_end.
Affected APIs:
* astream_log
* astream_events
* callbacks sent to langsmith via langsmith-sdk
* Any other code that relies on BaseTracer!
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- This ensures ids are stable across streamed chunks
- Multiple messages in batch call get separate ids
- Also fix ids being dropped when combining message chunks
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
- **Description:** add `remove_comments` option (default: True): do not
extract html _comments_,
- **Issue:** None,
- **Dependencies:** None,
- **Tag maintainer:** @nfcampos ,
- **Twitter handle:** peter_v
I ran `make format`, `make lint` and `make test`.
Discussion: I my use case, I prefer to not have the comments in the
extracted text:
* e.g. from a Google tag that is added in the html as comment
* e.g. content that the authors have temporarily hidden to make it non
visible to the regular reader
Removing the comments makes the extracted text more alike the intended
text to be seen by the reader.
**Choice to make:** do we prefer to make the default for this
`remove_comments` option to be True or False?
I have changed it to True in a second commit, since that is how I would
prefer to use it by default. Have the
cleaned text (without technical Google tags etc.) and also closer to the
actually visible and intended content.
I am not sure what is best aligned with the conventions of langchain in
general ...
INITIAL VERSION (new version above):
~**Choice to make:** do we prefer to make the default for this
`ignore_comments` option to be True or False?
I have set it to False now to be backwards compatible. On the other
hand, I would use it mostly with True.
I am not sure what is best aligned with the conventions of langchain in
general ...~
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
As in #19346, this PR exposes `request_timeout` in `BaseCohere`, while
`max_retires` is no longer a parameter of the beneath client
(`cohere.Client`) and it is already configured in
`langchain_cohere.llms.Cohere`.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** the layout of html pages can be variant based on the
bootstrap framework or the styles of the pages. So we need to have a
splitter to transform the html tags to a proper layout and then split
the html content based on the provided list of tags to determine its
html sections. We are using BS4 library along with xslt structure to
split the html content using an section aware approach.
- **Dependencies:** No new dependencies
- **Twitter handle:** @m_setayesh
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
[Dria](https://dria.co/) is a hub of public RAG models for developers to
both contribute and utilize a shared embedding lake. This PR adds a
retriever that can retrieve documents from Dria.
Thank you for contributing to LangChain!
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**:
- **Description:** Fix argument translation from OpenAPI spec to OpenAI
function call (and similar)
- **Issue:** OpenGPTs failures with calling Action Server based actions.
- **Dependencies:** None
- **Twitter handle:** mikkorpela
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
~2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.~
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
Description: Update `ChatZhipuAI` to support the latest `glm-4` model.
Issue: N/A
Dependencies: httpx, httpx-sse, PyJWT
The previous `ChatZhipuAI` implementation requires the `zhipuai`
package, and cannot call the latest GLM model. This is because
- The old version `zhipuai==1.*` doesn't support the latest model.
- `zhipuai==2.*` requires `pydantic V2`, which is incompatible with
'langchain-community'.
This re-implementation invokes the GLM model by sending HTTP requests to
[open.bigmodel.cn](https://open.bigmodel.cn/dev/api) via the `httpx`
package, and uses the `httpx-sse` package to handle stream events.
---------
Co-authored-by: zR <2448370773@qq.com>
- **Description:** Add functionality to generate Mermaid syntax and
render flowcharts from graph data. This includes support for custom node
colors and edge curve styles, as well as the ability to export the
generated graphs to PNG images using either the Mermaid.INK API or
Pyppeteer for local rendering.
- **Dependencies:** Optional dependencies are `pyppeteer` if rendering
wants to be done using Pypeteer and Javascript code.
---------
Co-authored-by: Angel Igareta <angel.igareta@klarna.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
**Description:** An additional `U` argument was added for the
instructions to install the pip packages for the MediaWiki Dump Document
loader which was leading to error in installing the package. Removing
the argument fixed the command to install.
**Issue:** #19820
**Dependencies:** No dependency change requierd
**Twitter handle:** [@vardhaman722](https://twitter.com/vardhaman722)
* Replace `source_documents` with `documents`
* Pass `documents` as a named arg vs keyword
* Make `parsed_docs` more robust
* Fix edge case of doc page_content being `None`
- **Updating Together.ai Endpoint**: "langchain_together: Updated
Deprecated endpoint for partner package"
- Description: The inference API of together is deprecates, do replaced
with completions and made corresponding changes.
- Twitter handle: @dev_yashmathur
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Add attribution_token within
GoogleVertexAISearchRetriever so user can provide this information to
Google support team or product team during debug session.
Reference:
https://cloud.google.com/generative-ai-app-builder/docs/view-analytics#user-events
Attribution tokens. Attribution tokens are unique IDs generated by
Vertex AI Search and returned with each search request. Make sure to
include that attribution token as UserEvent.attributionToken with any
user events resulting from a search. This is needed to identify if a
search is served by the API. Only user events with a Google-generated
attribution token are used to compute metrics.
- **Issue:** No
- **Dependencies:** No
- **Twitter handle:** abehsu1992626
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Support reranking based on cross encoder models
available from HuggingFace.
- Added `CrossEncoder` schema
- Implemented `HuggingFaceCrossEncoder` and
`SagemakerEndpointCrossEncoder`
- Implemented `CrossEncoderReranker` that performs similar functionality
to `CohereRerank`
- Added `cross-encoder-reranker.ipynb` to demonstrate how to use it.
Please let me know if anything else needs to be done to make it visible
on the table-of-contents navigation bar on the left, or on the card list
on [retrievers documentation
page](https://python.langchain.com/docs/integrations/retrievers).
- **Issue:** N/A
- **Dependencies:** None other than the existing ones.
---------
Co-authored-by: Kenny Choe <kchoe@amazon.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Description: Video imagery to text (Closed Captioning)
This pull request introduces the VideoCaptioningChain, a tool for
automated video captioning. It processes audio and video to generate
subtitles and closed captions, merging them into a single SRT output.
Issue: https://github.com/langchain-ai/langchain/issues/11770
Dependencies: opencv-python, ffmpeg-python, assemblyai, transformers,
pillow, torch, openai
Tag maintainer:
@baskaryan
@hwchase17
Hello! We are a group of students from the University of Toronto
(@LunarECL, @TomSadan, @nicoledroi1, @A2113S) that want to make a
contribution to the LangChain community! We have ran make format, make
lint and make test locally before submitting the PR. To our knowledge,
our changes do not introduce any new errors.
Thank you for taking the time to review our PR!
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description
This implementation adds functionality from the AlphaVantage API,
renowned for its comprehensive financial data. The class encapsulates
various methods, each dedicated to fetching specific types of financial
information from the API.
### Implemented Functions
- **`search_symbols`**:
- Searches the AlphaVantage API for financial symbols using the provided
keywords.
- **`_get_market_news_sentiment`**:
- Retrieves market news sentiment for a specified stock symbol from the
AlphaVantage API.
- **`_get_time_series_daily`**:
- Fetches daily time series data for a specific symbol from the
AlphaVantage API.
- **`_get_quote_endpoint`**:
- Obtains the latest price and volume information for a given symbol
from the AlphaVantage API.
- **`_get_time_series_weekly`**:
- Gathers weekly time series data for a particular symbol from the
AlphaVantage API.
- **`_get_top_gainers_losers`**:
- Provides details on top gainers, losers, and most actively traded
tickers in the US market from the AlphaVantage API.
### Issue:
- #11994
### Dependencies:
- 'requests' library for HTTP requests. (import requests)
- 'pytest' library for testing. (import pytest)
---------
Co-authored-by: Adam Badar <94140103+adam-badar@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Langchain-Predibase integration was failing, because
it was not current with the Predibase SDK; in addition, Predibase
integration tests were instantiating the Langchain Community `Predibase`
class with one required argument (`model`) missing. This change updates
the Predibase SDK usage and fixes the integration tests.
- **Twitter handle:** `@alexsherstinsky`
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Code written by following, the official documentation
of [Google Drive
Loader](https://python.langchain.com/docs/integrations/document_loaders/google_drive),
gives errors. I have opened an issue regarding this. See #14725. This is
a pull request for modifying the documentation to use an approach that
makes the code work. Basically, the change is that we need to always set
the GOOGLE_APPLICATION_CREDENTIALS env var to an emtpy string, rather
than only in case of RefreshError. Also, rewrote 2 paragraphs to make
the instructions more clear.
- **Issue:** See this related [issue #
14725](https://github.com/langchain-ai/langchain/issues/14725)
- **Dependencies:** NA
- **Tag maintainer:** @baskaryan
- **Twitter handle:** NA
Co-authored-by: Snehil <snehil@example.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Thank you for contributing to LangChain!
- [x] **PR title**: "community: added support for llmsherpa library"
- [x] **Add tests and docs**:
1. Integration test:
'docs/docs/integrations/document_loaders/test_llmsherpa.py'.
2. an example notebook:
`docs/docs/integrations/document_loaders/llmsherpa.ipynb`.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
this pr also drops the community added action for checking broken links
in mdx. It does not work well for our use case, throwing errors for
local paths, plus the rest of the errors our in house solution had.
# Description
Implementing `_combine_llm_outputs` to `ChatMistralAI` to override the
default implementation in `BaseChatModel` returning `{}`. The
implementation is inspired by the one in `ChatOpenAI` from package
`langchain-openai`.
# Issue
None
# Dependencies
None
# Twitter handle
None
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
This template utilizes Chroma and TGI (Text Generation Inference) to
execute RAG on the Intel Xeon Scalable Processors. It serves as a
demonstration for users, illustrating the deployment of the RAG service
on the Intel Xeon Scalable Processors and showcasing the resulting
performance enhancements.
**Issue:**
None
**Dependencies:**
The template contains the poetry project requirements to run this
template.
CPU TGI batching is WIP.
**Twitter handle:**
None
---------
Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** We'd like to support passing additional kwargs in
`with_structured_output`. I believe this is the accepted approach to
enable additional arguments on API calls.
- **Description:** Haskell language support added in text_splitter
module
- **Dependencies:** No
- **Twitter handle:** @nisargtr
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** PR adds support for limiting number of messages
preserved in a session history for DynamoDBChatMessageHistory
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Subject: Fix Type Misdeclaration for index_schema in redis/base.py
I noticed a type misdeclaration for the index_schema column in the
redis/base.py file.
When following the instructions outlined in [Redis Custom Metadata
Indexing](https://python.langchain.com/docs/integrations/vectorstores/redis)
to create our own index_schema, it leads to a Pylance type error. <br/>
**The error message indicates that Dict[str, list[Dict[str, str]]] is
incompatible with the type Optional[Union[Dict[str, str], str,
os.PathLike]].**
```
index_schema = {
"tag": [{"name": "credit_score"}],
"text": [{"name": "user"}, {"name": "job"}],
"numeric": [{"name": "age"}],
}
rds, keys = Redis.from_texts_return_keys(
texts,
embeddings,
metadatas=metadata,
redis_url="redis://localhost:6379",
index_name="users_modified",
index_schema=index_schema,
)
```
Therefore, I have created this pull request to rectify the type
declaration problem.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
## Feature
- Set additional headers in constructor
- Headers will be sent in post request
This feature is useful if deploying Ollama on a cloud service such as
hugging face, which requires authentication tokens to be passed in the
request header.
## Tests
- Test if header is passed
- Test if header is not passed
Similar to https://github.com/langchain-ai/langchain/pull/15881
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
If `prompt` is passed into `create_sql_agent()`, then
`toolkit.get_context()` shouldn't be executed against the database
unless relevant prompt variables (`table_info` or `table_names`) are
present .
Thank you for contributing to LangChain!
- [x] **PR title**: "community: Implement DirectoryLoader lazy_load
function"
- [x] **Description**: The `lazy_load` function of the `DirectoryLoader`
yields each document separately. If the given `loader_cls` of the
`DirectoryLoader` also implemented `lazy_load`, it will be used to yield
subdocuments of the file.
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access:
`libs/community/tests/unit_tests/document_loaders/test_directory_loader.py`
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory:
`docs/docs/integrations/document_loaders/directory.ipynb`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
When using the SQLDatabaseChain with Llama2-70b LLM and, SQLite
database. I was getting `Warning: You can only execute one statement at
a time.`.
```
from langchain.sql_database import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain
sql_database_path = '/dccstor/mmdataretrieval/mm_dataset/swimming_record/rag_data/swimmingdataset.db'
sql_db = get_database(sql_database_path)
db_chain = SQLDatabaseChain.from_llm(mistral, sql_db, verbose=True, callbacks = [callback_obj])
db_chain.invoke({
"query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
})
```
Error:
```
Warning Traceback (most recent call last)
Cell In[31], line 3
1 import langchain
2 langchain.debug=False
----> 3 db_chain.invoke({
4 "query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
5 })
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:162, in Chain.invoke(self, input, config, **kwargs)
160 except BaseException as e:
161 run_manager.on_chain_error(e)
--> 162 raise e
163 run_manager.on_chain_end(outputs)
164 final_outputs: Dict[str, Any] = self.prep_outputs(
165 inputs, outputs, return_only_outputs
166 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:156, in Chain.invoke(self, input, config, **kwargs)
149 run_manager = callback_manager.on_chain_start(
150 dumpd(self),
151 inputs,
152 name=run_name,
153 )
154 try:
155 outputs = (
--> 156 self._call(inputs, run_manager=run_manager)
157 if new_arg_supported
158 else self._call(inputs)
159 )
160 except BaseException as e:
161 run_manager.on_chain_error(e)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:198, in SQLDatabaseChain._call(self, inputs, run_manager)
194 except Exception as exc:
195 # Append intermediate steps to exception, to aid in logging and later
196 # improvement of few shot prompt seeds
197 exc.intermediate_steps = intermediate_steps # type: ignore
--> 198 raise exc
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:143, in SQLDatabaseChain._call(self, inputs, run_manager)
139 intermediate_steps.append(
140 sql_cmd
141 ) # output: sql generation (no checker)
142 intermediate_steps.append({"sql_cmd": sql_cmd}) # input: sql exec
--> 143 result = self.database.run(sql_cmd)
144 intermediate_steps.append(str(result)) # output: sql exec
145 else:
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:436, in SQLDatabase.run(self, command, fetch, include_columns)
425 def run(
426 self,
427 command: str,
428 fetch: Literal["all", "one"] = "all",
429 include_columns: bool = False,
430 ) -> str:
431 """Execute a SQL command and return a string representing the results.
432
433 If the statement returns rows, a string of the results is returned.
434 If the statement returns no rows, an empty string is returned.
435 """
--> 436 result = self._execute(command, fetch)
438 res = [
439 {
440 column: truncate_word(value, length=self._max_string_length)
(...)
443 for r in result
444 ]
446 if not include_columns:
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:413, in SQLDatabase._execute(self, command, fetch)
410 elif self.dialect == "postgresql": # postgresql
411 connection.exec_driver_sql("SET search_path TO %s", (self._schema,))
--> 413 cursor = connection.execute(text(command))
414 if cursor.returns_rows:
415 if fetch == "all":
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1416, in Connection.execute(self, statement, parameters, execution_options)
1414 raise exc.ObjectNotExecutableError(statement) from err
1415 else:
-> 1416 return meth(
1417 self,
1418 distilled_parameters,
1419 execution_options or NO_OPTIONS,
1420 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/sql/elements.py:516, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options)
514 if TYPE_CHECKING:
515 assert isinstance(self, Executable)
--> 516 return connection._execute_clauseelement(
517 self, distilled_params, execution_options
518 )
519 else:
520 raise exc.ObjectNotExecutableError(self)
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1639, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options)
1627 compiled_cache: Optional[CompiledCacheType] = execution_options.get(
1628 "compiled_cache", self.engine._compiled_cache
1629 )
1631 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
1632 dialect=dialect,
1633 compiled_cache=compiled_cache,
(...)
1637 linting=self.dialect.compiler_linting | compiler.WARN_LINTING,
1638 )
-> 1639 ret = self._execute_context(
1640 dialect,
1641 dialect.execution_ctx_cls._init_compiled,
1642 compiled_sql,
1643 distilled_parameters,
1644 execution_options,
1645 compiled_sql,
1646 distilled_parameters,
1647 elem,
1648 extracted_params,
1649 cache_hit=cache_hit,
1650 )
1651 if has_events:
1652 self.dispatch.after_execute(
1653 self,
1654 elem,
(...)
1658 ret,
1659 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1848, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw)
1843 return self._exec_insertmany_context(
1844 dialect,
1845 context,
1846 )
1847 else:
-> 1848 return self._exec_single_context(
1849 dialect, context, statement, parameters
1850 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1988, in Connection._exec_single_context(self, dialect, context, statement, parameters)
1985 result = context._setup_result_proxy()
1987 except BaseException as e:
-> 1988 self._handle_dbapi_exception(
1989 e, str_statement, effective_parameters, cursor, context
1990 )
1992 return result
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:2346, in Connection._handle_dbapi_exception(self, e, statement, parameters, cursor, context, is_sub_exec)
2344 else:
2345 assert exc_info[1] is not None
-> 2346 raise exc_info[1].with_traceback(exc_info[2])
2347 finally:
2348 del self._reentrant_error
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1969, in Connection._exec_single_context(self, dialect, context, statement, parameters)
1967 break
1968 if not evt_handled:
-> 1969 self.dialect.do_execute(
1970 cursor, str_statement, effective_parameters, context
1971 )
1973 if self._has_events or self.engine._has_events:
1974 self.dispatch.after_cursor_execute(
1975 self,
1976 cursor,
(...)
1980 context.executemany,
1981 )
File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/default.py:922, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
921 def do_execute(self, cursor, statement, parameters, context=None):
--> 922 cursor.execute(statement, parameters)
Warning: You can only execute one statement at a time.
```
**Issue:**
The Error occurs because when generating the SQLQuery, the llm_input
includes the stop character of "\nSQLResult:", so for this user query
the LLM generated response is **SELECT Time FROM men_butterfly_100m
WHERE Swimmer = 'Lance Larson';\nSQLResult:** it is required to remove
the SQLResult suffix on the llm response before executing it on the
database.
```
llm_inputs = {
"input": input_text,
"top_k": str(self.top_k),
"dialect": self.database.dialect,
"table_info": table_info,
"stop": ["\nSQLResult:"],
}
sql_cmd = self.llm_chain.predict(
callbacks=_run_manager.get_child(),
**llm_inputs,
).strip()
if SQL_RESULT in sql_cmd:
sql_cmd = sql_cmd.split(SQL_RESULT)[0].strip()
result = self.database.run(sql_cmd)
```
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Description: Fix xml parser to handle strings that only contain the root
tag
Issue: N/A
Dependencies: None
Twitter handle: N/A
A valid xml text can contain only the root level tag. Example: <body>
Some text here
</body>
The example above is a valid xml string. If parsed with the current
implementation the result is {"body": []}. This fix checks if the root
level text contains any non-whitespace character and if that's the case
it returns {root.tag: root.text}. The result is that the above text is
correctly parsed as {"body": "Some text here"}
@ale-delfino
Thank you for contributing to LangChain!
Checklist:
- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
- Example: "community: add foobar LLM"
- [x] PR message: **Delete this entire template message** and replace it
with the following bulleted list
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @efriis, @eyurtsev, @hwchase17.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
When testing Nomic embeddings --
```
from langchain_community.embeddings import LlamaCppEmbeddings
embd_model_path = "/Users/rlm/Desktop/Code/llama.cpp/models/nomic-embd/nomic-embed-text-v1.Q4_K_S.gguf"
embd_lc = LlamaCppEmbeddings(model_path=embd_model_path)
embedding_lc = embd_lc.embed_query(query)
```
We were seeing this error for strings > a certain size --
```
File ~/miniforge3/envs/llama2/lib/python3.9/site-packages/llama_cpp/llama.py:827, in Llama.embed(self, input, normalize, truncate, return_count)
824 s_sizes = []
826 # add to batch
--> 827 self._batch.add_sequence(tokens, len(s_sizes), False)
828 t_batch += n_tokens
829 s_sizes.append(n_tokens)
File ~/miniforge3/envs/llama2/lib/python3.9/site-packages/llama_cpp/_internals.py:542, in _LlamaBatch.add_sequence(self, batch, seq_id, logits_all)
540 self.batch.token[j] = batch[i]
541 self.batch.pos[j] = i
--> 542 self.batch.seq_id[j][0] = seq_id
543 self.batch.n_seq_id[j] = 1
544 self.batch.logits[j] = logits_all
ValueError: NULL pointer access
```
The default `n_batch` of llama-cpp-python's Llama is `512` but we were
explicitly setting it to `8`.
These need to be set to equal for embedding models.
* The embedding.cpp example has an assertion to make sure these are
always equal.
* Apparently this is not being done properly in llama-cpp-python.
With `n_batch` set to 8, if more than 8 tokens are passed the batch runs
out of space and it crashes.
This also explains why the CPU compute buffer size was small:
raw client with default `n_batch=512`
```
llama_new_context_with_model: CPU input buffer size = 3.51 MiB
llama_new_context_with_model: CPU compute buffer size = 21.00 MiB
```
langchain with `n_batch=8`
```
llama_new_context_with_model: CPU input buffer size = 0.04 MiB
llama_new_context_with_model: CPU compute buffer size = 0.33 MiB
```
We can work around this by passing `n_batch=512`, but this will not be
obvious to some users:
```
embedding = LlamaCppEmbeddings(model_path=embd_model_path,
n_batch=512)
```
From discussion w/ @cebtenzzre. Related:
https://github.com/abetlen/llama-cpp-python/issues/1189
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** The base URL for OpenAI is retrieved from the
environment variable "OPENAI_BASE_URL", whereas for langchain it is
obtained from "OPENAI_API_BASE". By adding `base_url =
os.environ.get("OPENAI_API_BASE")`, the OpenAI proxy can execute
correctly.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- **Description:** added unit tests for NotebookLoader. Linked PR:
https://github.com/langchain-ai/langchain/pull/17614
- **Issue:**
[#17614](https://github.com/langchain-ai/langchain/pull/17614)
- **Twitter handle:** @paulodoestech
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: lachiewalker <lachiewalker1@hotmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** Created a Langchain Tool for OpenAI DALLE Image
Generation.
**Issue:**
[#15901](https://github.com/langchain-ai/langchain/issues/15901)
**Dependencies:** n/a
**Twitter handle:** @paulodoestech
- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**: adding checking codes for calling AI model get error
in chat_models/base.py and llms/base.py
**Issue**: Sometimes the AI Model calling will get error, we should
raise it.
Otherwise, the next code 'choices.extend(response["choices"])' will
throw a "TypeError: 'NoneType' object is not iterable" error to mask the
true error.
Because 'response["choices"]' is None.
**Dependencies**: None
---------
Co-authored-by: yangkx <yangkx@asiainfo-int.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
## PR message
**Description:** This PR adds a README file for the Together API in the
`libs/partners` folder of this repository. The README includes:
- A brief description of the package
- Installation instructions and class introductions
- Simple usage examples
**Issue:** #17545
This PR only contains document changes.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:**
1. Fix the BiliBiliLoader that can receive cookie parameters, it
requires 3 other parameters to run. The change is backward compatible.
2. Add test;
3. Add example in docs
- **Issue:** [#14213]
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
**Description:** A few grammatical changes to improve readability of the
LCEL .ipynb and tidy some null characters.
**Issue:** N/A
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- [x] **PR title**: "community: Support streaming in Azure ML and few
naming changes"
- [x] **PR message**:
- **Description:** Added support for streaming for azureml_endpoint.
Also, renamed and AzureMLEndpointApiType.realtime to
AzureMLEndpointApiType.dedicated. Also, added new classes
CustomOpenAIChatContentFormatter and CustomOpenAIContentFormatter and
updated the classes LlamaChatContentFormatter and LlamaContentFormatter
to now show a deprecated warning message when instantiated.
---------
Co-authored-by: Sachin Paryani <saparan@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:** At times, BaseChatMemory._get_input_output may acquire
some extra keys such as 'intermediate_steps' (agent_executor with
return_intermediate_steps set to True) and 'messages'
(agent_executor.iter with memory). In these instances, _get_input_output
can raise an error due to the presence of multiple keys. The 'output'
field should be used as the default field in these cases.
**Issue:** #16791
Previous markdown code was not working as intended, new code should add
green box around the tip so it is highlighted
Co-authored-by: Hershenson, Isaac (Extern) <isaac.hershenson.extern@bayer04.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Description: Added missing `from_documents` method to `KNNRetriever`,
providing the ability to supply metadata to LangChain `Document`s, and
to give it parity to the other retrievers, which do have
`from_documents`.
- Issue: None
- Dependencies: None
- Twitter handle: None
Co-authored-by: Victor Adan <vadan@netroadshow.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Relates to #17048
Description : Applied fix to dynamodb and elasticsearch file.
Error was : `Cannot override writeable attribute with read-only
property`
Suggestion:
instead of adding
```
@messages.setter
def messages(self, messages: List[BaseMessage]) -> None:
raise NotImplementedError("Use add_messages instead")
```
we can change base class property
`messages: List[BaseMessage]`
to
```
@property
def messages(self) -> List[BaseMessage]:...
```
then we don't need to add `@messages.setter` in all child classes.
**Description:**
While not technically incorrect, the TypeVar used for the `@beta`
decorator prevented pyright (and thus most vscode users) from correctly
seeing the types of functions/classes decorated with `@beta`.
This is in part due to a small bug in pyright
(https://github.com/microsoft/pyright/issues/7448 ) - however, the
`Type` bound in the typevar `C = TypeVar("C", Type, Callable)` is not
doing anything - classes are `Callables` by default, so by my
understanding binding to `Type` does not actually provide any more
safety - the modified annotation still works correctly for both
functions, properties, and classes.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
In this small PR I added the `template_tool_response` arg to the
`create_json_chat` function, so that users can customize this prompt in
case of need.
Thanks for your reviews!
---------
Co-authored-by: taamedag <Davide.Menini@swisscom.com>
This patch updates multiple function "run" to "invoke" in
llm_symbolic_math.ipynb.
Without this patch, you see following message.
The function `run` was deprecated in LangChain 0.1.0
and will be removed in 0.2.0. Use invoke instead.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
**Description:** Adds support for `with_structured_output` to Cohere,
which supports single function calling.
---------
Co-authored-by: BeatrixCohere <128378696+BeatrixCohere@users.noreply.github.com>
- [x] **PR title**: "community: fix baidu qianfan missing stop
parameter"
- [x] **PR message**:
- **Description: Baidu Qianfan lost the stop parameter when requesting
service due to extracting it from kwargs. This bug can cause the agent
to receive incorrect results
---------
Co-authored-by: ligang33 <ligang33@baidu.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Bug fixes in this PR:
* allows for other params such as "message" not just the input param to
the prompt for the cohere tools agent
* fixes to documents kwarg from messages
* fixes to tool_calls API call
---------
Co-authored-by: Harry M <127103098+harry-cohere@users.noreply.github.com>
- **Issue:** When passing an empty list to MergerRetriever it fails with
error: ValueError: max() arg is an empty sequence
- **Description:** We have a use case where we dynamically select
retrievers and use MergerRetriever for merging the output of the
retrievers. We faced this issue when the retriever_docs list is empty.
Adding a default 0 for cases when retriever_docs is an empty list to
avoid "ValueError: max() arg is an empty sequence". Also, changed to use
map() which is more than twice as fast compared to the current
implementation.
```
import timeit
# Sample retriever_docs with varying lengths of sublists
retriever_docs = [[i for i in range(j)] for j in range(1, 1000)]
# First code snippet
code1 = '''
max_docs = max(len(docs) for docs in retriever_docs)
'''
# Second code snippet
code2 = '''
max_docs = max(map(len, retriever_docs), default=0)
'''
# Benchmarking
time1 = timeit.timeit(stmt=code1, globals=globals(), number=10000)
time2 = timeit.timeit(stmt=code2, globals=globals(), number=10000)
# Output
print(f"Execution time for code snippet 1: {time1} seconds")
print(f"Execution time for code snippet 2: {time2} seconds")
```
- **Dependencies:** none
The previous version didn't had Voyage rerank in the init file
- [ ] **PR title**: langchain_voyageai reranker is not working
- [ ] **PR message**:
- **Description:** This fix let you run reranker from voyage
- **Issue:** Was not able to run reranker from voyage
@efriis
Due to changes in the OpenAI SDK, the previous method of setting the
OpenAI proxy in ChatOpenAI no longer works. This PR fixes this issue,
making the previous way of setting the OpenAI proxy in ChatOpenAI
effective again.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
This is a follow up to #18371. These are the changes:
- New **Azure AI Services** toolkit and tools to replace those of
**Azure Cognitive Services**.
- Updated documentation for Microsoft platform.
- The image analysis tool has been rewritten to use the new package
`azure-ai-vision-imageanalysis`, doing a proper replacement of
`azure-ai-vision`.
These changes:
- Update outdated naming from "Azure Cognitive Services" to "Azure AI
Services".
- Update documentation to use non-deprecated methods to create and use
agents.
- Removes need to depend on yanked python package (`azure-ai-vision`)
There is one new dependency that is needed as a replacement to
`azure-ai-vision`:
- `azure-ai-vision-imageanalysis`. This is optional and declared within
a function.
There is a new `azure_ai_services.ipynb` notebook showing usage; Changes
have been linted and formatted.
I am leaving the actions of adding deprecation notices and future
removal of Azure Cognitive Services up to the LangChain team, as I am
not sure what the current practice around this is.
---
If this PR makes it, my handle is @galo@mastodon.social
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
- **Description**: `bigdl-llm` library has been renamed to
[`ipex-llm`](https://github.com/intel-analytics/ipex-llm). This PR
migrates the `bigdl-llm` integration to `ipex-llm` .
- **Issue**: N/A. The original PR of `bigdl-llm` is
https://github.com/langchain-ai/langchain/pull/17953
- **Dependencies**: `ipex-llm` library
- **Contribution maintainer**: @shane-huang
Updated doc: docs/docs/integrations/llms/ipex_llm.ipynb
Updated test:
libs/community/tests/integration_tests/llms/test_ipex_llm.py
- **Description:** Add support for Intel Lab's [Visual Data Management
System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store
- **Dependencies:** `vdms` library which requires protobuf = "4.24.2".
There is a conflict with dashvector in `langchain` package but conflict
is resolved in `community`.
- **Contribution maintainer:** [@cwlacewe](https://github.com/cwlacewe)
- **Added tests:**
libs/community/tests/integration_tests/vectorstores/test_vdms.py
- **Added docs:** docs/docs/integrations/vectorstores/vdms.ipynb
- **Added cookbook:** cookbook/multi_modal_RAG_vdms.ipynb
---------
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
If you use an embedding dist function in an eval loop, you get warned
every time. Would prefer to just check once and forget about it.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- .stream() and .astream() call on_llm_new_token, removing the need for
subclasses to do so. Backwards compatible because now we don't pass
run_manager into ._stream and ._astream
- .generate() and .agenerate() now handle `stream: bool` kwarg for
_generate and _agenerate. Subclasses handle this arg by delegating to
._stream(), now one less thing they need to do. Backwards compat because
this is an optional arg that we now never pass to the subclasses
- .generate() and .agenerate() now inspect callback handlers to decide
on a default value for stream:bool if not passed in. This auto enables
streaming when using astream_events and astream_log
- as a result of these three changes any usage of .astream_events and
.astream_log should now yield chat model stream events
- In future PRs we can update all subclasses to reflect these two things
now handled by base class, but in meantime all will continue to work
* **Description**: add `None` type for `file_path` along with `str` and
`List[str]` types.
* `file_path`/`filename` arguments in `get_elements_from_api()` and
`partition()` can be `None`, however, there's no `None` type hint for
`file_path` in `UnstructuredAPIFileLoader` and `UnstructuredFileLoader`
currently.
* calling the function with `file_path=None` is no problem, but my IDE
annoys me lol.
* **Issue**: N/A
* **Dependencies**: N/A
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Updates Meilisearch vectorstore for compatibility
with v1.6 and above. Adds embedders settings and embedder_name which are
now required.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description:**
This PR adds a slightly more helpful message to a Tool Exception
```
# current state
langchain_core.tools.ToolException: Too many arguments to single-input tool
# proposed state
langchain_core.tools.ToolException: Too many arguments to single-input tool. Consider using a StructuredTool instead.
```
**Issue:** Somewhat discussed here 👉#6197
**Dependencies:** None
**Twitter handle:** N/A
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Thank you for contributing to LangChain!
- [ ] **cookbook** - update example for SalesGPT - include Stripe
Payment Link Generation
- **Description:** We updated the Jupyter notebook example with the
ability of the AI Agent to negotiate with customers and then close the
deal by generating a custom Stripe payment link.
- **Issue:** N/A
- **Dependencies:** N/a
- **Twitter handle:** @FilipMichalsky @0xtotaylor
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Filip Michalsky <filip_michalsky@g.harvard.edu>
Co-authored-by: Bagatur <baskaryan@gmail.com>
As mentioned in #18322, the current PydanticOutputParser won't work for
anyone trying to parse to pydantic v2 models. This PR adds a separate
`PydanticV2OutputParser`, as well as a `langchain_core.pydantic_v2`
namespace that will fail on import to any projects using pydantic<2.
Happy to update the docs for output parsers if this is something we're
interesting in adding.
On a separate note, I also updated `check_pydantic.sh` to detect
pydantic imports with leading whitespace and excluded the internal
namespaces. That change can be separated into its own PR if needed.
---------
Co-authored-by: Jan Nissen <jan23@gmail.com>
- **Description:** I've made a fix to a ParseError call in the
XMLOutputParser documentation.
- **Issue:** None
- **Dependencies:** None
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
**Description:**
PebbloSafeLoader: Add support for non-file-based Document Loaders
This pull request enhances PebbloSafeLoader by introducing support for
several non-file-based Document Loaders. With this update,
PebbloSafeLoader now seamlessly integrates with the following loaders:
- GoogleDriveLoader
- SlackDirectoryLoader
- Unstructured EmailLoader
**Issue:** NA
**Dependencies:** - None
**Twitter handle:** @Raj__725
---------
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Patch potential XML vulnerability CVE-2024-1455
This patches a potential XML vulnerability in the XMLOutputParser in
langchain-core. The vulnerability in some situations could lead to a
denial of service attack.
At risk are users that:
1) Running older distributions of python that have older version of
libexpat
2) Are using XMLOutputParser with an agent
3) Accept inputs from untrusted sources with this agent (e.g., endpoint
on the web that allows an untrusted user to interact wiith the parser)
Introduction
[Intel® Extension for
Transformers](https://github.com/intel/intel-extension-for-transformers)
is an innovative toolkit designed to accelerate GenAI/LLM everywhere
with the optimal performance of Transformer-based models on various
Intel platforms
Description
adding ITREX runtime embeddings using intel-extension-for-transformers.
added mdx documentation and example notebooks
added embedding import testing.
---------
Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- [x] **PR title**: "experimental: Enhance LLMGraphTransformer with
async processing and improved readability"
- [x] **PR message**:
- **Description:** This pull request refactors the `process_response`
and `convert_to_graph_documents` methods in the LLMGraphTransformer
class to improve code readability and adds async versions of these
methods for concurrent processing.
The main changes include:
- Simplifying list comprehensions and conditional logic in the
process_response method for better readability.
- Adding async versions aprocess_response and
aconvert_to_graph_documents to enable concurrent processing of
documents.
These enhancements aim to improve the overall efficiency and
maintainability of the `LLMGraphTransformer` class.
- **Issue:** N/A
- **Dependencies:** No additional dependencies required.
- **Twitter handle:** @jjovalle99
- [x] **Add tests and docs**: N/A (This PR does not introduce a new
integration)
- [x] **Lint and test**: Ran make format, make lint, and make test from
the root of the modified package(s). All tests pass successfully.
Additional notes:
- The changes made in this PR are backwards compatible and do not
introduce any breaking changes.
- The PR touches only the `LLMGraphTransformer` class within the
experimental package.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Implemented try-except block for
`GCSDirectoryLoader`. Reason: Users processing large number of
unstructured files in a folder may experience many different errors. A
try-exception block is added to capture these errors. A new argument
`use_try_except=True` is added to enable *silent failure* so that error
caused by processing one file does not break the whole function.
- **Issue:** N/A
- **Dependencies:** no new dependencies
- **Twitter handle:** timothywong731
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Adding oracle autonomous database document loader
integration. This will allow users to connect to oracle autonomous
database through connection string or TNS configuration.
https://www.oracle.com/autonomous-database/
- **Issue:** None
- **Dependencies:** oracledb python package
https://pypi.org/project/oracledb/
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Unit test and doc are added.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Currently the semantic_configurations are not used
when creating an AzureSearch instance, instead creating a new one with
default values. This PR changes the behavior to use the passed
semantic_configurations if it is present, and the existing default
configuration if not.
---------
Co-authored-by: Adam Law <adamlaw@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
DefusedXML is causing parsing errors on previously functional code with
the 0.7.x versions. These do not seem to support newer version of python
well. 0.8.x has only been released as rc, so we're not going to to use
it in the core package
* Adds support for `additional_kwargs` in `get_cohere_chat_request`
* This functionality passes in Cohere SDK specific parameters from
`BaseMessage` based classes to the API
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
Thank you for contributing to LangChain!
- [x] **Add len() implementation to Chroma**: "package: community"
- [x] **PR message**:
- **Description:** add an implementation of the __len__() method for the
Chroma vectostore, for convenience.
- **Issue:** no exposed method to know the size of a Chroma vectorstore
- **Dependencies:** None
- **Twitter handle:** lowrank_adrian
- [x] **Add tests and docs**
- [x] **Lint and test**
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** Be more explicit with the `model_kwargs` and
`encode_kwargs` for `HuggingFaceEmbeddings`.
- **Issue:** -
- **Dependencies:** -
I received some reports by my users that they didn't realise that you
could change the default `batch_size` with `HuggingFaceEmbeddings`,
which may be attributed to how the `model_kwargs` and `encode_kwargs`
don't give much information about what you can specify.
I've added some parameter names & links to the Sentence Transformers
documentation to help clear it up. Let me know if you'd rather have
Markdown/Sphinx-style hyperlinks rather than a "bare URL".
- Tom Aarsen
So this arose from the
https://github.com/langchain-ai/langchain/pull/18397 problem of document
loaders not supporting `pathlib.Path`.
This pull request provides more uniform support for Path as an argument.
The core ideas for this upgrade:
- if there is a local file path used as an argument, it should be
supported as `pathlib.Path`
- if there are some external calls that may or may not support Pathlib,
the argument is immidiately converted to `str`
- if there `self.file_path` is used in a way that it allows for it to
stay pathlib without conversion, is is only converted for the metadata.
Twitter handle: https://twitter.com/mwmajewsk
### Issue
Recently, the new `allow_dangerous_deserialization` flag was introduced
for preventing unsafe model deserialization that relies on pickle
without user's notice (#18696). Since then some LLMs like Databricks
requires passing in this flag with true to instantiate the model.
However, this breaks existing functionality to loading such LLMs within
a chain using `load_chain` method, because the underlying loader
function
[load_llm_from_config](f96dd57501/libs/langchain/langchain/chains/loading.py (L40))
(and load_llm) ignores keyword arguments passed in.
### Solution
This PR fixes this issue by propagating the
`allow_dangerous_deserialization` argument to the class loader iff the
LLM class has that field.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Create a Class which allows to use the "text2vec" open source embedding
model.
It should install the model by running 'pip install -U text2vec'.
Example to call the model through LangChain:
from langchain_community.embeddings.text2vec import Text2vecEmbeddings
embedding = Text2vecEmbeddings()
bookend.embed_documents([
"This is a CoSENT(Cosine Sentence) model.",
"It maps sentences to a 768 dimensional dense vector space.",
])
bookend.embed_query(
"It can be used for text matching or semantic search."
)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
## Description
This PR proposes a modification to the `libs/langchain/dev.Dockerfile`
configuration to copy the `libs/langchain/poetry.lock` into the working
directory. The change aims to address the issue where the Poetry install
command, the last command in the `dev.Dockerfile`, takes excessively
long hours, and to ensure the reproducibility of the poetry environment
in the devcontainer.
## Problem
The `dev.Dockerfile`, prepared for development environments such as
`.devcontainer`, encounters an unending dependency resolution when
attempting the Poetry installation.
### Steps to Reproduce
Execute the following build command:
```bash
docker build -f libs/langchain/dev.Dockerfile .
```
### Current Behavior
The Docker build process gets stuck at the following step, which, in my
experience, did not conclude even after an entire night:
```
=> [langchain-dev-dependencies 4/6] COPY libs/community/ ../community/ 0.9s
=> [langchain-dev-dependencies 5/6] COPY libs/text-splitters/ ../text-splitters/ 0.0s
=> [langchain-dev-dependencies 6/6] RUN poetry install --no-interaction --no-ansi --with dev,test,docs 12.3s
=> => # Updating dependencies
=> => # Resolving dependencies...
```
### Expected Behavior
The Docker build completes in a realistic timeframe. By applying this
PR, the build finishes within a few minutes.
### Analysis
The complexity of LangChain's dependencies has reached a point where
Poetry is required to resolve dependencies akin to threading a needle.
Consequently, poetry install fails to complete in a practical timeframe.
## Solution
The solution for dependency resolution is already recorded in
`libs/langchain/poetry.lock`, so we can use it. When copying
`project.toml` and `poetry.toml`, the `poetry.lock` located in the same
directory should also be copied.
```diff
# Copy only the dependency files for installation
-COPY libs/langchain/pyproject.toml libs/langchain/poetry.toml ./
+COPY libs/langchain/pyproject.toml libs/langchain/poetry.toml libs/langchain/poetry.lock ./
```
## Note
I am not intimately familiar with the historical context of the
`dev.Dockerfile` and thus do not know why `poetry.lock` has not been
copied until now. It might have been an oversight, or perhaps dependency
resolution used to complete quickly even without the `poetry.lock` file
in the past. However, if there are deliberate reasons why copying
`poetry.lock` is not advisable, please just close this PR.
Description:
this change fixes the pydantic validation error when looking up from
GPTCache, the `ChatOpenAI` class returns `ChatGeneration` as response
which is not handled.
use the existing `_loads_generations` and `_dumps_generations` functions
to handle it
Trace
```
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/development/scripts/chatbot-postgres-test.py", line 90, in <module>
print(llm.invoke("tell me a joke"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 166, in invoke
self.generate_prompt(
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 544, in generate_prompt
return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 408, in generate
raise e
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 398, in generate
self._generate_with_cache(
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 585, in _generate_with_cache
cache_val = llm_cache.lookup(prompt, llm_string)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_community/cache.py", line 807, in lookup
return [
^
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_community/cache.py", line 808, in <listcomp>
Generation(**generation_dict) for generation_dict in json.loads(res)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/load/serializable.py", line 120, in __init__
super().__init__(**kwargs)
File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Generation
type
unexpected value; permitted: 'Generation' (type=value_error.const; given=ChatGeneration; permitted=('Generation',))
```
Although I don't seem to find any issues here, here's an
[issue](https://github.com/zilliztech/GPTCache/issues/585) raised in
GPTCache. Please let me know if I need to do anything else
Thank you
---------
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Few-Shot prompt template may use a `SemanticSimilarityExampleSelector`
that in turn uses a `VectorStore` that does I/O operations.
So to work correctly on the event loop, we need:
* async methods for the `VectorStore` (OK)
* async methods for the `SemanticSimilarityExampleSelector` (this PR)
* async methods for `BasePromptTemplate` and `BaseChatPromptTemplate`
(future work)
This is a small breaking change but I think it should be done as:
* No external dependency needs to be installed anymore for the default
to work
* It is vendor-neutral
This patch updates function "run" to "invoke" in fake_llm.ipynb. Without
this patch, you see following warning.
LangChainDeprecationWarning: The function `run` was deprecated in
LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Fixing some issues for AzureCosmosDBSemanticCache
- Added the entry for "AzureCosmosDBSemanticCache" which was missing in
langchain/cache.py
- Added application name when creating the MongoClient for the
AzureCosmosDBVectorSearch, for tracking purposes.
@baskaryan, can you please review this PR, we need this to go in asap.
These are just small fixes which we found today in our testing.
- **Description:** The `semantic_hybrid_search_with_score_and_rerank`
method of `AzureSearch` contains a hardcoded field name "metadata" for
the document metadata in the Azure AI Search Index. Adding such a field
is optional when creating an Azure AI Search Index, as other snippets
from `AzureSearch` test for the existence of this field before trying to
access it. Furthermore, the metadata field name shouldn't be hardcoded
as "metadata" and use the `FIELDS_METADATA` variable that defines this
field name instead. In the current implementation, any index without a
metadata field named "metadata" will yield an error if a semantic answer
is returned by the search in
`semantic_hybrid_search_with_score_and_rerank`.
- **Issue:** https://github.com/langchain-ai/langchain/issues/18731
- **Prior fix to this bug:** This bug was fixed in this PR
https://github.com/langchain-ai/langchain/pull/15642 by adding a check
for the existence of the metadata field named `FIELDS_METADATA` and
retrieving a value for the key called "key" in that metadata if it
exists. If the field named `FIELDS_METADATA` was not present, an empty
string was returned. This fix was removed in this PR
https://github.com/langchain-ai/langchain/pull/15659 (see
ed1ffca911#).
@lz-chen: could you confirm this wasn't intentional?
- **New fix to this bug:** I believe there was an oversight in the logic
of the fix from
[#1564](https://github.com/langchain-ai/langchain/pull/15642) which I
explain below.
The `semantic_hybrid_search_with_score_and_rerank` method creates a
dictionary `semantic_answers_dict` with semantic answers returned by the
search as follows.
5c2f7e6b2b/libs/community/langchain_community/vectorstores/azuresearch.py (L574-L581)
The keys in this dictionary are the unique document ids in the index, if
I understand the [documentation of semantic
answers](https://learn.microsoft.com/en-us/azure/search/semantic-answers)
in Azure AI Search correctly. When the method transforms a search result
into a `Document` object, an "answer" key is added to the document's
metadata. The value for this "answer" key should be the semantic answer
returned by the search from this document, if such an answer is
returned. The match between a `Document` object and the semantic answers
returned by the search should be done through the unique document id,
which is used as a key for the `semantic_answers_dict` dictionary. This
id is defined in the search result's field named `FIELDS_ID`. I added a
check to avoid any error in case no field named `FIELDS_ID` exists in a
search result (which shouldn't happen in theory).
A benefit of this approach is that this fix should work whether or not
the Azure AI Search Index contains a metadata field.
@levalencia could you confirm my analysis and test the fix?
@raunakshrivastava7 do you agree with the fix?
Thanks for the help!
### Prem SDK integration in LangChain
This PR adds the integration with [PremAI's](https://www.premai.io/)
prem-sdk with langchain. User can now access to deployed models
(llms/embeddings) and use it with langchain's ecosystem. This PR adds
the following:
### This PR adds the following:
- [x] Add chat support
- [X] Adding embedding support
- [X] writing integration tests
- [X] writing tests for chat
- [X] writing tests for embedding
- [X] writing unit tests
- [X] writing tests for chat
- [X] writing tests for embedding
- [X] Adding documentation
- [X] writing documentation for chat
- [X] writing documentation for embedding
- [X] run `make test`
- [X] run `make lint`, `make lint_diff`
- [X] Final checks (spell check, lint, format and overall testing)
---------
Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** PgVector class always runs "create extension" on init
and this statement crashes on ReadOnly databases (read only replicas).
but wierdly the next create collection etc work even in readOnly
databases
- **Dependencies:** no new dependencies
- **Twitter handle:** @VenOmaX666
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Thank you for contributing to LangChain!
When run command langchain app new my-app, i get this error:
File
"/home/mauricio/.local/lib/python3.8/site-packages/langchain_cli/utils/pyproject.py",
line 15, in <module>
pyproject_toml: Path, local_editable_dependencies: Iterable[tuple[str,
Path]]
TypeError: 'type' object is not subscriptable
This PR fix the error.
The existing default list of separators for the `RecursiveTextSplitter`
assumes spaces are word boundaries. Some languages [don't use spaces
between
words](https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries)
(Chinese, Japanese, Thai, Burmese).
This PR extends the documentation to explain how to cater for those
languages by adding additional punctuation to the separators and
zero-width spaces which are used by some typesetters and will assist the
splitter to not split in words.
Ideally, **these separators could be a constant in the module** but for
now, defining them in the documentation is a start.
**Description:**
- minor PR to speed up onboarding by not trying to add a dataset, if a
model is already present.
- replace batch publish API with streaming when single events are
published.
**Dependencies:** any dependencies required for this change
**Twitter handle:** behalder
Co-authored-by: Barun Halder <barun@fiddler.ai>
This PR aims to enhance the documentation for TiDB integration, driven
by feedback from our users. It provides detailed introductions to key
features, ensuring developers can fully leverage TiDB for AI application
development.
**Description:**
Expanding version in all the Confluence API calls so to get when the
page was last modified/created in all cases.
**Issue:** #12812
**Twitter handle:** zzste
This PR adds code to make sure that the correct base URL is being
created for the Azure Cognitive Search retriever. At the moment an
incorrect base URL is being generated. I think this is happening because
the original code was based on a depreciated API version. No
dependencies need to be added. I've also added more context to the test
doc strings.
I should also note that ACS is now Azure AI Search. I will open a
separate PR to make these changes as that would be a breaking change and
should potentially be discussed.
Twitter: @marlene_zw
- No new tests added, however the current ACS retriever tests are now
passing when I run them.
- Code was linted.
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
- **Description:** This commit introduces support for the newly
available GPU index types introduced in Milvus 2.4 within the LangChain
project's `milvus.py`. With the release of Milvus 2.4, a range of
GPU-accelerated index types have been added, offering enhanced search
capabilities and performance optimizations for vector search operations.
This update ensures LangChain users can fully utilize the new
performance benefits for vector search operations.
- Reference: https://milvus.io/docs/gpu_index.md
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Corrected a broken link within the semantic-chunker.ipynb notebook,
ensuring that users can access the referenced resource.
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
This patch fixes the #18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.
I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
- **Description:** Added support for lower-case and mixed-case names
The names for tables and columns previouly had to be UPPER_CASE.
With this enhancement, also lower_case and MixedCase are supported,
- **Issue:** N/A
- **Dependencies:** no new dependecies added
- **Twitter handle:** @sapopensource
- **Description:** Since the implicit `__call__` has been deprecated in
favor of `invoke`, the local_llms article also needed to be updated.
This article was my introduction to Lanchain, and as it was helpful in
getting me setup with running LLMs locally, it is nice to not have any
warnings when running the example code. With this change, the warnings
go away when running the example code.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** clarkerican
Previous PR passed _parser attribute which apparently is not meant to be
used by user code and causes non deterministic failures on CI when
testing the transform and a transform methods. Reverting this change
temporarily.
This mitigates a security concern for users still using older versions of libexpat that causes an attacker to compromise the availability of the system if an attacker manages to surface malicious payload to this XMLParser.
**Description:** This change passes through `batch_size` to
`add_documents()`/`aadd_documents()` on calls to `index()` and
`aindex()` such that the documents are processed in the expected batch
size.
**Issue:** #19415
**Dependencies:** N/A
**Twitter handle:** N/A
Updated `HuggingFacePipeline` docs to be in sync with list of supported
tasks, including translation.
- [x] **PR title**: "community: Update docs for `HuggingFacePipeline`"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [x] **PR message**:
- **Description:** Update docs for `HuggingFacePipeline`, was earlier
missing `translation` as a valid task
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** None
- [x] **Add tests and docs**:
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
**Description:**
This PR adds [Dappier](https://dappier.com/) for the chat model. It
supports generate, async generate, and batch functionalities. We added
unit and integration tests as well as a notebook with more details about
our chat model.
**Dependencies:**
No extra dependencies are needed.
- **Description:** [CVE
2024-21503](https://www.cve.org/CVERecord?id=CVE-2024-21503) was
recently identified. The python linter "black" suffers from a potential
Regex-related denial of service attack. Updated version from the
vulnerable 24.2.0 to the patched 24.3.0.
- **Issue:** N/A
- **Dependencies:** The 'black' package in both `langchain` (top-level)
and `templates/python-lint`.
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
DuckDB has a cosine similarity function along list and array data types,
which can be used as a vector store.
- **Description:** The latest version of DuckDB features a cosine
similarity function, which can be used with its support for list or
array column types. This PR surfaces this functionality to langchain.
- **Dependencies:** duckdb 0.10.0
- **Twitter handle:** @igocrite
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
**Description:** Update s3_file.py to use arguments **mode** and
**post_processors** from the base class **UnstructuredBaseLoader** to
include more metadata about the files from the S3 bucket such as
*'page_number', 'languages'* etc.
**Issue:** NA
**Dependencies:** None
**Twitter handle:** preak95
---------
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Looking at tokens / page of our docs, we see a few outliers:
<img width="761" alt="image"
src="https://github.com/langchain-ai/langchain/assets/122662504/677aa2d6-0a29-45e4-882a-db2bbf46d02b">
It is due to non-rendering images in one case, and output spamming.
Clean these, along with other cases of excessing output spamming in
docs.
All get sucked into chat-langchain for retrieval.
2024-03-24 23:47:38 -07:00
3501 changed files with 415303 additions and 80415 deletions
**LangChain** is a framework for developing applications powered by language models. It enables applications that:
- **Are context-aware**: connect a language model to sources of context (prompt instructions, few shot examples, content to ground its response in, etc.)
- **Reason**: rely on a language model to reason (about how to answer based on provided context, what actions to take, etc.)
**LangChain** is a framework for developing applications powered by large language models (LLMs).
This framework consists of several parts.
- **LangChain Libraries**: The Python and JavaScript libraries. Contains interfaces and integrations for a myriad of components, a basic run time for combining these components into chains and agents, and off-the-shelf implementations of chains and agents.
- **[LangChain Templates](templates)**: A collection of easily deployable reference architectures for a wide variety of tasks.
- **[LangServe](https://github.com/langchain-ai/langserve)**: A library for deploying LangChain chains as a REST API.
- **[LangSmith](https://smith.langchain.com)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
- **[LangGraph](https://python.langchain.com/docs/langgraph)**: LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain. It extends the LangChain Expression Language with the ability to coordinate multiple chains (or actors) across multiple steps of computation in a cyclic manner.
For these applications, LangChain simplifies the entire application lifecycle:
The LangChain libraries themselves are made up of several different packages.
- **[`langchain-core`](libs/core)**: Base abstractionsand LangChain Expression Language.
- **[`langchain-community`](libs/community)**: Third party integrations.
- **[`langchain`](libs/langchain)**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- **Open-source libraries**: Build your applications using LangChain's [modular building blocks](https://python.langchain.com/docs/expression_language/) and [components](https://python.langchain.com/docs/modules/). Integrate with hundreds of [third-party providers](https://python.langchain.com/docs/integrations/platforms/).
- **Productionization**: Inspect, monitor, and evaluate your apps with [LangSmith](https://python.langchain.com/docs/langsmith/) so that you can constantly optimize and deploy with confidence.
- **Deployment**: Turn any chain into a REST API with [LangServe](https://python.langchain.com/docs/langserve).
### Open-source libraries
- **`langchain-core`**: Base abstractions and LangChain Expression Language.
- **`langchain-community`**: Third party integrations.
- Some integrations have been further split into **partner packages** that only rely on **`langchain-core`**. Examples include **`langchain_openai`** and **`langchain_anthropic`**.
- **`langchain`**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- **[`LangGraph`](https://python.langchain.com/docs/langgraph)**: A library for building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.
### Productionization:
- **[LangSmith](https://python.langchain.com/docs/langsmith)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
### Deployment:
- **[LangServe](https://python.langchain.com/docs/langserve)**: A library for deploying LangChain chains as REST APIs.

@@ -72,34 +78,51 @@ And much more! Head to the [Use cases](https://python.langchain.com/docs/use_cas
## 🚀 How does LangChain help?
The main value props of the LangChain libraries are:
1.**Components**: composable tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
1.**Components**: composable building blocks, tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
2.**Off-the-shelf chains**: built-in assemblages of components for accomplishing higher-level tasks
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
## LangChain Expression Language (LCEL)
LCEL is the foundation of many of LangChain's components, and is a declarative way to compose chains. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains.
- **[Overview](https://python.langchain.com/docs/expression_language/)**: LCEL and its benefits
- **[Interface](https://python.langchain.com/docs/expression_language/interface)**: The standard interface for LCEL objects
- **[Primitives](https://python.langchain.com/docs/expression_language/primitives)**: More on the primitives LCEL includes
## Components
Components fall into the following **modules**:
**📃 Model I/O:**
This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.
This includes [prompt management](https://python.langchain.com/docs/modules/model_io/prompts/), [prompt optimization](https://python.langchain.com/docs/modules/model_io/prompts/example_selectors/), a generic interface for [chat models](https://python.langchain.com/docs/modules/model_io/chat/) and [LLMs](https://python.langchain.com/docs/modules/model_io/llms/), and common utilities for working with [model outputs](https://python.langchain.com/docs/modules/model_io/output_parsers/).
**📚 Retrieval:**
Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.
Retrieval Augmented Generation involves [loading data](https://python.langchain.com/docs/modules/data_connection/document_loaders/) from a variety of sources, [preparing it](https://python.langchain.com/docs/modules/data_connection/document_loaders/), [then retrieving it](https://python.langchain.com/docs/modules/data_connection/retrievers/) for use in the generation step.
**🤖 Agents:**
Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.
Agents allow an LLM autonomy over how a task is accomplished. Agents make decisions about which Actions to take, then take that Action, observe the result, and repeat until the task is complete done. LangChain provides a [standard interface for agents](https://python.langchain.com/docs/modules/agents/), a [selection of agents](https://python.langchain.com/docs/modules/agents/agent_types/) to choose from, and examples of end-to-end agents.
## 📖 Documentation
Please see [here](https://python.langchain.com) for full documentation, which includes:
- [Getting started](https://python.langchain.com/docs/get_started/introduction): installation, setting up the environment, simple examples
-Overview of the [interfaces](https://python.langchain.com/docs/expression_language/), [modules](https://python.langchain.com/docs/modules/), and [integrations](https://python.langchain.com/docs/integrations/providers)
-[Use case](https://python.langchain.com/docs/use_cases/qa_structured/sql) walkthroughs and best practice [guides](https://python.langchain.com/docs/guides/adapters/openai)
- [LangSmith](https://python.langchain.com/docs/langsmith/), [LangServe](https://python.langchain.com/docs/langserve), and [LangChain Template](https://python.langchain.com/docs/templates/) overviews
- [Reference](https://api.python.langchain.com): full API docs
-[Use case](https://python.langchain.com/docs/use_cases/) walkthroughs and best practice [guides](https://python.langchain.com/docs/guides/)
-Overviews of the [interfaces](https://python.langchain.com/docs/expression_language/), [components](https://python.langchain.com/docs/modules/), and [integrations](https://python.langchain.com/docs/integrations/providers)
You can also check out the full [API Reference docs](https://api.python.langchain.com).
## 🌐 Ecosystem
- [🦜🛠️ LangSmith](https://python.langchain.com/docs/langsmith/): Tracing and evaluating your language model applications and intelligent agents to help you move from prototype to production.
- [🦜🕸️ LangGraph](https://python.langchain.com/docs/langgraph): Creating stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain primitives.
- [🦜🏓 LangServe](https://python.langchain.com/docs/langserve): Deploying LangChain runnables and chains as REST APIs.
- [LangChain Templates](https://python.langchain.com/docs/templates/): Example applications hosted with LangServe.
"query = \"Give me company names that are interesting investments based on EV / NTM and NTM rev growth. Consider EV / NTM multiples vs historical?\"\n",
[press_releases.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/press_releases.ipynb) | Retrieve and query company press release data powered by [Kay.ai](https://kay.ai).
[program_aided_language_model.i...](https://github.com/langchain-ai/langchain/tree/master/cookbook/program_aided_language_model.ipynb) | Implement program-aided language models as described in the provided research paper.
[qa_citations.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/qa_citations.ipynb) | Different ways to get a model to cite its sources.
[rag_upstage_layout_analysis_groundedness_check.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag_upstage_layout_analysis_groundedness_check.ipynb) | End-to-end RAG example using Upstage Layout Analysis and Groundedness Check.
[retrieval_in_sql.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/retrieval_in_sql.ipynb) | Perform retrieval-augmented-generation (rag) on a PostgreSQL database using pgvector.
[sales_agent_with_context.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/sales_agent_with_context.ipynb) | Implement a context-aware ai sales agent, salesgpt, that can have natural sales conversations, interact with other systems, and use a product knowledge base to discuss a company's offerings.
[self_query_hotel_search.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/self_query_hotel_search.ipynb) | Build a hotel room search feature with self-querying retrieval, using a specific hotel recommendation dataset.
"Apply to the [`LLaMA2`](https://arxiv.org/pdf/2307.09288.pdf) paper. \n",
"\n",
"We use the Unstructured [`partition_pdf`](https://unstructured-io.github.io/unstructured/bricks/partition.html#partition-pdf), which segments a PDF document by using a layout model. \n",
"We use the Unstructured [`partition_pdf`](https://unstructured-io.github.io/unstructured/core/partition.html#partition-pdf), which segments a PDF document by using a layout model. \n",
"\n",
"This layout model makes it possible to extract elements, such as tables, from pdfs. \n",
"query = \"What percentage of CPI is dedicated to Housing, and how does it compare to the combined percentage of Medical Care, Apparel, and Other Goods and Services?\"\n",
"suffix_for_images = \" Include any pie charts, graphs, or tables.\"\n",
" raise ValueError(\"a KEYSPACE environment variable must be set\")\n",
"\n",
"session.set_keyspace(keyspace)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup Database"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This needs to be done one time only!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The dataset used is from Kaggle, the [Environmental Sensor Telemetry Data](https://www.kaggle.com/datasets/garystafford/environmental-sensor-data-132k?select=iot_telemetry_data.csv). The next cell will download and unzip the data into a Pandas dataframe. The following cell is instructions to download manually. \n",
"\n",
"The net result of this section is you should have a Pandas dataframe variable `df`."
"with zip_file.open(csv_file_name) as csv_file:\n",
" df = pd.read_csv(csv_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Download Manually"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can download the `.zip` file and unpack the `.csv` contained within. Comment in the next line, and adjust the path to this `.csv` file appropriately."
"WITH COMMENT = 'Data from environmental IoT room sensors. Columns include device identifier, timestamp (ts) of the data collection, carbon monoxide level (co), relative humidity, light presence, LPG concentration, motion detection, smoke concentration, and temperature (temp). Data is partitioned by day and device.';\n",
" description=\"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\",\n",
"Here is your task: In the {keyspace} keyspace, find the total number of times the temperature of each device has exceeded 23 degrees on July 14, 2020.\n",
" Create a summary report including the name of the room. Use Pandas if helpful.\n",
"* Set the MongoDB connection string. Follow the steps [here](https://www.mongodb.com/docs/manual/reference/connection-string/) to get the connection string from the Atlas UI.\n",
"\n",
"* Set the OpenAI API key. Steps to obtain an API key as [here](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "b56412ae",
"metadata": {},
"outputs": [],
"source": [
"import getpass"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "16a20d7a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Enter your MongoDB connection string:········\n"
]
}
],
"source": [
"MONGODB_URI = getpass.getpass(\"Enter your MongoDB connection string:\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "978682d4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Enter your OpenAI API key:········\n"
]
}
],
"source": [
"OPENAI_API_KEY = getpass.getpass(\"Enter your OpenAI API key:\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "606081c5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"········\n"
]
}
],
"source": [
"# Optional-- If you want to enable Langsmith -- good for debugging\n",
"rag_chain = retriever_chain | rag_prompt | model | parse_output"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "9618d395",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The best movie to watch when feeling down could be \"Last Action Hero.\" It\\'s a fun and action-packed film that blends reality and fantasy, offering an escape from the real world and providing an entertaining distraction.'"
"'I apologize for the confusion. Another movie that might lift your spirits when you\\'re feeling sad is \"Smilla\\'s Sense of Snow.\" It\\'s a mystery thriller that could engage your mind and distract you from your sadness with its intriguing plot and suspenseful storyline.'"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with_message_history.invoke(\n",
" {\n",
" \"question\": \"Hmmm..I don't want to watch that one. Can you suggest something else?\"\n",
"'For a lighter movie option, you might enjoy \"Cousins.\" It\\'s a comedy film set in Barcelona with action and humor, offering a fun and entertaining escape from reality. The storyline is engaging and filled with comedic moments that could help lift your spirits.'"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with_message_history.invoke(\n",
" {\"question\": \"How about something more light?\"},\n",
"## Step 7: Get faster responses using Semantic Cache\n",
"\n",
"**NOTE:** Semantic cache only caches the input to the LLM. When using it in retrieval chains, remember that documents retrieved can change between runs resulting in cache misses for semantically similar queries."
"# RAG using Upstage Layout Analysis and Groundedness Check\n",
"This example illustrates RAG using [Upstage](https://python.langchain.com/docs/integrations/providers/upstage/) Layout Analysis and Groundedness Check."
"# SalesGPT - Your Context-Aware AI Sales Assistant With Knowledge Base\n",
"# SalesGPT - Context-Aware AI Sales Assistant With Knowledge Base and Ability Generate Stripe Payment Links\n",
"\n",
"This notebook demonstrates an implementation of a **Context-Aware** AI Sales agent with a Product Knowledge Base. \n",
"This notebook demonstrates an implementation of a **Context-Aware** AI Sales agent with a Product Knowledge Base which can actually close sales. \n",
"\n",
"This notebook was originally published at [filipmichalsky/SalesGPT](https://github.com/filip-michalsky/SalesGPT) by [@FilipMichalsky](https://twitter.com/FilipMichalsky).\n",
"\n",
"SalesGPT is context-aware, which means it can understand what section of a sales conversation it is in and act accordingly.\n",
" \n",
"As such, this agent can have a natural sales conversation with a prospect and behaves based on the conversation stage. Hence, this notebook demonstrates how we can use AI to automate sales development representatives activities, such as outbound sales calls. \n",
"As such, this agent can have a natural sales conversation with a prospect and behaves based on the conversation stage. Hence, this notebook demonstrates how we can use AI to automate sales development representatives activites, such as outbound sales calls. \n",
"\n",
"Additionally, the AI Sales agent has access to tools, which allow it to interact with other systems.\n",
"\n",
"Here, we show how the AI Sales Agent can use a **Product Knowledge Base** to speak about a particular's company offerings,\n",
"hence increasing relevance and reducing hallucinations.\n",
"\n",
"We leverage the [`langchain`](https://github.com/langchain-ai/langchain) library in this implementation, specifically [Custom Agent Configuration](https://langchain-langchain.vercel.app/docs/modules/agents/how_to/custom_agent_with_tool_retrieval) and are inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) architecture ."
"Furthermore, we show how our AI Sales Agent can **generate sales** by integration with the AI Agent Highway called [Mindware](https://www.mindware.co/). In practice, this allows the agent to autonomously generate a payment link for your customers **to pay for your products via Stripe**.\n",
"\n",
"We leverage the [`langchain`](https://github.com/hwchase17/langchain) library in this implementation, specifically [Custom Agent Configuration](https://langchain-langchain.vercel.app/docs/modules/agents/how_to/custom_agent_with_tool_retrieval) and are inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) architecture ."
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting only from the following options:\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting ony from the following options:\n",
" 1. Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\n",
" 2. Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.\n",
" 3. Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting only from the following options:\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting ony from the following options:\n",
" 1. Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\n",
" 2. Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.\n",
" 3. Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.\n",
"\"I'm doing great, thank you for asking! As a Business Development Representative at Sleep Haven, I wanted to reach out to see if you are looking to achieve a better night's sleep. We provide premium mattresses that offer the most comfortable and supportive sleeping experience possible. Are you interested in exploring our sleep solutions? <END_OF_TURN>\""
"{'salesperson_name': 'Ted Lasso',\n",
" 'salesperson_role': 'Business Development Representative',\n",
" 'company_name': 'Sleep Haven',\n",
" 'company_business': 'Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.',\n",
" 'company_values': \"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" 'conversation_purpose': 'find out whether they are looking to achieve better sleep via buying a premier mattress.',\n",
" 'conversation_history': 'Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>',\n",
" 'conversation_type': 'call',\n",
" 'conversation_stage': 'Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional. Your greeting should be welcoming. Always clarify in your greeting the reason why you are contacting the prospect.',\n",
" 'text': \"I'm doing well, thank you for asking. The reason I'm calling is to discuss how Sleep Haven can help enhance your sleep quality with our premium mattresses. Are you currently looking for ways to achieve a better night's sleep? <END_OF_TURN>\"}"
]
},
"execution_count": 8,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sales_conversation_utterance_chain.run(\n",
" salesperson_name=\"Ted Lasso\",\n",
" salesperson_role=\"Business Development Representative\",\n",
" company_name=\"Sleep Haven\",\n",
" company_business=\"Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.\",\n",
" company_values=\"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" conversation_purpose=\"find out whether they are looking to achieve better sleep via buying a premier mattress.\",\n",
" conversation_history=\"Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>\",\n",
" conversation_type=\"call\",\n",
" conversation_stage=conversation_stages.get(\n",
" \"1\",\n",
" \"Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\",\n",
" ),\n",
"sales_conversation_utterance_chain.invoke(\n",
" {\n",
" \"salesperson_name\": \"Ted Lasso\",\n",
" \"salesperson_role\": \"Business Development Representative\",\n",
" \"company_name\": \"Sleep Haven\",\n",
" \"company_business\": \"Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed tomeet the unique needs of our customers.\",\n",
" \"company_values\": \"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" \"conversation_purpose\": \"find out whether they are looking to achieve better sleep via buying a premier mattress.\",\n",
" \"conversation_history\": \"Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>\",\n",
" \"Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\",\n",
" description=\"useful for when you need to answer questions about product information\",\n",
" )\n",
" ]\n",
"\n",
" return tools"
" return knowledge_base"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 10,
"metadata": {},
"outputs": [
{
@@ -485,16 +485,18 @@
"text": [
"Created a chunk of size 940, which is longer than the specified 10\n",
"Created a chunk of size 844, which is longer than the specified 10\n",
"Created a chunk of size 837, which is longer than the specified 10\n"
"Created a chunk of size 837, which is longer than the specified 10\n",
"/Users/filipmichalsky/Odyssey/sales_bot/SalesGPT/env/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: The function `run` was deprecated in LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead.\n",
" warn_deprecated(\n"
]
},
{
"data": {
"text/plain": [
"' We have four products available: the Classic Harmony Spring Mattress, the Plush Serenity Bamboo Mattress, the Luxury Cloud-Comfort Memory Foam Mattress, and the EcoGreen Hybrid Latex Mattress. Each product is available in different sizes, with the Classic Harmony Spring Mattress available in Queen and King sizes, the Plush Serenity Bamboo Mattress available in King size, the Luxury Cloud-Comfort Memory Foam Mattress available in Twin, Queen, and King sizes, and the EcoGreen Hybrid Latex Mattress available in Twin and Full sizes.'"
"'The Sleep Haven products available are:\\n\\n1. Luxury Cloud-Comfort Memory Foam Mattress\\n2. Classic Harmony Spring Mattress\\n3. EcoGreen Hybrid Latex Mattress\\n4. Plush Serenity Bamboo Mattress\\n\\nEach product has its unique features and price point.'"
]
},
"execution_count": 11,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -508,12 +510,199 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up the SalesGPT Controller with the Sales Agent and Stage Analyzer and a Knowledge Base"
"### Payment gateway"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to set up your AI agent to use a payment gateway to generate payment links for your users you need two things:\n",
"\n",
"1. Sign up for a Stripe account and obtain a STRIPE API KEY\n",
"2. Create products you would like to sell in the Stripe UI. Then follow out example of `example_product_price_id_mapping.json`\n",
"to feed the product name to price_id mapping which allows you to generate the payment links."
" description=\"useful for when you need to answer questions about product information or services offered, availability and their costs.\",\n",
" ),\n",
" Tool(\n",
" name=\"GeneratePaymentLink\",\n",
" func=generate_stripe_payment_link,\n",
" description=\"useful to close a transaction with a customer. You need to include product name and quantity and customer name in the query input.\",\n",
" ),\n",
" ]\n",
"\n",
" return tools"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up the SalesGPT Controller with the Sales Agent and Stage Analyzer\n",
"\n",
"#### The Agent has access to a Knowledge Base and can autonomously sell your products via Stripe"
"Created a chunk of size 940, which is longer than the specified 10\n",
"Created a chunk of size 844, which is longer than the specified 10\n",
"Created a chunk of size 837, which is longer than the specified 10\n"
"Created a chunk of size 837, which is longer than the specified 10\n",
"/Users/filipmichalsky/Odyssey/sales_bot/SalesGPT/env/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: The class `langchain.agents.agent.LLMSingleActionAgent` was deprecated in langchain 0.1.0 and will be removed in 0.2.0. Use Use new agent constructor methods like create_react_agent, create_json_agent, create_structured_chat_agent, etc. instead.\n",
" warn_deprecated(\n"
]
}
],
@@ -907,7 +1095,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
@@ -917,7 +1105,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 22,
"metadata": {},
"outputs": [
{
@@ -934,14 +1122,14 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Hello, this is Ted Lasso from Sleep Haven. How are you doing today?\n"
"Ted Lasso: Good day! This is Ted Lasso from Sleep Haven. How are you doing today?\n"
]
}
],
@@ -951,18 +1139,18 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"I am well, how are you? I would like to learn more about your mattresses.\"\n",
" \"I am well, how are you? I would like to learn more about your services.\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 25,
"metadata": {},
"outputs": [
{
@@ -977,92 +1165,32 @@
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: I'm glad to hear that you're doing well! As for our mattresses, at Sleep Haven, we provide customers with the most comfortable and supportive sleeping experience possible. Our high-quality mattresses are designed to meet the unique needs of our customers. Can I ask what specifically you'd like to learn more about? \n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\"Yes, what materials are you mattresses made from?\")"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Our mattresses are made from a variety of materials, depending on the model. We have the EcoGreen Hybrid Latex Mattress, which is made from 100% natural latex harvested from eco-friendly plantations. The Plush Serenity Bamboo Mattress features a layer of plush, adaptive foam and a base of high-resilience support foam, with a bamboo-infused top layer. The Luxury Cloud-Comfort Memory Foam Mattress has an innovative, temperature-sensitive memory foam layer and a high-density foam base with cooling gel-infused particles. Finally, the Classic Harmony Spring Mattress has a robust inner spring construction and layers of plush padding, with a quilted top layer and a natural cotton cover. Is there anything specific you'd like to know about these materials?\n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: I'm doing great, thank you for asking! I'm glad to hear you're interested. Sleep Haven is a premium mattress company, and we're all about offering the best sleep solutions, including top-notch mattresses, pillows, and bedding accessories. Our mission is to help you achieve a better night's sleep. May I know if you're looking to enhance your sleep experience with a new mattress or bedding accessories? \n"
]
}
],
"source": [
"sales_agent.human_step(\n",
" \"Yes, I am looking for a queen sized mattress. Do you have any mattresses in queen size?\"\n",
")"
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"outputs": [],
"source": [
"sales_agent.determine_conversation_stage()"
"sales_agent.human_step(\n",
" \"Yes, I would like to improve my sleep. Can you tell me more about your products?\"\n",
")"
]
},
{
@@ -1074,7 +1202,24 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Yes, we do have queen-sized mattresses available. We offer the Luxury Cloud-Comfort Memory Foam Mattress and the Classic Harmony Spring Mattress in queen size. Both mattresses provide exceptional comfort and support. Is there anything specific you would like to know about these options?\n"
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Absolutely, I'd be happy to share more about our products. At Sleep Haven, we offer a variety of high-quality mattresses designed to cater to different sleeping preferences and needs. Whether you're looking for memory foam's comfort, the support of hybrid mattresses, or the breathability of natural latex, we have options for everyone. Our pillows and bedding accessories are similarly curated to enhance your sleep quality. Every product is built with the aim of helping you achieve the restful night's sleep you deserve. What specific features are you looking for in a mattress? \n"
]
}
],
@@ -1084,16 +1229,16 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\"Yea, compare and contrast those two options, please.\")"
"sales_agent.human_step(\"What mattresses do you have and how much do they cost?\")"
]
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 32,
"metadata": {},
"outputs": [
{
@@ -1110,14 +1255,14 @@
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: The Luxury Cloud-Comfort Memory Foam Mattress is priced at $999 and is available in Twin, Queen, and King sizes. It features an innovative, temperature-sensitive memory foam layer and a high-density foam base. On the other hand, the Classic Harmony Spring Mattress is priced at $1,299 and is available in Queen and King sizes. It features a robust inner spring construction and layers of plush padding. Both mattresses provide exceptional comfort and support, but the Classic Harmony Spring Mattress may be a better option if you prefer the traditional feel of an inner spring mattress. Do you have any other questions about these options?\n"
"Ted Lasso: We offer two primary types of mattresses at Sleep Haven. The first is our Luxury Cloud-Comfort Memory Foam Mattress, which is priced at $999 and comes in Twin, Queen, and King sizes. The second is our Classic Harmony Spring Mattress, priced at $1,299, available in Queen and King sizes. Both are designed to provide exceptional comfort and support for a better night's sleep. Which type of mattress would you be interested in learning more about? \n"
]
}
],
@@ -1127,14 +1272,66 @@
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"Great, thanks, that's it. I will talk to my wife and call back if she is onboard. Have a good day!\"\n",
" \"Okay.I would like to order two Memory Foam mattresses in Twin size please.\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Close: Ask for the sale by proposing a next step. This could be a demo, a trial or a meeting with decision-makers. Ensure to summarize what has been discussed and reiterate the benefits.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Fantastic choice! You're on your way to a better night's sleep with our Luxury Cloud-Comfort Memory Foam Mattresses. I've generated a payment link for two Twin size mattresses for you. Here is the link to complete your purchase: https://buy.stripe.com/test_6oEg28e3V97BdDabJn. Is there anything else I can assist you with today? \n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"Great, thanks! I will discuss with my wife and will buy it if she is onboard. Have a good day!\"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I should research ChatGPT to answer this question.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I should research ChatGPT to answer this question.\n",
"Action: Search\n",
"Action Input: \"ChatGPT\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mNov 30, 2022 ... We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer... ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large... ChatGPT. We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer... Feb 2, 2023 ... ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after... 2 days ago ... ChatGPT recently launched a new version of its own plagiarism detection tool, with hopes that it will squelch some of the criticism around how... An API for accessing new AI models developed by OpenAI. Feb 19, 2023 ... ChatGPT is an AI chatbot system that OpenAI released in November to show off and test what a very large, powerful AI system can accomplish. You... ChatGPT is fine-tuned from GPT-3.5, a language model trained to produce text. ChatGPT was optimized for dialogue by using Reinforcement Learning with Human... 3 days ago ... Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. Dec 1, 2022 ... ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with a...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\u001b[0m\n",
"Action Input: \"ChatGPT\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mNov 30, 2022 ... We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer... ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large... ChatGPT. We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer... Feb 2, 2023 ... ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after... 2 days ago ... ChatGPT recently launched a new version of its own plagiarism detection tool, with hopes that it will squelch some of the criticism around how... An API for accessing new AI models developed by OpenAI. Feb 19, 2023 ... ChatGPT is an AI chatbot system that OpenAI released in November to show off and test what a very large, powerful AI system can accomplish. You... ChatGPT is fine-tuned from GPT-3.5, a language model trained to produce text. ChatGPT was optimized for dialogue by using Reinforcement Learning with Human... 3 days ago ... Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. Dec 1, 2022 ... ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with a...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\u001B[0m\n",
"Cell \u001B[0;32mIn[36], line 1\u001B[0m\n\u001B[0;32m----> 1\u001B[0m \u001B[43magent_executor\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43minvoke\u001B[49m\u001B[43m(\u001B[49m\u001B[43m{\u001B[49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[38;5;124;43minput\u001B[39;49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[43m:\u001B[49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[38;5;124;43mWhat is ChatGPT?\u001B[39;49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[43m}\u001B[49m\u001B[43m)\u001B[49m\n",
"agent_executor.invoke({\"input\": \"What is ChatGPT?\"})"
]
},
{
@@ -179,15 +196,15 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to find out who developed ChatGPT\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I need to find out who developed ChatGPT\n",
"Action: Search\n",
"Action Input: Who developed ChatGPT\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large... Feb 15, 2023 ... Who owns Chat GPT? Chat GPT is owned and developed by AI research and deployment company, OpenAI. The organization is headquartered in San... Feb 8, 2023 ... ChatGPT is an AI chatbot developed by San Francisco-based startup OpenAI. OpenAI was co-founded in 2015 by Elon Musk and Sam Altman and is... Dec 7, 2022 ... ChatGPT is an AI chatbot designed and developed by OpenAI. The bot works by generating text responses based on human-user input, like questions... Jan 12, 2023 ... In 2019, Microsoft invested $1 billion in OpenAI, the tiny San Francisco company that designed ChatGPT. And in the years since, it has quietly... Jan 25, 2023 ... The inside story of ChatGPT: How OpenAI founder Sam Altman built the world's hottest technology with billions from Microsoft. Dec 3, 2022 ... ChatGPT went viral on social media for its ability to do anything from code to write essays. · The company that created the AI chatbot has a... Jan 17, 2023 ... While many Americans were nursing hangovers on New Year's Day, 22-year-old Edward Tian was working feverishly on a new app to combat misuse... ChatGPT is a language model created by OpenAI, an artificial intelligence research laboratory consisting of a team of researchers and engineers focused on... 1 day ago ... Everyone is talking about ChatGPT, developed by OpenAI. This is such a great tool that has helped to make AI more accessible to a wider...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: ChatGPT was developed by OpenAI.\u001b[0m\n",
"Action Input: Who developed ChatGPT\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large... Feb 15, 2023 ... Who owns Chat GPT? Chat GPT is owned and developed by AI research and deployment company, OpenAI. The organization is headquartered in San... Feb 8, 2023 ... ChatGPT is an AI chatbot developed by San Francisco-based startup OpenAI. OpenAI was co-founded in 2015 by Elon Musk and Sam Altman and is... Dec 7, 2022 ... ChatGPT is an AI chatbot designed and developed by OpenAI. The bot works by generating text responses based on human-user input, like questions... Jan 12, 2023 ... In 2019, Microsoft invested $1 billion in OpenAI, the tiny San Francisco company that designed ChatGPT. And in the years since, it has quietly... Jan 25, 2023 ... The inside story of ChatGPT: How OpenAI founder Sam Altman built the world's hottest technology with billions from Microsoft. Dec 3, 2022 ... ChatGPT went viral on social media for its ability to do anything from code to write essays. · The company that created the AI chatbot has a... Jan 17, 2023 ... While many Americans were nursing hangovers on New Year's Day, 22-year-old Edward Tian was working feverishly on a new app to combat misuse... ChatGPT is a language model created by OpenAI, an artificial intelligence research laboratory consisting of a team of researchers and engineers focused on... 1 day ago ... Everyone is talking about ChatGPT, developed by OpenAI. This is such a great tool that has helped to make AI more accessible to a wider...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer\n",
"Final Answer: ChatGPT was developed by OpenAI.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -202,7 +219,7 @@
}
],
"source": [
"agent_chain.run(input=\"Who developed it?\")"
"agent_executor.invoke({\"input\": \"Who developed it?\"})"
]
},
{
@@ -217,14 +234,14 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to simplify the conversation for a 5 year old.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I need to simplify the conversation for a 5 year old.\n",
"Action: Summary\n",
"Action Input: My daughter 5 years old\u001b[0m\n",
"Action Input: My daughter 5 years old\u001B[0m\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"\u001B[1m> Entering new LLMChain chain...\u001B[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThis is a conversation between a human and a bot:\n",
"\u001B[32;1m\u001B[1;3mThis is a conversation between a human and a bot:\n",
"\n",
"Human: What is ChatGPT?\n",
"AI: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\n",
@@ -232,16 +249,16 @@
"AI: ChatGPT was developed by OpenAI.\n",
"\n",
"Write a summary of the conversation for My daughter 5 years old:\n",
"\u001b[0m\n",
"\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3m\n",
"The conversation was about ChatGPT, an artificial intelligence chatbot. It was created by OpenAI and can send and receive images while chatting.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot created by OpenAI that can send and receive images while chatting.\u001b[0m\n",
"Observation: \u001B[33;1m\u001B[1;3m\n",
"The conversation was about ChatGPT, an artificial intelligence chatbot. It was created by OpenAI and can send and receive images while chatting.\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot created by OpenAI that can send and receive images while chatting.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -256,8 +273,8 @@
}
],
"source": [
"agent_chain.run(\n",
" input=\"Thanks. Summarize the conversation, for my daughter 5 years old.\"\n",
"agent_executor.invoke(\n",
" {\"input\": \"Thanks. Summarize the conversation, for my daughter 5 years old.\"}\n",
")"
]
},
@@ -289,9 +306,17 @@
}
],
"source": [
"print(agent_chain.memory.buffer)"
"print(agent_executor.memory.buffer)"
]
},
{
"cell_type": "markdown",
"id": "84ca95c30e262e00",
"metadata": {
"collapsed": false
},
"source": []
},
{
"cell_type": "markdown",
"id": "cc3d0aa4",
@@ -340,25 +365,9 @@
" ),\n",
"]\n",
"\n",
"prefix = \"\"\"Have a conversation with a human, answering the following questions as best you can. You have access to the following tools:\"\"\"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I should research ChatGPT to answer this question.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I should research ChatGPT to answer this question.\n",
"Action: Search\n",
"Action Input: \"ChatGPT\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mNov 30, 2022 ... We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer... ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large... ChatGPT. We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer... Feb 2, 2023 ... ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after... 2 days ago ... ChatGPT recently launched a new version of its own plagiarism detection tool, with hopes that it will squelch some of the criticism around how... An API for accessing new AI models developed by OpenAI. Feb 19, 2023 ... ChatGPT is an AI chatbot system that OpenAI released in November to show off and test what a very large, powerful AI system can accomplish. You... ChatGPT is fine-tuned from GPT-3.5, a language model trained to produce text. ChatGPT was optimized for dialogue by using Reinforcement Learning with Human... 3 days ago ... Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. Dec 1, 2022 ... ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with a...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\u001b[0m\n",
"Action Input: \"ChatGPT\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mNov 30, 2022 ... We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer... ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large... ChatGPT. We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer... Feb 2, 2023 ... ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after... 2 days ago ... ChatGPT recently launched a new version of its own plagiarism detection tool, with hopes that it will squelch some of the criticism around how... An API for accessing new AI models developed by OpenAI. Feb 19, 2023 ... ChatGPT is an AI chatbot system that OpenAI released in November to show off and test what a very large, powerful AI system can accomplish. You... ChatGPT is fine-tuned from GPT-3.5, a language model trained to produce text. ChatGPT was optimized for dialogue by using Reinforcement Learning with Human... 3 days ago ... Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. Dec 1, 2022 ... ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with a...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -396,7 +405,7 @@
}
],
"source": [
"agent_chain.run(input=\"What is ChatGPT?\")"
"agent_executor.invoke({\"input\": \"What is ChatGPT?\"})"
]
},
{
@@ -411,15 +420,15 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to find out who developed ChatGPT\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I need to find out who developed ChatGPT\n",
"Action: Search\n",
"Action Input: Who developed ChatGPT\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large... Feb 15, 2023 ... Who owns Chat GPT? Chat GPT is owned and developed by AI research and deployment company, OpenAI. The organization is headquartered in San... Feb 8, 2023 ... ChatGPT is an AI chatbot developed by San Francisco-based startup OpenAI. OpenAI was co-founded in 2015 by Elon Musk and Sam Altman and is... Dec 7, 2022 ... ChatGPT is an AI chatbot designed and developed by OpenAI. The bot works by generating text responses based on human-user input, like questions... Jan 12, 2023 ... In 2019, Microsoft invested $1 billion in OpenAI, the tiny San Francisco company that designed ChatGPT. And in the years since, it has quietly... Jan 25, 2023 ... The inside story of ChatGPT: How OpenAI founder Sam Altman built the world's hottest technology with billions from Microsoft. Dec 3, 2022 ... ChatGPT went viral on social media for its ability to do anything from code to write essays. · The company that created the AI chatbot has a... Jan 17, 2023 ... While many Americans were nursing hangovers on New Year's Day, 22-year-old Edward Tian was working feverishly on a new app to combat misuse... ChatGPT is a language model created by OpenAI, an artificial intelligence research laboratory consisting of a team of researchers and engineers focused on... 1 day ago ... Everyone is talking about ChatGPT, developed by OpenAI. This is such a great tool that has helped to make AI more accessible to a wider...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: ChatGPT was developed by OpenAI.\u001b[0m\n",
"Action Input: Who developed ChatGPT\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large... Feb 15, 2023 ... Who owns Chat GPT? Chat GPT is owned and developed by AI research and deployment company, OpenAI. The organization is headquartered in San... Feb 8, 2023 ... ChatGPT is an AI chatbot developed by San Francisco-based startup OpenAI. OpenAI was co-founded in 2015 by Elon Musk and Sam Altman and is... Dec 7, 2022 ... ChatGPT is an AI chatbot designed and developed by OpenAI. The bot works by generating text responses based on human-user input, like questions... Jan 12, 2023 ... In 2019, Microsoft invested $1 billion in OpenAI, the tiny San Francisco company that designed ChatGPT. And in the years since, it has quietly... Jan 25, 2023 ... The inside story of ChatGPT: How OpenAI founder Sam Altman built the world's hottest technology with billions from Microsoft. Dec 3, 2022 ... ChatGPT went viral on social media for its ability to do anything from code to write essays. · The company that created the AI chatbot has a... Jan 17, 2023 ... While many Americans were nursing hangovers on New Year's Day, 22-year-old Edward Tian was working feverishly on a new app to combat misuse... ChatGPT is a language model created by OpenAI, an artificial intelligence research laboratory consisting of a team of researchers and engineers focused on... 1 day ago ... Everyone is talking about ChatGPT, developed by OpenAI. This is such a great tool that has helped to make AI more accessible to a wider...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer\n",
"Final Answer: ChatGPT was developed by OpenAI.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -434,7 +443,7 @@
}
],
"source": [
"agent_chain.run(input=\"Who developed it?\")"
"agent_executor.invoke({\"input\": \"Who developed it?\"})"
]
},
{
@@ -449,14 +458,14 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to simplify the conversation for a 5 year old.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I need to simplify the conversation for a 5 year old.\n",
"Action: Summary\n",
"Action Input: My daughter 5 years old\u001b[0m\n",
"Action Input: My daughter 5 years old\u001B[0m\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"\u001B[1m> Entering new LLMChain chain...\u001B[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThis is a conversation between a human and a bot:\n",
"\u001B[32;1m\u001B[1;3mThis is a conversation between a human and a bot:\n",
"\n",
"Human: What is ChatGPT?\n",
"AI: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\n",
@@ -464,16 +473,16 @@
"AI: ChatGPT was developed by OpenAI.\n",
"\n",
"Write a summary of the conversation for My daughter 5 years old:\n",
"\u001b[0m\n",
"\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3m\n",
"The conversation was about ChatGPT, an artificial intelligence chatbot developed by OpenAI. It is designed to have conversations with humans and can also send and receive images.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI that can have conversations with humans and send and receive images.\u001b[0m\n",
"Observation: \u001B[33;1m\u001B[1;3m\n",
"The conversation was about ChatGPT, an artificial intelligence chatbot developed by OpenAI. It is designed to have conversations with humans and can also send and receive images.\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI that can have conversations with humans and send and receive images.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -488,8 +497,8 @@
}
],
"source": [
"agent_chain.run(\n",
" input=\"Thanks. Summarize the conversation, for my daughter 5 years old.\"\n",
"agent_executor.invoke(\n",
" {\"input\": \"Thanks. Summarize the conversation, for my daughter 5 years old.\"}\n",
" AIMessage(content='The result of \\\\(3 + 5^{2.743}\\\\) is approximately 300.04, and the result of \\\\(17.24 - 918.1241\\\\) is approximately -900.88.', response_metadata={'token_usage': {'completion_tokens': 44, 'prompt_tokens': 251, 'total_tokens': 295}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_b28b39ffa8', 'finish_reason': 'stop', 'logprobs': None}, id='run-d1161669-ed09-4b18-94bd-6d8530df5aa8-0')]}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"graph.invoke(\n",
" {\n",
" \"messages\": [\n",
" HumanMessage(\n",
" \"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\"\n",
" )\n",
" ]\n",
" }\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "073c074e-d722-42e0-85ec-c62c079207e4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\"),\n",
"This notebook shows how to use VideoCaptioningChain, which is implemented using Langchain's ImageCaptionLoader and AssemblyAI to produce .srt files.\n",
"\n",
"This system autogenerates both subtitles and closed captions from a video URL."
"* use_logging (Default: True): Log the chain's processes in run manager\n",
"* frame_skip (Default: None): Choose how many video frames to skip during processing. Increasing it results in faster execution, but less accurate results. If None, frame skip is calculated manually based on the framerate Set this to 0 to sample all frames\n",
"* image_delta_threshold (Default: 3000000): Set the sensitivity for what the image processor considers a change in scenery in the video, used to delimit closed captions. Higher = less sensitive\n",
"* closed_caption_char_limit (Default: 20): Sets the character limit on closed captions\n",
"* closed_caption_similarity_threshold (Default: 80): Sets the percentage value to how similar two closed caption models should be in order to be clustered into one longer closed caption\n",
"* use_unclustered_video_models (Default: False): If true, closed captions that could not be clustered will be included. May result in spontaneous behaviour from closed captions such as very short lasting captions or fast-changing captions. Enabling this is experimental and not recommended"
### Introduction to LangChain with Harrison Chase, creator of LangChain
- [Building the Future with LLMs, `LangChain`, & `Pinecone`](https://youtu.be/nMniwlGyX-c) by [Pinecone](https://www.youtube.com/@pinecone-io)
- [LangChain and Weaviate with Harrison Chase and Bob van Luijt - Weaviate Podcast #36](https://youtu.be/lhby7Ql7hbk) by [Weaviate • Vector Database](https://www.youtube.com/@Weaviate)
- [LangChain Demo + Q&A with Harrison Chase](https://youtu.be/zaYTXQFR0_s?t=788) by [Full Stack Deep Learning](https://www.youtube.com/@FullStackDeepLearning)
- [LangChain Demo + Q&A with Harrison Chase](https://youtu.be/zaYTXQFR0_s?t=788) by [Full Stack Deep Learning](https://www.youtube.com/@The_Full_Stack)
- [LangChain Agents: Build Personal Assistants For Your Data (Q&A with Harrison Chase and Mayo Oshin)](https://youtu.be/gVkF8cwfBLI) by [Chat with data](https://www.youtube.com/@chatwithdata)
## Videos (sorted by views)
@@ -15,8 +15,8 @@
- [Using `ChatGPT` with YOUR OWN Data. This is magical. (LangChain OpenAI API)](https://youtu.be/9AXP7tCI9PI) by [TechLead](https://www.youtube.com/@TechLead)
- [First look - `ChatGPT` + `WolframAlpha` (`GPT-3.5` and Wolfram|Alpha via LangChain by James Weaver)](https://youtu.be/wYGbY811oMo) by [Dr Alan D. Thompson](https://www.youtube.com/@DrAlanDThompson)
- [LangChain explained - The hottest new Python framework](https://youtu.be/RoR4XJw8wIc) by [AssemblyAI](https://www.youtube.com/@AssemblyAI)
- [Chatbot with INFINITE MEMORY using `OpenAI` & `Pinecone` - `GPT-3`, `Embeddings`, `ADA`, `Vector DB`, `Semantic`](https://youtu.be/2xNzB7xq8nk) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)
- [LangChain for LLMs is... basically just an Ansible playbook](https://youtu.be/X51N9C-OhlE) by [David Shapiro ~ AI](https://www.youtube.com/@DavidShapiroAutomator)
- [Chatbot with INFINITE MEMORY using `OpenAI` & `Pinecone` - `GPT-3`, `Embeddings`, `ADA`, `Vector DB`, `Semantic`](https://youtu.be/2xNzB7xq8nk) by [David Shapiro ~ AI](https://www.youtube.com/@DaveShap)
- [LangChain for LLMs is... basically just an Ansible playbook](https://youtu.be/X51N9C-OhlE) by [David Shapiro ~ AI](https://www.youtube.com/@DaveShap)
- [Build your own LLM Apps with LangChain & `GPT-Index`](https://youtu.be/-75p09zFUJY) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [`BabyAGI` - New System of Autonomous AI Agents with LangChain](https://youtu.be/lg3kJvf1kXo) by [1littlecoder](https://www.youtube.com/@1littlecoder)
- [Run `BabyAGI` with Langchain Agents (with Python Code)](https://youtu.be/WosPGHPObx8) by [1littlecoder](https://www.youtube.com/@1littlecoder)
@@ -37,15 +37,15 @@
- [Building AI LLM Apps with LangChain (and more?) - LIVE STREAM](https://www.youtube.com/live/M-2Cj_2fzWI?feature=share) by [Nicholas Renotte](https://www.youtube.com/@NicholasRenotte)
- [`ChatGPT` with any `YouTube` video using langchain and `chromadb`](https://youtu.be/TQZfB2bzVwU) by [echohive](https://www.youtube.com/@echohive)
- [How to Talk to a `PDF` using LangChain and `ChatGPT`](https://youtu.be/v2i1YDtrIwk) by [Automata Learning Lab](https://www.youtube.com/@automatalearninglab)
- [Langchain Document Loaders Part 1: Unstructured Files](https://youtu.be/O5C0wfsen98) by [Merk](https://www.youtube.com/@merksworld)
- [LangChain - Prompt Templates (what all the best prompt engineers use)](https://youtu.be/1aRu8b0XNOQ) by [Nick Daigler](https://www.youtube.com/@nick_daigs)
- [Langchain Document Loaders Part 1: Unstructured Files](https://youtu.be/O5C0wfsen98) by [Merk](https://www.youtube.com/@heymichaeldaigler)
- [LangChain - Prompt Templates (what all the best prompt engineers use)](https://youtu.be/1aRu8b0XNOQ) by [Nick Daigler](https://www.youtube.com/@nickdaigler)
- [LangChain. Crear aplicaciones Python impulsadas por GPT](https://youtu.be/DkW_rDndts8) by [Jesús Conde](https://www.youtube.com/@0utKast)
- [Easiest Way to Use GPT In Your Products | LangChain Basics Tutorial](https://youtu.be/fLy0VenZyGc) by [Rachel Woods](https://www.youtube.com/@therachelwoods)
- [`BabyAGI` + `GPT-4` Langchain Agent with Internet Access](https://youtu.be/wx1z_hs5P6E) by [tylerwhatsgood](https://www.youtube.com/@tylerwhatsgood)
- [Learning LLM Agents. How does it actually work? LangChain, AutoGPT & OpenAI](https://youtu.be/mb_YAABSplk) by [Arnoldas Kemeklis](https://www.youtube.com/@processusAI)
- [Get Started with LangChain in `Node.js`](https://youtu.be/Wxx1KUWJFv4) by [Developers Digest](https://www.youtube.com/@DevelopersDigest)
- [LangChain + `OpenAI` tutorial: Building a Q&A system w/ own text data](https://youtu.be/DYOU_Z0hAwo) by [Samuel Chan](https://www.youtube.com/@SamuelChan)
- [Langchain + `Zapier` Agent](https://youtu.be/yribLAb-pxA) by [Merk](https://www.youtube.com/@merksworld)
- [Langchain + `Zapier` Agent](https://youtu.be/yribLAb-pxA) by [Merk](https://www.youtube.com/@heymichaeldaigler)
- [Connecting the Internet with `ChatGPT` (LLMs) using Langchain And Answers Your Questions](https://youtu.be/9Y0TBC63yZg) by [Kamalraj M M](https://www.youtube.com/@insightbuilder)
- [Build More Powerful LLM Applications for Business’s with LangChain (Beginners Guide)](https://youtu.be/sp3-WLKEcBg) by[ No Code Blackbox](https://www.youtube.com/@nocodeblackbox)
- [LangFlow LLM Agent Demo for 🦜🔗LangChain](https://youtu.be/zJxDHaWt-6o) by [Cobus Greyling](https://www.youtube.com/@CobusGreylingZA)
@@ -82,7 +82,7 @@
- [Build a LangChain-based Semantic PDF Search App with No-Code Tools Bubble and Flowise](https://youtu.be/s33v5cIeqA4) by [Menlo Park Lab](https://www.youtube.com/@menloparklab)
- [LangChain Memory Tutorial | Building a ChatGPT Clone in Python](https://youtu.be/Cwq91cj2Pnc) by [Alejandro AO - Software & Ai](https://www.youtube.com/@alejandro_ao)
- [ChatGPT For Your DATA | Chat with Multiple Documents Using LangChain](https://youtu.be/TeDgIDqQmzs) by [Data Science Basics](https://www.youtube.com/@datasciencebasics)
- [`Llama Index`: Chat with Documentation using URL Loader](https://youtu.be/XJRoDEctAwA) by [Merk](https://www.youtube.com/@merksworld)
- [`Llama Index`: Chat with Documentation using URL Loader](https://youtu.be/XJRoDEctAwA) by [Merk](https://www.youtube.com/@heymichaeldaigler)
- [Using OpenAI, LangChain, and `Gradio` to Build Custom GenAI Applications](https://youtu.be/1MsmqMg3yUc) by [David Hundley](https://www.youtube.com/@dkhundley)
- [LangChain, Chroma DB, OpenAI Beginner Guide | ChatGPT with your PDF](https://youtu.be/FuqdVNB_8c0)
- [Build AI chatbot with custom knowledge base using OpenAI API and GPT Index](https://youtu.be/vDZAZuaXf48) by [Irina Nik](https://www.youtube.com/@irina_nik)
@@ -93,7 +93,7 @@
- [Build a Custom Chatbot with OpenAI: `GPT-Index` & LangChain | Step-by-Step Tutorial](https://youtu.be/FIDv6nc4CgU) by [Fabrikod](https://www.youtube.com/@fabrikod)
- [`Flowise` is an open-source no-code UI visual tool to build 🦜🔗LangChain applications](https://youtu.be/CovAPtQPU0k) by [Cobus Greyling](https://www.youtube.com/@CobusGreylingZA)
- [LangChain & GPT 4 For Data Analysis: The `Pandas` Dataframe Agent](https://youtu.be/rFQ5Kmkd4jc) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
- [`GirlfriendGPT` - AI girlfriend with LangChain](https://youtu.be/LiN3D1QZGQw) by [Toolfinder AI](https://www.youtube.com/@toolfinderai)
- [`GirlfriendGPT` - AI girlfriend with LangChain](https://youtu.be/LiN3D1QZGQw) by [Girlfriend GPT](https://www.youtube.com/@girlfriendGPT)
- [How to build with Langchain 10x easier | ⛓️ LangFlow & `Flowise`](https://youtu.be/Ya1oGL7ZTvU) by [AI Jason](https://www.youtube.com/@AIJasonZ)
- [Getting Started With LangChain In 20 Minutes- Build Celebrity Search Application](https://youtu.be/_FpT1cwcSLg) by [Krish Naik](https://www.youtube.com/@krishnaik06)
- ⛓ [Vector Embeddings Tutorial – Code Your Own AI Assistant with `GPT-4 API` + LangChain + NLP](https://youtu.be/yfHHvmaMkcA?si=5uJhxoh2tvdnOXok) by [FreeCodeCamp.org](https://www.youtube.com/@freecodecamp)
- ⛓ [Prompt Engineering in Web Development | Using LangChain and Templates with OpenAI](https://youtu.be/pK6WzlTOlYw?si=fkcDQsBG2h-DM8uQ) by [Akamai Developer
](https://www.youtube.com/@AkamaiDeveloper)
- ⛓ [Retrieval-Augmented Generation (RAG) using LangChain and `Pinecone` - The RAG Special Episode](https://youtu.be/J_tCD_J6w3s?si=60Mnr5VD9UED9bGG) by [Generative AI and Data Science On AWS](https://www.youtube.com/@GenerativeAIDataScienceOnAWS)
- ⛓ [Retrieval-Augmented Generation (RAG) using LangChain and `Pinecone` - The RAG Special Episode](https://youtu.be/J_tCD_J6w3s?si=60Mnr5VD9UED9bGG) by [Generative AI and Data Science On AWS](https://www.youtube.com/@GenerativeAIOnAWS)
- ⛓ [`LLAMA2 70b-chat` Multiple Documents Chatbot with Langchain & Streamlit |All OPEN SOURCE|Replicate API](https://youtu.be/vhghB81vViM?si=dszzJnArMeac7lyc) by [DataInsightEdge](https://www.youtube.com/@DataInsightEdge01)
- ⛓ [Chatting with 44K Fashion Products: LangChain Opportunities and Pitfalls](https://youtu.be/Zudgske0F_s?si=8HSshHoEhh0PemJA) by [Rabbitmetrics](https://www.youtube.com/@rabbitmetrics)
- ⛓ [Structured Data Extraction from `ChatGPT` with LangChain](https://youtu.be/q1lYg8JISpQ?si=0HctzOHYZvq62sve) by [MG](https://www.youtube.com/@MG_cafe)
As LangChain continues to grow, the surface area of documentation required to cover it continues to grow too.
This page provides guidelines for anyone writing documentation for LangChain, as well as some of our philosophies around
organization and structure.
## Philosophy
LangChain's documentation aspires to follow the [Diataxis framework](https://diataxis.fr).
Under this framework, all documentation falls under one of four categories:
- **Tutorials**: Lessons that take the reader by the hand through a series of conceptual steps to complete a project.
- An example of this is our [LCEL streaming guide](/docs/expression_language/streaming).
- Our guides on [custom components](/docs/modules/model_io/chat/custom_chat_model) is another one.
- **How-to guides**: Guides that take the reader through the steps required to solve a real-world problem.
- The clearest examples of this are our [Use case](/docs/use_cases/) quickstart pages.
- **Reference**: Technical descriptions of the machinery and how to operate it.
- Our [Runnable interface](/docs/expression_language/interface) page is an example of this.
- The [API reference pages](https://api.python.langchain.com/) are another.
- **Explanation**: Explanations that clarify and illuminate a particular topic.
- The [LCEL primitives pages](/docs/expression_language/primitives/sequence) are an example of this.
Each category serves a distinct purpose and requires a specific approach to writing and structuring the content.
## Taxonomy
Keeping the above in mind, we have sorted LangChain's docs into categories. It is helpful to think in these terms
when contributing new documentation:
### Getting started
The [getting started section](/docs/get_started/introduction) includes a high-level introduction to LangChain, a quickstart that
tours LangChain's various features, and logistical instructions around installation and project setup.
It contains elements of **How-to guides** and **Explanations**.
### Use cases
[Use cases](/docs/use_cases/) are guides that are meant to show how to use LangChain to accomplish a specific task (RAG, information extraction, etc.).
The quickstarts should be good entrypoints for first-time LangChain developers who prefer to learn by getting something practical prototyped,
then taking the pieces apart retrospectively. These should mirror what LangChain is good at.
The quickstart pages here should fit the **How-to guide** category, with the other pages intended to be **Explanations** of more
in-depth concepts and strategies that accompany the main happy paths.
:::note
The below sections are listed roughly in order of increasing level of abstraction.
:::
### Expression Language
[LangChain Expression Language (LCEL)](/docs/expression_language/) is the fundamental way that most LangChain components fit together, and this section is designed to teach
developers how to use it to build with LangChain's primitives effectively.
This section should contains **Tutorials** that teach how to stream and use LCEL primitives for more abstract tasks, **Explanations** of specific behaviors,
and some **References** for how to use different methods in the Runnable interface.
### Components
The [components section](/docs/modules) covers concepts one level of abstraction higher than LCEL.
Abstract base classes like `BaseChatModel` and `BaseRetriever` should be covered here, as well as core implementations of these base classes,
such as `ChatPromptTemplate` and `RecursiveCharacterTextSplitter`. Customization guides belong here too.
This section should contain mostly conceptual **Tutorials**, **References**, and **Explanations** of the components they cover.
:::note
As a general rule of thumb, everything covered in the `Expression Language` and `Components` sections (with the exception of the `Composition` section of components) should
cover only components that exist in `langchain_core`.
:::
### Integrations
The [integrations](/docs/integrations/platforms/) are specific implementations of components. These often involve third-party APIs and services.
If this is the case, as a general rule, these are maintained by the third-party partner.
This section should contain mostly **Explanations** and **References**, though the actual content here is more flexible than other sections and more at the
discretion of the third-party provider.
:::note
Concepts covered in `Integrations` should generally exist in `langchain_community` or specific partner packages.
:::
### Guides and Ecosystem
The [Guides](/docs/guides) and [Ecosystem](/docs/langsmith/) sections should contain guides that address higher-level problems than the sections above.
This includes, but is not limited to, considerations around productionization and development workflows.
These should contain mostly **How-to guides**, **Explanations**, and **Tutorials**.
### API references
LangChain's API references. Should act as **References** (as the name implies) with some **Explanation**-focused content as well.
## Sample developer journey
We have set up our docs to assist a new developer to LangChain. Let's walk through the intended path:
- The developer lands on https://python.langchain.com, and reads through the introduction and the diagram.
- If they are just curious, they may be drawn to the [Quickstart](/docs/get_started/quickstart) to get a high-level tour of what LangChain contains.
- If they have a specific task in mind that they want to accomplish, they will be drawn to the Use-Case section. The use-case should provide a good, concrete hook that shows the value LangChain can provide them and be a good entrypoint to the framework.
- They can then move to learn more about the fundamentals of LangChain through the Expression Language sections.
- Next, they can learn about LangChain's various components and integrations.
- Finally, they can get additional knowledge through the Guides.
This is only an ideal of course - sections will inevitably reference lower or higher-level concepts that are documented in other sections.
## Guidelines
Here are some other guidelines you should think about when writing and organizing documentation.
### Linking to other sections
Because sections of the docs do not exist in a vacuum, it is important to link to other sections as often as possible
to allow a developer to learn more about an unfamiliar topic inline.
This includes linking to the API references as well as conceptual sections!
### Conciseness
In general, take a less-is-more approach. If a section with a good explanation of a concept already exists, you should link to it rather than
re-explain it, unless the concept you are documenting presents some new wrinkle.
Be concise, including in code samples.
### General style
- Use active voice and present tense whenever possible.
- Use examples and code snippets to illustrate concepts and usage.
- Use appropriate header levels (`#`, `##`, `###`, etc.) to organize the content hierarchically.
- Use bullet points and numbered lists to break down information into easily digestible chunks.
- Use tables (especially for **Reference** sections) and diagrams often to present information visually.
- Include the table of contents for longer documentation pages to help readers navigate the content, but hide it for shorter pages.
"# Logic for converting tools to string to go in prompt\n",
"def convert_tools(tools):\n",
" return \"\\n\".join([f\"{tool.name}: {tool.description}\" for tool in tools])"
]
},
{
"cell_type": "markdown",
"id": "260f5988",
"metadata": {},
"source": [
"Building an agent from a runnable usually involves a few things:\n",
"\n",
"1. Data processing for the intermediate steps. These need to be represented in a way that the language model can recognize them. This should be pretty tightly coupled to the instructions in the prompt\n",
"\n",
"2. The prompt itself\n",
"\n",
"3. The model, complete with stop tokens if needed\n",
"\n",
"4. The output parser - should be in sync with how the prompt specifies things to be formatted."
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m <tool>search</tool><tool_input>weather in New York\u001b[0m\u001b[36;1m\u001b[1;3m32 degrees\u001b[0m\u001b[32;1m\u001b[1;3m <tool>search</tool>\n",
"<tool_input>weather in New York\u001b[0m\u001b[36;1m\u001b[1;3m32 degrees\u001b[0m\u001b[32;1m\u001b[1;3m <final_answer>The weather in New York is 32 degrees\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
},
{
"data": {
"text/plain": [
"{'input': 'whats the weather in New york?',\n",
" 'output': 'The weather in New York is 32 degrees'}"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"agent_executor.invoke({\"input\": \"whats the weather in New york?\"})"
"With LCEL you can easily add [custom routing logic](/docs/expression_language/how_to/routing#using-a-custom-function) to your chain to dynamically determine the chain logic based on user input. All you need to do is define a function that given an input returns a `Runnable`.\n",
"\n",
"One especially useful technique is to use embeddings to route a query to the most relevant prompt. Here's a very simple example."
"A black hole is a region in space where gravity is extremely strong, so strong that nothing, not even light, can escape its gravitational pull. It is formed when a massive star collapses under its own gravity during a supernova explosion. The collapse causes an incredibly dense mass to be concentrated in a small volume, creating a gravitational field that is so intense that it warps space and time. Black holes have a boundary called the event horizon, which marks the point of no return for anything that gets too close. Beyond the event horizon, the gravitational pull is so strong that even light cannot escape, hence the name \"black hole.\" While we have a good understanding of black holes, there is still much to learn, especially about what happens inside them.\n"
]
}
],
"source": [
"print(chain.invoke(\"What's a black hole\"))"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "f261910d-1de1-4a01-8c8a-308db02b81de",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Using MATH\n",
"Thank you for your kind words! I will do my best to break down the concept of a path integral for you.\n",
"\n",
"In mathematics and physics, a path integral is a mathematical tool used to calculate the probability amplitude or wave function of a particle or system of particles. It was introduced by Richard Feynman and is an integral over all possible paths that a particle can take to go from an initial state to a final state.\n",
"\n",
"To understand the concept better, let's consider an example. Suppose we have a particle moving from point A to point B in space. Classically, we would describe this particle's motion using a definite trajectory, but in quantum mechanics, particles can simultaneously take multiple paths from A to B.\n",
"\n",
"The path integral formalism considers all possible paths that the particle could take and assigns a probability amplitude to each path. These probability amplitudes are then added up, taking into account the interference effects between different paths.\n",
"\n",
"To calculate a path integral, we need to define an action, which is a mathematical function that describes the behavior of the system. The action is usually expressed in terms of the particle's position, velocity, and time.\n",
"\n",
"Once we have the action, we can write down the path integral as an integral over all possible paths. Each path is weighted by a factor determined by the action and the principle of least action, which states that a particle takes a path that minimizes the action.\n",
"\n",
"Mathematically, the path integral is expressed as:\n",
"\n",
"∫ e^(iS/ħ) D[x(t)]\n",
"\n",
"Here, S is the action, ħ is the reduced Planck's constant, and D[x(t)] represents the integration over all possible paths x(t) of the particle.\n",
"\n",
"By evaluating this integral, we can obtain the probability amplitude for the particle to go from the initial state to the final state. The absolute square of this amplitude gives us the probability of finding the particle in a particular state.\n",
"\n",
"Path integrals have proven to be a powerful tool in various areas of physics, including quantum mechanics, quantum field theory, and statistical mechanics. They allow us to study complex systems and calculate probabilities that are difficult to obtain using other methods.\n",
"\n",
"I hope this explanation helps you understand the concept of a path integral. If you have any further questions, feel free to ask!\n"
Example code for accomplishing common tasks with the LangChain Expression Language (LCEL). These examples show how to compose different Runnable (the core LCEL interface) components to achieve various tasks. If you're just getting acquainted with LCEL, the [Prompt + LLM](/docs/expression_language/cookbook/prompt_llm_parser) page is a good place to start.
"_template = \"\"\"Given the following conversation and a follow up question, rephrase the follow up question to be a standalone question, in its original language.\n",
"AIMessage(content='Harrison was employed at Kensho.')"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversational_qa_chain.invoke(\n",
" {\n",
" \"question\": \"where did harrison work?\",\n",
" \"chat_history\": [],\n",
" }\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "424e7e7a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='Harrison worked at Kensho.')"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conversational_qa_chain.invoke(\n",
" {\n",
" \"question\": \"where did he work?\",\n",
" \"chat_history\": [\n",
" HumanMessage(content=\"Who wrote this notebook?\"),\n",
" AIMessage(content=\"Harrison\"),\n",
" ],\n",
" }\n",
")"
]
},
{
"cell_type": "markdown",
"id": "c5543183",
"metadata": {},
"source": [
"### With Memory and returning source documents\n",
"\n",
"This shows how to use memory with the above. For memory, we need to manage that outside at the memory. For returning the retrieved documents, we just need to pass them through all the way."
"chain = prompt | model | StrOutputParser() | search"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "55f2967d",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'What sports games are on TV today & tonight? Watch and stream live sports on TV today, tonight, tomorrow. Today\\'s 2023 sports TV schedule includes football, basketball, baseball, hockey, motorsports, soccer and more. Watch on TV or stream online on ESPN, FOX, FS1, CBS, NBC, ABC, Peacock, Paramount+, fuboTV, local channels and many other networks. MLB Games Tonight: How to Watch on TV, Streaming & Odds - Thursday, September 7. Seattle Mariners\\' Julio Rodriguez greets teammates in the dugout after scoring against the Oakland Athletics in a ... Circle - Country Music and Lifestyle. Live coverage of all the MLB action today is available to you, with the information provided below. The Brewers will look to pick up a road win at PNC Park against the Pirates on Wednesday at 12:35 PM ET. Check out the latest odds and with BetMGM Sportsbook. Use bonus code \"GNPLAY\" for special offers! MLB Games Tonight: How to Watch on TV, Streaming & Odds - Tuesday, September 5. Houston Astros\\' Kyle Tucker runs after hitting a double during the fourth inning of a baseball game against the Los Angeles Angels, Sunday, Aug. 13, 2023, in Houston. (AP Photo/Eric Christian Smith) (APMedia) The Houston Astros versus the Texas Rangers is one of ... The second half of tonight\\'s college football schedule still has some good games remaining to watch on your television.. We\\'ve already seen an exciting one when Colorado upset TCU. And we saw some ...'"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"input\": \"I'd like to figure out what games are tonight\"})"
"prompt = ChatPromptTemplate.from_template(\"tell me a short joke about {topic}\")\n",
"model = ChatOpenAI(model=\"gpt-4\")\n",
"output_parser = StrOutputParser()\n",
"\n",
"chain = prompt | model | output_parser\n",
@@ -76,15 +101,15 @@
"id": "81c502c5-85ee-4f36-aaf4-d6e350b7792f",
"metadata": {},
"source": [
"Notice this line of this code, where we piece together then different components into a single chain using LCEL:\n",
"Notice this line of the code, where we piece together these different components into a single chain using LCEL:\n",
"\n",
"```\n",
"chain = prompt | model | output_parser\n",
"```\n",
"\n",
"The `|` symbol is similar to a [unix pipe operator](https://en.wikipedia.org/wiki/Pipeline_(Unix)), which chains together the different components feeds the output from one component as input into the next component. \n",
"The `|` symbol is similar to a [unix pipe operator](https://en.wikipedia.org/wiki/Pipeline_(Unix)), which chains together the different components, feeding the output from one component as input into the next component. \n",
"\n",
"In this chain the user input is passed to the prompt template, then the prompt template output is passed to the model, then the model output is passed to the output parser. Let's take a look at each component individually to really understand what's going on."
"In this chain the user input is passed to the prompt template, then the prompt template output is passed to the model, then the model output is passed to the output parser. Let's take a look at each component individually to really understand what's going on."
"And lastly we pass our `model` output to the `output_parser`, which is a `BaseOutputParser` meaning it takes either a string or a \n",
"`BaseMessage` as input. The `StrOutputParser` specifically simple converts any input into a string."
"`BaseMessage` as input. The specific `StrOutputParser` simply converts any input into a string."
]
},
{
@@ -293,7 +318,7 @@
"source": [
":::info\n",
"\n",
"Note that if you’re curious about the output of any components, you can always test out a smaller version of the chain such as `prompt` or `prompt | model` to see the intermediate results:\n",
"Note that if you’re curious about the output of any components, you can always test out a smaller version of the chain such as `prompt` or `prompt | model` to see the intermediate results:\n",
"\n",
":::"
]
@@ -321,7 +346,17 @@
"source": [
"## RAG Search Example\n",
"\n",
"For our next example, we want to run a retrieval-augmented generation chain to add some context when responding to questions."
"For our next example, we want to run a retrieval-augmented generation chain to add some context when responding to questions."
"We then use the `RunnableParallel` to prepare the expected inputs into the prompt by using the entries for the retrieved documents as well as the original user question, using the retriever for document search, and RunnablePassthrough to pass the user’s question:"
"We then use the `RunnableParallel` to prepare the expected inputs into the prompt by using the entries for the retrieved documents as well as the original user question, using the retriever for document search, and `RunnablePassthrough` to pass the user’s question:"
]
},
{
@@ -451,7 +484,7 @@
"With the flow being:\n",
"\n",
"1. The first steps create a `RunnableParallel` object with two entries. The first entry, `context` will include the document results fetched by the retriever. The second entry, `question` will contain the user’s original question. To pass on the question, we use `RunnablePassthrough` to copy this entry. \n",
"2. Feed the dictionary from the step above to the `prompt` component. It then takes the user input which is `question` as well as the retrieved document which is `context` to construct a prompt and output a PromptValue. \n",
"2. Feed the dictionary from the step above to the `prompt` component. It then takes the user input which is `question` as well as the retrieved document which is `context` to construct a prompt and output a PromptValue. \n",
"3. The `model` component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated output from the model is a `ChatMessage` object. \n",
"4. Finally, the `output_parser` component takes in a `ChatMessage`, and transforms this into a Python string, which is returned from the invoke method.\n",
"\n",
@@ -476,7 +509,7 @@
"source": [
"## Next steps\n",
"\n",
"We recommend reading our [Why use LCEL](/docs/expression_language/why) section next to see a side-by-side comparison of the code needed to produce common functionality with and without LCEL."
"We recommend reading our [Advantages of LCEL](/docs/expression_language/why) section next to see a side-by-side comparison of the code needed to produce common functionality with and without LCEL."
"# Create a runnable with the `@chain` decorator\n",
"# Create a runnable with the @chain decorator\n",
"\n",
"You can also turn an arbitrary function into a chain by adding a `@chain` decorator. This is functionaly equivalent to wrapping in a [`RunnableLambda`](./functions).\n",
"You can also turn an arbitrary function into a chain by adding a `@chain` decorator. This is functionaly equivalent to wrapping in a [`RunnableLambda`](/docs/expression_language/primitives/functions).\n",
"\n",
"This will have the benefit of improved observability by tracing your chain correctly. Any calls to runnables inside this function will be traced as nested childen.\n",
"title: \"RunnableLambda: Run Custom Functions\"\n",
"keywords: [RunnableLambda, LCEL]\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "fbc4bf6e",
"metadata": {},
"source": [
"# Run custom functions\n",
"\n",
"You can use arbitrary functions in the pipeline.\n",
"\n",
"Note that all inputs to these functions need to be a SINGLE argument. If you have a function that accepts multiple arguments, you should write a wrapper that accepts a single input and unpacks it into multiple argument."
"Runnable lambdas can optionally accept a [RunnableConfig](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.config.RunnableConfig.html#langchain_core.runnables.config.RunnableConfig), which they can use to pass callbacks, tags, and other configuration information to nested runs."
"title: \"RunnableBranch: Dynamically route logic based on input\"\n",
"title: \"Route logic based on input\"\n",
"keywords: [RunnableBranch, LCEL]\n",
"---"
]
@@ -25,7 +25,7 @@
"\n",
"There are two ways to perform routing:\n",
"\n",
"1. Conditionally return runnables from a [`RunnableLambda`](./functions) (recommended)\n",
"1. Conditionally return runnables from a [`RunnableLambda`](/docs/expression_language/primitives/functions) (recommended)\n",
"2. Using a `RunnableBranch`.\n",
"\n",
"We'll illustrate both methods using a two step sequence where the first step classifies an input question as being about `LangChain`, `Anthropic`, or `Other`, then routes to a corresponding prompt chain."
"AIMessage(content=' As Dario Amodei told me, to use Anthropic IPC you first need to import it:\\n\\n```python\\nfrom anthroipc import ic\\n```\\n\\nThen you can create a client and connect to the server:\\n\\n```python \\nclient = ic.connect()\\n```\\n\\nAfter that, you can call methods on the client and get responses:\\n\\n```python\\nresponse = client.ask(\"What is the meaning of life?\")\\nprint(response)\\n```\\n\\nYou can also register callbacks to handle events: \\n\\n```python\\ndef on_poke(event):\\n print(\"Got poked!\")\\n\\nclient.on(\\'poke\\', on_poke)\\n```\\n\\nAnd that\\'s the basics of using the Anthropic IPC client library for Python! Let me know if you have any other questions!', additional_kwargs={}, example=False)"
"AIMessage(content=\"As Dario Amodei told me, to use Anthropic, you can start by exploring the company's website and learning about their mission, values, and the different services and products they offer. Anthropic is focused on developing safe and ethical AI systems, so they have a strong emphasis on transparency and responsible AI development. \\n\\nDepending on your specific needs, you can look into Anthropic's AI research and development services, which cover areas like natural language processing, computer vision, and reinforcement learning. They also offer consulting and advisory services to help organizations navigate the challenges and opportunities of AI integration.\\n\\nAdditionally, Anthropic has released some open-source AI models and tools that you can explore and experiment with. These can be a great way to get hands-on experience with Anthropic's approach to AI development.\\n\\nOverall, Anthropic aims to be a reliable and trustworthy partner in the AI space, so I'd encourage you to reach out to them directly to discuss how they can best support your specific requirements.\", response_metadata={'id': 'msg_01CtLFgFSwvTaJomrihE87Ra', 'content': [ContentBlock(text=\"As Dario Amodei told me, to use Anthropic, you can start by exploring the company's website and learning about their mission, values, and the different services and products they offer. Anthropic is focused on developing safe and ethical AI systems, so they have a strong emphasis on transparency and responsible AI development. \\n\\nDepending on your specific needs, you can look into Anthropic's AI research and development services, which cover areas like natural language processing, computer vision, and reinforcement learning. They also offer consulting and advisory services to help organizations navigate the challenges and opportunities of AI integration.\\n\\nAdditionally, Anthropic has released some open-source AI models and tools that you can explore and experiment with. These can be a great way to get hands-on experience with Anthropic's approach to AI development.\\n\\nOverall, Anthropic aims to be a reliable and trustworthy partner in the AI space, so I'd encourage you to reach out to them directly to discuss how they can best support your specific requirements.\", type='text')], 'model': 'claude-3-haiku-20240307', 'role': 'assistant', 'stop_reason': 'end_turn', 'stop_sequence': None, 'type': 'message', 'usage': Usage(input_tokens=53, output_tokens=219)})"
]
},
"execution_count": 12,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
@@ -191,17 +183,17 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 7,
"id": "48913dc6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' As Harrison Chase told me, to use LangChain you first need to sign up for an API key at platform.langchain.com. Once you have your API key, you can install the Python library and write a simple Python script to call the LangChain API. Here is some sample code to get started:\\n\\n```python\\nimport langchain\\n\\napi_key = \"YOUR_API_KEY\"\\n\\nlangchain.set_key(api_key)\\n\\nresponse = langchain.ask(\"What is the capital of France?\")\\n\\nprint(response.response)\\n```\\n\\nThis will send the question \"What is the capital of France?\" to the LangChain API and print the response. You can customize the request by providing parameters like max_tokens, temperature, etc. The LangChain Python library documentation has more details on the available options. The key things are getting an API key and calling langchain.ask() with your question text. Let me know if you have any other questions!', additional_kwargs={}, example=False)"
"AIMessage(content=\"As Harrison Chase told me, using LangChain involves a few key steps:\\n\\n1. **Set up your environment**: Install the necessary Python packages, including the LangChain library itself, as well as any other dependencies your application might require, such as language models or other integrations.\\n\\n2. **Understand the core concepts**: LangChain revolves around a few core concepts, like Agents, Chains, and Tools. Familiarize yourself with these concepts and how they work together to build powerful language-based applications.\\n\\n3. **Identify your use case**: Determine what kind of task or application you want to build using LangChain, such as a chatbot, a question-answering system, or a document summarization tool.\\n\\n4. **Choose the appropriate components**: Based on your use case, select the right LangChain components, such as agents, chains, and tools, to build your application.\\n\\n5. **Integrate with language models**: LangChain is designed to work seamlessly with various language models, such as OpenAI's GPT-3 or Anthropic's models. Connect your chosen language model to your LangChain application.\\n\\n6. **Implement your application logic**: Use LangChain's building blocks to implement the specific functionality of your application, such as prompting the language model, processing the response, and integrating with other services or data sources.\\n\\n7. **Test and iterate**: Thoroughly test your application, gather feedback, and iterate on your design and implementation to improve its performance and user experience.\\n\\nAs Harrison Chase emphasized, LangChain provides a flexible and powerful framework for building language-based applications, making it easier to leverage the capabilities of modern language models. By following these steps, you can get started with LangChain and create innovative solutions tailored to your specific needs.\", response_metadata={'id': 'msg_01H3UXAAHG4TwxJLpxwuuVU7', 'content': [ContentBlock(text=\"As Harrison Chase told me, using LangChain involves a few key steps:\\n\\n1. **Set up your environment**: Install the necessary Python packages, including the LangChain library itself, as well as any other dependencies your application might require, such as language models or other integrations.\\n\\n2. **Understand the core concepts**: LangChain revolves around a few core concepts, like Agents, Chains, and Tools. Familiarize yourself with these concepts and how they work together to build powerful language-based applications.\\n\\n3. **Identify your use case**: Determine what kind of task or application you want to build using LangChain, such as a chatbot, a question-answering system, or a document summarization tool.\\n\\n4. **Choose the appropriate components**: Based on your use case, select the right LangChain components, such as agents, chains, and tools, to build your application.\\n\\n5. **Integrate with language models**: LangChain is designed to work seamlessly with various language models, such as OpenAI's GPT-3 or Anthropic's models. Connect your chosen language model to your LangChain application.\\n\\n6. **Implement your application logic**: Use LangChain's building blocks to implement the specific functionality of your application, such as prompting the language model, processing the response, and integrating with other services or data sources.\\n\\n7. **Test and iterate**: Thoroughly test your application, gather feedback, and iterate on your design and implementation to improve its performance and user experience.\\n\\nAs Harrison Chase emphasized, LangChain provides a flexible and powerful framework for building language-based applications, making it easier to leverage the capabilities of modern language models. By following these steps, you can get started with LangChain and create innovative solutions tailored to your specific needs.\", type='text')], 'model': 'claude-3-haiku-20240307', 'role': 'assistant', 'stop_reason': 'end_turn', 'stop_sequence': None, 'type': 'message', 'usage': Usage(input_tokens=50, output_tokens=400)})"
"AIMessage(content=\"As Dario Amodei told me, here are some ways to use Anthropic:\\n\\n- Sign up for an account on Anthropic's website to access tools like Claude, Constitutional AI, and Writer. \\n\\n- Use Claude for tasks like email generation, customer service chat, and QA. Claude can understand natural language prompts and provide helpful responses.\\n\\n- Use Constitutional AI if you need an AI assistant that is harmless, honest, and helpful. It is designed to be safe and aligned with human values.\\n\\n- Use Writer to generate natural language content for things like marketing copy, stories, reports, and more. Give it a topic and prompt and it will create high-quality written content.\\n\\n- Check out Anthropic's documentation and blog for tips, tutorials, examples, and announcements about new capabilities as they continue to develop their AI technology.\\n\\n- Follow Anthropic on social media or subscribe to their newsletter to stay up to date on new features and releases.\\n\\n- For most people, the easiest way to leverage Anthropic's technology is through their website - just create an account to get started!\", additional_kwargs={}, example=False)"
"AIMessage(content=\"As Dario Amodei told me, to use Anthropic, you should first familiarize yourself with our mission and principles. Anthropic is committed to developing safe and beneficial artificial intelligence that can help solve important problems facing humanity. \\n\\nTo get started, I recommend exploring the resources on our website, which cover our research, products, and approach to AI development. You can also reach out to our team to learn more about how Anthropic's technology and services can support your specific needs.\\n\\nThe key is to engage with us in a way that aligns with our values of transparency, ethical AI, and a commitment to the wellbeing of humanity. We're here to collaborate and help you harness the power of advanced AI responsibly.\", response_metadata={'id': 'msg_0187BVnpniPDJnVvwf3M1LdY', 'content': [ContentBlock(text=\"As Dario Amodei told me, to use Anthropic, you should first familiarize yourself with our mission and principles. Anthropic is committed to developing safe and beneficial artificial intelligence that can help solve important problems facing humanity. \\n\\nTo get started, I recommend exploring the resources on our website, which cover our research, products, and approach to AI development. You can also reach out to our team to learn more about how Anthropic's technology and services can support your specific needs.\\n\\nThe key is to engage with us in a way that aligns with our values of transparency, ethical AI, and a commitment to the wellbeing of humanity. We're here to collaborate and help you harness the power of advanced AI responsibly.\", type='text')], 'model': 'claude-3-haiku-20240307', 'role': 'assistant', 'stop_reason': 'end_turn', 'stop_sequence': None, 'type': 'message', 'usage': Usage(input_tokens=53, output_tokens=160)})"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "display_data"
"output_type": "execute_result"
}
],
"source": [
@@ -277,18 +270,19 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 10,
"id": "8d8caf9b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=' As Harrison Chase told me, here is how you use LangChain:\\n\\nLangChain is an AI assistant that can have conversations, answer questions, and generate text. To use LangChain, you simply type or speak your input and LangChain will respond. \\n\\nYou can ask LangChain questions, have discussions, get summaries or explanations about topics, and request it togenerate text on a subject. Some examples of interactions:\\n\\n- Ask general knowledge questions and LangChain will try to answer factually. For example \"What is the capital of France?\"\\n\\n- Have conversations on topics by taking turns speaking. You can prompt the start of a conversation by saying something like \"Let\\'s discuss machine learning\"\\n\\n- Ask for summaries or high-level explanations on subjects. For example \"Can you summarize the main themes in Shakespeare\\'s Hamlet?\" \\n\\n- Give creative writing prompts or requests to have LangChain generate text in different styles. For example \"Write a short children\\'s story about a mouse\" or \"Generate a poem in the style of Robert Frost about nature\"\\n\\n- Correct LangChain if it makes an inaccurate statement and provide the right information. This helps train it.\\n\\nThe key is interacting naturally and giving it clear prompts and requests', additional_kwargs={}, example=False)"
"AIMessage(content=\"As Harrison Chase told me, using LangChain involves several key steps. First, you'll need to install the LangChain library and import the necessary modules. Then, you'll want to define your language model, any data sources you plan to use, and the specific tasks you want to accomplish, such as question answering, text generation, or agent-based reasoning. \\n\\nLangChain provides a flexible framework for building applications that leverage large language models. It includes abstractions for things like retrievers, prompts, and chains, which allow you to compose different components together to create powerful workflows. \\n\\nThe documentation on the LangChain website is excellent and covers many common use cases in detail. I'd recommend starting there to get a solid understanding of the core concepts and how to apply them to your specific needs. And of course, feel free to reach out if you have any other questions - I'm always happy to share more insights from my conversations with Harrison.\", response_metadata={'id': 'msg_01T1naS99wGPkEAP4LME8iAv', 'content': [ContentBlock(text=\"As Harrison Chase told me, using LangChain involves several key steps. First, you'll need to install the LangChain library and import the necessary modules. Then, you'll want to define your language model, any data sources you plan to use, and the specific tasks you want to accomplish, such as question answering, text generation, or agent-based reasoning. \\n\\nLangChain provides a flexible framework for building applications that leverage large language models. It includes abstractions for things like retrievers, prompts, and chains, which allow you to compose different components together to create powerful workflows. \\n\\nThe documentation on the LangChain website is excellent and covers many common use cases in detail. I'd recommend starting there to get a solid understanding of the core concepts and how to apply them to your specific needs. And of course, feel free to reach out if you have any other questions - I'm always happy to share more insights from my conversations with Harrison.\", type='text')], 'model': 'claude-3-haiku-20240307', 'role': 'assistant', 'stop_reason': 'end_turn', 'stop_sequence': None, 'type': 'message', 'usage': Usage(input_tokens=50, output_tokens=205)})"
"As a physics professor, I would be happy to provide a concise and easy-to-understand explanation of what a black hole is.\n",
"\n",
"A black hole is an incredibly dense region of space-time where the gravitational pull is so strong that nothing, not even light, can escape from it. This means that if you were to get too close to a black hole, you would be pulled in and crushed by the intense gravitational forces.\n",
"\n",
"The formation of a black hole occurs when a massive star, much larger than our Sun, reaches the end of its life and collapses in on itself. This collapse causes the matter to become extremely dense, and the gravitational force becomes so strong that it creates a point of no return, known as the event horizon.\n",
"\n",
"Beyond the event horizon, the laws of physics as we know them break down, and the intense gravitational forces create a singularity, which is a point of infinite density and curvature in space-time.\n",
"\n",
"Black holes are fascinating and mysterious objects, and there is still much to be learned about their properties and behavior. If I were unsure about any specific details or aspects of black holes, I would readily admit that I do not have a complete understanding and would encourage further research and investigation.\n"
]
}
],
"source": [
"print(chain.invoke(\"What's a black hole\"))"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "df34e469",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Using MATH\n",
"A path integral is a powerful mathematical concept in physics, particularly in the field of quantum mechanics. It was developed by the renowned physicist Richard Feynman as an alternative formulation of quantum mechanics.\n",
"\n",
"In a path integral, instead of considering a single, definite path that a particle might take from one point to another, as in classical mechanics, the particle is considered to take all possible paths simultaneously. Each path is assigned a complex-valued weight, and the total probability amplitude for the particle to go from one point to another is calculated by summing (integrating) over all possible paths.\n",
"\n",
"The key ideas behind the path integral formulation are:\n",
"\n",
"1. Superposition principle: In quantum mechanics, particles can exist in a superposition of multiple states or paths simultaneously.\n",
"\n",
"2. Probability amplitude: The probability amplitude for a particle to go from one point to another is calculated by summing the complex-valued weights of all possible paths.\n",
"\n",
"3. Weighting of paths: Each path is assigned a weight based on the action (the time integral of the Lagrangian) along that path. Paths with lower action have a greater weight.\n",
"\n",
"4. Feynman's approach: Feynman developed the path integral formulation as an alternative to the traditional wave function approach in quantum mechanics, providing a more intuitive and conceptual understanding of quantum phenomena.\n",
"\n",
"The path integral approach is particularly useful in quantum field theory, where it provides a powerful framework for calculating transition probabilities and understanding the behavior of quantum systems. It has also found applications in various areas of physics, such as condensed matter, statistical mechanics, and even in finance (the path integral approach to option pricing).\n",
"\n",
"The mathematical construction of the path integral involves the use of advanced concepts from functional analysis and measure theory, making it a powerful and sophisticated tool in the physicist's arsenal.\n"
LangChain Expression Language, or LCEL, is a declarative way to easily compose chains together.
LCEL was designed from day 1 to **support putting prototypes in production, with no code changes**, from the simplest “prompt + LLM” chain to the most complex chains (we’ve seen folks successfully run LCEL chains with 100s of steps in production). To highlight a few of the reasons you might want to use LCEL:
When you build your chains with LCEL you get the best possible time-to-first-token (time elapsed until the first chunk of output comes out). For some chains this means eg. we stream tokens straight from an LLM to a streaming output parser, and you get back parsed, incremental chunks of output at the same rate as the LLM provider outputs the raw tokens.
**Async support**
Any chain built with LCEL can be called both with the synchronous API (eg. in your Jupyter notebook while prototyping) as well as with the asynchronous API (eg. in a [LangServe](/docs/langsmith) server). This enables using the same code for prototypes and in production, with great performance, and the ability to handle many concurrent requests in the same server.
Any chain built with LCEL can be called both with the synchronous API (eg. in your Jupyter notebook while prototyping) as well as with the asynchronous API (eg. in a [LangServe](/docs/langserve) server). This enables using the same code for prototypes and in production, with great performance, and the ability to handle many concurrent requests in the same server.
Whenever your LCEL chains have steps that can be executed in parallel (eg if you fetch documents from multiple retrievers) we automatically do it, both in the sync and the async interfaces, for the smallest possible latency.
**Retries and fallbacks**
[**Retries and fallbacks**](/docs/guides/productionization/fallbacks)
Configure retries and fallbacks for any part of your LCEL chain. This is a great way to make your chains more reliable at scale. We’re currently working on adding streaming support for retries/fallbacks, so you can get the added reliability without any latency cost.
For more complex chains it’s often very useful to access the results of intermediate steps even before the final output is produced. This can be used to let end-users know something is happening, or even just to debug your chain. You can stream intermediate results, and it’s available on every [LangServe](/docs/langserve) server.
**Input and output schemas**
[**Input and output schemas**](/docs/expression_language/interface#input-schema)
Input and output schemas give every LCEL chain Pydantic and JSONSchema schemas inferred from the structure of your chain. This can be used for validation of inputs and outputs, and is an integral part of LangServe.
**Seamless LangSmith tracing integration**
[**Seamless LangSmith tracing**](/docs/langsmith)
As your chains get more and more complex, it becomes increasingly important to understand what exactly is happening at every step.
With LCEL, **all** steps are automatically logged to [LangSmith](/docs/langsmith/) for maximum observability and debuggability.
"To make it as easy as possible to create custom chains, we've implemented a [\"Runnable\"](https://api.python.langchain.com/en/stable/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol. The `Runnable` protocol is implemented for most components. \n",
"To make it as easy as possible to create custom chains, we've implemented a [\"Runnable\"](https://api.python.langchain.com/en/stable/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol. Many LangChain components implement the `Runnable` protocol, including chat models, LLMs, output parsers, retrievers, prompt templates, and more. There are also several useful primitives for working with runnables, which you can read about [in this section](/docs/expression_language/primitives).\n",
"\n",
"This is a standard interface, which makes it easy to define custom chains as well as invoke them in a standard way. \n",
"The standard interface includes:\n",
"\n",
@@ -24,7 +25,7 @@
"- [`invoke`](#invoke): call the chain on an input\n",
"- [`batch`](#batch): call the chain on a list of inputs\n",
"\n",
"These also have corresponding async methods:\n",
"These also have corresponding async methods that should be used with [asyncio](https://docs.python.org/3/library/asyncio.html) `await` syntax for concurrency:\n",
"\n",
"- [`astream`](#async-stream): stream back chunks of the response async\n",
"- [`ainvoke`](#async-invoke): call the chain on an input async\n",
"The `RunnablePassthrough.assign(...)` static method takes an input value and adds the extra arguments passed to the assign function.\n",
"\n",
"This is useful when additively creating a dictionary to use as input to a later step, which is a common LCEL pattern.\n",
"\n",
"Here's an example:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[33mWARNING: You are using pip version 22.0.4; however, version 24.0 is available.\n",
"You should consider upgrading via the '/Users/jacoblee/.pyenv/versions/3.10.5/bin/python -m pip install --upgrade pip' command.\u001b[0m\u001b[33m\n",
"\u001b[0mNote: you may need to restart the kernel to use updated packages.\n"
"- The input to the chain is `{\"num\": 1}`. This is passed into a `RunnableParallel`, which invokes the runnables it is passed in parallel with that input.\n",
"- The value under the `extra` key is invoked. `RunnablePassthrough.assign()` keeps the original keys in the input dict (`{\"num\": 1}`), and assigns a new key called `mult`. The value is `lambda x: x[\"num\"] * 3)`, which is `3`. Thus, the result is `{\"num\": 1, \"mult\": 3}`.\n",
"- `{\"num\": 1, \"mult\": 3}` is returned to the `RunnableParallel` call, and is set as the value to the key `extra`.\n",
"- At the same time, the `modified` key is called. The result is `2`, since the lambda extracts a key called `\"num\"` from its input and adds one.\n",
"\n",
"Thus, the result is `{'extra': {'num': 1, 'mult': 3}, 'modified': 2}`.\n",
"\n",
"## Streaming\n",
"\n",
"One nice feature of this method is that it allows values to pass through as soon as they are available. To show this off, we'll use `RunnablePassthrough.assign()` to immediately return source docs in a retrieval chain:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'question': 'where did harrison work?'}\n",
"{'context': [Document(page_content='harrison worked at kensho')]}\n",
"stream = retrieval_chain.stream(\"where did harrison work?\")\n",
"\n",
"for chunk in stream:\n",
" print(chunk)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can see that the first chunk contains the original `\"question\"` since that is immediately available. The second chunk contains `\"context\"` since the retriever finishes second. Finally, the output from the `generation_chain` streams in chunks as soon as it is available."
"Sometimes we want to invoke a Runnable within a Runnable sequence with constant arguments that are not part of the output of the preceding Runnable in the sequence, and which are not part of the user input. We can use `Runnable.bind()` to easily pass these arguments in.\n",
"Sometimes we want to invoke a Runnable within a Runnable sequence with constant arguments that are not part of the output of the preceding Runnable in the sequence, and which are not part of the user input. We can use `Runnable.bind()` to pass these arguments in.\n",
"\n",
"Suppose we have a simple prompt + model sequence:"
"You can use arbitrary functions in the pipeline.\n",
"\n",
"Note that all inputs to these functions need to be a SINGLE argument. If you have a function that accepts multiple arguments, you should write a wrapper that accepts a single input and unpacks it into multiple argument."
"Runnable lambdas can optionally accept a [RunnableConfig](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.config.RunnableConfig.html#langchain_core.runnables.config.RunnableConfig), which they can use to pass callbacks, tags, and other configuration information to nested runs."
"RunnableParallel can be useful for manipulating the output of one Runnable to match the input format of the next Runnable in a sequence.\n",
"The `RunnableParallel` primitive is essentially a dict whose values are runnables (or things that can be coerced to runnables, like functions). It runs all of its values in parallel, and each value is called with the overall input of the `RunnableParallel`. The final return value is a dict with the results of each value under its appropriate key.\n",
"\n",
"Here the input to prompt is expected to be a map with keys \"context\" and \"question\". The user input is just the question. So we need to get the context using our retriever and passthrough the user input under the \"question\" key.\n",
"It is useful for parallelizing operations, but can also be useful for manipulating the output of one Runnable to match the input format of the next Runnable in a sequence.\n",
"\n",
"\n"
"Here the input to prompt is expected to be a map with keys \"context\" and \"question\". The user input is just the question. So we need to get the context using our retriever and passthrough the user input under the \"question\" key.\n"
"RunnablePassthrough allows to pass inputs unchanged or with the addition of extra keys. This typically is used in conjuction with RunnableParallel to assign data to a new key in the map. \n",
"\n",
"RunnablePassthrough() called on it's own, will simply take the input and pass it through. \n",
"\n",
"RunnablePassthrough called with assign (`RunnablePassthrough.assign(...)`) will take the input, and will add the extra arguments passed to the assign function. \n",
"RunnablePassthrough on its own allows you to pass inputs unchanged. This typically is used in conjuction with RunnableParallel to pass data through to a new key in the map. \n",
"As seen above, `passed` key was called with `RunnablePassthrough()` and so it simply passed on `{'num': 1}`. \n",
"\n",
"In the second line, we used `RunnablePastshrough.assign` with a lambda that multiplies the numerical value by 3. In this cased, `extra` was set with `{'num': 1, 'mult': 3}` which is the original value with the `mult` key added. \n",
"\n",
"Finally, we also set a third key in the map with `modified` which uses a lambda to set a single value adding 1 to the num, which resulted in `modified` key with the value of `2`."
"We also set a second key in the map with `modified`. This uses a lambda to set a single value adding 1 to the num, which resulted in `modified` key with the value of `2`."
]
},
{
@@ -86,7 +79,7 @@
"source": [
"## Retrieval Example\n",
"\n",
"In the example below, we see a use case where we use RunnablePassthrough along with RunnableMap. "
"In the example below, we see a use case where we use `RunnablePassthrough` along with `RunnableParallel`. "
"One key advantage of the `Runnable` interface is that any two runnables can be \"chained\" together into sequences. The output of the previous runnable's `.invoke()` call is passed as input to the next runnable. This can be done using the pipe operator (`|`), or the more explicit `.pipe()` method, which does the same thing. The resulting `RunnableSequence` is itself a runnable, which means it can be invoked, streamed, or piped just like any other runnable.\n",
"\n",
"## The pipe operator\n",
"\n",
"To show off how this works, let's go through an example. We'll walk through a common pattern in LangChain: using a [prompt template](/docs/modules/model_io/prompts/) to format input into a [chat model](/docs/modules/model_io/chat/), and finally converting the chat message output into a string with an [output parser](/docs/modules/model_io/output_parsers/)."
"Prompts and models are both runnable, and the output type from the prompt call is the same as the input type of the chat model, so we can chain them together. We can then invoke the resulting sequence like any other runnable:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Here's a bear joke for you:\\n\\nWhy don't bears wear socks? \\nBecause they have bear feet!\\n\\nHow's that? I tried to keep it light and silly. Bears can make for some fun puns and jokes. Let me know if you'd like to hear another one!\""
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"topic\": \"bears\"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Coercion\n",
"\n",
"We can even combine this chain with more runnables to create another chain. This may involve some input/output formatting using other types of runnables, depending on the required inputs and outputs of the chain components.\n",
"\n",
"For example, let's say we wanted to compose the joke generating chain with another chain that evaluates whether or not the generated joke was funny.\n",
"\n",
"We would need to be careful with how we format the input into the next chain. In the below example, the dict in the chain is automatically parsed and converted into a [`RunnableParallel`](/docs/expression_language/primitives/parallel), which runs all of its values in parallel and returns a dict with the results.\n",
"\n",
"This happens to be the same format the next prompt template expects. Here it is in action:"
"analysis_prompt = ChatPromptTemplate.from_template(\"is this a funny joke? {joke}\")\n",
"\n",
"composed_chain = {\"joke\": chain} | analysis_prompt | model | StrOutputParser()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"That's a pretty classic and well-known bear pun joke. Whether it's considered funny is quite subjective, as humor is very personal. Some people may find that type of pun-based joke amusing, while others may not find it that humorous. Ultimately, the funniness of a joke is in the eye (or ear) of the beholder. If you enjoyed the joke and got a chuckle out of it, then that's what matters most.\""
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"composed_chain.invoke({\"topic\": \"bears\"})"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Functions will also be coerced into runnables, so you can add custom logic to your chains too. The below chain results in the same logical flow as before:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"composed_chain_with_lambda = (\n",
" chain\n",
" | (lambda input: {\"joke\": input})\n",
" | analysis_prompt\n",
" | model\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'I appreciate the effort, but I have to be honest - I didn\\'t find that joke particularly funny. Beet-themed puns can be quite hit-or-miss, and this one falls more on the \"miss\" side for me. The premise is a bit too straightforward and predictable. While I can see the logic behind it, the punchline just doesn\\'t pack much of a comedic punch. \\n\\nThat said, I do admire your willingness to explore puns and wordplay around vegetables. Cultivating a good sense of humor takes practice, and not every joke is going to land. The important thing is to keep experimenting and finding what works. Maybe try for a more unexpected or creative twist on beet-related humor next time. But thanks for sharing - I always appreciate when humans test out jokes on me, even if they don\\'t always make me laugh out loud.'"
"However, keep in mind that using functions like this may interfere with operations like streaming. See [this section](/docs/expression_language/primitives/functions) for more information."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The `.pipe()` method\n",
"\n",
"We could also compose the same sequence using the `.pipe()` method. Here's what that looks like:"
"'That\\'s a pretty good Battlestar Galactica-themed pun! I appreciated the clever play on words with \"Centurion\" and \"center on.\" It\\'s the kind of nerdy, science fiction-inspired humor that fans of the show would likely enjoy. The joke is clever and demonstrates a good understanding of the Battlestar Galactica universe. I\\'d be curious to hear any other Battlestar-related jokes you might have up your sleeve. As long as they don\\'t reproduce copyrighted material, I\\'m happy to provide my thoughts on the humor and appeal for fans of the show.'"
"You might notice above that `parser` actually doesn't block the streaming output from the model, and instead processes each chunk individually. Many of the [LCEL primitives](/docs/expression_language/primitives) also support this kind of transform-style passthrough streaming, which can be very convenient when constructing apps.\n",
"\n",
"Certain runnables, like [prompt templates](/docs/modules/model_io/prompts) and [chat models](/docs/modules/model_io/chat), cannot process individual chunks and instead aggregate all previous steps. This will interrupt the streaming process. Custom functions can be [designed to return generators](/docs/expression_language/primitives/functions#streaming), which"
]
},
{
"cell_type": "markdown",
"id": "1b399fb4-5e3c-4581-9570-6df9b42b623d",
"metadata": {},
"source": [
":::{.callout-note}\n",
"You do not have to use the `LangChain Expression Language` to use LangChain and can instead rely on a standard **imperative** programming approach by\n",
"If the above functionality is not relevant to what you're building, you do not have to use the `LangChain Expression Language` to use LangChain and can instead rely on a standard **imperative** programming approach by\n",
"caling `invoke`, `batch` or `stream` on each component individually, assigning the results to variables and then using them downstream as you see fit.\n",
"\n",
"If that works for your needs, then that's fine by us 👌!\n",
"import { ColumnContainer, Column } from \\\"@theme/Columns\\\";"
"```{=mdx}\n",
"import { ColumnContainer, Column } from \"@theme/Columns\";\n",
"```"
]
},
{
@@ -18,7 +20,7 @@
"id": "919a5ae2-ed21-4923-b98f-723c111bac67",
"metadata": {},
"source": [
":::tip \n",
":::{.callout-tip} \n",
"We recommend reading the LCEL [Get started](/docs/expression_language/get_started) section first.\n",
":::"
]
@@ -28,9 +30,10 @@
"id": "f331037f-be3f-4782-856f-d55dab952488",
"metadata": {},
"source": [
"LCEL makes it easy to build complex chains from basic components. It does this by providing:\n",
"1. **A unified interface**: Every LCEL object implements the `Runnable` interface, which defines a common set of invocation methods (`invoke`, `batch`, `stream`, `ainvoke`, ...). This makes it possible for chains of LCEL objects to also automatically support these invocations. That is, every chain of LCEL objects is itself an LCEL object.\n",
"2. **Composition primitives**: LCEL provides a number of primitives that make it easy to compose chains, parallelize components, add fallbacks, dynamically configure chain internal, and more.\n",
"LCEL is designed to streamline the process of building useful apps with LLMs and combining related components. It does this by providing:\n",
"\n",
"1. **A unified interface**: Every LCEL object implements the `Runnable` interface, which defines a common set of invocation methods (`invoke`, `batch`, `stream`, `ainvoke`, ...). This makes it possible for chains of LCEL objects to also automatically support useful operations like batching and streaming of intermediate steps, since every chain of LCEL objects is itself an LCEL object.\n",
"2. **Composition primitives**: LCEL provides a number of primitives that make it easy to compose chains, parallelize components, add fallbacks, dynamically configure chain internals, and more.\n",
"\n",
"To better understand the value of LCEL, it's helpful to see it in action and think about how we might recreate similar functionality without it. In this walkthrough we'll do just that with our [basic example](/docs/expression_language/get_started#basic_example) from the get started section. We'll take our simple prompt + model chain, which under the hood already defines a lot of functionality, and see what it would take to recreate all of it."
]
@@ -53,10 +56,13 @@
"## Invoke\n",
"In the simplest case, we just want to pass in a topic string and get back a joke string:\n",
"\n",
"```{=mdx}\n",
"<ColumnContainer>\n",
"\n",
"<Column>\n",
"\n",
"```\n",
"\n",
"#### Without LCEL\n"
]
},
@@ -95,9 +101,12 @@
"id": "cdc3b527-c09e-4c77-9711-c3cc4506cd95",
"metadata": {},
"source": [
"\n",
"```{=mdx}\n",
"</Column>\n",
"\n",
"<Column>\n",
"```\n",
"\n",
"#### LCEL\n",
"\n"
@@ -136,14 +145,19 @@
"id": "3c0b0513-77b8-4371-a20e-3e487cec7e7f",
"metadata": {},
"source": [
"\n",
"```{=mdx}\n",
"</Column>\n",
"</ColumnContainer>\n",
"\n",
"```\n",
"## Stream\n",
"If we want to stream results instead, we'll need to change our function:\n",
"\n",
"```{=mdx}\n",
"\n",
"<ColumnContainer>\n",
"<Column>\n",
"```\n",
"\n",
"#### Without LCEL\n",
"\n"
@@ -184,10 +198,11 @@
"id": "f8e36b0e-c7dc-4130-a51b-189d4b756c7f",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"\n",
"<Column>\n",
"\n",
"```\n",
"#### LCEL\n",
"\n"
]
@@ -208,15 +223,19 @@
"id": "b9b41e78-ddeb-44d0-a58b-a0ea0c99a761",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"</ColumnContainer>\n",
"```\n",
"\n",
"## Batch\n",
"\n",
"If we want to run on a batch of inputs in parallel, we'll again need a new function:\n",
"If we want to use a completion endpoint instead of a chat endpoint: \n",
"\n",
"```{=mdx}\n",
"<ColumnContainer>\n",
"<Column>\n",
"```\n",
"\n",
"#### Without LCEL\n",
"\n"
@@ -368,9 +467,11 @@
"id": "45342cd6-58c2-4543-9392-773e05ef06e7",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"\n",
"<Column>\n",
"```\n",
"\n",
"#### LCEL\n",
"\n"
@@ -401,15 +502,19 @@
"id": "ca115eaf-59ef-45c1-aac1-e8b0ce7db250",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"</ColumnContainer>\n",
"```\n",
"\n",
"## Different model provider\n",
"\n",
"If we want to use Anthropic instead of OpenAI: \n",
"\n",
"```{=mdx}\n",
"<ColumnContainer>\n",
"<Column>\n",
"```\n",
"\n",
"#### Without LCEL\n",
"\n"
@@ -447,9 +552,11 @@
"id": "52a0c9f8-e316-42e1-af85-cabeba4b7059",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"\n",
"<Column>\n",
"```\n",
"\n",
"#### LCEL\n",
"\n"
@@ -480,15 +587,19 @@
"id": "d7a91eee-d017-420d-b215-f663dcbf8ed2",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"</ColumnContainer>\n",
"```\n",
"\n",
"## Runtime configurability\n",
"\n",
"If we wanted to make the choice of chat model or LLM configurable at runtime:\n",
"\n",
"```{=mdx}\n",
"<ColumnContainer>\n",
"<Column>\n",
"```\n",
"\n",
"#### Without LCEL\n",
"\n"
@@ -569,9 +680,11 @@
"id": "d1530c5c-6635-4599-9483-6df357ca2d64",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"\n",
"<Column>\n",
"```\n",
"\n",
"#### With LCEL\n",
"\n"
@@ -629,15 +742,19 @@
"id": "370dd4d7-b825-40c4-ae3c-2693cba2f22a",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"</ColumnContainer>\n",
"```\n",
"\n",
"## Logging\n",
"\n",
"If we want to log our intermediate results:\n",
"\n",
"```{=mdx}\n",
"<ColumnContainer>\n",
"<Column>\n",
"```\n",
"\n",
"#### Without LCEL\n",
"\n",
@@ -668,9 +785,11 @@
"id": "16bd20fd-43cd-4aaf-866f-a53d1f20312d",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"\n",
"<Column>\n",
"```\n",
"\n",
"#### LCEL\n",
"Every component has built-in integrations with LangSmith. If we set the following two environment variables, all chain traces are logged to LangSmith.\n",
@@ -705,16 +824,19 @@
"id": "e25ce3c5-27a7-4954-9f0e-b94313597135",
"metadata": {},
"source": [
"```{=mdx}\n",
"</Column>\n",
"</ColumnContainer>\n",
"```\n",
"\n",
"## Fallbacks\n",
"\n",
"If we wanted to add fallback logic, in case one model API is down:\n",
"\n",
"\n",
"```{=mdx}\n",
"<ColumnContainer>\n",
"<Column>\n",
"```\n",
"\n",
"#### Without LCEL\n",
"\n",
@@ -739,7 +861,7 @@
" return await ainvoke_chain(topic)\n",
" except Exception:\n",
" # Note: we haven't actually implemented this.\n",
"Even in this simple case, our LCEL chain succinctly packs in a lot of functionality. As chains become more complex, this becomes especially valuable.\n",
"To continue learning about LCEL, we recommend:\n",
"- Reading up on the full LCEL [Interface](/docs/expression_language/interface), which we've only partially covered here.\n",
"- Exploring the [How-to](/docs/expression_language/how_to) section to learn about additional composition primitives that LCEL provides.\n",
"- Looking through the [Cookbook](/docs/expression_language/cookbook) section to see LCEL in action for common use cases. A good next use case to look at would be [Retrieval-augmented generation](/docs/expression_language/cookbook/retrieval)."
"- Exploring the [primitives](/docs/expression_language/primitives) to learn more about what LCEL provides."
@@ -29,13 +33,6 @@ If you want to install from source, you can do so by cloning the repo and be sur
pip install -e .
```
## LangChain community
The `langchain-community` package contains third-party integrations. It is automatically installed by `langchain`, but can also be used separately. Install with:
```bash
pip install langchain-community
```
## LangChain core
The `langchain-core` package contains base abstractions that the rest of the LangChain ecosystem uses, along with the LangChain Expression Language. It is automatically installed by `langchain`, but can also be used separately. Install with:
@@ -43,6 +40,13 @@ The `langchain-core` package contains base abstractions that the rest of the Lan
pip install langchain-core
```
## LangChain community
The `langchain-community` package contains third-party integrations. It is automatically installed by `langchain`, but can also be used separately. Install with:
```bash
pip install langchain-community
```
## LangChain experimental
The `langchain-experimental` package holds experimental LangChain code, intended for research and experimental uses.
Install with:
@@ -51,6 +55,13 @@ Install with:
pip install langchain-experimental
```
## LangGraph
`langgraph` is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain.
Install with:
```bash
pip install langgraph
```
## LangServe
LangServe helps developers deploy LangChain runnables and chains as a REST API.
LangServe is automatically installed by LangChain CLI.
**LangChain** is a framework for developing applications powered by language models. It enables applications that:
- **Are context-aware**: connect a language model to sources of context (prompt instructions, few shot examples, content to ground its response in, etc.)
- **Reason**: rely on a language model to reason (about how to answer based on provided context, what actions to take, etc.)
**LangChain** is a framework for developing applications powered by large language models (LLMs).
This framework consists of several parts.
- **LangChain Libraries**: The Python and JavaScript libraries. Contains interfaces and integrations for a myriad of components, a basic run time for combining these components into chains and agents, and off-the-shelf implementations of chains and agents.
- **[LangChain Templates](/docs/templates)**: A collection of easily deployable reference architectures for a wide variety of tasks.
- **[LangServe](/docs/langserve)**: A library for deploying LangChain chains as a REST API.
- **[LangSmith](/docs/langsmith)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
LangChain simplifies every stage of the LLM application lifecycle:
- **Development**: Build your applications using LangChain's open-source [building blocks](/docs/expression_language/) and [components](/docs/modules/). Hit the ground running using [third-party integrations](/docs/integrations/platforms/) and [Templates](/docs/templates).
- **Productionization**: Use [LangSmith](/docs/langsmith/) to inspect, monitor and evaluate your chains, so that you can continuously optimize and deploy with confidence.
- **Deployment**: Turn any chain into an API with [LangServe](/docs/langserve).
import ThemedImage from '@theme/ThemedImage';
@@ -25,31 +23,24 @@ import ThemedImage from '@theme/ThemedImage';
title="LangChain Framework Overview"
/>
Together, these products simplify the entire application lifecycle:
- **Develop**: Write your applications in LangChain/LangChain.js. Hit the ground running using Templates for reference.
- **Productionize**: Use LangSmith to inspect, test and monitor your chains, so that you can constantly improve and deploy with confidence.
- **Deploy**: Turn any chain into an API with LangServe.
Concretely, the framework consists of the following open-source libraries:
## LangChain Libraries
The main value props of the LangChain packages are:
1. **Components**: composable tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
2. **Off-the-shelf chains**: built-in assemblages of components for accomplishing higher-level tasks
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
The LangChain libraries themselves are made up of several different packages.
- **`langchain-core`**: Base abstractions and LangChain Expression Language.
- **`langchain-community`**: Third party integrations.
- Partner packages (e.g. **`langchain-openai`**, **`langchain-anthropic`**, etc.): Some integrations have been further split into their own lightweight packages that only depend on **`langchain-core`**.
- **`langchain`**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- **[langgraph](/docs/langgraph)**: Build robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.
- **[langserve](/docs/langserve)**: Deploy LangChain chains as REST APIs.
The broader ecosystem includes:
- **[LangSmith](/docs/langsmith)**: A developer platform that lets you debug, test, evaluate, and monitor LLM applications and seamlessly integrates with LangChain.
## Get started
[Here’s](/docs/get_started/installation) how to install LangChain, set up your environment, and start building.
We recommend following our [Quickstart](/docs/get_started/quickstart) guide to familiarize yourself with the framework by building your first LangChain application.
Read up on our [Security](/docs/security) best practices to make sure you're developing safely with LangChain.
[See here](/docs/get_started/installation) for instructions on how to install LangChain, set up your environment, and start building.
:::note
@@ -57,48 +48,53 @@ These docs focus on the Python LangChain library. [Head here](https://js.langcha
:::
## LangChain Expression Language (LCEL)
## Use cases
LCEL is a declarative way to compose chains. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains.
If you're looking to build something specific or are more of a hands-on learner, check out our [use-cases](/docs/use_cases).
They're walkthroughs and techniques for common end-to-end tasks, such as:
- **[Overview](/docs/expression_language/)**: LCEL and its benefits
- **[Interface](/docs/expression_language/interface)**: The standard interface for LCEL objects
- **[How-to](/docs/expression_language/how_to)**: Key features of LCEL
- **[Cookbook](/docs/expression_language/cookbook)**: Example code for accomplishing common tasks
## Modules
LangChain provides standard, extendable interfaces and integrations for the following modules:
#### [Model I/O](/docs/modules/model_io/)
Interface with language models
#### [Retrieval](/docs/modules/data_connection/)
Interface with application-specific data
#### [Agents](/docs/modules/agents/)
Let models choose which tools to use given high-level directives
LangChain Expression Language (LCEL) is the foundation of many of LangChain's components, and is a declarative way to compose chains. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains.
- **[Get started](/docs/expression_language/)**: LCEL and its benefits
- **[Runnable interface](/docs/expression_language/interface)**: The standard interface for LCEL objects
- **[Primitives](/docs/expression_language/primitives)**: More on the primitives LCEL includes
- and more!
## Ecosystem
### [🦜🛠️ LangSmith](/docs/langsmith)
Trace and evaluate your language model applications and intelligent agents to help you move from prototype to production.
### [🦜🕸️ LangGraph](/docs/langgraph)
Build stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain primitives.
### [🦜🏓 LangServe](/docs/langserve)
Deploy LangChain runnables and chains as REST APIs.
## [Security](/docs/security)
Read up on our [Security](/docs/security) best practices to make sure you're developing safely with LangChain.
## Additional resources
### [Components](/docs/modules/)
LangChain provides standard, extendable interfaces and integrations for many different components, including:
### [Integrations](/docs/integrations/providers/)
LangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of [integrations](/docs/integrations/providers/).
@@ -14,9 +18,9 @@ That's a fair amount to cover! Let's dive in.
### Jupyter Notebook
This guide (and most of the other guides in the documentation) use [Jupyter notebooks](https://jupyter.org/) and assume the reader is as well. Jupyter notebooks are perfect for learning how to work with LLM systems because oftentimes things can go wrong (unexpected output, API down, etc) and going through guides in an interactive environment is a great way to better understand them.
This guide (and most of the other guides in the documentation) uses [Jupyter notebooks](https://jupyter.org/) and assumes the reader is as well. Jupyter notebooks are perfect for learning how to work with LLM systems because oftentimes things can go wrong (unexpected output, API down, etc) and going through guides in an interactive environment is a great way to better understand them.
You do not NEED to go through the guide in a Jupyter Notebook, but it is recommended. See [here](https://jupyter.org/install) for instructions on how to install.
This and other tutorials are perhaps most conveniently run in a Jupyter notebook. See [here](https://jupyter.org/install) for instructions on how to install.
### Installation
@@ -90,12 +94,12 @@ from langchain_openai import ChatOpenAI
llm = ChatOpenAI()
```
If you'd prefer not to set an environment variable you can pass the key in directly via the `openai_api_key` named parameter when initiating the OpenAI LLM class:
If you'd prefer not to set an environment variable you can pass the key in directly via the `api_key` named parameter when initiating the OpenAI LLM class:
```python
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(openai_api_key="...")
llm = ChatOpenAI(api_key="...")
```
</TabItem>
@@ -137,10 +141,10 @@ from langchain_anthropic import ChatAnthropic
If you'd prefer not to set an environment variable you can pass the key in directly via the `anthropic_api_key` named parameter when initiating the Anthropic Chat Model class:
If you'd prefer not to set an environment variable you can pass the key in directly via the `api_key` named parameter when initiating the Anthropic Chat Model class:
First we'll need to import the Cohere SDK package.
```shell
pip install cohere
pip install langchain-cohere
```
Accessing the API requires an API key, which you can get by creating an account and heading [here](https://dashboard.cohere.com/api-keys). Once we have a key we'll want to set it as an environment variable by running:
@@ -161,7 +165,7 @@ export COHERE_API_KEY="..."
We can then initialize the model:
```python
from langchain_community.chat_models import ChatCohere
from langchain_cohere import ChatCohere
llm = ChatCohere()
```
@@ -169,7 +173,7 @@ llm = ChatCohere()
If you'd prefer not to set an environment variable you can pass the key in directly via the `cohere_api_key` named parameter when initiating the Cohere LLM class:
```python
from langchain_community.chat_models import ChatCohere
from langchain_cohere import ChatCohere
llm = ChatCohere(cohere_api_key="...")
```
@@ -184,13 +188,13 @@ Let's ask it what LangSmith is - this is something that wasn't present in the tr
llm.invoke("how can langsmith help with testing?")
```
We can also guide it's response with a prompt template.
Prompt templates are used to convert raw user input to a better input to the LLM.
We can also guide its response with a prompt template.
Prompt templates convert raw user input to better input to the LLM.
```python
from langchain_core.prompts import ChatPromptTemplate
prompt = ChatPromptTemplate.from_messages([
("system", "You are world class technical documentation writer."),
("system", "You are a world class technical documentation writer."),
("user", "{input}")
])
```
@@ -234,7 +238,7 @@ We've now successfully set up a basic LLM chain. We only touched on the basics o
## Retrieval Chain
In order to properly answer the original question ("how can langsmith help with testing?"), we need to provide additional context to the LLM.
To properly answer the original question ("how can langsmith help with testing?"), we need to provide additional context to the LLM.
We can do this via *retrieval*.
Retrieval is useful when you have **too much data** to pass to the LLM directly.
You can then use a retriever to fetch only the most relevant pieces and pass those in.
@@ -242,7 +246,7 @@ You can then use a retriever to fetch only the most relevant pieces and pass tho
In this process, we will look up relevant documents from a *Retriever* and then pass them into the prompt.
A Retriever can be backed by anything - a SQL table, the internet, etc - but in this instance we will populate a vector store and use that as a retriever. For more information on vectorstores, see [this documentation](/docs/modules/data_connection/vectorstores).
First, we need to load the data that we want to index. In order to do this, we will use the WebBaseLoader. This requires installing [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/):
First, we need to load the data that we want to index. To do this, we will use the WebBaseLoader. This requires installing [BeautifulSoup](https://beautiful-soup-4.readthedocs.io/en/latest/):
We can test this out by passing in an instance where the user is asking a followup question.
We can test this out by passing in an instance where the user asks a follow-up question.
```python
from langchain_core.messages import HumanMessage, AIMessage
@@ -411,7 +415,7 @@ retriever_chain.invoke({
"input": "Tell me how"
})
```
You should see that this returns documents about testing in LangSmith. This is because the LLM generated a new query, combining the chat history with the followup question.
You should see that this returns documents about testing in LangSmith. This is because the LLM generated a new query, combining the chat history with the follow-up question.
Now that we have this new retriever, we can create a new chain to continue the conversation with these retrieved documents in mind.
@@ -439,7 +443,7 @@ We can see that this gives a coherent answer - we've successfully turned our ret
## Agent
We've so far create examples of chains - where each step is known ahead of time.
We've so far created examples of chains - where each step is known ahead of time.
The final thing we will create is an agent - where the LLM decides what steps to take.
**NOTE: for this example we will only show how to create an agent using OpenAI models, as local models are not reliable enough yet.**
@@ -448,7 +452,7 @@ One of the first things to do when building an agent is to decide what tools it
For this example, we will give the agent access to two tools:
1. The retriever we just created. This will let it easily answer questions about LangSmith
2. A search tool. This will let it easily answer questions that require uptodate information.
2. A search tool. This will let it easily answer questions that require up-to-date information.
First, let's set up a tool for the retriever we just created:
@@ -488,6 +492,11 @@ Install langchain hub first
```bash
pip install langchainhub
```
Install the langchain-openai package
To interact with OpenAI we need to use langchain-openai which connects with OpenAI SDK[https://github.com/langchain-ai/langchain/tree/master/libs/partners/openai].
```bash
pip install langchain-openai
```
Now we can use it to get a predefined prompt
@@ -499,6 +508,8 @@ from langchain.agents import AgentExecutor
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.