langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-28 06:48:50 +00:00

Author	SHA1	Message	Date
Wang, Yi	d834c6b618	huggingface: fix tool argument serialization in _convert_TGI_message_to_LC_message (#26075 ) Currently `_convert_TGI_message_to_LC_message` replaces `'` in the tool arguments, so an argument like "It's" will be converted to `It"s` and could cause a json parser to fail. --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Vadym Barda <vadym@langchain.dev>	2024-12-11 18:34:32 -08:00
Lakindu Boteju	5a31792bf1	community: Add support for cross-region inference profile IDs in Bedrock Anthropic Claude token cost calculation (#28167 ) This change modifies the token cost calculation logic to support cross-region inference profile IDs for Anthropic Claude models. Instead of explicitly listing all regional variants of new inference profile IDs in the cost dictionaries, the code now extracts a base model ID from the input model ID (or inference profile ID), making it more maintainable and automatically supporting new regional variants. These inference profile IDs follow the format: `<region>.<vendor>.<model-name>` (e.g., `us.anthropic.claude-3-haiku-xxx`, `eu.anthropic.claude-3-sonnet-xxx`). Cross-region inference profiles are system-defined identifiers that enable distributing model inference requests across multiple AWS regions. They help manage unplanned traffic bursts and enhance resilience during peak demands without additional routing costs. References for Amazon Bedrock's cross-region inference profiles:- - https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html - https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-12 02:33:50 +00:00
fatmelon	d1e0ec7b55	community: VectorStores: Azure Cosmos DB Mongo vCore with DiskANN (#27329 ) # Description Add a new vector index type `diskann` to Azure Cosmos DB Mongo vCore vector store. Paper of DiskANN can be found here [DiskANN: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node](https://proceedings.neurips.cc/paper_files/paper/2019/file/09853c7fb1d3f8ee67a61b6bf4a7f8e6-Paper.pdf). ## Sample Usage ```python from pymongo import MongoClient # INDEX_NAME = "izzy-test-index-2" # NAMESPACE = "izzy_test_db.izzy_test_collection" # DB_NAME, COLLECTION_NAME = NAMESPACE.split(".") client: MongoClient = MongoClient(CONNECTION_STRING) collection = client[DB_NAME][COLLECTION_NAME] model_deployment = os.getenv( "OPENAI_EMBEDDINGS_DEPLOYMENT", "smart-agent-embedding-ada" ) model_name = os.getenv("OPENAI_EMBEDDINGS_MODEL_NAME", "text-embedding-ada-002") vectorstore = AzureCosmosDBVectorSearch.from_documents( docs, openai_embeddings, collection=collection, index_name=INDEX_NAME, ) # Read more about these variables in detail here. https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search maxDegree = 40 dimensions = 1536 similarity_algorithm = CosmosDBSimilarityType.COS kind = CosmosDBVectorSearchType.VECTOR_DISKANN lBuild = 20 vectorstore.create_index( dimensions=dimensions, similarity=similarity_algorithm, kind=kind , max_degree=maxDegree, l_build=lBuild, ) ``` ## Dependencies No additional dependencies were added --------- Co-authored-by: Yang Qiao (from Dev Box) <yangqiao@microsoft.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-12 01:54:04 +00:00
manukychen	ba9b95cd23	Community: Adding bulk_size as a setable param for OpenSearchVectorSearch (#28325 ) Description: When using langchain.retrievers.parent_document_retriever.py with vectorstore is OpenSearchVectorSearch, I found that the bulk_size param I passed into OpenSearchVectorSearch class did not work on my ParentDocumentRetriever.add_documents() function correctly, it will be overwrite with int 500 the function which OpenSearchVectorSearch class had (e.g., add_texts(), add_embeddings()...). So I made this PR requset to fix this, thanks! --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-12 01:45:22 +00:00
xintoteai	45f9c9ae88	langchain: fixed weaviate (v4) vectorstore import for self-query retriever (#28675 ) Co-authored-by: Xin Heng <xin.heng@gmail.com>	2024-12-11 15:53:41 -08:00
Thomas van Dongen	ee640d6bd3	community: fixed bug in model2vec embedding code (#28670 ) This PR fixes a bug with the current implementation for Model2Vec embeddings where `embed_documents` does not work as expected. - Description: the current implementation uses `encode_as_sequence` for encoding documents. This is incorrect, as `encode_as_sequence` creates token embeddings and not mean embeddings. The normal `encode` function handles both single and batched inputs and should be used instead. The return type was also incorrect, as encode returns a NumPy array. This PR converts the embedding to a list so that the output is consistent with the Embeddings ABC.	2024-12-11 15:50:56 -08:00
Brian Sharon	b20230c800	community: use correct `id_key` when deleting by id in LanceDB wrapper (#28655 ) - Description: The current version of the `delete` method assumes that the id field will always be called `id`. - Issue: n/a - Dependencies: n/a - Twitter handle: ugh, Twitter :D --- Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-11 23:49:35 +00:00
Mohammad Mohtashim	fa155a422f	[Community]: `requests_kwargs` not being used in _fetch (#28646 ) - Description: `requests_kwargs` is not being passed to `_fetch` which is fetching pages asynchronously. In this PR, making sure that we are passing `requests_kwargs` to `_fetch` just like `_scrape`. - Issue: #28634 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-11 23:46:54 +00:00
Mohammad Mohtashim	a37afbe353	mistral[minor]: Added Retrying Mechanism in case of Request Rate Limit Error for `MistralAIEmbeddings` (#27818 ) - Description:: In the event of a Rate Limit Error from the MistralAI server, the response JSON raises a KeyError. To address this, a simple retry mechanism has been implemented to handle cases where the request limit is exceeded. - Issue: #27790 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-12-11 17:53:42 -05:00
Vincent Zhang	df5008fe55	community[minor]: FAISS Filter Function Enhancement with Advanced Query Operators (#28207 ) ## Description We are submitting as a team of four for a project. Other team members are @RuofanChen03, @LikeWang10067, @TANYAL77. This pull requests expands the filtering capabilities of the FAISS vectorstore by adding MongoDB-style query operators indicated as follows, while including comprehensive testing for the added functionality. - $eq (equals) - $neq (not equals) - $gt (greater than) - $lt (less than) - $gte (greater than or equal) - $lte (less than or equal) - $in (membership in list) - $nin (not in list) - $and (all conditions must match) - $or (any condition must match) - $not (negation of condition) ## Issue This closes https://github.com/langchain-ai/langchain/issues/26379. ## Sample Usage ```python import faiss import asyncio from langchain_community.vectorstores import FAISS from langchain.schema import Document from langchain_huggingface import HuggingFaceEmbeddings embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-mpnet-base-v2") documents = [ Document(page_content="Process customer refund request", metadata={"schema_type": "financial", "handler_type": "refund",}), Document(page_content="Update customer shipping address", metadata={"schema_type": "customer", "handler_type": "update",}), Document(page_content="Process payment transaction", metadata={"schema_type": "financial", "handler_type": "payment",}), Document(page_content="Handle customer complaint", metadata={"schema_type": "customer","handler_type": "complaint",}), Document(page_content="Process invoice payment", metadata={"schema_type": "financial","handler_type": "payment",}) ] async def search(vectorstore, query, schema_type, handler_type, k=2): schema_filter = {"schema_type": {"$eq": schema_type}} handler_filter = {"handler_type": {"$eq": handler_type}} combined_filter = { "$and": [ schema_filter, handler_filter, ] } base_retriever = vectorstore.as_retriever( search_kwargs={"k":k, "filter":combined_filter} ) return await base_retriever.ainvoke(query) async def main(): vectorstore = FAISS.from_texts( texts=[doc.page_content for doc in documents], embedding=embeddings, metadatas=[doc.metadata for doc in documents] ) def printt(title, documents): print(title) if not documents: print("\tNo documents found.") return for doc in documents: print(f"\t{doc.page_content}. {doc.metadata}") printt("Documents:", documents) printt('\nquery="process payment", schema_type="financial", handler_type="payment":', await search(vectorstore, query="process payment", schema_type="financial", handler_type="payment", k=2)) printt('\nquery="customer update", schema_type="customer", handler_type="update":', await search(vectorstore, query="customer update", schema_type="customer", handler_type="update", k=2)) printt('\nquery="refund process", schema_type="financial", handler_type="refund":', await search(vectorstore, query="refund process", schema_type="financial", handler_type="refund", k=2)) printt('\nquery="refund process", schema_type="financial", handler_type="foobar":', await search(vectorstore, query="refund process", schema_type="financial", handler_type="foobar", k=2)) print() if __name__ == "__main__":asyncio.run(main()) ``` ## Output ``` Documents: Process customer refund request. {'schema_type': 'financial', 'handler_type': 'refund'} Update customer shipping address. {'schema_type': 'customer', 'handler_type': 'update'} Process payment transaction. {'schema_type': 'financial', 'handler_type': 'payment'} Handle customer complaint. {'schema_type': 'customer', 'handler_type': 'complaint'} Process invoice payment. {'schema_type': 'financial', 'handler_type': 'payment'} query="process payment", schema_type="financial", handler_type="payment": Process payment transaction. {'schema_type': 'financial', 'handler_type': 'payment'} Process invoice payment. {'schema_type': 'financial', 'handler_type': 'payment'} query="customer update", schema_type="customer", handler_type="update": Update customer shipping address. {'schema_type': 'customer', 'handler_type': 'update'} query="refund process", schema_type="financial", handler_type="refund": Process customer refund request. {'schema_type': 'financial', 'handler_type': 'refund'} query="refund process", schema_type="financial", handler_type="foobar": No documents found. ``` --------- Co-authored-by: ruofan chen <ruofan.is.awesome@gmail.com> Co-authored-by: RickyCowboy <like.wang@mail.utoronto.ca> Co-authored-by: Shanni Li <tanya.li@mail.utoronto.ca> Co-authored-by: RuofanChen03 <114096642+ruofanchen03@users.noreply.github.com> Co-authored-by: Like Wang <102838708+likewang10067@users.noreply.github.com>	2024-12-11 17:52:22 -05:00
like	3048a9a26d	community: tongyi multimodal response format fix to support langchain (#28645 ) Description: The multimodal(tongyi) response format "message": {"role": "assistant", "content": [{"text": "图像"}]}}]} is not compatible with LangChain. Dependencies: No --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-10 21:13:26 +00:00
Bagatur	d0e662e43b	community[patch]: Release 0.3.11 (#28658 )	2024-12-10 20:51:13 +00:00
Bagatur	91227ad7fd	langchain[patch]: Release 0.3.11 (#28657 )	2024-12-10 12:28:14 -08:00
Bagatur	1fbd86a155	core[patch]: Release 0.3.24 (#28656 )	2024-12-10 20:19:21 +00:00
Bagatur	e6a62d8422	core,langchain,community[patch]: allow langsmith 0.2 (#28598 )	2024-12-10 18:50:58 +00:00
ccurme	bc4dc7f4b1	ollama[patch]: permit streaming for tool calls (#28654 ) Resolves https://github.com/langchain-ai/langchain/issues/28543 Ollama recently [released](https://github.com/ollama/ollama/releases/tag/v0.4.6) support for streaming tool calls. Previously we would override the `stream` parameter if tools were passed in. Covered in standard tests here: `c1d348e95d/libs/standard-tests/langchain_tests/integration_tests/chat_models.py (L893-L897)` Before, the test generates one message chunk: ```python [ AIMessageChunk( content='', additional_kwargs={}, response_metadata={ 'model': 'llama3.1', 'created_at': '2024-12-10T17:49:04.468487Z', 'done': True, 'done_reason': 'stop', 'total_duration': 525471208, 'load_duration': 19701000, 'prompt_eval_count': 170, 'prompt_eval_duration': 31000000, 'eval_count': 17, 'eval_duration': 473000000, 'message': Message( role='assistant', content='', images=None, tool_calls=[ ToolCall( function=Function(name='magic_function', arguments={'input': 3}) ) ] ) }, id='run-552bbe0f-8fb2-4105-ada1-fa38c1db444d', tool_calls=[ { 'name': 'magic_function', 'args': {'input': 3}, 'id': 'b0a4dc07-7d7a-487b-bd7b-ad062c2363a2', 'type': 'tool_call', }, ], usage_metadata={ 'input_tokens': 170, 'output_tokens': 17, 'total_tokens': 187 }, tool_call_chunks=[ { 'name': 'magic_function', 'args': '{"input": 3}', 'id': 'b0a4dc07-7d7a-487b-bd7b-ad062c2363a2', 'index': None, 'type': 'tool_call_chunk', } ] ) ] ``` After, it generates two (tool call in one, response metadata in another): ```python [ AIMessageChunk( content='', additional_kwargs={}, response_metadata={}, id='run-9a3f0860-baa1-4bae-9562-13a61702de70', tool_calls=[ { 'name': 'magic_function', 'args': {'input': 3}, 'id': '5bbaee2d-c335-4709-8d67-0783c74bd2e0', 'type': 'tool_call', }, ], tool_call_chunks=[ { 'name': 'magic_function', 'args': '{"input": 3}', 'id': '5bbaee2d-c335-4709-8d67-0783c74bd2e0', 'index': None, 'type': 'tool_call_chunk', }, ], ), AIMessageChunk( content='', additional_kwargs={}, response_metadata={ 'model': 'llama3.1', 'created_at': '2024-12-10T17:46:43.278436Z', 'done': True, 'done_reason': 'stop', 'total_duration': 514282750, 'load_duration': 16894458, 'prompt_eval_count': 170, 'prompt_eval_duration': 31000000, 'eval_count': 17, 'eval_duration': 464000000, 'message': Message( role='assistant', content='', images=None, tool_calls=None ), }, id='run-9a3f0860-baa1-4bae-9562-13a61702de70', usage_metadata={ 'input_tokens': 170, 'output_tokens': 17, 'total_tokens': 187 } ), ] ```	2024-12-10 12:54:37 -05:00
Johannes Mohren	c1d348e95d	doc-loader: retain Azure Doc Intelligence API metadata in Document parser (#28382 ) Description: This PR modifies the doc_intelligence.py parser in the community package to include all metadata returned by the Azure Doc Intelligence API in the Document object. Previously, only the parsed content (markdown) was retained, while other important metadata such as bounding boxes (bboxes) for images and tables was discarded. These image bboxes are crucial for supporting use cases like multi-modal RAG workflows when using Azure Doc Intelligence. The change ensures that all information returned by the Azure Doc Intelligence API is preserved by setting the metadata attribute of the Document object to the entire result returned by the API, rather than an empty dictionary. This extends the parser's utility for complex use cases without breaking existing functionality. Issue: This change does not address a specific issue number, but it resolves a critical limitation in supporting multimodal workflows when using the LangChain wrapper for the Azure API. Dependencies: No additional dependencies are required for this change. --------- Co-authored-by: jmohren <johannes.mohren@aol.de>	2024-12-10 11:22:58 -05:00
Alex Tonkonozhenko	0d20c314dd	Confluence Loader: Fix CQL loading (#27620 ) fix #12082 <!--- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->	2024-12-10 11:05:23 -05:00
Katarina Supe	aba2711e7f	community: update Memgraph integration (#27017 ) Description: - Memgraph no longer relies on `Neo4jGraphStore` but implements `GraphStore`, just like other graph databases. - Memgraph no longer relies on `GraphQAChain`, but implements `MemgraphQAChain`, just like other graph databases. - The refresh schema procedure has been updated to try using `SHOW SCHEMA INFO`. The fallback uses Cypher queries (a combination of schema and Cypher) → LangChain integration no longer relies on MAGE library. - The schema structure has been reformatted. Regardless of the procedures used to get schema, schema structure is the same. - The `add_graph_documents()` method has been implemented. It transforms `GraphDocument` into Cypher queries and creates a graph in Memgraph. It implements the ability to use `baseEntityLabel` to improve speed (`baseEntityLabel` has an index on the `id` property). It also implements the ability to include sources by creating a `MENTIONS` relationship to the source document. - Jupyter Notebook for Memgraph has been updated. - Issue: / - Dependencies: / - Twitter handle: supe_katarina (DX Engineer @ Memgraph) Closes #25606	2024-12-10 10:57:21 -05:00
ccurme	5c6e2cbcda	ollama[patch]: support structured output (#28629 ) - Bump minimum version of `ollama` to 0.4.4 (which also addresses https://github.com/langchain-ai/langchain/issues/28607). - Support recently-released [structured output](https://ollama.com/blog/structured-outputs) feature. This can be accessed by calling `.with_structured_output` with `method="json_schema"` (choice of name [mirrors](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.with_structured_output) what we have for OpenAI's structured output feature). `ChatOllama` previously implemented `.with_structured_output` via the [base implementation](`ec9b41431e/libs/core/langchain_core/language_models/chat_models.py (L1117)`).	2024-12-10 10:36:00 -05:00
Bagatur	24292c4a31	core[patch]: Release 0.3.23 (#28648 )	2024-12-10 10:01:16 +00:00
Bagatur	e24f86e55f	core[patch]: return ToolMessage from tool (#28605 )	2024-12-10 09:59:38 +00:00
Erick Friis	ef2f875dfb	core: deprecate PipelinePromptTemplate (#28644 )	2024-12-10 03:56:48 +00:00
TamagoTorisugi	0f0df2df60	fix: Set default search_type to 'similarity' in as_retriever method of AzureSearch (#28376 ) Description This PR updates the `as_retriever` method in the `AzureSearch` to ensure that the `search_type` parameter defaults to 'similarity' when not explicitly provided. Previously, if the `search_type` was omitted, it did not default to any specific value. So it was inherited from `AzureSearchVectorStoreRetriever`, which defaults to 'hybrid'. This change ensures that the intended default behavior aligns with the expected usage. Issue No specific issue was found related to this change. Dependencies No new dependencies are introduced with this change. --------- Co-authored-by: prrao87 <prrao87@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-10 03:40:04 +00:00
Prashanth Rao	8c6eec5f25	community: KuzuGraph needs allow_dangerous_requests, add graph documents via LLMGraphTransformer (#27949 ) - [x] PR title: "community: Kuzu - Add graph documents via LLMGraphTransformer" - This PR adds a new method `add_graph_documents` to use the `GraphDocument`s extracted by `LLMGraphTransformer` and store in a Kùzu graph backend. - This allows users to transform unstructured text into a graph that uses Kùzu as the graph store. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: pookam90 <pookam@microsoft.com> Co-authored-by: Pooja Kamath <60406274+Pookam90@users.noreply.github.com> Co-authored-by: hsm207 <hsm207@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-10 03:15:28 +00:00
Filip Ratajczak	4e743b5427	Core: google docstring parsing fix (#28404 ) Thank you for contributing to LangChain! - [ ] PR title: "core: google docstring parsing fix" - [x] PR message: - Description: Added a solution for invalid parsing of google docstring such as: Args: net_annual_income (float): The user's net annual income (in current year dollars). - Issue: Previous code would return arg = "net_annual_income (float)" which would cause exception in _validate_docstring_args_against_annotations - Dependencies: None If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-10 00:27:25 +00:00
Arnav Priyadarshi	b78b2f7a28	community[fix]: Update Perplexity to pass parameters into API calls (#28421 ) - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - Description: I realized the invocation parameters were not being passed into `_generate` so I added those in but then realized that the parameters contained some old fields designed for an older openai client which I removed. Parameters work fine now. - Issue: Fixes #28229 - Dependencies: No new dependencies. - Twitter handle: @arch_plane - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-10 00:23:31 +00:00
Clément Jumel	cf6d1c0ae7	docs: add Linkup integration documentation (#28366 ) ## Description First of all, thanks for the great framework that is LangChain! At [Linkup](https://www.linkup.so/) we're working on an API to connect LLMs and agents to the internet and our partner sources. We'd be super excited to see our API integrated in LangChain! This essentially consists in adding a LangChain retriever and tool, which is done in our own [package](https://pypi.org/project/langchain-linkup/). Here we're simply following the [integration documentation](https://python.langchain.com/docs/contributing/how_to/integrations/) and update the documentation of LangChain to mention the Linkup integration. We do have tests (both units & integration) in our [source code](https://github.com/LinkupPlatform/langchain-linkup), and tried to follow as close as possible the [integration documentation](https://python.langchain.com/docs/contributing/how_to/integrations/) which specifically requests to focus on documentation changes for an integration PR, so I'm not adding tests here, even though the PR checklist seems to suggest so. Feel free to correct me if I got this wrong! By the way, we would be thrilled by being mentioned in the list of providers which have standalone packages [here](https://langchain-git-fork-linkupplatform-cj-doc-langchain.vercel.app/docs/integrations/providers/), is there something in particular for us to do for that? 🙂 ## Twitter handle Linkup_platform <!-- ## PR Checklist Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [x] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --!>	2024-12-09 14:36:25 -08:00
Amir Sadeghi	2c49f587aa	community[fix]: could not locate runnable browser (#28289 ) set open_browser to false to resolve "could not locate runnable browser" error while default browser is None Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-09 21:05:52 +00:00
Martin Triska	75bc6bb191	community: [bugfix] fix source path for office files in O365 (#28260 ) # What problem are we fixing? Currently documents loaded using `O365BaseLoader` fetch source from `file.web_url` (where `file` is `<class 'O365.drive.File'>`). This works well for `.pdf` documents. Unfortunately office documents (`.xlsx`, `.docx` ...) pass their `web_url` in following format: `https://sharepoint_address/sites/path/to/library/root/Doc.aspx?sourcedoc=%XXXXXXXX-1111-1111-XXXX-XXXXXXXXXX%7D&file=filename.xlsx&action=default&mobileredirect=true` This obfuscates the path to the file. This PR utilizes the parrent folder's path and file name to reconstruct the actual location of the file. Knowing the file's location can be crucial for some RAG applications (path to the file can carry information we don't want to loose). @vbarda Could you please look at this one? I'm @-mentioning you since we've already closed some PRs together :-) Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-09 12:34:59 -08:00
Erick Friis	534b8f4364	standard-tests: release 0.3.7 (#28637 )	2024-12-09 15:12:18 -05:00
Naka Masato	ce3b69aa05	community: add include_labels option to ConfluenceLoader (#28259 ) ## Description: Enable `ConfluenceLoader` to include labels with `include_labels` option (`false` by default for backward compatibility). and the labels are set to `metadata` in the `Document`. e.g. `{"labels": ["l1", "l2"]}` ## Notes Confluence API supports to get labels by providing `metadata.labels` to `expand` query parameter All of the following functions support `expand` in the same way: - confluence.get_page_by_id - confluence.get_all_pages_by_label - confluence.get_all_pages_from_space - cql (internally using [/api/content/search](https://developer.atlassian.com/cloud/confluence/rest/v1/api-group-content/#api-wiki-rest-api-content-search-get)) ## Issue: No issue related to this PR. ## Dependencies: No changes. ## Twitter handle: [@gymnstcs](https://x.com/gymnstcs) - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-09 19:35:01 +00:00
Rajendra Kadam	242fee11be	community[minor] Pebblo: Support for new Pinecone class PineconeVectorStore (#28253 ) - Description: Support for new Pinecone class PineconeVectorStore in PebbloRetrievalQA. - Issue: NA - Dependencies: NA - Tests: - Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-09 19:33:54 +00:00
nikitajoyn	9fcd203556	partners/mistralai: Fix KeyError in Vertex AI stream (#28624 ) - Description: Streaming response from Mistral model using Vertex AI raises KeyError when trying to access `choices` key, that the last chunk doesn't have. The fix is to access the key safely using `get()`. - Issue: https://github.com/langchain-ai/langchain/issues/27886 - Dependencies: - Twitter handle:	2024-12-09 14:14:58 -05:00
maang-h	b64d846347	docs: Standardize MoonshotChat docstring (#28159 ) - Description: Add docstring Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-09 18:46:25 +00:00
Erick Friis	4c70ffff01	standard-tests: sync/async vectorstore tests conditional (#28636 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-09 18:02:55 +00:00
ccurme	ffb5c1905a	openai[patch]: release 0.2.12 (#28633 )	2024-12-09 12:38:13 -05:00
ccurme	6e6061fe73	openai[patch]: bump minimum SDK version (#28632 ) Resolves https://github.com/langchain-ai/langchain/issues/28625	2024-12-09 11:28:05 -05:00
Mohammad Mohtashim	ec9b41431e	[Core]: Small Docstring Clarification for `BaseTool` (#28148 ) - Description: `kwargs` are not being passed to `run` of the `BaseTool` which has been fixed - Issue: #28114 --------- Co-authored-by: Stevan Kapicic <kapicic.ste1@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-09 06:10:19 +00:00
Erick Friis	cef21a0b49	cli: warning on app add (#28619 ) instead of #28128	2024-12-09 06:07:14 +00:00
Ankit Dangi	90f162efb6	text-splitters: add pydocstyle linting (#28127 ) As seen in #23188, turned on Google-style docstrings by enabling `pydocstyle` linting in the `text-splitters` package. Each resulting linting error was addressed differently: ignored, resolved, suppressed, and missing docstrings were added. Fixes one of the checklist items from #25154, similar to #25939 in `core` package. Ran `make format`, `make lint` and `make test` from the root of the package `text-splitters` to ensure no issues were found. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-09 06:01:03 +00:00
WGNW_MG	eabe587787	community[patch]:Fix for get_openai_callback() return token_cost=0.0 when model is gpt-4o-11-20 (#28408 ) - Description: update MODEL_COST_PER_1K_TOKENS for new gpt-4o-11-20. - Issue: with latest gpt-4o-11-20, openai callback return token_cost=0.0 - Dependencies: None (just simple dict fix.) - Twitter handle: I Don't Use Twitter. - (However..., I have a YouTube channel. Could you upload this there, by any chance? https://www.youtube.com/@%EA%B2%9C%EC%B0%BD%EB%B6%80%EA%B3%A0%EB%AC%B8AI%EC%9E%90%EB%AC%B8%EC%84%BC%EC%84%B8)	2024-12-08 20:46:50 -08:00
Fahim Zaman	481c4bfaba	core[patch]: Fixed trim functions, and added corresponding unit test for the solved issue (#28429 ) - Description: - Trim functions were incorrectly deleting nodes with more than 1 outgoing/incoming edge, so an extra condition was added to check for this directly. A unit test "test_trim_multi_edge" was written to test this test case specifically. - Issue: - Fixes #28411 - Fixes https://github.com/langchain-ai/langgraph/issues/1676 - Dependencies: - No changes were made to the dependencies - [x] Unit tests were added to verify the changes. - [x] Updated documentation where necessary. - [x] Ran make format, make lint, and make test to ensure compliance with project standards. --------- Co-authored-by: Tasif Hussain <tasif006@gmail.com>	2024-12-08 20:45:28 -08:00
Marco Perini	2354bb7bfa	partners: 🕷️🦜 ScrapeGraph API Integration (#28559 ) Hi Langchain team! I'm the co-founder and mantainer at [ScrapeGraphAI](https://scrapegraphai.com/). By following the integration [guide](https://python.langchain.com/docs/contributing/how_to/integrations/publish/) on your site, I have created a new lib called [langchain-scrapegraph](https://github.com/ScrapeGraphAI/langchain-scrapegraph). With this PR I would like to integrate Scrapegraph as provider in Langchain, adding the required documentation files. Let me know if there are some changes to be made to be properly integrated both in the lib and in the documentation. Thank you 🕷️🦜 If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-09 02:38:21 +00:00
Abhinav	317a38b83e	community[minor]: Add support for modle2vec embeddings (#28507 ) This PR add an embeddings integration for model2vec, the `Model2vecEmbeddings` class. - Description: [Model2Vec](https://github.com/MinishLab/model2vec) lets you turn any sentence transformer into a really small static model and makes running the model faster. - Issue: - Dependencies: model2vec ([pypi](https://pypi.org/project/model2vec/)) - Twitter handle:: - [x] Add tests and docs: - [Test](https://github.com/blacksmithop/langchain/blob/model2vec_embeddings/libs/community/langchain_community/embeddings/model2vec.py), [docs](https://github.com/blacksmithop/langchain/blob/model2vec_embeddings/docs/docs/integrations/text_embedding/model2vec.ipynb) - [x] Lint and test: --------- Co-authored-by: Abhinav KM <abhinav.m@zerone-consulting.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-12-09 02:17:22 +00:00
Mohammad Mohtashim	524ee6d9ac	Invalid `tool_choice` being passed to `ChatLiteLLM` (#28198 ) - Description: Invalid `tool_choice` is given to `ChatLiteLLM` to `bind_tools` due to it's parent's class default value being pass through `with_structured_output`. - Issue: #28176	2024-12-07 14:33:40 -05:00
Erick Friis	dd0085a9ff	docs: standard tests to markdown, load templates from files (#28603 )	2024-12-07 01:37:21 +00:00
Erick Friis	5e8553c31a	standard-tests: retriever docstrings (#28596 )	2024-12-07 00:32:19 +00:00
ccurme	d801c6ffc7	tests[patch]: nits (#28601 )	2024-12-07 00:13:04 +00:00
Erick Friis	07c2ac765a	community: release 0.3.10 (#28600 )	2024-12-07 00:07:13 +00:00

... 2 3 4 5 6 ...

6291 Commits