langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-01 11:02:37 +00:00

Author	SHA1	Message	Date
Joydeep Banik Roy	3796672c67	community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271 ) - [ ] Packages affected: - community: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/milvus: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/mongodb: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/pinecone: fix `cosine_similarity` to support simsimd beyond 3.7.7 - partners/qdrant: fix `cosine_similarity` to support simsimd beyond 3.7.7 - [ ] Broadcast operation failure while using simsimd beyond v3.7.7: - Description: I was using simsimd 4.3.1 and the unsupported operand type issue popped up. When I checked out the repo and ran the tests, they failed as well (have attached a screenshot for that). Looks like it is a variant of https://github.com/langchain-ai/langchain/issues/18022 . Prior to 3.7.7, simd.cdist returned an ndarray but now it returns simsimd.DistancesTensor which is ineligible for a broadcast operation with numpy. With this change, it also remove the need to explicitly cast `Z` to numpy array - Issue: #19905 - Dependencies: No - Twitter handle: https://x.com/GetzJoydeep <img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56"> - [ ] Considerations: 1. I started with community but since similar changes were there in Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well. If touching multiple packages in one PR is not the norm, then I can remove them from this PR and raise separate ones 2. I have run and verified that the tests work. Since, only MongoDB had tests, I ran theirs and verified it works as well. Screenshots attached : <img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9"> <img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM" src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5"> I have added a test for simsimd. I feel it may not go well with the CI/CD setup as installing simsimd is not a dependency requirement. I have just imported simsimd to ensure simsimd cosine similarity is invoked. However, its not a good approach. Suggestions are welcome and I can make the required changes on the PR. Please provide guidance on the same as I am new to the community. --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-04 17:36:31 +00:00
Michal Gregor	98b2e7b195	huggingface[patch]: Support for HuggingFacePipeline in ChatHuggingFace. (#22194 ) - Description: Added support for using HuggingFacePipeline in ChatHuggingFace (previously it was only usable with API endpoints, probably by oversight). - Issue: #19997 - Dependencies: none - Twitter handle: none --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-06-04 00:47:35 +00:00
Miroslav	cbd5720011	huggingface[patch]: Skip Login to HuggingFaceHub when token is not set (#22365 )	2024-06-03 15:20:32 -07:00
Bagatur	678a19a5f7	infra: bump anthropic mypy 1 (#22373 )	2024-06-03 08:21:55 -07:00
Bagatur	a8098f5ddb	anthropic[patch]: Release 0.1.15, fix sdk tools break (#22369 )	2024-05-31 12:10:22 -07:00
Erick Friis	6ffa0acf32	ai21: fix text-splitters version (#22366 )	2024-05-31 11:41:05 -04:00
ccurme	f34337447f	openai: update ChatOpenAI api ref (#22324 ) Update to reflect that token usage is no longer default in streaming mode. Add detail for streaming context under Token Usage section.	2024-05-30 12:31:28 -04:00
ChengZi	2443e85533	docs: fix milvus import and update template (#22306 ) docs: fix milvus import problem update milvus-rag template with milvus-lite Signed-off-by: ChengZi <chen.zhang@zilliz.com>	2024-05-30 08:28:55 -07:00
ccurme	6e1df72a88	openai[patch]: Release 0.1.8 (#22291 )	2024-05-29 20:08:30 +00:00
Bagatur	6dd0f095c3	docs: revamp ChatOpenAI (#22253 ) Can build API ref docs by running ```bash make api_docs_clean; make api_docs_quick_preview API_PKG=openai ``` only builds openai ref, takes ~20 sec	2024-05-29 10:20:14 -07:00
Erick Friis	00c70d98c2	robocorp: release 0.0.9 (#22282 )	2024-05-29 16:49:18 +00:00
Mikko Korpela	fc5909ad6f	langchain-robocorp: Fix parsing of Union types (such as Optional). (#22277 )	2024-05-29 09:47:02 -07:00
ccurme	af1f723ada	openai: don't override stream_options default (#22242 ) ChatOpenAI supports a kwarg `stream_options` which can take values `{"include_usage": True}` and `{"include_usage": False}`. Setting include_usage to True adds a message chunk to the end of the stream with usage_metadata populated. In this case the final chunk no longer includes `"finish_reason"` in the `response_metadata`. This is the current default and is not yet released. Because this could be disruptive to workflows, here we remove this default. The default will now be consistent with OpenAI's API (see parameter [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)). Examples: ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI() for chunk in llm.stream("hi"): print(chunk) ``` ``` content='' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='Hello' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='!' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='' response_metadata={'finish_reason': 'stop'} id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' ``` ```python for chunk in llm.stream("hi", stream_options={"include_usage": True}): print(chunk) ``` ``` content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='Hello' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='!' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='' response_metadata={'finish_reason': 'stop'} id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17} ``` ```python llm = ChatOpenAI().bind(stream_options={"include_usage": True}) for chunk in llm.stream("hi"): print(chunk) ``` ``` content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='Hello' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='!' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='' response_metadata={'finish_reason': 'stop'} id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17} ```	2024-05-29 10:30:40 -04:00
Erick Friis	93240fac68	milvus: fix core dep (#22239 )	2024-05-28 10:21:37 -07:00
ChengZi	404d92ded0	milvus: New langchain_milvus package and new milvus features (#21077 ) New features: - New langchain_milvus package in partner - Milvus collection hybrid search retriever - Zilliz cloud pipeline retriever - Milvus Local guid - Rag-milvus template --------- Signed-off-by: ChengZi <chen.zhang@zilliz.com> Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> Co-authored-by: Jael Gu <mengjia.gu@zilliz.com> Co-authored-by: Jackson <jacksonxie612@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-05-28 08:24:20 -07:00
Leonid Ganeline	d6995e814b	ai21[patch]: added `license` (#22153 ) The `pyproject.toml` missed the `license` parameter. I've added it as `MIT`	2024-05-27 15:14:14 -07:00
Mohammad Mohtashim	577ed68b59	mistralai[patch]: Added Json Mode for ChatMistralAI (#22213 ) - Description: Powered [ChatMistralAI.with_structured_output](`fbfed65fb1/libs/partners/mistralai/langchain_mistralai/chat_models.py (L609)`) via json mode - Issue: #22081 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-27 21:16:52 +00:00
Erick Friis	42ffcb2ff1	anthropic: release 0.1.14rc2, test release note gen (#22147 )	2024-05-24 12:40:10 -07:00
ccurme	9a010fb761	openai: read stream_options (#21548 ) OpenAI recently added a `stream_options` parameter to its chat completions API (see [release notes](https://platform.openai.com/docs/changelog/added-chat-completions-stream-usage)). When this parameter is set to `{"usage": True}`, an extra "empty" message is added to the end of a stream containing token usage. Here we propagate token usage to `AIMessage.usage_metadata`. We enable this feature by default. Streams would now include an extra chunk at the end, after the chunk with `response_metadata={'finish_reason': 'stop'}`. New behavior: ``` [AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='Hello', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='!', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17})] ``` Old behavior (accessible by passing `stream_options={"include_usage": False}` into (a)stream: ``` [AIMessageChunk(content='', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='Hello', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='!', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-1312b971-c5ea-4d92-9015-e6604535f339')] ``` From what I can tell this is not yet implemented in Azure, so we enable only for ChatOpenAI.	2024-05-24 13:20:56 -04:00
Bagatur	baa3c975cb	anthropic[patch]: allow tool call mutation (#22130 ) If tool_use blocks and tool_calls with overlapping IDs are present, prefer the values of the tool_calls. Allows for mutating AIMessages just via tool_calls.	2024-05-24 08:18:14 -07:00
ccurme	0ea1e89b2c	groq: read tool calls from .tool_calls attribute (#22096 )	2024-05-23 18:16:06 -04:00
Eugene Yurtsev	2d693c484e	docs: fix some spelling mistakes caught by newest version of code spell (#22090 ) Going to merge this even though it doesn't pass all tests, and open a separate PR for the remaining spelling mistakes.	2024-05-23 16:59:11 -04:00
ccurme	152c8cac33	anthropic, openai: cut pre-releases (#22083 )	2024-05-23 15:02:23 -04:00
ccurme	fbfed65fb1	core, partners: add token usage attribute to AIMessage (#21944 ) ```python class UsageMetadata(TypedDict): """Usage metadata for a message, such as token counts. Attributes: input_tokens: (int) count of input (or prompt) tokens output_tokens: (int) count of output (or completion) tokens total_tokens: (int) total token count """ input_tokens: int output_tokens: int total_tokens: int ``` ```python class AIMessage(BaseMessage): ... usage_metadata: Optional[UsageMetadata] = None """If provided, token usage information associated with the message.""" ... ```	2024-05-23 14:21:58 -04:00
junkeon	4fda7bf4f2	upstage[patch] : fix error handling in Layout Analysis parser (#22054 ) This pull request addresses and fixes exception handling in the UpstageLayoutAnalysisParser and enhances the test coverage by adding error exception tests for the document loader. These improvements ensure robust error handling and increase the reliability of the system when dealing with external API calls and JSON responses. ### Changes Made 1. Fix Request Exception Handling: - Issue: The existing implementation of UpstageLayoutAnalysisParser did not properly handle exceptions thrown by the requests library, which could lead to unhandled exceptions and potential crashes. - Solution: Added comprehensive exception handling for requests.RequestException to catch any request-related errors. This includes logging the error details and raising a ValueError with a meaningful error message. 2. Add Error Exception Tests for Document Loader: - New Tests: Introduced new test cases to verify the robustness of the UpstageLayoutAnalysisLoader against various error scenarios. The tests ensure that the loader gracefully handles: - RequestException: Simulates network issues or invalid API requests to ensure appropriate error handling and user feedback. - JSONDecodeError: Simulates scenarios where the API response is not a valid JSON, ensuring the system does not crash and provides clear error messaging.	2024-05-23 11:45:34 -04:00
JuHyung Son	d9eff44400	partner-upstage[patch]: embeddings empty list bug (#22057 ) Fixed an error in `embed_documents` when the input was given as an empty list. And I have revised the document.	2024-05-23 11:44:30 -04:00
Bagatur	50186da0a1	infra: rm unused # noqa violations (#22049 ) Updating #21137	2024-05-22 15:21:08 -07:00
Klaudia Lemiec	45351d1bc6	docs: Chroma docstrings update (#22001 ) Thank you for contributing to LangChain! - [X] PR title: "docs: Chroma docstrings update" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [X] PR message: - Description: Added and updated Chroma docstrings - Issue: https://github.com/langchain-ai/langchain/issues/21983 - [X] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - only docs - [X] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-05-22 21:45:30 +00:00
Erick Friis	ef53ccf54b	robocorp: release 0.0.8 (#22034 )	2024-05-22 16:41:41 +00:00
Asaf Joseph Gardin	a042e804b4	ai21: AI21 Jamba docs (#21978 ) - Updated docs to have an example to use Jamba instead of J2 --------- Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-05-21 19:27:46 +00:00
ccurme	4be5537837	Revert "anthropic: set default model" (#21987 ) Reverts langchain-ai/langchain#21986	2024-05-21 17:28:32 +00:00
ccurme	35439cf3bd	anthropic: set default model (#21986 ) Various docs reference `ChatAnthropic()`, but this currently raises ValidationError.	2024-05-21 17:24:31 +00:00
Alex Riina	c0e3c3a350	openai[patch], community[patch]: add pricing and max context window for GPT-4o (#21673 ) # Add pricing and max context window for GPT-4o - community: add cost per 1k tokens and max context window - partners: add max context window Description: adds static information about GPT-4o based on https://openai.com/api/pricing/ and https://platform.openai.com/docs/models/gpt-4o so that GPT-4o reporting is accurate. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-05-20 23:47:43 +00:00
Leonid Ganeline	e98a4fd19a	ai21[patch]: configuration fix (#21790 ) added "repository" and "Source Code" parameters (these parameters are missed only in this partner package configuration).	2024-05-20 15:49:38 -07:00
Trayan Azarov	f54cbf8ff5	chroma[patch]: Chroma - remove reference to collection upon delete_collection (#21817 ) Description: - Reference to `Collection` object is set to `None` when deleting a collection `delete_collection()` - Added utility method `reset_collection()` to allow recreating the collection - Moved collection creation out of `__init__` into `__ensure_collection()` to be reused by object init and `reset_collection()` - `_collection` is now a property to avoid breaking changes Issues: - chroma-core/chroma#2213 Twitter: @t_azarov	2024-05-20 15:42:36 -07:00
Jared Van Bortel	25d1c1c9bb	nomic: implement local embeddings with the inference_mode parameter (#21934 ) ## Description This PR implements local and dynamic mode in the Nomic Embed integration using the inference_mode and device parameters. They work as documented [here](https://docs.nomic.ai/reference/python-api/embeddings#local-inference). <!-- If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --> --------- Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-05-20 14:17:07 -07:00
ccurme	4470d3b4a0	partners: bump core in packages implementing ls_params (#21868 ) These packages all import `LangSmithParams` which was released in langchain-core==0.2.0. N.B. we will need to release `openai` and then bump `langchain-openai` in `together` and `upstage`.	2024-05-20 11:51:43 -07:00
ccurme	9c76739425	mistral: implement ls_params (#21867 )	2024-05-20 11:49:48 -07:00
fzowl	d3624eaba1	partners: Remove unnecessary print from voyageai embeddings (#21865 ) Thank you for contributing to LangChain! Remove unnecessary print from voyageai embeddings - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-05-18 08:57:17 -04:00
ccurme	181dfef118	core, standard tests, partner packages: add test for model params (#21677 ) 1. Adds `.get_ls_params` to BaseChatModel which returns ```python class LangSmithParams(TypedDict, total=False): ls_provider: str ls_model_name: str ls_model_type: Literal["chat"] ls_temperature: Optional[float] ls_max_tokens: Optional[int] ls_stop: Optional[List[str]] ``` by default it will only return ```python {ls_model_type="chat", ls_stop=stop} ``` 2. Add these params to inheritable metadata in `CallbackManager.configure` 3. Implement `.get_ls_params` and populate all params for Anthropic + all subclasses of BaseChatOpenAI Sample trace: https://smith.langchain.com/public/d2962673-4c83-47c7-b51e-61d07aaffb1b/r OpenAI: <img width="984" alt="Screenshot 2024-05-17 at 10 03 35 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/2ef41f74-a9df-4e0e-905d-da74fa82a910"> Anthropic: <img width="978" alt="Screenshot 2024-05-17 at 10 06 07 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/39701c9f-7da5-4f1a-ab14-84e9169d63e7"> Mistral (and all others for which params are not yet populated): <img width="977" alt="Screenshot 2024-05-17 at 10 08 43 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/37d7d894-fec2-4300-986f-49a5f0191b03">	2024-05-17 13:51:26 -04:00
Bakar Tavadze	3b5ac44e03	langchain-robocorp[minor]: Enable passing additional headers to the action server. (#21809 ) Actions can optionally receive secrets via request headers. This PR enables this functionality.	2024-05-17 15:08:48 +00:00
Asaf Joseph Gardin	f3289b898c	partners: Revert AI21 Labs docs scan feature (#21699 ) Description: Reverted commit #21614 --------- Co-authored-by: Asaf Gardin <asafg@ai21.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-05-16 22:58:40 +00:00
Bagatur	6416d16d39	anthropic[patch]: Release 0.1.13, tool_choice support (#21773 )	2024-05-16 17:56:29 +00:00
Bagatur	867adbf27b	docs: add aca-ds (#21746 )	2024-05-16 08:52:07 +00:00
Erick Friis	06110e20b9	pinecone: bump min core version (#21742 )	2024-05-15 19:31:43 -07:00
Erick Friis	bd3e7d50f3	fireworks: bump min core version (#21741 )	2024-05-15 19:29:13 -07:00
Erick Friis	f5c31078d7	airbyte[patch]: airbyte-cdk compatible pydantic versions (#21738 )	2024-05-15 19:13:25 -07:00
Erick Friis	3d33b89fa4	ibm[patch]: release 0.1.7 (#21737 )	2024-05-15 19:10:15 -07:00
Erick Friis	e41d801369	openai[patch]: fix embedding float precision issue (#21736 ) also clean up + comment some of the embedding batching code	2024-05-16 02:06:51 +00:00
JuHyung Son	38c297a025	upstage: Support batch input in embedding request. (#21730 ) Description: upstage embedding now supports batch input.	2024-05-15 18:13:44 -07:00

... 6 7 8 9 10 ...

839 Commits