langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-07-31 00:29:57 +00:00

Author	SHA1	Message	Date
Erick Friis	a119cae5bd	partners/mistralai: release 0.2.4 (#28803 )	2024-12-18 22:11:48 +00:00
Erick Friis	514d78516b	partners/ollama: release 0.2.2 (#28802 )	2024-12-18 22:11:08 +00:00
Bagatur	68940dd0d6	openai[patch]: Release 0.2.13 (#28800 )	2024-12-18 22:08:47 +00:00
Bagatur	4a531437bb	core[patch], openai[patch]: Handle OpenAI developer msg (#28794 ) - Convert developer openai messages to SystemMessage - store additional_kwargs={"__openai_role__": "developer"} so that the correct role can be reconstructed if needed - update ChatOpenAI to read in openai_role --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-18 21:54:07 +00:00
Bagatur	e4d3ccf62f	json mode standard test (#25497 ) Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-17 18:47:34 +00:00
ccurme	b745281eec	anthropic[patch]: increase timeouts for integration tests (#28767 ) Some tests consistently ran into the 10s limit in CI.	2024-12-17 15:47:17 +00:00
Vinit Kudva	a00258ec12	chroma: fix persistence if client_settings is passed in (#25199 ) …ent path given. Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-17 10:03:02 -05:00
Omri Eliyahu Levy	f8883a1321	partners/voyageai: enable setting output dimension (#28740 ) Voyage has introduced voyage-3-large and voyage-code-3, which feature different output dimensions by leveraging a technique called "Matryoshka Embeddings" (see blog - https://blog.voyageai.com/2024/12/04/voyage-code-3/). These two models are available in various sizes: [256, 512, 1024, 2048] (https://docs.voyageai.com/docs/embeddings#model-choices). This PR adds the option to set the required output dimension.	2024-12-17 10:02:00 -05:00
Manuel	af2e0a7ede	partners: add 'model' alias for consistency in embedding classes (#28374 ) Description: This PR introduces a `model` alias for the embedding classes that contain the attribute `model_name`, to ensure consistency across the codebase, as suggested by a moderator in a previous PR. The change aligns the usage of attribute names across the project (see for example [here](`65deeddd5d/libs/partners/groq/langchain_groq/chat_models.py (L304)`)). Issue: This PR addresses the suggestion from the review of issue #28269. Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-13 22:30:00 +00:00
Erick Friis	3107d78517	huggingface: fix standard test lint (#28714 )	2024-12-13 22:18:54 +00:00
Kaiwei Zhang	b909d54e70	chroma[patch]: Update logic for assigning ids	2024-12-13 21:58:34 +00:00
Wang, Yi	d834c6b618	huggingface: fix tool argument serialization in _convert_TGI_message_to_LC_message (#26075 ) Currently `_convert_TGI_message_to_LC_message` replaces `'` in the tool arguments, so an argument like "It's" will be converted to `It"s` and could cause a json parser to fail. --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Vadym Barda <vadym@langchain.dev>	2024-12-11 18:34:32 -08:00
Mohammad Mohtashim	a37afbe353	mistral[minor]: Added Retrying Mechanism in case of Request Rate Limit Error for `MistralAIEmbeddings` (#27818 ) - Description:: In the event of a Rate Limit Error from the MistralAI server, the response JSON raises a KeyError. To address this, a simple retry mechanism has been implemented to handle cases where the request limit is exceeded. - Issue: #27790 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-12-11 17:53:42 -05:00
ccurme	bc4dc7f4b1	ollama[patch]: permit streaming for tool calls (#28654 ) Resolves https://github.com/langchain-ai/langchain/issues/28543 Ollama recently [released](https://github.com/ollama/ollama/releases/tag/v0.4.6) support for streaming tool calls. Previously we would override the `stream` parameter if tools were passed in. Covered in standard tests here: `c1d348e95d/libs/standard-tests/langchain_tests/integration_tests/chat_models.py (L893-L897)` Before, the test generates one message chunk: ```python [ AIMessageChunk( content='', additional_kwargs={}, response_metadata={ 'model': 'llama3.1', 'created_at': '2024-12-10T17:49:04.468487Z', 'done': True, 'done_reason': 'stop', 'total_duration': 525471208, 'load_duration': 19701000, 'prompt_eval_count': 170, 'prompt_eval_duration': 31000000, 'eval_count': 17, 'eval_duration': 473000000, 'message': Message( role='assistant', content='', images=None, tool_calls=[ ToolCall( function=Function(name='magic_function', arguments={'input': 3}) ) ] ) }, id='run-552bbe0f-8fb2-4105-ada1-fa38c1db444d', tool_calls=[ { 'name': 'magic_function', 'args': {'input': 3}, 'id': 'b0a4dc07-7d7a-487b-bd7b-ad062c2363a2', 'type': 'tool_call', }, ], usage_metadata={ 'input_tokens': 170, 'output_tokens': 17, 'total_tokens': 187 }, tool_call_chunks=[ { 'name': 'magic_function', 'args': '{"input": 3}', 'id': 'b0a4dc07-7d7a-487b-bd7b-ad062c2363a2', 'index': None, 'type': 'tool_call_chunk', } ] ) ] ``` After, it generates two (tool call in one, response metadata in another): ```python [ AIMessageChunk( content='', additional_kwargs={}, response_metadata={}, id='run-9a3f0860-baa1-4bae-9562-13a61702de70', tool_calls=[ { 'name': 'magic_function', 'args': {'input': 3}, 'id': '5bbaee2d-c335-4709-8d67-0783c74bd2e0', 'type': 'tool_call', }, ], tool_call_chunks=[ { 'name': 'magic_function', 'args': '{"input": 3}', 'id': '5bbaee2d-c335-4709-8d67-0783c74bd2e0', 'index': None, 'type': 'tool_call_chunk', }, ], ), AIMessageChunk( content='', additional_kwargs={}, response_metadata={ 'model': 'llama3.1', 'created_at': '2024-12-10T17:46:43.278436Z', 'done': True, 'done_reason': 'stop', 'total_duration': 514282750, 'load_duration': 16894458, 'prompt_eval_count': 170, 'prompt_eval_duration': 31000000, 'eval_count': 17, 'eval_duration': 464000000, 'message': Message( role='assistant', content='', images=None, tool_calls=None ), }, id='run-9a3f0860-baa1-4bae-9562-13a61702de70', usage_metadata={ 'input_tokens': 170, 'output_tokens': 17, 'total_tokens': 187 } ), ] ```	2024-12-10 12:54:37 -05:00
ccurme	5c6e2cbcda	ollama[patch]: support structured output (#28629 ) - Bump minimum version of `ollama` to 0.4.4 (which also addresses https://github.com/langchain-ai/langchain/issues/28607). - Support recently-released [structured output](https://ollama.com/blog/structured-outputs) feature. This can be accessed by calling `.with_structured_output` with `method="json_schema"` (choice of name [mirrors](https://python.langchain.com/api_reference/openai/chat_models/langchain_openai.chat_models.base.ChatOpenAI.html#langchain_openai.chat_models.base.ChatOpenAI.with_structured_output) what we have for OpenAI's structured output feature). `ChatOllama` previously implemented `.with_structured_output` via the [base implementation](`ec9b41431e/libs/core/langchain_core/language_models/chat_models.py (L1117)`).	2024-12-10 10:36:00 -05:00
nikitajoyn	9fcd203556	partners/mistralai: Fix KeyError in Vertex AI stream (#28624 ) - Description: Streaming response from Mistral model using Vertex AI raises KeyError when trying to access `choices` key, that the last chunk doesn't have. The fix is to access the key safely using `get()`. - Issue: https://github.com/langchain-ai/langchain/issues/27886 - Dependencies: - Twitter handle:	2024-12-09 14:14:58 -05:00
ccurme	ffb5c1905a	openai[patch]: release 0.2.12 (#28633 )	2024-12-09 12:38:13 -05:00
ccurme	6e6061fe73	openai[patch]: bump minimum SDK version (#28632 ) Resolves https://github.com/langchain-ai/langchain/issues/28625	2024-12-09 11:28:05 -05:00
Erick Friis	0eb7ab65f1	multiple: fix xfailed signatures (#28597 )	2024-12-06 15:39:47 -08:00
ccurme	2c6bc74cb1	multiple: combine sync/async vector store standard test suites (#28580 ) Breaking change in `langchain-tests`.	2024-12-06 14:55:06 -05:00
blaufink	28f8d436f6	mistral: fix of issue #26029 (#28233 ) - Description: Azure AI takes an issue with the safe_mode parameter being set to False instead of None. Therefore, this PR changes the default value of safe_mode from False to None. This results in it being filtered out before the request is sent - avoind the extra-parameter issue described below. - Issue: #26029 - Dependencies: / --------- Co-authored-by: blaufink <sebastian.brueckner@outlook.de> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-05 23:28:12 +00:00
ccurme	b8e861a63b	openai[patch]: add standard tests for embeddings (#28540 )	2024-12-05 17:00:27 +00:00
ZhangShenao	d26555c682	[VectorStore] Improvement: Improve chroma vector store (#28524 ) - Complete unit test - Fix spelling error	2024-12-05 11:58:32 -05:00
ccurme	8f9b3b7498	chroma[patch]: fix bug (#28538 ) Fix bug introduced in https://github.com/langchain-ai/langchain/pull/27995 If all document IDs are `""`, the chroma SDK will raise ``` DuplicateIDError: Expected IDs to be unique ``` Caught by [docs tests](https://github.com/langchain-ai/langchain/actions/runs/12180395579/job/33974633950), but added a test to langchain-chroma as well.	2024-12-05 15:37:19 +00:00
Erick Friis	c5acedddc2	anthropic: timeout in tests (10s) (#28488 )	2024-12-04 16:03:38 -08:00
ccurme	8bc2c912b8	chroma[patch]: (nit) simplify test (#28517 ) Use `self.get_embeddings` on test class instead of importing embeddings separately.	2024-12-04 20:22:55 +00:00
ccurme	eec55c2550	chroma[patch]: add `get_by_ids` and fix bug (#28516 ) - Run standard integration tests in Chroma - Add `get_by_ids` method - Fix bug in `add_texts`: if a list of `ids` is passed but any of them are None, Chroma will raise an exception. Here we assign a uuid.	2024-12-04 14:00:36 -05:00
Eric Pinzur	eff8a54756	langchain_chroma: added document.id support (#27995 ) Description: * Added internal `Document.id` support to Chroma VectorStore Dependencies: * https://github.com/langchain-ai/langchain/pull/27968 should be merged first and this PR should be re-based on top of those changes. Tests: * Modified/Added tests for `Document.id` support. All tests are passing. Note: I am not a member of the Chroma team. --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-04 00:04:27 +00:00
Erick Friis	c74f34cb41	pinecone: release 0.2.1 (version sequence) (#28485 )	2024-12-03 10:22:16 -08:00
Audrey Sage Lorberfeld	926e452f44	partners: update version header for Pinecone integration (#28481 ) Just need to update the version header used with Pinecone in recently-merged method (from [this PR](https://github.com/langchain-ai/langchain/pull/28320/files#r1867820929)). Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-03 18:08:56 +00:00
Erick Friis	7315360907	openai: dont populate logit_bias if None (#28482 )	2024-12-03 17:54:53 +00:00
Erick Friis	ff675c11f6	partners/pinecone: release 0.2.2 (#28466 )	2024-12-03 06:49:35 +00:00
Audrey Sage Lorberfeld	6b7e93d4c7	pinecone: update pinecone client (#28320 ) This PR updates the Pinecone client to `5.4.0`, as well as its dependencies (`pinecone-plugin-inference` and `pinecone-plugin-interface`). Note: `pinecone-client` is now simply called `pinecone`. Question for reviewer(s): should this PR also update the `pinecone` dep in [the root dir's `poetry.lock` file](https://github.com/langchain-ai/langchain/blob/master/poetry.lock#L6729)? Was unsure. (I don't believe so b/c it seems pinned to a lower version likely based on 3rd-party deps (e.g. Unstructured).) -- TW: @audrey_sage_ --- - To see the specific tasks where the Asana app for GitHub is being used, see below: - https://app.asana.com/0/0/1208693659122374	2024-12-02 22:47:09 -08:00
Erick Friis	42d40d694b	partners/openai: release 0.2.11 (#28461 )	2024-12-02 23:35:18 +00:00
Erick Friis	9f04416768	openai: set logit_bias to none instead of empty dict by default (#28460 )	2024-12-02 15:30:32 -08:00
ccurme	28487597b2	ollama[patch]: release 0.2.1 (#28458 ) We inadvertently skipped 0.2.1, so release pipeline [failed](https://github.com/langchain-ai/langchain/actions/runs/12126964367/job/33810204551).	2024-12-02 21:17:51 +00:00
ccurme	88d6d02b59	ollama[patch]: release 0.2.2 (#28456 )	2024-12-02 14:57:30 -05:00
Bagatur	47433485e7	mistral[patch]: Release 0.2.3 (#28452 )	2024-12-02 08:26:28 -08:00
ccurme	c2f1d022a2	mistral[patch]: ensure tool call IDs in tool messages are correctly formatted (#28422 ) Fixes tests for cross-provider compatibility: https://github.com/langchain-ai/langchain/actions/runs/12085358877/job/33702420504#step:10:376	2024-11-29 13:56:06 +00:00
ccurme	a8b21afc08	qdrant[patch]: run python 3.13 in CI (#28394 )	2024-11-27 12:22:17 -05:00
ccurme	ee6fc3f3f6	nomic[patch]: run python 3.13 in CI (#28393 )	2024-11-27 17:08:15 +00:00
Massimiliano Pronesti	83586661d6	partners[chroma]: add retrieval of embedding vectors (#28290 ) This PR adds an additional method to `Chroma` to retrieve the embedding vectors, besides the most relevant Documents. This is sometimes of use when you need to run a postprocessing algorithm on the retrieved results based on the vectors, which has been the case for me lately. Example issue (discussion) requesting this change: https://github.com/langchain-ai/langchain/discussions/20383 --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-11-27 16:34:02 +00:00
ccurme	733a6ad328	mistral[patch]: run python 3.13 in CI (#28392 )	2024-11-27 11:29:04 -05:00
ccurme	b9bf7fd797	couchbase[patch]: run python 3.13 in CI (#28391 )	2024-11-27 11:28:21 -05:00
TheDannyG	607c60a594	partners/ollama: fix tool calling with nested schemas (#28225 ) ## Description This PR addresses the following: Fixes Issue #25343: - Adds additional logic to parse shallowly nested JSON-encoded strings in tool call arguments, allowing for proper parsing of responses like that of Llama3.1 and 3.2 with nested schemas. Adds Integration Test for Fix: - Adds a Ollama specific integration test to ensure the issue is resolved and to prevent regressions in the future. Fixes Failing Integration Tests: - Fixes failing integration tests (even prior to changes) caused by `llama3-groq-tool-use` model. Previously, tests`test_structured_output_async` and `test_structured_output_optional_param` failed due to the model not issuing a tool call in the response. Resolved by switching to `llama3.1`. ## Issue Fixes #25343. ## Dependencies No dependencies. ____ Done in collaboration with @ishaan-upadhyay @mirajismail @ZackSteine.	2024-11-27 10:32:02 -05:00
ccurme	42b8ad067d	chroma[patch]: test python 3.13 in CI (#28387 )	2024-11-27 15:02:40 +00:00
ccurme	a1c90794e1	ollama[patch]: bump to 0.4.1 in lock file (#28365 )	2024-11-26 18:19:31 +00:00
ccurme	74d9d2cba1	ollama[patch]: support ollama 0.4 (#28364 ) v0.4 of the Python SDK is already installed via the lock file in CI, but our current implementation is not compatible with it. This also addresses an issue introduced in https://github.com/langchain-ai/langchain/pull/28299. @RyanMagnuson would you mind explaining the motivation for that change? From what I can tell the Ollama SDK [does not support kwargs](`6c44bb2729/ollama/_client.py (L286)`). Previously, unsupported kwargs were ignored, but they currently raise `TypeError`. Some of LangChain's standard test suite expects `tool_choice` to be supported, so here we catch it in `bind_tools` so it is ignored and not passed through to the client.	2024-11-26 12:45:59 -05:00
Bagatur	e9c16552fa	openai[patch]: bump core dep (#28361 )	2024-11-26 08:37:05 -08:00
Bagatur	e7dc26aefb	openai[patch]: Release 0.2.10 (#28360 )	2024-11-26 08:30:29 -08:00
ccurme	42b18824c2	openai[patch]: use max_completion_tokens in place of max_tokens (#26917 ) `max_tokens` is deprecated: https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_tokens --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-11-26 16:30:19 +00:00
Erick Friis	aa7fa80e1e	partners/ollama: release 0.2.2rc1 (#28300 )	2024-11-22 22:25:05 +00:00
Erick Friis	7277794a59	ollama: include kwargs in requests (#28299 ) courtesy of @ryanmagnuson	2024-11-22 14:15:42 -08:00
Erick Friis	29f8a79ebe	groq,openai,mistralai: fix unit tests (#28279 )	2024-11-22 04:54:01 +00:00
ccurme	56499cf58b	openai[patch]: unskip test and relax tolerance in embeddings comparison (#28262 ) From what I can tell response using SDK is not deterministic: ```python import numpy as np import openai documents = ["disallowed special token '<\|endoftext\|>'"] model = "text-embedding-ada-002" direct_output_1 = ( openai.OpenAI() .embeddings.create(input=documents, model=model) .data[0] .embedding ) for i in range(10): direct_output_2 = ( openai.OpenAI() .embeddings.create(input=documents, model=model) .data[0] .embedding ) print(f"{i}: {np.isclose(direct_output_1, direct_output_2).all()}") ``` ``` 0: True 1: True 2: True 3: True 4: False 5: True 6: True 7: True 8: True 9: True ``` See related discussion here: https://community.openai.com/t/can-text-embedding-ada-002-be-made-deterministic/318054 Found the same result using `"text-embedding-3-small"`.	2024-11-21 10:23:10 -08:00
Erick Friis	d1108607f4	multiple: push deprecation removals to 1.0 (#28236 )	2024-11-20 19:56:29 -08:00
Eugene Yurtsev	2acc83f146	mistralai[patch]: 0.2.2 release (#28240 ) mistralai 0.2.2 release	2024-11-20 22:18:15 +00:00
Eugene Yurtsev	1a66175e38	mistral[patch]: Propagate tool call id (#28238 ) mistralai-large-2411 requires tool call id Older models accept tool call id if its provided mistral-large-2407 mistral-large-2402	2024-11-20 17:02:30 -05:00
af su	7c7ee07d30	huggingface[fix]: HuggingFaceEndpointEmbeddings model parameter passing error when async embed (#27953 ) This change refines the handling of _model_kwargs in POST requests. Instead of nesting _model_kwargs as a dictionary under the parameters key, it is now directly unpacked and merged into the request's JSON payload. This ensures that the model parameters are passed correctly and avoids unnecessary nesting.E. g.: ```python import asyncio from langchain_huggingface.embeddings import HuggingFaceEndpointEmbeddings embedding_input = ["This input will get multiplied" * 10000] embeddings = HuggingFaceEndpointEmbeddings( model="http://127.0.0.1:8081/embed", model_kwargs={"truncate": True}, ) # Truncated parameters in synchronized methods are handled correctly embeddings.embed_documents(texts=embedding_input) # The truncate parameter is not handled correctly in the asynchronous method, # and 413 Request Entity Too Large is returned. asyncio.run(embeddings.aembed_documents(texts=embedding_input)) ``` Co-authored-by: af su <saf@zjuici.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-20 19:08:56 +00:00
Eric Pinzur	923ef85105	langchain_chroma: fixed integration tests (#27968 ) Description: * I'm planning to add `Document.id` support to the Chroma VectorStore, but first I wanted to make sure all the integration tests were passing first. They weren't. This PR fixes the broken tests. * I found 2 issues: * This change (from a year ago, exactly :) ) for supporting multi-modal embeddings: https://docs.trychroma.com/deployment/migration#migration-to-0.4.16---november-7,-2023 * This change https://github.com/langchain-ai/langchain/pull/27827 due to an update in the chroma client. Also ran `format` and `lint` on the changes. Note: I am not a member of the Chroma team.	2024-11-20 11:05:02 -08:00
Erick Friis	0dbaf05bb7	standard-tests: rename langchain_standard_tests to langchain_tests, release 0.3.2 (#28203 )	2024-11-18 19:10:39 -08:00
Erick Friis	d9d689572a	openai: release 0.2.9, o1 streaming (#28197 )	2024-11-18 23:54:38 +00:00
Erick Friis	6d2004ee7d	multiple: langchain-standard-tests -> langchain-tests (#28139 )	2024-11-15 11:32:04 -08:00
Elham Badri	d696728278	partners/ollama: Enabled Token Level Streaming when Using Bind Tools for ChatOllama (#27689 ) Description: The issue concerns the unexpected behavior observed using the bind_tools method in LangChain's ChatOllama. When tools are not bound, the llm.stream() method works as expected, returning incremental chunks of content, which is crucial for real-time applications such as conversational agents and live feedback systems. However, when bind_tools([]) is used, the streaming behavior changes, causing the output to be delivered in full chunks rather than incrementally. This change negatively impacts the user experience by breaking the real-time nature of the streaming mechanism. Issue: #26971 --------- Co-authored-by: 4meyDam1e <amey.damle@mail.utoronto.ca> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-11-15 11:36:27 -05:00
ccurme	776e3271e3	standard-tests[patch]: add test for async tool calling (#28133 )	2024-11-15 16:09:50 +00:00
Vadym Barda	6ec688cf2b	xai[patch]: update core (#28092 )	2024-11-13 17:51:51 +00:00
Vadym Barda	09e85c7c4b	xai[patch]: update dependencies (#28067 )	2024-11-12 16:15:17 -05:00
ccurme	00e7b2dada	anthropic[patch]: add examples to API ref (#28065 )	2024-11-12 20:17:02 +00:00
Vadym Barda	48ee322a78	partners: add xAI chat integration (#28032 )	2024-11-12 15:11:29 -05:00
ccurme	2898b95ca7	anthropic[major]: release 0.3.0 (#28063 )	2024-11-12 14:58:00 -05:00
ccurme	5eaa0e8c45	openai[patch]: release 0.2.8 (#28062 )	2024-11-12 14:57:11 -05:00
ccurme	1538ee17f9	anthropic[major]: support python 3.13 (#27916 ) Last week Anthropic released version 0.39.0 of its python sdk, which enabled support for Python 3.13. This release deleted a legacy `client.count_tokens` method, which we currently access during init of the `Anthropic` LLM. Anthropic has replaced this functionality with the [client.beta.messages.count_tokens() API](https://github.com/anthropics/anthropic-sdk-python/pull/726). To enable support for `anthropic >= 0.39.0` and Python 3.13, here we drop support for the legacy token counting method, and add support for the new method via `ChatAnthropic.get_num_tokens_from_messages`. To fully support the token counting API, we update the signature of `get_num_tokens_from_message` to accept tools everywhere. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-11-12 14:31:07 -05:00
Bagatur	139881b108	openai[patch]: fix azure oai stream check (#28048 )	2024-11-12 15:42:06 +00:00
Bagatur	9611f0b55d	openai[patch]: Release 0.2.7 (#28047 )	2024-11-12 15:16:15 +00:00
Bagatur	33dbfba08b	openai[patch]: default to invoke on o1 stream() (#27983 )	2024-11-08 19:12:59 -08:00
Erick Friis	8a5b9bf2ad	box: migrate to repo (#27969 )	2024-11-07 10:19:22 -08:00
ccurme	a747dbd24b	anthropic[patch]: remove retired model from tests (#27965 ) `claude-instant` was [retired yesterday](https://docs.anthropic.com/en/docs/resources/model-deprecations).	2024-11-07 16:16:29 +00:00
ZhangShenao	c2072d909a	Improvement[Partner] Improve qdrant vector store (#27251 ) - Add static method decorator - Add args for api doc - Fix word spelling Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-07 02:42:41 +00:00
Roman Solomatin	0f85dea8c8	langchain-huggingface: use separate kwargs for queries and docs (#27857 ) Now `encode_kwargs` used for both for documents and queries and this leads to wrong embeddings. E. g.: ```python model_kwargs = {"device": "cuda", "trust_remote_code": True} encode_kwargs = {"normalize_embeddings": False, "prompt_name": "s2p_query"} model = HuggingFaceEmbeddings( model_name="dunzhang/stella_en_400M_v5", model_kwargs=model_kwargs, encode_kwargs=encode_kwargs, ) query_embedding = np.array( model.embed_query("What are some ways to reduce stress?",) ) document_embedding = np.array( model.embed_documents( [ "There are many effective ways to reduce stress. Some common techniques include deep breathing, meditation, and physical activity. Engaging in hobbies, spending time in nature, and connecting with loved ones can also help alleviate stress. Additionally, setting boundaries, practicing self-care, and learning to say no can prevent stress from building up.", "Green tea has been consumed for centuries and is known for its potential health benefits. It contains antioxidants that may help protect the body against damage caused by free radicals. Regular consumption of green tea has been associated with improved heart health, enhanced cognitive function, and a reduced risk of certain types of cancer. The polyphenols in green tea may also have anti-inflammatory and weight loss properties.", ] ) ) print(model._client.similarity(query_embedding, document_embedding)) # output: tensor([[0.8421, 0.3317]], dtype=torch.float64) ``` But from the [model card](https://huggingface.co/dunzhang/stella_en_400M_v5#sentence-transformers) expexted like this: ```python model_kwargs = {"device": "cuda", "trust_remote_code": True} encode_kwargs = {"normalize_embeddings": False} query_encode_kwargs = {"normalize_embeddings": False, "prompt_name": "s2p_query"} model = HuggingFaceEmbeddings( model_name="dunzhang/stella_en_400M_v5", model_kwargs=model_kwargs, encode_kwargs=encode_kwargs, query_encode_kwargs=query_encode_kwargs, ) query_embedding = np.array( model.embed_query("What are some ways to reduce stress?", ) ) document_embedding = np.array( model.embed_documents( [ "There are many effective ways to reduce stress. Some common techniques include deep breathing, meditation, and physical activity. Engaging in hobbies, spending time in nature, and connecting with loved ones can also help alleviate stress. Additionally, setting boundaries, practicing self-care, and learning to say no can prevent stress from building up.", "Green tea has been consumed for centuries and is known for its potential health benefits. It contains antioxidants that may help protect the body against damage caused by free radicals. Regular consumption of green tea has been associated with improved heart health, enhanced cognitive function, and a reduced risk of certain types of cancer. The polyphenols in green tea may also have anti-inflammatory and weight loss properties.", ] ) ) print(model._client.similarity(query_embedding, document_embedding)) # tensor([[0.8398, 0.2990]], dtype=torch.float64) ```	2024-11-06 17:35:39 -05:00
ccurme	66966a6e72	openai[patch]: release 0.2.6 (#27924 ) Some additions in support of [predicted outputs](https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs) feature: - Bump openai sdk version - Add integration test - Add example to integration docs The `prediction` kwarg is already plumbed through model invocation.	2024-11-05 23:02:24 +00:00
SHJUN	f6b2f82099	community: chroma error patch(attribute changed on chroma) (#27827 ) There was a change of attribute name which was "max_batch_size". It's now "get_max_batch_size" method. I want to use "create_batches" which is right down below. Please check this PR link. reference: https://github.com/chroma-core/chroma/pull/2305 --------- Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com> Co-authored-by: Prithvi Kannan <46332835+prithvikannan@users.noreply.github.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Jun Yamog <jkyamog@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ono-hiroki <86904208+ono-hiroki@users.noreply.github.com> Co-authored-by: Dobiichi-Origami <56953648+Dobiichi-Origami@users.noreply.github.com> Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Duy Huynh <vndee.huynh@gmail.com> Co-authored-by: Rashmi Pawar <168514198+raspawar@users.noreply.github.com> Co-authored-by: sifatj <26035630+sifatj@users.noreply.github.com> Co-authored-by: Eric Pinzur <2641606+epinzur@users.noreply.github.com> Co-authored-by: Daniel Vu Dao <danielvdao@users.noreply.github.com> Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: Stéphane Philippart <wildagsx@gmail.com>	2024-11-05 19:43:11 +00:00
Bagatur	dfa83531ad	qdrant,nomic[minor]: bump core deps (#27849 )	2024-11-04 20:19:50 +00:00
Bagatur	3b0b7cfb74	chroma[minor]: release 0.2.0 (#27840 )	2024-11-01 18:12:00 -07:00
Bagatur	002e1c9055	airbyte: remove from master (#27837 )	2024-11-01 13:59:34 -07:00
Bagatur	ee63d21915	many: use core 0.3.15 (#27834 )	2024-11-01 20:35:55 +00:00
Bagatur	06420de2e7	integrations[patch]: bump core to 0.3.15 (#27805 )	2024-10-31 11:27:05 -07:00
JiaranI	3952ee31b8	ollama: add pydocstyle linting for ollama (#27686 ) Description: add lint docstrings for ollama module Issue: the issue https://github.com/langchain-ai/langchain/issues/23188 @baskaryan test: ruff check passed. <img width="311" alt="e94c68ffa93dd518297a95a93de5217" src="https://github.com/user-attachments/assets/e96bf721-e0e3-44de-a50e-206603de398e"> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-31 03:06:55 +00:00
Bagatur	6691202998	anthropic[patch]: allow multiple sys not at start (#27725 )	2024-10-30 23:56:47 +00:00
ccurme	88bfd60b03	infra: specify python max version of 3.12 for some integration packages (#27740 )	2024-10-30 12:24:48 -04:00
ccurme	bd5ea18a6c	groq[patch]: update standard tests (#27744 ) - Add xfail on integration test (fails [> 50% of the time](https://github.com/langchain-ai/langchain/actions/workflows/scheduled_test.yml)); - Remove xfail on passing unit test.	2024-10-30 15:50:51 +00:00
Andrew Effendi	49517cc1e7	partners/huggingface[patch]: fix HuggingFacePipeline model_id parameter (#27514 ) Description: Fixes issue with model parameter not getting initialized correctly when passing transformers pipeline Issue: https://github.com/langchain-ai/langchain/issues/25915	2024-10-29 14:34:46 +00:00
Erick Friis	583808a7b8	partners/huggingface: release 0.1.1 (#27691 )	2024-10-28 13:39:38 -07:00
Erick Friis	6d524e9566	partners/box: release 0.2.2 (#27690 )	2024-10-28 12:54:20 -07:00
yahya-mouman	6803cb4f34	openai[patch]: add check for none values when summing token usage (#27585 ) Description: Fixes None addition issues when an empty value is passed on If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-10-28 12:49:43 -07:00
Bagatur	ede953d617	openai[patch]: fix schema formatting util (#27685 )	2024-10-28 15:46:47 +00:00
ccurme	fe87e411f2	groq: fix unit test (#27660 )	2024-10-26 14:57:23 -04:00
Bagatur	d5306899d3	openai[patch]: Release 0.2.4 (#27652 )	2024-10-25 20:26:21 +00:00
Nithish Raghunandanan	0623c74560	couchbase: Add document id to vector search results (#27622 ) Description: Returns the document id along with the Vector Search results Issue: Fixes https://github.com/langchain-ai/langchain/issues/26860 for CouchbaseVectorStore - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-24 21:47:36 +00:00
Hyejun An	6227396e20	partners/HuggingFacePipeline[stream]: Change to use `pipeline` instead of `pipeline.model.generate` in stream() (#26531 ) ## Description I encountered an error while using the` gemma-2-2b-it model` with the `HuggingFacePipeline` class and have implemented a fix to resolve this issue. ### What is Problem ```python model_id="google/gemma-2-2b-it" gemma_2_model = AutoModelForCausalLM.from_pretrained(model_id) gemma_2_tokenizer = AutoTokenizer.from_pretrained(model_id) gen = pipeline( task='text-generation', model=gemma_2_model, tokenizer=gemma_2_tokenizer, max_new_tokens=1024, device=0 if torch.cuda.is_available() else -1, temperature=.5, top_p=0.7, repetition_penalty=1.1, do_sample=True, ) llm = HuggingFacePipeline(pipeline=gen) for chunk in llm.stream("Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World."): print(chunk, end="", flush=True) ``` This code outputs the following error message: ``` /usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1258: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation. warnings.warn( Exception in thread Thread-19 (generate): Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/lib/python3.10/threading.py", line 953, in run self._target(self._args, self._kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1874, in generate self._validate_generated_length(generation_config, input_ids_length, has_default_max_length) File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1266, in _validate_generated_length raise ValueError( ValueError: Input length of input_ids is 31, but `max_length` is set to 20. This can lead to unexpected behavior. You should consider increasing `max_length` or, better yet, setting `max_new_tokens`. ``` In addition, the following error occurs when the number of tokens is reduced. ```python for chunk in llm.stream("Hello World"): print(chunk, end="", flush=True) ``` ``` /usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1258: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation. warnings.warn( /usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1885: UserWarning: You are calling .generate() with the `input_ids` being on a device type different than your model's device. `input_ids` is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put `input_ids` to the correct device by calling for example input_ids = input_ids.to('cuda') before running `.generate()`. warnings.warn( Exception in thread Thread-20 (generate): Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/lib/python3.10/threading.py", line 953, in run self._target(self._args, *self._kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2024, in generate result = self._sample( File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2982, in _sample outputs = self(model_inputs, return_dict=True) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/gemma2/modeling_gemma2.py", line 994, in forward outputs = self.model( File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/transformers/models/gemma2/modeling_gemma2.py", line 803, in forward inputs_embeds = self.embed_tokens(input_ids) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl return self._call_impl(args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl return forward_call(args, kwargs) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 164, in forward return F.embedding( File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2267, in embedding return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse) RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select) ``` On the other hand, in the case of invoke, the output is normal: ``` llm.invoke("Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World.") ``` ``` 'Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World.\n\nThis is a simple program that prints the phrase "Hello World" to the console. \n\nHere\'s how it works:*\n\n `print("Hello World")`: This line of code uses the `print()` function, which is a built-in function in most programming languages (like Python). The `print()` function takes whatever you put inside its parentheses and displays it on the screen.\n* `"Hello World"`: The text within the double quotes (`"`) is called a string. It represents the message we want to print.\n\n\nLet me know if you\'d like to explore other programming concepts or see more examples! \n' ``` ### Problem Analysis - Apparently, I put kwargs in while generating pipelines and it applied to `invoke()`, but it's not applied in the `stream()`. - When using the stream, `inputs = self.pipeline.tokenizer (prompt, return_tensors = "pt")` enters cpu. - This can crash when the model is in gpu. ### Solution Just use `self.pipeline` instead of `self.pipeline.model.generate`. - Original Code ```python stopping_criteria = StoppingCriteriaList([StopOnTokens()]) inputs = self.pipeline.tokenizer(prompt, return_tensors="pt") streamer = TextIteratorStreamer( self.pipeline.tokenizer, timeout=60.0, skip_prompt=skip_prompt, skip_special_tokens=True, ) generation_kwargs = dict( inputs, streamer=streamer, stopping_criteria=stopping_criteria, pipeline_kwargs, ) t1 = Thread(target=self.pipeline.model.generate, kwargs=generation_kwargs) t1.start() ``` - Updated Code ```python stopping_criteria = StoppingCriteriaList([StopOnTokens()]) streamer = TextIteratorStreamer( self.pipeline.tokenizer, timeout=60.0, skip_prompt=skip_prompt, skip_special_tokens=True, ) generation_kwargs = dict( text_inputs= prompt, streamer=streamer, stopping_criteria=stopping_criteria, pipeline_kwargs, ) t1 = Thread(target=self.pipeline, kwargs=generation_kwargs) t1.start() ``` By using the `pipeline` directly, the `kwargs` of the pipeline are applied, and there is no need to consider the `device` of the `tensor` made with the `tokenizer`. > According to the change to use `pipeline`, it was modified to put `text_inputs=prompts` directly into `generation_kwargs`. ## Issue None ## Dependencies None ## Twitter handle None --------- Co-authored-by: Vadym Barda <vadym@langchain.dev>	2024-10-24 16:49:43 -04:00
Bagatur	655ced84d7	openai[patch]: accept json schema response format directly (#27623 ) fix #25460 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-24 18:19:15 +00:00

1 2 3 4 5 ...

1028 Commits