langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-06-21 06:14:37 +00:00

Author	SHA1	Message	Date
Bagatur	ede953d617	openai[patch]: fix schema formatting util (#27685 )	2024-10-28 15:46:47 +00:00
Bagatur	655ced84d7	openai[patch]: accept json schema response format directly (#27623 ) fix #25460 --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-24 18:19:15 +00:00
Erick Friis	7d65a32ee0	openai: audio modality, remove sockets from unit tests (#27436 )	2024-10-18 08:02:09 -07:00
Bagatur	ce33c4fa40	openai[patch]: default temp=1 for o1 (#27206 )	2024-10-08 15:45:21 -07:00
Bagatur	bd5b335cb4	standard-tests[patch]: fix oai usage metadata test (#27122 )	2024-10-04 20:00:48 +00:00
Bagatur	4935a14314	core,integrations[minor]: Dont error on fields in model_kwargs (#27110 ) Given the current erroring behavior, every time we've moved a kwarg from model_kwargs and made it its own field that was a breaking change. Updating this behavior to support the old instantiations / serializations. Assuming build_extra_kwargs was not something that itself is being used externally and needs to be kept backwards compatible	2024-10-04 11:30:27 -07:00
Erick Friis	e8e5d67a8d	openai: fix None token detail (#27091 ) happens in Azure	2024-10-04 01:25:38 +00:00
Bagatur	c09da53978	openai[patch]: add usage metadata details (#27080 )	2024-10-03 14:01:03 -07:00
ccurme	7091a1a798	openai[patch]: increase token limit in azure integration tests (#26901 ) `test_json_mode` occasionally runs into this	2024-09-26 14:31:33 +00:00
ccurme	2a4c5713cd	openai[patch]: fix azure integration tests (#26791 )	2024-09-23 17:49:15 -04:00
Bagatur	e1e4f88b3e	openai[patch]: enable Azure structured output, parallel_tool_calls=Fa… (#26599 ) …lse, tool_choice=required response_format=json_schema, tool_choice=required, parallel_tool_calls are all supported for gpt-4o on azure.	2024-09-22 22:25:22 -07:00
Anton Dubovik	3e2cb4e8a4	openai: embeddings: supported chunk_size when check_embedding_ctx_length is disabled (#23767 ) Chunking of the input array controlled by `self.chunk_size` is being ignored when `self.check_embedding_ctx_length` is disabled. Effectively, the chunk size is assumed to be equal 1 in such a case. This is suprising. The PR takes into account `self.chunk_size` passed by the user. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 16:58:45 -07:00
Bagatur	5ced41bf50	anthropic[patch]: fix tool call and tool res image_url handling (#26587 ) Co-authored-by: ccurme <chester.curme@gmail.com>	2024-09-17 14:30:07 -07:00
Erick Friis	c2a3021bb0	multiple: pydantic 2 compatibility, v0.3 (#26443 ) Signed-off-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com> Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com> Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: ccurme <chester.curme@gmail.com> Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: ZhangShenao <15201440436@163.com> Co-authored-by: Friso H. Kingma <fhkingma@gmail.com> Co-authored-by: ChengZi <chen.zhang@zilliz.com> Co-authored-by: Nuno Campos <nuno@langchain.dev> Co-authored-by: Morgante Pell <morgantep@google.com>	2024-09-13 14:38:45 -07:00
liuhetian	7fc9e99e21	openai[patch]: get output_type when using with_structured_output (#26307 ) - This allows pydantic to correctly resolve annotations necessary when using openai new param `json_schema` Resolves issue: #26250 --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-13 11:42:01 -07:00
Harrison Chase	28ad244e77	community, openai: support nested dicts (#26414 ) needed for thinking tokens --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-12 21:47:47 -07:00
Bagatur	dba308447d	fmt	2024-09-04 11:28:04 -07:00
Bagatur	3ec93c2817	standard-tests[patch]: add Ser/Des test	2024-09-04 10:24:06 -07:00
Friso H. Kingma	af11fbfbf6	langchain_openai: Make sure the response from the async client in the astream method of ChatOpenAI is properly awaited in case of "include_response_headers=True" (#26031 ) - Description: This is a one line change. the `self.async_client.with_raw_response.create(payload)` call is not properly awaited within the `_astream` method. In `_agenerate` this is done already, but likely forgotten in the other method. - Issue: Not applicable - Dependencies:** No dependencies required. (If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.) --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-09-04 13:26:48 +00:00
Eugene Yurtsev	bc3b851f08	openai[patch]: Upgrade @root_validators in preparation for pydantic 2 migration (#25491 ) * Upgrade @root_validator in openai pkg * Ran notebooks for all but AzureAI embeddings --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-09-03 14:42:24 -07:00
Bagatur	bc3b02651c	standard-tests[patch]: test init from env vars (#25983 )	2024-09-03 19:05:39 +00:00
ccurme	2e5c379632	openai[patch]: fix get_num_tokens for function calls (#25785 ) Closes https://github.com/langchain-ai/langchain/issues/25784 See additional discussion [here](`0a4ee864e9 (r145147380)`).	2024-08-27 20:18:19 +00:00
Hyman	58e72febeb	openai:compatible with other llm usage meta data (#24500 ) - [ ] PR message: - Description: Compatible with other llm (eg: deepseek-chat, glm-4) usage meta data - Issue: N/A - Dependencies: no new dependencies added - [ ] Add tests and docs: libs/partners/openai/tests/unit_tests/chat_models/test_base.py ```shell cd libs/partners/openai poetry run pytest tests/unit_tests/chat_models/test_base.py::test_openai_astream poetry run pytest tests/unit_tests/chat_models/test_base.py::test_openai_stream poetry run pytest tests/unit_tests/chat_models/test_base.py::test_deepseek_astream poetry run pytest tests/unit_tests/chat_models/test_base.py::test_deepseek_stream poetry run pytest tests/unit_tests/chat_models/test_base.py::test_glm4_astream poetry run pytest tests/unit_tests/chat_models/test_base.py::test_glm4_stream ``` --------- Co-authored-by: hyman <hyman@xiaozancloud.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-08-23 16:59:14 -07:00
Yusuke Fukasawa	0258cb96fa	core[patch]: add additionalProperties recursively to oai function if strict (#25169 ) Hello. First of all, thank you for maintaining such a great project. ## Description In https://github.com/langchain-ai/langchain/pull/25123, support for structured_output is added. However, `"additionalProperties": false` needs to be included at all levels when a nested object is generated. error from current code: https://gist.github.com/fufufukakaka/e9b475300e6934853d119428e390f204 ``` BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'JokeWithEvaluation': In context=('properties', 'self_evaluation'), 'additionalProperties' is required to be supplied and to be false", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}} ``` Reference: [Introducing Structured Outputs in the API](https://openai.com/index/introducing-structured-outputs-in-the-api/) ```json { "model": "gpt-4o-2024-08-06", "messages": [ { "role": "system", "content": "You are a helpful math tutor." }, { "role": "user", "content": "solve 8x + 31 = 2" } ], "response_format": { "type": "json_schema", "json_schema": { "name": "math_response", "strict": true, "schema": { "type": "object", "properties": { "steps": { "type": "array", "items": { "type": "object", "properties": { "explanation": { "type": "string" }, "output": { "type": "string" } }, "required": ["explanation", "output"], "additionalProperties": false } }, "final_answer": { "type": "string" } }, "required": ["steps", "final_answer"], "additionalProperties": false } } } } ``` In the current code, `"additionalProperties": false` is only added at the last level. This PR introduces the `_add_additional_properties_key` function, which recursively adds `"additionalProperties": false` to the entire JSON schema for the request. Twitter handle: `@fukkaa1225` Thank you! --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-08-23 00:08:58 +00:00
ccurme	b83f1eb0d5	core, partners: implement standard tracing params for LLMs (#25410 )	2024-08-16 13:18:09 -04:00
ccurme	01ecd0acba	openai[patch]: fix json mode for Azure (#25488 ) https://github.com/langchain-ai/langchain/issues/25479 https://github.com/langchain-ai/langchain/issues/25485 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-08-16 09:50:50 -07:00
Bagatur	0b4608f71e	infra: temp skip oai embeddings test (#25148 )	2024-08-07 17:51:39 +00:00
Bagatur	09fbce13c5	openai[patch]: ChatOpenAI.with_structured_output json_schema support (#25123 )	2024-08-07 08:09:07 -07:00
Bagatur	78403a3746	core[patch], openai[patch]: enable strict tool calling (#25111 ) Introduced https://openai.com/index/introducing-structured-outputs-in-the-api/	2024-08-06 21:21:06 +00:00
ccurme	a197a8e184	openai[patch]: move test (#24552 ) No-override tests (https://github.com/langchain-ai/langchain/pull/24407) include a condition that integrations not implement additional tests.	2024-07-23 10:22:22 -04:00
Erick Friis	2c6b9e8771	standard-tests: add override check (#24407 )	2024-07-22 23:38:01 +00:00
Bagatur	7d83189b19	openai[patch]: use model_name in AzureOpenAI.ls_model_name (#24366 )	2024-07-17 15:24:05 -07:00
Erick Friis	1e9cc02ed8	openai: raw response headers (#24150 )	2024-07-16 09:54:54 -07:00
ccurme	cb95198398	standard-tests[patch]: add tests for runnables as tools and streaming usage metadata (#24153 )	2024-07-11 18:30:05 -04:00
Bagatur	5fd1e67808	core[minor], integrations...[patch]: Support ToolCall as Tool input and ToolMessage as Tool output (#24038 ) Changes: - ToolCall, InvalidToolCall and ToolCallChunk can all accept a "type" parameter now - LLM integration packages add "type" to all the above - Tool supports ToolCall inputs that have "type" specified - Tool outputs ToolMessage when a ToolCall is passed as input - Tools can separately specify ToolMessage.content and ToolMessage.raw_output - Tools emit events for validation errors (using on_tool_error and on_tool_end) Example: ```python @tool("structured_api", response_format="content_and_raw_output") def _mock_structured_tool_with_raw_output( arg1: int, arg2: bool, arg3: Optional[dict] = None ) -> Tuple[str, dict]: """A Structured Tool""" return f"{arg1} {arg2}", {"arg1": arg1, "arg2": arg2, "arg3": arg3} def test_tool_call_input_tool_message_with_raw_output() -> None: tool_call: Dict = { "name": "structured_api", "args": {"arg1": 1, "arg2": True, "arg3": {"img": "base64string..."}}, "id": "123", "type": "tool_call", } expected = ToolMessage("1 True", raw_output=tool_call["args"], tool_call_id="123") tool = _mock_structured_tool_with_raw_output actual = tool.invoke(tool_call) assert actual == expected tool_call.pop("type") with pytest.raises(ValidationError): tool.invoke(tool_call) actual_content = tool.invoke(tool_call["args"]) assert actual_content == expected.content ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-11 14:54:02 -07:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
Chip Davis	04bc5f1a95	partners[azure]: fix having openai_api_base set for other packages (#22068 ) This fix is for #21726. When having other packages installed that require the `openai_api_base` environment variable, users are not able to instantiate the AzureChatModels or AzureEmbeddings. This PR adds a new value `ignore_openai_api_base` which is a bool. When set to True, it sets `openai_api_base` to `None` Two new tests were added for the `test_azure` and a new file `test_azure_embeddings` A different approach may be better for this. If you can think of better logic, let me know and I can adjust it. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-07-01 18:35:20 +00:00
ccurme	390ee8d971	standard-tests: add test for structured output (#23631 ) - add test for structured output - fix bug with structured output for Azure - better testing on Groq (break out Mixtral + Llama3 and add xfails where needed)	2024-06-28 15:01:40 -04:00
Bagatur	9d145b9630	openai[patch]: fix tool calling token counting (#23408 ) Resolves https://github.com/langchain-ai/langchain/issues/23388	2024-06-25 10:34:25 -07:00
Bagatur	8698cb9b28	infra: add more formatter rules to openai (#23189 ) Turns on https://docs.astral.sh/ruff/settings/#format_docstring-code-format and https://docs.astral.sh/ruff/settings/#format_skip-magic-trailing-comma ```toml [tool.ruff.format] docstring-code-format = true skip-magic-trailing-comma = true ```	2024-06-19 11:39:58 -07:00
Bagatur	0a4ee864e9	openai[patch]: image token counting (#23147 ) Resolves #23000 --------- Co-authored-by: isaac hershenson <ihershenson@hmc.edu> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-06-19 10:41:47 -07:00
Bagatur	90559fde70	openai[patch], standard-tests[patch]: don't pass in falsey stop vals (#23153 ) adds an image input test to standard-tests as well	2024-06-18 18:13:13 -07:00
Bagatur	d96f67b06f	standard-tests[patch]: Update chat model standard tests (#22378 ) - Refactor standard test classes to make them easier to configure - Update openai to support stop_sequences init param - Update groq to support stop_sequences init param - Update fireworks to support max_retries init param - Update ChatModel.bind_tools to type tool_choice - Update groq to handle tool_choice="any". this may be controversial --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-17 13:37:41 -07:00
ccurme	722c8f50ea	openai[patch]: add stream_usage parameter (#22854 ) Here we add `stream_usage` to ChatOpenAI as: 1. a boolean attribute 2. a kwarg to _stream and _astream. Question: should the `stream_usage` attribute be `bool`, or `bool \| None`? Currently I've kept it `bool` and defaulted to False. It was implemented on [ChatAnthropic](`e832bbb486/libs/partners/anthropic/langchain_anthropic/chat_models.py (L535)`) as a bool. However, to maintain support for users who access the behavior via OpenAI's `stream_options` param, this ends up being possible: ```python llm = ChatOpenAI(model_kwargs={"stream_options": {"include_usage": True}}) assert not llm.stream_usage ``` (and this model will stream token usage). Some options for this: - it's ok - make the `stream_usage` attribute bool or None - make an \_\_init\_\_ for ChatOpenAI, set a `._stream_usage` attribute and read `.stream_usage` from a property Open to other ideas as well.	2024-06-17 13:35:18 -04:00
Hakan Özdemir	c437b1aab7	[Partner]: Add metadata to stream response (#22716 ) Adds `response_metadata` to stream responses from OpenAI. This is returned with `invoke` normally, but wasn't implemented for `stream`. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-06-17 09:46:50 -04:00
ccurme	af1f723ada	openai: don't override stream_options default (#22242 ) ChatOpenAI supports a kwarg `stream_options` which can take values `{"include_usage": True}` and `{"include_usage": False}`. Setting include_usage to True adds a message chunk to the end of the stream with usage_metadata populated. In this case the final chunk no longer includes `"finish_reason"` in the `response_metadata`. This is the current default and is not yet released. Because this could be disruptive to workflows, here we remove this default. The default will now be consistent with OpenAI's API (see parameter [here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)). Examples: ```python from langchain_openai import ChatOpenAI llm = ChatOpenAI() for chunk in llm.stream("hi"): print(chunk) ``` ``` content='' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='Hello' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='!' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' content='' response_metadata={'finish_reason': 'stop'} id='run-8cff4721-2acd-4551-9bf7-1911dae46b92' ``` ```python for chunk in llm.stream("hi", stream_options={"include_usage": True}): print(chunk) ``` ``` content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='Hello' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='!' id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='' response_metadata={'finish_reason': 'stop'} id='run-39ab349b-f954-464d-af6e-72a0927daa27' content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17} ``` ```python llm = ChatOpenAI().bind(stream_options={"include_usage": True}) for chunk in llm.stream("hi"): print(chunk) ``` ``` content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='Hello' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='!' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='' response_metadata={'finish_reason': 'stop'} id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17} ```	2024-05-29 10:30:40 -04:00
ccurme	9a010fb761	openai: read stream_options (#21548 ) OpenAI recently added a `stream_options` parameter to its chat completions API (see [release notes](https://platform.openai.com/docs/changelog/added-chat-completions-stream-usage)). When this parameter is set to `{"usage": True}`, an extra "empty" message is added to the end of a stream containing token usage. Here we propagate token usage to `AIMessage.usage_metadata`. We enable this feature by default. Streams would now include an extra chunk at the end, after the chunk with `response_metadata={'finish_reason': 'stop'}`. New behavior: ``` [AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='Hello', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='!', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'), AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17})] ``` Old behavior (accessible by passing `stream_options={"include_usage": False}` into (a)stream: ``` [AIMessageChunk(content='', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='Hello', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='!', id='run-1312b971-c5ea-4d92-9015-e6604535f339'), AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-1312b971-c5ea-4d92-9015-e6604535f339')] ``` From what I can tell this is not yet implemented in Azure, so we enable only for ChatOpenAI.	2024-05-24 13:20:56 -04:00
Eugene Yurtsev	2d693c484e	docs: fix some spelling mistakes caught by newest version of code spell (#22090 ) Going to merge this even though it doesn't pass all tests, and open a separate PR for the remaining spelling mistakes.	2024-05-23 16:59:11 -04:00
ccurme	181dfef118	core, standard tests, partner packages: add test for model params (#21677 ) 1. Adds `.get_ls_params` to BaseChatModel which returns ```python class LangSmithParams(TypedDict, total=False): ls_provider: str ls_model_name: str ls_model_type: Literal["chat"] ls_temperature: Optional[float] ls_max_tokens: Optional[int] ls_stop: Optional[List[str]] ``` by default it will only return ```python {ls_model_type="chat", ls_stop=stop} ``` 2. Add these params to inheritable metadata in `CallbackManager.configure` 3. Implement `.get_ls_params` and populate all params for Anthropic + all subclasses of BaseChatOpenAI Sample trace: https://smith.langchain.com/public/d2962673-4c83-47c7-b51e-61d07aaffb1b/r OpenAI: <img width="984" alt="Screenshot 2024-05-17 at 10 03 35 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/2ef41f74-a9df-4e0e-905d-da74fa82a910"> Anthropic: <img width="978" alt="Screenshot 2024-05-17 at 10 06 07 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/39701c9f-7da5-4f1a-ab14-84e9169d63e7"> Mistral (and all others for which params are not yet populated): <img width="977" alt="Screenshot 2024-05-17 at 10 08 43 AM" src="https://github.com/langchain-ai/langchain/assets/26529506/37d7d894-fec2-4300-986f-49a5f0191b03">	2024-05-17 13:51:26 -04:00
ccurme	4170e72a42	openai: fix loads unit test (#21542 ) following changes to tests in core here: https://github.com/langchain-ai/langchain/pull/21342/files	2024-05-10 18:46:34 +00:00

1 2

80 Commits