langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-16 16:11:02 +00:00

Author	SHA1	Message	Date
ccurme	56499cf58b	openai[patch]: unskip test and relax tolerance in embeddings comparison (#28262 ) From what I can tell response using SDK is not deterministic: ```python import numpy as np import openai documents = ["disallowed special token '<\|endoftext\|>'"] model = "text-embedding-ada-002" direct_output_1 = ( openai.OpenAI() .embeddings.create(input=documents, model=model) .data[0] .embedding ) for i in range(10): direct_output_2 = ( openai.OpenAI() .embeddings.create(input=documents, model=model) .data[0] .embedding ) print(f"{i}: {np.isclose(direct_output_1, direct_output_2).all()}") ``` ``` 0: True 1: True 2: True 3: True 4: False 5: True 6: True 7: True 8: True 9: True ``` See related discussion here: https://community.openai.com/t/can-text-embedding-ada-002-be-made-deterministic/318054 Found the same result using `"text-embedding-3-small"`.	2024-11-21 10:23:10 -08:00
Anton Dubovik	3e2cb4e8a4	openai: embeddings: supported chunk_size when check_embedding_ctx_length is disabled (#23767 ) Chunking of the input array controlled by `self.chunk_size` is being ignored when `self.check_embedding_ctx_length` is disabled. Effectively, the chunk size is assumed to be equal 1 in such a case. This is suprising. The PR takes into account `self.chunk_size` passed by the user. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-09-20 16:58:45 -07:00
Bagatur	0b4608f71e	infra: temp skip oai embeddings test (#25148 )	2024-08-07 17:51:39 +00:00
Bagatur	a0c2281540	infra: update mypy 1.10, ruff 0.5 (#23721 ) ```python """python scripts/update_mypy_ruff.py""" import glob import tomllib from pathlib import Path import toml import subprocess import re ROOT_DIR = Path(__file__).parents[1] def main(): for path in glob.glob(str(ROOT_DIR / "libs/*/pyproject.toml"), recursive=True): print(path) with open(path, "rb") as f: pyproject = tomllib.load(f) try: pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = ( "^1.10" ) pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = ( "^0.5" ) except KeyError: continue with open(path, "w") as f: toml.dump(pyproject, f) cwd = "/".join(path.split("/")[:-1]) completed = subprocess.run( "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color", cwd=cwd, shell=True, capture_output=True, text=True, ) logs = completed.stdout.split("\n") to_ignore = {} for l in logs: if re.match("^(.)\:(\d+)\: error:.\[(.)\]", l): path, line_no, error_type = re.match( "^(.)\:(\d+)\: error:.\[(.*)\]", l ).groups() if (path, line_no) in to_ignore: to_ignore[(path, line_no)].append(error_type) else: to_ignore[(path, line_no)] = [error_type] print(len(to_ignore)) for (error_path, line_no), error_types in to_ignore.items(): all_errors = ", ".join(error_types) full_path = f"{cwd}/{error_path}" try: with open(full_path, "r") as f: file_lines = f.readlines() except FileNotFoundError: continue file_lines[int(line_no) - 1] = ( file_lines[int(line_no) - 1][:-1] + f" # type: ignore[{all_errors}]\n" ) with open(full_path, "w") as f: f.write("".join(file_lines)) subprocess.run( "poetry run ruff format .; poetry run ruff --select I --fix .", cwd=cwd, shell=True, capture_output=True, text=True, ) if __name__ == "__main__": main() ```	2024-07-03 10:33:27 -07:00
Bagatur	bef50ded63	openai[patch]: fix special token default behavior (#21131 ) By default handle special sequences as regular text	2024-04-30 20:08:24 -04:00
ccurme	22da9f5f3f	update scheduled tests (#20526 ) repurpose scheduled tests to test over provider packages	2024-04-16 16:49:46 -04:00
Erick Friis	be92cf57ca	openai[patch]: fix azure embedding length check (#19870 )	2024-04-01 10:26:15 -07:00
Erick Friis	a05fb19f42	openai[patch]: remove numpy dep (#18034 )	2024-02-23 21:12:05 +00:00
Erick Friis	bb3b6bde33	openai[minor]: change to secretstr (#16803 )	2024-01-30 15:49:56 -08:00
Bagatur	61e876aad8	openai[patch]: Explicitly support embedding dimensions (#16596 )	2024-01-25 15:16:04 -08:00
Erick Friis	ebc75c5ca7	openai[minor]: implement langchain-openai package (#15503 ) Todo - [x] copy over integration tests - [x] update docs with new instructions in #15513 - [x] add linear ticket to bump core -> community, community->langchain, and core->openai deps - [ ] (optional): add `pip install langchain-openai` command to each notebook using it - [x] Update docstrings to not need `openai` install - [x] Add serialization - [x] deprecate old models Contributor steps: - [x] Add secret names to manual integrations workflow in .github/workflows/_integration_test.yml - [x] Add secrets to release workflow (for pre-release testing) in .github/workflows/_release.yml Maintainer steps (Contributors should not do these): - [x] set up pypi and test pypi projects - [x] add credential secrets to Github Actions - [ ] add package to conda-forge Functional changes to existing classes: - now relies on openai client v1 (1.6.1) via concrete dep in langchain-openai package Codebase organization - some function calling stuff moved to `langchain_core.utils.function_calling` in order to be used in both community and langchain-openai	2024-01-05 15:03:28 -08:00

11 Commits