langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-07-13 00:16:01 +00:00

Author	SHA1	Message	Date
Zeeland	2549df00cd	docs: fix error bilibili url (#19375 ) Thank you for contributing to LangChain! bilibili-api-python use https://github.com/Nemo2011/bilibili-api repo. Change to the correct address. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-03-22 17:06:17 -07:00
aditya thomas	375ab7bf59	docs: update module imports for fireworks documentation (#19377 ) Description: Update module imports for Fireworks documentation Issue: Module imports not present or in incorrect location Dependencies: None	2024-03-22 17:05:27 -07:00
aditya thomas	0cc0467267	docs: update import paths and move to lcel for llama.cpp examples (#19391 ) Description: Update import paths and move to lcel for llama.cpp examples Issue: Update import paths to reflect package refactoring and move chains to LCEL in examples Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-23 00:04:12 +00:00
fengjial	3b52ee05d1	community[patch]: fix bugs in baiduvectordb as vectorstore (#19380 ) fix small bugs in vectorstore/baiduvectordb	2024-03-22 17:03:59 -07:00
Cailin Wang	5402aef32e	docs: Add `partition` parameter to DashVector (#19385 ) Description: Add `partition` parameter to DashVector dashvector.ipynb Related PR: https://github.com/langchain-ai/langchain/pull/19023 Twitter handle: @CailinWang_ --------- Co-authored-by: root <root@Bluedot-AI>	2024-03-22 17:00:29 -07:00
aditya thomas	16ef88a87d	docs: moving FireworksEmbeddings documentation to docs folder (#19398 ) Description: Moving FireworksEmbeddings documentation to the location docs/integration/text_embedding/ from langchain_fireworks/docs/ Issue: FireworksEmbeddings documentation was not in the correct location Dependencies: None --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2024-03-22 23:24:22 +00:00
Ray Bell	7d36ee38b7	docs: point to titantic dataset on web (#19455 ) Updated `pd.read_csv("titantic.csv")` to `pd.read_csv("https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv")` i.e. it will read it https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv and allow anyone to run the code. Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-22 22:22:41 +00:00
Ray Bell	f959fad56e	docs: use invoke instead of run (#19457 ) Updated the deprecated run with invoke Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-22 15:08:26 -07:00
老阿張	9dfce56b31	docs: Fix typo in infino.ipynb (#18640 ) Description: "conquerer should be conqueror "? 🤔 Issue: Typo Dependencies: Nope Twitter handle: laoazhang	2024-03-20 07:51:58 -07:00
aditya thomas	e46419c851	docs: contribute / integrations code examples update (#19319 ) Description: Update to make the code examples consistent with the actual use Issue: Code examples were different from actual use in the LangChain code Dependencies: Changes on top of https://github.com/langchain-ai/langchain/pull/19294 Note: If these changes are acceptable, please merge them after https://github.com/langchain-ai/langchain/pull/19294.	2024-03-20 09:27:53 -04:00
Brace Sproul	40f846e65d	docs[minor]: Add chat model selection tabs component (#19296 ) <img width="1728" alt="image" src="https://github.com/langchain-ai/langchain/assets/46789226/45e70a92-c2ee-48c8-9964-100eed22687b">	2024-03-19 18:12:46 -07:00
Nithish Raghunandanan	7ad0a3f2a7	community: add Couchbase Vector Store (#18994 ) - Description: Added support for Couchbase Vector Search to LangChain. - Dependencies: couchbase>=4.1.12 - Twitter handle: @nithishr --------- Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>	2024-03-19 12:39:51 -07:00
Chris Papademetrious	305d74c67a	core: implement a batch_size parameter for CacheBackedEmbeddings (#18070 ) Description: Currently, `CacheBackedEmbeddings` computes vectors for all uncached documents before updating the store. This pull request updates the embedding computation loop to compute embeddings in batches, updating the store after each batch. I noticed this when I tried `CacheBackedEmbeddings` on our 30k document set and the cache directory hadn't appeared on disk after 30 minutes. The motivation is to minimize compute/data loss when problems occur: * If there is a transient embedding failure (e.g. a network outage at the embedding endpoint triggers an exception), at least the completed vectors are written to the store instead of being discarded. * If there is an issue with the store (e.g. no write permissions), the condition is detected early without computing (and discarding!) all the vectors. Issue: Implements enhancement #18026. Testing: I was unable to run unit tests; details in [this post](https://github.com/langchain-ai/langchain/discussions/15019#discussioncomment-8576684). --------- Signed-off-by: chrispy <chrispy@synopsys.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-03-19 18:55:43 +00:00
Christophe Bornet	30e4a35d7a	community: Use langchain-astradb for AstraDB caches (#18419 ) - [x] Needs https://github.com/langchain-ai/langchain-datastax/pull/4 - [x] Needs a new release of langchain-astradb	2024-03-19 14:04:36 -04:00
Brace Sproul	17c62e0f3a	ci[minor]: Bump LC scripts package, add retry option (#19285 ) The `retryFailed` option will retry all failed links, once at a time with the goal of not triggering bot protection `microsoft.com` is now hard coded into the whitelist	2024-03-19 10:42:59 -07:00
Erick Friis	7eb376d5fc	docs: integration deprecation docs (#19283 )	2024-03-19 17:11:15 +00:00
HatsuneMK00	4761c09e94	docs: update slack toolkit ipynb in integration (#19219 ) Thank you for contributing to LangChain! - [x] PR title: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - PR message: - Description: Update the slack toolkit doc to use an agent that support multiple inputs. Using ReAct agent will cause a ValidationError when invoking the slack tools. This is because the agent return a string like `'{"channel": "C05LDF54S21", "message": "Hello, world!"}'` but the ReAct agent does not support multiple inputs. - Issue: This is related to this [Discussion#18083](https://github.com/langchain-ai/langchain/discussions/18083) - Dependencies: No dependencies required Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-03-19 10:39:09 -04:00
Vittorio Rigamonti	9b2f9ee952	community: VectorStore Infinispan, adding autoconfiguration (#18967 ) Description: this PR enable VectorStore autoconfiguration for Infinispan: if metadatas are only of basic types, protobuf config will be automatically generated for the user.	2024-03-18 21:33:45 -07:00
Anthony Shaw	bb0dd8f82f	docs: Embellish article on splitting by tokens with more examples and missing details (#18997 ) Description This PR adds some missing details from the "Split by tokens" page in the documentation. Specifically: - The `.from_tiktoken_encoder()` class methods for both the `CharacterTextSplitter` and `RecursiveCharacterTextSplitter` default to the old `gpt-2` encoding. I've added a comment to suggest specifying `model_name` or `encoding` - The docs didn't mention that the `from_tiktoken_encoder()` class method passes additional kwargs down to the constructor of the splitter. I only discovered this by reading the source code - Added an example of using the `.from_tiktoken_encoder()` class method with `RecursiveCharacterTextSplitter` which is the recommended approach for most scenarios above `CharacterTextSplitter` - Added a warning that `TokenTextSplitter` can split characters which have multiple tokens (e.g. 猫 has 3 cl100k_base tokens) between multiple chunks which creates malformed Unicode strings and should not be used in these situations. Side note: I think the default argument of `gpt2` for `.from_tiktoken_encoder()` should be updated? Twitter handle anthonypjshaw --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-18 21:28:17 -07:00
Simon Stone	58c7687174	langchain: preserve document metadata in `FlashrankRerank` (#19148 ) Description: Preserves document metadata in `FlashrankRerank` - Issue: #19142 - Dependencies: None - Twitter handle: n/a --------- Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>	2024-03-19 04:15:18 +00:00
Simon Stone	dc4ce82ddd	docs: fix import path for `FlashrankRerank` example notebook (#19146 ) Description: Fixes the import paths for the `FlashrankRerank` example notebook. Issue: #19139 Dependencies: None Twitter handle: n/a --------- Co-authored-by: Simon Stone <simon.stone@dartmouth.edu> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-18 21:03:00 -07:00
Saurav Kumar	bde199d128	Updating format of pip install (#19198 ) Thank you for contributing to LangChain! - [x] PR title: "Updating format of pip install in two files of docs/cookbook" - pip install is not reflecting properly in some of the files in cookbook - Example: [docs/expression_language/cookbook/sql_db](https://python.langchain.com/docs/expression_language/cookbook/sql_db) - [x] PR message: Updating format of pip install in two files of docs/cookbook - Description: a description of the change - Issue: #19197 - Note - let's do squash merge for the PR If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-03-19 04:01:24 +00:00
HowardChan	ae3c7f702c	docs:Make url as a markdown link (#19212 ) Description: same as the title Co-authored-by: ChenZhengHao <chenzhenghao@mail.teletraan.io>	2024-03-19 03:47:52 +00:00
Estephania Calvo Carvajal	94e58dd827	docs:Fix links to LangSmith docs on Evaluation page (#19210 ) (#19216 ) - Description: Same as the title - Issue: #19210	2024-03-18 22:27:43 +00:00
Kenzie Mihardja	21f75991d4	deprecate community docugami loader (#19230 ) Thank you for contributing to LangChain! - [x] PR title: "community: deprecate DocugamiLoader" - [x] PR message: Deprecate the langchain_community and use the docugami_langchain DocugamiLoader --------- Co-authored-by: Kenzie Mihardja <kenzie28@cs.washington.edu>	2024-03-18 12:56:47 -07:00
Anubhav Madhav	9235dade90	docs: provided hyperlinks to text and fixed grammar (#19092 ) 1) Provided links to text in the prompt (Refer Page Link 1, Page Link 2 and Page Link 3) 2) Fixed Grammar in Considerations of Model I/O Concepts documentation page - Update concepts.mdx (Page Link 4) Issues are on the following pages: Page Link 1: https://python.langchain.com/docs/modules/model_io/concepts#prompttemplate Page Link 2: https://python.langchain.com/docs/modules/model_io/concepts#messageprompttemplate Page Link 3: https://python.langchain.com/docs/modules/model_io/concepts#chatprompttemplate Page Link 4: https://python.langchain.com/docs/modules/model_io/concepts#considerations Fix 1: Description: Fixed Grammar in Considerations of Model I/O Documentation Page Issue: "to work well with the model are you using" # "to work well with the model you are using" Dependencies: None Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav) Fix 2: Description: Provided links to text in the prompt (Refer Page Link 1, Page Link 2 and Page Link 3) Issue: links not provided # links have been provided to the text Dependencies: None Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav) baskaryan, efriis, eyurtsev, hwchase17. For Fix 1 Refer to the first word 'This" word in the image attached with this PR. PFA <img width="839" alt="Screenshot 2024-03-15 at 3 04 17 AM" src="https://github.com/langchain-ai/langchain/assets/42323737/94e8db16-249f-48c3-a1d1-dee8d36067fa"> If no one reviews your PR within a few days, please @-mention one of --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-17 01:37:42 +00:00
inpyeong	7c092f479f	docs: Update why.ipynb (#19173 ) I think that cell type for pip command may be 'code'. Please check, thank you :) If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.	2024-03-16 22:21:51 +00:00
Vitalii Korsakov	d96e0b2de7	docs: Remove duplicated line in Get Started section (#19182 ) Line `from langchain_openai import ChatOpenAI` is put twice in Get Started / Serving with LangServe section. Imports on lines 559 and 566 are identical Co-authored-by: Vitalii <vitalii@localhost>	2024-03-16 22:21:25 +00:00
Rodrigo Nogueira	e64cf1aba4	community: Add model argument for maritalk models and better error handling (#19187 )	2024-03-16 15:18:56 -07:00
samanhappy	ff94f86ce1	docs: fix link to interface TextSplitter (#19177 )	2024-03-16 15:16:34 -07:00
aditya thomas	05008c4f94	docs: update stale links in Together AI documentation (#19011 ) Description: Update stales link in Together AI documentation Issue: Some links pointed to legacy webpages on the Together AI website Dependencies: None Lint and test: `make format`, `make lint` were run	2024-03-15 16:38:04 -07:00
wulixuan	f79d0cb9fb	docs: update docs for yuan2 in LLMs and Chat models integration. (#19028 ) update yuan2.0 notebook in LLMs and Chat models. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2024-03-15 16:03:18 -07:00
Taraka Nithin Vankala	eec023766e	docs: Corrected error (#19030 ) - [ ] PR title: "docs: correction in "https://github.com/langchain-ai/langchain/blob/master/docs/docs/get_started/quickstart.mdx", line 289". - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: - Corrected the spelling mistake - #18981	2024-03-15 16:02:33 -07:00
Christophe Bornet	f2a7dda4bd	community[patch]: Use langchain-astradb for AstraDB doc loader (#19071 ) Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-15 22:57:25 +00:00
Leonid Ganeline	a49ac55964	docs: `providers` update 8 (#19053 ) Added missed providers. Added missed integrations. Fixed format.	2024-03-15 15:49:14 -07:00
Holt Skinner	cee03630d9	community[patch]: Add Blended Search Support to `GoogleVertexAISearchRetriever` (#19082 ) https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#multi-data-stores --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-15 22:39:31 +00:00
William W Wang	0a784074d1	docs: Update llm_caching.ipynb (#19085 )	2024-03-15 22:35:48 +00:00
William W Wang	6327be9048	docsUpdate azure_cosmos_db.ipynb (#19087 ) Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-15 22:33:26 +00:00
Anubhav Madhav	553a520ab6	docs: Fixed Grammar in Considerations of Model I/O Concepts (#19091 ) Fixed Grammar in Considerations of Model I/O Concepts documentation page - Update concepts.mdx Page Link: https://python.langchain.com/docs/modules/model_io/concepts#considerations - Description: Fixed Grammar in Considerations of Model I/O Documentation Page - Issue: "to work well with the model are you using" # "to work well with the model you are using" - Dependencies: None - Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav) If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>	2024-03-15 22:31:39 +00:00
Shotaro Sano	d647ff1a9a	docs: Fix execution results of `docs/docs/modules/data_connection/indexing.ipynb` (#19112 ) ## Description This PR addresses a documentation issue in the [Indexing](https://python.langchain.com/docs/modules/data_connection/indexing) page. Specifically, it corrects the execution results of the Jupyter notebook under the [Source](https://python.langchain.com/docs/modules/data_connection/indexing#source) section, which were broken as detailed below. ## Problem The execution results following the statement, `This should delete the old versions of documents associated with doggy.txt source and replace them with the new versions.`, appear to be incorrect, as described below. ### Current Behavior - For some reason, the `index` function fails to add the new content of `doggy.txt`. Although it deletes the document objects associated with the `doggy.txt` source, it does not add the objects in `changed_doggy_docs`. Consequently, the execution result displays `num_added: 0`. - This unexpected behavior also impacts the results of `vectorstore.similarity_search("dog", k=30)`, showing only the contents of `kitty.txt`. It appears as though the contents of `doggy.txt` have been completely removed from the index: ``` Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}), Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}), Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})] ``` ### Expected Behavior - The `index` function should successfully add the objects in `changed_doggy_docs` after removing the old content of `doggy.txt`. The anticipated execution result is `num_added: 2`. - Subsequently, the modified content of `doggy.txt` should appear in the results of `vectorstore.similarity_search("dog", k=30)` as follows: ``` [Document(page_content='woof woof', metadata={'source': 'doggy.txt'}), Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}), Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}), Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}), Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})] ``` ## Fix I reran `docs/docs/modules/data_connection/indexing.ipynb` and have included the diff in this PR.	2024-03-15 22:27:15 +00:00
Guangdong Liu	cced3eb9bc	community[patch]: Fix sparkllm embeddings api bug. (#19122 ) - Description: Fix sparkllm embeddings api bug. @baskaryan PTAL	2024-03-15 15:08:49 -07:00
samanhappy	b9c62fb905	docs: fix API link for BaseLoader (#19128 ) The link to the BaseLoader API requires an update as it has been moved into the `langchain_core` package.	2024-03-15 14:46:05 -07:00
Kostas Botsas	527676a753	docs: Fix source column xata.ipynb (#19137 ) Docs fix: replace column name search with source. The Xata integration expects metadata column named "source". The docs suggest the name "search", which if used, yields the following error: ``` File "/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/xata.py", line 95, in _add_vectors raise Exception(f"Error adding vectors to Xata: {r.status_code} {r}") Exception: Error adding vectors to Xata: 400 {'errors': [{'status': 400, 'message': 'invalid record: column [source]: column not found'}]} ```	2024-03-15 14:06:18 -07:00
fengjial	c922ea36cb	community[minor]: Add Baidu VectorDB as vector store (#17997 ) Co-authored-by: fengjialin <fengjialin@MacBook-Pro.local>	2024-03-15 19:01:58 +00:00
aditya thomas	190887c5cd	docs: update the list of providers (#19012 ) Description: Update the list of LangChain providers Issue: Make the list of LangChain providers current Dependencies: None	2024-03-15 12:00:24 -07:00
Erick Friis	bbe164ad28	docs: voyageai as provider (#19154 )	2024-03-15 10:12:37 -07:00
Erick Friis	781aee0068	community, langchain, infra: revert store extended test deps outside of poetry (#19153 ) Reverts langchain-ai/langchain#18995 Because it makes installing dependencies in python 3.11 extended testing take 80 minutes	2024-03-15 17:10:47 +00:00
Leonid Kuligin	e3ff107e4f	docs: updated google integration related imports in the documentation (#19131 ) updated imports in the documentation for google vertex	2024-03-15 09:30:50 -04:00
Erick Friis	9e569d85a4	community, langchain, infra: store extended test deps outside of poetry (#18995 ) poetry can't reliably handle resolving the number of optional "extended test" dependencies we have. If we instead just rely on pip to install extended test deps in CI, this isn't an issue.	2024-03-15 05:55:30 +00:00
Erick Friis	7ce81eb6f4	voyageai[patch]: init package (#19098 ) Co-authored-by: fodizoltan <zoltan@conway.expert> Co-authored-by: Yujie Qian <thomasq0809@gmail.com> Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>	2024-03-15 00:56:10 +00:00

1 2 3 4 5 ...

3268 Commits