langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-01 11:02:37 +00:00

Author	SHA1	Message	Date
Martin Triska	7a9149f5dd	community: ZeroxPDFLoader (#27800 ) # OCR-based PDF loader This implements [Zerox](https://github.com/getomni-ai/zerox) PDF document loader. Zerox utilizes simple but very powerful (even though slower and more costly) approach to parsing PDF documents: it converts PDF to series of images and passes it to a vision model requesting the contents in markdown. It is especially suitable for complex PDFs that are not parsed well by other alternatives. ## Example use: ```python from langchain_community.document_loaders.pdf import ZeroxPDFLoader os.environ["OPENAI_API_KEY"] = "" ## your-api-key model = "gpt-4o-mini" ## openai model pdf_url = "https://assets.ctfassets.net/f1df9zr7wr1a/soP1fjvG1Wu66HJhu3FBS/034d6ca48edb119ae77dec5ce01a8612/OpenAI_Sacra_Teardown.pdf" loader = ZeroxPDFLoader(file_path=pdf_url, model=model) docs = loader.load() ``` The Zerox library supports wide range of provides/models. See Zerox documentation for details. - Dependencies: `zerox` - Twitter handle: @martintriska1 If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-11-07 03:14:57 +00:00
Dmitriy Prokopchuk	53b0a99f37	community: Memcached LLM Cache Integration (#27323 ) ## Description This PR adds support for Memcached as a usable LLM model cache by adding the ```MemcachedCache``` implementation relying on the [pymemcache](https://github.com/pinterest/pymemcache) client. Unit test-wise, the new integration is generally covered under existing import testing. All new functionality depends on pymemcache if instantiated and used, so to comply with the other cache implementations the PR also adds optional integration tests for ```MemcachedCache```. Since this is a new integration, documentation is added for Memcached as an integration and as an LLM Cache. ## Issue This PR closes #27275 which was originally raised as a discussion in #27035 ## Dependencies There are no new required dependencies for langchain, but [pymemcache](https://github.com/pinterest/pymemcache) is required to instantiate the new ```MemcachedCache```. ## Example Usage ```python3 from langchain.globals import set_llm_cache from langchain_openai import OpenAI from langchain_community.cache import MemcachedCache from pymemcache.client.base import Client llm = OpenAI(model="gpt-3.5-turbo-instruct", n=2, best_of=2) set_llm_cache(MemcachedCache(Client('localhost'))) # The first time, it is not yet in cache, so it should take longer llm.invoke("Which city is the most crowded city in the USA?") # The second time it is, so it goes faster llm.invoke("Which city is the most crowded city in the USA?") ``` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-07 03:07:59 +00:00
Siddharth Murching	cfff2a057e	community: Update UC toolkit documentation to use LangGraph APIs (#26778 ) - Description: Update UC toolkit documentation to show an example of using recommended LangGraph agent APIs before the existing LangChain AgentExecutor example. Tested by manually running the updated example notebook - Dependencies: No new dependencies --------- Signed-off-by: Sid Murching <sid.murching@databricks.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-07 02:47:41 +00:00
Baptiste Pasquier	81f7daa458	community: add InfinityRerank (#27043 ) Description: - Add a Reranker for Infinity server. Dependencies: This wrapper uses [infinity_client](https://github.com/michaelfeil/infinity/tree/main/libs/client_infinity/infinity_client) to connect to an Infinity server. Tests and docs - integration test: test_infinity_rerank.py - example notebook: infinity_rerank.ipynb [here](https://github.com/baptiste-pasquier/langchain/blob/feat/infinity-rerank/docs/docs/integrations/document_transformers/infinity_rerank.ipynb) --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-06 17:26:30 -08:00
Martin Triska	90189f5639	community: Allow other than default parsers in SharePointLoader and OneDriveLoader (#27716 ) ## What this PR does? ### Currently `O365BaseLoader` (and consequently both derived loaders) are limited to `pdf`, `doc`, `docx` files. - Solution: here we introduce _handlers_ attribute that allows for custom handlers to be passed in. This is done in _dict_ form: Example: ```python from langchain_community.document_loaders.parsers.documentloader_adapter import DocumentLoaderAsParser # PR for DocumentLoaderAsParser here: https://github.com/langchain-ai/langchain/pull/27749 from langchain_community.document_loaders.excel import UnstructuredExcelLoader xlsx_parser = DocumentLoaderAsParser(UnstructuredExcelLoader, mode="paged") # create dictionary mapping file types to handlers (parsers) handlers = { "doc": MsWordParser() "pdf": PDFMinerParser() "txt": TextParser() "xlsx": xlsx_parser } loader = SharePointLoader(document_library_id="...", handlers=handlers # pass handlers to SharePointLoader ) documents = loader.load() # works the same in OneDriveLoader loader = OneDriveLoader(document_library_id="...", handlers=handlers ) ``` This dictionary is then passed to `MimeTypeBasedParser` same as in the [current implementation](`5a2cfb49e0/libs/community/langchain_community/document_loaders/parsers/registry.py (L13)`). ### Currently `SharePointLoader` and `OneDriveLoader` are separate loaders that both inherit from `O365BaseLoader` However both of these implement the same functionality. The only differences are: - `SharePointLoader` requires argument `document_library_id` whereas `OneDriveLoader` requires `drive_id`. These are just different names for the same thing. - `SharePointLoader` implements significantly more features. - Solution: `OneDriveLoader` is replaced with an empty shell just renaming `drive_id` to `document_library_id` and inheriting from `SharePointLoader` Dependencies: None Twitter handle: @martintriska1 If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.	2024-11-06 17:44:34 -05:00
murrlincoln	14f1827953	docs: Adding notebook for cdp agentkit toolkit (#27910 ) - Description: Adding in the first pass of documentation for the CDP Agentkit Toolkit - Issue: N/a - Dependencies: cdp-langchain - Twitter handle: @CoinbaseDev --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: John Peterson <john.peterson@coinbase.com>	2024-11-06 13:28:27 -08:00
Hammad Randhawa	75aa82fedc	docs: Completed sentence under the heading "Instantiating a Browser … (#27944 ) …Toolkit" in "playwright.ipynb" integration. - Completed the incomplete sentence in the Langchain Playwright documentation. - Enhanced documentation clarity to guide users on best practices for instantiating browser instances with Langchain Playwright. Example before: > "It's always recommended to instantiate using the from_browser method so that the Example after: > "It's always recommended to instantiate using the `from_browser` method so that the browser context is properly initialized and managed, ensuring seamless interaction and resource optimization." Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-06 19:55:00 +00:00
ccurme	66966a6e72	openai[patch]: release 0.2.6 (#27924 ) Some additions in support of [predicted outputs](https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs) feature: - Bump openai sdk version - Add integration test - Add example to integration docs The `prediction` kwarg is already plumbed through model invocation.	2024-11-05 23:02:24 +00:00
Tomaz Bratanic	a3bbbe6a86	update llm graph transformer documentation (#27905 )	2024-11-05 11:54:26 -05:00
Bagatur	6973f7214f	docs: sidebar capitalization (#27894 )	2024-11-04 22:09:32 +00:00
Ofer Mendelevitch	d7c39e6dbb	community: update Vectara integration (#27869 ) Thank you for contributing to LangChain! - Description: Updated Vectara integration - Issue: refresh on descriptions across all demos and added UDF reranker - Dependencies: None - Twitter handle: @ofermend --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 20:40:39 +00:00
Daniel Vu Dao	5745f3bf78	docs: Update `messages.mdx` (#27856 ) ### Description Updates phrasing for the header of the `Messages` section. Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 20:36:31 +00:00
sifatj	e02a5ee03e	docs: Update VectorStore as_retriever method url in qa_chat_history_how_to.ipynb (#27844 ) Description: Update VectorStore `as_retriever` method api reference url in `qa_chat_history_how_to.ipynb` Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 20:34:50 +00:00
sifatj	dd1711f3c2	docs: Update max_marginal_relevance_search api reference url in multi_vector.ipynb (#27843 ) Description: Update VectorStore `max_marginal_relevance_search` api reference url in `multi_vector.ipynb` Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 20:31:36 +00:00
sifatj	aa1f46a03a	docs: Update VectorStore .as_retriever method url in vectorstore_retriever.ipynb (#27842 ) Description: Update VectorStore `.as_retriever` method url in `vectorstore_retriever.ipynb` Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 20:28:11 +00:00
sifatj	eecf95df9b	docs: Update VectorStore api reference url in rag.ipynb (#27841 ) Description: Update VectorStore api reference url in `rag.ipynb` Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 20:27:03 +00:00
sifatj	50563400fb	docs: Update broken vectorstore urls in retrievers.ipynb (#27838 ) Description: Update outdated `VectorStore` api reference urls in `retrievers.ipynb` Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 20:26:03 +00:00
Rashmi Pawar	f86a09f82c	Add nvidia as provider for embedding, llm (#27810 ) Documentation: Add NVIDIA as integration provider cc: @mattf @dglogo Co-authored-by: Erick Friis <erick@langchain.dev>	2024-11-04 19:45:51 +00:00
ono-hiroki	b7d549ae88	docs: fix undefined 'data' variable in document_loader_csv.ipynb (#27872 ) Description: This PR addresses an issue in the CSVLoader example where data is not defined, causing a NameError. The line `data = loader.load()` is added to correctly assign the output of loader.load() to the data variable.	2024-11-04 14:10:56 +00:00
Erick Friis	9fedb04dd3	docs: INVALID_CHAT_HISTORY redirect (#27845 )	2024-11-01 21:35:11 +00:00
Erick Friis	03a3670a5e	infra: remove some special cases (#27839 )	2024-11-01 21:13:43 +00:00
Prithvi Kannan	c3c638cd7b	docs: Reference new databricks-langchain package (#27828 ) Thank you for contributing to LangChain! Update references in Databricks integration page to reference our new partner package databricks-langchain https://github.com/databricks/databricks-ai-bridge/tree/main/integrations/langchain Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>	2024-11-01 10:21:19 -07:00
sifatj	33d445550e	docs: update VectorStore api reference url in retrievers.ipynb (#27814 ) Description: Update outdated `VectorStore` api reference url in Vector store subsection of `retrievers.ipynb`	2024-11-01 15:44:26 +00:00
sifatj	9a4a630e40	docs: Update Retrievers and Runnable links in Retrievers subsection of retrievers.ipynb (#27815 ) Description: Update outdated links for `Retrievers` and `Runnable` in Retrievers subsection of `retrievers.ipynb`	2024-11-01 15:42:30 +00:00
Zapiron	b0dfff4cd5	Fixed broken link for TokenTextSplitter (#27824 ) Fixed the broken redirect link for `TokenTextSplitter` section	2024-11-01 11:32:07 -04:00
Eugene Yurtsev	2f6254605d	docs: fix more links (#27809 ) Fix more broken links	2024-10-31 17:15:46 -04:00
Eugene Yurtsev	71f590de50	docs: fix more broken links (#27806 ) Fix some broken links	2024-10-31 19:46:39 +00:00
Neli Hateva	c572d663f9	docs: Ontotext GraphDB QA Chain Update Documentation (Fix versions of libraries) (#27783 ) - Description: Update versions of libraries in the Ontotext GraphDB QA Chain Documentation - Issue: N/A - Dependencies: N/A - Twitter handle: @OntotextGraphDB	2024-10-31 15:23:16 -04:00
Erick Friis	54cb80c778	docs: experimental case, use yq action (#27798 )	2024-10-31 11:21:48 -07:00
Changyong Um	d9163e7afa	community[docs]: Add content for the Lora adapter in the VLLM page. (#27788 ) Description: I added code for lora_request in the community package, but I forgot to add content to the VLLM page. So, I will do that now. #27731 --------- Co-authored-by: Um Changyong <changyong.um@sfa.co.kr>	2024-10-31 12:44:35 -04:00
ccurme	0172d938b4	community: add AzureOpenAIWhisperParser (#27796 ) Commandeered from https://github.com/langchain-ai/langchain/pull/26757. --------- Co-authored-by: Sheepsta300 <128811766+Sheepsta300@users.noreply.github.com>	2024-10-31 12:37:41 -04:00
Sam Julien	0a472e2a2d	community: Add Writer integration (#27646 ) Description: Add support for Writer chat models Issue: N/A Dependencies: Add `writer-sdk` to optional dependencies. Twitter handle: Please tag `@samjulien` and `@Get_Writer` Tests and docs - [x] Unit test - [x] Example notebook in `docs/docs/integrations` directory. Lint and test - [x] Run `make format` - [x] Run `make lint` - [x] Run `make test` --------- Co-authored-by: Johannes <tolstoy.work@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-30 18:06:05 +00:00
ccurme	595dc592c9	docs: run how-to guides in CI (#27615 ) Add how-to guides to [Run notebooks job](https://github.com/langchain-ai/langchain/actions/workflows/run_notebooks.yml) and fix existing notebooks. - As with tutorials, cassettes must be updated when HTTP calls in guides change (by running existing [script](https://github.com/langchain-ai/langchain/blob/master/docs/scripts/update_cassettes.sh)). - Cassettes now total ~62mb over 474 files. - `docs/scripts/prepare_notebooks_for_ci.py` lists a number of notebooks that do not run (e.g., due to requiring additional infra, slowness, requiring `input()`, etc.).	2024-10-30 12:35:38 -04:00
Yuki Watanabe	e593e017d2	Update compatibility table for ChatDatabricks (#27676 ) `ChatDatabricks` added support for structured output and JSON mode in the last release. This PR updates the feature table accordingly. Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>	2024-10-30 11:56:55 -04:00
Ankur Singh	0b97135da1	fix the grammar and markdown component (#27657 ) ## Before ![Screenshot from 2024-10-26 08-47-29](https://github.com/user-attachments/assets/d8ccead1-3ba3-4f67-a29f-ef8b352341cf) ## After ![image](https://github.com/user-attachments/assets/78f36d54-b2d7-4164-b334-8ac41000711e) ## Typo `(either in PR summary of in a linked issue)` => `either in PR summary or in a linked issue` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-10-30 14:47:26 +00:00
Abdesselam Benameur	8fb6708ac4	Fix typo (missing letter) in elasticsearch_retriever.ipynb (#27639 ) Fixed a small typo (added a missing "t" in ElasticsearchRetriever docs page) https://python.langchain.com/docs/integrations/retrievers/elasticsearch_retriever/#:~:text=It%20is%20possible%20to%20cusomize%20the%20function%20tha%20maps%20an%20Elasticsearch%20result%20(hit)%20to%20a%20LangChain%20document.	2024-10-30 14:38:39 +00:00
Martin Gullbrandson	8a5807a6b4	docs: Update Milvus documentation to correctly show how to filter in similarity_search (#27723 ) ### Description/Issue: I had problems filtering when setting up a local Milvus db and noticed that the `filter` option in the `similarity_search` and `similarity_search_with_score` appeared to do nothing. Instead, the `expr` option should be used. The `expr` option is correctly used in the retriever example further down in the documentation. The `expr` option seems to be correctly passed on, for example [here](`447c0dd2f0/libs/community/langchain_community/vectorstores/milvus.py (L701)`) ### Solution: Update the documentation for the functions mentioned to show intended behavior. --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-10-30 14:15:11 +00:00
Nawaz Haider	9d2f6701e1	DOCS: Fixed import of langchain instead of langchain_nvidia_ai_endpoints for ChatNVIDIA (#27734 ) * PR title: "docs: Replaced langchain import with langchain-nvidia-ai-endpoints in NVIDIA Endpoints Tab" * PR message: + Description: Replaced the import of `langchain` with `langchain-nvidia-ai-endpoints` in the NVIDIA Endpoints Tab to resolve an error caused by the documentation attempting to import the generic `langchain` module despite the targeted import. + Issue: + Dependencies: No additional dependencies introduced; simply updated the existing import to a more specific module. + Twitter handle: https://x.com/nawaz0x1 * Add tests and docs: + Applicability: Not applicable in this case, as the change is a fix to an existing integration rather than the addition of a new one. + Rationale: No new functionality or integrations are introduced, only a corrective import change. * Lint and test: + Status: Completed + Outcome: - `make format`: Passed - `make lint`: Passed - `make test`: Passed ![image](https://github.com/user-attachments/assets/fbc1b597-5083-4461-875a-d32ab8ed933c)	2024-10-30 13:57:37 +00:00
Prithvi Kannan	0433b114bb	docs: Add databricks-langchain package consolidation notice (#27703 ) Thank you for contributing to LangChain! Add notice of upcoming package consolidation of `langchain-databricks` into `databricks-langchain`. <img width="1047" alt="image" src="https://github.com/user-attachments/assets/18eaa394-4e82-444b-85d5-7812be322674"> Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-29 22:00:27 +00:00
Zapiron	447c0dd2f0	docs: Fixed Grammar & Improve reading (#27672 ) Updated the documentation to fix some grammar errors - Description: Some language errors exist in the documentation - Issue: the issue # Changed the structure of some sentences	2024-10-29 20:19:00 +00:00
Soham Das	913ff1b152	docs: fix typo in query analysis documentation (#27721 ) PR Title: `docs: fix typo in query analysis documentation` Description: This PR corrects a typo on line 68 in the query analysis documentation, changing "pharsings" to "phrasings" for clarity and accuracy. Only one instance of the typo was fixed in the last merge, and this PR fixes the second instance. Issue: N/A Dependencies: None Additional Notes: No functional changes were made; this is a documentation fix only.	2024-10-29 16:15:37 -04:00
Mateusz Szewczyk	0606aabfa3	docs: Added WatsonxRerank documentation (#27424 ) Thank you for contributing to LangChain! Changes: - docs: Added `WatsonxRerank` documentation - docs Updated `WatsonxEmbeddings` with docs template - docs: Updated `ChatWatsonx` with docs template - docs: Updated `WatsonxLLM` with docs template - docs: Added `ChatWatsonx` to list with Chat models providers. Added [test_chat_models_standard](https://github.com/langchain-ai/langchain-ibm/blob/main/libs/ibm/tests/integration_tests/test_chat_models_standard.py) to `langchain_ibm` tests suite. - docs: Added `IBM` to list with Embedding models providers. Added [test_embeddings_standard](https://github.com/langchain-ai/langchain-ibm/blob/main/libs/ibm/tests/integration_tests/test_embeddings_standard.py) to `langchain_ibm` tests suite. - docs: Updated `langcahin_ibm` recommended versions compatible with `LangChain v0.3` --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-29 16:57:47 +00:00
Zapiron	9ccd4a6ffb	DOC: Tutorial Section Updates (#27675 ) Edited various notebooks in the tutorial section to fix: * Grammatical Errors * Improve Readability by changing the sentence structure or reducing repeated words which bears the same meaning * Edited a code block to follow the PEP 8 Standard * Added more information in some sentences to make the concept more clear and reduce ambiguity --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>	2024-10-29 14:51:34 +00:00
Soham Das	c3021e9322	docs: fix typo in query analysis documentation (#27697 ) PR Title: `docs: fix typo in query analysis documentation` Description: This PR corrects a typo on line 68 in the query analysis documentation, changing "pharsings" to "phrasings" for clarity and accuracy. Issue: N/A Dependencies: None Additional Notes: No functional changes were made; this is a documentation fix only.	2024-10-29 14:07:22 +00:00
Erick Friis	94e5765416	docs: packages in homepage (#27693 )	2024-10-28 20:44:30 +00:00
Jorge Piedrahita Ortiz	8895d468cb	community: sambastudio llm refactor (#27215 ) Description: - Sambastudio LLM refactor - Sambastudio openai compatible API support added - docs updated	2024-10-27 11:08:15 -04:00
Erick Friis	cdb4b1980a	docs: reorganize contributing docs (#27649 )	2024-10-25 22:41:54 +00:00
Gabriel Faundez	ef27ce7a45	docs: add missing import for tools docs (#27650 ) ## Description Added missing import from `pydantic` in the tools docs	2024-10-25 21:14:40 +00:00
Erick Friis	2683f814f4	docs: contributing index page (#27647 )	2024-10-25 17:06:55 +00:00
Rashmi Pawar	83eebf549f	docs: Add NVIDIA as provider in v3 integrations (#27254 ) ### Add NVIDIA as provider in langchain v3 integrations cc: @sumitkbh @mattf @dglogo --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-10-25 16:21:22 +00:00

... 6 7 8 9 10 ...

4873 Commits