langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-07-05 20:58:25 +00:00

Author	SHA1	Message	Date
Karthik Raja A	8b08687fc4	MultiOn client toolkit (#8110 ) Addition of MultiOn Client Agent Toolkit Dependencies: multion pip package This PR consists of the following: - MultiOn utility,tools and integration with agent - sample jupyter notebook. Request @hwchase17 , @hinthornw --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-07-22 08:19:01 -07:00
Harrison Chase	aa0e69bc98	Harrison/official pre release (#8106 )	2023-07-21 18:44:32 -07:00
Lance Martin	5a084e1b20	Async HTML loader and HTML2Text transformer (#8036 ) New HTML loader that asynchronously loader a list of urls. New transformer using [HTML2Text](https://github.com/Alir3z4/html2text/) for HTML to clean, easy-to-read plain ASCII text (valid Markdown).	2023-07-20 22:30:59 -07:00
Wey Gu	cf60cff1ef	feat: Add with_history option for chatglm (#8048 ) In certain 0-shot scenarios, the existing stateful language model can unintentionally send/accumulate the .history. This commit adds the "with_history" option to chatglm, allowing users to control the behavior of .history and prevent unintended accumulation. Possible reviewers @hwchase17 @baskaryan @mlot Refer to discussion over this thread: https://twitter.com/wey_gu/status/1681996149543276545?s=20	2023-07-20 22:25:37 -07:00
Harrison Chase	1f3b987860	Harrison/GitHub toolkit (#8047 ) Co-authored-by: Trevor Dobbertin <trevordobbertin@gmail.com>	2023-07-20 22:24:55 -07:00
Harrison Chase	f99f497b2c	Harrison/predibase (#8046 ) Co-authored-by: Abhay Malik <32989166+Abhay-765@users.noreply.github.com>	2023-07-20 19:26:50 -07:00
Jacob Lee	56c6ab1715	Fix bad docs sidebar header (#7966 ) Quick fix for: <img width="283" alt="Screenshot 2023-07-19 at 2 49 44 PM" src="https://github.com/hwchase17/langchain/assets/6952323/91e4868c-b75e-413d-9f8f-d34762abf164"> CC @baskaryan	2023-07-20 19:06:57 -07:00
Kacper Łukawski	ed6a5532ac	Implement async support in Qdrant local mode (#8001 ) I've extended the support of async API to local Qdrant mode. It is faked but allows prototyping without spinning a container. The tests are improved to test the in-memory case as well. @baskaryan @rlancemartin @eyurtsev @agola11	2023-07-20 19:04:33 -07:00
Taqi Jaffri	973593c5c7	Added streaming support to Replicate (#8045 ) Streaming support is useful if you are doing long-running completions or need interactivity e.g. for chat... adding it to replicate, using a similar pattern to other LLMs that support streaming. Housekeeping: I ran `make format` and `make lint`, no issues reported in the files I touched. I did update the replicate integration test but ran into some issues, specifically: 1. The original test was failing for me due to the model argument not being specified... perhaps this test is not regularly run? I fixed it by adding a call to the lightweight hello world model which should not be burdensome for replicate infra. 2. I couldn't get the `make integration_tests` command to pass... a lot of failures in other integration tests due to missing dependencies... however I did make sure the particluar test file I updated does pass, by running `poetry run pytest tests/integration_tests/llms/test_replicate.py` Finally, I am @tjaffri https://twitter.com/tjaffri for feature announcement tweets... or if you could please tag @docugami https://twitter.com/docugami we would really appreciate that :-) Tagging model maintainers @hwchase17 @baskaryan Thank for all the awesome work you folks are doing. --------- Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>	2023-07-20 18:59:54 -07:00
Piyush Jain	31b7ddc12c	Neptune graph and openCypher QA Chain (#8035 ) ## Description This PR adds a graph class and an openCypher QA chain to work with the Amazon Neptune database. ## Dependencies `requests` which is included in the LangChain dependencies. ## Maintainers for Review @krlawrence @baskaryan ### Twitter handle pjain7	2023-07-20 18:56:47 -07:00
Jonathon Belotti	021bb9be84	Update Modal.com integration docs (#8014 ) Hey, I'm a Modal Labs engineer and I'm making this docs update after getting a user question in [our beta Slack space](https://join.slack.com/t/modalbetatesters/shared_invite/zt-1xl9gbob8-1QDgUY7_PRPg6dQ49hqEeQ) about the Langchain integration docs. 🔗 [Modal beta-testers link to docs discussion thread](https://modalbetatesters.slack.com/archives/C031Z7DBQFL/p1689777700594819?thread_ts=1689775859.855849&cid=C031Z7DBQFL)	2023-07-20 15:53:06 -07:00
Jeffrey Wang	62d0475c29	Add Metaphor new field and reformat docs (#8022 ) This PR reformats our python notebook example and also adds a new field we have. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2023-07-20 15:50:54 -07:00
Dwai Banerjee	d8c40253c3	Adding endpoint_url to embeddings/bedrock.py and updated docs (#7927 ) BedrockEmbeddings does not have endpoint_url so that switching to custom endpoint is not possible. I have access to Bedrock custom endpoint and cannot use BedrockEmbeddings --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 07:25:59 -07:00
Constantin Musca	d593833e4d	Add Golden Query Tool (#7930 ) Description: Golden Query is a wrapper on top of the [Golden Query API](https://docs.golden.com/reference/query-api) which enables programmatic access to query results on entities across Golden's Knowledge Base. For more information about Golden API, please see the [Golden API Getting Started](https://docs.golden.com/reference/getting-started) page. Issue: None Dependencies: requests(already present in project) Tag maintainer: @hinthornw Signed-off-by: Constantin Musca <constantin.musca@gmail.com>	2023-07-20 07:03:20 -07:00
Santiago Delgado	c416dbe8e0	Amadeus Flight and Travel Search Tool (#7890 ) ## Background With the addition on email and calendar tools, LangChain is continuing to complete its functionality to automate business processes. ## Challenge One of the pieces of business functionality that LangChain currently doesn't have is the ability to search for flights and travel in order to book business travel. ## Changes This PR implements an integration with the [Amadeus](https://developers.amadeus.com/) travel search API for LangChain, enabling seamless search for flights with a single authentication process. ## Who can review? @hinthornw ## Appendix @tsolakoua and @minjikarin, I utilized your [amadeus-python](https://github.com/amadeus4dev/amadeus-python) library extensively. Given the rising popularity of LangChain and similar AI frameworks, the convergence of libraries like amadeus-python and tools like this one is likely. So, I wanted to keep you updated on our progress. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-20 06:59:29 -07:00
Jithin James	493cbc9410	docs: fix a couple of small indentation errors in the strings (#7951 ) Fixed a few indentations I came across in the docs @baskaryan	2023-07-20 06:34:01 -07:00
Bhashithe Abeysinghe	73901ef132	Added windows specific instructions to Llama.cpp documentation. (#8000 ) - Description: Added windows specific instructions on llama.cpp in the notebook file - Issue: #6356 - Dependencies: None - Tag maintainer: @baskaryan	2023-07-20 06:31:25 -07:00
Jeff Huber	5694e7b8cf	Update chroma notebook (#7978 ) Fix up the Chroma notebook - remove `.persist()` -- this is no longer in Chroma as of `0.4.0` - update output to match `0.4.0` - other cleanup work	2023-07-20 06:25:31 -07:00
Kacper Łukawski	19e8472521	Add async Qdrant to async_agent.ipynb (#7993 ) I added Qdrant to the async API docs. This is the only vector store that supports full async API. @baskaryan @rlancemartin, @eyurtsev	2023-07-20 06:23:15 -07:00
Bagatur	5d021c0962	nb fix (#7962 )	2023-07-19 15:27:43 -07:00
Julien Salinas	3adab5e5be	Integrate NLP Cloud embeddings endpoint (#7931 ) Add embeddings for [NLPCloud](https://docs.nlpcloud.com/#embeddings). --------- Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-19 15:27:34 -07:00
Brendan Collins	9aef79c2e3	Add Geopandas.GeoDataFrame Document Loader (#3817 ) Work in Progress. WIP Not ready... Adds Document Loader support for [Geopandas.GeoDataFrames](https://geopandas.org/) Example: - [x] stub out `GeoDataFrameLoader` class - [x] stub out integration tests - [ ] Experiment with different geometry text representations - [ ] Verify CRS is successfully added in metadata - [ ] Test effectiveness of searches on geometries - [ ] Test with different geometry types (point, line, polygon with multi-variants). - [ ] Add documentation --------- Co-authored-by: Lance Martin <lance@langchain.dev> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>	2023-07-19 12:14:41 -07:00
Bagatur	f97535b33e	fix (#7947 )	2023-07-19 10:23:10 -07:00
William FH	9d7e57f5c0	Docs Nit (#7918 )	2023-07-18 21:47:28 -07:00
Jarek Kazmierczak	f2ef3ff54a	Google Cloud Enterprise Search retriever (#7857 ) Added a retriever that encapsulated Google Cloud Enterprise Search. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 18:24:08 -07:00
Zizhong Zhang	bdf0c2267f	docs(custom_chain) fix typo (#7898 ) Fix typo in the document of custom_chain	2023-07-18 18:03:19 -07:00
Jeff Huber	2139d0197e	upgrade chroma to 0.4.0 (#7749 ) This should land Monday the 17th Chroma is upgrading from `0.3.29` to `0.4.0`. `0.4.0` is easier to build, more durable, faster, smaller, and more extensible. This comes with a few changes: 1. A simplified and improved client setup. Instead of having to remember weird settings, users can just do `EphemeralClient`, `PersistentClient` or `HttpClient` (the underlying direct `Client` implementation is also still accessible) 2. We migrated data stores away from `duckdb` and `clickhouse`. This changes the api for the `PersistentClient` that used to reference `chroma_db_impl="duckdb+parquet"`. Now we simply set `is_persistent=true`. `is_persistent` is set for you to `true` if you use `PersistentClient`. 3. Because we migrated away from `duckdb` and `clickhouse` - this also means that users need to migrate their data into the new layout and schema. Chroma is committed to providing extension notification and tooling around any schema and data migrations (for example - this PR!). After upgrading to `0.4.0` - if users try to access their data that was stored in the previous regime, the system will throw an `Exception` and instruct them how to use the migration assistant to migrate their data. The migration assitant is a pip installable CLI: `pip install chroma_migrate`. And is runnable by calling `chroma_migrate` -- TODO ADD here is a short video demonstrating how it works. Please reference the readme at [chroma-core/chroma-migrate](https://github.com/chroma-core/chroma-migrate) to see a full write-up of our philosophy on migrations as well as more details about this particular migration. Please direct any users facing issues upgrading to our Discord channel called [#get-help](https://discord.com/channels/1073293645303795742/1129200523111841883). We have also created a [email listserv](https://airtable.com/shrHaErIs1j9F97BE) to notify developers directly in the future about breaking changes. --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 17:20:54 -07:00
Lance Martin	41c841ec85	Add Llama-v2 to Llama.cpp notebook (#7913 )	2023-07-18 15:13:27 -07:00
Lance Martin	862268175e	Add llama-v2 to docs (#7893 )	2023-07-18 12:09:09 -07:00
maciej-skorupka	c6d1d6d7fc	feat: moving azure OpenAI API version to the latest 2023-05-15 (#7764 ) Moving to the latest non-preview Azure OpenAI API version=2023-05-15. The previous 2023-03-15-preview doesn't have support, SLA etc. For instance, OpenAI SDK has moved to this version https://github.com/openai/openai-python/releases/tag/v0.27.7 @baskaryan	2023-07-18 09:50:15 -07:00
satorioh	259a409998	docs(zilliz): connection_args add token description for serverless cl… (#7810 ) Description: Currently, Zilliz only support dedicated clusters using a pair of username and password for connection. Regarding serverless clusters, they can connect to them by using API keys( [ see official note detail](https://docs.zilliz.com/docs/manage-cluster-credentials)), so I add API key(token) description in Zilliz docs to make it more obvious and convenient for this group of users to better utilize Zilliz. No changes done to code. --------- Co-authored-by: Robin.Wang <3Jg$94sbQ@q1> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 09:31:39 -07:00
maciej-skorupka	5de7815310	docs: added comment from azure llm to azure chat about GPT-4 (#7884 ) Azure GPT-4 models can't be accessed via LLM model. It's easy to miss that and a lot of discussions about that are on the Internet. Therefore I added a comment in Azure LLM docs that mentions that and points to Azure Chat OpenAI docs. @baskaryan --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-18 08:05:41 -07:00
Bill Zhang	dda11d2a05	WeaviateHybridSearchRetriever option to enable scores. (#7861 ) Description: This PR adds the option to retrieve scores and explanations in the WeaviateHybridSearchRetriever. This feature improves the usability of the retriever by allowing users to understand the scoring logic behind the search results and further refine their search queries. Issue: This PR is a solution to the issue #7855 Dependencies: This PR does not introduce any new dependencies. Tag maintainer: @rlancemartin, @eyurtsev I have included a unit test for the added feature, ensuring that it retrieves scores and explanations correctly. I have also included an example notebook demonstrating its use.	2023-07-18 07:57:17 -07:00
Jonathan Pedoeem	c460c29a64	Adding Docs for `PromptLayerCallbackHandler` (#7860 ) Here I am adding documentation for the `PromptLayerCallbackHandler`. When we created the initial PR for the callback handler the docs were causing issues, so we merged without the docs.	2023-07-18 07:51:16 -07:00
German Martin	f1eaa9b626	Lost in the middle: We have been ordering documents the WRONG way. (for long context) (#7520 ) Motivation, it seems that when dealing with a long context and "big" number of relevant documents we must avoid using out of the box score ordering from vector stores. See: https://arxiv.org/pdf/2306.01150.pdf So, I added an additional parameter that allows you to reorder the retrieved documents so we can work around this performance degradation. The relevance respect the original search score but accommodates the lest relevant document in the middle of the context. Extract from the paper (one image speaks 1000 tokens): ![image](https://github.com/hwchase17/langchain/assets/1821407/fafe4843-6e18-4fa6-9416-50cc1d32e811) This seems to be common to all diff arquitectures. SO I think we need a good generic way to implement this reordering and run some test in our already running retrievers. It could be that my approach is not the best one from the architecture point of view, happy to have a discussion about that. For me this was the best place to introduce the change and start retesting diff implementations. @rlancemartin, @eyurtsev --------- Co-authored-by: Lance Martin <lance@langchain.dev>	2023-07-18 07:45:15 -07:00
William FH	c6f2d27789	Docs Nits (#7874 ) Add links to reference docs	2023-07-18 01:50:14 -07:00
William FH	3179ee3a56	Evals docs (#7460 ) Still don't have good "how to's", and the guides / examples section could be further pruned and improved, but this PR adds a couple examples for each of the common evaluator interfaces. - [x] Example docs for each implemented evaluator - [x] "how to make a custom evalutor" notebook for each low level APIs (comparison, string, agent) - [x] Move docs to modules area - [x] Link to reference docs for more information - [X] Still need to finish the evaluation index page - ~[ ] Don't have good data generation section~ - ~[ ] Don't have good how to section for other common scenarios / FAQs like regression testing, testing over similar inputs to measure sensitivity, etc.~	2023-07-18 01:00:01 -07:00
Jasper	5b4d53e8ef	Add text_content kwarg to BrowserlessLoader (#7856 ) Added keyword argument to toggle between getting the text content of a site versus its HTML when using the `BrowserlessLoader`	2023-07-17 17:02:19 -07:00
Matt Robinson	3c489be773	feat: optional post-processing for Unstructured loaders (#7850 ) ### Summary Adds a post-processing method for Unstructured loaders that allows users to optionally modify or clean extracted elements. ### Testing ```python from langchain.document_loaders import UnstructuredFileLoader from unstructured.cleaners.core import clean_extra_whitespace loader = UnstructuredFileLoader( "./example_data/layout-parser-paper.pdf", mode="elements", post_processors=[clean_extra_whitespace], ) docs = loader.load() docs[:5] ``` ### Reviewrs - @rlancemartin - @eyurtsev - @hwchase17	2023-07-17 12:13:05 -07:00
Bagatur	2a315dbee9	fix nb (#7843 )	2023-07-17 09:39:11 -07:00
Bagatur	98c48f303a	fix (#7838 )	2023-07-17 07:53:11 -07:00
Dayuan Jiang	ee40d37098	add bm25 module (#7779 ) - Description: Add a BM25 Retriever that do not need Elastic search - Dependencies: rank_bm25(if it is not installed it will be install by using pip, just like TFIDFRetriever do) - Tag maintainer: @rlancemartin, @eyurtsev - Twitter handle: DayuanJian21687 --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-17 07:30:17 -07:00
Liu Ming	fa0a9e502a	Add LLM for ChatGLM(2)-6B API (#7774 ) Description: Add LLM for ChatGLM-6B & ChatGLM2-6B API Related Issue: Will the langchain support ChatGLM? #4766 Add support for selfhost models like ChatGLM or transformer models #1780 Dependencies: No extra library install required. It wraps api call to a ChatGLM(2)-6B server(start with api.py), so api endpoint is required to run. Tag maintainer: @mlot Any comments on this PR would be appreciated. --------- Co-authored-by: mlot <limpo2000@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-17 07:27:17 -07:00
sseide	25e3d3f283	Support Redis Sentinel database connections (#5196 ) # Support Redis Sentinel database connections This PR adds the support to connect not only to Redis standalone servers but High Availability Replication sets too (https://redis.io/docs/management/sentinel/) Redis Replica Sets have on Master allowing to write data and 2+ replicas with read-only access to the data. The additional Redis Sentinel instances monitor all server and reconfigure the RW-Master on the fly if it comes unavailable. Therefore all connections must be made through the Sentinels the query the current master for a read-write connection. This PR adds basic support to also allow a redis connection url specifying a Sentinel as Redis connection. Redis documentation and Jupyter notebook with Redis examples are updated to mention how to connect to a redis Replica Set with Sentinels - Remark - i did not found test cases for Redis server connections to add new cases here. Therefor i tests the new utility class locally with different kind of setups to make sure different connection urls are working as expected. But no test case here as part of this PR.	2023-07-17 07:18:51 -07:00
Yifei Song	2e47412073	Add Xorbits agent (#7647 ) - [Xorbits](https://doc.xorbits.io/en/latest/) is an open-source computing framework that makes it easy to scale data science and machine learning workloads in parallel. Xorbits can leverage multi cores or GPUs to accelerate computation on a single machine, or scale out up to thousands of machines to support processing terabytes of data. - This PR added support for the Xorbits agent, which allows langchain to interact with Xorbits Pandas dataframe and Xorbits Numpy array. - Dependencies: This change requires the Xorbits library to be installed in order to be used. `pip install xorbits` - Request for review: @hinthornw - Twitter handle: https://twitter.com/Xorbitsio	2023-07-17 07:09:51 -07:00
Mohammad Mohtashim	b8b8a138df	Simple Import fix in Tools Exception Docs (#7740 ) Issue: #7720 @hinthornw	2023-07-15 10:25:34 -04:00
Lance Martin	b015647e31	Add GPT4All embeddings (#7743 ) Support for [GPT4All embeddings](https://docs.gpt4all.io/gpt4all_python_embedding.html) --------- Co-authored-by: Bagatur <baskaryan@gmail.com>	2023-07-15 10:04:29 -04:00
Bearnardd	275b926cf7	add missing import (#7730 ) Just a nit documentation fix @baskaryan	2023-07-14 20:03:23 -04:00
Lorenzo	77e6bbe6f0	fix typo in deeplake.ipynb (#7718 ) - Fixing typos in deeplake documentation - @baskaryan	2023-07-14 13:38:31 -04:00
Samuel Berthe	2be3515a66	SQLDatabase: adding security disclamer (#7710 ) It might be obvious to most engineers, but I think everybody should be cautious when using such a chain. ![image](https://github.com/hwchase17/langchain/assets/2951285/a1df6567-9d56-4c12-98ea-767401ae2ac8)	2023-07-14 13:38:16 -04:00

1 2 3 4 5

220 Commits