langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-08-31 10:23:18 +00:00

Author	SHA1	Message	Date
Zander Chase	00f276d23f	Run eval in eval mode (#6447 ) For the `run_on_dataset` sessions	2023-06-19 18:31:38 -07:00
Harrison Chase	1300a4bc8c	expose docs chains (#6453 )	2023-06-19 17:18:54 -07:00
Harrison Chase	286452c7f0	remove mongo	2023-06-19 10:04:14 -07:00
David Duong	be9371ca8f	Include placeholder value for all secrets, not just kwargs (#6421 ) Mirror PR for https://github.com/hwchase17/langchainjs/pull/1696 Secrets passed via environment variables should be present in the serialised chain	2023-06-19 15:41:45 +01:00
Harrison Chase	df40cd233f	bump version to 205 (#6410 ) v0.0.205	2023-06-18 23:21:26 -07:00
Harrison Chase	e9c2b280db	Harrison/refactor functions (#6408 )	2023-06-18 23:13:42 -07:00
Harrison Chase	6a4a950a3c	changes to llm chain (#6328 ) - return raw and full output (but keep run shortcut method functional) - change output parser to take in generations (good for working with messages) - add output parser to base class, always run (default to same as current) --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2023-06-18 22:49:47 -07:00
Davis Chase	d3c2eab0b3	Docs nit (#6350 )	2023-06-18 20:58:12 -07:00
Davis Chase	af96de6552	fix prod docs build (#6402 )	2023-06-18 20:56:12 -07:00
Fei Wang	50556f3b35	support memory for functions (#6165 ) #### Before submitting Add memory support for `OpenAIFunctionsAgent` like `StructuredChatAgent`. #### Who can review? @hwchase17 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 19:00:40 -07:00
Dhruvil Shah	b2b9ded12f	Update web_base.py _fetch() method For SiteMapLoader (#6256 ) A must-include for SiteMap Loader to avoid the SSL verification error. Setting the 'verify' to False by ``` sitemap_loader.requests_kwargs = {"verify": False}``` does not bypass the SSL verification in some websites. There are websites (https:// researchadmin.asu.edu/ sitemap.xml) where setting "verify" to False as shown below would not work: sitemap_loader.requests_kwargs = {"verify": False} We need this merge to tell the Session to use a connector with a specific argument about SSL: \# For SiteMap SSL verification if not self.request_kwargs['verify']: connector = aiohttp.TCPConnector(ssl=False) else: connector = None <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #5483 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> v0.0.204	2023-06-18 18:34:18 -07:00
Harrison Chase	10bff4ecc4	Harrison/chroma fix (#6390 ) Co-authored-by: Junu Moon(Fran) <francomoon7@gmail.com>	2023-06-18 18:33:26 -07:00
Harrison Chase	5c1fa3e70e	Harrison/typesense fix (#6391 ) Co-authored-by: Gaurav Chauhan <2796gaurav@gmail.com> Co-authored-by: gaurav <gaurav.chauhan1@rksv.in>	2023-06-18 18:33:15 -07:00
Harrison Chase	5ccebce777	rm pandas from arize (#6392 )	2023-06-18 18:33:04 -07:00
matias-biatoz	3b7c4c51d5	Added gpt-3.5-turbo 0613 16k and 16k-0613 pricing (#6287 ) @agola11 Issue #6193 I added the new pricing for the new models. Also, now gpt-3.5-turbo got split into "input" and "output" pricing. It currently does not support that.	2023-06-18 18:32:20 -07:00
Ly Nguyen	1e0af59f69	- Fix pass system_message argument in new feature openai_functions_agent (#6297 ) can't pass system_message argument, the prompt always show default message "System: You are a helpful AI assistant." ``` system_message = SystemMessage( content="You are an AI that provides information to Human regarding documentation." ) agent = initialize_agent( tools, llm=openai_llm_chat, agent=AgentType.OPENAI_FUNCTIONS, system_message=system_message, agent_kwargs={ "system_message": system_message, }, verbose=False, ) ``` #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:54:00 -07:00
georgian	e64bafed3a	Fixes typo in Vectara.similarity_search (#6277 ) Fixes a simple typo. @hwchase17 @dev2049 Co-authored-by: Georgian Sarghi <georgian.sarghi@gmail.com>	2023-06-18 17:48:54 -07:00
Ted	112695e4da	Iterate through filtered file types instead of all listed files (#6258 ) # Iterate through filtered file types instead of all listed files Fixes https://github.com/hwchase17/langchain/issues/6257 https://github.com/hwchase17/langchain/pull/4926 originally added the functionality to filter by file type, storing the filtered files in `_files` https://github.com/hwchase17/langchain/pull/5220 removed the functionality when adding code to filter trashed files by using the `files` variables instead of the `_files` variable. This PR simply adds the functionality back by using `_files` again. #### Who can review? @hwchase17 - project lead @eyurtsev	2023-06-18 17:47:58 -07:00
Dhruvil Shah	ba90e3c990	Update web_base.ipynb for guiding purposes (#6248 ) To bypass SSL verification errors during fetching, you can include the `verify=False` parameter. This markdown proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #6079 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:10 -07:00
Dhruvil Shah	92f05a67a4	Add markdown to specify important arguments (#6246 ) To bypass SSL verification errors during web scraping, you can include the ssl_verify=False parameter along with the headers parameter. This combination of arguments proves useful, especially for beginners in the field of web scraping. <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes #1829 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 @eyurtsev --> --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:47:00 -07:00
ikebo	ca7a44d024	add max_context_size property in BaseOpenAI (#6239 ) Hi, I make a small improvement for BaseOpenAI. I added a max_context_size attribute to BaseOpenAI so that we can get the max context size directly instead of only getting the maximum token size of the prompt through the max_tokens_for_prompt method. Who can review? @hwchase17 @agola11 I followed the [Common Tasks](`c7db9febb0/.github/CONTRIBUTING.md`), the test is all passed. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>	2023-06-18 17:46:35 -07:00
Jan Pawellek	3e3ed8c5c9	Fix LLM types so that they can be loaded from config dicts (#6235 ) LLM configurations can be loaded from a Python dict (or JSON file deserialized as dict) using the [load_llm_from_config](`8e1a7a8646/langchain/llms/loading.py (L12)`) function. However, the type string in the `type_to_cls_dict` lookup dict differs from the type string defined in some LLM classes. This means that the LLM object can be saved, but not loaded again, because the type strings differ.	2023-06-18 17:46:22 -07:00
Shu	46782ad79b	Fixed an unhandled error that was raised when DynamoDB did not have any chat history. (#6141 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> The current version of chat history with DynamoDB doesn't handle the case correctly when a table has no chat history. This change solves this error handling. <!-- Remove if not applicable --> Fixes https://github.com/hwchase17/langchain/issues/6088 #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:39:19 -07:00
Cameron Vetter	2286204354	Correct AzureSearch Vector Store not applying search_kwargs when searching (#6132 ) Fixes #6131 Simply passes kwargs forward from similarity_search to helper functions so that search_kwargs are applied to search as originally intended. See bug for repro steps. #### Who can review? @hwchase17 @dev2049 Twitter: poshporcupine	2023-06-18 17:39:06 -07:00
Pierre Dulac	395a2a3724	Fix typo in the CAI critique prompt (#6123 ) Very small typo in the Constitutional AI critique default prompt. The negation "If there is no material critique of ..." is used two times, should be used only on the first one. Cheers, Pierre	2023-06-18 17:38:56 -07:00
Hao Chen	38057f0d2e	Fix latest clickhouse vector schema change (#6385 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> Fixes https://github.com/hwchase17/langchain/issues/6208 <!-- Remove if not applicable --> Fixes # (issue) #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 --> VectorStores / Retrievers / Memory - @dev2049	2023-06-18 17:34:53 -07:00
Davit Buniatyan	1ab9dc8293	[hotfix] Deep Lake fails on newer version due to hardcode (#6383 ) Hot Fixes for Deep Lake [would highly appreciate expedited review] * deeplake version was hardcoded and since deeplake upgraded the integration fails with confusing error * an additional integration test fixed due to embedding function * Additionally fixed docs for code understanding links after docs upgraded * notebook removal of public parameter to make sure code understanding notebook works #### Who can review? @hwchase17 @dev2049 --------- Co-authored-by: Davit Buniatyan <d@activeloop.ai>	2023-06-18 17:33:49 -07:00
hp0404	6aa7b04f79	Fix integration tests for Faiss vector store (#6281 ) Fixes #5807 (issue) #### Who can review? Tag maintainers/contributors who might be interested: @dev2049 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:25:49 -07:00
Chakib Benziane	ddd518a161	searx_search: updated tools and doc (#6276 ) - Allows using the same wrapper to create multiple tools ```python wrapper = SearxSearchWrapper(searx_host="**") github_tool = SearxSearchResults(name="Github", wrapper=wrapper, kwargs = { "engines": ["github"], }) arxiv_tool = SearxSearchResults(name="Arxiv", wrapper=wrapper, kwargs = { "engines": ["arxiv"] }) ``` - Updated link to searx documentation Agents / Tools / Toolkits - @hwchase17	2023-06-18 17:23:12 -07:00
ju-bezdek	e2f36ee608	OpenAI functions dont work with async streaming... #6225 (#6226 ) Related to this https://github.com/hwchase17/langchain/issues/6225 Just copied the implementation from `generate` function to `agenerate` and tested it. Didn't run any official tests thought <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes #6225 #### Before submitting <!-- If you're adding a new integration, please include: 1. a test for the integration - favor unit tests that does not rely on network access. 2. an example notebook showing its use See contribution guidelines for more information on how to write tests, lint etc: https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md --> #### Who can review? Tag maintainers/contributors who might be interested: @hwchase17, @agola11 <!-- For a quicker response, figure out the right person to tag with @ @hwchase17 - project lead Tracing / Callbacks - @agola11 Async - @agola11 DataLoaders - @eyurtsev Models - @hwchase17 - @agola11 Agents / Tools / Toolkits - @hwchase17 VectorStores / Retrievers / Memory - @dev2049 -->	2023-06-18 17:05:16 -07:00
Jan Pawellek	ea6a5b03e0	Fix output final text for HuggingFaceTextGenInference when streaming (#6211 ) The LLM integration [HuggingFaceTextGenInference](https://github.com/hwchase17/langchain/blob/master/langchain/llms/huggingface_text_gen_inference.py) already has streaming support. However, when streaming is enabled, it always returns an empty string as the final output text when the LLM is finished. This is because `text` is instantiated with an empty string and never updated. This PR fixes the collection of the final output text by concatenating new tokens.	2023-06-18 17:01:15 -07:00
Tomaz Bratanic	b3bccabc66	Add option to save/load graph cypher QA (#6219 ) Similar as https://github.com/hwchase17/langchain/pull/5818 Added the functionality to save/load Graph Cypher QA Chain due to a user reporting the following error > raise NotImplementedError("Saving not supported for this chain type.")\nNotImplementedError: Saving not supported for this chain type.\n'	2023-06-18 17:00:27 -07:00
Harrison Chase	495128ba95	Harrison/functions docs improvements (#6389 ) Co-authored-by: Sumanth Donthula <46747610+sumanthdonthula@users.noreply.github.com>	2023-06-18 16:57:33 -07:00
Leonid Ganeline	c7ca350cd3	Fix class promotion (#6187 ) In LangChain, all module classes are enumerated in the `__init__.py` file of the correspondent module. But some classes were missed and were not included in the module `__init__.py` This PR: - added the missed classes to the module `__init__.py` files - `__init__.py:__all_` variable value (a list of the class names) was sorted - `langchain.tools.sql_database.tool.QueryCheckerTool` was renamed into the `QuerySQLCheckerTool` because it conflicted with `langchain.tools.spark_sql.tool.QueryCheckerTool` - changes to `pyproject.toml`: - added `pgvector` to `pyproject.toml:extended_testing` - added `pandas` to `pyproject.toml:[tool.poetry.group.test.dependencies]` - commented out the `streamlit` from `collbacks/__init__.py`, It is because now the `streamlit` requires Python >=3.7, !=3.9.7 - fixed duplicate names in `tools` - fixed correspondent ut-s #### Who can review? @hwchase17 @dev2049	2023-06-18 16:55:18 -07:00
Harrison Chase	c0c2fd0782	Harrison/zep mem (#6388 ) Co-authored-by: Daniel Chalef <131175+danielchalef@users.noreply.github.com>	2023-06-18 16:53:35 -07:00
Harrison Chase	b7159c15cc	Harrison/metaphor search fix (#6387 ) Co-authored-by: jeffzwang <jeffreyzhiyuanwang@gmail.com>	2023-06-18 16:53:24 -07:00
Harrison Chase	9bf5b0defa	Harrison/myscale self query (#6376 ) Co-authored-by: Fangrui Liu <fangruil@moqi.ai> Co-authored-by: 刘方瑞 <fangrui.liu@outlook.com> Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>	2023-06-18 16:53:10 -07:00
Harrison Chase	bd8d418a95	Merge branch 'master' of github.com:hwchase17/langchain	2023-06-18 16:45:49 -07:00
Harrison Chase	3a75d59c3d	searx - docs	2023-06-18 16:45:42 -07:00
MIDORIBIN	5be465bd86	Fixed PermissionError on windows (#6170 ) Fixed PermissionError that occurred when downloading PDF files via http in BasePDFLoader on windows. When downloading PDF files via http in BasePDFLoader, NamedTemporaryFile is used. This function cannot open the file again on Windows.[Python Doc](https://docs.python.org/3.9/library/tempfile.html#tempfile.NamedTemporaryFile) So, we created a temporary directory with TemporaryDirectory and placed the downloaded file there. temporary directory is deleted in the deconstruct. Fixes #2698 #### Who can review? Tag maintainers/contributors who might be interested: - @eyurtsev - @hwchase17	2023-06-18 16:39:57 -07:00
xleven	4fc7939848	fix link of callbacks on modules page (#6323 ) Since [Callbacks](https://python.langchain.com/docs/modules/callbacks/getting_started/) on [Modules](https://python.langchain.com/docs/modules/) went to a "Page Not Found".	2023-06-18 15:08:12 -07:00
Vijay	2b3b4e0f60	Add the ability to run the map_reduce chains process results step as async (#6181 ) This will add the ability to add an AsyncCallbackManager (handler) for the reducer chain, which would be able to stream the tokens via the `async def on_llm_new_token` callback method Fixes # (issue) [5532](https://github.com/hwchase17/langchain/issues/5532) @hwchase17 @agola11 The following code snippet explains how this change would be used to enable `reduce_llm` with streaming support in a `map_reduce` chain I have tested this change and it works for the streaming use-case of reducer responses. I am happy to share more information if this makes solution sense. ``` AsyncHandler .......................... class StreamingLLMCallbackHandler(AsyncCallbackHandler): """Callback handler for streaming LLM responses.""" def __init__(self, websocket): self.websocket = websocket # This callback method is to be executed in async async def on_llm_new_token(self, token: str, **kwargs: Any) -> None: resp = ChatResponse(sender="bot", message=token, type="stream") await self.websocket.send_json(resp.dict()) Chain .......... stream_handler = StreamingLLMCallbackHandler(websocket) stream_manager = AsyncCallbackManager([stream_handler]) streaming_llm = ChatOpenAI( streaming=True, callback_manager=stream_manager, verbose=False, temperature=0, ) main_llm = OpenAI( temperature=0, verbose=False, ) doc_chain = load_qa_chain( llm=main_llm, reduce_llm=streaming_llm, chain_type="map_reduce", callback_manager=manager ) qa_chain = ConversationalRetrievalChain( retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator, callback_manager=manager, ) # Here `acall` will trigger `acombine_docs` on `map_reduce` which should then call `_aprocess_result` which in turn will call `self.combine_document_chain.arun` hence async callback will be awaited result = await qa_chain.acall( {"question": question, "chat_history": chat_history} ) ```	2023-06-18 13:19:56 -07:00
Alvaro Bartolome	e0dea577ee	Extend `ArgillaCallbackHandler` support (#6153 ) Hi again @agola11! 🤗 ## What's in this PR? After playing around with different chains we noticed that some chains were using different `output_key`s and we were just handling some, so we've extended the support to any output, either if it's a Python list or a string. Kudos to @dvsrepo for spotting this! --------- Co-authored-by: Daniel Vila Suero <daniel@argilla.io>	2023-06-18 11:18:33 -07:00
Harrison Chase	a8cb9ee013	Harrison/gdrive enhancements (#6375 ) Co-authored-by: Matt Robinson <mrobinson@unstructuredai.io>	2023-06-18 11:07:23 -07:00
rafael	ebfffaa38f	Guardrails output parser: Pass LLM api for reasking (#6089 ) <!-- Thank you for contributing to LangChain! Your PR will appear in our release under the title you set. Please make sure it highlights your valuable contribution. Replace this with a description of the change, the issue it fixes (if applicable), and relevant context. List any dependencies required for this change. After you're done, someone will review your PR. They may suggest improvements. If no one reviews your PR within a few days, feel free to @-mention the same people again, as notifications can get lost. Finally, we'd love to show appreciation for your contribution - if you'd like us to shout you out on Twitter, please also include your handle! --> <!-- Remove if not applicable --> Fixes https://github.com/ShreyaR/guardrails/issues/155 Enables guardrails reasking by specifying an LLM api in the output parser.	2023-06-18 10:50:20 -07:00
Davis Chase	ec850e607f	bump 203 (#6372 ) v0.0.203	2023-06-18 09:20:47 -07:00
Lance Martin	370becdfc2	Add self query retriever example with MD header splitting (#6359 ) Flesh out the notebook example for `MarkdownHeaderTextSplitter`	2023-06-17 21:40:20 -07:00
Lance Martin	2c97fbabbd	Update MD header text splitter notebook (#6339 ) Highlight use case for maintaining header groups when splitting.	2023-06-17 13:19:27 -07:00
Harrison Chase	a2bbe3dda4	Harrison/mmr support for opensearch (#6349 ) Co-authored-by: Mehmet Öner Yalçın <oneryalcin@gmail.com>	2023-06-17 12:22:37 -07:00
Davis Chase	2eea5d4cb4	Add ignore vercel preview script (#6320 ) skip building preview of docs for anything branch that doesn't start with `__docs__`. will eventually update to look at code diff directories but patching for now	2023-06-17 11:17:08 -07:00

... 2 3 4 5 6 ...

2739 Commits