langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-01 11:02:37 +00:00

Author	SHA1	Message	Date
Zapiron	0c11aee486	docs: small grammar changes for conceptual guide page (#28765 ) Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-17 14:27:55 +00:00
German Martin	3a1d05394d	community: Apache AGE wrapper. Ensure Node Uniqueness by ID. (#28759 ) Description: The Apache AGE graph integration incorrectly handled node merging, allowing duplicate nodes with different IDs but the same type and other properties. Unlike [Neo4j](`cdf6202156/libs/community/langchain_community/graphs/neo4j_graph.py (L47)`), [Memgraph](`cdf6202156/libs/community/langchain_community/graphs/memgraph_graph.py (L50)`), [Kuzu](`cdf6202156/libs/community/langchain_community/graphs/kuzu_graph.py (L253)`), and [Gremlin](`cdf6202156/libs/community/langchain_community/graphs/gremlin_graph.py (L165)`), it did not use the node ID as the primary identifier for merging. This inconsistency caused data integrity issues and unexpected behavior when users expected updates to specific nodes by ID. Solution: This PR modifies the `node_insert_query` to `MERGE` nodes based on label and ID only and updates properties with `SET`, aligning the behavior with other graph database integrations. The `_format_properties` method was also modified to handle id overrides. Impact: This fix ensures data integrity by preventing duplicate nodes, and provides a consistent behavior across graph database integrations.	2024-12-17 09:21:59 -05:00
gsa9989	cdf6202156	cosmosdbnosql: Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook (#24424 ) * Added Cosmos DB NoSQL Semantic Cache Integration with tests and jupyter notebook --------- Co-authored-by: Aayush Kataria <aayushkataria3011@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 21:57:05 -05:00
Brian Burgin	27a9056725	community: Fix ChatLiteLLMRouter runtime issues (#28163 ) Description: Fix ChatLiteLLMRouter ctor validation and model_name parameter Issue: #19356, #27455, #28077 Twitter handle: @bburgin_0	2024-12-16 18:17:39 -05:00
Eugene Yurtsev	234d49653a	docs: Create custom embeddings (#20398 ) Guidelines on how to create custom embeddings --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 17:57:51 -05:00
Mikhail Khludnev	00deacc67e	docs, external: introduce `langchain-localai` (#28751 ) Thank you for contributing to LangChain! Referring to https://github.com/mkhludnev/langchain-localai --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 22:22:37 +00:00
Erick Friis	d4b5e7ef22	community: recommend RedisVectorStore over Redis (#28749 )	2024-12-16 21:08:30 +00:00
Hiros	8f5e72de05	community: Correctly handle multi-element rich text (#25762 ) Description: - Add _concatenate_rich_text method to combine all elements in rich text arrays - Update load_page method to use _concatenate_rich_text for rich text properties - Ensure all text content is captured, including inline code and formatted text - Add unit tests to verify correct handling of multi-element rich text This fix prevents truncation of content after backticks or other formatting elements. Issue: Using Notion DB Loader, the text for `richtext` and `title` is truncated after 1st element was loaded as Notion Loader only read the first element. Dependencies: any dependencies required for this change None. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 20:20:27 +00:00
Antonio Lanza	b2102b8cc4	text-splitters: Inconsistent results with `NLTKTextSplitter`'s `add_start_index=True` (#27782 ) This PR closes #27781 # Problem The current implementation of `NLTKTextSplitter` is using `sent_tokenize`. However, this `sent_tokenize` doesn't handle chars between 2 tokenized sentences... hence, this behavior throws errors when we are using `add_start_index=True`, as described in issue #27781. In particular: ```python from nltk.tokenize import sent_tokenize output1 = sent_tokenize("Innovation drives our success. Collaboration fosters creative solutions. Efficiency enhances data management.", language="english") print(output1) output2 = sent_tokenize("Innovation drives our success. Collaboration fosters creative solutions. Efficiency enhances data management.", language="english") print(output2) >>> ['Innovation drives our success.', 'Collaboration fosters creative solutions.', 'Efficiency enhances data management.'] >>> ['Innovation drives our success.', 'Collaboration fosters creative solutions.', 'Efficiency enhances data management.'] ``` # Solution With this new `use_span_tokenize` parameter, we can use NLTK to create sentences (with `span_tokenize`), but also add extra chars to be sure that we still can map the chunks to the original text. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-12-16 19:53:15 +00:00
Tari Yekorogha	d262d41cc0	community: added FalkorDB vector store support i.e implementation, test, docs an… (#26245 ) Description: Added support for FalkorDB Vector Store, including its implementation, unit tests, documentation, and an example notebook. The FalkorDB integration allows users to efficiently manage and query embeddings in a vector database, with relevance scoring and maximal marginal relevance search. The following components were implemented: - Core implementation for FalkorDBVector store. - Unit tests ensuring proper functionality and edge case coverage. - Example notebook demonstrating an end-to-end setup, search, and retrieval using FalkorDB. Twitter handle: @tariyekorogha --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 19:37:55 +00:00
Aaron Pham	12fced13f4	chore(community): update to OpenLLM 0.6 (#24609 ) Update to OpenLLM 0.6, which we decides to make use of OpenLLM's OpenAI-compatible endpoint. Thus, OpenLLM will now just become a thin wrapper around OpenAI wrapper. Signed-off-by: Aaron Pham <contact@aarnphm.xyz> --------- Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-16 14:30:07 -05:00
Lvlvko	5c17a4ace9	community: support Hunyuan Embedding (#23160 ) ## description - I refactor `Chathunyuan` using tencentcloud sdk because I found the original one can't work in my application - I add `HunyuanEmbeddings` using tencentcloud sdk - Both of them are extend the basic class of langchain. I have fully tested them in my application ## Dependencies - tencentcloud-sdk-python --------- Co-authored-by: centonhuang <centonhuang@tencent.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 19:27:19 +00:00
Harrison Chase	de7996c2ca	core: add kwargs support to VectorStore (#25934 ) has been missing the passthrough until now --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 18:57:57 +00:00
Henry Tu	87c50f99e5	docs: cerebras Update Llama 3.1 70B to Llama 3.3 70B (#28746 ) This PR updates the docs for the Cerebras integration to use Llama 3.3 70b instead of Llama 3.1 70b. cc: @efriis	2024-12-16 18:43:35 +00:00
Lorenzo	b79a1156ed	community: correct return type of get_files_from_directory in github tool (#27885 ) ### About: - Description: the _get_files_from_directory_ method return a string, but it's used in other methods that expect a List[str] - Issue: None - Dependencies: None This pull request import a new method _list_files_ with the old logic of _get_files_from_directory_, but it return a List[str] at the end. The behavior of _ get_files_from_directory_ is not changed.	2024-12-16 10:30:33 -08:00
Sheepsta300	580a8d53f9	community: Add configurable `VisualFeatures` to the `AzureAiServicesImageAnalysisTool` (#27444 ) Thank you for contributing to LangChain! - [ ] PR title: community: Add configurable `VisualFeatures` to the `AzureAiServicesImageAnalysisTool` - [ ] PR message: - Description: The `AzureAiServicesImageAnalysisTool` is a good service and utilises the Azure AI Vision package under the hood. However, since the creation of this tool, new `VisualFeatures` have been added to allow the user to request other image specific information to be returned. Currently, the tool offers neither configuration of which features should be return nor does it offer any newer feature types. The aim of this PR is to address this and expose more of the Azure Service in this integration. - Dependencies: no new dependencies in the main class file, azure.ai.vision.imageanalysis added to extra test dependencies file. - [ ] Add tests and docs: If you're adding a new integration, please include 1. Although no tests exist for already implemented Azure Service tools, I've created 3 unit tests for this class that test initialisation and credentials, local file analysis and a test for the new changes/ features option. - [ ] Lint and test: All linting has passed. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 18:30:04 +00:00
Erick Friis	1c120e9615	core: xml output parser tags docstring (#28745 )	2024-12-16 18:25:16 +00:00
Ana	ebab2ea81b	Fix Azure National Cloud authentication using token (RBAC) (Generated by Ana - AI SDE) (#25843 ) This pull request addresses the issue with authenticating Azure National Cloud using token (RBAC) in the AzureSearch vectorstore implementation. ## Changes - Modified the `_get_search_client` method in `azuresearch.py` to pass `additional_search_client_options` to the `SearchIndexClient` instance. ## Implementation Details The patch updates the `SearchIndexClient` initialization to include the `additional_search_client_options` parameter: ```python index_client: SearchIndexClient = SearchIndexClient( endpoint=endpoint, credential=credential, user_agent=user_agent, **additional_search_client_options ) ``` This change allows the `audience` parameter to be correctly passed when using Azure National Cloud, fixing the authentication issues with GovCloud & RBAC. This patch was generated by [Ana - AI SDE](https://openana.ai/), an AI-powered software development assistant. This is a fix for [Issue 25823](https://github.com/langchain-ai/langchain/issues/25823) --------- Co-authored-by: ccurme <chester.curme@gmail.com>	2024-12-16 18:22:24 +00:00
chenzimin	169d419581	community: Remove all other keys in ChatLiteLLM and add api_key (#28097 ) Thank you for contributing to LangChain! - PR title: "community: Remove all other keys in ChatLiteLLM and add api_key" - PR message: Currently, no api_key are passed to LiteLLM, and LiteLLM only takes on api_key parameter. Therefore I removed all current `*_api_key` attributes (They are not used), and added `api_key` that is passed to ChatLiteLLM. - Should fix issue #27826 --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 17:54:29 +00:00
German Martin	d5d18c62b3	community: Apache AGE wrapper additional edge cases. (#28151 ) Description: Current AGEGraph() implementation does some custom wrapping for graph queries. The method here is _wrap_query() as it parse the field from the original query to add some SQL context to it. This improves the current parsing logic to cover additional edge cases that are added to the test coverage, basically if any Node property name or value has the "return" literal in it will break the graph / SQL query. We discovered this while dealing with real world datasets, is not an uncommon scenario and I think it needs to be covered.	2024-12-16 11:28:01 -05:00
Rock2z	768e4a7fd4	[community][fix] Compatibility support to bump up wikibase-rest-api-client version (#27316 ) Description: This PR addresses the `TypeError: sequence item 0: expected str instance, FluentValue found` error when invoking `WikidataQueryRun`. The root cause was an incompatible version of the `wikibase-rest-api-client`, which caused the tool to fail when handling `FluentValue` objects instead of strings. The current implementation only supports `wikibase-rest-api-client<0.2`, but the latest version is `0.2.1`, where the current implementation breaks. Additionally, the error message advises users to install the latest version: [code reference](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/utilities/wikidata.py#L125C25-L125C32). Therefore, this PR updates the tool to support the latest version of `wikibase-rest-api-client`. Key changes: - Updated the handling of `FluentValue` objects to ensure compatibility with the latest `wikibase-rest-api-client`. - Removed the restriction to `wikibase-rest-api-client<0.2` and updated to support the latest version (`0.2.1`). Issue: Fixes [#24093](https://github.com/langchain-ai/langchain/issues/24093) – `TypeError: sequence item 0: expected str instance, FluentValue found`. Dependencies: - Upgraded `wikibase-rest-api-client` to the latest version to resolve the issue. --------- Co-authored-by: peiwen_zhang <peiwen_zhang@email.com> Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 16:22:18 +00:00
André Quintino	a26c786bc5	community: refactor opensearch query constructor to use wildcard instead of match in the contain comparator (#26653 ) - Description: Changed the comparator to use a wildcard query instead of match. This modification allows for partial text matching on analyzed fields, which improves the flexibility of the search by performing full-text searches that aren't limited to exact matches. - Issue: The previous implementation used a match query, which performs exact matches on analyzed fields. This approach limited the search capabilities by requiring the query terms to align with the indexed text. The modification to use a wildcard query instead addresses this limitation. The wildcard query allows for partial text matching, which means the search can return results even if only a portion of the term matches the text. This makes the search more flexible and suitable for use cases where exact matches aren't necessary or expected, enabling broader full-text searches across analyzed fields. In short, the problem was that match queries were too restrictive, and the change to wildcard queries enhances the ability to perform partial matches. - Dependencies: none - Twitter handle: @Andre_Q_Pereira --------- Co-authored-by: André Quintino <andre.quintino@tui.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 11:16:34 -05:00
Davi Schumacher	0f9b4bf244	community[patch]: update dynamodb chat history to update instead of overwrite (#22397 ) Description: The current implementation of `DynamoDBChatMessageHistory` updates the `History` attribute for a given chat history record by first extracting the existing contents into memory, appending the new message, and then using the `put_item` method to put the record back. This has the effect of overwriting any additional attributes someone may want to include in the record, like chat session metadata. This PR suggests changing from using `put_item` to using `update_item` instead which will keep any other attributes in the record untouched. The change is backward compatible since 1. `update_item` is an "upsert" operation, creating the record if it doesn't already exist, otherwise updating it 2. It only touches the db insert call and passes the exact same information. The rest of the class is left untouched Dependencies: None Tests and docs: No unit tests currently exist for the `DynamoDBChatMessageHistory` class. This PR adds the file `libs/community/tests/unit_tests/chat_message_histories/test_dynamodb_chat_message_history.py` to test the `add_message` and `clear` methods. I wanted to use the moto library to mock DynamoDB calls but I could not get poetry to resolve it so I mocked those calls myself in the test. Therefore, no test dependencies were added. The change was tested on a test DynamoDB table as well. The first three images below show the current behavior. First a message is added to chat history, then a value is inserted in the record in some other attribute, and finally another message is added to the record, destroying the other attribute. ![using_put_1_first_message](https://github.com/langchain-ai/langchain/assets/29493541/426acd62-fe29-42f4-b75f-863fb8b3fb21) ![using_put_2_add_attribute](https://github.com/langchain-ai/langchain/assets/29493541/f8a1c864-7114-4fe3-b487-d6f9252f8f92) ![using_put_3_second_message](https://github.com/langchain-ai/langchain/assets/29493541/8b691e08-755e-4877-8969-0e9769e5d28a) The next three images show the new behavior. Once again a value is added to an attribute other than the History attribute, but now when the followup message is added it does not destroy that other attribute. The History attribute itself is unaffected by this change. ![using_update_1_first_message](https://github.com/langchain-ai/langchain/assets/29493541/3e0d76ed-637e-41cd-82c7-01a86c468634) ![using_update_2_add_attribute](https://github.com/langchain-ai/langchain/assets/29493541/52585f9b-71a2-43f0-9dfc-9935aa59c729) ![using_update_3_second_message](https://github.com/langchain-ai/langchain/assets/29493541/f94c8147-2d6f-407a-9a0f-86b94341abff) The doc located at `docs/docs/integrations/memory/aws_dynamodb.ipynb` required no changes and was tested as well.	2024-12-16 10:38:00 -05:00
Christophe Bornet	6ddd5dbb1e	community: Add FewShotSQLTool (#28232 ) The `FewShotSQLTool` gets some SQL query examples from a `BaseExampleSelector` for a given question. This is useful to provide [few-shot examples](https://python.langchain.com/docs/how_to/sql_prompting/#few-shot-examples) capability to an SQL agent. Example usage: ```python from langchain.agents.agent_toolkits.sql.prompt import SQL_PREFIX embeddings = OpenAIEmbeddings() example_selector = SemanticSimilarityExampleSelector.from_examples( examples, embeddings, AstraDB, k=5, input_keys=["input"], collection_name="lc_few_shots", token=ASTRA_DB_APPLICATION_TOKEN, api_endpoint=ASTRA_DB_API_ENDPOINT, ) few_shot_sql_tool = FewShotSQLTool( example_selector=example_selector, description="Input to this tool is the input question, output is a few SQL query examples related to the input question. Always use this tool before checking the query with sql_db_query_checker!" ) agent = create_sql_agent( llm=llm, db=db, prefix=SQL_PREFIX + "\nYou MUST get some example queries before creating the query.", extra_tools=[few_shot_sql_tool] ) result = agent.invoke({"input": "How many artists are there?"}) ``` --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>	2024-12-16 15:37:21 +00:00
Mohammad Mohtashim	8d746086ab	Added `bind_tools` support for `ChatMLX` along with small fix in `_stream` (#28743 ) - Description: Added Support for `bind_tool` as requested in the issue. Plus two issue in `_stream` were fixed: - Corrected the Positional Argument Passing for `generate_step` - Accountability if `token` returned by `generate_step` is integer. - Issue: #28692	2024-12-16 09:52:49 -05:00
Jorge Piedrahita Ortiz	558b65ea32	community: SamabaStudio Tool Calling and Structured Output (#28025 ) Description: Add tool calling and structured output support for SambaStudio chat models, docs included --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 06:15:19 +00:00
Hristiyan Genchev	c87f24d85d	docs: correct variable name from formatted_docs to docs_content (#28735 ) - Description: Fixed incorrect variable name formatted_docs, replacing it with docs_content to ensure correct functionality.	2024-12-15 22:09:58 -08:00
clairebehue	fb44e74ca4	community: fix AzureSearch Oauth with azure_ad_access_token (#26995 ) Description: AzureSearch vector store: create a wrapper class on `azure.core.credentials.TokenCredential` (which is not-instantiable) to fix Oauth usage with `azure_ad_access_token` argument Issue: [the issue it fixes](https://github.com/langchain-ai/langchain/issues/26216) Dependencies: None - [x] Lint and test --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 05:56:45 +00:00
SirSmokeAlot	29305cd948	community: O365Toolkit - send_event - fixed timezone error (#25876 ) Description: Fixed formatting start and end time Issue: The old formatting resulted everytime in an timezone error Dependencies: / Twitter handle: / --------- Co-authored-by: Yannick Opitz <yannick.opitz@gob.de> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-16 05:32:28 +00:00
Erick Friis	4f6ccb7080	text-splitters: extended-tests without socket (#28736 )	2024-12-16 05:19:50 +00:00
Erick Friis	8ec1c72e03	text-splitters: test without socket (#28732 )	2024-12-15 22:10:35 +00:00
Tibor Reiss	690aa02c31	docs[experimental]: Make docs clearer and add min_chunk_size (#26398 ) Fixes #26171: - added some clarification text for the keyword argument `breakpoint_threshold_amount` - added min_chunk_size: together with `breakpoint_threshold_amount`, too small/big chunk sizes can be avoided Note: the langchain-experimental was moved to a separate repo, so only the doc change stays here.	2024-12-15 13:43:48 -08:00
Aayush Kataria	d417e4b372	Community: Azure CosmosDB No Sql Vector Store: Full Text and Hybrid Search Support (#28716 ) Thank you for contributing to LangChain! - Added [full text](https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/full-text-search) and [hybrid search](https://learn.microsoft.com/en-us/azure/cosmos-db/gen-ai/hybrid-search) support for Azure CosmosDB NoSql Vector Store - Added a new enum called CosmosDBQueryType which supports the following values: - VECTOR = "vector" - FULL_TEXT_SEARCH = "full_text_search" - FULL_TEXT_RANK = "full_text_rank" - HYBRID = "hybrid" - User now needs to provide this query_type to the similarity_search method for the vectorStore to make the correct query api call. - Added a couple of work arounds as for the FULL_TEXT_RANK and HYBRID query functions we don't support parameterized queries right now. I have added TODO's in place, and will remove these work arounds by end of January. - Added necessary test cases and updated the - [x] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erickfriis@gmail.com>	2024-12-15 13:26:32 -08:00
Mohammad Mohtashim	4c1871d9a8	community: Passing the `model_kwargs` correctly while maintaing backward compatability (#28439 ) - Description: `Model_Kwargs` was not being passed correctly to `sentence_transformers.SentenceTransformer` which has been corrected while maintaing backward compatability - Issue: #28436 --------- Co-authored-by: MoosaTae <sadhis.tae@gmail.com> Co-authored-by: Sadit Wongprayon <101176694+MoosaTae@users.noreply.github.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-15 20:34:29 +00:00
nhols	a3851cb3bc	community: FAISS vectorstore - consistent Document id field (#28728 ) make sure id field of Documents in `FAISS` docstore have the same id as values in `index_to_docstore_id`, implement `get_by_ids` method	2024-12-15 12:23:49 -08:00
Bagatur	a0534ae62a	community[patch]: Release 0.3.12 (#28725 ) langchain-community==0.3.12	2024-12-14 22:13:20 +00:00
Bagatur	089e659e03	langchain[patch]: Release 0.3.12 (#28724 ) langchain==0.3.12	2024-12-14 20:02:18 +00:00
Bagatur	679e3a9970	text-splitters[patch]: Release 0.3.3 (#28723 ) langchain-text-splitters==0.3.3	2024-12-14 19:20:22 +00:00
ccurme	23b433f683	infra: fix notebook tests (#28722 ) Bump unstructured to pick up resolution of https://github.com/Unstructured-IO/unstructured/issues/3795	2024-12-14 15:13:19 +00:00
Erick Friis	387284c259	core: release 0.3.25 (#28718 ) langchain-core==0.3.25	2024-12-14 02:22:28 +00:00
Nawaf Alharbi	decd77c515	community: fix an issue with deepinfra integration (#28715 ) Thank you for contributing to LangChain! - [x] PR title: langchain: add URL parameter to ChatDeepInfra class - [x] PR message: add URL parameter to ChatDeepInfra class - Description: This PR introduces a url parameter to the ChatDeepInfra class in LangChain, allowing users to specify a custom URL. Previously, the URL for the DeepInfra API was hardcoded to "https://stage.api.deepinfra.com/v1/openai/chat/completions", which caused issues when the staging endpoint was not functional. The _url method was updated to return the value from the url parameter, enabling greater flexibility and addressing the problem. out! --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-14 02:15:29 +00:00
Ben Chambers	008efada2c	[community]: Render documents to graphviz (#24830 ) - Description: Adds a helper that renders documents with the GraphVectorStore metadata fields to Graphviz for visualization. This is helpful for understanding and debugging. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-14 02:02:09 +00:00
Leonid Ganeline	fc8006121f	docs: integrations W&B update (#28059 ) Issue: Here is an ambiguity about W&B integrations. There are two existing provider pages. Fix: Added the "root" W&B provider page. Added there the references to the documentation in the W&B site. Cleaned up formats in existing pages. Added one more integration reference. --------- Co-authored-by: Erick Friis <erick@langchain.dev> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>	2024-12-14 00:47:16 +00:00
Erick Friis	288f204758	docs, community: aerospike docs update (#28717 ) Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com> Co-authored-by: Jesse S <jschmidt@aerospike.com> Co-authored-by: dylan <dwelch@aerospike.com>	2024-12-14 00:27:37 +00:00
ronidas39	bd008baee0	docs: Update additional_resources/tutorials.mdx (#28005 ) Added Langchain complete tutorial playlist from total technology zonne channel .In this playlist every video is focusing one specific use case and hands on demo.All tutorials are equally good for every levels . Thank you for contributing to LangChain! - [ ] PR title: "package: description" - Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] PR message: *Delete this entire checklist* and replace with - Description: a description of the change - Issue: the issue # it fixes, if applicable - Dependencies: any dependencies required for this change - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] Add tests and docs: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erickfriis@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-14 00:21:30 +00:00
Vimpas	337fed80a5	community: 🐛 PDF Filter Type Error (#27154 ) Thank you for contributing to LangChain! PR title: "community: fix PDF Filter Type Error" - Description: fix PDF Filter Type Error" - Issue: the issue #27153 it fixes, - Dependencies: no - Twitter handle: if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [x] Lint and test: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-13 23:30:29 +00:00
Ryan Parker	12111cb922	community: fallback on core async atransform_documents method for `MarkdownifyTransformer` (#27866 ) # Description Implements the `atransform_documents` method for `MarkdownifyTransformer` using the `asyncio` built-in library for concurrency. Note that this is mainly for API completeness when working with async frameworks rather than for performance, since the `markdownify` function is not I/O bound because it works with `Document` objects already in memory. # Issue Fixes #27865 # Dependencies No new dependencies added, but [`markdownify`](https://github.com/matthewwithanm/python-markdownify) is required since this PR updates the `markdownify` integration. # Tests and docs - Tests added - I did not modify the docstrings since they already described the basic functionality, and [the API docs also already included a description](https://python.langchain.com/api_reference/community/document_transformers/langchain_community.document_transformers.markdownify.MarkdownifyTransformer.html#langchain_community.document_transformers.markdownify.MarkdownifyTransformer.atransform_documents). If it would be helpful, I would be happy to update the docstrings and/or the API docs. # Lint and test - [x] format - [x] lint - [x] test I ran formatting with `make format`, linting with `make lint`, and confirmed that tests pass using `make test`. Note that some unit tests pass in CI but may fail when running `make_test`. Those unit tests are: - `test_extract_html` (and `test_extract_html_async`) - `test_strip_tags` (and `test_strip_tags_async`) - `test_convert_tags` (and `test_convert_tags_async`) The reason for the difference is that there are trailing spaces when the tests are run in the CI checks, and no trailing spaces when run with `make test`. I ensured that the tests pass in CI, but they may fail with `make test` due to the addition of trailing spaces. --------- Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-13 22:32:22 +00:00
Manuel	af2e0a7ede	partners: add 'model' alias for consistency in embedding classes (#28374 ) Description: This PR introduces a `model` alias for the embedding classes that contain the attribute `model_name`, to ensure consistency across the codebase, as suggested by a moderator in a previous PR. The change aligns the usage of attribute names across the project (see for example [here](`65deeddd5d/libs/partners/groq/langchain_groq/chat_models.py (L304)`)). Issue: This PR addresses the suggestion from the review of issue #28269. Dependencies: None --------- Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: Erick Friis <erick@langchain.dev>	2024-12-13 22:30:00 +00:00
Erick Friis	3107d78517	huggingface: fix standard test lint (#28714 )	2024-12-13 22:18:54 +00:00
Kaiwei Zhang	b909d54e70	chroma[patch]: Update logic for assigning ids	2024-12-13 21:58:34 +00:00

1 2 3 4 5 ...

12236 Commits