Commit Graph

7993 Commits

Author SHA1 Message Date
Averi Kitsch
9a3f54c89b docs: update Google Cloud database integration docs (#18711)
**Description:** update Google Cloud database integration docs
 **Issue:** NA
**Dependencies:** NA
2024-04-25 17:39:09 -07:00
Tomaz Bratanic
59d3788968 docs: Fix diffbot graph transformer description (#18736)
The previous docstring was invalid
2024-04-25 17:39:09 -07:00
Jan Nissen
2a9d2937b8 core[patch]: improve PydanticOutputParser typing (#18740)
This PR adds generic typing to `PydanticOutputParser` so we get a typed
output from `.parse` instead of `Any`. It should provide a better DX by
way of Intellisense and for anyone strictly typing.

Pre-change:

![Screenshot 2024-03-07 at 10 22
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/fd22dde0-9fdc-4283-b283-4c98f0bc46e5)

Post-change:

![Screenshot 2024-03-07 at 10 26
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/7e23d2b7-8f8c-494f-80b3-187530a173ee)

I haven't dug too deep, but I think a similar change could probably be
added to `JsonOutputParser` so we don't have to pull up `.parse`.

Co-authored-by: Jan Nissen <jan23@gmail.com>
2024-04-25 17:39:08 -07:00
Massimiliano Pronesti
504d8f5f1d experimental[minor]: add support for modin in pandas agent (#18749)
Added support for Intel's
[modin](https://github.com/modin-project/modin) in
`create_pandas_dataframe_agent`.
2024-04-25 17:39:08 -07:00
Tomaz Bratanic
23c0c2c0df comunity[patch]: Fix neo4j sanitizing values (#18750)
Fixing sanitization for when deeply nested lists appear
2024-04-25 17:39:08 -07:00
Ian
c909582a99 docs: Improve the tidb vector store notebook (#18773)
Remove redundant useless content, and fix some minor oversight
2024-04-25 17:39:08 -07:00
Eugene Yurtsev
0fbc89cf18 core[patch]: Automatic upgrade to AddableDict in transform and atransform (#18743)
Automatic upgrade to transform and atransform

Closes: 

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
2024-04-25 17:39:08 -07:00
Yunmo Koo
527cf9db7d community[minor]: Integration for Friendli LLM and ChatFriendli ChatModel. (#17913)
## Description
- Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM
and `ChatFriendli` chat model.
- Unit tests and integration tests corresponding to this change are
added.
- Documentations corresponding to this change are added.

## Dependencies
- Optional dependency
[`friendli-client`](https://pypi.org/project/friendli-client/) package
is added only for those who use `Frienldi` or `ChatFriendli` model.

## Twitter handle
- https://twitter.com/friendliai
2024-04-25 17:39:08 -07:00
Smit Parmar
c227f9c08e community[patch]: Added support for filter out AWS Kendra search by score confidence (#12920)
**Description:** It will add support for filter out kendra search by
score confidence which will make result more accurate.
    For example
   ```
retriever = AmazonKendraRetriever(
        index_id=kendra_index_id, top_k=5, region_name=region,
        score_confidence="HIGH"
    )
```
Result will not include the records which has score confidence "LOW" or "MEDIUM". 
Relevant docs 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/query.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/retrieve.html

 **Issue:** the issue # it resolve #11801 
**twitter:** [@SmitCode](https://twitter.com/SmitCode)
2024-04-25 17:39:08 -07:00
Ian
5165253297 community[minor]: Add Initial Support for TiDB Vector Store (#15796)
This pull request introduces initial support for the TiDB vector store.
The current version is basic, laying the foundation for the vector store
integration. While this implementation provides the essential features,
we plan to expand and improve the TiDB vector store support with
additional enhancements in future updates.

Upcoming Enhancements:
* Support for Vector Index Creation: To enhance the efficiency and
performance of the vector store.
* Support for max marginal relevance search. 
* Customized Table Structure Support: Recognizing the need for
flexibility, we plan for more tailored and efficient data store
solutions.

Simple use case exmaple

```python
from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)
```
2024-04-25 17:39:08 -07:00
Bagatur
c5d9d5755b community[patch]: chat hf typing fix (#18693) 2024-04-25 17:39:08 -07:00
Eugene Yurtsev
853b6f9431 Docs: remove sales from security (#18762)
Remove sales from security
2024-04-25 17:39:08 -07:00
Jib
77c92cbb6b langchain-mongodb: Standardize mongodb collection/index names in tests (#18755)
## **Description:**
MongoDB integration tests link to a provided Atlas Cluster. We have very
stringent permissions set against the cluster provided. In order to make
it easier to track and isolate the collections each test gets run
against, we've updated the collection names to map the test file name.
i.e. `langchain_{filename}` => `langchain_test_vectorstores`

Fixes integration test results

![image](https://github.com/langchain-ai/langchain/assets/2887713/41f911b9-55f7-4fe4-9134-5514b82009f9)

## **Dependencies:** 
Provided MONGODB_ATLAS_URI

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

cc: @shaneharvey, @blink1073 , @NoahStapp , @caseyclements
2024-04-25 17:39:08 -07:00
Eugene Yurtsev
a29bbd05f2 Docs: Add custom parsing documentation and extending langchain (#18331)
* Added extending langchain.mdx -- we'll need to add links as we add
more custom documentation
* Added partial documentation about parsers
2024-04-25 17:39:08 -07:00
Eugene Yurtsev
d8bcd4f75a core: upgrade mypy to recent mypy (#18753)
Testing this works per package on CI
2024-04-25 17:39:08 -07:00
Eugene Yurtsev
114bced635 Add dangerous parameter to requests tool (#18697)
The tools are already documented as dangerous. Not clear whether adding
an opt-in parameter is necessary or not
2024-04-25 17:39:08 -07:00
Leonid Ganeline
08ea451fd4 docs: update imports of adapters to use langchain_community (#18751)
Updated imports from `langchain` to `langchain_community`
2024-04-25 17:39:08 -07:00
Erick Friis
92f568f479 community[patch]: deprecate community anthropic (#18745) 2024-04-25 17:39:08 -07:00
Erick Friis
d0ee993ac5 community[patch]: move pdf text tests to integration (#18746) 2024-04-25 17:39:08 -07:00
Christophe Bornet
bbd494a7f4 community: If load() has been overridden, use it in default lazy_load() (#18690) 2024-04-25 17:39:08 -07:00
Christophe Bornet
19163b14c9 community[patch]: Implement lazy_load() for MHTMLLoader (#18648)
Covered by `tests/unit_tests/document_loaders/test_mhtml.py`
2024-04-25 17:39:08 -07:00
axiangcoding
4a871a217f community[patch]: Chroma use uuid4 instead of uuid1 to generate random ids (#18723)
- **Description:** Chroma use uuid4 instead of uuid1 as random ids. Use
uuid1 may leak mac address, changing to uuid4 will not cause other
effects.
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** None
2024-04-25 17:39:08 -07:00
Leonid Ganeline
38b752f212 docs: update imports of tools to use langchain_community (#18705)
Updated imports from `langchain` to `langchain_community`.
2024-04-25 17:39:08 -07:00
Guangdong Liu
736afd5d38 community[patch]: Fix sparkllm authentication problem. (#18651)
- **Description:** fix sparkllm authentication problem.The current
timestamp is in RFC1123 format. The time deviation must be controlled
within 300s. I changed to re-obtain the url every time I ask a question.
https://www.xfyun.cn/doc/spark/general_url_authentication.html#_1-2-%E9%89%B4%E6%9D%83%E5%8F%82%E6%95%B0
2024-04-25 17:39:08 -07:00
Erick Friis
56217b4faf community[patch]: release 0.0.27 (#18708) 2024-04-25 17:39:08 -07:00
Erick Friis
2be4f2a10a core[patch]: release 0.1.30 (#18706) 2024-04-25 17:39:08 -07:00
Piyush Jain
8e70153c68 Support for claude v3 models. (#18630)
Fixes #18513.

## Description
This PR attempts to fix the support for Anthropic Claude v3 models in
BedrockChat LLM. The changes here has updated the payload to use the
`messages` format instead of the formatted text prompt for all models;
`messages` API is backwards compatible with all models in Anthropic, so
this should not break the experience for any models.


## Notes
The PR in the current form does not support the v3 models for the
non-chat Bedrock LLM. This means, that with these changes, users won't
be able to able to use the v3 models with the Bedrock LLM. I can open a
separate PR to tackle this use-case, the intent here was to get this out
quickly, so users can start using and test the chat LLM. The Bedrock LLM
classes have also grown complex with a lot of conditions to support
various providers and models, and is ripe for a refactor to make future
changes more palatable. This refactor is likely to take longer, and
requires more thorough testing from the community. Credit to PRs
[18579](https://github.com/langchain-ai/langchain/pull/18579) and
[18548](https://github.com/langchain-ai/langchain/pull/18548) for some
of the code here.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-25 17:39:08 -07:00
Sam Khano
d92d46c90d community[minor]: Add DocumentDBVectorSearch VectorStore (#17757)
**Description:**
- Added Amazon DocumentDB Vector Search integration (HNSW index)
- Added integration tests
- Updated AWS documentation with DocumentDB Vector Search instructions
- Added notebook for DocumentDB integration with example usage

---------

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>
2024-04-25 17:39:08 -07:00
Vittorio Rigamonti
3f3a90fed4 community[minor]: Adding support for Infinispan as VectorStore (#17861)
**Description:**
This integrates Infinispan as a vectorstore.
Infinispan is an open-source key-value data grid, it can work as single
node as well as distributed.

Vector search is supported since release 15.x 

For more: [Infinispan Home](https://infinispan.org)

Integration tests are provided as well as a demo notebook
2024-04-25 17:39:08 -07:00
Max Jakob
e8ac8fc45c elasticsearch[patch], community[patch]: update references, deprecate community classes (#18506)
Follow up on https://github.com/langchain-ai/langchain/pull/17467.

- Update all references to the Elasticsearch classes to use the partners
package.
- Deprecate community classes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 17:39:08 -07:00
José Luis Di Biase
b219c81170 templates: rag-multi-modal typo, replace serch with search (#18519)
Thank you for contributing to LangChain!

- [x] **PR title**: "templates: rag-multi-modal typo, replace serch with
search "
- **Description:** Two little typos in multi modal templates (replace
serch string with search)

Signed-off-by: José Luis Di Biase <josx@interorganic.com.ar>
2024-04-25 17:39:08 -07:00
Djordje
93ad87dae7 community[patch]: Opensearch delete method added - indexing supported (#18522)
- **Description:** Added delete method for OpenSearchVectorSearch,
therefore indexing supported
    - **Issue:** No
    - **Dependencies:** No
    - **Twitter handle:** stkbmf
2024-04-25 17:39:08 -07:00
Erick Friis
77a6d76861 openai[patch]: unit test azure init (#18703) 2024-04-25 17:39:08 -07:00
Christophe Bornet
8d02fa46cb community: Implement lazy_load() for PlaywrightURLLoader (#18676)
Integration tests:
`tests/integration_tests/document_loaders/test_url_playwright.py`
2024-04-25 17:39:08 -07:00
Aaron Yi
88f725895f community[patch]: make metadata and text optional as expected in DocArray (#18678)
ValidationError: 2 validation errors for DocArrayDoc
text
Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.5/v/missing
metadata
Field required [type=missing, input_value={'embedding': [-0.0191128...9, 0.01005221541175212]}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.5/v/missing
```
In the `_get_doc_cls` method, the `DocArrayDoc` class is defined as
follows:

```python
class DocArrayDoc(BaseDoc):
    text: Optional[str]
    embedding: Optional[NdArray] = Field(**embeddings_params)
    metadata: Optional[dict]
```
2024-04-25 17:39:08 -07:00
Eugene Yurtsev
0f64bc2ce2 community[major]: breaking change in some APIs to force users to opt-in for pickling (#18696)
This is a PR that adds a dangerous load parameter to force users to opt in to use pickle.

This is a PR that's meant to raise user awareness that the pickling module is involved.
2024-04-25 17:39:08 -07:00
Eugene Yurtsev
40baa4f82f community[patch]: Patch tdidf retriever (CVE-2024-2057) (#18695)
This is a patch for `CVE-2024-2057`:
https://www.cve.org/CVERecord?id=CVE-2024-2057

This affects users that: 

* Use the  `TFIDFRetriever`
* Attempt to de-serialize it from an untrusted source that contains a
malicious payload
2024-04-25 17:39:08 -07:00
Leonid Ganeline
2f6b65154e docs: update import paths for callbacks to use langchain_community callbacks where applicable (#18691)
Refactored imports from `langchain` to `langchain_community` whenever it
is applicable
2024-04-25 17:39:08 -07:00
Erick Friis
2edc7e7cd5 mongodb[patch]: release 0.1.1 (#18692) 2024-04-25 17:39:08 -07:00
Leonid Ganeline
914b1fc2ed docs: fix streamlit provider (#18606)
There is a wrong python package import.
Fixed it.
2024-04-25 17:39:08 -07:00
Christophe Bornet
b6edefcaa5 core: Move document loader interfaces to core (#17723)
This is needed to be able to move document loaders to partner packages.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-25 17:39:08 -07:00
aditya thomas
52e41903d9 docs: update to the streaming tutorial notebook in the lcel documentation (#18378)
**Description:** Update to the streaming tutorial notebook in the LCEL
documentation
**Issue:** Fixed an import and (minor) changes in documentation language
**Dependencies:** None
2024-04-25 17:39:08 -07:00
Guangdong Liu
2c6db960f6 docs: Fix some issues with sparkllm use cases (#17674) 2024-04-25 17:39:08 -07:00
Christophe Bornet
e7adc6e0e5 Merge pull request #18539
* Implement lazy_load() for GitLoader
2024-04-25 17:39:08 -07:00
Christophe Bornet
688e5073ec Merge pull request #18423
* Implement lazy_load() for BSHTMLLoader
2024-04-25 17:39:08 -07:00
Christophe Bornet
35d887b361 Merge pull request #18673
* Implement lazy_load() for PDFMinerPDFasHTMLLoader and PyMuPDFLoader
2024-04-25 17:39:08 -07:00
Christophe Bornet
f3fb693883 Merge pull request #18674
* Implement lazy_load() for TextLoader
2024-04-25 17:39:08 -07:00
Christophe Bornet
2e8e8d1c71 Merge pull request #18671
* Implement lazy_load() for MastodonTootsLoader
2024-04-25 17:39:08 -07:00
Christophe Bornet
2bfeaf62f8 Merge pull request #18421
* Implement lazy_load() for AssemblyAIAudioTranscriptLoader
2024-04-25 17:39:08 -07:00
Christophe Bornet
71f2eb3948 Merge pull request #18436
* Implement lazy_load() for ConfluenceLoader
2024-04-25 17:39:08 -07:00