Commit Graph

7827 Commits

Author SHA1 Message Date
老阿張
07dfde0184 docs: Fix typo in baidu_qianfan_endpoint.ipynb & baidu_qianfan_endpoint.ipynb (#18176)
Description: "sucessfully should be successfully "? 🤔
Issue: Typo
Dependencies: Nope
Twitter handle: laoazhang
2024-04-25 17:39:06 -07:00
Hemslo Wang
1aaf50d284 community[patch]: fix RecursiveUrlLoader metadata_extractor return type (#18193)
**Description:** Fix `metadata_extractor` type for `RecursiveUrlLoader`,
the default `_metadata_extractor` returns `dict` instead of `str`.
**Issue:** N/A
**Dependencies:** N/A
**Twitter handle:** N/A

Signed-off-by: Hemslo Wang <hemslo.wang@gmail.com>
2024-04-25 17:39:06 -07:00
Maxime Perrin
3ba8324f09 community[patch]: removing "response_mode" parameter in llama_index retriever (#18180)
- **Description:** Removing this line 
```python
response = index.query(query, response_mode="no_text", **self.query_kwargs)
```
to 
```python
response = index.query(query, **self.query_kwargs)
```
Since llama index query does not support response_mode anymore : ``` |
TypeError: BaseQueryEngine.query() got an unexpected keyword argument
'response_mode'````
  - **Twitter handle:** @maximeperrin_

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-04-25 17:39:06 -07:00
Leonid Kuligin
4037d9232f docs: cookbook on gemma integrations (#18213)
- [ ] **PR title**: "cookbook: using Gemma on LangChain"

- [ ] **PR message**: 
- **Description:** added a tutorial how to use Gemma with LangChain
(from VertexAI or locally from Kaggle or HF)
    - **Dependencies:** langchain-google-vertexai==0.0.7
    - **Twitter handle:** lkuligin
2024-04-25 17:39:06 -07:00
Christophe Bornet
2dc47b3cf4 community: Use default load() implementation in doc loaders (#18385)
Following https://github.com/langchain-ai/langchain/pull/18289
2024-04-25 17:39:06 -07:00
William De Vena
de543dcd6e infra: fake model invoke callback prior to yielding token (#18286)
## PR title
core[patch]: Invoke callback prior to yielding

## PR message
Description: Invoke on_llm_new_token callback prior to yielding token in
_stream and _astream methods.
Issue: https://github.com/langchain-ai/langchain/issues/16913
Dependencies: None
Twitter handle: None
2024-04-25 17:39:06 -07:00
Ikko Eltociear Ashimine
44998f88e1 docs: fix typo in milvus.ipynb (#18373)
retreival -> retrieval
2024-04-25 17:39:06 -07:00
Tabby
0e63e7779d docs: Update Google El Carro for Oracle Workload Documentation. (#18394)
In this commit we update the documentation for Google El Carro for Oracle Workloads. We amend the documentation in the Google Providers page to use the correct name which is El Carro for Oracle Workloads. We also add changes to the document_loaders and memory pages to reflect changes we made in our repo.
2024-04-25 17:39:06 -07:00
mwmajewsk
0602f1faed community[patch]: fix, better error message in deeplake vectoriser (#18397)
If the document loader recieves Pathlib path instead of str, it reads
the file correctly, but the problem begins when the document is added to
Deeplake.
This problem arises from casting the path to str in the metadata.

```python
deeplake = True
fname = Path('./lorem_ipsum.txt')
loader = TextLoader(fname, encoding="utf-8")
docs = loader.load_and_split()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=100)
chunks= text_splitter.split_documents(docs)
if deeplake:
    db = DeepLake(dataset_path=ds_path, embedding=embeddings, token=activeloop_token)
    db.add_documents(chunks)
else:
    db = Chroma.from_documents(docs, embeddings)
```

So using this snippet of code the error message for deeplake looks like
this:

```
[part of error message omitted]

Traceback (most recent call last):
  File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 53, in <module>
    db.add_documents(chunks)
  File "/home/mwm/repositories/sources/langchain/libs/core/langchain_core/vectorstores.py", line 139, in add_documents
    return self.add_texts(texts, metadatas, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/deeplake.py", line 258, in add_texts
    return self.vectorstore.add(
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/deeplake_vectorstore.py", line 226, in add
    return self.dataset_handler.add(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/dataset_handlers/client_side_dataset_handler.py", line 139, in add
    dataset_utils.extend_or_ingest_dataset(
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 544, in extend_or_ingest_dataset
    extend(
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/vectorstore/vector_search/dataset/dataset.py", line 505, in extend
    dataset.extend(batched_processed_tensors, progressbar=False)
  File "/home/mwm/anaconda3/envs/langchain/lib/python3.11/site-packages/deeplake/core/dataset/dataset.py", line 3247, in extend
    raise SampleExtendError(str(e)) from e.__cause__
deeplake.util.exceptions.SampleExtendError: Failed to append a sample to the tensor 'metadata'. See more details in the traceback. If you wish to skip the samples that cause errors, please specify `ignore_errors=True`.
```

Which is does not explain the error well enough.
The same error for chroma looks like this 

```
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/mwm/repositories/sources/fixing_langchain/main.py", line 56, in <module>
    db = Chroma.from_documents(docs, embeddings)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 778, in from_documents
    return cls.from_texts(
           ^^^^^^^^^^^^^^^
  File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 736, in from_texts
    chroma_collection.add_texts(
  File "/home/mwm/repositories/sources/langchain/libs/community/langchain_community/vectorstores/chroma.py", line 309, in add_texts
    raise ValueError(e.args[0] + "\n\n" + msg)
ValueError: Expected metadata value to be a str, int, float or bool, got lorem_ipsum.txt which is a <class 'pathlib.PosixPath'>

Try filtering complex metadata from the document using langchain_community.vectorstores.utils.filter_complex_metadata.
```

Which is way more user friendly, so I just added information about
possible mismatch of the type in the error message, the same way it is
covered in chroma
https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/chroma.py#L224
2024-04-25 17:39:06 -07:00
Daniel Chico
4e6a8bee5a community[patch]: type ignore fixes (#18395)
Related to #17048
2024-04-25 17:39:06 -07:00
Christophe Bornet
a3270fc17b community[patch]: Implement lazy_load() for CSVLoader (#18391)
Covered by `test_csv_loader.py`
2024-04-25 17:39:06 -07:00
Bagatur
d741d3ab75 fireworks[patch]: support "any" tool_choice (#18343)
per https://readme.fireworks.ai/docs/function-calling
2024-04-25 17:39:06 -07:00
Leonid Ganeline
03ec7d6c6f docs: Tutorials update (#18230)
A big update of the `Tutorials` page. Cleaned it up. Added several new
resources.
2024-04-25 17:39:06 -07:00
Erick Friis
97b69958dd astradb: move to langchain-datastax repo (#18354) 2024-04-25 17:39:06 -07:00
Akash A Desai
27a55218ad templates: Lanceb RAG template (#17809)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-25 17:39:06 -07:00
Guangdong Liu
42770c0a3a community[patch]: Fix ChatModel for sparkllm Bug. (#18375)
**PR message**: ***Delete this entire checklist*** and replace with
    - **Description:** fix sparkllm paramer error
    - **Issue:**   close #18370
- **Dependencies:** change `IFLYTEK_SPARK_APP_URL` to
`IFLYTEK_SPARK_API_URL`
    - **Twitter handle:** No
2024-04-25 17:39:06 -07:00
Yujie Qian
1e3122bfa4 community[patch]: Voyage AI updates default model and batch size (#17655)
- **Description:** update the default model and batch size in
VoyageEmbeddings
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** N/A

---------

Co-authored-by: fodizoltan <zoltan@conway.expert>
2024-04-25 17:39:06 -07:00
Shengsheng Huang
5c9ae435f6 community[minor]: add BigDL-LLM integrations (#17953)
- **Description**:
[`bigdl-llm`](https://github.com/intel-analytics/BigDL) is a library for
running LLM on Intel XPU (from Laptop to GPU to Cloud) using
INT4/FP4/INT8/FP8 with very low latency (for any PyTorch model). This PR
adds bigdl-llm integrations to langchain.
- **Issue**: NA
- **Dependencies**: `bigdl-llm` library
- **Contribution maintainer**: @shane-huang 
 
Examples added:
- docs/docs/integrations/llms/bigdl.ipynb
2024-04-25 17:39:06 -07:00
Ethan Yang
9b4f6e7760 community[minor]: Add openvino backend support (#11591)
- **Description:** add openvino backend support by HuggingFace Optimum
Intel,
  - **Dependencies:** “optimum[openvino]”,

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 17:39:06 -07:00
Leonid Ganeline
ec9eef5f64 docs: runnable module description (#17966)
Added a module description. Added `batch` description.
2024-04-25 17:39:06 -07:00
Leonid Ganeline
8baa7b5d88 docs: nvidia: provider page update (#18054)
Nvidia provider page is missing a Triton Inference Server package
reference.
Changes:
- added the Triton Inference Server reference
- copied the example notebook from the package into the doc files.
- added the Triton Inference Server description and links, the link to
the above example notebook
- formatted page to the consistent format

NOTE:
It seems that the [example
notebook](https://github.com/langchain-ai/langchain/blob/master/libs/partners/nvidia-trt/docs/llms.ipynb)
was originally created in wrong place. It should be in the LangChain
docs
[here](https://github.com/langchain-ai/langchain/tree/master/docs/docs/integrations/llms).
So, I've created a copy of this example. The original example is still
in the nvidia-trt package.
2024-04-25 17:39:06 -07:00
RadhikaBansal97
42591be4f6 community[patch]: Change github endpoint in GithubLoader (#17622)
Description- 
- Changed the GitHub endpoint as existing was not working and giving 404
not found error
- Also the existing function was failing if file_filter is not passed as
the tree api return all paths including directory as well, and when
get_file_content was iterating over these path, the function was failing
for directory as the api was returning list of files inside the
directory, so added a condition to ignore the paths if it a directory
- Fixes this issue -
https://github.com/langchain-ai/langchain/issues/17453

Co-authored-by: Radhika Bansal <Radhika.Bansal@veritas.com>
2024-04-25 17:39:06 -07:00
Yufei (Benny) Chen
18b51cb26f fireworks[patch]: Fix fireworks async stream (#18372)
- **Description:**  Fix the async stream issue with Fireworks
- **Dependencies:** fireworks >= 0.13.0

```
tests/integration_tests/test_chat_models.py ..........                                                                   [ 45%]
tests/integration_tests/test_compile.py .                                                                                [ 50%]
tests/integration_tests/test_embeddings.py ..                                                                            [ 59%]
tests/integration_tests/test_llms.py .........                                                                           [100%]
```
```
tests/unit_tests/test_embeddings.py .                                                                                    [ 16%]
tests/unit_tests/test_imports.py .                                                                                       [ 33%]
tests/unit_tests/test_llms.py ....                                                                                       [100%]
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-25 17:39:06 -07:00
William FH
06139073e0 Add dataset version info (#18299) 2024-04-25 17:39:06 -07:00
Anush
bb795163bc community[patch]: FastEmbed to latest (#18040)
## Description

Updates the `langchain_community.embeddings.fastembed` provider as per
the recent updates to [`FastEmbed`](https://github.com/qdrant/fastembed)
library.
2024-04-25 17:39:05 -07:00
Jacob Lee
ffdb73290d docs[patch]: Add Neo4j GraphAcademy to tutorials section (#18353) 2024-04-25 17:39:05 -07:00
Erick Friis
59173917fc fireworks[patch]: remove custom async and stream implementations (#18363) 2024-04-25 17:39:05 -07:00
Bagatur
0c0d50e0f9 docs: update api ref nav (#18362) 2024-04-25 17:39:05 -07:00
Bagatur
831a9136f4 infra: update create_api_rst (#18361) 2024-04-25 17:39:05 -07:00
Erick Friis
d990b74211 templates: use langchain-text-splitters (#18360)
- deps
- import
- import
2024-04-25 17:39:05 -07:00
Bagatur
0763b7e20c docs: text splitters readme (#18359) 2024-04-25 17:39:05 -07:00
Bagatur
009bd8f812 langchain[patch]: langchain-text-splitters dep (#18357) 2024-04-25 17:39:05 -07:00
Eugene Yurtsev
2d910ba318 community[patch]: BaseLoader load method should just delegate to lazy_load (#18289)
load() should just reference lazy_load()
2024-04-25 17:39:05 -07:00
Bagatur
1f46245a68 text-splitters[minor], langchain[minor], community[patch], templates, docs: langchain-text-splitters 0.0.1 (#18346) 2024-04-25 17:39:05 -07:00
Nuno Campos
21a652b39f Fix missing labels (#18356)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-25 17:39:05 -07:00
William FH
ba50cf2f7a [Core] Patch: rm dumpd of outputs from runnables/base (#18295)
It obstructs evaluations when your return a pydantic object.
2024-04-25 17:39:05 -07:00
Erick Friis
f7a2d1b40c infra: tolerate partner package move in ci (#18355) 2024-04-25 17:39:05 -07:00
William FH
8e13c7e4a2 fireworks[patch]: Fix fireworks bind tools (#18352)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-25 17:39:05 -07:00
Erick Friis
c7fef3dc31 multiple[patch]: fix deprecation versions (#18349) 2024-04-25 17:39:05 -07:00
Erick Friis
94c186c7d7 core[patch]: deprecation docstring with lib (#18350) 2024-04-25 17:39:05 -07:00
Erick Friis
5c7a7f9246 docs: airbyte deps note (#18243) 2024-04-25 17:39:05 -07:00
Erick Friis
949bc0089b mongodb[patch]: core 0.1.5 dep (#18348) 2024-04-25 17:39:05 -07:00
Erick Friis
7d7682ac2c infra: mongodb env vars (#18347) 2024-04-25 17:39:05 -07:00
Jib
cf8ebd860f mongodb[minor]: MongoDB Partner Package -- Porting MongoDBAtlasVectorSearch (#17652)
This PR migrates the existing MongoDBAtlasVectorSearch abstraction from
the `langchain_community` section to the partners package section of the
codebase.
- [x] Run the partner package script as advised in the partner-packages
documentation.
- [x] Add Unit Tests
- [x] Migrate Integration Tests
- [x] Refactor `MongoDBAtlasVectorStore` (autogenerated) to
`MongoDBAtlasVectorSearch`
- [x] ~Remove~ deprecate the old `langchain_community` VectorStore
references.

## Additional Callouts
- Implemented the `delete` method
- Included any missing async function implementations
  - `amax_marginal_relevance_search_by_vector`
  - `adelete` 
- Added new Unit Tests that test for functionality of
`MongoDBVectorSearch` methods
- Removed [`del
res[self._embedding_key]`](e0c81e1cb0/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L218))
in `_similarity_search_with_score` function as it would make the
`maximal_marginal_relevance` function fail otherwise. The `Document`
needs to store the embedding key in metadata to work.

Checklist:

- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
- [x] PR message
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. Existing tests supplied in docs/docs do not change. Updated
docstrings for new functions like `delete`
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory. (This already exists)

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Steven Silvester <steven.silvester@ieee.org>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-25 17:39:05 -07:00
William De Vena
d3e551e440 Updated partners/fireworks README (#18267)
## PR title
partners: changed the README file for the Fireworks integration in the
libs/partners/fireworks folder

## PR message
Description: Changed the README file of partners/fireworks following the
docs on https://python.langchain.com/docs/integrations/llms/Fireworks

The README includes:

- Brief description
- Installation
- Setting-up instructions (API key, model id, ...)
- Basic usage

Issue: https://github.com/langchain-ai/langchain/issues/17545

Dependencies: None

Twitter handle: None
2024-04-25 17:39:05 -07:00
Kai Kugler
4a484c9099 community[patch]: Fixing embedchain document mapping (#18255)
- **Description:** The current embedchain implementation seems to handle
document metadata differently than done in the current implementation of
langchain and a KeyError is thrown. I would love for someone else to
test this...

---------

Co-authored-by: KKUGLER <kai.kugler@mercedes-benz.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Deshraj Yadav <deshraj@gatech.edu>
2024-04-25 17:39:05 -07:00
Erick Friis
b832522da3 community[patch]: remove llmlingua extended tests (#18344) 2024-04-25 17:39:05 -07:00
William De Vena
22f7af0a15 Updated partners/ibm README (#18268)
## PR title
partners: changed the README file for the IBM Watson AI integration in
the libs/partners/ibm folder.

## PR message
Description: Changed the README file of partners/ibm following the docs
on https://python.langchain.com/docs/integrations/llms/ibm_watsonx

The README includes:

- Brief description
- Installation
- Setting-up instructions (API key, project id, ...)
- Basic usage:
  - Loading the model
  - Direct inference
  - Chain invoking
  - Streaming the model output
  
Issue: https://github.com/langchain-ai/langchain/issues/17545

Dependencies: None

Twitter handle: None

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2024-04-25 17:39:05 -07:00
Erick Friis
cb759bd649 infra: ci dirs in wrong order (#18340) 2024-04-25 17:39:05 -07:00
Bagatur
d2d785f47c core[patch]: Release 0.1.28 (#18341) 2024-04-25 17:39:05 -07:00