Commit Graph

2868 Commits

Author SHA1 Message Date
Bagatur
c0ce93236a
experimental[patch]: fix zero-shot pandas agent (#17442) 2024-02-12 21:58:35 -08:00
Abhishek Jain
37e1275f9e
community[patch]: Fixed the 'aembed' method of 'CohereEmbeddings'. (#16497)
**Description:**
- The existing code was trying to find a `.embeddings` property on the
`Coroutine` returned by calling `cohere.async_client.embed`.
- Instead, the `.embeddings` property is present on the value returned
by the `Coroutine`.
- Also, it seems that the original cohere client expects a value of
`max_retries` to not be `None`. Hence, setting the default value of
`max_retries` to `3`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-12 21:57:27 -08:00
Sridhar Ramaswamy
9f1cbbc6ed
community[minor]: Add pebblo safe document loader (#16862)
- **Description:** Pebblo opensource project enables developers to
safely load data to their Gen AI apps. It identifies semantic topics and
entities found in the loaded data and summarizes them in a
developer-friendly report.
  - **Dependencies:** none
  - **Twitter handle:** srics

@hwchase17
2024-02-12 21:56:12 -08:00
mhavey
1bbb64d956
community[minor], langchian[minor]: Add Neptune Rdf graph and chain (#16650)
**Description**: This PR adds a chain for Amazon Neptune graph database
RDF format. It complements the existing Neptune Cypher chain. The PR
also includes a Neptune RDF graph class to connect to, introspect, and
query a Neptune RDF graph database from the chain. A sample notebook is
provided under docs that demonstrates the overall effect: invoking the
chain to make natural language queries against Neptune using an LLM.

**Issue**: This is a new feature
 
**Dependencies**: The RDF graph class depends on the AWS boto3 library
if using IAM authentication to connect to the Neptune database.

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-12 21:30:20 -08:00
Michael Feil
e1cfd0f3e7
community[patch]: infinity embeddings update incorrect default url (#16759)
The default url has always been incorrect (7797 instead 7997). Here is a
update to the correct url.
2024-02-12 20:05:08 -08:00
Massimiliano Pronesti
df7cbd6fbb
community[minor]: add FlashRank ranker (#16785)
**Description:** This PR adds support for
[flashrank](https://github.com/PrithivirajDamodaran/FlashRank) for
reranking as alternative to Cohere.

I'm not sure `libs/langchain` is the right place for this change. At
first, I wanted to put it under `libs/community`. All the compressors
were under `libs/langchain/retrievers/document_compressors` though. Hope
this makes sense!
2024-02-12 20:00:52 -08:00
Andreas Motl
1fdd9bd980
community/SQLDatabase: Generalize and trim software tests (#16659)
- **Description:** Improve test cases for `SQLDatabase` adapter
component, see
[suggestion](https://github.com/langchain-ai/langchain/pull/16655#pullrequestreview-1846749474).
  - **Depends on:** GH-16655
  - **Addressed to:** @baskaryan, @cbornet, @eyurtsev

_Remark: This PR is stacked upon GH-16655, so that one will need to go
in first._

Edit: Thank you for bringing in GH-17191, @eyurtsev. This is a little
aftermath, improving/streamlining the corresponding test cases.
2024-02-12 22:58:34 -05:00
Theo / Taeyoon Kang
1987f905ed
core[patch]: Support .yml extension for YAML (#16783)
- **Description:**

[AS-IS] When dealing with a yaml file, the extension must be .yaml.  

[TO-BE] In the absence of extension length constraints in the OS, the
extension of the YAML file is yaml, but control over the yml extension
must still be made.

It's as if it's an error because it's a .jpg extension in jpeg support.

  - **Issue:** - 

  - **Dependencies:**
no dependencies required for this change,
2024-02-12 19:57:20 -08:00
Kapil Sachdeva
cd00a87db7
community[patch] - in FAISS vector store, support passing custom DocStore implementation when using from_xxx methods (#16801)
- **Description:** The from__xx methods of FAISS class have hardcoded
InMemoryStore implementation and thereby not let users pass a custom
DocStore implementation,
  - **Issue:** no referenced issue,
  - **Dependencies:** none,
  - **Twitter handle:** ksachdeva
2024-02-12 19:51:55 -08:00
Chris
f9f5626ca4
community[patch]: Fix github search issues and PRs PaginatedList has no len() error (#16806)
**Description:** 
Bugfix: Langchain_community's GitHub Api wrapper throws a TypeError when
searching for issues and/or PRs (the `search_issues_and_prs` method).
This is because PyGithub's PageinatedList type does not support the
len() method. See https://github.com/PyGithub/PyGithub/issues/1476

![image](https://github.com/langchain-ai/langchain/assets/8849021/57390b11-ed41-4f48-ba50-f3028610789c)
  **Dependencies:** None 
  **Twitter handle**: @ChrisKeoghNZ
  
I haven't registered an issue as it would take me longer to fill the
template out than to make the fix, but I'm happy to if that's deemed
essential.

I've added a simple integration test to cover this as there were no
existing unit tests and it was going to be tricky to set them up.

Co-authored-by: Chris Keogh <chris.keogh@xero.com>
2024-02-12 19:50:59 -08:00
morgana
722aae4fd1
community: add delete method to rocksetdb vectorstore to support recordmanager (#17030)
- **Description:** This adds a delete method so that rocksetdb can be
used with `RecordManager`.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** `@_morgan_adams_`

---------

Co-authored-by: Rockset API Bot <admin@rockset.io>
2024-02-12 19:50:20 -08:00
yin1991
c454dc36fc
community[proxy]: Enhancement/add proxy support playwrighturlloader 16751 (#16822)
- **Description:** Enhancement/add proxy support playwrighturlloader
16751
- **Issue:** [Enhancement: Add Proxy Support to PlaywrightURLLoader
Class](https://github.com/langchain-ai/langchain/issues/16751)
  - **Dependencies:** 
  - **Twitter handle:** @ootR77013489

---------

Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-12 19:48:29 -08:00
Bhupesh Varshney
e3b775e035
infra: make .gitignore consistent with standard python gitignore (#16828)
- The new .gitignore version is inherited from the one maintained by the
github community over at
https://github.com/github/gitignore/blob/main/Python.gitignore
- This should cover all the cases of how a langchain app can be used.
2024-02-12 19:43:41 -08:00
James Braza
64938ae6f2
infra: unit testing check_package_version (#16825)
Wrote a unit test for `check_package_version` in the core package.

Note that this is a revival of
https://github.com/langchain-ai/langchain/pull/16387 after GitHub
incident (see
https://github.com/langchain-ai/langchain/discussions/16796).
2024-02-12 19:39:58 -08:00
Lingzhen Chen
30af711c34
community[patch]: update AzureSearch class to work with azure-search-documents=11.4.0 (#15659)
- **Description:** Updates
`libs/community/langchain_community/vectorstores/azuresearch.py` to
support the stable version `azure-search-documents=11.4.0`
- **Issue:** https://github.com/langchain-ai/langchain/issues/14534,
https://github.com/langchain-ai/langchain/issues/15039,
https://github.com/langchain-ai/langchain/issues/15355
  - **Dependencies:** azure-search-documents>=11.4.0

---------

Co-authored-by: Clément Tamines <Skar0@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-12 19:23:35 -08:00
Robby
e135dc70c3
community[patch]: Invoke callback prior to yielding token (#17348)
**Description:** Invoke callback prior to yielding token in stream
method for Ollama.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
2024-02-12 19:22:55 -08:00
Christophe Bornet
ab025507bc
community[patch]: Add async methods to VectorStoreQATool (#16949) 2024-02-12 19:19:50 -08:00
Christophe Bornet
fb7552bfcf
Add async methods to InMemoryCache (#17425)
Add async methods to InMemoryCache
2024-02-12 22:02:38 -05:00
Eugene Yurtsev
93472ee9e6
core[patch]: Replace memory stream implementation used by LogStreamCallbackHandler (#17185)
This PR replaces the memory stream implementation used by the 
LogStreamCallbackHandler.

This implementation resolves an issue in which streamed logs and
streamed events originating from sync code would arrive only after the
entire sync code would finish execution (rather than arriving in real
time as they're generated).

One example is if trying to stream tokens from an llm within a tool. If
the tool was an async tool, but the llm was invoked via stream (sync
variant) rather than astream (async variant), then the tokens would fail
to stream in real time and would all arrived bunched up after the tool
invocation completed.
2024-02-12 21:57:38 -05:00
yin1991
37ef6ac113
community[patch]: Add Pagination to GitHubIssuesLoader for Efficient GitHub Issues Retrieval (#16934)
- **Description:** Add Pagination to GitHubIssuesLoader for Efficient
GitHub Issues Retrieval
- **Issue:** [the issue # it fixes if
applicable,](https://github.com/langchain-ai/langchain/issues/16864)

---------

Co-authored-by: root <root@ip-172-31-46-160.ap-southeast-1.compute.internal>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-12 18:30:36 -08:00
Bagatur
22638e5927
community[patch]: give reranker default client val (#17289) 2024-02-12 17:21:53 -08:00
Robby
ece4b43a81
community[patch]: doc loaders mypy fixes (#17368)
**Description:** Fixed `type: ignore`'s for mypy for some
document_loaders.
**Issue:** [Remove "type: ignore" comments #17048
](https://github.com/langchain-ai/langchain/issues/17048)

---------

Co-authored-by: Robby <h0rv@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-02-12 16:51:06 -08:00
Robby
0653aa469a
community[patch]: Invoke callback prior to yielding token (#17346)
**Description:** Invoke callback prior to yielding token in stream
method for watsonx.
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)

Co-authored-by: Robby <h0rv@users.noreply.github.com>
2024-02-12 16:36:33 -08:00
Bagatur
f7e453971d
community[patch]: remove print (#17435) 2024-02-12 15:21:38 -08:00
Spencer Kelly
54fa78c887
community[patch]: fixed vector similarity filtering (#16967)
**Description:** changed filtering so that failed filter doesn't add
document to results. Currently filtering is entirely broken and all
documents are returned whether or not they pass the filter.

fixes issue introduced in
https://github.com/langchain-ai/langchain/pull/16190
2024-02-12 14:52:57 -08:00
Aditya
a23c719c8b
google-genai[minor]: add safety settings (#16836)
Replace this entire comment with:
- **Description:Expose safety_settings for Gemini integrations on
google-generativeai
  - **Issue:NA,
  - **Dependencies:NA
  - **Twitter handle:@aditya_rane

@lkuligin for review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-12 13:44:24 -08:00
Abhijeeth Padarthi
584b647b96
community[minor]: AWS Athena Document Loader (#15625)
- **Description:** Adds the document loader for [AWS
Athena](https://aws.amazon.com/athena/), a serverless and interactive
analytics service.
  - **Dependencies:** Added boto3 as a dependency
2024-02-12 12:53:40 -08:00
david-tempelmann
93da18b667
community[minor]: Add mmr and similarity_score_threshold retrieval to DatabricksVectorSearch (#16829)
- **Description:** This PR adds support for `search_types="mmr"` and
`search_type="similarity_score_threshold"` to retrievers using
`DatabricksVectorSearch`,
  - **Issue:** 
  - **Dependencies:**
  - **Twitter handle:**

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-12 12:51:37 -08:00
Erick Friis
42648061ad
openai[patch]: code cleaning (#17355)
h/t @tdene for finding cleanup op in #17047
2024-02-12 12:36:12 -08:00
Massimiliano Pronesti
3894b4d9a5
community: add gpt-4-turbo and gpt-4-0125 costs (#17349)
Ref: https://openai.com/pricing
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2024-02-11 21:24:24 -08:00
Tomaz Bratanic
19a1c9183d
Improve graph cypher qa prompt (#17380)
Unlike vector results, the LLM has to completely trust the context of a
graph database result, even if it doesn't provide whole context. We
tried with instructions, but it seems that adding a single example is
the way to go to solve this issue.
2024-02-11 21:15:46 -08:00
Sandeep Banerjee
183daa6e6f
google-genai[patch]: on_llm_new_token fix (#16924)
### This pull request makes the following changes:
* Fixed issue #16913

Fixed the google gen ai chat_models.py code to make sure that the
callback is called before the token is yielded

<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-09 18:00:24 -08:00
Bagatur
10c10f2dea
cli[patch]: integration template nits (#14691)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-09 17:59:34 -08:00
Erick Friis
99540d3d75
infra: no print in newer partner packages (#17353)
<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2024-02-09 16:40:02 -08:00
William FH
7c03cc5ed4
Support serialization when inputs/outputs contain generators (#17338)
Pydantic's `dict()` function raises an error here if you pass in a
generator. We have a more robust serialization function in lagnsmith
that we will use instead.
2024-02-09 16:24:54 -08:00
Erick Friis
3a2eb6e12b
infra: add print rule to ruff (#16221)
Added noqa for existing prints. Can slowly remove / will prevent more
being intro'd
2024-02-09 16:13:30 -08:00
Jael Gu
c07c0da01a
community[patch]: Fix Milvus add texts when ids=None (#17021)
- **Description:** Fix Milvus add texts when ids=None (auto_id=True)

Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-09 18:48:37 -05:00
Quang Hoa
54c1fb3f25
community[patch]: Make some functions work with Milvus (#10695)
**Description**
Make some functions work with Milvus:
1. get_ids: Get primary keys by field in the metadata
2. delete: Delete one or more entities by ids
3. upsert: Update/Insert one or more entities

**Issue**
None
**Dependencies**
None
**Tag maintainer:**
@hwchase17 
**Twitter handle:**
None

---------

Co-authored-by: HoaNQ9 <hoanq.1811@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-09 15:21:31 -08:00
kYLe
c9999557bf
community[patch]: Modify LLMs/Anyscale work with OpenAI API v1 (#14206)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
- **Description:** 
1. Modify LLMs/Anyscale to work with OAI v1
2. Get rid of openai_ prefixed variables in Chat_model/ChatAnyscale
3. Modify `anyscale_api_base` to `anyscale_base_url` to follow OAI name
convention (reverted)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-09 15:11:18 -08:00
Charlie Marsh
24c0bab57b
infra, multiple: Upgrade configuration for Ruff v0.2.0 (#16905)
## Summary

This PR upgrades LangChain's Ruff configuration in preparation for
Ruff's v0.2.0 release. (The changes are compatible with Ruff v0.1.5,
which LangChain uses today.) Specifically, we're now warning when
linter-only options are specified under `[tool.ruff]` instead of
`[tool.ruff.lint]`.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-09 14:28:02 -08:00
Bagatur
01409add5a
google-vertexai[patch]: rm deps (#17077)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-09 14:12:10 -08:00
Erick Friis
1c2facf88d
nvidia-ai-endpoints[patch]: release 0.0.3 (#17345) 2024-02-09 13:55:01 -08:00
Vadim Kudlay
5f9ac6986e
nvidia-ai-endpoints[patch]: model arguments (e.g. temperature) on construction bug (#17290)
- **Issue:** Issue with model argument support (been there for a while
actually):
- Non-specially-handled arguments like temperature don't work when
passed through constructor.
- Such arguments DO work quite well with `bind`, but also do not abide
by field requirements.
- Since initial push, server-side error messages have gotten better and
v0.0.2 raises better exceptions. So maybe it's better to let server-side
handle such issues?
- **Description:**
- Removed ChatNVIDIA's argument fields in favor of
`model_kwargs`/`model_kws` arguments which aggregates constructor kwargs
(from constructor pathway) and merges them with call kwargs (bind
pathway).
- Shuffled a few functions from `_NVIDIAClient` to `ChatNVIDIA` to
streamline construction for future integrations.
- Minor/Optional: Old services didn't have stop support, so client-side
stopping was implemented. Now do both.
- **Any Breaking Changes:** Minor breaking changes if you strongly rely
on chat_model.temperature, etc. This is captured by
chat_model.model_kwargs.

PR passes tests and example notebooks and example testing. Still gonna
chat with some people, so leaving as draft for now.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-02-09 13:46:02 -08:00
Leonid Ganeline
932c52c333
community[patch]: docstrings (#16810)
- added missed docstrings
- formated docstrings to the consistent form
2024-02-09 12:48:57 -08:00
Leonid Ganeline
ae66bcbc10
core[patch]: docstring update (#16813)
- added missed docstrings
- formated docstrings to consistent form
2024-02-09 12:47:41 -08:00
Eugene Yurtsev
e10030e241
core[patch]: Add unit test to cover different streaming format for json parsing (#17063)
Add unit test to cover this issue:

https://github.com/langchain-ai/langchain/issues/16423

which was resolved by this PR:

https://github.com/langchain-ai/langchain/pull/16670/files

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-02-09 11:28:55 -05:00
Kononov Pavel
15bc201967
langchain_community: Fix typo bug (#17324)
Problem from #17095

This error wasn't in the v1.4.0
2024-02-09 11:27:33 -05:00
Erick Friis
e660a1685b
google-genai[patch]: release 0.0.8 (#17285) 2024-02-08 19:39:44 -08:00
Erick Friis
febf9540b9
google-genai[patch]: fix tool format, use protos (#17284) 2024-02-08 19:36:49 -08:00
German Martin
1032faba5f
langchain_google_genai : Add missing _identifying_params property. (#17224)
Description: Missing _identifying_params create issues when dealing with
callbacks to get current run model parameters.
All other model partners implementation provide this property and also
provide _default_params. I'm not sure about the default values to
include or if we can re-use the same as for _VertexAICommon(), this
change allows you to access the model parameters correctly.
Issue: Not exactly this issue but could be related
https://github.com/langchain-ai/langchain/issues/14711
Twitter handle:@musicaoriginal2
2024-02-08 17:40:21 -08:00