Last year Microsoft [changed the
name](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search)
of Azure Cognitive Search to Azure AI Search. This PR updates the
Langchain Azure Retriever API and it's associated docs to reflect this
change. It may be confusing for users to see the name Cognitive here and
AI in the Microsoft documentation which is why this is needed. I've also
added a more detailed example to the Azure retriever doc page.
There are more places that need a similar update but I'm breaking it up
so the PRs are not too big 😄 Fixing my errors from the previous PR.
Twitter: @marlene_zw
Two new tests added to test backward compatibility in
`libs/community/tests/integration_tests/retrievers/test_azure_cognitive_search.py`
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
[Dria](https://dria.co/) is a hub of public RAG models for developers to
both contribute and utilize a shared embedding lake. This PR adds a
retriever that can retrieve documents from Dria.
Description: adds support for langchain_cohere
---------
Co-authored-by: Harry M <127103098+harry-cohere@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Looking at tokens / page of our docs, we see a few outliers:
<img width="761" alt="image"
src="https://github.com/langchain-ai/langchain/assets/122662504/677aa2d6-0a29-45e4-882a-db2bbf46d02b">
It is due to non-rendering images in one case, and output spamming.
Clean these, along with other cases of excessing output spamming in
docs.
All get sucked into chat-langchain for retrieval.
**Description:** Fixes the import paths for the `FlashrankRerank`
example notebook.
**Issue:** #19139
**Dependencies:** None
**Twitter handle:** n/a
---------
Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Add documentation notebook for `ElasticsearchRetriever`.
## Dependencies
- [ ] Release new `langchain-elasticsearch` version 0.2.0 that includes
`ElasticsearchRetriever`
Follow up on https://github.com/langchain-ai/langchain/pull/17467.
- Update all references to the Elasticsearch classes to use the partners
package.
- Deprecate community classes.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
**Description**: This PR adds support for using the [LLMLingua project
](https://github.com/microsoft/LLMLingua) especially the LongLLMLingua
(Enhancing Large Language Model Inference via Prompt Compression) as a
document compressor / transformer.
The LLMLingua project is an interesting project that can greatly improve
RAG system by compressing prompts and contexts while keeping their
semantic relevance.
**Issue**: https://github.com/microsoft/LLMLingua/issues/31
**Dependencies**: [llmlingua](https://pypi.org/project/llmlingua/)
@baskaryan
---------
Co-authored-by: Ayodeji Ayibiowu <ayodeji.ayibiowu@getinge.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
### Description
This PR moves the Elasticsearch classes to a partners package.
Note that we will not move (and later remove) `ElasticKnnSearch`. It
were previously deprecated.
`ElasticVectorSearch` is going to stay in the community package since it
is used quite a lot still.
Also note that I left the `ElasticsearchTranslator` for self query
untouched because it resides in main `langchain` package.
### Dependencies
There will be another PR that updates the notebooks (potentially pulling
them into the partners package) and templates and removes the classes
from the community package, see
https://github.com/langchain-ai/langchain/pull/17468
#### Open question
How to make the transition smooth for users? Do we move the import
aliases and require people to install `langchain-elasticsearch`? Or do
we remove the import aliases from the `langchain` package all together?
What has worked well for other partner packages?
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
**Description:** This PR adds support for
[flashrank](https://github.com/PrithivirajDamodaran/FlashRank) for
reranking as alternative to Cohere.
I'm not sure `libs/langchain` is the right place for this change. At
first, I wanted to put it under `libs/community`. All the compressors
were under `libs/langchain/retrievers/document_compressors` though. Hope
this makes sense!
This PR updates the `TF-IDF.ipynb` documentation to reflect the new
import path for TFIDFRetriever in the langchain-community package. The
previous path, `from langchain.retrievers import TFIDFRetriever`, has
been updated to `from langchain_community.retrievers import
TFIDFRetriever` to align with the latest changes in the langchain
library.
<!-- Thank you for contributing to LangChain!
Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes if applicable,
- **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- **Description: changes to you.com files**
- general cleanup
- adds community/utilities/you.py, moving bulk of code from retriever ->
utility
- removes `snippet` as endpoint
- adds `news` as endpoint
- adds more tests
<s>**Description: update community MAKE file**
- adds `integration_tests`
- adds `coverage`</s>
- **Issue:** the issue # it fixes if applicable,
- [For New Contributors: Update Integration
Documentation](https://github.com/langchain-ai/langchain/issues/15664#issuecomment-1920099868)
- **Dependencies:** n/a
- **Twitter handle:** @scottnath
- **Mastodon handle:** scottnath@mastodon.social
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- vertex chat
- google
- some pip openai
- percent and openai
- all percent
- more
- pip
- fmt
- docs: google vertex partner docs
- fmt
- docs: more pip installs
Updates docs and cookbooks to import ChatOpenAI, OpenAI, and OpenAI
Embeddings from `langchain_openai`
There are likely more
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
…tch]: import models from community
ran
```bash
git grep -l 'from langchain\.chat_models' | xargs -L 1 sed -i '' "s/from\ langchain\.chat_models/from\ langchain_community.chat_models/g"
git grep -l 'from langchain\.llms' | xargs -L 1 sed -i '' "s/from\ langchain\.llms/from\ langchain_community.llms/g"
git grep -l 'from langchain\.embeddings' | xargs -L 1 sed -i '' "s/from\ langchain\.embeddings/from\ langchain_community.embeddings/g"
git checkout master libs/langchain/tests/unit_tests/llms
git checkout master libs/langchain/tests/unit_tests/chat_models
git checkout master libs/langchain/tests/unit_tests/embeddings/test_imports.py
make format
cd libs/langchain; make format
cd ../experimental; make format
cd ../core; make format
```
Description: Adding Summarization to Vectara, to reflect it provides not
only vector-store type functionality but also can return a summary.
Also added:
MMR capability (in the Vectara platform side)
Updated templates
Updated documentation and IPYNB examples
Tag maintainer: @baskaryan
Twitter handle: @ofermend
---------
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
## Description
This PR intends to add support for Qdrant's new [sparse vector
retrieval](https://qdrant.tech/articles/sparse-vectors/) by introducing
a new retriever class, `QdrantSparseVectorRetriever`.
Necessary usage docs and integration tests have been added for the
retriever.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
Description: A new vector store Jaguar is being added. Class, test
scripts, and documentation is added.
Issue: None -- This is the first PR contributing to LangChain
Dependencies: This depends on "pip install -U jaguardb-http-client"
client http package
Tag maintainer: @baskaryan, @eyurtsev, @hwchase1
Twitter handle: @workbot
---------
Co-authored-by: JY <jyjy@jaguardb>
Co-authored-by: Bagatur <baskaryan@gmail.com>
This patch fixes some typos.
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Many jupyter notebooks didn't pass linting. List of these files are
presented in the [tool.ruff.lint.per-file-ignores] section of the
pyproject.toml . Addressed these bugs:
- fixed bugs; added missed imports; updated pyproject.toml
Only the `document_loaders/tensorflow_datasets.ipyn`,
`cookbook/gymnasium_agent_simulation.ipynb` are not completely fixed.
I'm not sure about imports.
---------
Co-authored-by: Erick Friis <erick@langchain.dev>
The `AWS` platform page has many missed integrations.
- added missed integration references to the `AWS` platform page
- added/updated descriptions and links in the referenced notebooks
- renamed two notebook files. They have file names != page Title, which
generate unordered ToC.
- reroute the URLs for renamed files
- fixed `amazon_textract` notebook: removed failed cell outputs