Commit Graph

4195 Commits

Author SHA1 Message Date
Harrison Chase
15be439719 Harrison/move flashrank rerank (#21448)
third party integration, should be in community
2024-05-15 13:08:52 -07:00
Erick Friis
aca98fd150 multiple: releases with relaxed core dep (#21724) 2024-05-15 19:29:35 +00:00
Bagatur
af284518bc openai[patch]: Release 0.1.7, bump tiktoken 0.7.0 (#21723) 2024-05-15 12:19:29 -07:00
William FH
ca768c8353 [Core] Check is async callable (#21714)
To permit proper coercion of objects like the following:


```python
class MyAsyncCallable:
    async def __call__(self, foo):
        return await ...

class MyAsyncGenerator:
    async def __call__(self, foo):
        await ...
        yield 
```
2024-05-15 10:49:49 -07:00
Eugene Yurtsev
5c2cfabec6 core[minor]: Add v2 implementation of astream events (#21638)
This PR introduces a v2 implementation of astream events that removes
intermediate abstractions and fixes some issues with v1 implementation.

The v2 implementation significantly reduces relevant code that's
associated with the astream events implementation together with
overhead.

After this PR, the astream events implementation:

- Uses an async callback handler
- No longer relies on BaseTracer
- No longer relies on json patch

As a result of this re-write, a number of issues were discovered with
the existing implementation.

## Changes in V2 vs. V1

### on_chat_model_end `output`

The outputs associated with `on_chat_model_end` changed depending on
whether it was within a chain or not.

As a root level runnable the output was: 

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

As part of a chain the output was:

```
            "data": {
                "output": {
                    "generations": [
                        [
                            {
                                "generation_info": None,
                                "message": AIMessageChunk(
                                    content="hello world!", id=AnyStr()
                                ),
                                "text": "hello world!",
                                "type": "ChatGenerationChunk",
                            }
                        ]
                    ],
                    "llm_output": None,
                }
            },
```

After this PR, we will always use the simpler representation:

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

**NOTE** Non chat models (i.e., regular LLMs) are still associated with
the more verbose format.

### Remove some `_stream` events

`on_retriever_stream` and `on_tool_stream` events were removed -- these
were not real events, but created as an artifact of implementing on top
of astream_log.

The same information is already available in the `x_on_end` events.

### Propagating Names

Names of runnables have been updated to be more consistent

```python
  model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
        messages=ConfigurableField(
            id="messages",
            name="Messages",
            description="Messages return by the LLM",
        )
    )
```

Before:
```python
"name": "RunnableConfigurableFields",
```

After:
```python
"name": "GenericFakeChatModel",
```

### on_retriever_end

on_retriever_end will always return `output` which is a list of
documents (rather than a dict containing a key called "documents")

### Retry events

Removed the `on_retry` callback handler. It was incorrectly showing that
the failed function being retried has invoked `on_chain_end`


https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394
2024-05-15 11:48:47 -04:00
Rajendra Kadam
54e003268e langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641)
- **Description:** PebbloRetrievalQA chain introduces identity
enforcement using vector-db metadata filtering
- **Dependencies:** None
- **Issue:** None
- **Documentation:** Adding documentation for PebbloRetrievalQA chain in
a separate PR(https://github.com/langchain-ai/langchain/pull/20746)
- **Unit tests:** New unit-tests added

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-05-15 13:14:52 +00:00
Bagatur
241a6e43a5 docs: update structured how to (#21679) 2024-05-14 22:19:51 +00:00
Jib
f369495fa0 mongodb: [performance] Increase DEFAULT_INSERT_BATCH_SIZE to 100,000 and introduce sizing constraints (#19608) 2024-05-14 22:11:26 +00:00
Eugene Yurtsev
e69a9bedf8 core[patch]: Update mypy config (#21684)
Update mypy config to ignore checking deps from numpy and pytest (which are optional in langsmith sdk)
2024-05-14 17:29:07 -04:00
Erick Friis
9973547aef mongodb: release 0.1.4 (#21678) 2024-05-14 11:54:23 -07:00
Jib
a97473c846 mongodb[patch]: Make ObjectId JSON-serializable on generation (#21394) 2024-05-14 11:52:29 -07:00
Eugene Yurtsev
5c64c004cc core[patch]: Add unit tests with some streaming scenarios (#21668)
Add unit tests that show differences between sync / async versions when
streaming.

The inner on_chain_chunk event is missing if mixing sync and async
functionality. Likely due to missing tap_output_iter implementation on
the sync variant of `_transform_stream_with_config`
2024-05-14 15:30:57 +00:00
Eugene Yurtsev
2ac4d2960c core[patch]: Add unit test to catch ordering (#21669)
Add unit test to catch ordering issues
2024-05-14 15:25:33 +00:00
Zhao Blake
972d2071c6 core[patch]: Fix typo in VectorStoreExampleSelector doc-string (#21574) 2024-05-14 13:31:37 +00:00
Erick Friis
2a984e8e3f docs: huggingface package (#21645) 2024-05-14 03:17:40 +00:00
Erick Friis
c77d2f2b06 multiple: core 0.2 nonbreaking dep, check_diff community->langchain dep (#21646)
0.2 is not a breaking release for core (but it is for langchain and
community)

To keep the core+langchain+community packages in sync at 0.2, we will
relax deps throughout the ecosystem to tolerate `langchain-core` 0.2
2024-05-13 19:50:36 -07:00
Anush
edd68e4ad4 qdrant: init package (#21146)
## Description

This PR introduces the new `langchain-qdrant` partner package, intending
to deprecate the community package.

## Changes

- Moved the Qdrant vector store implementation `/libs/partners/qdrant`
with integration tests.
- The conditional imports of the client library are now regular with
minor implementation improvements.
- Added a deprecation warning to
`langchain_community.vectorstores.qdrant.Qdrant`.
- Replaced references/imports from `langchain_community` with either
`langchain_core` or by moving the definitions to the `langchain_qdrant`
package itself.
- Updated the Qdrant vector store documentation to reflect the changes.

## Testing
- `QDRANT_URL` and
[`QDRANT_API_KEY`](583e36bf6b)
env values need to be set to [run integration
tests](d608c93d1f)
in the [cloud](https://cloud.qdrant.tech).
- If a Qdrant instance is running at `http://localhost:6333`, the
integration tests will use it too.
- By default, tests use an
[`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode)
instance(Not comprehensive).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-05-13 18:20:03 -07:00
Prashanth Rao
63c3a0e56c [community][graph]: Update KuzuQAChain and docs (#21218)
This PR makes some small updates for `KuzuQAChain` for graph QA.

- Updated Cypher generation prompt (we now support `WHERE EXISTS`) and
generalize it more
- Support different LLMs for Cypher generation and QA
- Update docs and examples
2024-05-13 17:17:14 -07:00
Tomaz Bratanic
89ff6a3d3b Add sentiment and confidence levels to diffbotgraphtransformer (#21590)
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-13 23:00:52 +00:00
Erick Friis
9b51ca08bc huggingface: fix community dep checking (#21628) 2024-05-13 21:52:18 +00:00
Erick Friis
91a2ea5cd6 chroma, mongodb: fix docstrings (#21629) 2024-05-13 21:27:43 +00:00
Jofthomas
afd85b60fc huggingface: init package (#21097)
First Pr for the langchain_huggingface partner Package

- Moved some of the hugging face related class from `community` to the
new `partner package`

Still needed :
- Documentation
- Tests
- Support for the new apply_chat_template in `ChatHuggingFace`
- Confirm choice of class to support for embeddings witht he
sentence-transformer team.

cc : @efriis

---------

Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-13 20:53:15 +00:00
Tomaz Bratanic
9fce03e7db community[patch]: Fix neo4j enhanced schema (#21582) 2024-05-13 15:26:06 -04:00
Christophe Bornet
66a4da8ad0 community[patch]: Improve Cassandra VectorStore docsctrings (#21620) 2024-05-13 15:24:26 -04:00
adreo00
40aff1eacc core[major]: AsyncCallbackManagerForToolRun no longer casts return object to string (#20374)
- **Description:** Stops `AsyncCallbackManagerForToolRun` from
converting the output to str
- **Issue:** #20372
- **Dependencies:** None
2024-05-13 15:09:12 -04:00
Eugene Yurtsev
25fbe356b4 community[patch]: upgrade to recent version of mypy (#21616)
This PR upgrades community to a recent version of mypy. It inserts type:
ignore on all existing failures.
2024-05-13 14:55:07 -04:00
Eugene Yurtsev
b923951062 langchain[patch]: CI add lint rule for community imports (#21618)
Add a rule to check for imports from community in global scope
2024-05-13 14:51:25 -04:00
Jorge Piedrahita Ortiz
4378fbbef0 community[patch]: Fix typos in Sambanova integration doc-strings (#21617)
- **Description:** Sambanova integration docstrings updated, bad
formated

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-05-13 18:35:16 +00:00
Christophe Bornet
bcf53f93e1 [community]: Add missing docstring param to CassandraLoader (#21611) 2024-05-13 16:03:18 +00:00
Christophe Bornet
e6fa4547b1 community[minor]: Add alazy_load to AsyncHtmlLoader (#21536)
Also fixes a bug that `_scrape` was called and was doing a second HTTP
request synchronously.

**Twitter handle:** cbornet_
2024-05-13 12:01:03 -04:00
Wang Guan
b53548dcda langchain[minor]: allow CacheBackedEmbeddings to cache queries (#20073)
Add optional caching of queries to cache backed embeddings
2024-05-13 15:18:04 +00:00
Guangdong Liu
a156aace2b core[patch]:Fix Incorrect listeners parameters for Runnable.with_listeners() and .map() (#20661)
- **Issue:** fix #20509
-  @baskaryan, @eyurtsev


![image](https://github.com/langchain-ai/langchain/assets/48236177/f799a976-b983-4d8b-b373-64392e1fd6c6)
2024-05-13 11:16:17 -04:00
junkeon
480c02bf55 upstage[minor]: add merge_and_split function for document loader (#21603)
- Introduce the `merge_and_split` function in the
`UpstageLayoutAnalysisLoader`.
- The `merge_and_split` function takes a list of documents and a
splitter as inputs.
- This function merges all documents and then divides them using the
`split_documents` method, which is a proprietary function of the
splitter.
- If the provided splitter is `None` (which is the default setting), the
function will simply merge the documents without splitting them.
2024-05-13 10:55:19 -04:00
Leonid Ganeline
500569da48 community[patch]: vectorstores import update (#21169)
Issue: we have several helper functions to import third-party libraries
like lancedb.import_lancedb in
[community.vectorstores](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.lancedb.import_lancedb.html#langchain_community.vectorstores.lancedb.import_lancedb).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
2024-05-13 10:45:31 -04:00
ccurme
3003363605 langchain, community: remove cap on sqlalchemy and bump duckdb (#21509) 2024-05-13 10:16:09 -04:00
ccurme
01a3228d8e standard tests: add test for few-shot examples (#21019) 2024-05-13 10:06:12 -04:00
Chuyuan Qu
af875cff57 prompty: adding Microsoft langchain_prompty package (#21346)
Co-authored-by: Micky Liu <wayliu@microsoft.com>
Co-authored-by: wayliums <wayliums@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-11 04:03:44 +00:00
Matt Florence
d3ca2cc8c3 langchain: Fix broken OpenAIModerationChain and implement async (#18537)
Thank you for contributing to LangChain!

## PR title
lancghain[patch]: fix `OpenAIModerationChain` and implement async

## PR message
Description: fix `OpenAIModerationChain` and implement async

Issues: 
- https://github.com/langchain-ai/langchain/issues/18533 
- https://github.com/langchain-ai/langchain/issues/13685

Dependencies: none
Twitter handle: mattflo


## Add tests and docs
 
Existing documentation is broken:
https://python.langchain.com/docs/guides/safety/moderation


- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Emilia Katari <emilia@outpace.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-05-10 19:04:13 +00:00
ccurme
4170e72a42 openai: fix loads unit test (#21542)
following changes to tests in core here:
https://github.com/langchain-ai/langchain/pull/21342/files
2024-05-10 18:46:34 +00:00
Erick Friis
3db85cbb5b community: deps (#21508) 2024-05-09 15:12:34 -07:00
Erick Friis
8580e350be cli: release 0.0.22 (#21507) 2024-05-09 21:45:20 +00:00
Anthony Chu
c735849e76 azure-dynamic-sessions: add Python REPL tool (#21264)
Adds a Python REPL that executes code in a code interpreter session
using Azure Container Apps dynamic sessions.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-09 21:39:04 +00:00
Erick Friis
02701c277f langchain: core min version (#21506) 2024-05-09 13:45:44 -07:00
Erick Friis
13b01104c9 langchain: drop sqlalchemy max, release 0.2.0rc2 (#21504) 2024-05-09 13:12:38 -07:00
ccurme
375f447e58 community: fix builds with min dependencies (#21495) 2024-05-09 13:01:44 -07:00
Trayan Azarov
ba7d53689c community: Chroma Adding create_collection_if_not_exists flag to Chroma constructor (#21420)
- **Description:** Adds the ability to either `get_or_create` or simply
`get_collection`. This is useful when dealing with read-only Chroma
instances where users are constraint to using `get_collection`. Targeted
at Http/CloudClients mostly.
- **Issue:** chroma-core/chroma#2163
- **Dependencies:** N/A
- **Twitter handle:** `@t_azarov`




| Collection Exists | create_collection_if_not_exists | Outcome | test |

|-------------------|---------------------------------|----------------------------------------------------------------|----------------------------------------------------------|
| True | False | No errors, collection state unchanged |
`test_create_collection_if_not_exist_false_existing` |
| True | True | No errors, collection state unchanged |
`test_create_collection_if_not_exist_true_existing` |
| False | False | Error, `get_collection()` fails |
`test_create_collection_if_not_exist_false_non_existing` |
| False | True | No errors, `get_or_create_collection()` creates the
collection | `test_create_collection_if_not_exist_true_non_existing` |
2024-05-09 11:45:10 -04:00
ccurme
3bb9bec314 bedrock: add unit test for retriever (#21485)
This was implemented in
https://github.com/langchain-ai/langchain/pull/21349 but dropped before
merge.
2024-05-09 11:37:03 -04:00
Renu Rozera
4035a1d234 Add source metadata to bedrock retriever response (#21349)
Thank you for contributing to LangChain!

- [X] **PR title**: "community: Add source metadata to bedrock retriever
response"

- [X] **PR message**: 
- **Description:** Bedrock retrieve API returns extra metadata in the
response which is currently not returned in the retriever response
- **Issue:** The change adds the metadata from bedrock retrieve API
response to the bedrock retriever in a backward compatible way. Renamed
metadata to sourceMetadata as metadata term is being used in the
Document already. This is in sync with what we are doing in llama-index
as well.
    - **Dependencies:** No


- [X] **Add tests and docs**:
  1. Added unit tests
  2. Notebook already exists and does not need any change
3. Response from end to end testing, just to ensure backward
compatibility: `[Document(page_content='Exoplanets.',
metadata={'location': {'s3Location': {'uri':
's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647,
'source_metadata': {'x-amz-bedrock-kb-source-uri':
's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year':
1946.0}})]`


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
2024-05-09 11:06:22 -04:00
Erick Friis
f178c67ad0 community: release 0.2.0rc1, bump deps (#21470) 2024-05-08 23:32:44 -07:00
William FH
b28be5d407 Pass through Run ID Explicitly (#21469) 2024-05-08 22:20:51 -07:00