Commit Graph

878 Commits

Author SHA1 Message Date
Kwan Kin Chan
6d2a76ac05
langchain_huggingface: Fix multiple GPU usage bug in from_model_id function (#23628)
- [ ]  **Description:**   
   - pass the device_map into model_kwargs 
- removing the unused device_map variable in the hf_pipeline function
call
- [ ] **Issue:** issue #13128 
When using the from_model_id function to load a Hugging Face model for
text generation across multiple GPUs, the model defaults to loading on
the CPU despite multiple GPUs being available using the expected format
``` python
llm = HuggingFacePipeline.from_model_id(
    model_id="model-id",
    task="text-generation",
    device_map="auto",
)
```
Currently, to enable multiple GPU , we have to pass in variable in this
format instead
``` python
llm = HuggingFacePipeline.from_model_id(
    model_id="model-id",
    task="text-generation",
    device=None,
    model_kwargs={
        "device_map": "auto",
    }
)
```
This issue arises due to improper handling of the device and device_map
parameters.

- [ ] **Explanation:**
1. In from_model_id, the model is created using model_kwargs and passed
as the model variable of the pipeline function. So at this moment, to
load the model with multiple GPUs, "device_map" needs to be set to
"auto" within model_kwargs. Otherwise, the model defaults to loading on
the CPU.
2. The device_map variable in from_model_id is not utilized correctly.
In the pipeline function's source code of tnansformer:
- The device_map variable is stored in the model_kwargs dictionary
(lines 867-878 of transformers/src/transformers/pipelines/\__init__.py).
```python
    if device_map is not None:
        ......
        model_kwargs["device_map"] = device_map
```
- The model is constructed with model_kwargs containing the device_map
value ONLY IF it is a string (lines 893-903 of
transformers/src/transformers/pipelines/\__init__.py).
```python
    if isinstance(model, str) or framework is None:
        model_classes = {"tf": targeted_task["tf"], "pt": targeted_task["pt"]}
        framework, model = infer_framework_load_model( ... , **model_kwargs, )
```
- Consequently, since a model object is already passed to the pipeline
function, the device_map variable from from_model_id is never used.

3. The device_map variable in from_model_id not only appears unused but
also causes errors. Without explicitly setting device=None, attempting
to load the model on multiple GPUs may result in the following error:
 ```
Device has 2 GPUs available. Provide device={deviceId} to
`from_model_id` to use available GPUs for execution. deviceId is -1
(default) for CPU and can be a positive integer associated with CUDA
device id.
  Traceback (most recent call last):
    File "foo.py", line 15, in <module>
      llm = HuggingFacePipeline.from_model_id(
File
"foo\site-packages\langchain_huggingface\llms\huggingface_pipeline.py",
line 217, in from_model_id
      pipeline = hf_pipeline(
File "foo\lib\site-packages\transformers\pipelines\__init__.py", line
1108, in pipeline
return pipeline_class(model=model, framework=framework, task=task,
**kwargs)
File "foo\lib\site-packages\transformers\pipelines\text_generation.py",
line 96, in __init__
      super().__init__(*args, **kwargs)
File "foo\lib\site-packages\transformers\pipelines\base.py", line 835,
in __init__
      raise ValueError(
ValueError: The model has been loaded with `accelerate` and therefore
cannot be moved to a specific device. Please discard the `device`
argument when creating your pipeline object.
```
This error occurs because, in from_model_id, the default values in from_model_id for device and device_map are -1 and None, respectively. It would passes the statement (`device_map is not None and device < 0`) and keep the device as -1 so the pipeline function later raises an error when trying to move a GPU-loaded model back to the CPU. 
19eb82e68b/libs/community/langchain_community/llms/huggingface_pipeline.py (L204-L213)




If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: vbarda <vadym@langchain.dev>
2024-10-22 21:41:47 -04:00
Fernando de Oliveira
ab205e7389
partners/openai + community: Async Azure AD token provider support for Azure OpenAI (#27488)
This PR introduces a new `azure_ad_async_token_provider` attribute to
the `AzureOpenAI` and `AzureChatOpenAI` classes in `partners/openai` and
`community` packages, given it's currently supported on `openai` package
as
[AsyncAzureADTokenProvider](https://github.com/openai/openai-python/blob/main/src/openai/lib/azure.py#L33)
type.

The reason for creating a new attribute is to avoid breaking changes.
Let's say you have an existing code that uses a `AzureOpenAI` or
`AzureChatOpenAI` instance to perform both sync and async operations.
The `azure_ad_token_provider` will work exactly as it is today, while
`azure_ad_async_token_provider` will override it for async requests.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-10-22 21:43:06 +00:00
Vadym Barda
0640cbf2f1
huggingface[patch]: hide client field in HuggingFaceEmbeddings (#27522) 2024-10-21 17:37:07 -04:00
Erick Friis
4ceb28009a
mongodb: migrate to repo (#27467) 2024-10-18 12:35:12 -07:00
Erick Friis
a562c54f7d
azure-dynamic-sessions: migrate to repo (#27468) 2024-10-18 12:30:48 -07:00
Erick Friis
2cf2cefe39
partners/openai: release 0.2.3 (#27457) 2024-10-18 08:16:01 -07:00
Erick Friis
7d65a32ee0
openai: audio modality, remove sockets from unit tests (#27436) 2024-10-18 08:02:09 -07:00
Bagatur
a4392b070d core[patch]: add convert_to_openai_messages util (#27263)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-16 17:10:10 +00:00
Erick Friis
edf6d0a0fb
partners/couchbase: release 0.2.0 (attempt 2) (#27375) 2024-10-15 14:51:05 -07:00
Erick Friis
92ae61bcc8
multiple: rely on asyncio_mode auto in tests (#27200) 2024-10-15 16:26:38 +00:00
William FH
0a3e089827
[Anthropic] Shallow Copy (#27105)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 15:50:48 +00:00
Trayan Azarov
59bbda9ba3
chroma: Deprecating versions 0.5.7 thru 0.5.12 (#27305)
**Description:** Deprecated version of Chroma >=0.5.5 <0.5.12 due to a
serious correctness issue that caused some embeddings for deployments
with multiple collections to be lost (read more on the issue in Chroma
repo)
**Issue:** chroma-core/chroma#2922 (fixed by chroma-core/chroma##2923
and released in
[0.5.13](https://github.com/chroma-core/chroma/releases/tag/0.5.13))
**Dependencies:** N/A
**Twitter handle:** `@t_azarov`
2024-10-14 11:56:05 -04:00
Bagatur
ce33c4fa40
openai[patch]: default temp=1 for o1 (#27206) 2024-10-08 15:45:21 -07:00
RIdham Golakiya
73ad7f2e7a
langchain_chroma[patch]: updated example for get documents with where clause (#26767)
Example updated for vectorstore ChromaDB.

If we want to apply multiple filters then ChromaDB supports filters like
this:
Reference: [ChromaDB
filters](https://cookbook.chromadb.dev/core/filters/)

Thank you.
2024-10-08 20:21:58 +00:00
Bagatur
38099800cc
docs: fix anthropic max_tokens docstring (#27166) 2024-10-07 16:51:42 +00:00
Bagatur
06ce5d1d5c
anthropic[patch]: Release 0.2.3 (#27126) 2024-10-04 22:38:03 +00:00
Bagatur
0b8416bd2e
anthropic[patch]: fix input_tokens when cached (#27125) 2024-10-04 22:35:51 +00:00
Bagatur
bd5b335cb4
standard-tests[patch]: fix oai usage metadata test (#27122) 2024-10-04 20:00:48 +00:00
Bagatur
827bdf4f51
fireworks[patch]: Release 0.2.1 (#27120) 2024-10-04 18:59:15 +00:00
Bagatur
98942edcc9
openai[patch]: Release 0.2.2 (#27119) 2024-10-04 11:54:01 -07:00
Bagatur
414fe16071
anthropic[patch]: Release 0.2.2 (#27118) 2024-10-04 11:53:53 -07:00
Scott Hurrey
558fb4d66d
box: Add citation support to langchain_box.retrievers.BoxRetriever when used with Box AI (#27012)
Thank you for contributing to LangChain!

**Description:** Box AI can return responses, but it can also be
configured to return citations. This change allows the developer to
decide if they want the answer, the citations, or both. Regardless of
the combination, this is returned as a single List[Document] object.

**Dependencies:** Updated to the latest Box Python SDK, v1.5.1
**Twitter handle:** BoxPlatform


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-04 18:32:34 +00:00
Bagatur
1e768a9ec7
anthropic[patch]: correctly handle tool msg with empty list (#27109) 2024-10-04 11:30:50 -07:00
Bagatur
4935a14314
core,integrations[minor]: Dont error on fields in model_kwargs (#27110)
Given the current erroring behavior, every time we've moved a kwarg from
model_kwargs and made it its own field that was a breaking change.
Updating this behavior to support the old instantiations /
serializations.

Assuming build_extra_kwargs was not something that itself is being used
externally and needs to be kept backwards compatible
2024-10-04 11:30:27 -07:00
Bagatur
0495b7f441
anthropic[patch]: add usage_metadata details (#27087)
fixes https://github.com/langchain-ai/langchain/pull/27087
2024-10-04 08:46:49 -07:00
Erick Friis
e8e5d67a8d
openai: fix None token detail (#27091)
happens in Azure
2024-10-04 01:25:38 +00:00
Bagatur
c09da53978
openai[patch]: add usage metadata details (#27080) 2024-10-03 14:01:03 -07:00
Bagatur
099235da01
Revert "huggingface[patch]: make HuggingFaceEndpoint serializable (#2… (#27032)
…7027)"

This reverts commit b5e28d3a6d.
2024-10-01 21:26:38 +00:00
Bagatur
5f2e93ffea
huggingface[patch]: xfail test (#27031) 2024-10-01 21:14:07 +00:00
Bagatur
b5e28d3a6d
huggingface[patch]: make HuggingFaceEndpoint serializable (#27027) 2024-10-01 13:16:10 -07:00
Erick Friis
a8e1577f85
milvus: mv to external repo (#26920) 2024-10-01 00:38:30 +00:00
Erick Friis
35f6393144
unstructured: mv to external repo (#26923) 2024-09-30 17:38:21 -07:00
Erick Friis
7ecd720120
multiple: update docs urls to latest 2 (#26837) 2024-09-30 17:37:07 -07:00
Bagatur
0078493a80
fireworks[patch]: allow tool_choice with multiple tools (#26999)
https://docs.fireworks.ai/api-reference/post-chatcompletions
2024-09-30 11:28:43 -07:00
Bagatur
c7120d87dd
groq[patch]: support tool_choice=any/required (#27000)
https://console.groq.com/docs/api-reference#chat-create
2024-09-30 11:28:35 -07:00
Bagatur
9404e7af9d
openai[patch]: exclude http client (#26891)
httpx clients aren't serializable
2024-09-29 11:16:27 -07:00
ccurme
39987ebd91
openai[patch]: update deprecation target in API ref (#26921) 2024-09-27 08:42:31 -04:00
Erick Friis
8bc12df2eb
voyageai: new models (#26907)
Co-authored-by: fzowl <zoltan@voyageai.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
2024-09-26 17:07:10 +00:00
ccurme
7091a1a798
openai[patch]: increase token limit in azure integration tests (#26901)
`test_json_mode` occasionally runs into this
2024-09-26 14:31:33 +00:00
Bagatur
eaffa92c1d
openai[patch]: Release 0.2.1 (#26858) 2024-09-25 15:55:49 +00:00
John
6c3ea262c8
partners/unstructured: release 0.1.5 (#26831)
**Description:** update package version to support loading URLs #26670
**Issue:**  #26697
2024-09-24 15:02:53 -07:00
ccurme
2a4c5713cd
openai[patch]: fix azure integration tests (#26791) 2024-09-23 17:49:15 -04:00
Bagatur
e1e4f88b3e
openai[patch]: enable Azure structured output, parallel_tool_calls=Fa… (#26599)
…lse, tool_choice=required

response_format=json_schema, tool_choice=required, parallel_tool_calls
are all supported for gpt-4o on azure.
2024-09-22 22:25:22 -07:00
Anton Dubovik
3e2cb4e8a4
openai: embeddings: supported chunk_size when check_embedding_ctx_length is disabled (#23767)
Chunking of the input array controlled by `self.chunk_size` is being
ignored when `self.check_embedding_ctx_length` is disabled. Effectively,
the chunk size is assumed to be equal 1 in such a case. This is
suprising.

The PR takes into account `self.chunk_size` passed by the user.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 16:58:45 -07:00
Nithish Raghunandanan
2d21274bf6
couchbase: Add ttl support to caches & chat_message_history (#26214)
**Description:** Add support to delete documents automatically from the
caches & chat message history by adding a new optional parameter, `ttl`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:44:29 +00:00
Krishna Kulkarni
c6c508ee96
Refining Skip Count Calculation by Filtering Documents with session_id (#26020)
In the previous implementation, `skip_count` was counting all the
documents in the collection. Instead, we want to filter the documents by
`session_id` and calculate `skip_count` by subtracting `history_size`
from the filtered count.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-20 23:40:56 +00:00
Lucain
a2023a1e96
huggingface; fix huggingface_endpoint.py (initialize clients only with supported kwargs) (#26378)
## Description

By default, `HuggingFaceEndpoint` instantiates both the
`InferenceClient` and the `AsyncInferenceClient` with the
`"server_kwargs"` passed as input. This is an issue as both clients
might not support exactly the same kwargs. This has been highlighted in
https://github.com/huggingface/huggingface_hub/issues/2522 by
@morgandiverrez with the `trust_env` parameter. In order to make
`langchain` integration future-proof, I do think it's wiser to forward
only the supported parameters to each client. Parameters that are not
supported are simply ignored with a warning to the user. From a
`huggingface_hub` maintenance perspective, this allows us much more
flexibility as we are not constrained to support the exact same kwargs
in both clients.

## Issue

https://github.com/huggingface/huggingface_hub/issues/2522

## Dependencies

None

## Twitter 

https://x.com/Wauplin

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 16:05:24 -07:00
ccurme
eef18dec44
unstructured[patch]: support loading URLs (#26670)
`unstructured.partition.auto.partition` supports a `url` kwarg, but
`url` in `UnstructuredLoader.__init__` is reserved for the server URL.
Here we add a `web_url` kwarg that is passed to the partition kwargs:
```python
self.unstructured_kwargs["url"] = web_url
```
2024-09-19 11:40:25 -07:00
ccurme
7d49ee9741
unstructured[patch]: add to integration tests (#26666)
- Add to tests on parsed content;
- Add tests for async + lazy loading;
- Add a test for `strategy="hi_res"`.
2024-09-19 13:43:34 -04:00
Daniel Cooke
7835c0651f
langchain_chroma: Pass through kwargs to Chroma collection.delete (#25970)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:21:24 +00:00