langchain

mirror of https://github.com/hwchase17/langchain.git synced 2026-01-30 05:47:54 +00:00

Files

Mikhail Khludnev 14ff1438e6 nvidia-trt[patch]: propagate InferenceClientException to the caller. (#16936 )

- **Description:**  
 
before the change I've got

1. propagate InferenceClientException to the caller.
2. stop grpc receiver thread on exception 

```
        for token in result_queue:
>           result_str += token
E           TypeError: can only concatenate str (not "InferenceServerException") to str

../../langchain_nvidia_trt/llms.py:207: TypeError
```
And stream thread keeps running. 

after the change request thread stops correctly and caller got a root
cause exception:

```
E                   tritonclient.utils.InferenceServerException: [request id: 4529729] expected number of inputs between 2 and 3 but got 10 inputs for model 'vllm_model'

../../langchain_nvidia_trt/llms.py:205: InferenceServerException
```

  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
  - **Twitter handle:** [t.me/mkhl_spb](https://t.me/mkhl_spb)
 
I'm not sure about test coverage. Should I setup deep mocks or there's a
kind of triton stub via testcontainers or so.

2024-02-06 11:47:07 -08:00

anthropic

anthropic[patch]: Fix message type lookup in Anthropic Partners (#16563 )

2024-01-25 09:17:59 -08:00

exa

exa: init pkg (#16553 )

2024-01-24 20:57:17 -07:00

google-genai

google-genai[patch]: fix new core typing (#16988 )

2024-02-03 17:45:44 -08:00

google-vertexai

google-vertexai[patch]: streaming bug (#16603 )