langchain/libs/community/tests/integration_tests/embeddings/test_yandex.py
Mikhail Khludnev a017f49fd3
comminity[patch]: fix #25575 YandexGPTs for _grpc_metadata (#25617)
it fixes two issues:

### YGPTs are broken #25575

```
File ....conda/lib/python3.11/site-packages/langchain_community/embeddings/yandex.py:211, in _make_request(self, texts, **kwargs)
..
--> 211 res = stub.TextEmbedding(request, metadata=self._grpc_metadata)  # type: ignore[attr-defined]

AttributeError: 'YandexGPTEmbeddings' object has no attribute '_grpc_metadata'
```
My gut feeling that #23841 is the cause.

I have to drop leading underscore from `_grpc_metadata` for quickfix,
but I just don't know how to do it _pydantic_ enough.

### minor issue:

if we use `api_key`, which is not the best practice the code fails with 

```
File ~/git/...../python3.11/site-packages/langchain_community/embeddings/yandex.py:119, in YandexGPTEmbeddings.validate_environment(cls, values)
...

AttributeError: 'tuple' object has no attribute 'append'
```

- Added new integration test. But it requires YGPT env available and
active account. I don't know how int tests dis\enabled in CI.
 - added small unit tests with mocks. Should be fine.

---------

Co-authored-by: mikhail-khludnev <mikhail_khludnev@rntgroup.com>
2024-08-28 18:48:10 -07:00

32 lines
1.1 KiB
Python

import pytest
from langchain_community.embeddings.yandex import YandexGPTEmbeddings
@pytest.mark.parametrize(
"constructor_args",
[
dict(),
dict(disable_request_logging=True),
],
)
# @pytest.mark.scheduled - idk what it means
# requires YC_* env and active service
def test_yandex_embedding(constructor_args: dict) -> None:
documents = ["exactly same", "exactly same", "different"]
embedding = YandexGPTEmbeddings(**constructor_args)
doc_outputs = embedding.embed_documents(documents)
assert len(doc_outputs) == 3
for i in range(3):
assert len(doc_outputs[i]) >= 256 # there are many dims
assert len(doc_outputs[0]) == len(doc_outputs[i]) # dims are te same
assert doc_outputs[0] == doc_outputs[1] # same input, same embeddings
assert doc_outputs[2] != doc_outputs[1] # different input, different embeddings
qry_output = embedding.embed_query(documents[0])
assert len(qry_output) >= 256
assert len(doc_outputs[0]) == len(
qry_output
) # query and doc models have same dimensions
assert doc_outputs[0] != qry_output # query and doc models are different