community[patch]: Adopting the lighter-weight xinference_client (#21900)

While integrating the xinference_embedding, we observed that the
downloaded dependency package is quite substantial in size. With a focus
on resource optimization and efficiency, if the project requirements are
limited to its vector processing capabilities, we recommend migrating to
the xinference_client package. This package is more streamlined,
significantly reducing the storage space requirements of the project and
maintaining a feature focus, making it particularly suitable for
scenarios that demand lightweight integration. Such an approach not only
boosts deployment efficiency but also enhances the application's
maintainability, rendering it an optimal choice for our current context.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
Liuww 2024-05-21 06:05:09 +08:00 committed by GitHub
parent a43515ca65
commit 332ffed393
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 31 additions and 6 deletions

View File

@ -1,11 +1,11 @@
"""Wrapper around Xinference embedding models."""
from typing import Any, List, Optional
from langchain_core.embeddings import Embeddings
class XinferenceEmbeddings(Embeddings):
"""Xinference embedding models.
To use, you should have the xinference library installed:
@ -14,6 +14,12 @@ class XinferenceEmbeddings(Embeddings):
pip install xinference
If you're simply using the services provided by Xinference, you can utilize the xinference_client package:
.. code-block:: bash
pip install xinference_client
Check out: https://github.com/xorbitsai/inference
To run, you need to start a Xinference supervisor on one server and Xinference workers on the other servers.
@ -32,6 +38,12 @@ class XinferenceEmbeddings(Embeddings):
$ xinference-supervisor
If you're simply using the services provided by Xinference, you can utilize the xinference_client package:
.. code-block:: bash
pip install xinference_client
Starting the worker:
.. code-block:: bash
@ -72,11 +84,14 @@ class XinferenceEmbeddings(Embeddings):
):
try:
from xinference.client import RESTfulClient
except ImportError as e:
raise ImportError(
"Could not import RESTfulClient from xinference. Please install it"
" with `pip install xinference`."
) from e
except ImportError:
try:
from xinference_client import RESTfulClient
except ImportError as e:
raise ImportError(
"Could not import RESTfulClient from xinference. Please install it"
" with `pip install xinference` or `pip install xinference_client`."
) from e
super().__init__()

View File

@ -73,3 +73,13 @@ def test_xinference_embedding_query(setup: Tuple[str, str]) -> None:
document = "foo bar"
output = xinference.embed_query(document)
assert len(output) == 4096
def test_xinference_embedding() -> None:
embedding_model = XinferenceEmbeddings(
server_url="http://xinference-hostname:9997", model_uid="foo"
)
embedding_model.embed_documents(
texts=["hello", "i'm trying to upgrade xinference embedding"]
)