mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-08 14:31:55 +00:00
community[minor]: New model parameters and dynamic batching for VertexAIEmbeddings (#13999)
- **Description:** VertexAIEmbeddings performance improvements - **Twitter handle:** @vladkol ## Improvements - Dynamic batch size, starting from 250, lowering down to 5. Batch size varies across regions. Some regions support larger batches, and it significantly improves performance. When running large batches of texts in `us-central1`, performance gain can be up to 3.5x. The dynamic batching also makes sure every batch is below 20K token limit. - New model parameter `embeddings_type` that translates to `task_type` parameter of the API. Newer model versions support [different embeddings task types](https://cloud.google.com/vertex-ai/docs/generative-ai/embeddings/get-text-embeddings#api_changes_to_models_released_on_or_after_august_2023).
This commit is contained in:
@@ -1,8 +1,8 @@
|
||||
"""Test Vertex AI API wrapper.
|
||||
In order to run this test, you need to install VertexAI SDK
|
||||
In order to run this test, you need to install VertexAI SDK
|
||||
pip install google-cloud-aiplatform>=1.35.0
|
||||
|
||||
Your end-user credentials would be used to make the calls (make sure you've run
|
||||
Your end-user credentials would be used to make the calls (make sure you've run
|
||||
`gcloud auth login` first).
|
||||
"""
|
||||
from langchain_community.embeddings import VertexAIEmbeddings
|
||||
@@ -24,6 +24,16 @@ def test_embedding_query() -> None:
|
||||
assert len(output) == 768
|
||||
|
||||
|
||||
def test_large_batches() -> None:
|
||||
documents = ["foo bar" for _ in range(0, 251)]
|
||||
model_uscentral1 = VertexAIEmbeddings(location="us-central1")
|
||||
model_asianortheast1 = VertexAIEmbeddings(location="asia-northeast1")
|
||||
model_uscentral1.embed_documents(documents)
|
||||
model_asianortheast1.embed_documents(documents)
|
||||
assert model_uscentral1.instance["batch_size"] >= 250
|
||||
assert model_asianortheast1.instance["batch_size"] < 50
|
||||
|
||||
|
||||
def test_paginated_texts() -> None:
|
||||
documents = [
|
||||
"foo bar",
|
||||
|
Reference in New Issue
Block a user