mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-24 08:27:50 +00:00
LlamaCppEmbeddings
: adds verbose
parameter, similar to llms.LlamaCpp
class (#11038)
## Description
As of now, when instantiating and during inference, `LlamaCppEmbeddings`
outputs (a lot of) verbose when controlled from Langchain binding - it
is a bit annoying when computing the embeddings of long documents, for
instance.
This PR adds `verbose` for `LlamaCppEmbeddings` objects to be able
**not** to print the verbose of the model to `stderr`. It is natively
supported by `llama-cpp-python` and directly passed to the library – the
PR is hence very small.
The value of `verbose` is `True` by default, following the way it is
defined in [`LlamaCpp` (`llamacpp.py`
#L136-L137)](c87e9fb2ce/libs/langchain/langchain/llms/llamacpp.py (L136-L137)
)
## Issue
_No issue linked_
## Dependencies
_No additional dependency needed_
## To see it in action
```python
from langchain.embeddings import LlamaCppEmbeddings
MODEL_PATH = "<path_to_gguf_file>"
if __name__ == "__main__":
llm_embeddings = LlamaCppEmbeddings(
model_path=MODEL_PATH,
n_gpu_layers=1,
n_batch=512,
n_ctx=2048,
f16_kv=True,
verbose=False,
)
```
Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
parent
a00a73ef18
commit
1b48d6cb8c
@ -54,6 +54,9 @@ class LlamaCppEmbeddings(BaseModel, Embeddings):
|
||||
n_gpu_layers: Optional[int] = Field(None, alias="n_gpu_layers")
|
||||
"""Number of layers to be loaded into gpu memory. Default None."""
|
||||
|
||||
verbose: bool = Field(True, alias="verbose")
|
||||
"""Print verbose output to stderr."""
|
||||
|
||||
class Config:
|
||||
"""Configuration for this pydantic object."""
|
||||
|
||||
@ -73,6 +76,7 @@ class LlamaCppEmbeddings(BaseModel, Embeddings):
|
||||
"use_mlock",
|
||||
"n_threads",
|
||||
"n_batch",
|
||||
"verbose",
|
||||
]
|
||||
model_params = {k: values[k] for k in model_param_names}
|
||||
# For backwards compatibility, only include if non-null.
|
||||
|
Loading…
Reference in New Issue
Block a user