- **Description:** `Model_Kwargs` was not being passed correctly to
`sentence_transformers.SentenceTransformer` which has been corrected
while maintaing backward compatability
- **Issue:** #28436
---------
Co-authored-by: MoosaTae <sadhis.tae@gmail.com>
Co-authored-by: Sadit Wongprayon <101176694+MoosaTae@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
without this `model_config` importing this package produces warnings
about "model_name" having conflicts with protected namespace "model_".
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
- Example: "community: add foobar LLM"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** a description of the change
- **Issue:** the issue # it fixes, if applicable
- **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.
This works correctly also in pydantic v1.
```python
from pydantic.v1 import BaseModel
class Foo(BaseModel, extra="forbid"):
x: int
Foo(x=5, y=1)
```
And
```python
from pydantic.v1 import BaseModel
class Foo(BaseModel):
x: int
class Config:
extra = "forbid"
Foo(x=5, y=1)
```
## Enum -> literal using grit pattern:
```
engine marzano(0.1)
language python
or {
`extra=Extra.allow` => `extra="allow"`,
`extra=Extra.forbid` => `extra="forbid"`,
`extra=Extra.ignore` => `extra="ignore"`
}
```
Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)
## Sort attributes in Config:
```
engine marzano(0.1)
language python
function sort($values) js {
return $values.text.split(',').sort().join("\n");
}
class_definition($name, $body) as $C where {
$name <: `Config`,
$body <: block($statements),
$values = [],
$statements <: some bubble($values) assignment() as $A where {
$values += $A
},
$body => sort($values),
}
```
- **Description:** This implements `show_progress` more consistently
(i.e. it is also added to the `HuggingFaceBgeEmbeddings` object).
- **Issue:** This implements `show_progress` more consistently in the
embeddings huggingface classes. Previously this could have been set via
`encode_kwargs`.
- **Dependencies:** None
- **Twitter handle:** @jonzeolla
Thank you for contributing to LangChain!
- [ ] **HuggingFaceInferenceAPIEmbeddings**: "Additional Headers"
- Where: langchain, community, embeddings. huggingface.py.
- Community: add additional headers when needed by custom HuggingFace
TEI embedding endpoints. HuggingFaceInferenceAPIEmbeddings"
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Adding the `additional_headers` to be passed to
requests library if needed
- **Dependencies:** none
- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. Tested with locally available TEI endpoints with and without
`additional_headers`
2. Example Usage
```python
embeddings=HuggingFaceInferenceAPIEmbeddings(
api_key=MY_CUSTOM_API_KEY,
api_url=MY_CUSTOM_TEI_URL,
additional_headers={
"Content-Type": "application/json"
}
)
```
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
---------
Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Be more explicit with the `model_kwargs` and
`encode_kwargs` for `HuggingFaceEmbeddings`.
- **Issue:** -
- **Dependencies:** -
I received some reports by my users that they didn't realise that you
could change the default `batch_size` with `HuggingFaceEmbeddings`,
which may be attributed to how the `model_kwargs` and `encode_kwargs`
don't give much information about what you can specify.
I've added some parameter names & links to the Sentence Transformers
documentation to help clear it up. Let me know if you'd rather have
Markdown/Sphinx-style hyperlinks rather than a "bare URL".
- Tom Aarsen
- **Description:** Add embedding instruction to
HuggingFaceBgeEmbeddings, so that it can be compatible with nomic and
other models that need embedding instruction.
---------
Co-authored-by: Tao Wu <tao.wu@rwth-aachen.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Adds a function parameter to HuggingFaceEmbeddings
called `show_progress` that enables a `tqdm` progress bar if enabled.
Does not function if `multi_process = True`.
- **Issue:** n/a
- **Dependencies:** n/a