Assigning missed defaults in various classes. Most clients were being
assigned during the `model_validator(mode="before")` step, so this
change should amount to a no-op in those cases.
---
This PR was autogenerated using gritql
```shell
grit apply 'class_definition(name=$C, $body, superclasses=$S) where {
$C <: ! "Config", // Does not work in this scope, but works after class_definition
$body <: block($statements),
$statements <: some bubble assignment(left=$x, right=$y, type=$t) as $A where {
or {
$y <: `Field($z)`,
$x <: "model_config"
}
},
// And has either Any or Optional fields without a default
$statements <: some bubble assignment(left=$x, right=$y, type=$t) as $A where {
$t <: or {
r"Optional.*",
r"Any",
r"Union[None, .*]",
r"Union[.*, None, .*]",
r"Union[.*, None]",
},
$y <: ., // Match empty node
$t => `$t = None`,
},
}
' --language python .
```
The underlying code is already documented as requiring appropriate RBAC
control, but adding a forced user opt-in to make sure that users
that don't read documentation are still aware of what's required
from a security perspective.
https://huntr.com/bounties/8f4ad910-7fdc-4089-8f0a-b5df5f32e7c5
This PR upgrades langchain-community to pydantic 2.
* Most of this PR was auto-generated using code mods with gritql
(https://github.com/eyurtsev/migrate-pydantic/tree/main)
* Subsequently, some code was fixed manually due to accommodate
differences between pydantic 1 and 2
Breaking Changes:
- Use TEXTEMBED_API_KEY and TEXTEMBEB_API_URL for env variables for text
embed integrations:
cbea780492
Other changes:
- Added pydantic_settings as a required dependency for community. This
may be removed if we have enough time to convert the dependency into an
optional one.
---------
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- **Description:** Added langchain version while calling discover API
during both ingestion and retrieval
- **Issue:** NA
- **Dependencies:** NA
- **Tests:** NA
- **Docs** NA
---------
Co-authored-by: dristy.cd <dristy@clouddefense.io>
- **PR message**: **Fix URL construction in newer Python versions**
- **Description:**
- Update the URL construction logic to use the .value attribute for
Routes enum members.
- This adjustment resolves an issue where the code worked correctly in
Python 3.9 but failed in Python 3.11.
- Clean up unused routes.
- **Issue:** NA
- **Dependencies:** NA
**Refactor PebbloRetrievalQA**
- Created `APIWrapper` and moved API logic into it.
- Created smaller functions/methods for better readability.
- Properly read environment variables.
- Removed unused code.
- Updated models
**Issue:** NA
**Dependencies:** NA
**tests**: NA
- [x] NatbotChain: move to community, deprecate langchain version.
Update to use `prompt | llm | output_parser` instead of LLMChain.
- [x] LLMMathChain: deprecate + add langgraph replacement example to API
ref
- [x] HypotheticalDocumentEmbedder (retriever): update to use `prompt |
llm | output_parser` instead of LLMChain
- [x] FlareChain: update to use `prompt | llm | output_parser` instead
of LLMChain
- [x] ConstitutionalChain: deprecate + add langgraph replacement example
to API ref
- [x] LLMChainExtractor (document compressor): update to use `prompt |
llm | output_parser` instead of LLMChain
- [x] LLMChainFilter (document compressor): update to use `prompt | llm
| output_parser` instead of LLMChain
- [x] RePhraseQueryRetriever (retriever): update to use `prompt | llm |
output_parser` instead of LLMChain
Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.
This works correctly also in pydantic v1.
```python
from pydantic.v1 import BaseModel
class Foo(BaseModel, extra="forbid"):
x: int
Foo(x=5, y=1)
```
And
```python
from pydantic.v1 import BaseModel
class Foo(BaseModel):
x: int
class Config:
extra = "forbid"
Foo(x=5, y=1)
```
## Enum -> literal using grit pattern:
```
engine marzano(0.1)
language python
or {
`extra=Extra.allow` => `extra="allow"`,
`extra=Extra.forbid` => `extra="forbid"`,
`extra=Extra.ignore` => `extra="ignore"`
}
```
Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)
## Sort attributes in Config:
```
engine marzano(0.1)
language python
function sort($values) js {
return $values.text.split(',').sort().join("\n");
}
class_definition($name, $body) as $C where {
$name <: `Config`,
$body <: block($statements),
$values = [],
$statements <: some bubble($values) assignment() as $A where {
$values += $A
},
$body => sort($values),
}
```
This PR adds annotations in comunity package.
Annotations are only strictly needed in subclasses of BaseModel for
pydantic 2 compatibility.
This PR adds some unnecessary annotations, but they're not bad to have
regardless for documentation pages.
Title: [pebblo_retrieval] Identifying entities in prompts given in
PebbloRetrievalQA leading to prompt governance
Description: Implemented identification of entities in the prompt using
Pebblo prompt governance API.
Issue: NA
Dependencies: NA
Add tests and docs: NA
**Description:** This PR introduces a change to the
`cypher_generation_chain` to dynamically concatenate inputs. This
improvement aims to streamline the input handling process and make the
method more flexible. The change involves updating the arguments
dictionary with all elements from the `inputs` dictionary, ensuring that
all necessary inputs are dynamically appended. This will ensure that any
cypher generation template will not require a new `_call` method patch.
**Issue:** This PR fixes issue #24260.
- **Description:**
- Updated checksum in doc metadata
- Sending checksum and removing actual content, while sending data to
`pebblo-cloud` if `classifier-location `is `pebblo-cloud` in
`/loader/doc` API
- Adding `pb_id` i.e. pebblo id to doc metadata
- Refactoring as needed.
- Sending `content-checksum` and removing actual content, while sending
data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in
`prmopt` API
- **Issue:** NA
- **Dependencies:** NA
- **Tests:** Updated
- **Docs** NA
---------
Co-authored-by: dristy.cd <dristy@clouddefense.io>
- **Description:** Support PGVector in PebbloRetrievalQA
- Identity and Semantic Enforcement support for PGVector
- Refactor Vectorstore validation and name check
- Clear the overridden identity and semantic enforcement filters
- **Issue:** NA
- **Dependencies:** NA
- **Tests**: NA(already added)
- **Docs**: Updated
- **Twitter handle:** [@Raj__725](https://twitter.com/Raj__725)
Description: Add classifier_location feature flag. This flag enables
Pebblo to decide the classifier location, local or pebblo-cloud.
Unit Tests: N/A
Documentation: N/A
---------
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
**Description:**
Fix "`TypeError: 'NoneType' object is not iterable`" when the
auth_context is absent in PebbloRetrievalQA. The auth_context is
optional; hence, PebbloRetrievalQA should work without it, but it throws
an error at the moment. This PR fixes that issue.
**Issue:** NA
**Dependencies:** None
**Unit tests:** NA
---------
Co-authored-by: Chester Curme <chester.curme@gmail.com>
LLMs struggle with Graph RAG, because it's different from vector RAG in
a way that you don't provide the whole context, only the answer and the
LLM has to believe. However, that doesn't really work a lot of the time.
However, if you wrap the context as function response the accuracy is
much better.
btw... `union[LLMChain, Runnable]` is linting fun, that's why so many
ignores
This PR makes some small updates for `KuzuQAChain` for graph QA.
- Updated Cypher generation prompt (we now support `WHERE EXISTS`) and
generalize it more
- Support different LLMs for Cypher generation and QA
- Update docs and examples
0.2rc
migrations
- [x] Move memory
- [x] Move remaining retrievers
- [x] graph_qa chains
- [x] some dependency from evaluation code potentially on math utils
- [x] Move openapi chain from `langchain.chains.api.openapi` to
`langchain_community.chains.openapi`
- [x] Migrate `langchain.chains.ernie_functions` to
`langchain_community.chains.ernie_functions`
- [x] migrate `langchain/chains/llm_requests.py` to
`langchain_community.chains.llm_requests`
- [x] Moving `langchain_community.cross_enoders.base:BaseCrossEncoder`
->
`langchain_community.retrievers.document_compressors.cross_encoder:BaseCrossEncoder`
(namespace not ideal, but it needs to be moved to `langchain` to avoid
circular deps)
- [x] unit tests langchain -- add pytest.mark.community to some unit
tests that will stay in langchain
- [x] unit tests community -- move unit tests that depend on community
to community
- [x] mv integration tests that depend on community to community
- [x] mypy checks
Other todo
- [x] Make deprecation warnings not noisy (need to use warn deprecated
and check that things are implemented properly)
- [x] Update deprecation messages with timeline for code removal (likely
we actually won't be removing things until 0.4 release) -- will give
people more time to transition their code.
- [ ] Add information to deprecation warning to show users how to
migrate their code base using langchain-cli
- [ ] Remove any unnecessary requirements in langchain (e.g., is
SQLALchemy required?)
---------
Co-authored-by: Erick Friis <erick@langchain.dev>