I was trying to use web loaders on some spanish documentation (e.g.
[this site](https://www.fromdoppler.com/es/mailing-tendencias/), but the
auto-encoding introduced in
https://github.com/langchain-ai/langchain/pull/3602 was detected as
"MacRoman" instead of the (correct) "UTF-8".
To address this, I've added the ability to disable the auto-encoding, as
well as the ability to explicitly tell the loader what encoding to use.
- **Description:** Makes auto-setting the encoding optional in
`WebBaseLoader`, and introduces an `encoding` option to explicitly set
it.
- **Dependencies:** N/A
- **Tag maintainer:** @hwchase17
- **Twitter handle:** @czue
**Description:**
Pinecone hybrid search is now limited to default namespace. There is no
option for the user to provide a namespace to partition an index, which
is one of the most important features of pinecone.
**Resource:**
https://docs.pinecone.io/docs/namespaces
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- **Description:** Updating URL in Context Callback Docstrings and
update metadata key Context CallbackHandler uses to send model names.
- **Issue:** The URL in ContextCallbackHandler is out of date. Model
data being sent to Context should be under the "model" key and not
"llm_model". This allows Context to do more sophisticated analysis.
- **Dependencies:** None
Tagging @agamble.
- This pr adds `llm_kwargs` to the initialization of Xinference LLMs
(integrated in #8171 ).
- With this enhancement, users can not only provide `generate_configs`
when calling the llms for generation but also during the initialization
process. This allows users to include custom configurations when
utilizing LangChain features like LLMChain.
- It also fixes some format issues for the docstrings.
Hello @hwchase17
**Issue**:
The class WebResearchRetriever accept only
RecursiveCharacterTextSplitter, but never uses a specification of this
class. I propose to change the type to TextSplitter. Then, the lint can
accept all subtypes.
- tools invoked in async methods would not work due to missing await
- RunnableSequence.stream() was creating an extra root run by mistake,
and it can simplified due to existence of default implementation for
.transform()
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
**Description:** Renamed argument `database` in
`SQLDatabaseSequentialChain.from_llm()` to `db`,
I realize it's tiny and a bit of a nitpick but for consistency with
SQLDatabaseChain (and all the others actually) I thought it should be
renamed. Also got me while working and using it today.
✔️ Please make sure your PR is passing linting and
testing before submitting. Run `make format`, `make lint` and `make
test` to check this locally.
This PR is a documentation fix.
Description:
* fixes imports in the code samples in the docstrings of
`create_openai_fn_chain` and `create_structured_output_chain`
* fixes imports in
`docs/extras/modules/chains/how_to/openai_functions.ipynb`
* removes unused imports from the notebook
Issues:
* the docstrings use `from pydantic_v1 import BaseModel, Field` which
this PR changes to `from langchain.pydantic_v1 import BaseModel, Field`
* importing `pydantic` instead of `langchain.pydantic_v1` leads to
errors later in the notebook
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
- Description: Added support for Ollama embeddings
- Issue: the issue # it fixes (if applicable),
- Dependencies: N/A
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: @herrjemand
cc https://github.com/jmorganca/ollama/issues/436
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- **Description:** a description of the change,
- **Issue:** the issue # it fixes (if applicable),
- **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
-->
Hello,
this PR improves coverage for caching by the two Cassandra-related
caches (i.e. exact-match and semantic alike) by switching to the more
general `dumps`/`loads` serdes utilities.
This enables cache usage within e.g. `ChatOpenAI` contexts (which need
to store lists of `ChatGeneration` instead of `Generation`s), which was
not possible as long as the cache classes were relying on the legacy
`_dump_generations_to_json` and `_load_generations_from_json`).
Additionally, a slightly different init signature is introduced for the
cache objects:
- named parameters required for init, to pave the way for easier changes
in the future connect-to-db flow (and tests adjusted accordingly)
- added a `skip_provisioning` optional passthrough parameter for use
cases where the user knows the underlying DB table, etc already exist.
Thank you for a review!
Adding support for Neo4j vector index hybrid search option. In Neo4j,
you can achieve hybrid search by using a combination of vector and
fulltext indexes.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Description:
* Baidu AI Cloud's [Qianfan
Platform](https://cloud.baidu.com/doc/WENXINWORKSHOP/index.html) is an
all-in-one platform for large model development and service deployment,
catering to enterprise developers in China. Qianfan Platform offers a
wide range of resources, including the Wenxin Yiyan model (ERNIE-Bot)
and various third-party open-source models.
- Issue: none
- Dependencies:
* qianfan
- Tag maintainer: @baskaryan
- Twitter handle:
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
`langchain.agents.openai_functions[_multi]_agent._parse_ai_message()`
incorrectly extracts AI message content, thus LLM response ("thoughts")
is lost and can't be logged or processed by callbacks.
This PR fixes function call message content retrieving.