<!-- Thank you for contributing to LangChain!
Replace this comment with:
- Description: Added the capability to handles structured data from
google enterprise search,
- Issue: Retriever failed when underline search engine was integrated
with structured data,
- Dependencies: google-api-core
- Tag maintainer: @jarokaz
- Twitter handle: anifort
Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use.
Maintainer responsibilities:
- General / Misc / if you don't know who to tag: @baskaryan
- DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
- Models / Prompts: @hwchase17, @baskaryan
- Memory: @hwchase17
- Agents / Tools / Toolkits: @hinthornw
- Tracing / Callbacks: @agola11
- Async: @agola11
If no one reviews your PR within a few days, feel free to @-mention the
same people again.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
-->
---------
Co-authored-by: Christos Aniftos <aniftos@google.com>
Co-authored-by: Holt Skinner <13262395+holtskinner@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Updates the hub stubs to not fail when no api key is found. For
supporting singleton tenants and default values from sdk 0.1.6.
Also adds the ability to define is_public and description for backup
repo creation on push.
Currently, generation_info is not respected by only reflecting messages
in chunks. Change it to add generations so that generation chunks are
merged properly.
---------
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
- Description: current code does not work very well on jupyter notebook,
so I changed the code so that it imports `tqdm.auto` instead.
- Issue: #9582
- Dependencies: N/A
- Tag maintainer: @hwchase17, @baskaryan
- Twitter handle: N/A
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
It's possible that langchain-experimental works fine with the latest
*published* langchain, but is broken with the langchain on `master`.
Unfortunately, you can see this is currently the case — this is why this
PR also includes a minor fix for the `langchain` package itself.
We want to catch situations like that *before* releasing a new
langchain, hence this test.
# Description
This PR introduces a new toolkit for interacting with the AINetwork
blockchain. The toolkit provides a set of tools for performing various
operations on the AINetwork blockchain, such as transferring AIN,
reading and writing values to the blockchain database, managing apps,
setting rules and owners.
# Dependencies
[ain-py](https://github.com/ainblockchain/ain-py) >= 1.0.2
# Misc
The example notebook
(langchain/docs/extras/integrations/toolkits/ainetwork.ipynb) is in the
PR
---------
Co-authored-by: kriii <kriii@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Introduces a conditional in `ArangoGraph.generate_schema()` to exclude
empty ArangoDB Collections from the schema
- Add empty collection test case
Issue: N/A
Dependencies: None
### Description
Polars is a DataFrame interface on top of an OLAP Query Engine
implemented in Rust.
Polars is faster to read than pandas, so I'm looking forward to seeing
it added to the document loader.
### Dependencies
polars (https://pola-rs.github.io/polars-book/user-guide/)
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
I have restructured the code to ensure uniform handling of ImportError.
In place of previously used ValueError, I've adopted the standard
practice of raising ImportError with explanatory messages. This
modification enhances code readability and clarifies that any problems
stem from module importation.
@eyurtsev , @baskaryan
Thanks
Add PromptGuard integration
-------
There are two approaches to integrate PromptGuard with a LangChain
application.
1. PromptGuardLLMWrapper
2. functions that can be used in LangChain expression.
-----
- Dependencies
`promptguard` python package, which is a runtime requirement if you'd
try out the demo.
- @baskaryan @hwchase17 Thanks for the ideas and suggestions along the
development process.
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
### Description
When we're loading documents using `ConfluenceLoader`:`load` function
and, if both `include_comments=True` and `keep_markdown_format=True`,
we're getting an error saying `NameError: free variable 'BeautifulSoup'
referenced before assignment in enclosing scope`.
loader = ConfluenceLoader(url="URI", token="TOKEN")
documents = loader.load(
space_key="SPACE",
include_comments=True,
keep_markdown_format=True,
)
This happens because previous imports only consider the
`keep_markdown_format` parameter, however to include the comments, it's
using `BeautifulSoup`
Now it's fixed to handle all four scenarios considering both
`include_comments` and `keep_markdown_format`.
### Twitter
`@SathinduGA`
---------
Co-authored-by: Bagatur <baskaryan@gmail.com>
- Description: Allows the user of `ConfluenceLoader` to pass a
`requests.Session` object in lieu of an authentication mechanism
- Issue: None
- Dependencies: None
- Tag maintainer: @hwchase17
- Improved docs
- Improved performance in multiple ways through batching, threading,
etc.
- fixed error message
- Added support for metadata filtering during similarity search.
@baskaryan PTAL
The package is linted with mypy, so its type hints are correct and
should be exposed publicly. Without this file, the type hints remain
private and cannot be used by downstream users of the package.
- Description: Updated marqo integration to use tensor_fields instead of
non_tensor_fields. Upgraded marqo version to 1.2.4
- Dependencies: marqo 1.2.4
---------
Co-authored-by: Raynor Kirkson E. Chavez <raynor.chavez@192.168.254.171>
Co-authored-by: Bagatur <baskaryan@gmail.com>
<!-- Thank you for contributing to LangChain!
Replace this entire comment with:
- Description: a description of the change,
- Issue: the issue # it fixes (if applicable),
- Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!
Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.
See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. These live is docs/extras
directory.
If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17, @rlancemartin.
-->
- Description: Changed metadata retrieval so that it combines Vectara
doc level and part level metadata
- Tag maintainer: @rlancemartin
- Twitter handle: @ofermend
**Description**:
- Uniformed the current valid suffixes (file formats) for loading agents
from hubs and files (to better handle future additions);
- Clarified exception messages (also in unit test).
@rlancemartin The current implementation within `Geopandas.GeoDataFrame`
loader uses the python builtin `str()` function on the input geometries.
While this looks very close to WKT (Well known text), Python's str
function doesn't guarantee that.
In the interest of interop., I've changed to the of use `wkt` property
on the Shapely geometries for generating the text representation of the
geometries.
Also, included here:
- validation of the input `page_content_column` as being a GeoSeries.
- geometry `crs` (Coordinate Reference System) / bounds
(xmin/ymin/xmax/ymax) added to Document metadata. Having the CRS is
critical... having the bounds is just helpful!
I think there is a larger question of "Should the geometry live in the
`page_content`, or should the record be better summarized and tuck the
geom into metadata?" ...something for another day and another PR.
This is an extension of #8104. I updated some of the signatures so all
the tests pass.
@danhnn I couldn't commit to your PR, so I created a new one. Thanks for
your contribution!
@baskaryan Could you please merge it?
---------
Co-authored-by: Danh Nguyen <dnncntt@gmail.com>
### Summary
Fixes a bug from #7850 where post processing functions in Unstructured
loaders were not apply. Adds a assertion to the test to verify the post
processing function was applied and also updates the explanation in the
example notebook.