Commit Graph

9128 Commits

Author SHA1 Message Date
Harrison Chase
cf866efb78 Merge branch 'harrison/new-docs' of github.com:hwchase17/langchain into harrison/new-docs 2024-05-01 16:07:31 -07:00
Harrison Chase
8e8a03d61b cr 2024-05-01 16:07:24 -07:00
ccurme
c77debf870 (new docs): update rag use-case docs (#21164) 2024-05-01 16:25:14 -04:00
ccurme
6a20856fab (new docs): embedding how-to guides (#21106) 2024-04-30 14:49:06 -04:00
Chester Curme
7f4397c94a format 2024-04-30 12:44:14 -04:00
Chester Curme
7285370328 update tutorial 2024-04-30 12:43:54 -04:00
ccurme
df8a2cdc96 (new docs): update text splitter how-to guides (#21087) 2024-04-30 11:34:42 -04:00
ccurme
c3b7933d98 (new docs): update how-to guides (#21073) 2024-04-30 08:21:09 -04:00
Harrison Chase
8a0e71d27b Merge branch 'master' into harrison/new-docs 2024-04-29 16:30:10 -07:00
Harrison Chase
86bb3aa45b Merge branch 'harrison/new-docs' of github.com:hwchase17/langchain into harrison/new-docs 2024-04-29 16:29:52 -07:00
Harrison Chase
55dd2ea57d cr 2024-04-29 16:29:47 -07:00
ccurme
bc4bb49451 (new docs): remove agents from sidebar (#21046) 2024-04-29 19:18:08 -04:00
ccurme
392b842a59 (new docs): organize how-to sidebars (#21029)
```python
import json
import re
from pathlib import Path

def parse_markdown_to_sidebar(markdown_content):
    lines = markdown_content.splitlines()
    sidebar = []
    current_category = None
    current_subcategory = None

    for line in lines:
        if line.startswith('### '):
            # Subcategory
            if current_subcategory is not None:
                current_category['items'].append(current_subcategory)
            subcategory_title = line.strip('# ').strip()
            current_subcategory = {
                "type": "category",
                "label": subcategory_title,
                "collapsed": True,
                "items": [],
                "link": {"type": "generated-index"}
            }
        elif line.startswith('## '):
            # Category
            if current_category is not None:
                if current_subcategory is not None:
                    current_category['items'].append(current_subcategory)
                    current_subcategory = None
                sidebar.append(current_category)
            category_title = line.strip('# ').strip()
            current_category = {
                "type": "category",
                "label": category_title,
                "collapsed": True,
                "items": [],
                "link": {"type": "generated-index"}
            }
        elif line.startswith('- ['):
            # Link
            match = re.match(r'- \[(.*?)\]\((.*?)\)', line)
            if match:
                title, link = match.groups()
                link = link.replace('/docs/', '')  # Remove '/docs/' prefix
                if current_subcategory is not None:
                    current_subcategory['items'].append(link)
                elif current_category is not None:
                    current_category['items'].append(link)

    # Add the last category and subcategory if they exist
    if current_subcategory is not None:
        current_category['items'].append(current_subcategory)
    if current_category is not None:
        sidebar.append(current_category)

    return sidebar

def generate_sidebar_json(file_path):
    with open(file_path, 'r') as md_file:
        markdown_content = md_file.read()
    sidebar = parse_markdown_to_sidebar(markdown_content)
    sidebar_json = json.dumps({"items": sidebar}, indent=2)
    return sidebar_json
```
2024-04-29 19:00:06 -04:00
Harrison Chase
e037446ca3 cr 2024-04-29 15:40:00 -07:00
Harrison Chase
8920bcd263 Merge branch 'harrison/new-docs' of github.com:hwchase17/langchain into harrison/new-docs 2024-04-29 15:39:55 -07:00
Harrison Chase
81a7868c57 cr 2024-04-29 15:39:50 -07:00
Rahul Triptahi
c172611647 community[patch]: Add classifier_url argument in PebbloSafeLoader and documentation update. (#21030)
Description: Add classifier_url argument in PebbloSafeLoader.
Documentation: Updated PebbloSafeLoader documentation with above change
and new links for pebblo github pages.

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-04-29 17:41:09 -04:00
Leonid Ganeline
08d08d7c83 docs: langchain docstrings updates (#21032)
Added missed docstings. Formatted docstrings into a consistent format.
2024-04-29 17:40:44 -04:00
Leonid Ganeline
85094cbb3a docs: community docstring updates (#21040)
Added missed docstrings. Updated docstrings to consistent format.
2024-04-29 17:40:23 -04:00
Rodrigo Nogueira
90f19028e5 community[patch]: Add maritalk streaming (sync and async) (#19203)
Co-authored-by: RosevalJr <rdmalajr@gmail.com>
Co-authored-by: Roseval Donisete Malaquias Junior <roseval@maritaca.ai>
2024-04-29 21:31:14 +00:00
Cahid Arda Öz
cc6191cb90 community[minor]: Add support for Upstash Vector (#20824)
## Description

Adding `UpstashVectorStore` to utilize [Upstash
Vector](https://upstash.com/docs/vector/overall/getstarted)!

#17012 was opened to add Upstash Vector to langchain but was closed to
wait for filtering. Now filtering is added to Upstash vector and we open
a new PR. Additionally, [embedding
feature](https://upstash.com/docs/vector/features/embeddingmodels) was
added and we add this to our vectorstore aswell.

## Dependencies

[upstash-vector](https://pypi.org/project/upstash-vector/) should be
installed to use `UpstashVectorStore`. Didn't update dependencies
because of [this comment in the previous
PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450).

## Tests

Tests are added and they pass. Tests are naturally network bound since
Upstash Vector is offered through an API.

There was [a discussion in the previous PR about mocking the
unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567).
We didn't make changes to this end yet. We can update the tests if you
can explain how the tests should be mocked.

---------

Co-authored-by: ytkimirti <yusuftaha9@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 17:25:01 -04:00
ccurme
d99a7a6b44 (new docs): update how-to guides (#21039) 2024-04-29 16:27:58 -04:00
Leonid Ganeline
1a2ff56cd8 core[patch[: docstring update (#21036)
Added missed docstrings. Updated docstrings to consistent format.
2024-04-29 15:35:34 -04:00
Eugene Yurtsev
f479a337cc langchain[patch]: replace deprecated imports with imports from langchain_core (#21033)
* Output of running the migration script.
* Ran only against langchain code itself and not the unit tests.
2024-04-29 15:34:31 -04:00
Eugene Yurtsev
82d4afcac0 langchain[minor]: Code to handle dynamic imports (#20893)
Proposing to centralize code for handling dynamic imports. This allows treating langchain-community as an optional dependency.

---

The proposal is to scan the code base and to replace all existing imports with dynamic imports using this functionality.
2024-04-29 15:34:03 -04:00
Erick Friis
854ae3e1de mistralai: release 0.1.5, allow client passing in (#21034) 2024-04-29 17:14:26 +00:00
chyroc
3e241956d3 community[minor]: add coze chat model (#20770)
add coze chat model, to call coze.com apis
2024-04-29 12:26:16 -04:00
Eugene Yurtsev
29493bb598 cli[minor]: improve confirmation message with more details (#21027)
Improve confirmation message with more details
2024-04-29 12:20:42 -04:00
Eugene Yurtsev
aab78a37f3 cli[patch]: Ignore imports that change the name of the class (#21026)
Not currently handeled by migration script
2024-04-29 12:20:30 -04:00
Massimiliano Pronesti
ce89b34fc0 community[patch]: support hybrid search with threshold in Azure AI Search Retriever (#20907)
Support hybrid search with a score threshold -- similar to what we do
for similarity search.
2024-04-29 12:11:44 -04:00
Andrei Panferov
b3efa38cc0 community[patch]: GigaChat model selection fix (#20988)
Fixed the error that the model name is never actually put into GigaChat
request payload, always defaulting to `GigaChat-Lite`.

With this fix, model selection through
```python
import os
from langchain.chat_models.gigachat import GigaChat

chat = GigaChat(
    name="GigaChat-Pro", # <- HERE!!!!!
    ...
)
```
should actually work, as intended in
[here](804390ba4b/libs/community/langchain_community/llms/gigachat.py (L36)).

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-29 16:08:26 +00:00
ccurme
38bd7f4dd6 (new docs): update sidebars alt (#21024) 2024-04-29 11:57:30 -04:00
Patrick McFadin
3331865f6b community[minor]: add Cassandra Database Toolkit (#20246)
**Description**: ToolKit and Tools for accessing data in a Cassandra
Database primarily for Agent integration. Initially, this includes the
following tools:
- `cassandra_db_schema` Gathers all schema information for the connected
database or a specific schema. Critical for the agent when determining
actions.
- `cassandra_db_select_table_data` Selects data from a specific keyspace
and table. The agent can pass paramaters for a predicate and limits on
the number of returned records.
- `cassandra_db_query` Expiriemental alternative to
`cassandra_db_select_table_data` which takes a query string completely
formed by the agent instead of parameters. May be removed in future
versions.

Includes unit test and two notebooks to demonstrate usage. 

**Dependencies**: cassio
**Twitter handle**: @PatrickMcFadin

---------

Co-authored-by: Phil Miesle <phil.miesle@datastax.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 15:51:43 +00:00
Igor Brai
b3e74f2b98 community[minor]: add mojeek search util (#20922)
**Description:** This pull request introduces a new feature to community
tools, enhancing its search capabilities by integrating the Mojeek
search engine
**Dependencies:** None

---------

Co-authored-by: Igor Brai <igor@mojeek.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-04-29 15:49:53 +00:00
hmn falahi
4822beb298 Ignore self/cls from required args of class functions in convert_to_openai_tool (#20691)
Removed redundant self/cls from required args of class functions in
_get_python_function_required_args:

```python
class MemberTool:
    def search_member(
            self,
            keyword: str,
            *args,
            **kwargs,
    ):
        """Search on members with any keyword like first_name, last_name, email

        Args:
            keyword: Any keyword of member
        """

        headers = dict(authorization=kwargs['token'])
        members = []
        try:
            members = request_(
                method='SEARCH',
                url=f'{service_url}/apiv1/members',
                headers=headers,
                json=dict(query=keyword),
            )

        except Exception as e:
            logger.info(e.__doc__)

        return members

convert_to_openai_tool(MemberTool.search_member)
```
expected result:
```
{'type': 'function', 'function': {'name': 'search_member', 'description': 'Search on members with any keyword like first_name, last_name, username, email', 'parameters': {'type': 'object', 'properties': {'keyword': {'type': 'string', 'description': 'Any keyword of member'}}, 'required': ['keyword']}}}
```

#20685

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 11:46:26 -04:00
Rahul Triptahi
a64a1943fd docs: Document update for load_extended_matadata in GoogleDriveLoader (#20950)
Document: Updated google_drive,ipynb for loading following extended
metadata.
 - full_path - Full path of the file/s in google drive.
 - owner - owner of the file/s.
 - size - size of the file/s.

Code changes:
[langchain-google/pull/179.](https://github.com/langchain-ai/langchain-google/pull/179)

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-29 11:41:57 -04:00
Eugene Yurtsev
4f4ee8e2cf cli[patch]: Update migrations file manually (#21021)
We need to replace occurrences in the code of RunnableMap not just the
import,
so for now, we don't replace RunnableMap.
2024-04-29 10:53:31 -04:00
Tomaz Bratanic
67428c4052 community[patch]: Neo4j enhanced schema (#20983)
Scan the database for example values and provide them to an LLM for
better inference of Text2cypher
2024-04-29 10:45:55 -04:00
Leonid Kuligin
dc70c23a11 docs: switched GCSLoaders docs to langchain-google-community (#20985)
Thank you for contributing to LangChain!

- [ ] **PR title**: "docs: switched GCSLoaders docs to
langchain-google-community"

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** switched GCSLoaders docs to
langchain-google-community
2024-04-29 10:45:11 -04:00
aditya thomas
8b59bddc03 anthropic[patch]: add tests for secret_str for api key (#20986)
**Description:** Add tests to check API keys are masked
**Issue:** Resolves
https://github.com/langchain-ai/langchain/issues/12165 for Anthropic
models
**Dependencies:** None
2024-04-29 10:39:14 -04:00
Pengcheng Liu
1fad39be1c community[minor]: Add LarkSuite wiki document loader. (#21016)
**Description:** Add LarkSuite wiki document loader. Refer to [LarkSuite
api document
](https://open.feishu.cn/document/server-docs/docs/wiki-v2/space-node/list)for
details.
**Issue:** None
**Dependencies:** None
**Twitter handle:** None
2024-04-29 10:37:50 -04:00
Tomaz Bratanic
d36332476c docs: Add neo4j relationship vector index docs (#20990)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 14:36:47 +00:00
Leonid Ganeline
dc7c06bc07 community[minor]: import fix (#20995)
Issue: When the third-party package is not installed, whenever we need
to `pip install <package>` the ImportError is raised.
But sometimes, the `ValueError` or `ModuleNotFoundError` is raised. It
is bad for consistency.
Change: replaced the `ValueError` or `ModuleNotFoundError` with
`ImportError` when we raise an error with the `pip install <package>`
message.
Note: Ideally, we replace all `try: import... except... raise ... `with
helper functions like `import_aim` or just use the existing
[langchain_core.utils.utils.guard_import](https://api.python.langchain.com/en/latest/utils/langchain_core.utils.utils.guard_import.html#langchain_core.utils.utils.guard_import)
But it would be much bigger refactoring. @baskaryan Please, advice on
this.
2024-04-29 10:32:50 -04:00
Karim Lalani
2ddac9a7c3 experimental[minor]: Add bind_tools and with_structured_output functions to OllamaFunctions (#20881)
Implemented bind_tools for OllamaFunctions.
Made OllamaFunctions sub class of ChatOllama.
Implemented with_structured_output for OllamaFunctions.

integration unit test has been updated.
notebook has been updated.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 14:13:33 +00:00
Eugene Yurtsev
d781560722 cli[minor]: Add ipynb support, add text_splitters (#20963) 2024-04-29 10:11:21 -04:00
Vadym Barda
5e0b6b3e75 docs: update langserve link in LCEL docs (#20992) 2024-04-29 09:06:10 -04:00
Aditya
07ce39bfe7 docs: updated tutorials for Image generation and Vector Search (#21000)
Description: docs: updated tutorials for Image generation and Vector
Search

@lkuligin for review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
2024-04-29 09:04:11 -04:00
Aditya
17bbb7d2a5 docs: updated tutorial for Gemini versions, included safety attribute updates (#21006)
Description:updated tutorial for Gemini versions, included safety
attribute updates

@lkuligin For review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
2024-04-29 09:01:54 -04:00
WilliamEspegren
804390ba4b community: Spider integration (#20937)
Added the [Spider.cloud](https://spider.cloud) document loader.
[Spider](https://github.com/spider-rs/spider) is the
[fastest](https://github.com/spider-rs/spider/blob/main/benches/BENCHMARKS.md)
and cheapest crawler that returns LLM-ready data.

```
- **Description:** Adds Spider data loader
- **Dependencies:** spider-client
- **Twitter handle:** @WilliamEspegren 
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: = <=>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-27 21:45:03 +00:00
Jamie Lemon
6342217b93 docs: Moves "Using PyMuPDF" to higher up the page. (#20832)
**Description:**
This PR moves the **PyMuPDF** PDF loader solution to be underneath
**PyPDF**. This is because it is the the 2nd most popular PyPI package
after **PyPDF**.

Please refer to these numbers, at the time of writing as follows:

PyPDF
https://www.pepy.tech/projects/PyPDF2
160 million

PyMuPDF
https://www.pepy.tech/projects/pymupdf
60 million

PDFPlumber
https://www.pepy.tech/projects/pdfplumber
23 million

PDFMiner
https://www.pepy.tech/projects/pdfminer
16 million

PyPDFium2
https://www.pepy.tech/projects/pypdfium2
8 million

Unstructured
https://www.pepy.tech/projects/unstructured
8 million


Please note I am an active contributor to
https://github.com/pymupdf/PyMuPDF

Many thanks!

----

**Twitter handle:**
@artifex
2024-04-27 20:40:20 +00:00