Commit Graph

62 Commits

Author SHA1 Message Date
German Martin
3a1d05394d
community: Apache AGE wrapper. Ensure Node Uniqueness by ID. (#28759)
**Description:**

The Apache AGE graph integration incorrectly handled node merging,
allowing duplicate nodes with different IDs but the same type and other
properties. Unlike
[Neo4j](cdf6202156/libs/community/langchain_community/graphs/neo4j_graph.py (L47)),
[Memgraph](cdf6202156/libs/community/langchain_community/graphs/memgraph_graph.py (L50)),
[Kuzu](cdf6202156/libs/community/langchain_community/graphs/kuzu_graph.py (L253)),
and
[Gremlin](cdf6202156/libs/community/langchain_community/graphs/gremlin_graph.py (L165)),
it did not use the node ID as the primary identifier for merging.

This inconsistency caused data integrity issues and unexpected behavior
when users expected updates to specific nodes by ID.

**Solution:**
This PR modifies the `node_insert_query` to `MERGE` nodes based on label
and ID *only* and updates properties with `SET`, aligning the behavior
with other graph database integrations. The `_format_properties` method
was also modified to handle id overrides.

**Impact:**

This fix ensures data integrity by preventing duplicate nodes, and
provides a consistent behavior across graph database integrations.
2024-12-17 09:21:59 -05:00
German Martin
d5d18c62b3
community: Apache AGE wrapper additional edge cases. (#28151)
Description: 
Current AGEGraph() implementation does some custom wrapping for graph
queries. The method here is _wrap_query() as it parse the field from the
original query to add some SQL context to it.
This improves the current parsing logic to cover additional edge cases
that are added to the test coverage, basically if any Node property name
or value has the "return" literal in it will break the graph / SQL
query.
We discovered this while dealing with real world datasets, is not an
uncommon scenario and I think it needs to be covered.
2024-12-16 11:28:01 -05:00
Katarina Supe
aba2711e7f
community: update Memgraph integration (#27017)
**Description:**
- **Memgraph** no longer relies on `Neo4jGraphStore` but **implements
`GraphStore`**, just like other graph databases.
- **Memgraph** no longer relies on `GraphQAChain`, but implements
`MemgraphQAChain`, just like other graph databases.
- The refresh schema procedure has been updated to try using `SHOW
SCHEMA INFO`. The fallback uses Cypher queries (a combination of schema
and Cypher) → **LangChain integration no longer relies on MAGE
library**.
- The **schema structure** has been reformatted. Regardless of the
procedures used to get schema, schema structure is the same.
- The `add_graph_documents()` method has been implemented. It transforms
`GraphDocument` into Cypher queries and creates a graph in Memgraph. It
implements the ability to use `baseEntityLabel` to improve speed
(`baseEntityLabel` has an index on the `id` property). It also
implements the ability to include sources by creating a `MENTIONS`
relationship to the source document.
- Jupyter Notebook for Memgraph has been updated.
- **Issue:** /
- **Dependencies:** /
- **Twitter handle:** supe_katarina (DX Engineer @ Memgraph)

Closes #25606
2024-12-10 10:57:21 -05:00
Prashanth Rao
8c6eec5f25
community: KuzuGraph needs allow_dangerous_requests, add graph documents via LLMGraphTransformer (#27949)
- [x] **PR title**: "community: Kuzu - Add graph documents via
LLMGraphTransformer"
- This PR adds a new method `add_graph_documents` to use the
`GraphDocument`s extracted by `LLMGraphTransformer` and store in a Kùzu
graph backend.
- This allows users to transform unstructured text into a graph that
uses Kùzu as the graph store.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: pookam90 <pookam@microsoft.com>
Co-authored-by: Pooja Kamath <60406274+Pookam90@users.noreply.github.com>
Co-authored-by: hsm207 <hsm207@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-10 03:15:28 +00:00
Alex Thomas
5867f25ff3
community[patch]: Neo4j community deprecation (#28130)
Adds deprecation notices for Neo4j components moving to the
`langchain_neo4j` partner package.

- Adds deprecation warnings to all Neo4j-related classes and functions
that have been migrated to the new `langchain_neo4j` partner package
- Updates documentation to reference the new `langchain_neo4j` package
instead of `langchain_community`
2024-11-25 10:34:22 -08:00
Erick Friis
600b7bdd61
all: test 3.13 ci (#27197)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-10-25 12:56:58 -07:00
Tomaz Bratanic
481bd25d29
community: Fix database connections for neo4j (#27190)
Fixes https://github.com/langchain-ai/langchain/issues/27185

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 23:47:55 +00:00
Tomaz Bratanic
03b9aca55d
community: Retry retriable errors in Neo4j (#26211)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:01:07 +00:00
Erick Friis
c2a3021bb0
multiple: pydantic 2 compatibility, v0.3 (#26443)
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com>
Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com>
Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: ZhangShenao <15201440436@163.com>
Co-authored-by: Friso H. Kingma <fhkingma@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Morgante Pell <morgantep@google.com>
2024-09-13 14:38:45 -07:00
Tomaz Bratanic
181e4fc0e0
Add session expired retry to neo4j graph (#26182)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 11:40:43 -07:00
Dan O'Donovan
f49da71e87
community[patch]: change default Neo4j username/password (#25226)
**Description:**

Change the default Neo4j username/password (when not supplied as
environment variable or in code) from `None` to `""`.

Neo4j has an option to [disable
auth](https://neo4j.com/docs/operations-manual/current/configuration/configuration-settings/#config_dbms.security.auth_enabled)
which is helpful when developing. When auth is disabled, the username /
password through the `neo4j` module should be `""` (ie an empty string).

Empty strings get marked as false in
`langchain_core.utils.env.get_from_dict_or_env` -- changing this code /
behaviour would have a wide impact and is undesirable.

In order to both _allow_ access to Neo4j with auth disabled and _not_
impact `langchain_core` this patch is presented. The downside would be
that if a user forgets to set NEO4J_USERNAME or NEO4J_PASSWORD they
would see an invalid credentials error rather than missing credentials
error. This could be mitigated but would result in a less elegant patch!

**Issue:**
Fix issue where langchain cannot communicate with Neo4j if Neo4j auth is
disabled.
2024-09-03 11:24:18 -07:00
Leonid Ganeline
8788a34bfa
community: NeptuneGraph fix (#23281)
Issue: the `service` optional parameter was mentioned but not used.
Fix: added this parameter.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-23 15:34:26 +00:00
Nada Amin
ac7b71e0d7
langchain_community.graphs: Neo4JGraph: prop min_size might be None (#23944)
When I used the Neo4JGraph enhanced_schema=True option, I ran into an
error because a prop min_size of None was compared numerically with an
int.

The fix I applied is similar to the pattern of skipping embeddings
elsewhere in the file.

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-22 20:29:52 +00:00
Eugene Yurtsev
d24b82357f
community[patch]: Add missing annotations (#24890)
This PR adds annotations in comunity package.

Annotations are only strictly needed in subclasses of BaseModel for
pydantic 2 compatibility.

This PR adds some unnecessary annotations, but they're not bad to have
regardless for documentation pages.
2024-07-31 18:13:44 +00:00
Tomaz Bratanic
d3a2b9fae0
Fix neo4j type error on missing constraint information (#24177)
If you use `refresh_schema=False`, then the metadata constraint doesn't
exist. ATM, we used default `None` in the constraint check, but then
`any` fails because it can't iterate over None value
2024-07-12 06:39:29 -04:00
Bagatur
a0c2281540
infra: update mypy 1.10, ruff 0.5 (#23721)
```python
"""python scripts/update_mypy_ruff.py"""
import glob
import tomllib
from pathlib import Path

import toml
import subprocess
import re

ROOT_DIR = Path(__file__).parents[1]


def main():
    for path in glob.glob(str(ROOT_DIR / "libs/**/pyproject.toml"), recursive=True):
        print(path)
        with open(path, "rb") as f:
            pyproject = tomllib.load(f)
        try:
            pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = (
                "^1.10"
            )
            pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = (
                "^0.5"
            )
        except KeyError:
            continue
        with open(path, "w") as f:
            toml.dump(pyproject, f)
        cwd = "/".join(path.split("/")[:-1])
        completed = subprocess.run(
            "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color",
            cwd=cwd,
            shell=True,
            capture_output=True,
            text=True,
        )
        logs = completed.stdout.split("\n")

        to_ignore = {}
        for l in logs:
            if re.match("^(.*)\:(\d+)\: error:.*\[(.*)\]", l):
                path, line_no, error_type = re.match(
                    "^(.*)\:(\d+)\: error:.*\[(.*)\]", l
                ).groups()
                if (path, line_no) in to_ignore:
                    to_ignore[(path, line_no)].append(error_type)
                else:
                    to_ignore[(path, line_no)] = [error_type]
        print(len(to_ignore))
        for (error_path, line_no), error_types in to_ignore.items():
            all_errors = ", ".join(error_types)
            full_path = f"{cwd}/{error_path}"
            try:
                with open(full_path, "r") as f:
                    file_lines = f.readlines()
            except FileNotFoundError:
                continue
            file_lines[int(line_no) - 1] = (
                file_lines[int(line_no) - 1][:-1] + f"  # type: ignore[{all_errors}]\n"
            )
            with open(full_path, "w") as f:
                f.write("".join(file_lines))

        subprocess.run(
            "poetry run ruff format .; poetry run ruff --select I --fix .",
            cwd=cwd,
            shell=True,
            capture_output=True,
            text=True,
        )


if __name__ == "__main__":
    main()

```
2024-07-03 10:33:27 -07:00
Tomaz Bratanic
aeeda370aa
Sanitize backticks from neo4j labels and types for import (#23367) 2024-06-24 19:05:31 +00:00
Leonid Ganeline
51e75cf59d
community: docstrings (#23202)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 11:08:13 -04:00
Bagatur
50186da0a1
infra: rm unused # noqa violations (#22049)
Updating #21137
2024-05-22 15:21:08 -07:00
Eugene Yurtsev
58360a1e53
community[patch]: Add unit test to verify that init is correctly defined (#22030)
Fix some __init__ files and add a unit test
2024-05-22 17:19:00 +00:00
Tomaz Bratanic
d8a1f1114d
community[patch]: Handle exceptions where node props aren't consistent in neo4j schema (#22027) 2024-05-22 11:21:56 -04:00
Tomaz Bratanic
9fce03e7db
community[patch]: Fix neo4j enhanced schema (#21582) 2024-05-13 15:26:06 -04:00
Tomaz Bratanic
ac14f171ac
Add indexed properties to neo4j enhanced schema (#21335) 2024-05-06 14:28:34 -07:00
Tomaz Bratanic
9e53fa7d2e
Some more fixes to neo4j enhanced schema (#21139) 2024-05-01 13:12:43 -07:00
Eugene Yurtsev
1ce1a10f2b
langchain[patch],community[minor]: Move graph index creator (#20795)
Move graph index creator to community
2024-05-01 10:04:30 -04:00
Tomaz Bratanic
c9e96bb5e2
community[patch]: Fix neo4j enhanced schema bugs (#21072) 2024-04-30 20:16:26 -04:00
Charlie Marsh
8f38b7a725
multiple: Remove unnecessary Ruff suppression comments (#21050)
## Summary

I ran `ruff check --extend-select RUF100 -n` to identify `# noqa`
comments that weren't having any effect in Ruff, and then `ruff check
--extend-select RUF100 -n --fix` on select files to remove all of the
unnecessary `# noqa: F401` violations. It's possible that these were
needed at some point in the past, but they're not necessary in Ruff
v0.1.15 (used by LangChain) or in the latest release.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-30 17:13:48 +00:00
Leonid Ganeline
85094cbb3a
docs: community docstring updates (#21040)
Added missed docstrings. Updated docstrings to consistent format.
2024-04-29 17:40:23 -04:00
Tomaz Bratanic
67428c4052
community[patch]: Neo4j enhanced schema (#20983)
Scan the database for example values and provide them to an LLM for
better inference of Text2cypher
2024-04-29 10:45:55 -04:00
Leonid Ganeline
dc7c06bc07
community[minor]: import fix (#20995)
Issue: When the third-party package is not installed, whenever we need
to `pip install <package>` the ImportError is raised.
But sometimes, the `ValueError` or `ModuleNotFoundError` is raised. It
is bad for consistency.
Change: replaced the `ValueError` or `ModuleNotFoundError` with
`ImportError` when we raise an error with the `pip install <package>`
message.
Note: Ideally, we replace all `try: import... except... raise ... `with
helper functions like `import_aim` or just use the existing
[langchain_core.utils.utils.guard_import](https://api.python.langchain.com/en/latest/utils/langchain_core.utils.utils.guard_import.html#langchain_core.utils.utils.guard_import)
But it would be much bigger refactoring. @baskaryan Please, advice on
this.
2024-04-29 10:32:50 -04:00
Guilherme Zanotelli
f931a9ce60
community[patch]: Pass kwargs to SPARQLStore from RdfGraph (#20385)
This introduces `store_kwargs` which behaves similarly to `graph_kwargs`
on the `RdfGraph` object, which will enable users to pass `headers` and
other arguments to the underlying `SPARQLStore` object. I have also made
a [PR in `rdflib` to support passing
`default_graph`](https://github.com/RDFLib/rdflib/pull/2761).

Example usage:
```python
from langchain_community.graphs import RdfGraph

graph = RdfGraph(
    query_endpoint="http://localhost/sparql",
    standard="rdf",
    store_kwargs=dict(
        default_graph="http://example.com/mygraph"
    )
)
```

<!--If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.-->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-27 01:38:29 +00:00
Tomaz Bratanic
9efab3ed66
community[patch]: Add driver config param for neo4j graph (#20772)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-24 21:14:41 +00:00
Leonid Ganeline
13751c3297
community: tigergraph fixes (#20034)
- added guard on the `pyTigerGraph` import
- added a missed example page in the `docs/integrations/graphs/`
- formatted the `docs/integrations/providers/` page to the consistent
format. Added links.
2024-04-24 16:49:21 -04:00
shumway743
cb6e5e56c2
community[minor]: add graph store implementation for apache age (#20582)
**Description:** implemented GraphStore class for Apache Age graph db

**Dependencies:** depends on psycopg2

Unit and integration tests included. Formatting and linting have been
run.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-20 14:31:04 -07:00
Leonid Ganeline
7cf2d2759d
community[patch]: docstrings update (#20301)
Added missed docstrings. Format docstings to the consistent form.
2024-04-11 16:23:27 -04:00
Leonid Ganeline
4cb5f4c353
community[patch]: import flattening fix (#20110)
This PR should make it easier for linters to do type checking and for IDEs to jump to definition of code.

See #20050 as a template for this PR.
- As a byproduct: Added 3 missed `test_imports`.
- Added missed `SolarChat` in to __init___.py Added it into test_import
ut.
- Added `# type: ignore` to fix linting. It is not clear, why linting
errors appear after ^ changes.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-10 13:01:19 -04:00
Erick Friis
37a9e23c05
community: switch to falkordb python client (#20229) 2024-04-09 20:19:44 +00:00
Piyush Jain
cd7abc495a
community[minor]: add neptune analytics graph (#20047)
Replacement for PR
[#19772](https://github.com/langchain-ai/langchain/pull/19772).

---------

Co-authored-by: Dave Bechberger <dbechbe@amazon.com>
Co-authored-by: bechbd <bechbd@users.noreply.github.com>
2024-04-09 09:20:59 -05:00
Tomaz Bratanic
87d2a6b777
community[minor]: Add the option to omit schema refresh in Neo4jGraph (#19654) 2024-03-27 14:20:12 -04:00
Piyush Jain
72ba738bf5
community[minor]: Improvements for NeptuneRdfGraph, Improve discovery of graph schema using database statistics (#19546)
Fixes linting for PR
[19244](https://github.com/langchain-ai/langchain/pull/19244)

---------

Co-authored-by: mhavey <mchavey@gmail.com>
2024-03-26 10:36:51 -04:00
Leonid Ganeline
9c8523b529
community[patch]: flattening imports 3 (#18939)
@eyurtsev
2024-03-12 15:18:54 -07:00
Tomaz Bratanic
a28be31a96
Switch to md5 for deduplication in neo4j integrations (#18846)
Deduplicate documents using MD5 of the page_content. Also allows for
custom deduplication with graph ingestion method by providing metadata
id attribute

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-09 13:28:55 -08:00
Tomaz Bratanic
4bfe888717
comunity[patch]: Fix neo4j sanitizing values (#18750)
Fixing sanitization for when deeply nested lists appear
2024-03-07 19:21:52 -08:00
Tomaz Bratanic
ea51cdaede
Remove neo4j bloom labels from graph schema (#18564)
Neo4j tools use particular node labels and relationship types to store
metadata, but are irrelevant for text2cypher or graph generation, so we
want to ignore them in the schema representation.
2024-03-05 12:54:05 -08:00
Tomaz Bratanic
353248838d
Add precedence for input params over env variables in neo4j integration (#18581)
input parameters take precedence over env variables
2024-03-05 09:36:56 -08:00
Tomaz Bratanic
f6bfb969ba
community[patch]: Add an option for indexed generic label when import neo4j graph documents (#18122)
Current implementation doesn't have an indexed property that would
optimize the import. I have added a `baseEntityLabel` parameter that
allows you to add a secondary node label, which has an indexed id
`property`. By default, the behaviour is identical to previous version.

Since multi-labeled nodes are terrible for text2cypher, I removed the
secondary label from schema representation object and string, which is
used in text2cypher.
2024-03-01 12:33:52 -08:00
Petteri Johansson
6c1989d292
community[minor], langchain[minor], docs: Gremlin Graph Store and QA Chain (#17683)
- **Description:** 
New feature: Gremlin graph-store and QA chain (including docs).
Compatible with Azure CosmosDB.
  - **Dependencies:** 
  no changes
2024-03-01 12:21:14 -08:00
Neli Hateva
a01e8473f8
community[patch]: Fix GraphSparqlQAChain so that it works with Ontotext GraphDB (#15009)
- **Description:** Introduce a new parameter `graph_kwargs` to
`RdfGraph` - parameters used to initialize the `rdflib.Graph` if
`query_endpoint` is set. Also, do not set
`rdflib.graph.DATASET_DEFAULT_GRAPH_ID` as default value for the
`rdflib.Graph` `identifier` if `query_endpoint` is set.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** N/A
2024-02-25 19:05:21 -08:00
Raunak
1ec8199c8e
community[patch]: Added more functions in NetworkxEntityGraph class (#17624)
- **Description:** 
1. Added add_node(), remove_node(), has_node(), remove_edge(),
has_edge() and get_neighbors() functions in
       NetworkxEntityGraph class.

2. Added the above functions in graph_networkx_qa.ipynb documentation.
2024-02-21 17:02:56 -08:00
Amir Karbasi
bccc9241ea
community[patch]: Resolve KuzuQAChain API Changes (#16885)
- **Description:** Updates to the Kuzu API had broken this
functionality. These updates resolve those issues and add a new test to
demonstrate the updates.
- **Issue:** #11874
- **Dependencies:** No new dependencies
- **Twitter handle:** @amirk08


Test results:
```
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params PASSED                                   [ 33%]
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params PASSED                                      [ 66%]
tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema PASSED                                    [100%]

=================================================== slowest 5 durations =================================================== 
0.53s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema
0.34s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_no_params
0.28s call     tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params
0.03s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_refresh_schema
0.02s teardown tests/integration_tests/graphs/test_kuzu.py::TestKuzu::test_query_params
==================================================== 3 passed in 1.27s ==================================================== 
```
2024-02-15 10:18:37 -08:00