Commit Graph

93 Commits

Author SHA1 Message Date
Erick Friis
fa3857c9d0
docs: tests/standard tests api ref redirect (#29444) 2025-01-27 23:21:50 -08:00
Erick Friis
e723882a49
docs: mongodb api ref redirect (#29348) 2025-01-21 16:48:03 -08:00
Luke
f69695069d
text_splitters: Add HTMLSemanticPreservingSplitter (#25911)
**Description:** 

With current HTML splitters, they rely on secondary use of the
`RecursiveCharacterSplitter` to further chunk the document into
manageable chunks. The issue with this is it fails to maintain important
structures such as tables, lists, etc within HTML.

This Implementation of a HTML splitter, allows the user to define a
maximum chunk size, HTML elements to preserve in full, options to
preserve `<a>` href links in the output and custom handlers.

The core splitting begins with headers, similar to `HTMLHeaderSplitter`.
If these sections exceed the length of the `max_chunk_size` further
recursive splitting is triggered. During this splitting, elements listed
to preserve, will be excluded from the splitting process. This can cause
chunks to be slightly larger then the max size, depending on preserved
length. However, all contextual relevance of the preserved item remains
intact.

**Custom Handlers**: Sometimes, companies such as Atlassian have custom
HTML elements, that are not parsed by default with `BeautifulSoup`.
Custom handlers allows a user to provide a function to be ran whenever a
specific html tag is encountered. This allows the user to preserve and
gather information within custom html tags that `bs4` will potentially
miss during extraction.

**Dependencies:** User will need to install `bs4` in their project to
utilise this class

I have also added in `how_to` and unit tests, which require `bs4` to
run, otherwise they will be skipped.

Flowchart of process:


![HTMLSemanticPreservingSplitter](https://github.com/user-attachments/assets/20873c36-22ed-4c80-884b-d3c6f433f5a7)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-19 12:09:22 -05:00
hsm207
d0e95971f5
langchain-weaviate: Remove outdated docs (#28058)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


Docs on how to do hybrid search with weaviate is covered
[here](https://python.langchain.com/docs/integrations/vectorstores/weaviate/)

@efriis

---------

Co-authored-by: pookam90 <pookam@microsoft.com>
Co-authored-by: Pooja Kamath <60406274+Pookam90@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-12-10 05:00:07 +00:00
Tomaz Bratanic
6815981578
Switch graphqa example in docs to langgraph (#28574)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-12-09 14:46:00 -05:00
ccurme
7b9a0d9ed8
docs: update tutorials (#28219) 2024-11-26 10:43:12 -05:00
Erick Friis
a073c4c498
templates,docs: leave templates in v0.2 (#27952)
all template installs will now have to declare `--branch v0.2` to make
clear they aren't compatible with langchain 0.3 (most have a pydantic v1
setup). e.g.

```
langchain-cli app add pirate-speak --branch v0.2
```
2024-11-07 22:23:48 +00:00
Erick Friis
9fedb04dd3
docs: INVALID_CHAT_HISTORY redirect (#27845) 2024-11-01 21:35:11 +00:00
Erick Friis
cdb4b1980a
docs: reorganize contributing docs (#27649) 2024-10-25 22:41:54 +00:00
Erick Friis
b468552859
docs: langgraph error code redirects (#27465) 2024-10-18 10:39:32 -07:00
Erick Friis
a38e903360
docs: platforms -> providers (#27285) 2024-10-16 18:27:07 +00:00
Erick Friis
e0c36afc3e
docs: v0.3 link redirect (#26632) 2024-09-18 14:28:56 -07:00
Harutaka Kawamura
6ed50e78c9
community: Rename deployments server to AI gateway (#26368)
We recently renamed `MLflow Deployments Server` to `MLflow AI Gateway`
in mlflow. This PR updates the relevant notebooks to use `MLflow AI
gateway`

---

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:36:04 +00:00
Bagatur
a319a0ff1d
docs: add redirects for tools and lcel (#26541) 2024-09-16 18:06:15 +00:00
Bagatur
fa8e0d90de
docs: update version docs (#26457) 2024-09-13 22:20:24 +00:00
Erick Friis
c2a3021bb0
multiple: pydantic 2 compatibility, v0.3 (#26443)
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com>
Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com>
Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: ZhangShenao <15201440436@163.com>
Co-authored-by: Friso H. Kingma <fhkingma@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Morgante Pell <morgantep@google.com>
2024-09-13 14:38:45 -07:00
Bagatur
f872c50b3f
docs: installation nits (#24484) 2024-09-03 01:05:08 +00:00
Erick Friis
71c039571a
docs: remove deprecated nemo embed docs (#25720) 2024-08-24 00:36:33 +00:00
Erick Friis
0022ae1b31
docs: remove templates (#25717)
- [x] check redirect works at template root
- [x] check redirect works within individual template page
2024-08-23 15:51:12 -07:00
Isaac Francisco
d40bdd6257
docs: more indexing of document loaders (#25500)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-20 17:54:42 +00:00
Bob Merkus
8e3e532e7d
docs: ollama doc update (toolcalling, install, notebook examples) (#25549)
The new `langchain-ollama` package seems pretty well implemented, but I
noticed the docs were still outdated so I decided to fix em up a bit.

- Llama3.1 was release on 23rd of July;
https://ai.meta.com/blog/meta-llama-3-1/
- Ollama supports tool calling since 25th of July;
https://ollama.com/blog/tool-support
- LangChain Ollama partner package was released 1st of august;
https://pypi.org/project/langchain-ollama/

**Problem**: Docs note langchain-community instead of langchain-ollama

**Solution**: Update docs to
https://python.langchain.com/v0.2/docs/integrations/chat/ollama/


**Problem**: OllamaFunctions is deprecated, as noted on
[Integrations](https://python.langchain.com/v0.2/docs/integrations/chat/ollama_functions/):
This was an experimental wrapper that attempts to bolt-on tool calling
support to models that do not natively support it. The [primary Ollama
integration](https://python.langchain.com/v0.2/docs/integrations/chat/ollama/) now
supports tool calling, and should be used instead.

**Solution**: Delete old notebook from repo, update the existing one
with @tool decorator + pydantic examples to the notebook


**Problem**: Llama3.1 was released while llama3-groq-tool-call fine-tune
Is noted in notebooks.

**Solution**: update docs + notebooks to llama3.1 (which has improved
tool calling support)


**Problem**: Install instructions are incomplete, there is no
information to download a model and/or run the Ollama server

**Solution**: Add simple instructions to start the ollama service and
pull model (for toolcalling)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-20 09:20:59 -04:00
Isaac Francisco
e0bbb81d04
[docs]: standardize tool docstrings (#25351) 2024-08-13 16:10:00 -07:00
Isaac Francisco
6bc451b942
[docs]: merge tool/toolkit duplicates (#25197) 2024-08-13 12:19:17 -07:00
Bagatur
786ef021a3
docs: redirect toolkits (#25190) 2024-08-08 14:54:11 -07:00
Isaac Francisco
15a36dd0a2
[docs]: combine tools and toolkits (#25158) 2024-08-08 08:59:02 -07:00
Bagatur
3abf1b6905
docs: versions sidebar (#25061) 2024-08-06 09:23:43 -07:00
ccurme
6e45dba471
docs: fix redirect (#24950) 2024-08-01 20:45:54 -04:00
ccurme
c123cb2b30
docs: update migration guide (#24835)
Move to its own section in the sidebar.
2024-07-30 20:17:12 +00:00
Jacob Lee
379803751e
docs[patch]: Remove very old document comparison notebook (#24587) 2024-07-23 22:25:35 -07:00
Erick Friis
141943a7e1
infra: docs ignore step in script (#24090) 2024-07-10 15:18:00 -07:00
Jacob Lee
593de8a913
docs[patch]: Add robots.txt and root sitemap (#22492)
CC @efriis @baskaryan
2024-06-04 11:26:40 -07:00
Bagatur
4d82cea71f
docs: fix llm caches redirect (#22371) 2024-05-31 19:37:06 +00:00
Erick Friis
1bad0ac946
docs: redirect integration links to 0.2 (#22326) 2024-05-31 11:40:48 -04:00
Bagatur
93049d1563
docs: make llm cache its own section (#22301) 2024-05-30 00:17:33 -07:00
Harrison Chase
170cc8aec3
docs: add multi-modal-docs (#21734)
We dont really have any abstractions around multi-modal... so add a
section explaining we dont have any abstrations and then how to guides
for openai and anthropic (probably need to add for more)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: junefish <junefish@users.noreply.github.com>
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-23 18:33:25 +00:00
Erick Friis
6b97418836
docs: rewrite old home, fix v0.1 infinite redirect (#21936) 2024-05-20 13:44:41 -07:00
Bagatur
1418d3af00
docs: link to langsmith+langgraph docs (#21930) 2024-05-20 13:05:22 -07:00
Erick Friis
7976fb1663
docs: cookbook redirect (#21822) 2024-05-17 17:07:30 +00:00
Erick Friis
2be4b1b2c9
Revert "docs: redirect base slug" (#21499)
Reverts langchain-ai/langchain#21457
2024-05-09 12:20:16 -07:00
Erick Friis
d1fc841b1a
docs: redirect base slug (#21457) 2024-05-09 10:52:36 -07:00
Erick Friis
21d14549a9
docs: v0.2 docs in master (#21438)
current python.langchain.com is building from branch `v0.1`. Iterate on
v0.2 docs here.

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com>
Co-authored-by: Fayfox <admin@fayfox.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com>
Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com>
Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: WeichenXu <weichen.xu@databricks.com>
Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com>
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai>
Co-authored-by: MacanPN <martin.triska@gmail.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
Co-authored-by: Hyeongchan Kim <kozistr@gmail.com>
Co-authored-by: sdan <git@sdan.io>
Co-authored-by: Guangdong Liu <liugddx@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com>
Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com>
Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com>
Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com>
Co-authored-by: Tomer Cagan <tomer@tomercagan.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
2024-05-08 12:29:59 -07:00
Erick Friis
f09bd0b75b
upstage: init package (#20574)
Co-authored-by: Sean Cho <sean@upstage.ai>
Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
2024-04-17 23:25:36 +00:00
pjb157
479be3cc91
community[minor]: Unify Titan Takeoff Integrations and Adding Embedding Support (#18775)
**Community: Unify Titan Takeoff Integrations and Adding Embedding
Support**

 **Description:** 
Titan Takeoff no longer reflects this either of the integrations in the
community folder. The two integrations (TitanTakeoffPro and
TitanTakeoff) where causing confusion with clients, so have moved code
into one place and created an alias for backwards compatibility. Added
Takeoff Client python package to do the bulk of the work with the
requests, this is because this package is actively updated with new
versions of Takeoff. So this integration will be far more robust and
will not degrade as badly over time.

**Issue:**
Fixes bugs in the old Titan integrations and unified the code with added
unit test converge to avoid future problems.

**Dependencies:**
Added optional dependency takeoff-client, all imports still work without
dependency including the Titan Takeoff classes but just will fail on
initialisation if not pip installed takeoff-client

**Twitter**
@MeryemArik9

Thanks all :)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-17 01:43:35 +00:00
Jacob Lee
58a2123ca0
docs[patch]: Add missing redirects (#20076) 2024-04-05 12:54:00 -07:00
Leonid Ganeline
82f0198be2
docs: graphs update (#19675)
Issue: The `graph` code was moved into the `community` package a long
ago. But the related documentation is still in the
[use_cases](https://python.langchain.com/docs/use_cases/graph/integrations/diffbot_graphtransformer)
section and not in the `integrations`.
Changes:
- moved the `use_cases/graph/integrations` notebooks into the
`integrations/graphs`
- renamed files and changed titles to follow the consistent format
- redirected old page URLs to new URLs in `vercel.json` and in several
other pages
- added descriptions and links when necessary
- formatted into the consistent format
2024-04-04 14:13:22 -07:00
Jacob Lee
7f0cb3bfba
docs[patch]: Make Docusaurus and Vercel add trailing slashes when navigating by default (#20014)
Should hopefully avoid weird broken link edge cases.

Relative links now trip up the Docusaurus broken link checker, so this
PR also removes them.

Also snuck in a small addition about asyncio
2024-04-04 12:49:15 -07:00
Jacob Lee
605c3f23e1
docs: reorg and visual refresh (#19765)
- put use cases in main sidebar
- move modules to own sidebar, rename components
- cleanup lcel section
- cleanup guides
- update font, cell highlighting

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-04 00:58:36 -07:00
Shengsheng Huang
ac1dd8ad94
community[minor]: migrate bigdl-llm to ipex-llm (#19518)
- **Description**: `bigdl-llm` library has been renamed to
[`ipex-llm`](https://github.com/intel-analytics/ipex-llm). This PR
migrates the `bigdl-llm` integration to `ipex-llm` .
- **Issue**: N/A. The original PR of `bigdl-llm` is
https://github.com/langchain-ai/langchain/pull/17953
- **Dependencies**: `ipex-llm` library
- **Contribution maintainer**: @shane-huang

Updated doc:   docs/docs/integrations/llms/ipex_llm.ipynb
Updated test:
libs/community/tests/integration_tests/llms/test_ipex_llm.py
2024-03-27 20:12:59 -07:00
yuwenzho
3a7d2cf443
community[minor]: Add ITREX optimized Embeddings (#18474)
Introduction
[Intel® Extension for
Transformers](https://github.com/intel/intel-extension-for-transformers)
is an innovative toolkit designed to accelerate GenAI/LLM everywhere
with the optimal performance of Transformer-based models on various
Intel platforms

Description

adding ITREX runtime embeddings using intel-extension-for-transformers.
added mdx documentation and example notebooks
added embedding import testing.

---------

Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 07:22:06 +00:00
Leonid Ganeline
07c518ad3e
docs: providers update 4 (#18540)
Created the `facebook` page from `facebook_faiss` and `facebook_chat`
pages. Added another Facebook integrations into this page.
Updated `discord` page.
2024-03-09 13:30:48 -08:00