feat(docs): improve devx, fix Makefile targets (#32237)

**TL;DR much of the provided `Makefile` targets were broken, and any
time I wanted to preview changes locally I either had to refer to a
command Chester gave me or try waiting on a Vercel preview deployment.
With this PR, everything should behave like normal.**

Significant updates to the `Makefile` and documentation files, focusing
on improving usability, adding clear messaging, and fixing/enhancing
documentation workflows.

### Updates to `Makefile`:

#### Enhanced build and cleaning processes:
- Added informative messages (e.g., "📚 Building LangChain
documentation...") to makefile targets like `docs_build`, `docs_clean`,
and `api_docs_build` for better user feedback during execution.
- Introduced a `clean-cache` target to the `docs` `Makefile` to clear
cached dependencies and ensure clean builds.

#### Improved dependency handling:
- Modified `install-py-deps` to create a `.venv/deps_installed` marker,
preventing redundant/duplicate dependency installations and improving
efficiency.

#### Streamlined file generation and infrastructure setup:
- Added caching for the LangServe README download and parallelized
feature table generation
- Added user-friendly completion messages for targets like `copy-infra`
and `render`.

#### Documentation server updates:
- Enhanced the `start` target with messages indicating server start and
URL for local documentation viewing.

---

### Documentation Improvements:

#### Content clarity and consistency:
- Standardized section titles for consistency across documentation
files.
[[1]](diffhunk://#diff-9b1a85ea8a9dcf79f58246c88692cd7a36316665d7e05a69141cfdc50794c82aL1-R1)
[[2]](diffhunk://#diff-944008ad3a79d8a312183618401fcfa71da0e69c75803eff09b779fc8e03183dL1-R1)
- Refined phrasing and formatting in sections like "Dependency
management" and "Formatting and linting" for better readability.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L6-R6)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L84-R82)

#### Enhanced workflows:
- Updated instructions for building and viewing documentation locally,
including tips for specifying server ports and handling API reference
previews.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L60-R94)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
- Expanded guidance on cleaning documentation artifacts and using
linting tools effectively.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)

#### API reference documentation:
- Improved instructions for generating and formatting in-code
documentation, highlighting best practices for docstring writing.
[[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142)
[[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L144-R186)

---

### Minor Changes:
- Added support for a new package name (`langchain_v1`) in the API
documentation generation script.
- Fixed minor capitalization and formatting issues in documentation
files.
[[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L40-R40)
[[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L166-R160)

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit is contained in:
Mason Daugherty
2025-07-25 14:49:03 -04:00
committed by GitHub
parent 549ecd3e78
commit f624ad489a
53 changed files with 761 additions and 246 deletions

View File

@@ -1,9 +1,9 @@
# we build the docs in these stages:
# 1. install vercel and python dependencies
# 2. copy files from "source dir" to "intermediate dir"
# 2. generate files like model feat table, etc in "intermediate dir"
# 3. copy files to their right spots (e.g. langserve readme) in "intermediate dir"
# 4. build the docs from "intermediate dir" to "output dir"
# We build the docs in these stages:
# 1. Install vercel and python dependencies
# 2. Copy files from "source dir" to "intermediate dir"
# 2. Generate files like model feat table, etc in "intermediate dir"
# 3. Copy files to their right spots (e.g. langserve readme) in "intermediate dir"
# 4. Build the docs from "intermediate dir" to "output dir"
SOURCE_DIR = docs/
INTERMEDIATE_DIR = build/intermediate/docs
@@ -18,32 +18,45 @@ PORT ?= 3001
clean:
rm -rf build
clean-cache:
rm -rf build .venv/deps_installed
install-vercel-deps:
yum -y -q update
yum -y -q install gcc bzip2-devel libffi-devel zlib-devel wget tar gzip rsync -y
install-py-deps:
python3 -m venv .venv
$(PYTHON) -m pip install -q --upgrade pip
$(PYTHON) -m pip install -q --upgrade uv
$(PYTHON) -m uv pip install -q --pre -r vercel_requirements.txt
$(PYTHON) -m uv pip install -q --pre $$($(PYTHON) scripts/partner_deps_list.py) --overrides vercel_overrides.txt
@echo "📦 Installing Python dependencies..."
@if [ ! -d .venv ]; then python3 -m venv .venv; fi
@if [ ! -f .venv/deps_installed ]; then \
$(PYTHON) -m pip install -q --upgrade pip --disable-pip-version-check; \
$(PYTHON) -m pip install -q --upgrade uv; \
$(PYTHON) -m uv pip install -q --pre -r vercel_requirements.txt; \
$(PYTHON) -m uv pip install -q --pre $$($(PYTHON) scripts/partner_deps_list.py) --overrides vercel_overrides.txt; \
touch .venv/deps_installed; \
fi
@echo "✅ Dependencies installed"
generate-files:
@echo "📄 Generating documentation files..."
mkdir -p $(INTERMEDIATE_DIR)
cp -rp $(SOURCE_DIR)/* $(INTERMEDIATE_DIR)
$(PYTHON) scripts/tool_feat_table.py $(INTERMEDIATE_DIR)
$(PYTHON) scripts/kv_store_feat_table.py $(INTERMEDIATE_DIR)
$(PYTHON) scripts/partner_pkg_table.py $(INTERMEDIATE_DIR)
curl https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md | sed 's/<=/\&lt;=/g' > $(INTERMEDIATE_DIR)/langserve.md
@if [ ! -f build/langserve_readme_cache.md ] || [ $$(find build/langserve_readme_cache.md -mtime +1 -print) ]; then \
echo "🌐 Downloading LangServe README..."; \
curl -s https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md | sed 's/<=/\&lt;=/g' > build/langserve_readme_cache.md; \
fi
cp build/langserve_readme_cache.md $(INTERMEDIATE_DIR)/langserve.md
cp ../SECURITY.md $(INTERMEDIATE_DIR)/security.md
$(PYTHON) scripts/resolve_local_links.py $(INTERMEDIATE_DIR)/langserve.md https://github.com/langchain-ai/langserve/tree/main/
@echo "🔧 Generating feature tables and processing links..."
$(PYTHON) scripts/tool_feat_table.py $(INTERMEDIATE_DIR) & \
$(PYTHON) scripts/kv_store_feat_table.py $(INTERMEDIATE_DIR) & \
$(PYTHON) scripts/partner_pkg_table.py $(INTERMEDIATE_DIR) & \
$(PYTHON) scripts/resolve_local_links.py $(INTERMEDIATE_DIR)/langserve.md https://github.com/langchain-ai/langserve/tree/main/ & \
wait
@echo "✅ Files generated"
copy-infra:
@echo "📂 Copying infrastructure files..."
mkdir -p $(OUTPUT_NEW_DIR)
cp -r src $(OUTPUT_NEW_DIR)
cp vercel.json $(OUTPUT_NEW_DIR)
@@ -55,15 +68,22 @@ copy-infra:
cp -r static $(OUTPUT_NEW_DIR)
cp -r ../libs/cli/langchain_cli/integration_template $(OUTPUT_NEW_DIR)/src/theme
cp yarn.lock $(OUTPUT_NEW_DIR)
@echo "✅ Infrastructure files copied"
render:
@echo "📓 Converting notebooks (this may take a while)..."
$(PYTHON) scripts/notebook_convert.py $(INTERMEDIATE_DIR) $(OUTPUT_NEW_DOCS_DIR)
@echo "✅ Notebooks converted"
md-sync:
@echo "📝 Syncing markdown files..."
rsync -avmq --include="*/" --include="*.mdx" --include="*.md" --include="*.png" --include="*/_category_.yml" --exclude="*" $(INTERMEDIATE_DIR)/ $(OUTPUT_NEW_DOCS_DIR)
@echo "✅ Markdown files synced"
append-related:
@echo "🔗 Appending related links..."
$(PYTHON) scripts/append_related_links.py $(OUTPUT_NEW_DOCS_DIR)
@echo "✅ Related links appended"
generate-references:
$(PYTHON) scripts/generate_api_reference_links.py --docs_dir $(OUTPUT_NEW_DOCS_DIR)
@@ -71,6 +91,10 @@ generate-references:
update-md: generate-files md-sync
build: install-py-deps generate-files copy-infra render md-sync append-related
@echo ""
@echo "🎉 Documentation build complete!"
@echo "📖 To view locally, run: cd docs && make start"
@echo ""
vercel-build: install-vercel-deps build generate-references
rm -rf docs
@@ -84,4 +108,9 @@ vercel-build: install-vercel-deps build generate-references
NODE_OPTIONS="--max-old-space-size=5000" yarn run docusaurus build
start:
cd $(OUTPUT_NEW_DIR) && yarn && yarn start --port=$(PORT)
@echo "🚀 Starting documentation server on port $(PORT)..."
@echo "📖 Installing Node.js dependencies..."
cd $(OUTPUT_NEW_DIR) && yarn install --silent
@echo "🌐 Starting server at http://localhost:$(PORT)"
@echo "Press Ctrl+C to stop the server"
cd $(OUTPUT_NEW_DIR) && yarn start --port=$(PORT)

View File

@@ -262,6 +262,8 @@ myst_enable_extensions = ["colon_fence"]
# generate autosummary even if no references
autosummary_generate = True
# Don't fail on autosummary import warnings
autosummary_ignore_module_all = False
html_copy_source = False
html_show_sourcelink = False

View File

@@ -497,6 +497,7 @@ def _package_dir(package_name: str = "langchain") -> Path:
"""Return the path to the directory containing the documentation."""
if package_name in (
"langchain",
"langchain_v1",
"experimental",
"community",
"core",
@@ -592,7 +593,12 @@ For the legacy API reference hosted on ReadTheDocs see [https://api.python.langc
if integrations:
integration_headers = [
" ".join(
custom_names.get(x, x.title().replace("ai", "AI").replace("db", "DB"))
custom_names.get(
x,
x.title().replace("db", "DB")
if dir_ == "langchain_v1"
else x.title().replace("ai", "AI").replace("db", "DB"),
)
for x in dir_.split("-")
)
for dir_ in integrations

View File

@@ -1,4 +1,4 @@
# Contribute Code
# Contribute code
If you would like to add a new feature or update an existing one, please read the resources below before getting started:

View File

@@ -3,7 +3,7 @@
This guide walks through how to run the repository locally and check in your first code.
For a [development container](https://containers.dev/), see the [.devcontainer folder](https://github.com/langchain-ai/langchain/tree/master/.devcontainer).
## Dependency Management: `uv` and other env/dependency managers
## Dependency management: `uv` and other env/dependency managers
This project utilizes [uv](https://docs.astral.sh/uv/) v0.5+ as a dependency manager.
@@ -37,7 +37,7 @@ For this quickstart, start with `langchain`:
cd libs/langchain
```
## Local Development Dependencies
## Local development dependencies
Install development requirements (for running langchain, running examples, linting, formatting, tests, and coverage):
@@ -64,12 +64,6 @@ To run unit tests:
make test
```
To run unit tests in Docker:
```bash
make docker_tests
```
There are also [integration tests and code-coverage](../testing.mdx) available.
### Developing langchain_core
@@ -81,11 +75,11 @@ cd libs/core
make test
```
## Formatting and Linting
## Formatting and linting
Run these locally before submitting a PR; the CI system will check also.
### Code Formatting
### Code formatting
Formatting for this project is done via [ruff](https://docs.astral.sh/ruff/rules/).
@@ -163,7 +157,7 @@ If codespell is incorrectly flagging a word, you can skip spellcheck for that wo
ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
```
## Working with Optional Dependencies
## Working with optional dependencies
`langchain`, `langchain-community`, and `langchain-experimental` rely on optional dependencies to keep these packages lightweight.

View File

@@ -1,4 +1,4 @@
# Contribute Documentation
# Contribute documentation
Documentation is a vital part of LangChain. We welcome both new documentation for new features and
community improvements to our current documentation. Please read the resources below before getting started:

View File

@@ -12,12 +12,11 @@ It covers a wide array of topics, including tutorials, use cases, integrations,
and more, offering extensive guidance on building with LangChain.
The content for this documentation lives in the `/docs` directory of the monorepo.
2. In-code Documentation: This is documentation of the codebase itself, which is also
used to generate the externally facing [API Reference](https://python.langchain.com/api_reference/langchain/index.html).
used to generate the externally facing [API Reference](https://python.langchain.com/api_reference/).
The content for the API reference is autogenerated by scanning the docstrings in the codebase. For this reason we ask that
developers document their code well.
The `API Reference` is largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/)
from the code and is hosted by [Read the Docs](https://readthedocs.org/).
The API Reference is largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.
We appreciate all contributions to the documentation, whether it be fixing a typo,
adding a new tutorial or example and whether it be in the main documentation or the API Reference.
@@ -25,7 +24,7 @@ adding a new tutorial or example and whether it be in the main documentation or
Similar to linting, we recognize documentation can be annoying. If you do not want
to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.
## 📜 Main Documentation
## 📜 Main documentation
The content for the main documentation is located in the `/docs` directory of the monorepo.
@@ -42,7 +41,7 @@ After modifying the documentation:
3. Make a pull request with the changes.
4. You can preview and verify that the changes are what you wanted by clicking the `View deployment` or `Visit Preview` buttons on the pull request `Conversation` page. This will take you to a preview of the documentation changes.
## ⚒️ Linting and Building Documentation Locally
## ⚒️ Linting and building documentation locally
After writing up the documentation, you may want to lint and build the documentation
locally to ensure that it looks good and is free of errors.
@@ -57,20 +56,44 @@ The code that builds the documentation is located in the `/docs` directory of th
In the following commands, the prefix `api_` indicates that those are operations for the API Reference.
Before building the documentation, it is always a good idea to clean the build directory:
```bash
make docs_clean
make api_docs_clean
```
Next, you can build the documentation as outlined below:
You can build the documentation as outlined below:
```bash
make docs_build
make api_docs_build
```
### Viewing documentation locally
After building the main documentation, you can view it locally by starting a development server:
```bash
# For main documentation (after running `make docs_build`)
cd docs && make start
```
This will start a development server where you can view the documentation in your browser. The exact url will be shown to you during the start process. The server will automatically reload when you make changes to the documentation files under the `build/` directory (e.g. for temporary tests - changes you wish to persist should be put under `docs/docs/`).
:::tip
You can specify a different port by setting the `PORT` environment variable:
```bash
cd docs && PORT=3000 make start
```
:::
The API Reference documentation is built as static HTML files and will be automatically opened directly in your browser.
You can also view the API Reference for a specific package by specifying the package name and installing the package if necessary dependencies:
```bash
# Opens the API Reference for the `ollama` package in your default browser
uv pip install -e libs/partners/ollama
make api_docs_quick_preview API_PKG=ollama
```
:::tip
The `make api_docs_build` command takes a long time. If you're making cosmetic changes to the API docs and want to see how they look, use:
@@ -79,18 +102,28 @@ The `make api_docs_build` command takes a long time. If you're making cosmetic c
make api_docs_quick_preview
```
which will just build a small subset of the API reference.
which will just build a small subset of the API reference (the `text-splitters` package).
:::
Finally, run the link checker to ensure all links are valid:
Finally, run the link checker from the project root to ensure all links are valid:
```bash
make docs_linkcheck
make api_docs_linkcheck
```
### Linting and Formatting
To clean up the documentation build artifacts, you can run:
```bash
make clean
# Or to clean specific documentation artifacts
make docs_clean
make api_docs_clean
```
### Formatting and linting
The Main Documentation is linted from the **monorepo root**. To lint the main documentation, run the following from there:
@@ -104,9 +137,9 @@ If you have formatting-related errors, you can fix them automatically with:
make format
```
## ⌨️ In-code Documentation
## ⌨️ In-code documentation
The in-code documentation is largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code and is hosted by [Read the Docs](https://readthedocs.org/).
The in-code documentation is largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code following [reStructuredText](https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html).
For the API reference to be useful, the codebase must be well-documented. This means that all functions, classes, and methods should have a docstring that explains what they do, what the arguments are, and what the return value is. This is a good practice in general, but it is especially important for LangChain because the API reference is the primary resource for developers to understand how to use the codebase.
@@ -141,16 +174,16 @@ def my_function(arg1: int, arg2: str) -> float:
return 3.14
```
### Linting and Formatting
### Formatting and linting
The in-code documentation is linted from the directories belonging to the packages
being documented.
For example, if you're working on the `langchain-community` package, you would change
the working directory to the `langchain-community` directory:
For example, if you're working on the `langchain-ollama` package, you would change
the working directory to the the package directory:
```bash
cd [root]/libs/langchain-community
cd [root]/libs/partners/ollama
```
Then you can run the following commands to lint and format the in-code documentation:
@@ -160,9 +193,9 @@ make format
make lint
```
## Verify Documentation Changes
## Verify documentation changes
After pushing documentation changes to the repository, you can preview and verify that the changes are
what you wanted by clicking the `View deployment` or `Visit Preview` buttons on the pull request `Conversation` page.
This will take you to a preview of the documentation changes.
This preview is created by [Vercel](https://vercel.com/docs/getting-started-with-vercel).
This preview is created by [Vercel](https://vercel.com/docs/getting-started-with-vercel).

View File

@@ -2,7 +2,7 @@
sidebar_class_name: "hidden"
---
# Documentation Style Guide
# Documentation style guide
As LangChain continues to grow, the amount of documentation required to cover the various concepts and integrations continues to grow too.
This page provides guidelines for anyone writing documentation for LangChain and outlines some of our philosophies around
@@ -158,3 +158,5 @@ Be concise, including in code samples.
- Use bullet points and numbered lists to break down information into easily digestible chunks
- Use tables (especially for **Reference** sections) and diagrams often to present information visually
- Include the table of contents for longer documentation pages to help readers navigate the content, but hide it for shorter pages
Next, see the [documentation setup guide](setup.mdx) to get started with writing documentation for LangChain.

View File

@@ -1,4 +1,4 @@
# How-to Guides
# How-to guides
- [**Documentation**](documentation/index.mdx): Help improve our docs, including this one!
- [**Code**](code/index.mdx): Help us write code, fix bugs, or improve our infrastructure.

View File

@@ -3,7 +3,7 @@ pagination_prev: null
pagination_next: contributing/how_to/integrations/package
---
# Contribute Integrations
# Contribute integrations
Integrations are a core component of LangChain.
LangChain provides standard interfaces for several different components (language models, vector stores, etc) that are crucial when building LLM applications.
@@ -16,7 +16,7 @@ LangChain provides standard interfaces for several different components (languag
- **Best Practices:** Through their standard interface, LangChain components encourage and facilitate best practices (streaming, async, etc)
## Components to Integrate
## Components to integrate
:::info
@@ -71,7 +71,7 @@ In order to contribute an integration, you should follow these steps:
5. [Optional] Open and merge a PR to add documentation for your integration to the official LangChain docs.
6. [Optional] Engage with the LangChain team for joint co-marketing ([see below](#co-marketing)).
## Co-Marketing
## Co-marketing
With over 20 million monthly downloads, LangChain has a large audience of developers
building LLM applications. Beyond just listing integrations, we aim to highlight
@@ -87,5 +87,5 @@ Here are some heuristics for types of content we are excited to promote:
- **End-to-end applications:** End-to-end applications are great resources for developers looking to build. We prefer to highlight applications that are more complex/agentic in nature, and that use [LangGraph](https://github.com/langchain-ai/langgraph) as the orchestration framework. We get particularly excited about anything involving long-term memory, human-in-the-loop interaction patterns, or multi-agent architectures.
- **Research:** We love highlighting novel research! Whether it is research built on top of LangChain or that integrates with it.
## Further Reading
## Further reading
To get started, let's learn [how to implement an integration package](/docs/contributing/how_to/integrations/package/) for LangChain.

View File

@@ -358,7 +358,7 @@ a schema for the LLM to fill out when calling the tool. Similar to the `name` an
description (part of `Field(..., description="description")`) are passed to the LLM,
and the values in these fields should be concise and LLM-usable.
### Run Methods
### Run methods
`_run` is the main method that should be implemented in the subclass. This method
takes in the arguments from `args_schema` and runs the tool, returning a string
@@ -469,6 +469,6 @@ import RetrieverSource from '/src/theme/integration_template/integration_templat
---
## Next Steps
## Next steps
Now that you've implemented your package, you can move on to [testing your integration](../standard_tests) for your integration and successfully run them.

View File

@@ -10,7 +10,7 @@ Unit tests run on every pull request, so they should be fast and reliable.
Integration tests run once a day, and they require more setup, so they should be reserved for confirming interface points with external services.
## Unit Tests
## Unit tests
Unit tests cover modular logic that does not require calls to outside APIs.
If you add new logic, please add a unit test.
@@ -27,19 +27,13 @@ To run unit tests:
make test
```
To run unit tests in Docker:
```bash
make docker_tests
```
To run a specific test:
```bash
TEST_FILE=tests/unit_tests/test_imports.py make test
```
## Integration Tests
## Integration tests
Integration tests cover logic that requires making calls to outside APIs (often integration with other services).
If you add support for a new external API, please add a new integration test.

View File

@@ -12,7 +12,7 @@ More coming soon! We are working on tutorials to help you make your first contri
- [**Make your first docs PR**](tutorials/docs.mdx)
## How-to Guides
## How-to guides
- [**Documentation**](how_to/documentation/index.mdx): Help improve our docs, including this one!
- [**Code**](how_to/code/index.mdx): Help us write code, fix bugs, or improve our infrastructure.

View File

@@ -50,7 +50,7 @@ There are other files in the root directory level, but their presence should be
## Documentation
The `/docs` directory contains the content for the documentation that is shown
at https://python.langchain.com/ and the associated API Reference https://python.langchain.com/api_reference/langchain/index.html.
at [python.langchain.com](https://python.langchain.com/) and the associated [API Reference](https://python.langchain.com/api_reference/).
See the [documentation](../how_to/documentation/index.mdx) guidelines to learn how to contribute to the documentation.

View File

@@ -8,7 +8,7 @@ This tutorial will guide you through making a simple documentation edit, like co
---
## Editing a Documentation Page on GitHub
## Editing a documentation page on GitHub
Sometimes you want to make a small change, like fixing a typo, and the easiest way to do this is to use GitHub's editor directly.
@@ -42,10 +42,14 @@ Sometimes you want to make a small change, like fixing a typo, and the easiest w
- Give your PR a title like `docs: Fix typo in X section`.
- Follow the checklist in the PR description template.
## Getting a Review
## Getting a review
Once you've submitted the pull request, it will be reviewed by the maintainers. You may receive feedback or requests for changes. Keep an eye on the PR to address any comments.
Docs PRs are typically reviewed within a few days, but it may take longer depending on the complexity of the change and the availability of maintainers.
For more information on reviews, see the [Review Process](../reference/review_process.mdx).
## More information
See our [how-to guides](../how_to/documentation/index.mdx) for more information on contributing to documentation:

View File

@@ -92,7 +92,7 @@ the support of DB2 vector store and vector search.
See detailed usage examples in the guide [here](/docs/integrations/vectorstores/db2).
Installation: This is a seperate package for vector store feature only and can be run
Installation: This is a separate package for vector store feature only and can be run
without the `langchain-ibm` package.
```bash
pip install -U langchain-db2

View File

@@ -20,7 +20,7 @@ pip install langchain-predictionguard
|---|---|---|---------------------------------------------------------|-------------------------------------------------------------------------------|
|Chat|Build Chat Bots|[Chat](https://docs.predictionguard.com/api-reference/api-reference/chat-completions)| `from langchain_predictionguard import ChatPredictionGuard` | [ChatPredictionGuard.ipynb](/docs/integrations/chat/predictionguard) |
|Completions|Generate Text|[Completions](https://docs.predictionguard.com/api-reference/api-reference/completions)| `from langchain_predictionguard import PredictionGuard` | [PredictionGuard.ipynb](/docs/integrations/llms/predictionguard) |
|Text Embedding|Embed String to Vectores|[Embeddings](https://docs.predictionguard.com/api-reference/api-reference/embeddings)| `from langchain_predictionguard import PredictionGuardEmbeddings` | [PredictionGuardEmbeddings.ipynb](/docs/integrations/text_embedding/predictionguard) |
|Text Embedding|Embed String to Vectors|[Embeddings](https://docs.predictionguard.com/api-reference/api-reference/embeddings)| `from langchain_predictionguard import PredictionGuardEmbeddings` | [PredictionGuardEmbeddings.ipynb](/docs/integrations/text_embedding/predictionguard) |
## Getting Started

View File

@@ -1,7 +1,6 @@
# PremAI
[PremAI](https://premai.io/) is an all-in-one platform that simplifies the creation of robust, production-ready applications powered by Generative AI. By streamlining the development process, PremAI allows you to concentrate on enhancing user experience and driving overall growth for your application. You can quickly start using our platform [here](https://docs.premai.io/quick-start).
[PremAI](https://premai.io/) is an all-in-one platform that simplifies the creation of robust, production-ready applications powered by Generative AI. By streamlining the development process, PremAI allows you to concentrate on enhancing user experience and driving overall growth for your application. You can quickly start using [our platform](https://docs.premai.io/quick-start).
## ChatPremAI
@@ -26,10 +25,9 @@ from langchain_community.chat_models import ChatPremAI
Once we imported our required modules, let's setup our client. For now let's assume that our `project_id` is `8`. But make sure you use your project-id, otherwise it will throw error.
To use langchain with prem, you do not need to pass any model name or set any parameters with our chat-client. By default it will use the model name and parameters used in the [LaunchPad](https://docs.premai.io/get-started/launchpad).
> Note: If you change the `model` or any other parameters like `temperature` or `max_tokens` while setting the client, it will override existing default configurations, that was used in LaunchPad.
To use langchain with prem, you do not need to pass any model name or set any parameters with our chat-client. By default it will use the model name and parameters used in the [LaunchPad](https://docs.premai.io/get-started/launchpad).
> Note: If you change the `model` or any other parameters like `temperature` or `max_tokens` while setting the client, it will override existing default configurations, that was used in LaunchPad.
```python
import os
@@ -43,9 +41,9 @@ chat = ChatPremAI(project_id=1234, model_name="gpt-4o")
### Chat Completions
`ChatPremAI` supports two methods: `invoke` (which is the same as `generate`) and `stream`.
`ChatPremAI` supports two methods: `invoke` (which is the same as `generate`) and `stream`.
The first one will give us a static result. Whereas the second one will stream tokens one by one. Here's how you can generate chat-like completions.
The first one will give us a static result. Whereas the second one will stream tokens one by one. Here's how you can generate chat-like completions.
```python
human_message = HumanMessage(content="Who are you?")
@@ -72,18 +70,17 @@ chat.invoke(
)
```
> If you are going to place system prompt here, then it will override your system prompt that was fixed while deploying the application from the platform.
> If you are going to place system prompt here, then it will override your system prompt that was fixed while deploying the application from the platform.
> You can find all the optional parameters [here](https://docs.premai.io/get-started/sdk#optional-parameters). Any parameters other than [these supported parameters](https://docs.premai.io/get-started/sdk#optional-parameters) will be automatically removed before calling the model.
### Native RAG Support with Prem Repositories
Prem Repositories which allows users to upload documents (.txt, .pdf etc) and connect those repositories to the LLMs. You can think Prem repositories as native RAG, where each repository can be considered as a vector database. You can connect multiple repositories. You can learn more about repositories [here](https://docs.premai.io/get-started/repositories).
Repositories are also supported in langchain premai. Here is how you can do it.
Repositories are also supported in langchain premai. Here is how you can do it.
```python
```python
query = "Which models are used for dense retrieval"
repository_ids = [1985,]
@@ -94,13 +91,13 @@ repositories = dict(
)
```
First we start by defining our repository with some repository ids. Make sure that the ids are valid repository ids. You can learn more about how to get the repository id [here](https://docs.premai.io/get-started/repositories).
First we start by defining our repository with some repository ids. Make sure that the ids are valid repository ids. You can learn more about how to get the repository id [here](https://docs.premai.io/get-started/repositories).
> Please note: Similar like `model_name` when you invoke the argument `repositories`, then you are potentially overriding the repositories connected in the launchpad.
> Please note: Similar like `model_name` when you invoke the argument `repositories`, then you are potentially overriding the repositories connected in the launchpad.
Now, we connect the repository with our chat object to invoke RAG based generations.
Now, we connect the repository with our chat object to invoke RAG based generations.
```python
```python
import json
response = chat.invoke(query, max_tokens=100, repositories=repositories)
@@ -109,7 +106,7 @@ print(response.content)
print(json.dumps(response.response_metadata, indent=4))
```
This is how an output looks like.
This is how an output looks like.
```bash
Dense retrieval models typically include:
@@ -134,11 +131,11 @@ Dense retrieval models typically include:
So, this also means that you do not need to make your own RAG pipeline when using the Prem Platform. Prem uses it's own RAG technology to deliver best in class performance for Retrieval Augmented Generations.
> Ideally, you do not need to connect Repository IDs here to get Retrieval Augmented Generations. You can still get the same result if you have connected the repositories in prem platform.
> Ideally, you do not need to connect Repository IDs here to get Retrieval Augmented Generations. You can still get the same result if you have connected the repositories in prem platform.
### Streaming
In this section, let's see how we can stream tokens using langchain and PremAI. Here's how you do it.
In this section, let's see how we can stream tokens using langchain and PremAI. Here's how you do it.
```python
import sys
@@ -163,16 +160,15 @@ for chunk in chat.stream(
This will stream tokens one after the other.
> Please note: As of now, RAG with streaming is not supported. However we still support it with our API. You can learn more about that [here](https://docs.premai.io/get-started/chat-completion-sse).
> Please note: As of now, RAG with streaming is not supported. However we still support it with our API. You can learn more about that [here](https://docs.premai.io/get-started/chat-completion-sse).
## Prem Templates
Writing Prompt Templates can be super messy. Prompt templates are long, hard to manage, and must be continuously tweaked to improve and keep the same throughout the application.
Writing Prompt Templates can be super messy. Prompt templates are long, hard to manage, and must be continuously tweaked to improve and keep the same throughout the application.
With **Prem**, writing and managing prompts can be super easy. The **_Templates_** tab inside the [launchpad](https://docs.premai.io/get-started/launchpad) helps you write as many prompts you need and use it inside the SDK to make your application running using those prompts. You can read more about Prompt Templates [here](https://docs.premai.io/get-started/prem-templates).
With **Prem**, writing and managing prompts can be super easy. The **_Templates_** tab inside the [launchpad](https://docs.premai.io/get-started/launchpad) helps you write as many prompts you need and use it inside the SDK to make your application running using those prompts. You can read more about Prompt Templates [here](https://docs.premai.io/get-started/prem-templates).
To use Prem Templates natively with LangChain, you need to pass an id the `HumanMessage`. This id should be the name the variable of your prompt template. the `content` in `HumanMessage` should be the value of that variable.
To use Prem Templates natively with LangChain, you need to pass an id the `HumanMessage`. This id should be the name the variable of your prompt template. the `content` in `HumanMessage` should be the value of that variable.
let's say for example, if your prompt template was this:
@@ -198,7 +194,7 @@ template_id = "78069ce8-xxxxx-xxxxx-xxxx-xxx"
response = chat.invoke([human_message], template_id=template_id)
```
Prem Templates are also available for Streaming too.
Prem Templates are also available for Streaming too.
## Prem Embeddings
@@ -215,7 +211,7 @@ if os.environ.get("PREMAI_API_KEY") is None:
```
We support lots of state of the art embedding models. You can view our list of supported LLMs and embedding models [here](https://docs.premai.io/get-started/supported-models). For now let's go for `text-embedding-3-large` model for this example. .
We support lots of state of the art embedding models. You can view our list of supported LLMs and embedding models [here](https://docs.premai.io/get-started/supported-models). For now let's go for `text-embedding-3-large` model for this example. .
```python
@@ -231,7 +227,7 @@ print(query_result[:5])
```
:::note
Setting `model_name` argument in mandatory for PremAIEmbeddings unlike chat.
Setting `model_name` argument in mandatory for PremAIEmbeddings unlike chat.
:::
Finally, let's embed some sample document
@@ -254,11 +250,13 @@ print(doc_result[0][:5])
```python
print(f"Dimension of embeddings: {len(query_result)}")
```
Dimension of embeddings: 3072
```python
doc_result[:5]
```
>Result:
>
>[-0.02129288576543331,
@@ -269,20 +267,20 @@ doc_result[:5]
## Tool/Function Calling
LangChain PremAI supports tool/function calling. Tool/function calling allows a model to respond to a given prompt by generating output that matches a user-defined schema.
LangChain PremAI supports tool/function calling. Tool/function calling allows a model to respond to a given prompt by generating output that matches a user-defined schema.
- You can learn all about tool calling in details [in our documentation here](https://docs.premai.io/get-started/function-calling).
- You can learn more about langchain tool calling in [this part of the docs](https://python.langchain.com/v0.1/docs/modules/model_io/chat/function_calling).
**NOTE:**
> The current version of LangChain ChatPremAI do not support function/tool calling with streaming support. Streaming support along with function calling will come soon.
> The current version of LangChain ChatPremAI do not support function/tool calling with streaming support. Streaming support along with function calling will come soon.
### Passing tools to model
In order to pass tools and let the LLM choose the tool it needs to call, we need to pass a tool schema. A tool schema is the function definition along with proper docstring on what does the function do, what each argument of the function is etc. Below are some simple arithmetic functions with their schema.
In order to pass tools and let the LLM choose the tool it needs to call, we need to pass a tool schema. A tool schema is the function definition along with proper docstring on what does the function do, what each argument of the function is etc. Below are some simple arithmetic functions with their schema.
**NOTE:**
**NOTE:**
> When defining function/tool schema, do not forget to add information around the function arguments, otherwise it would throw error.
```python
@@ -320,27 +318,28 @@ def multiply(a: int, b: int) -> int:
### Binding tool schemas with our LLM
We will now use the `bind_tools` method to convert our above functions to a "tool" and binding it with the model. This means we are going to pass these tool information everytime we invoke the model.
We will now use the `bind_tools` method to convert our above functions to a "tool" and binding it with the model. This means we are going to pass these tool information every time we invoke the model.
```python
tools = [add, multiply]
llm_with_tools = chat.bind_tools(tools)
```
After this, we get the response from the model which is now binded with the tools.
After this, we get the response from the model which is now binded with the tools.
```python
```python
query = "What is 3 * 12? Also, what is 11 + 49?"
messages = [HumanMessage(query)]
ai_msg = llm_with_tools.invoke(messages)
```
As we can see, when our chat model is binded with tools, then based on the given prompt, it calls the correct set of the tools and sequentially.
As we can see, when our chat model is binded with tools, then based on the given prompt, it calls the correct set of the tools and sequentially.
```python
```python
ai_msg.tool_calls
```
**Output**
```python
@@ -352,15 +351,15 @@ ai_msg.tool_calls
'id': 'call_MPKYGLHbf39csJIyb5BZ9xIk'}]
```
We append this message shown above to the LLM which acts as a context and makes the LLM aware that what all functions it has called.
We append this message shown above to the LLM which acts as a context and makes the LLM aware that what all functions it has called.
```python
```python
messages.append(ai_msg)
```
Since tool calling happens into two phases, where:
1. in our first call, we gathered all the tools that the LLM decided to tool, so that it can get the result as an added context to give more accurate and hallucination free result.
1. in our first call, we gathered all the tools that the LLM decided to tool, so that it can get the result as an added context to give more accurate and hallucination free result.
2. in our second call, we will parse those set of tools decided by LLM and run them (in our case it will be the functions we defined, with the LLM's extracted arguments) and pass this result to the LLM
@@ -373,12 +372,13 @@ for tool_call in ai_msg.tool_calls:
messages.append(ToolMessage(tool_output, tool_call_id=tool_call["id"]))
```
Finally, we call the LLM (binded with the tools) with the function response added in it's context.
Finally, we call the LLM (binded with the tools) with the function response added in it's context.
```python
response = llm_with_tools.invoke(messages)
print(response.content)
```
**Output**
```txt
@@ -425,4 +425,4 @@ chain.invoke(query)
[multiply(a=3, b=12), add(a=11, b=49)]
```
Now, as done above, we parse this and run this functions and call the LLM once again to get the result.
Now, as done above, we parse this and run this functions and call the LLM once again to get the result.

View File

@@ -1,9 +1,6 @@
import sys
from pathlib import Path
from langchain_community import document_loaders
from langchain_core.document_loaders.base import BaseLoader
KV_STORE_TEMPLATE = """\
---
sidebar_class_name: hidden

View File

@@ -175,8 +175,23 @@ def _modify_frontmatter(
def _convert_notebook(
notebook_path: Path, output_path: Path, intermediate_docs_dir: Path
) -> Path:
with open(notebook_path) as f:
nb = nbformat.read(f, as_version=4)
import json
import uuid
with open(notebook_path, "r", encoding="utf-8") as f:
nb_json = json.load(f)
# Fix missing and duplicate cell IDs before nbformat validation
seen_ids = set()
for cell in nb_json.get("cells", []):
if "id" not in cell or not cell.get("id") or cell.get("id") in seen_ids:
cell["id"] = str(uuid.uuid4())[:8]
seen_ids.add(cell["id"])
nb = nbformat.reads(json.dumps(nb_json), as_version=4)
# Upgrade notebook format
nb = nbformat.v4.upgrade(nb)
body, resources = exporter.from_notebook_node(nb)

View File

@@ -315,8 +315,8 @@ module.exports = {
},
],
link: {
type: "doc",
id: "integrations/stores/index",
type: "generated-index",
slug: "integrations/stores",
},
},
{

View File

@@ -4,3 +4,9 @@ aiohttp<3.11
protobuf<3.21
tenacity
urllib3
# Fix numpy conflicts between langchain-astradb and langchain-chroma
numpy>=1.26.0,<2.0.0
# Fix simsimd build error in langchain-weaviate
simsimd>=5.0.0
# Fix sentencepiece build error - use newer version that supports modern CMake
sentencepiece>=0.2.1