feat(docs): improve devx, fix Makefile targets (#32237)

**TL;DR much of the provided `Makefile` targets were broken, and any time I wanted to preview changes locally I either had to refer to a command Chester gave me or try waiting on a Vercel preview deployment. With this PR, everything should behave like normal.** Significant updates to the `Makefile` and documentation files, focusing on improving usability, adding clear messaging, and fixing/enhancing documentation workflows. ### Updates to `Makefile`: #### Enhanced build and cleaning processes: - Added informative messages (e.g., "📚 Building LangChain documentation...") to makefile targets like `docs_build`, `docs_clean`, and `api_docs_build` for better user feedback during execution. - Introduced a `clean-cache` target to the `docs` `Makefile` to clear cached dependencies and ensure clean builds. #### Improved dependency handling: - Modified `install-py-deps` to create a `.venv/deps_installed` marker, preventing redundant/duplicate dependency installations and improving efficiency. #### Streamlined file generation and infrastructure setup: - Added caching for the LangServe README download and parallelized feature table generation - Added user-friendly completion messages for targets like `copy-infra` and `render`. #### Documentation server updates: - Enhanced the `start` target with messages indicating server start and URL for local documentation viewing. --- ### Documentation Improvements: #### Content clarity and consistency: - Standardized section titles for consistency across documentation files. [[1]](diffhunk://#diff-9b1a85ea8a9dcf79f58246c88692cd7a36316665d7e05a69141cfdc50794c82aL1-R1) [[2]](diffhunk://#diff-944008ad3a79d8a312183618401fcfa71da0e69c75803eff09b779fc8e03183dL1-R1) - Refined phrasing and formatting in sections like "Dependency management" and "Formatting and linting" for better readability. [[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L6-R6) [[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L84-R82) #### Enhanced workflows: - Updated instructions for building and viewing documentation locally, including tips for specifying server ports and handling API reference previews. [[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L60-R94) [[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126) - Expanded guidance on cleaning documentation artifacts and using linting tools effectively. [[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L82-R126) [[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142) #### API reference documentation: - Improved instructions for generating and formatting in-code documentation, highlighting best practices for docstring writing. [[1]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L107-R142) [[2]](diffhunk://#diff-048deddcfd44b242e5b23aed9f2e9ec73afc672244ce14df2a0a316d95840c87L144-R186) --- ### Minor Changes: - Added support for a new package name (`langchain_v1`) in the API documentation generation script. - Fixed minor capitalization and formatting issues in documentation files. [[1]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L40-R40) [[2]](diffhunk://#diff-2069d4f956ab606ae6d51b191439283798adaf3a6648542c409d258131617059L166-R160) --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-03 03:59:42 +00:00 · 2025-07-25 14:49:03 -04:00
parent 549ecd3e78
commit f624ad489a
53 changed files with 761 additions and 246 deletions
--- a/docs/Makefile
+++ b/docs/Makefile
@@ -1,9 +1,9 @@
-# we build the docs in these stages:
-# 1. install vercel and python dependencies
-# 2. copy files from "source dir" to "intermediate dir"
-# 2. generate files like model feat table, etc in "intermediate dir"
-# 3. copy files to their right spots (e.g. langserve readme) in "intermediate dir"
-# 4. build the docs from "intermediate dir" to "output dir"
+# We build the docs in these stages:
+# 1. Install vercel and python dependencies
+# 2. Copy files from "source dir" to "intermediate dir"
+# 2. Generate files like model feat table, etc in "intermediate dir"
+# 3. Copy files to their right spots (e.g. langserve readme) in "intermediate dir"
+# 4. Build the docs from "intermediate dir" to "output dir"

 SOURCE_DIR = docs/
 INTERMEDIATE_DIR = build/intermediate/docs
@@ -18,32 +18,45 @@ PORT ?= 3001
 clean:
 	rm -rf build

+clean-cache:
+	rm -rf build .venv/deps_installed
+
 install-vercel-deps:
 	yum -y -q update
 	yum -y -q install gcc bzip2-devel libffi-devel zlib-devel wget tar gzip rsync -y

 install-py-deps:
-	python3 -m venv .venv
-	$(PYTHON) -m pip install -q --upgrade pip
-	$(PYTHON) -m pip install -q --upgrade uv
-	$(PYTHON) -m uv pip install -q --pre -r vercel_requirements.txt
-	$(PYTHON) -m uv pip install -q --pre $$($(PYTHON) scripts/partner_deps_list.py) --overrides vercel_overrides.txt
+	@echo "📦 Installing Python dependencies..."
+	@if [ ! -d .venv ]; then python3 -m venv .venv; fi
+	@if [ ! -f .venv/deps_installed ]; then \
+		$(PYTHON) -m pip install -q --upgrade pip --disable-pip-version-check; \
+		$(PYTHON) -m pip install -q --upgrade uv; \
+		$(PYTHON) -m uv pip install -q --pre -r vercel_requirements.txt; \
+		$(PYTHON) -m uv pip install -q --pre $$($(PYTHON) scripts/partner_deps_list.py) --overrides vercel_overrides.txt; \
+		touch .venv/deps_installed; \
+	fi
+	@echo "✅ Dependencies installed"

 generate-files:
+	@echo "📄 Generating documentation files..."
 	mkdir -p $(INTERMEDIATE_DIR)
 	cp -rp $(SOURCE_DIR)/* $(INTERMEDIATE_DIR)
-
-	$(PYTHON) scripts/tool_feat_table.py $(INTERMEDIATE_DIR)
-
-	$(PYTHON) scripts/kv_store_feat_table.py $(INTERMEDIATE_DIR)
-
-	$(PYTHON) scripts/partner_pkg_table.py $(INTERMEDIATE_DIR)
-
-	curl https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md | sed 's/<=/\&lt;=/g' > $(INTERMEDIATE_DIR)/langserve.md
+	@if [ ! -f build/langserve_readme_cache.md ] || [ $$(find build/langserve_readme_cache.md -mtime +1 -print) ]; then \
+		echo "🌐 Downloading LangServe README..."; \
+		curl -s https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md | sed 's/<=/\&lt;=/g' > build/langserve_readme_cache.md; \
+	fi
+	cp build/langserve_readme_cache.md $(INTERMEDIATE_DIR)/langserve.md
 	cp ../SECURITY.md $(INTERMEDIATE_DIR)/security.md
-	$(PYTHON) scripts/resolve_local_links.py $(INTERMEDIATE_DIR)/langserve.md https://github.com/langchain-ai/langserve/tree/main/
+	@echo "🔧 Generating feature tables and processing links..."
+	$(PYTHON) scripts/tool_feat_table.py $(INTERMEDIATE_DIR) & \
+	$(PYTHON) scripts/kv_store_feat_table.py $(INTERMEDIATE_DIR) & \
+	$(PYTHON) scripts/partner_pkg_table.py $(INTERMEDIATE_DIR) & \
+	$(PYTHON) scripts/resolve_local_links.py $(INTERMEDIATE_DIR)/langserve.md https://github.com/langchain-ai/langserve/tree/main/ & \
+	wait
+	@echo "✅ Files generated"

 copy-infra:
+	@echo "📂 Copying infrastructure files..."
 	mkdir -p $(OUTPUT_NEW_DIR)
 	cp -r src $(OUTPUT_NEW_DIR)
 	cp vercel.json $(OUTPUT_NEW_DIR)
@@ -55,15 +68,22 @@ copy-infra:
 	cp -r static $(OUTPUT_NEW_DIR)
 	cp -r ../libs/cli/langchain_cli/integration_template $(OUTPUT_NEW_DIR)/src/theme
 	cp yarn.lock $(OUTPUT_NEW_DIR)
+	@echo "✅ Infrastructure files copied"

 render:
+	@echo "📓 Converting notebooks (this may take a while)..."
 	$(PYTHON) scripts/notebook_convert.py $(INTERMEDIATE_DIR) $(OUTPUT_NEW_DOCS_DIR)
+	@echo "✅ Notebooks converted"

 md-sync:
+	@echo "📝 Syncing markdown files..."
 	rsync -avmq --include="*/" --include="*.mdx" --include="*.md" --include="*.png" --include="*/_category_.yml" --exclude="*" $(INTERMEDIATE_DIR)/ $(OUTPUT_NEW_DOCS_DIR)
+	@echo "✅ Markdown files synced"

 append-related:
+	@echo "🔗 Appending related links..."
 	$(PYTHON) scripts/append_related_links.py $(OUTPUT_NEW_DOCS_DIR)
+	@echo "✅ Related links appended"

 generate-references:
 	$(PYTHON) scripts/generate_api_reference_links.py --docs_dir $(OUTPUT_NEW_DOCS_DIR)
@@ -71,6 +91,10 @@ generate-references:
 update-md: generate-files md-sync

 build: install-py-deps generate-files copy-infra render md-sync append-related
+	@echo ""
+	@echo "🎉 Documentation build complete!"
+	@echo "📖 To view locally, run: cd docs && make start"
+	@echo ""

 vercel-build: install-vercel-deps build generate-references
 	rm -rf docs
@@ -84,4 +108,9 @@ vercel-build: install-vercel-deps build generate-references
 	NODE_OPTIONS="--max-old-space-size=5000" yarn run docusaurus build

 start:
-	cd $(OUTPUT_NEW_DIR) && yarn && yarn start --port=$(PORT)
+	@echo "🚀 Starting documentation server on port $(PORT)..."
+	@echo "📖 Installing Node.js dependencies..."
+	cd $(OUTPUT_NEW_DIR) && yarn install --silent
+	@echo "🌐 Starting server at http://localhost:$(PORT)"
+	@echo "Press Ctrl+C to stop the server"
+	cd $(OUTPUT_NEW_DIR) && yarn start --port=$(PORT)
--- a/docs/api_reference/conf.py
+++ b/docs/api_reference/conf.py
@@ -262,6 +262,8 @@ myst_enable_extensions = ["colon_fence"]

 # generate autosummary even if no references
 autosummary_generate = True
+# Don't fail on autosummary import warnings
+autosummary_ignore_module_all = False

 html_copy_source = False
 html_show_sourcelink = False
--- a/docs/api_reference/create_api_rst.py
+++ b/docs/api_reference/create_api_rst.py
@@ -497,6 +497,7 @@ def _package_dir(package_name: str = "langchain") -> Path:
    """Return the path to the directory containing the documentation."""
    if package_name in (
        "langchain",
+        "langchain_v1",
        "experimental",
        "community",
        "core",
@@ -592,7 +593,12 @@ For the legacy API reference hosted on ReadTheDocs see [https://api.python.langc
    if integrations:
        integration_headers = [
            " ".join(
-                custom_names.get(x, x.title().replace("ai", "AI").replace("db", "DB"))
+                custom_names.get(
+                    x,
+                    x.title().replace("db", "DB")
+                    if dir_ == "langchain_v1"
+                    else x.title().replace("ai", "AI").replace("db", "DB"),
+                )
                for x in dir_.split("-")
            )
            for dir_ in integrations
--- a/docs/docs/contributing/how_to/code/index.mdx
+++ b/docs/docs/contributing/how_to/code/index.mdx
@@ -1,4 +1,4 @@
-# Contribute Code
+# Contribute code

 If you would like to add a new feature or update an existing one, please read the resources below before getting started:

--- a/docs/docs/contributing/how_to/code/setup.mdx
+++ b/docs/docs/contributing/how_to/code/setup.mdx
@@ -3,7 +3,7 @@
 This guide walks through how to run the repository locally and check in your first code.
 For a [development container](https://containers.dev/), see the [.devcontainer folder](https://github.com/langchain-ai/langchain/tree/master/.devcontainer).

-## Dependency Management: `uv` and other env/dependency managers
+## Dependency management: `uv` and other env/dependency managers

 This project utilizes [uv](https://docs.astral.sh/uv/) v0.5+ as a dependency manager.

@@ -37,7 +37,7 @@ For this quickstart, start with `langchain`:
 cd libs/langchain
 ```

-## Local Development Dependencies
+## Local development dependencies

 Install development requirements (for running langchain, running examples, linting, formatting, tests, and coverage):

@@ -64,12 +64,6 @@ To run unit tests:
 make test
 ```

-To run unit tests in Docker:
-
-```bash
-make docker_tests
-```
-
 There are also [integration tests and code-coverage](../testing.mdx) available.

 ### Developing langchain_core
@@ -81,11 +75,11 @@ cd libs/core
 make test
 ```

-## Formatting and Linting
+## Formatting and linting

 Run these locally before submitting a PR; the CI system will check also.

-### Code Formatting
+### Code formatting

 Formatting for this project is done via [ruff](https://docs.astral.sh/ruff/rules/).

@@ -163,7 +157,7 @@ If codespell is incorrectly flagging a word, you can skip spellcheck for that wo
 ignore-words-list = 'momento,collison,ned,foor,reworkd,parth,whats,aapply,mysogyny,unsecure'
 ```

-## Working with Optional Dependencies
+## Working with optional dependencies

 `langchain`, `langchain-community`, and `langchain-experimental` rely on optional dependencies to keep these packages lightweight.

--- a/docs/docs/contributing/how_to/documentation/index.mdx
+++ b/docs/docs/contributing/how_to/documentation/index.mdx
@@ -1,4 +1,4 @@
-# Contribute Documentation
+# Contribute documentation

 Documentation is a vital part of LangChain. We welcome both new documentation for new features and 
 community improvements to our current documentation. Please read the resources below before getting started:
--- a/docs/docs/contributing/how_to/documentation/setup.mdx
+++ b/docs/docs/contributing/how_to/documentation/setup.mdx
@@ -12,12 +12,11 @@ It covers a wide array of topics, including tutorials, use cases, integrations,
 and more, offering extensive guidance on building with LangChain.
 The content for this documentation lives in the `/docs` directory of the monorepo.
 2. In-code Documentation: This is documentation of the codebase itself, which is also
-used to generate the externally facing [API Reference](https://python.langchain.com/api_reference/langchain/index.html).
+used to generate the externally facing [API Reference](https://python.langchain.com/api_reference/).
 The content for the API reference is autogenerated by scanning the docstrings in the codebase. For this reason we ask that
 developers document their code well.

-The `API Reference` is largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/)
-from the code and is hosted by [Read the Docs](https://readthedocs.org/).
+The API Reference is largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code.

 We appreciate all contributions to the documentation, whether it be fixing a typo,
 adding a new tutorial or example and whether it be in the main documentation or the API Reference.
@@ -25,7 +24,7 @@ adding a new tutorial or example and whether it be in the main documentation or
 Similar to linting, we recognize documentation can be annoying. If you do not want
 to do it, please contact a project maintainer, and they can help you with it. We do not want this to be a blocker for good code getting contributed.

-## 📜 Main Documentation
+## 📜 Main documentation

 The content for the main documentation is located in the `/docs` directory of the monorepo.

@@ -42,7 +41,7 @@ After modifying the documentation:
 3. Make a pull request with the changes.
 4. You can preview and verify that the changes are what you wanted by clicking the `View deployment` or `Visit Preview` buttons on the pull request `Conversation` page. This will take you to a preview of the documentation changes.

-## ⚒️ Linting and Building Documentation Locally
+## ⚒️ Linting and building documentation locally

 After writing up the documentation, you may want to lint and build the documentation
 locally to ensure that it looks good and is free of errors.
@@ -57,20 +56,44 @@ The code that builds the documentation is located in the `/docs` directory of th

 In the following commands, the prefix `api_` indicates that those are operations for the API Reference.

-Before building the documentation, it is always a good idea to clean the build directory:
-
-```bash
-make docs_clean
-make api_docs_clean
-```
-
-Next, you can build the documentation as outlined below:
+You can build the documentation as outlined below:

 ```bash
 make docs_build
 make api_docs_build
 ```

+### Viewing documentation locally
+
+After building the main documentation, you can view it locally by starting a development server:
+
+```bash
+# For main documentation (after running `make docs_build`)
+cd docs && make start
+```
+
+This will start a development server where you can view the documentation in your browser. The exact url will be shown to you during the start process. The server will automatically reload when you make changes to the documentation files under the `build/` directory (e.g. for temporary tests - changes you wish to persist should be put under `docs/docs/`).
+
+:::tip
+
+You can specify a different port by setting the `PORT` environment variable:
+
+```bash
+cd docs && PORT=3000 make start
+```
+
+:::
+
+The API Reference documentation is built as static HTML files and will be automatically opened directly in your browser.
+
+You can also view the API Reference for a specific package by specifying the package name and installing the package if necessary dependencies:
+
+```bash
+# Opens the API Reference for the `ollama` package in your default browser
+uv pip install -e libs/partners/ollama
+make api_docs_quick_preview API_PKG=ollama
+```
+
 :::tip

 The `make api_docs_build` command takes a long time. If you're making cosmetic changes to the API docs and want to see how they look, use:
@@ -79,18 +102,28 @@ The `make api_docs_build` command takes a long time. If you're making cosmetic c
 make api_docs_quick_preview
 ```

-which will just build a small subset of the API reference.
+which will just build a small subset of the API reference (the `text-splitters` package).

 :::

-Finally, run the link checker to ensure all links are valid:
+Finally, run the link checker from the project root to ensure all links are valid:

 ```bash
 make docs_linkcheck
 make api_docs_linkcheck
 ```

-### Linting and Formatting
+To clean up the documentation build artifacts, you can run:
+
+```bash
+make clean
+
+# Or to clean specific documentation artifacts
+make docs_clean
+make api_docs_clean
+```
+
+### Formatting and linting

 The Main Documentation is linted from the **monorepo root**. To lint the main documentation, run the following from there:

@@ -104,9 +137,9 @@ If you have formatting-related errors, you can fix them automatically with:
 make format
 ```

-## ⌨️ In-code Documentation
+## ⌨️ In-code documentation

-The in-code documentation is largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code and is hosted by [Read the Docs](https://readthedocs.org/).
+The in-code documentation is largely autogenerated by [sphinx](https://www.sphinx-doc.org/en/master/) from the code following [reStructuredText](https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html).

 For the API reference to be useful, the codebase must be well-documented. This means that all functions, classes, and methods should have a docstring that explains what they do, what the arguments are, and what the return value is. This is a good practice in general, but it is especially important for LangChain because the API reference is the primary resource for developers to understand how to use the codebase.

@@ -141,16 +174,16 @@ def my_function(arg1: int, arg2: str) -> float:
    return 3.14
 ```

-### Linting and Formatting
+### Formatting and linting

 The in-code documentation is linted from the directories belonging to the packages
 being documented.

-For example, if you're working on the `langchain-community` package, you would change
-the working directory to the `langchain-community` directory:
+For example, if you're working on the `langchain-ollama` package, you would change
+the working directory to the the package directory:

 ```bash
-cd [root]/libs/langchain-community
+cd [root]/libs/partners/ollama
 ```

 Then you can run the following commands to lint and format the in-code documentation:
@@ -160,9 +193,9 @@ make format
 make lint
 ```

-## Verify Documentation Changes
+## Verify documentation changes

 After pushing documentation changes to the repository, you can preview and verify that the changes are
 what you wanted by clicking the `View deployment` or `Visit Preview` buttons on the pull request `Conversation` page.
 This will take you to a preview of the documentation changes.
-This preview is created by [Vercel](https://vercel.com/docs/getting-started-with-vercel).
+This preview is created by [Vercel](https://vercel.com/docs/getting-started-with-vercel).
--- a/docs/docs/contributing/how_to/documentation/style_guide.mdx
+++ b/docs/docs/contributing/how_to/documentation/style_guide.mdx
@@ -2,7 +2,7 @@
 sidebar_class_name: "hidden"
 ---

-# Documentation Style Guide
+# Documentation style guide

 As LangChain continues to grow, the amount of documentation required to cover the various concepts and integrations continues to grow too.
 This page provides guidelines for anyone writing documentation for LangChain and outlines some of our philosophies around
@@ -158,3 +158,5 @@ Be concise, including in code samples.
 - Use bullet points and numbered lists to break down information into easily digestible chunks
 - Use tables (especially for **Reference** sections) and diagrams often to present information visually
 - Include the table of contents for longer documentation pages to help readers navigate the content, but hide it for shorter pages
+
+Next, see the [documentation setup guide](setup.mdx) to get started with writing documentation for LangChain.
--- a/docs/docs/contributing/how_to/index.mdx
+++ b/docs/docs/contributing/how_to/index.mdx
@@ -1,4 +1,4 @@
-# How-to Guides
+# How-to guides

 - [**Documentation**](documentation/index.mdx): Help improve our docs, including this one!
 - [**Code**](code/index.mdx): Help us write code, fix bugs, or improve our infrastructure.
--- a/docs/docs/contributing/how_to/integrations/index.mdx
+++ b/docs/docs/contributing/how_to/integrations/index.mdx
@@ -3,7 +3,7 @@ pagination_prev: null
 pagination_next: contributing/how_to/integrations/package
 ---

-# Contribute Integrations
+# Contribute integrations

 Integrations are a core component of LangChain.
 LangChain provides standard interfaces for several different components (language models, vector stores, etc) that are crucial when building LLM applications.
@@ -16,7 +16,7 @@ LangChain provides standard interfaces for several different components (languag
 - **Best Practices:** Through their standard interface, LangChain components encourage and facilitate best practices (streaming, async, etc)


-## Components to Integrate
+## Components to integrate

 :::info

@@ -71,7 +71,7 @@ In order to contribute an integration, you should follow these steps:
 5. [Optional] Open and merge a PR to add documentation for your integration to the official LangChain docs.
 6. [Optional] Engage with the LangChain team for joint co-marketing ([see below](#co-marketing)).

-## Co-Marketing
+## Co-marketing

 With over 20 million monthly downloads, LangChain has a large audience of developers
 building LLM applications. Beyond just listing integrations, we aim to highlight
@@ -87,5 +87,5 @@ Here are some heuristics for types of content we are excited to promote:
 - **End-to-end applications:** End-to-end applications are great resources for developers looking to build. We prefer to highlight applications that are more complex/agentic in nature, and that use [LangGraph](https://github.com/langchain-ai/langgraph) as the orchestration framework. We get particularly excited about anything involving long-term memory, human-in-the-loop interaction patterns, or multi-agent architectures.
 - **Research:** We love highlighting novel research! Whether it is research built on top of LangChain or that integrates with it.

-## Further Reading
+## Further reading
 To get started, let's learn [how to implement an integration package](/docs/contributing/how_to/integrations/package/) for LangChain.
--- a/docs/docs/contributing/how_to/integrations/package.mdx
+++ b/docs/docs/contributing/how_to/integrations/package.mdx
@@ -358,7 +358,7 @@ a schema for the LLM to fill out when calling the tool. Similar to the `name` an
 description (part of `Field(..., description="description")`) are passed to the LLM, 
 and the values in these fields should be concise and LLM-usable.

-### Run Methods
+### Run methods

 `_run` is the main method that should be implemented in the subclass. This method
 takes in the arguments from `args_schema` and runs the tool, returning a string
@@ -469,6 +469,6 @@ import RetrieverSource from '/src/theme/integration_template/integration_templat

 ---

-## Next Steps
+## Next steps

 Now that you've implemented your package, you can move on to [testing your integration](../standard_tests) for your integration and successfully run them.
--- a/docs/docs/contributing/how_to/testing.mdx
+++ b/docs/docs/contributing/how_to/testing.mdx
@@ -10,7 +10,7 @@ Unit tests run on every pull request, so they should be fast and reliable.

 Integration tests run once a day, and they require more setup, so they should be reserved for confirming interface points with external services.

-## Unit Tests
+## Unit tests

 Unit tests cover modular logic that does not require calls to outside APIs.
 If you add new logic, please add a unit test.
@@ -27,19 +27,13 @@ To run unit tests:
 make test
 ```

-To run unit tests in Docker:
-
-```bash
-make docker_tests
-```
-
 To run a specific test:

 ```bash
 TEST_FILE=tests/unit_tests/test_imports.py make test
 ```

-## Integration Tests
+## Integration tests

 Integration tests cover logic that requires making calls to outside APIs (often integration with other services).
 If you add support for a new external API, please add a new integration test.
--- a/docs/docs/contributing/index.mdx
+++ b/docs/docs/contributing/index.mdx
@@ -12,7 +12,7 @@ More coming soon! We are working on tutorials to help you make your first contri

 - [**Make your first docs PR**](tutorials/docs.mdx)

-## How-to Guides
+## How-to guides

 - [**Documentation**](how_to/documentation/index.mdx): Help improve our docs, including this one!
 - [**Code**](how_to/code/index.mdx): Help us write code, fix bugs, or improve our infrastructure.
--- a/docs/docs/contributing/reference/repo_structure.mdx
+++ b/docs/docs/contributing/reference/repo_structure.mdx
@@ -50,7 +50,7 @@ There are other files in the root directory level, but their presence should be
 ## Documentation

 The `/docs` directory contains the content for the documentation that is shown
-at https://python.langchain.com/ and the associated API Reference https://python.langchain.com/api_reference/langchain/index.html.
+at [python.langchain.com](https://python.langchain.com/) and the associated [API Reference](https://python.langchain.com/api_reference/).

 See the [documentation](../how_to/documentation/index.mdx) guidelines to learn how to contribute to the documentation.

--- a/docs/docs/contributing/tutorials/docs.mdx
+++ b/docs/docs/contributing/tutorials/docs.mdx
@@ -8,7 +8,7 @@ This tutorial will guide you through making a simple documentation edit, like co

 ---

-## Editing a Documentation Page on GitHub
+## Editing a documentation page on GitHub

 Sometimes you want to make a small change, like fixing a typo, and the easiest way to do this is to use GitHub's editor directly.

@@ -42,10 +42,14 @@ Sometimes you want to make a small change, like fixing a typo, and the easiest w
   - Give your PR a title like `docs: Fix typo in X section`.
   - Follow the checklist in the PR description template.

-## Getting a Review
+## Getting a review

 Once you've submitted the pull request, it will be reviewed by the maintainers. You may receive feedback or requests for changes. Keep an eye on the PR to address any comments.

 Docs PRs are typically reviewed within a few days, but it may take longer depending on the complexity of the change and the availability of maintainers.

 For more information on reviews, see the [Review Process](../reference/review_process.mdx).
+
+## More information
+
+See our [how-to guides](../how_to/documentation/index.mdx) for more information on contributing to documentation:
--- a/docs/docs/integrations/providers/ibm.mdx
+++ b/docs/docs/integrations/providers/ibm.mdx
@@ -92,7 +92,7 @@ the support of DB2 vector store and vector search.

 See detailed usage examples in the guide [here](/docs/integrations/vectorstores/db2).

-Installation: This is a seperate package for vector store feature only and can be run
+Installation: This is a separate package for vector store feature only and can be run
 without the `langchain-ibm` package.
 ```bash
 pip install -U langchain-db2
--- a/docs/docs/integrations/providers/predictionguard.mdx
+++ b/docs/docs/integrations/providers/predictionguard.mdx
@@ -20,7 +20,7 @@ pip install langchain-predictionguard
 |---|---|---|---------------------------------------------------------|-------------------------------------------------------------------------------|
 |Chat|Build Chat Bots|[Chat](https://docs.predictionguard.com/api-reference/api-reference/chat-completions)| `from langchain_predictionguard import ChatPredictionGuard` | [ChatPredictionGuard.ipynb](/docs/integrations/chat/predictionguard)             |
 |Completions|Generate Text|[Completions](https://docs.predictionguard.com/api-reference/api-reference/completions)| `from langchain_predictionguard import PredictionGuard` | [PredictionGuard.ipynb](/docs/integrations/llms/predictionguard)                     |
-|Text Embedding|Embed String to Vectores|[Embeddings](https://docs.predictionguard.com/api-reference/api-reference/embeddings)| `from langchain_predictionguard import PredictionGuardEmbeddings` | [PredictionGuardEmbeddings.ipynb](/docs/integrations/text_embedding/predictionguard) |
+|Text Embedding|Embed String to Vectors|[Embeddings](https://docs.predictionguard.com/api-reference/api-reference/embeddings)| `from langchain_predictionguard import PredictionGuardEmbeddings` | [PredictionGuardEmbeddings.ipynb](/docs/integrations/text_embedding/predictionguard) |

 ## Getting Started

--- a/docs/docs/integrations/providers/premai.md
+++ b/docs/docs/integrations/providers/premai.md
@@ -1,7 +1,6 @@
 # PremAI

-[PremAI](https://premai.io/) is an all-in-one platform that simplifies the creation of robust, production-ready applications powered by Generative AI. By streamlining the development process, PremAI allows you to concentrate on enhancing user experience and driving overall growth for your application. You can quickly start using our platform [here](https://docs.premai.io/quick-start).
-
+[PremAI](https://premai.io/) is an all-in-one platform that simplifies the creation of robust, production-ready applications powered by Generative AI. By streamlining the development process, PremAI allows you to concentrate on enhancing user experience and driving overall growth for your application. You can quickly start using [our platform](https://docs.premai.io/quick-start).

 ## ChatPremAI

@@ -26,10 +25,9 @@ from langchain_community.chat_models import ChatPremAI

 Once we imported our required modules, let's setup our client. For now let's assume that our `project_id` is `8`. But make sure you use your project-id, otherwise it will throw error.

-To use langchain with prem, you do not need to pass any model name or set any parameters with our chat-client. By default it will use the model name and parameters used in the [LaunchPad](https://docs.premai.io/get-started/launchpad). 
-
-> Note: If you change the `model` or any other parameters like `temperature`  or `max_tokens` while setting the client, it will override existing default configurations, that was used in LaunchPad.   
+To use langchain with prem, you do not need to pass any model name or set any parameters with our chat-client. By default it will use the model name and parameters used in the [LaunchPad](https://docs.premai.io/get-started/launchpad).

+> Note: If you change the `model` or any other parameters like `temperature`  or `max_tokens` while setting the client, it will override existing default configurations, that was used in LaunchPad.

 ```python
 import os
@@ -43,9 +41,9 @@ chat = ChatPremAI(project_id=1234, model_name="gpt-4o")

 ### Chat Completions

-`ChatPremAI` supports two methods: `invoke` (which is the same as `generate`) and `stream`. 
+`ChatPremAI` supports two methods: `invoke` (which is the same as `generate`) and `stream`.

-The first one will give us a static result. Whereas the second one will stream tokens one by one. Here's how you can generate chat-like completions. 
+The first one will give us a static result. Whereas the second one will stream tokens one by one. Here's how you can generate chat-like completions.

 ```python
 human_message = HumanMessage(content="Who are you?")
@@ -72,18 +70,17 @@ chat.invoke(
 )
 ```

-> If you are going to place system prompt here, then it will override your system prompt that was fixed while deploying the application from the platform. 
+> If you are going to place system prompt here, then it will override your system prompt that was fixed while deploying the application from the platform.

 > You can find all the optional parameters [here](https://docs.premai.io/get-started/sdk#optional-parameters). Any parameters other than [these supported parameters](https://docs.premai.io/get-started/sdk#optional-parameters) will be automatically removed before calling the model.

-
 ### Native RAG Support with Prem Repositories

 Prem Repositories which allows users to upload documents (.txt, .pdf etc) and connect those repositories to the LLMs. You can think Prem repositories as native RAG, where each repository can be considered as a vector database. You can connect multiple repositories. You can learn more about repositories [here](https://docs.premai.io/get-started/repositories).

-Repositories are also supported in langchain premai. Here is how you can do it. 
+Repositories are also supported in langchain premai. Here is how you can do it.

-```python 
+```python

 query = "Which models are used for dense retrieval"
 repository_ids = [1985,]
@@ -94,13 +91,13 @@ repositories = dict(
 )
 ```

-First we start by defining our repository with some repository ids. Make sure that the ids are valid repository ids. You can learn more about how to get the repository id [here](https://docs.premai.io/get-started/repositories). 
+First we start by defining our repository with some repository ids. Make sure that the ids are valid repository ids. You can learn more about how to get the repository id [here](https://docs.premai.io/get-started/repositories).

-> Please note: Similar like `model_name` when you invoke the argument `repositories`, then you are potentially overriding the repositories connected in the launchpad. 
+> Please note: Similar like `model_name` when you invoke the argument `repositories`, then you are potentially overriding the repositories connected in the launchpad.

-Now, we connect the repository with our chat object to invoke RAG based generations. 
+Now, we connect the repository with our chat object to invoke RAG based generations.

-```python 
+```python
 import json

 response = chat.invoke(query, max_tokens=100, repositories=repositories)
@@ -109,7 +106,7 @@ print(response.content)
 print(json.dumps(response.response_metadata, indent=4))
 ```

-This is how an output looks like. 
+This is how an output looks like.

 ```bash
 Dense retrieval models typically include:
@@ -134,11 +131,11 @@ Dense retrieval models typically include:

 So, this also means that you do not need to make your own RAG pipeline when using the Prem Platform. Prem uses it's own RAG technology to deliver best in class performance for Retrieval Augmented Generations.

-> Ideally, you do not need to connect Repository IDs here to get Retrieval Augmented Generations. You can still get the same result if you have connected the repositories in prem platform. 
+> Ideally, you do not need to connect Repository IDs here to get Retrieval Augmented Generations. You can still get the same result if you have connected the repositories in prem platform.

 ### Streaming

-In this section, let's see how we can stream tokens using langchain and PremAI. Here's how you do it. 
+In this section, let's see how we can stream tokens using langchain and PremAI. Here's how you do it.

 ```python
 import sys
@@ -163,16 +160,15 @@ for chunk in chat.stream(

 This will stream tokens one after the other.

-> Please note: As of now, RAG with streaming is not supported. However we still support it with our API. You can learn more about that [here](https://docs.premai.io/get-started/chat-completion-sse). 
-
+> Please note: As of now, RAG with streaming is not supported. However we still support it with our API. You can learn more about that [here](https://docs.premai.io/get-started/chat-completion-sse).

 ## Prem Templates

-Writing Prompt Templates can be super messy. Prompt templates are long, hard to manage, and must be continuously tweaked to improve and keep the same throughout the application. 
+Writing Prompt Templates can be super messy. Prompt templates are long, hard to manage, and must be continuously tweaked to improve and keep the same throughout the application.

-With **Prem**, writing and managing prompts can be super easy. The **_Templates_** tab inside the [launchpad](https://docs.premai.io/get-started/launchpad) helps you write as many prompts you need and use it inside the SDK to make your application running using those prompts. You can read more about Prompt Templates [here](https://docs.premai.io/get-started/prem-templates). 
+With **Prem**, writing and managing prompts can be super easy. The **_Templates_** tab inside the [launchpad](https://docs.premai.io/get-started/launchpad) helps you write as many prompts you need and use it inside the SDK to make your application running using those prompts. You can read more about Prompt Templates [here](https://docs.premai.io/get-started/prem-templates).

-To use Prem Templates natively with LangChain, you need to pass an id the `HumanMessage`. This id should be the name the variable of your prompt template. the `content` in `HumanMessage` should be the value of that variable. 
+To use Prem Templates natively with LangChain, you need to pass an id the `HumanMessage`. This id should be the name the variable of your prompt template. the `content` in `HumanMessage` should be the value of that variable.

 let's say for example, if your prompt template was this:

@@ -198,7 +194,7 @@ template_id = "78069ce8-xxxxx-xxxxx-xxxx-xxx"
 response = chat.invoke([human_message], template_id=template_id)
 ```

-Prem Templates are also available for Streaming too. 
+Prem Templates are also available for Streaming too.

 ## Prem Embeddings

@@ -215,7 +211,7 @@ if os.environ.get("PREMAI_API_KEY") is None:

 ```

-We support lots of state of the art embedding models. You can view our list of supported LLMs and embedding models [here](https://docs.premai.io/get-started/supported-models). For now let's go for `text-embedding-3-large` model for this example. . 
+We support lots of state of the art embedding models. You can view our list of supported LLMs and embedding models [here](https://docs.premai.io/get-started/supported-models). For now let's go for `text-embedding-3-large` model for this example. .

 ```python

@@ -231,7 +227,7 @@ print(query_result[:5])
 ```

 :::note
-Setting `model_name` argument in mandatory for PremAIEmbeddings unlike chat. 
+Setting `model_name` argument in mandatory for PremAIEmbeddings unlike chat.
 :::

 Finally, let's embed some sample document
@@ -254,11 +250,13 @@ print(doc_result[0][:5])
 ```python
 print(f"Dimension of embeddings: {len(query_result)}")
 ```
+
 Dimension of embeddings: 3072

 ```python
 doc_result[:5]
 ```
+
 >Result:
 >
 >[-0.02129288576543331,
@@ -269,20 +267,20 @@ doc_result[:5]

 ## Tool/Function Calling

-LangChain PremAI supports tool/function calling. Tool/function calling allows a model to respond to a given prompt by generating output that matches a user-defined schema. 
+LangChain PremAI supports tool/function calling. Tool/function calling allows a model to respond to a given prompt by generating output that matches a user-defined schema.

 - You can learn all about tool calling in details [in our documentation here](https://docs.premai.io/get-started/function-calling).
 - You can learn more about langchain tool calling in [this part of the docs](https://python.langchain.com/v0.1/docs/modules/model_io/chat/function_calling).

 **NOTE:**

-> The current version of LangChain ChatPremAI do not support function/tool calling with streaming support. Streaming support along with function calling will come soon. 
+> The current version of LangChain ChatPremAI do not support function/tool calling with streaming support. Streaming support along with function calling will come soon.

 ### Passing tools to model

-In order to pass tools and let the LLM choose the tool it needs to call, we need to pass a tool schema. A tool schema is the function definition along with proper docstring on what does the function do, what each argument of the function is etc. Below are some simple arithmetic functions with their schema. 
+In order to pass tools and let the LLM choose the tool it needs to call, we need to pass a tool schema. A tool schema is the function definition along with proper docstring on what does the function do, what each argument of the function is etc. Below are some simple arithmetic functions with their schema.

-**NOTE:** 
+**NOTE:**
 > When defining function/tool schema, do not forget to add information around the function arguments, otherwise it would throw error.

 ```python
@@ -320,27 +318,28 @@ def multiply(a: int, b: int) -> int:

 ### Binding tool schemas with our LLM

-We will now use the `bind_tools` method to convert our above functions to a "tool" and binding it with the model. This means we are going to pass these tool information everytime we invoke the model. 
+We will now use the `bind_tools` method to convert our above functions to a "tool" and binding it with the model. This means we are going to pass these tool information every time we invoke the model.

 ```python
 tools = [add, multiply]
 llm_with_tools = chat.bind_tools(tools)
 ```

-After this, we get the response from the model which is now binded with the tools. 
+After this, we get the response from the model which is now binded with the tools.

-```python 
+```python
 query = "What is 3 * 12? Also, what is 11 + 49?"

 messages = [HumanMessage(query)]
 ai_msg = llm_with_tools.invoke(messages)
 ```

-As we can see, when our chat model is binded with tools, then based on the given prompt, it calls the correct set of the tools and sequentially. 
+As we can see, when our chat model is binded with tools, then based on the given prompt, it calls the correct set of the tools and sequentially.

-```python 
+```python
 ai_msg.tool_calls
 ```
+
 **Output**

 ```python
@@ -352,15 +351,15 @@ ai_msg.tool_calls
  'id': 'call_MPKYGLHbf39csJIyb5BZ9xIk'}]
 ```

-We append this message shown above to the LLM which acts as a context and makes the LLM aware that what all functions it has called. 
+We append this message shown above to the LLM which acts as a context and makes the LLM aware that what all functions it has called.

-```python 
+```python
 messages.append(ai_msg)
 ```

 Since tool calling happens into two phases, where:

-1. in our first call, we gathered all the tools that the LLM decided to tool, so that it can get the result as an added context to give more accurate and hallucination free result. 
+1. in our first call, we gathered all the tools that the LLM decided to tool, so that it can get the result as an added context to give more accurate and hallucination free result.

 2. in our second call, we will parse those set of tools decided by LLM and run them (in our case it will be the functions we defined, with the LLM's extracted arguments) and pass this result to the LLM

@@ -373,12 +372,13 @@ for tool_call in ai_msg.tool_calls:
    messages.append(ToolMessage(tool_output, tool_call_id=tool_call["id"]))
 ```

-Finally, we call the LLM (binded with the tools) with the function response added in it's context. 
+Finally, we call the LLM (binded with the tools) with the function response added in it's context.

 ```python
 response = llm_with_tools.invoke(messages)
 print(response.content)
 ```
+
 **Output**

 ```txt
@@ -425,4 +425,4 @@ chain.invoke(query)
 [multiply(a=3, b=12), add(a=11, b=49)]
 ```

-Now, as done above, we parse this and run this functions and call the LLM once again to get the result.
+Now, as done above, we parse this and run this functions and call the LLM once again to get the result.
--- a/docs/scripts/kv_store_feat_table.py
+++ b/docs/scripts/kv_store_feat_table.py
@@ -1,9 +1,6 @@
 import sys
 from pathlib import Path

-from langchain_community import document_loaders
-from langchain_core.document_loaders.base import BaseLoader
-
 KV_STORE_TEMPLATE = """\
 ---
 sidebar_class_name: hidden
--- a/docs/scripts/notebook_convert.py
+++ b/docs/scripts/notebook_convert.py
@@ -175,8 +175,23 @@ def _modify_frontmatter(
 def _convert_notebook(
    notebook_path: Path, output_path: Path, intermediate_docs_dir: Path
 ) -> Path:
-    with open(notebook_path) as f:
-        nb = nbformat.read(f, as_version=4)
+    import json
+    import uuid
+
+    with open(notebook_path, "r", encoding="utf-8") as f:
+        nb_json = json.load(f)
+
+    # Fix missing and duplicate cell IDs before nbformat validation
+    seen_ids = set()
+    for cell in nb_json.get("cells", []):
+        if "id" not in cell or not cell.get("id") or cell.get("id") in seen_ids:
+            cell["id"] = str(uuid.uuid4())[:8]
+        seen_ids.add(cell["id"])
+
+    nb = nbformat.reads(json.dumps(nb_json), as_version=4)
+
+    # Upgrade notebook format
+    nb = nbformat.v4.upgrade(nb)

    body, resources = exporter.from_notebook_node(nb)

--- a/docs/sidebars.js
+++ b/docs/sidebars.js
@@ -315,8 +315,8 @@ module.exports = {
                },
              ],
              link: {
-                type: "doc",
-                id: "integrations/stores/index",
+                type: "generated-index",
+                slug: "integrations/stores",
              },
            },
            {
--- a/docs/vercel_overrides.txt
+++ b/docs/vercel_overrides.txt
@@ -4,3 +4,9 @@ aiohttp<3.11
 protobuf<3.21
 tenacity
 urllib3
+# Fix numpy conflicts between langchain-astradb and langchain-chroma
+numpy>=1.26.0,<2.0.0
+# Fix simsimd build error in langchain-weaviate
+simsimd>=5.0.0
+# Fix sentencepiece build error - use newer version that supports modern CMake
+sentencepiece>=0.2.1