docs: hide jsx in llm chain tutorial (#30187)

## **Description:** 
The Jupyter notebooks in the docs section are extremely useful and
critical for widespread adoption of LangChain amongst new developers.
However, because they are also converted to MDX and used to build the
HTML for the Docusaurus site, they contain JSX code that degrades
readability when opened in a "notebook" setting (local notebook server,
google colab, etc.). For instance, here we see the website, with a nice
React tab component for installation instructions (`pip` vs `conda`):

![Screenshot 2025-03-07 at 2 07
15 PM](https://github.com/user-attachments/assets/a528d618-f5a0-4d2e-9aed-16d4b8148b5a)

Now, here is the same notebook viewed in colab:

![Screenshot 2025-03-07 at 2 08
41 PM](https://github.com/user-attachments/assets/87acf5b7-a3e0-46ac-8126-6cac6eb93586)

Note that the text following "To install LangChain run:" contains
snippets of JSX code that is (i) confusing, (ii) bad for readability,
(iii) potentially misleading for a novice developer, who might take it
literally to mean that "to install LangChain I should run `import Tabs
from...`" and then an ill-formed command which mixes the `pip` and
`conda` installation instructions.

Ideally, we would like to have a system that presents a
similar/equivalent UI when viewing the notebooks on the documentation
site, or when interacting with them in a notebook setting - or, at a
minimum, we should not present ill-formed JSX snippets to someone trying
to execute the notebooks. As the documentation itself states, running
the notebooks yourself is a great way to learn the tools. Therefore,
these distracting and ill-formed snippets are contrary to that goal.

## **Fixes:**
* Comment out the JSX code inside the notebook
`docs/tutorials/llm_chain` with a special directive `<!-- HIDE_IN_NB`
(closed with `HIDE_IN_NB -->`). This makes the JSX code "invisible" when
viewed in a notebook setting.
* Add a custom preprocessor that runs process_cell and just erases these
comment strings. This makes sure they are rendered when converted to
MDX.
* Minor tweak: Refactor some of the Markdown instructions into an
executable codeblock for better experience when running as a notebook.
* Minor tweak: Optionally try to get the environment variables from a
`.env` file in the repo so the user doesn't have to enter it every time.
Depends on the user installing `python-dotenv` and adding their own
`.env` file.
* Add an environment variable for "LANGSMITH_PROJECT"
(default="default"), per the LangSmith docs, so a local user can target
a specific project in their LangSmith account.

**NOTE:** If this PR is approved, and the maintainers agree with the
general goal of aligning the notebook execution experience and the doc
site UI, I would plan to implement this on the rest of the JSX snippets
that are littered in the notebooks.

**NOTE:** I wasn't able to/don't know how to run the linkcheck Makefile
commands.

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Really Him <hesereallyhim@proton.me>
This commit is contained in:
Really Him 2025-03-26 14:22:33 -04:00 committed by GitHub
parent 8e5d2a44ce
commit fbd2e10703
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 71 additions and 10 deletions

View File

@ -39,6 +39,7 @@
"\n",
"To install LangChain run:\n",
"\n",
"<!-- HIDE_IN_NB\n",
"import Tabs from '@theme/Tabs';\n",
"import TabItem from '@theme/TabItem';\n",
"import CodeBlock from \"@theme/CodeBlock\";\n",
@ -51,9 +52,28 @@
" <CodeBlock language=\"bash\">conda install langchain -c conda-forge</CodeBlock>\n",
" </TabItem>\n",
"</Tabs>\n",
"HIDE_IN_NB -->"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "86874822",
"metadata": {},
"outputs": [],
"source": [
"# | output: false\n",
"\n",
"\n",
"\n",
"# %pip install langchain\n",
"# OR\n",
"# %conda install langchain -c conda-forge"
]
},
{
"cell_type": "markdown",
"id": "a546a5bc",
"metadata": {},
"source": [
"For more details, see our [Installation guide](/docs/how_to/installation).\n",
"\n",
"### LangSmith\n",
@ -67,17 +87,45 @@
"```shell\n",
"export LANGSMITH_TRACING=\"true\"\n",
"export LANGSMITH_API_KEY=\"...\"\n",
"export LANGSMITH_PROJECT=\"default\" # or any other project name\n",
"```\n",
"\n",
"Or, if in a notebook, you can set them with:\n",
"\n",
"```python\n",
"Or, if in a notebook, you can set them with:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "599bb688",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"import os\n",
"\n",
"try:\n",
" # load environment variables from .env file (requires `python-dotenv`)\n",
" from dotenv import load_dotenv\n",
"\n",
" load_dotenv()\n",
"except ImportError:\n",
" pass\n",
"\n",
"os.environ[\"LANGSMITH_TRACING\"] = \"true\"\n",
"os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass()\n",
"```"
"if \"LANGSMITH_API_KEY\" not in os.environ:\n",
" os.environ[\"LANGSMITH_API_KEY\"] = getpass.getpass(\n",
" prompt=\"Enter your LangSmith API key (optional): \"\n",
" )\n",
"if \"LANGSMITH_PROJECT\" not in os.environ:\n",
" os.environ[\"LANGSMITH_PROJECT\"] = getpass.getpass(\n",
" prompt='Enter your LangSmith Project Name (default = \"default\"): '\n",
" )\n",
" if not os.environ.get(\"LANGSMITH_PROJECT\"):\n",
" os.environ[\"LANGSMITH_PROJECT\"] = \"default\"\n",
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass.getpass(\n",
" prompt=\"Enter your OpenAI API key (required if using OpenAI): \"\n",
" )"
]
},
{
@ -89,9 +137,11 @@
"\n",
"First up, let's learn how to use a language model by itself. LangChain supports many different language models that you can use interchangeably. For details on getting started with a specific model, refer to [supported integrations](/docs/integrations/chat/).\n",
"\n",
"<!-- HIDE_IN_NB>\n",
"import ChatModelTabs from \"@theme/ChatModelTabs\";\n",
"\n",
"<ChatModelTabs overrideParams={{openai: {model: \"gpt-4o-mini\"}}} />\n"
"<ChatModelTabs overrideParams={{openai: {model: \"gpt-4o-mini\"}}} />\n",
"HIDE_IN_NB -->"
]
},
{

View File

@ -9,9 +9,12 @@ import nbformat
from nbconvert.exporters import MarkdownExporter
from nbconvert.preprocessors import Preprocessor
HIDE_IN_NB_MAGIC_OPEN = "<!-- HIDE_IN_NB"
HIDE_IN_NB_MAGIC_CLOSE = "HIDE_IN_NB -->"
class EscapePreprocessor(Preprocessor):
def preprocess_cell(self, cell, resources, cell_index):
def preprocess_cell(self, cell, resources, index):
if cell.cell_type == "markdown":
# rewrite .ipynb links to .md
cell.source = re.sub(
@ -61,7 +64,7 @@ class ExtractAttachmentsPreprocessor(Preprocessor):
outputs are returned in the 'resources' dictionary.
"""
def preprocess_cell(self, cell, resources, cell_index):
def preprocess_cell(self, cell, resources, index):
"""
Apply a transformation on each cell,
Parameters
@ -117,11 +120,19 @@ class CustomRegexRemovePreprocessor(Preprocessor):
return nb, resources
class UnHidePreprocessor(Preprocessor):
def preprocess_cell(self, cell, resources, index):
cell.source = cell.source.replace(HIDE_IN_NB_MAGIC_OPEN, "")
cell.source = cell.source.replace(HIDE_IN_NB_MAGIC_CLOSE, "")
return cell, resources
exporter = MarkdownExporter(
preprocessors=[
EscapePreprocessor,
ExtractAttachmentsPreprocessor,
CustomRegexRemovePreprocessor,
UnHidePreprocessor,
],
template_name="mdoutput",
extra_template_basedirs=["./scripts/notebook_convert_templates"],