mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-02 03:26:17 +00:00
docs: Add documentation of ElasticsearchStore.BM25RetrievalStrategy
(#20098)
This pull request follows up on https://github.com/langchain-ai/langchain/pull/19314 and https://github.com/langchain-ai/langchain-elastic/pull/6, adding documentation for the `ElasticsearchStore.BM25RetrievalStrategy`. Like other retrieval strategies, we are now introducing BM25RetrievalStrategy. ### Background - The `BM25RetrievalStrategy` has been introduced to `langchain-elastic` via the pull request https://github.com/langchain-ai/langchain-elastic/pull/6. - This PR was initially created in the main `langchain` repository but was moved to `langchain-elastic` during the review process due to the migration of the partner package. - The original PR can be found at https://github.com/langchain-ai/langchain/pull/19314. - As [commented](https://github.com/langchain-ai/langchain/pull/19314#issuecomment-2023202401) by @joemcelroy, documenting the new retrieval strategy is part of the requirements for its introduction. Although the `BM25RetrievalStrategy` has been merged into `langchain-elastic`, its documentation is still to be maintained in the main `langchain` repository. Therefore, this pull request adds the documentation portion of `BM25RetrievalStrategy`. The content of the documentation remains the same as that included in the original PR, https://github.com/langchain-ai/langchain/pull/19314. --------- Co-authored-by: Max Jakob <max.jakob@elastic.co>
This commit is contained in:
@@ -736,6 +736,50 @@
|
|||||||
"```"
|
"```"
|
||||||
]
|
]
|
||||||
},
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "markdown",
|
||||||
|
"id": "05cdb43d-5e46-46f6-a2dc-91df4aa56ec7",
|
||||||
|
"metadata": {},
|
||||||
|
"source": [
|
||||||
|
"## BM25RetrievalStrategy\n",
|
||||||
|
"This strategy allows the user to perform searches using pure BM25 without vector search.\n",
|
||||||
|
"\n",
|
||||||
|
"To use this, specify `BM25RetrievalStrategy` in `ElasticsearchStore` constructor.\n",
|
||||||
|
"\n",
|
||||||
|
"Note that in the example below, the embedding option is not specified, indicating that the search is conducted without using embeddings."
|
||||||
|
]
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"cell_type": "code",
|
||||||
|
"execution_count": 1,
|
||||||
|
"id": "4464a657-08c5-4a1a-b0e8-dba65f5b7ec0",
|
||||||
|
"metadata": {},
|
||||||
|
"outputs": [
|
||||||
|
{
|
||||||
|
"name": "stdout",
|
||||||
|
"output_type": "stream",
|
||||||
|
"text": [
|
||||||
|
"[Document(page_content='foo'), Document(page_content='foo bar'), Document(page_content='foo bar baz')]\n"
|
||||||
|
]
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"source": [
|
||||||
|
"from langchain_elasticsearch import ElasticsearchStore\n",
|
||||||
|
"\n",
|
||||||
|
"db = ElasticsearchStore(\n",
|
||||||
|
" es_url=\"http://localhost:9200\",\n",
|
||||||
|
" index_name=\"test_index\",\n",
|
||||||
|
" strategy=ElasticsearchStore.BM25RetrievalStrategy(),\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"db.add_texts(\n",
|
||||||
|
" [\"foo\", \"foo bar\", \"foo bar baz\", \"bar\", \"bar baz\", \"baz\"],\n",
|
||||||
|
")\n",
|
||||||
|
"\n",
|
||||||
|
"results = db.similarity_search(query=\"foo\", k=10)\n",
|
||||||
|
"print(results)"
|
||||||
|
]
|
||||||
|
},
|
||||||
{
|
{
|
||||||
"cell_type": "markdown",
|
"cell_type": "markdown",
|
||||||
"id": "0960fa0a",
|
"id": "0960fa0a",
|
||||||
@@ -993,7 +1037,7 @@
|
|||||||
"name": "python",
|
"name": "python",
|
||||||
"nbconvert_exporter": "python",
|
"nbconvert_exporter": "python",
|
||||||
"pygments_lexer": "ipython3",
|
"pygments_lexer": "ipython3",
|
||||||
"version": "3.9.7"
|
"version": "3.11.8"
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
"nbformat": 4,
|
"nbformat": 4,
|
||||||
|
Reference in New Issue
Block a user