mirror of
https://github.com/hwchase17/langchain.git
synced 2025-08-31 02:11:09 +00:00
docs: Add documentation of ElasticsearchStore.BM25RetrievalStrategy
(#20098)
This pull request follows up on https://github.com/langchain-ai/langchain/pull/19314 and https://github.com/langchain-ai/langchain-elastic/pull/6, adding documentation for the `ElasticsearchStore.BM25RetrievalStrategy`. Like other retrieval strategies, we are now introducing BM25RetrievalStrategy. ### Background - The `BM25RetrievalStrategy` has been introduced to `langchain-elastic` via the pull request https://github.com/langchain-ai/langchain-elastic/pull/6. - This PR was initially created in the main `langchain` repository but was moved to `langchain-elastic` during the review process due to the migration of the partner package. - The original PR can be found at https://github.com/langchain-ai/langchain/pull/19314. - As [commented](https://github.com/langchain-ai/langchain/pull/19314#issuecomment-2023202401) by @joemcelroy, documenting the new retrieval strategy is part of the requirements for its introduction. Although the `BM25RetrievalStrategy` has been merged into `langchain-elastic`, its documentation is still to be maintained in the main `langchain` repository. Therefore, this pull request adds the documentation portion of `BM25RetrievalStrategy`. The content of the documentation remains the same as that included in the original PR, https://github.com/langchain-ai/langchain/pull/19314. --------- Co-authored-by: Max Jakob <max.jakob@elastic.co>
This commit is contained in:
@@ -736,6 +736,50 @@
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "05cdb43d-5e46-46f6-a2dc-91df4aa56ec7",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## BM25RetrievalStrategy\n",
|
||||
"This strategy allows the user to perform searches using pure BM25 without vector search.\n",
|
||||
"\n",
|
||||
"To use this, specify `BM25RetrievalStrategy` in `ElasticsearchStore` constructor.\n",
|
||||
"\n",
|
||||
"Note that in the example below, the embedding option is not specified, indicating that the search is conducted without using embeddings."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"id": "4464a657-08c5-4a1a-b0e8-dba65f5b7ec0",
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"[Document(page_content='foo'), Document(page_content='foo bar'), Document(page_content='foo bar baz')]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from langchain_elasticsearch import ElasticsearchStore\n",
|
||||
"\n",
|
||||
"db = ElasticsearchStore(\n",
|
||||
" es_url=\"http://localhost:9200\",\n",
|
||||
" index_name=\"test_index\",\n",
|
||||
" strategy=ElasticsearchStore.BM25RetrievalStrategy(),\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"db.add_texts(\n",
|
||||
" [\"foo\", \"foo bar\", \"foo bar baz\", \"bar\", \"bar baz\", \"baz\"],\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"results = db.similarity_search(query=\"foo\", k=10)\n",
|
||||
"print(results)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"id": "0960fa0a",
|
||||
@@ -993,7 +1037,7 @@
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.9.7"
|
||||
"version": "3.11.8"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
|
Reference in New Issue
Block a user