langchain/libs/community/langchain_community/retrievers
German Molina 5fb261ce27
community: Google Vertex AI Search now returns the website title as part of the document metadata (#30688)
Google vertex ai search will now return the title of the found website
as part of the document metadata, if available.

Thank you for contributing to LangChain!

- **Description**: Vertex AI Search can be used to index websites and
then develop chatbots that use these websites to answer questions. At
present, the document metadata includes an `id` and `source` (which is
the URL). While the URL is enough to create a link, the ID is not
descriptive enough to show users. Therefore, I propose we return `title`
as well, when available (e.g., it will not be available in `.txt`
documents found during the website indexing).
- **Issue**: No bug in particular, but it would be better if this was
here.
- **Dependencies**: None
- I do not use twitter.

Format, Lint and Test seem to be all good.
2025-04-09 08:54:06 -04:00
..
__init__.py community: add Needle retriever and document loader integration (#28157) 2024-12-03 22:06:25 +00:00
arcee.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
arxiv.py docs: update integration api refs (#25195) 2024-08-09 12:27:32 +00:00
asknews.py community[minor]: add AskNews retriever and AskNews tool (#21581) 2024-05-20 18:23:06 -07:00
azure_ai_search.py multiple: update docs urls to latest 2 (#26837) 2024-09-30 17:37:07 -07:00
bedrock.py community: Additional AWS deprecations (#29447) 2025-01-28 09:50:14 -05:00
bm25.py community: BM25Retriever preservation of document id (#27019) 2024-12-04 00:36:00 +00:00
breebs.py community[patch]: Add missing type annotations (#22758) 2024-06-10 16:59:28 -04:00
chaindesk.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
chatgpt_plugin_retriever.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
cohere_rag_retriever.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
databerry.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
docarray.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
dria_index.py community[patch]: upgrade to recent version of mypy (#21616) 2024-05-13 14:55:07 -04:00
elastic_search_bm25.py community[minor]: import fix (#20995) 2024-04-29 10:32:50 -04:00
embedchain.py community[patch]: Fixing embedchain document mapping (#18255) 2024-02-29 14:54:37 -08:00
google_cloud_documentai_warehouse.py multiple: update removal targets (#25361) 2024-08-14 09:50:39 -04:00
google_vertex_ai_search.py community: Google Vertex AI Search now returns the website title as part of the document metadata (#30688) 2025-04-09 08:54:06 -04:00
kay.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
kendra.py community: Additional AWS deprecations (#29447) 2025-01-28 09:50:14 -05:00
knn.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
llama_index.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
metal.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
milvus.py multiple: update docs urls to latest 2 (#26837) 2024-09-30 17:37:07 -07:00
nanopq.py community: Bump ruff version to 0.9 (#29206) 2025-02-08 01:21:10 +00:00
needle.py community: add top_k as param to Needle Retriever (#29821) 2025-02-16 08:30:52 -05:00
outline.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
pinecone_hybrid_search.py community: Add configurable text key for indexing and the retriever in Pinecone Hybrid Search (#29697) 2025-02-10 08:56:37 -05:00
pubmed.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
pupmed.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
qdrant_sparse_vector_retriever.py multiple: update docs urls to latest 2 (#26837) 2024-09-30 17:37:07 -07:00
rememberizer.py community[minor]: Rememberizer retriever (#20052) 2024-05-01 10:41:44 -04:00
remote_retriever.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
svm.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
tavily_search_api.py community[fix]: Handle None value in raw_content from Tavily API response (#30021) 2025-02-27 10:53:53 -05:00
tfidf.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
thirdai_neuraldb.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
vespa_retriever.py community[major], core[patch], langchain[patch], experimental[patch]: Create langchain-community (#14463) 2023-12-11 13:53:30 -08:00
weaviate_hybrid_search.py weaviate: Add-deprecation-warning (#29757) 2025-02-16 21:42:18 -05:00
web_research.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
wikipedia.py docs: update integration api refs (#25195) 2024-08-09 12:27:32 +00:00
you.py community[patch]: docstrings update (#20301) 2024-04-11 16:23:27 -04:00
zep_cloud.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
zep.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00
zilliz.py multiple: pydantic 2 compatibility, v0.3 (#26443) 2024-09-13 14:38:45 -07:00