community: Google Vertex AI Search now returns the website title as part of the document metadata (#30688)

Google vertex ai search will now return the title of the found website
as part of the document metadata, if available.

Thank you for contributing to LangChain!

- **Description**: Vertex AI Search can be used to index websites and
then develop chatbots that use these websites to answer questions. At
present, the document metadata includes an `id` and `source` (which is
the URL). While the URL is enough to create a link, the ID is not
descriptive enough to show users. Therefore, I propose we return `title`
as well, when available (e.g., it will not be available in `.txt`
documents found during the website indexing).
- **Issue**: No bug in particular, but it would be better if this was
here.
- **Dependencies**: None
- I do not use twitter.

Format, Lint and Test seem to be all good.
This commit is contained in:
German Molina 2025-04-10 00:54:06 +12:00 committed by GitHub
parent 636d831d27
commit 5fb261ce27
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -167,6 +167,8 @@ class _BaseGoogleVertexAISearchRetriever(BaseModel):
doc_metadata = document_dict.get("struct_data", {})
doc_metadata["id"] = document_dict["id"]
doc_metadata["source"] = derived_struct_data.get("link", "")
if derived_struct_data.get("title") is not None:
doc_metadata["title"] = derived_struct_data.get("title")
if chunk_type not in derived_struct_data:
continue