langchain/libs/core/langchain_core/documents
Eugene Yurtsev 96b72edac8
core[minor]: Add optional ID field to Document schema (#23411)
This PR adds an optional ID field to the document schema.

# 1. Optional or Required

- An optional field will will requrie additional checking for the type
in user code (annoying).
- However, vectorstores currently don't respect this field. So if we
make it
required and start returning random UUIDs that might be even more
confusing
  to users.


**Proposal**: Start with Optional and convert to Required (with default
set to uuid4()) in 1-2 major releases.


# 2. Override __str__ or generic solution in prompts

Overriding __str__ as a simple way to avoid changing user code that
relies on
default str(document) in prompts. 


I considered rolling out a more general solution in prompts
(https://github.com/langchain-ai/langchain/pull/8685),
but to do that we need to:

1. Make things serializable
2. The more general solution would likely need to be backwards
compatible as well
3. It's unclear that one wants to format a List[int] in the same way as
List[Document]. The former should be `,` seperated (likely), the latter
   should be `---` separated (likely).


**Proposal** Start with __str__ override and focus on the vectorstore
APIs, we generalize prompts later
2024-06-27 12:15:58 -04:00
..
__init__.py core[minor]: move document compressor base (#17910) 2024-02-26 17:20:50 -08:00
base.py core[minor]: Add optional ID field to Document schema (#23411) 2024-06-27 12:15:58 -04:00
compressor.py core[patch]: Add doc-string to document compressor (#23085) 2024-06-19 11:03:49 -04:00
transformers.py