langchain/docs/modules/indexes/text_splitters/examples
Jens Madsen 8d9e9e013c
refactor: extract token text splitter function (#5179)
# Token text splitter for sentence transformers

The current TokenTextSplitter only works with OpenAi models via the
`tiktoken` package. This is not clear from the name `TokenTextSplitter`.
In this (first PR) a token based text splitter for sentence transformer
models is added. In the future I think we should work towards injecting
a tokenizer into the TokenTextSplitter to make ti more flexible.
Could perhaps be reviewed by @dev2049

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-06-04 14:41:44 -07:00
..
character_text_splitter.ipynb docs: text splitters improvements (#4490) 2023-05-17 21:33:34 -07:00
code_splitter.ipynb code splitter docs (#5480) 2023-05-31 07:11:53 -07:00
huggingface_length_function.ipynb docs: text splitters improvements (#4490) 2023-05-17 21:33:34 -07:00
nltk.ipynb docs: text splitters improvements (#4490) 2023-05-17 21:33:34 -07:00
recursive_text_splitter.ipynb docs: text splitters improvements (#4490) 2023-05-17 21:33:34 -07:00
sentence_transformer_token_splitter.ipynb refactor: extract token text splitter function (#5179) 2023-06-04 14:41:44 -07:00
spacy.ipynb docs: text splitters improvements (#4490) 2023-05-17 21:33:34 -07:00
tiktoken_splitter.ipynb docs: text splitters improvements (#4490) 2023-05-17 21:33:34 -07:00
tiktoken.ipynb docs: text splitters improvements (#4490) 2023-05-17 21:33:34 -07:00