mirror of
https://github.com/hwchase17/langchain.git
synced 2025-05-18 13:31:36 +00:00
# Token text splitter for sentence transformers The current TokenTextSplitter only works with OpenAi models via the `tiktoken` package. This is not clear from the name `TokenTextSplitter`. In this (first PR) a token based text splitter for sentence transformer models is added. In the future I think we should work towards injecting a tokenizer into the TokenTextSplitter to make ti more flexible. Could perhaps be reviewed by @dev2049 --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> |
||
---|---|---|
.. | ||
character_text_splitter.ipynb | ||
code_splitter.ipynb | ||
huggingface_length_function.ipynb | ||
nltk.ipynb | ||
recursive_text_splitter.ipynb | ||
sentence_transformer_token_splitter.ipynb | ||
spacy.ipynb | ||
tiktoken_splitter.ipynb | ||
tiktoken.ipynb |