langchain/docs/extras/integrations
Taqi Jaffri b7290f01d8
Batching for hf_pipeline (#10795)
The huggingface pipeline in langchain (used for locally hosted models)
does not support batching. If you send in a batch of prompts, it just
processes them serially using the base implementation of _generate:
https://github.com/docugami/langchain/blob/master/libs/langchain/langchain/llms/base.py#L1004C2-L1004C29

This PR adds support for batching in this pipeline, so that GPUs can be
fully saturated. I updated the accompanying notebook to show GPU batch
inference.

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2023-09-25 18:23:11 +01:00
..
callbacks Update argilla.ipynb with spelling fix (#10611) 2023-09-19 08:06:28 -07:00
chat docs: add vLLM chat notebook (#10993) 2023-09-24 18:23:19 -07:00
chat_loaders Harrison/stop importing from init (#10690) 2023-09-16 17:22:48 -07:00
document_loaders fix broken link in docugami loader docs (#10753) 2023-09-18 21:56:33 -07:00
document_transformers Separate platforms integrations docs (#10609) 2023-09-15 12:18:57 -07:00
llms Batching for hf_pipeline (#10795) 2023-09-25 18:23:11 +01:00
memory Remembrall Integration (#10767) 2023-09-19 08:36:32 -07:00
platforms add vertex prod features (#10910) 2023-09-22 01:44:09 -07:00
providers Add Javelin integration (#10275) 2023-09-20 16:36:39 -07:00
retrievers Harrison/stop importing from init (#10690) 2023-09-16 17:22:48 -07:00
text_embedding LLMRails Embedding (#10959) 2023-09-23 16:11:02 -07:00
toolkits Harrison/stop importing from init (#10690) 2023-09-16 17:22:48 -07:00
tools Harrison/stop importing from init (#10690) 2023-09-16 17:22:48 -07:00
vectorstores Docs: Using SupabaseVectorStore with existing documents (#10907) 2023-09-22 08:18:56 -07:00