mirror of
https://github.com/hwchase17/langchain.git
synced 2025-07-02 03:15:11 +00:00
Update Ollama multi-modal template README.md (#14994)
This commit is contained in:
parent
1db7450bc2
commit
94586ec242
@ -1,13 +1,17 @@
|
||||
|
||||
# rag-multi-modal-local
|
||||
|
||||
Visual search is a famililar application to many with iPhones or Android devices: use natural language to search across your photo collection.
|
||||
Visual search is a famililar application to many with iPhones or Android devices. It allows user to serch photos using natural language.
|
||||
|
||||
With the release of open source, multi-modal LLMs it's possible to build this kind of application for yourself and have it run on your personal laptop.
|
||||
With the release of open source, multi-modal LLMs it's possible to build this kind of application for yourself for your own private photo collection.
|
||||
|
||||
This template demonstrates how to perform visual search and question-answering over a collection of photos.
|
||||
This template demonstrates how to perform private visual search and question-answering over a collection of your photos.
|
||||
|
||||
It uses OpenCLIP embeddings to embed all of the photos and stores them in Chroma.
|
||||
|
||||
Given a set of photos, it will use OpenCLIP embeddings to index them, retrieve photos relevant to user question, and use Ollama to run a local, open-source multi-modal LLM to answer questions about the retrieved photos.
|
||||
Given a question, relevat photos are retrieved and passed to an open source multi-modal LLM of your choice for answer synthesis.
|
||||
|
||||

|
||||
|
||||
## Input
|
||||
|
||||
@ -119,4 +123,4 @@ We can access the template from code with:
|
||||
from langserve.client import RemoteRunnable
|
||||
|
||||
runnable = RemoteRunnable("http://localhost:8000/rag-chroma-multi-modal")
|
||||
```
|
||||
```
|
||||
|
Loading…
Reference in New Issue
Block a user