docs: templates updated titles (#25646)

Updated titles into a consistent format. 
Fixed links to the diagrams.
Fixed typos.
Note: The Templates menu in the navbar is now sorted by the file names.
I'll try sorting the navbar menus by the page titles, not the page file
names.
This commit is contained in:
Leonid Ganeline
2024-08-23 01:19:38 -07:00
committed by GitHub
parent 1b2ae40d45
commit 163ef35dd1
106 changed files with 366 additions and 344 deletions

View File

@@ -1,15 +1,14 @@
# rag-gemini-multi-modal
# RAG - Gemini multi-modal
Multi-modal LLMs enable visual assistants that can perform question-answering about images.
This template create a visual assistant for slide decks, which often contain visuals such as graphs or figures.
It uses OpenCLIP embeddings to embed all of the slide images and stores them in Chroma.
It uses `OpenCLIP` embeddings to embed all the slide images and stores them in Chroma.
Given a question, relevant slides are retrieved and passed to [Google Gemini](https://deepmind.google/technologies/gemini/#introduction) for answer synthesis.
![Diagram illustrating the process of a visual assistant using multi-modal LLM, from slide deck images to OpenCLIP embedding, retrieval, and synthesis with Google Gemini, resulting in an answer.](https://github.com/langchain-ai/langchain/assets/122662504/b9e69bef-d687-4ecf-a599-937e559d5184 "Workflow Diagram for Visual Assistant Using Multi-modal LLM")
![Diagram illustrating the process of a visual assistant using multi-modal LLM, from slide deck images to OpenCLIP embedding, retrieval, and synthesis with Google Gemini, resulting in an answer.](https://github.com/langchain-ai/langchain/assets/122662504/b9e69bef-d687-4ecf-a599-937e559d5184) "Workflow Diagram for Visual Assistant Using Multi-modal LLM"
## Input