langchain/docs/modules/audio_models/examples/bananadev.ipynb

{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Bananadev\n",
    "\n",
    "This notebook covers how to use whisper from banana.dev, build an audio chain and deploy it in combination with llm to summarize a transcript gathered from an audio file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import langchain\n",
    "from langchain.audio_models import AudioBanana\n",
    "from langchain.cache import SQLiteCache\n",
    "from langchain.chains import AudioChain, LLMChain, SimpleSequentialChain\n",
    "from langchain.llms import OpenAI\n",
    "from langchain.prompts import PromptTemplate"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setting the cache"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "audio_cache = SQLiteCache(database_path=\".langchain.db\")\n",
    "\n",
    "langchain.llm_cache = audio_cache"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Defining the audio model\n",
    "\n",
    "An `AudioBanana` object has the following arguments:\n",
    "\n",
    "* `model_key` (str): model endpoint to use;\n",
    "* `banana_api_key`(optional[str]): banana api key:\n",
    "* `max_chars` (optional[int]): max number of chars to return.\n",
    "\n",
    "An `AudioChain` object has the following arguments:\n",
    "* `audio_model` (AudioBase): the audio model to use;\n",
    "* `output_key` (str): the task to be performed by the model. Not all the audio models support all the tasks. In the case of AudioBanana only `\"transcribe\"` is a valid output key. The output key will returned along with the text data gathered from the audio file.\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "audio_model = AudioBanana(model_key=\"[YOUR MODEL KEY]\", max_chars=20000)\n",
    "\n",
    "audio_chain = AudioChain(audio_model=audio_model, output_key=\"transcribe\")"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Setting up a llm and an LLMChain"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "llm = OpenAI(temperature=0.7)\n",
    "template = \"\"\"Speech: {text}\n",
    "\n",
    "    Write a short 3 sentence summary of the speech.\n",
    "\n",
    "    Summary:\"\"\"\n",
    "\n",
    "# note how the input variabòes\n",
    "prompt_template = PromptTemplate(input_variables=[\"text\"], template=template)\n",
    "\n",
    "summary_chain = LLMChain(llm=llm, prompt=prompt_template)"
   ]
  },
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Defining a SimpleSequentialChain and running"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "speech_summary_chain = SimpleSequentialChain(\n",
    "    chains=[audio_chain, summary_chain], verbose=True\n",
    ")\n",
    "\n",
    "audio_path = \"example_data/ihaveadream.mp3\"\n",
    "speech_summary_chain.run(audio_path)"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}