mirror of
https://github.com/hwchase17/langchain.git
synced 2025-09-04 04:28:58 +00:00
Add new document_loader: AssemblyAIAudioTranscriptLoader (#9667)
This PR adds a new document loader `AssemblyAIAudioTranscriptLoader` that allows to transcribe audio files with the [AssemblyAI API](https://www.assemblyai.com) and loads the transcribed text into documents. - Add new document_loader with class `AssemblyAIAudioTranscriptLoader` - Add optional dependency `assemblyai` - Add unit tests (using a Mock client) - Add docs notebook This is the equivalent to the JS integration already available in LangChain.js. See the [LangChain JS docs AssemblyAI page](https://js.langchain.com/docs/modules/data_connection/document_loaders/integrations/web_loaders/assemblyai_audio_transcription). At its simplest, you can use the loader to get a transcript back from an audio file like this: ```python from langchain.document_loaders.assemblyai import AssemblyAIAudioTranscriptLoader loader = AssemblyAIAudioTranscriptLoader(file_path="./testfile.mp3") docs = loader.load() ``` To use it, it needs the `assemblyai` python package installed, and the environment variable `ASSEMBLYAI_API_KEY` set with your API key. Alternatively, the API key can also be passed as an argument. Twitter handles to shout out if so kindly 🙇 [@AssemblyAI](https://twitter.com/AssemblyAI) and [@patloeber](https://twitter.com/patloeber) --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
This commit is contained in:
224
docs/extras/integrations/document_loaders/assemblyai.ipynb
Normal file
224
docs/extras/integrations/document_loaders/assemblyai.ipynb
Normal file
@@ -0,0 +1,224 @@
|
||||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# AssemblyAI Audio Transcripts\n",
|
||||
"\n",
|
||||
"The `AssemblyAIAudioTranscriptLoader` allows to transcribe audio files with the [AssemblyAI API](https://www.assemblyai.com) and loads the transcribed text into documents.\n",
|
||||
"\n",
|
||||
"To use it, you should have the `assemblyai` python package installed, and the\n",
|
||||
"environment variable `ASSEMBLYAI_API_KEY` set with your API key. Alternatively, the API key can also be passed as an argument.\n",
|
||||
"\n",
|
||||
"More info about AssemblyAI:\n",
|
||||
"\n",
|
||||
"- [Website](https://www.assemblyai.com/)\n",
|
||||
"- [Get a Free API key](https://www.assemblyai.com/dashboard/signup)\n",
|
||||
"- [AssemblyAI API Docs](https://www.assemblyai.com/docs)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Installation\n",
|
||||
"\n",
|
||||
"First, you need to install the `assemblyai` python package.\n",
|
||||
"\n",
|
||||
"You can find more info about it inside the [assemblyai-python-sdk GitHub repo](https://github.com/AssemblyAI/assemblyai-python-sdk)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"#!pip install assemblyai"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Example\n",
|
||||
"\n",
|
||||
"The `AssemblyAIAudioTranscriptLoader` needs at least the `file_path` argument. Audio files can be specified as an URL or a local file path."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.document_loaders.assemblyai import AssemblyAIAudioTranscriptLoader\n",
|
||||
"\n",
|
||||
"audio_file = \"https://storage.googleapis.com/aai-docs-samples/nbc.mp3\"\n",
|
||||
"# or a local file path: audio_file = \"./nbc.mp3\"\n",
|
||||
"\n",
|
||||
"loader = AssemblyAIAudioTranscriptLoader(file_path=audio_file)\n",
|
||||
"\n",
|
||||
"docs = loader.load()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Note: Calling `loader.load()` blocks until the transcription is finished."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The transcribed text is available in the `page_content`:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"docs[0].page_content"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"```\n",
|
||||
"\"Load time, a new president and new congressional makeup. Same old ...\"\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The `metadata` contains the full JSON response with more meta information:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"docs[0].metadata"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"```\n",
|
||||
"{'language_code': <LanguageCode.en_us: 'en_us'>,\n",
|
||||
" 'audio_url': 'https://storage.googleapis.com/aai-docs-samples/nbc.mp3',\n",
|
||||
" 'punctuate': True,\n",
|
||||
" 'format_text': True,\n",
|
||||
" ...\n",
|
||||
"}\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Transcript Formats\n",
|
||||
"\n",
|
||||
"You can specify the `transcript_format` argument for different formats.\n",
|
||||
"\n",
|
||||
"Depending on the format, one or more documents are returned. These are the different `TranscriptFormat` options:\n",
|
||||
"\n",
|
||||
"- `TEXT`: One document with the transcription text\n",
|
||||
"- `SENTENCES`: Multiple documents, splits the transcription by each sentence\n",
|
||||
"- `PARAGRAPHS`: Multiple documents, splits the transcription by each paragraph\n",
|
||||
"- `SUBTITLES_SRT`: One document with the transcript exported in SRT subtitles format\n",
|
||||
"- `SUBTITLES_VTT`: One document with the transcript exported in VTT subtitles format"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from langchain.document_loaders.assemblyai import (\n",
|
||||
" AssemblyAIAudioTranscriptLoader,\n",
|
||||
" TranscriptFormat,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"loader = AssemblyAIAudioTranscriptLoader(\n",
|
||||
" file_path=\"./your_file.mp3\",\n",
|
||||
" transcript_format=TranscriptFormat.SENTENCES,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"docs = loader.load()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Transcription Config\n",
|
||||
"\n",
|
||||
"You can also specify the `config` argument to use different audio intelligence models.\n",
|
||||
"\n",
|
||||
"Visit the [AssemblyAI API Documentation](https://www.assemblyai.com/docs) to get an overview of all available models!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import assemblyai as aai\n",
|
||||
"\n",
|
||||
"config = aai.TranscriptionConfig(speaker_labels=True,\n",
|
||||
" auto_chapters=True,\n",
|
||||
" entity_detection=True\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"loader = AssemblyAIAudioTranscriptLoader(\n",
|
||||
" file_path=\"./your_file.mp3\",\n",
|
||||
" config=config\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Pass the API Key as argument\n",
|
||||
"\n",
|
||||
"Next to setting the API key as environment variable `ASSEMBLYAI_API_KEY`, it is also possible to pass it as argument."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"loader = AssemblyAIAudioTranscriptLoader(\n",
|
||||
" file_path=\"./your_file.mp3\",\n",
|
||||
" api_key=\"YOUR_KEY\"\n",
|
||||
")"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"language_info": {
|
||||
"name": "python"
|
||||
},
|
||||
"orig_nbformat": 4
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
Reference in New Issue
Block a user