Fixed the import error in OpenAIWhisperParserLocal and resolved the L… (#29168)

…angChain parser issue.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
This commit is contained in:
Syed Muneeb Abbas
2025-01-13 19:47:31 +05:00
committed by GitHub
parent c115c09b6d
commit 8ef7f3eacc

View File

@@ -3,7 +3,9 @@
{
"cell_type": "markdown",
"id": "e48afb8d",
"metadata": {},
"metadata": {
"id": "e48afb8d"
},
"source": [
"# YouTube audio\n",
"\n",
@@ -19,16 +21,18 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": null,
"id": "5f34e934",
"metadata": {},
"metadata": {
"id": "5f34e934"
},
"outputs": [],
"source": [
"from langchain_community.document_loaders.blob_loaders.youtube_audio import (\n",
" YoutubeAudioLoader,\n",
")\n",
"from langchain_community.document_loaders.generic import GenericLoader\n",
"from langchain_community.document_loaders.parsers import (\n",
"from langchain_community.document_loaders.parsers.audio import (\n",
" OpenAIWhisperParser,\n",
" OpenAIWhisperParserLocal,\n",
")"
@@ -37,7 +41,9 @@
{
"cell_type": "markdown",
"id": "85fc12bd",
"metadata": {},
"metadata": {
"id": "85fc12bd"
},
"source": [
"We will use `yt_dlp` to download audio for YouTube urls.\n",
"\n",
@@ -48,7 +54,9 @@
"cell_type": "code",
"execution_count": null,
"id": "fb5a6606",
"metadata": {},
"metadata": {
"id": "fb5a6606"
},
"outputs": [],
"source": [
"%pip install --upgrade --quiet yt_dlp\n",
@@ -59,7 +67,9 @@
{
"cell_type": "markdown",
"id": "b0e119f4",
"metadata": {},
"metadata": {
"id": "b0e119f4"
},
"source": [
"### YouTube url to text\n",
"\n",
@@ -74,7 +84,9 @@
"cell_type": "code",
"execution_count": null,
"id": "8682f256",
"metadata": {},
"metadata": {
"id": "8682f256"
},
"outputs": [],
"source": [
"# set a flag to switch between local and remote parsing\n",
@@ -84,9 +96,12 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"id": "23e1e134",
"metadata": {},
"metadata": {
"id": "23e1e134",
"outputId": "0794ffeb-f912-48cc-e3cb-3b4d6e5221c7"
},
"outputs": [
{
"name": "stdout",
@@ -130,9 +145,12 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": null,
"id": "72a94fd8",
"metadata": {},
"metadata": {
"id": "72a94fd8",
"outputId": "b024759c-3925-40c1-9c59-2f9dabee0248"
},
"outputs": [
{
"data": {
@@ -153,7 +171,9 @@
{
"cell_type": "markdown",
"id": "93be6b49",
"metadata": {},
"metadata": {
"id": "93be6b49"
},
"source": [
"### Building a chat app from YouTube video\n",
"\n",
@@ -162,9 +182,11 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"id": "1823f042",
"metadata": {},
"metadata": {
"id": "1823f042"
},
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA\n",
@@ -175,9 +197,11 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"id": "7257cda1",
"metadata": {},
"metadata": {
"id": "7257cda1"
},
"outputs": [],
"source": [
"# Combine doc\n",
@@ -187,9 +211,11 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"id": "147c0c55",
"metadata": {},
"metadata": {
"id": "147c0c55"
},
"outputs": [],
"source": [
"# Split them\n",
@@ -199,9 +225,11 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": null,
"id": "f3556703",
"metadata": {},
"metadata": {
"id": "f3556703"
},
"outputs": [],
"source": [
"# Build an index\n",
@@ -211,9 +239,11 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"id": "beaa99db",
"metadata": {},
"metadata": {
"id": "beaa99db"
},
"outputs": [],
"source": [
"# Build a QA chain\n",
@@ -226,9 +256,12 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"id": "f2239a62",
"metadata": {},
"metadata": {
"id": "f2239a62",
"outputId": "b8de052d-cb76-44c5-bb0c-57e7398e89e6"
},
"outputs": [
{
"data": {
@@ -249,9 +282,12 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": null,
"id": "a8d01098",
"metadata": {},
"metadata": {
"id": "a8d01098",
"outputId": "9d66d66e-fc7f-4ac9-b104-a8e45e962949"
},
"outputs": [
{
"data": {
@@ -271,9 +307,12 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": null,
"id": "fe1e77dd",
"metadata": {},
"metadata": {
"id": "fe1e77dd",
"outputId": "19479403-92c9-471e-8c93-5df4b17f007a"
},
"outputs": [
{
"data": {
@@ -314,6 +353,9 @@
"interpreter": {
"hash": "97cc609b13305c559618ec78a438abc56230b9381f827f22d070313b9a1f3777"
}
},
"colab": {
"provenance": []
}
},
"nbformat": 4,