community: fix CPU support for FasterWhisperParser (implicit compute type for WhisperModel) (#30263)

FasterWhisperParser fails on a machine without an NVIDIA GPU: "Requested float16 compute type, but the target device or backend do not support efficient float16 computation." This problem arises because the WhisperModel is called with compute_type="float16", which works only for NVIDIA GPU. According to the [CTranslate2 docs](https://opennmt.net/CTranslate2/quantization.html#bit-floating-points-float16) float16 is supported only on NVIDIA GPUs. Removing the compute_type parameter solves the problem for CPUs. According to the [CTranslate2 docs](https://opennmt.net/CTranslate2/quantization.html#quantize-on-model-loading) setting compute_type to "default" (standard when omitting the parameter) uses the original compute type of the model or performs implicit conversion for the specific computation device (GPU or CPU). I suggest to remove compute_type="float16". @hulitaitai you are the original author of the FasterWhisperParser - is there a reason for setting the parameter to float16? Thanks for reviewing the PR! Co-authored-by: qonnop <qonnop@users.noreply.github.com>
2025-08-08 12:31:49 +00:00 · 2025-03-15 03:22:29 +01:00 · 2025-03-15 03:22:29 +01:00 · 747efa16ec
commit 747efa16ec
parent c74e7b997d
1 changed files with 1 additions and 3 deletions
--- a/libs/community/langchain_community/document_loaders/parsers/audio.py
+++ b/libs/community/langchain_community/document_loaders/parsers/audio.py
@ -673,9 +673,7 @@ class FasterWhisperParser(BaseBlobParser):
        file_obj = io.BytesIO(audio.export(format="mp3").read())

        # Transcribe
-        model = WhisperModel(
-            self.model_size, device=self.device, compute_type="float16"
-        )
+        model = WhisperModel(self.model_size, device=self.device)

        segments, info = model.transcribe(file_obj, beam_size=5)