community[patch]: avoid KeyError when language not in LANGUAGE_SEGMENTERS (#15212)

**Description:**

Handle unsupported languages in same way as when none is provided 
 
**Issue:**

The following line will throw a KeyError if the language is not
supported.
```python
self.Segmenter = LANGUAGE_SEGMENTERS[language]
```
E.g. when using `Language.CPP` we would get `KeyError: <Language.CPP:
'cpp'>`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
This commit is contained in:
bachr 2024-01-23 21:09:43 -08:00 committed by GitHub
parent 3f38e1a457
commit b3ed98dec0
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -97,6 +97,8 @@ class LanguageParser(BaseBlobParser):
language: If None (default), it will try to infer language from source.
parser_threshold: Minimum lines needed to activate parsing (0 by default).
"""
if language and language not in LANGUAGE_SEGMENTERS:
raise Exception(f"No parser available for {language}")
self.language = language
self.parser_threshold = parser_threshold