community[patch]: Invoke callback prior to yielding token fix for Llamafile (#20365)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for Llamafile


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community llamafile.py
    - **Issue:** https://github.com/langchain-ai/langchain/issues/16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
This commit is contained in:
balloonio 2024-04-12 15:26:12 -04:00 committed by GitHub
parent 1b272fa2f4
commit e7b1a44c5b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -297,9 +297,10 @@ class Llamafile(LLM):
for raw_chunk in response.iter_lines(decode_unicode=True):
content = self._get_chunk_content(raw_chunk)
chunk = GenerationChunk(text=content)
yield chunk
if run_manager:
run_manager.on_llm_new_token(token=chunk.text)
yield chunk
def _get_chunk_content(self, chunk: str) -> str:
"""When streaming is turned on, llamafile server returns lines like: