langchain

mirror of https://github.com/hwchase17/langchain.git synced 2025-09-28 15:00:23 +00:00

Files

Sergey Kozlov 135cb86215 Fix QuestionListOutputParser (#9738 )

This PR fixes `QuestionListOutputParser` text splitting.

`QuestionListOutputParser` incorrectly splits numbered list text into
lines. If text doesn't end with `\n` , the regex doesn't capture the
last item. So it always returns `n - 1` items, and
`WebResearchRetriever.llm_chain` generates less queries than requested
in the search prompt.

How to reproduce:

```python
from langchain.retrievers.web_research import QuestionListOutputParser

parser = QuestionListOutputParser()

good = parser.parse(
    """1. This is line one.
    2. This is line two.
    """  # <-- !
)

bad = parser.parse(
    """1. This is line one.
    2. This is line two."""    # <-- No new line.
)

assert good.lines == ['1. This is line one.\n', '2. This is line two.\n'], good.lines
assert bad.lines == ['1. This is line one.\n', '2. This is line two.'], bad.lines
```

NOTE: Last item will not contain a line break but this seems ok because
the items are stripped in the
`WebResearchRetriever.clean_search_query()`.

2023-08-25 01:47:17 -07:00

experimental

poetry lock the experimental package. (#9478 )

2023-08-22 14:09:35 -04:00

langchain

Fix QuestionListOutputParser (#9738 )

2023-08-25 01:47:17 -07:00