mirror of
https://github.com/hwchase17/langchain.git
synced 2026-03-18 11:07:36 +00:00
Eval chain is currently very sensitive to differences in phrasing, punctuation, and tangential information. This prompt has worked better for me on my examples. More general q: Do we have any framework for evaluating default prompt changes? Could maybe start doing some regression testing?