Compare commits

...

3933 Commits

Author SHA1 Message Date
Erick Friis
796da053c6 x 2024-11-15 10:06:24 -08:00
Zapiron
4b641f87ae English Update and fixed a duplicate "the" (#27981)
Fixed a duplicate "the" in the documentation and made the documentation
generally easier to understand
2024-11-13 14:36:56 -05:00
Erick Friis
f6d34585f0 docs: throw on broken anchors (#27773)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-11-13 14:29:27 -05:00
Zapiron
7bd9c8cba3 docs: Updated link to ensure reference to the correct header for ToolNode (#28088)
When `ToolNode` hyperlink is clicked, it does not automatically scroll
to the section due to incorrect reference to the heading / id in the
LangGraph documentation
2024-11-13 14:19:55 -05:00
ccurme
940e93e891 docs: add docs on StrOutputParser (#28089)
Think it's worth adding a quick guide and including in the table in the
concepts page. `StrOutputParser` can make it easier to deal with the
union type for message content. For example, ChatAnthropic with bound
tools will generate string content if there are no tool calls and
`list[dict]` content otherwise.

I'm also considering removing the output parser section from the
["quickstart"
tutorial](https://python.langchain.com/docs/tutorials/llm_chain/); we
can link to this guide instead.
2024-11-13 14:16:50 -05:00
Vadym Barda
6ec688cf2b xai[patch]: update core (#28092) 2024-11-13 17:51:51 +00:00
Artur Barseghyan
2ab5673eb1 docs: Add example using TypedDict in structured outputs how-to guide (#27415)
For me, the [Pydantic
example](https://python.langchain.com/docs/how_to/structured_output/#choosing-between-multiple-schemas)
does not work (tested on various Python versions from 3.10 to 3.12, and
`Pydantic` versions from 2.7 to 2.9).

The `TypedDict` example (added in this PR) does.

----

Additionally, fixed an error in [Using PydanticOutputParser
example](https://python.langchain.com/docs/how_to/structured_output/#using-pydanticoutputparser).

Was:

```python
query = "Anna is 23 years old and she is 6 feet tall"

print(prompt.invoke(query).to_string())
```

Corrected to:

```python
query = "Anna is 23 years old and she is 6 feet tall"

print(prompt.invoke({"query": query}).to_string())
```

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-11-13 16:53:37 +00:00
Bharat Ramanathan
3e972faf81 community: chore warn deprecate the tracer (#27159)
- **Description:**: This PR deprecates the wandb tracer in favor of the
new
[WeaveTracer](https://weave-docs.wandb.ai/guides/integrations/langchain#using-weavetracer)
in W&B
- **Dependencies:** No dependencies, just a deprecation warning.
- **Twitter handle:** @parambharat


@baskaryan
2024-11-13 11:33:34 -05:00
Erick Friis
76e0127539 core: release 0.3.18 (#28070) 2024-11-13 16:19:13 +00:00
Eric Pinzur
eadc2f6a90 core: added DeleteResponse to the module (#28069)
Description:
* added `DeleteResponse` to the `langchain_core.indexing` module, for
implementing DocumentIndex classes.
2024-11-13 11:08:08 -05:00
ZhangShenao
c89e7ce8b5 core[patch]: Update doc-strings in callbacks (#28073)
- Fix api docs
2024-11-13 11:07:15 -05:00
Tom Pham
965286db3e docs: fix spelling error (#28075)
Fix spelling error in docs
2024-11-13 11:06:13 -05:00
Zapiron
892694d735 docs: Fixed broken link for AI models introduction (#28079)
Fixed broken redirect to the introduction to AI models in the Forefront
platform
2024-11-13 11:03:40 -05:00
Vruddhi Shah
beef4c4d62 Proofreading and Editing Report for Migration Guide (#28084)
Corrections and Suggestions for Migrating LangChain Code Documentation

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-11-13 11:03:09 -05:00
Zapiron
2cec957274 docs: Fix missing space between the words API Reference (#28087)
Added an expected space between the words APIReference
2024-11-13 11:02:46 -05:00
Zapiron
da7c79b794 DOCS: Concept Section Improvements & Updates (#27733)
Edited mainly the `Concepts` section in the LangChain documentation.

Overview:
* Updated some explanations to make the point more clear / Add missing
words for some documentations.
* Rephrased some sentences to make it shorter and more concise.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-11-13 11:01:27 -05:00
Zapiron
02de346f6d docs: Fixed additional 'the' and remove 'turns' to make explanation clearer (#28082)
Fixed additional 'the' and remove the word 'turns' as it would make
explanation clearer

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-11-13 15:15:39 +00:00
Zapiron
298ebeee4e docs: Fixed broken link for Cloudfare docs for the models available (#28080)
Fixed the broken redirect to see all the cloudfare models
2024-11-13 10:07:33 -05:00
Zapiron
8241c0df23 docs: Fixed wrong link redirect from JS ToolMessage to Python ToolMes… (#28083)
Fixed the link to ToolMessage from the JS documentation to Python
documentation
2024-11-13 10:05:19 -05:00
Zapiron
77c8a5c70c docs: Fixed broken link to the Luminous model family introduction (#28078)
The Luminous Model hyperlink at the start of the model is broken.
Fixed it to update it with the latest link used by the integration
2024-11-13 10:04:50 -05:00
Vadym Barda
09e85c7c4b xai[patch]: update dependencies (#28067) 2024-11-12 16:15:17 -05:00
am-kinetica
a646f1c383 Handled empty search result handling and updated the notebook (#27914)
- [ ] **PR title**: "community: updated Kinetica vectorstore"

  - **Description:** Handled empty search results
  - **Issue:** used to throw error if the search results were empty

@efriis
2024-11-12 13:03:49 -08:00
ccurme
00e7b2dada anthropic[patch]: add examples to API ref (#28065) 2024-11-12 20:17:02 +00:00
Vadym Barda
48ee322a78 partners: add xAI chat integration (#28032) 2024-11-12 15:11:29 -05:00
ccurme
2898b95ca7 anthropic[major]: release 0.3.0 (#28063) 2024-11-12 14:58:00 -05:00
ccurme
5eaa0e8c45 openai[patch]: release 0.2.8 (#28062) 2024-11-12 14:57:11 -05:00
ccurme
15b7dd3ad7 community[patch]: release 0.3.7 (#28061) 2024-11-12 19:54:58 +00:00
ccurme
5460096086 core[patch]: release 0.3.17 (#28060) 2024-11-12 19:38:56 +00:00
ccurme
1538ee17f9 anthropic[major]: support python 3.13 (#27916)
Last week Anthropic released version 0.39.0 of its python sdk, which
enabled support for Python 3.13. This release deleted a legacy
`client.count_tokens` method, which we currently access during init of
the `Anthropic` LLM. Anthropic has replaced this functionality with the
[client.beta.messages.count_tokens()
API](https://github.com/anthropics/anthropic-sdk-python/pull/726).

To enable support for `anthropic >= 0.39.0` and Python 3.13, here we
drop support for the legacy token counting method, and add support for
the new method via `ChatAnthropic.get_num_tokens_from_messages`.

To fully support the token counting API, we update the signature of
`get_num_tokens_from_message` to accept tools everywhere.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-11-12 14:31:07 -05:00
Syed Hyder Zaidi
759b6ed17a docs: Fix typo in Tavily Search example (#28034)
Changed "demon" to "demo" in the code comment for clarity.

PR Title
docs: Fix typo in Tavily Search example

PR Message
Description:
This PR fixes a typo in the code comment of the Tavily Search
documentation. Changed "demon" to "demo" for clarity and to avoid
confusion.

Issue:
No specific issue was mentioned, but this is a minor improvement in
documentation.

Dependencies:
No additional dependencies required.
2024-11-12 13:58:13 -05:00
ZhangShenao
ca7375ac20 Improvement[Community]Improve Embeddings API (#28038)
- Fix `BaichuanTextEmbeddings` api url
- Remove unused params in api doc
- Fix word spelling
2024-11-12 13:57:35 -05:00
Aditya Anand
e290736696 Update streaming.mdx (#28055)
fix: correct grammar in documentation for streaming modes

Updated sentence to clarify usage of "choose" in "When using the stream
and astream methods with LangGraph, you can choose one or more streaming
modes..." for better readability.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-11-12 16:43:12 +00:00
Aditya Anand
f9212c77e7 DOC: Fix typo in documentation for streaming modes, correcting 'witte… (#28052)
…n' to 'written' in 'Emit custom output written using LangGraph’s
StreamWriter.'

### Changes:
- Corrected the typo in the phrase 'Emit custom output witten using
LangGraph’s StreamWriter.' to 'Emit custom output written using
LangGraph’s StreamWriter.'
- Enhanced the clarity of the documentation surrounding LangGraph’s
streaming modes, specifically around the StreamWriter functionality.
- Provided additional context and emphasis on the role of the
StreamWriter class in handling custom output.

### Issue Reference:
- GitHub issue: https://github.com/langchain-ai/langchain/issues/28051

This update addresses the issue raised regarding the incorrect spelling
and aims to improve the clarity of the streaming mode documentation for
better user understanding.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "infra: ..."
for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**:
**Description:**  
Fixed a typo in the documentation for streaming modes, changing "witten"
to "written" in the phrase "Emit custom output witten using LangGraph’s
StreamWriter."
**Issue:**  
This PR addresses and fixes the typo in the documentation referenced in
[#28051](https://github.com/langchain-ai/langchain/issues/28051).


**Issue:**  
This PR addresses and fixes the typo in the documentation referenced in
[#28051](https://github.com/langchain-ai/langchain/issues/28051).


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-11-12 11:42:30 -05:00
Bagatur
139881b108 openai[patch]: fix azure oai stream check (#28048) 2024-11-12 15:42:06 +00:00
Bagatur
9611f0b55d openai[patch]: Release 0.2.7 (#28047) 2024-11-12 15:16:15 +00:00
Bagatur
5c14e1f935 community[patch]: Release 0.3.6 (#28046) 2024-11-12 15:15:07 +00:00
Bagatur
9ebd7ebed8 core[patch]: Release 0.3.16 (#28045) 2024-11-12 14:57:15 +00:00
Changyong Um
9484cc0962 community[docs]: modify parameter for the LoRA adapter on the vllm page (#27930)
**Description:** 
This PR modifies the documentation regarding the configuration of the
VLLM with the LoRA adapter. The updates aim to provide clear
instructions for users on how to set up the LoRA adapter when using the
VLLM.

- before
```python
VLLM(..., enable_lora=True)
```
- after
```python
VLLM(..., 
    vllm_kwargs={
        "enable_lora": True
    }
)
```
This change clarifies that users should use the vllm_kwargs to enable
the LoRA adapter.

Co-authored-by: Um Changyong <changyong.um@sfa.co.kr>
2024-11-11 15:41:56 -05:00
Zapiron
0b85f9035b docs: Makes the phrasing more smooth and reasoning more clear (#28020)
Updated the phrasing and reasoning on the "abstraction not receiving
much development" part of the documentation

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-11-11 17:17:29 +00:00
Zapiron
f695b96484 docs:Fixed missing hyperlink and changed AI to LLMs for clarity (#28006)
Changed "AI" to "LLM" in a paragraph
Fixed missing hyperlink for the structured output point
2024-11-11 12:14:29 -05:00
Choy Fuguan
c0f3777657 docs: removed bolding from header (#28001)
removed extra ** after heading two
2024-11-11 12:13:02 -05:00
Salman Faroz
44df79cf52 Correcting AzureOpenAI initialization (#28014) 2024-11-11 12:10:59 -05:00
Hammad Randhawa
57fc62323a docs : Update sql_qa.ipynb (#28026)
Text Documentation Bug:

Changed DSL query to SQL query.
2024-11-11 12:04:09 -05:00
ccurme
922b6b0e46 docs: update some cassettes (#28010) 2024-11-09 21:04:18 +00:00
ccurme
8e91c7ceec docs: add cross-links (#28000)
Mainly to improve visibility of integration pages.
2024-11-09 08:57:58 -05:00
Bagatur
33dbfba08b openai[patch]: default to invoke on o1 stream() (#27983) 2024-11-08 19:12:59 -08:00
Bagatur
503f2487a5 docs: intro nit (#27998) 2024-11-08 11:51:17 -08:00
ccurme
ff2152b115 docs: update tutorials index and add get started guides (#27996) 2024-11-08 14:47:32 -05:00
Eric Pinzur
c421997caa community[patch]: Added type hinting to OpenSearch clients (#27946)
Description:
* When working with OpenSearchVectorSearch to make
OpenSearchGraphVectorStore (coming soon), I noticed that there wasn't
type hinting for the underlying OpenSearch clients. This fixes that
issue.
* Confirmed tests are still passing with code changes.

Note that there is some additional code duplication now, but I think
this approach is cleaner overall.
2024-11-08 11:04:57 -08:00
Zapiron
4c2392e55c docs: fix link in custom tools guide (#27975)
Fixed broken link in tools documentation for `BaseTool`
2024-11-08 09:40:15 -05:00
Zapiron
85925e3164 docs: fix link in tool-calling guide (#27976)
Fix broken BaseTool link in documentation
2024-11-08 09:39:27 -05:00
Zapiron
138f360b25 docs: fix typo in PDF loader guide (#27977)
Fixed duplicate "py" in hyperlink to `pypdf` docs
2024-11-08 09:38:32 -05:00
Saad Makrod
b509747c7f Community: Google Books API Tool (#27307)
## Description

As proposed in our earlier discussion #26977 we have introduced a Google
Books API Tool that leverages the Google Books API found at
[https://developers.google.com/books/docs/v1/using](https://developers.google.com/books/docs/v1/using)
to generate book recommendations.

### Sample Usage

```python
from langchain_community.tools import GoogleBooksQueryRun
from langchain_community.utilities import GoogleBooksAPIWrapper

api_wrapper = GoogleBooksAPIWrapper()
tool = GoogleBooksQueryRun(api_wrapper=api_wrapper)

tool.run('ai')
```

### Sample Output

```txt
Here are 5 suggestions based off your search for books related to ai:

1. "AI's Take on the Stigma Against AI-Generated Content" by Sandy Y. Greenleaf: In a world where artificial intelligence (AI) is rapidly advancing and transforming various industries, a new form of content creation has emerged: AI-generated content. However, despite its potential to revolutionize the way we produce and consume information, AI-generated content often faces a significant stigma. "AI's Take on the Stigma Against AI-Generated Content" is a groundbreaking book that delves into the heart of this issue, exploring the reasons behind the stigma and offering a fresh, unbiased perspective on the topic. Written from the unique viewpoint of an AI, this book provides readers with a comprehensive understanding of the challenges and opportunities surrounding AI-generated content. Through engaging narratives, thought-provoking insights, and real-world examples, this book challenges readers to reconsider their preconceptions about AI-generated content. It explores the potential benefits of embracing this technology, such as increased efficiency, creativity, and accessibility, while also addressing the concerns and drawbacks that contribute to the stigma. As you journey through the pages of this book, you'll gain a deeper understanding of the complex relationship between humans and AI in the realm of content creation. You'll discover how AI can be used as a tool to enhance human creativity, rather than replace it, and how collaboration between humans and machines can lead to unprecedented levels of innovation. Whether you're a content creator, marketer, business owner, or simply someone curious about the future of AI and its impact on our society, "AI's Take on the Stigma Against AI-Generated Content" is an essential read. With its engaging writing style, well-researched insights, and practical strategies for navigating this new landscape, this book will leave you equipped with the knowledge and tools needed to embrace the AI revolution and harness its potential for success. Prepare to have your assumptions challenged, your mind expanded, and your perspective on AI-generated content forever changed. Get ready to embark on a captivating journey that will redefine the way you think about the future of content creation.
Read more at https://play.google.com/store/books/details?id=4iH-EAAAQBAJ&source=gbs_api

2. "AI Strategies For Web Development" by Anderson Soares Furtado Oliveira: From fundamental to advanced strategies, unlock useful insights for creating innovative, user-centric websites while navigating the evolving landscape of AI ethics and security Key Features Explore AI's role in web development, from shaping projects to architecting solutions Master advanced AI strategies to build cutting-edge applications Anticipate future trends by exploring next-gen development environments, emerging interfaces, and security considerations in AI web development Purchase of the print or Kindle book includes a free PDF eBook Book Description If you're a web developer looking to leverage the power of AI in your projects, then this book is for you. Written by an AI and ML expert with more than 15 years of experience, AI Strategies for Web Development takes you on a transformative journey through the dynamic intersection of AI and web development, offering a hands-on learning experience.The first part of the book focuses on uncovering the profound impact of AI on web projects, exploring fundamental concepts, and navigating popular frameworks and tools. As you progress, you'll learn how to build smart AI applications with design intelligence, personalized user journeys, and coding assistants. Later, you'll explore how to future-proof your web development projects using advanced AI strategies and understand AI's impact on jobs. Toward the end, you'll immerse yourself in AI-augmented development, crafting intelligent web applications and navigating the ethical landscape.Packed with insights into next-gen development environments, AI-augmented practices, emerging realities, interfaces, and security governance, this web development book acts as your roadmap to staying ahead in the AI and web development domain. What you will learn Build AI-powered web projects with optimized models Personalize UX dynamically with AI, NLP, chatbots, and recommendations Explore AI coding assistants and other tools for advanced web development Craft data-driven, personalized experiences using pattern recognition Architect effective AI solutions while exploring the future of web development Build secure and ethical AI applications following TRiSM best practices Explore cutting-edge AI and web development trends Who this book is for This book is for web developers with experience in programming languages and an interest in keeping up with the latest trends in AI-powered web development. Full-stack, front-end, and back-end developers, UI/UX designers, software engineers, and web development enthusiasts will also find valuable information and practical guidelines for developing smarter websites with AI. To get the most out of this book, it is recommended that you have basic knowledge of programming languages such as HTML, CSS, and JavaScript, as well as a familiarity with machine learning concepts.
Read more at https://play.google.com/store/books/details?id=FzYZEQAAQBAJ&source=gbs_api

3. "Artificial Intelligence for Students" by Vibha Pandey: A multifaceted approach to develop an understanding of AI and its potential applications KEY FEATURES ● AI-informed focuses on AI foundation, applications, and methodologies. ● AI-inquired focuses on computational thinking and bias awareness. ● AI-innovate focuses on creative and critical thinking and the Capstone project. DESCRIPTION AI is a discipline in Computer Science that focuses on developing intelligent machines, machines that can learn and then teach themselves. If you are interested in AI, this book can definitely help you prepare for future careers in AI and related fields. The book is aligned with the CBSE course, which focuses on developing employability and vocational competencies of students in skill subjects. The book is an introduction to the basics of AI. It is divided into three parts – AI-informed, AI-inquired and AI-innovate. It will help you understand AI's implications on society and the world. You will also develop a deeper understanding of how it works and how it can be used to solve complex real-world problems. Additionally, the book will also focus on important skills such as problem scoping, goal setting, data analysis, and visualization, which are essential for success in AI projects. Lastly, you will learn how decision trees, neural networks, and other AI concepts are commonly used in real-world applications. By the end of the book, you will develop the skills and competencies required to pursue a career in AI. WHAT YOU WILL LEARN ● Get familiar with the basics of AI and Machine Learning. ● Understand how and where AI can be applied. ● Explore different applications of mathematical methods in AI. ● Get tips for improving your skills in Data Storytelling. ● Understand what is AI bias and how it can affect human rights. WHO THIS BOOK IS FOR This book is for CBSE class XI and XII students who want to learn and explore more about AI. Basic knowledge of Statistical concepts, Algebra, and Plotting of equations is a must. TABLE OF CONTENTS 1. Introduction: AI for Everyone 2. AI Applications and Methodologies 3. Mathematics in Artificial Intelligence 4. AI Values (Ethical Decision-Making) 5. Introduction to Storytelling 6. Critical and Creative Thinking 7. Data Analysis 8. Regression 9. Classification and Clustering 10. AI Values (Bias Awareness) 11. Capstone Project 12. Model Lifecycle (Knowledge) 13. Storytelling Through Data 14. AI Applications in Use in Real-World
Read more at https://play.google.com/store/books/details?id=ptq1EAAAQBAJ&source=gbs_api

4. "The AI Book" by Ivana Bartoletti, Anne Leslie and Shân M. Millie: Written by prominent thought leaders in the global fintech space, The AI Book aggregates diverse expertise into a single, informative volume and explains what artifical intelligence really means and how it can be used across financial services today. Key industry developments are explained in detail, and critical insights from cutting-edge practitioners offer first-hand information and lessons learned. Coverage includes: · Understanding the AI Portfolio: from machine learning to chatbots, to natural language processing (NLP); a deep dive into the Machine Intelligence Landscape; essentials on core technologies, rethinking enterprise, rethinking industries, rethinking humans; quantum computing and next-generation AI · AI experimentation and embedded usage, and the change in business model, value proposition, organisation, customer and co-worker experiences in today’s Financial Services Industry · The future state of financial services and capital markets – what’s next for the real-world implementation of AITech? · The innovating customer – users are not waiting for the financial services industry to work out how AI can re-shape their sector, profitability and competitiveness · Boardroom issues created and magnified by AI trends, including conduct, regulation & oversight in an algo-driven world, cybersecurity, diversity & inclusion, data privacy, the ‘unbundled corporation’ & the future of work, social responsibility, sustainability, and the new leadership imperatives · Ethical considerations of deploying Al solutions and why explainable Al is so important
Read more at http://books.google.ca/books?id=oE3YDwAAQBAJ&dq=ai&hl=&source=gbs_api

5. "Artificial Intelligence in Society" by OECD: The artificial intelligence (AI) landscape has evolved significantly from 1950 when Alan Turing first posed the question of whether machines can think. Today, AI is transforming societies and economies. It promises to generate productivity gains, improve well-being and help address global challenges, such as climate change, resource scarcity and health crises.
Read more at https://play.google.com/store/books/details?id=eRmdDwAAQBAJ&source=gbs_api
```

## Issue 

This closes #27276 

## Dependencies

No additional dependencies were added

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-07 15:29:35 -08:00
Massimiliano Pronesti
be3b7f9bae cookbook: add Anthropic's contextual retrieval (#27898)
Hi there, this PR adds a notebook implementing Anthropic's proposed
[Contextual
retrieval](https://www.anthropic.com/news/contextual-retrieval) to
langchain's cookbook.
2024-11-07 14:48:01 -08:00
Erick Friis
733e43eed0 docs: new stack diagram (#27972) 2024-11-07 22:46:56 +00:00
Erick Friis
a073c4c498 templates,docs: leave templates in v0.2 (#27952)
all template installs will now have to declare `--branch v0.2` to make
clear they aren't compatible with langchain 0.3 (most have a pydantic v1
setup). e.g.

```
langchain-cli app add pirate-speak --branch v0.2
```
2024-11-07 22:23:48 +00:00
Erick Friis
8807e6986c docs: ignore case production fork master (#27971) 2024-11-07 13:55:21 -08:00
Shawn Lee
6f368e9eab community: handle chatdeepinfra jsondecode error (#27603)
Fixes #27602 

Added error handling to return empty dict if args is empty string or
None.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-07 13:47:19 -08:00
CLOVA Studio 개발
0588bab33e community: fix ClovaXEmbeddings document API link address (#27957)
- **Description:** 404 error occurs because `API reference` link address
path is incorrect on
`langchain/docs/docs/integrations/text_embedding/naver.ipynb`
- **Issue:** fix `API reference` link address correct path.

@vbarda @efriis
2024-11-07 13:46:01 -08:00
Akshata
05fd6a16a9 Add ChatModels wrapper for Cloudflare Workers AI (#27645)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: chat models wrapper for Cloudflare
Workers AI"


- [x] **PR message**:
- **Description:** Add chat models wrapper for Cloudflare Workers AI.
Enables Langgraph intergration via ChatModel for tool usage, agentic
usage.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-11-07 15:34:24 -05:00
Erick Friis
8a5b9bf2ad box: migrate to repo (#27969) 2024-11-07 10:19:22 -08:00
ccurme
1ad49957f5 docs[patch]: update cassettes for sql/csv notebook (#27966) 2024-11-07 11:48:45 -05:00
ccurme
a747dbd24b anthropic[patch]: remove retired model from tests (#27965)
`claude-instant` was [retired
yesterday](https://docs.anthropic.com/en/docs/resources/model-deprecations).
2024-11-07 16:16:29 +00:00
Aksel Joonas Reedi
2cb39270ec community: bytes as a source to AzureAIDocumentIntelligenceLoader (#26618)
- **Description:** This PR adds functionality to pass in in-memory bytes
as a source to `AzureAIDocumentIntelligenceLoader`.
- **Issue:** I needed the functionality, so I added it.
- **Dependencies:** NA
- **Twitter handle:** @akseljoonas if this is a big enough change :)

---------

Co-authored-by: Aksel Joonas Reedi <aksel@klippa.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-07 03:40:21 +00:00
Martin Triska
7a9149f5dd community: ZeroxPDFLoader (#27800)
# OCR-based PDF loader

This implements [Zerox](https://github.com/getomni-ai/zerox) PDF
document loader.
Zerox utilizes simple but very powerful (even though slower and more
costly) approach to parsing PDF documents: it converts PDF to series of
images and passes it to a vision model requesting the contents in
markdown.

It is especially suitable for complex PDFs that are not parsed well by
other alternatives.

## Example use:
```python
from langchain_community.document_loaders.pdf import ZeroxPDFLoader

os.environ["OPENAI_API_KEY"] = "" ## your-api-key

model = "gpt-4o-mini" ## openai model
pdf_url = "https://assets.ctfassets.net/f1df9zr7wr1a/soP1fjvG1Wu66HJhu3FBS/034d6ca48edb119ae77dec5ce01a8612/OpenAI_Sacra_Teardown.pdf"

loader = ZeroxPDFLoader(file_path=pdf_url, model=model)
docs = loader.load()
```

The Zerox library supports wide range of provides/models. See Zerox
documentation for details.

- **Dependencies:** `zerox`
- **Twitter handle:** @martintriska1

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-11-07 03:14:57 +00:00
Dmitriy Prokopchuk
53b0a99f37 community: Memcached LLM Cache Integration (#27323)
## Description
This PR adds support for Memcached as a usable LLM model cache by adding
the ```MemcachedCache``` implementation relying on the
[pymemcache](https://github.com/pinterest/pymemcache) client.

Unit test-wise, the new integration is generally covered under existing
import testing. All new functionality depends on pymemcache if
instantiated and used, so to comply with the other cache implementations
the PR also adds optional integration tests for ```MemcachedCache```.

Since this is a new integration, documentation is added for Memcached as
an integration and as an LLM Cache.

## Issue
This PR closes #27275 which was originally raised as a discussion in
#27035

## Dependencies
There are no new required dependencies for langchain, but
[pymemcache](https://github.com/pinterest/pymemcache) is required to
instantiate the new ```MemcachedCache```.

## Example Usage
```python3
from langchain.globals import set_llm_cache
from langchain_openai import OpenAI

from langchain_community.cache import MemcachedCache
from pymemcache.client.base import Client

llm = OpenAI(model="gpt-3.5-turbo-instruct", n=2, best_of=2)
set_llm_cache(MemcachedCache(Client('localhost')))

# The first time, it is not yet in cache, so it should take longer
llm.invoke("Which city is the most crowded city in the USA?")

# The second time it is, so it goes faster
llm.invoke("Which city is the most crowded city in the USA?")
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-07 03:07:59 +00:00
Siddharth Murching
cfff2a057e community: Update UC toolkit documentation to use LangGraph APIs (#26778)
- **Description:** Update UC toolkit documentation to show an example of
using recommended LangGraph agent APIs before the existing LangChain
AgentExecutor example. Tested by manually running the updated example
notebook
- **Dependencies:** No new dependencies

---------

Signed-off-by: Sid Murching <sid.murching@databricks.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-07 02:47:41 +00:00
ZhangShenao
c2072d909a Improvement[Partner] Improve qdrant vector store (#27251)
- Add static method decorator
- Add args for api doc
- Fix word spelling

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-07 02:42:41 +00:00
Baptiste Pasquier
81f7daa458 community: add InfinityRerank (#27043)
**Description:** 

- Add a Reranker for Infinity server.

**Dependencies:** 

This wrapper uses
[infinity_client](https://github.com/michaelfeil/infinity/tree/main/libs/client_infinity/infinity_client)
to connect to an Infinity server.

**Tests and docs**

- integration test: test_infinity_rerank.py
- example notebook: infinity_rerank.ipynb
[here](https://github.com/baptiste-pasquier/langchain/blob/feat/infinity-rerank/docs/docs/integrations/document_transformers/infinity_rerank.ipynb)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-06 17:26:30 -08:00
Erick Friis
2494deb2a4 infra: remove google creds from release and integration test workflows (#27950) 2024-11-07 00:31:10 +00:00
Martin Triska
90189f5639 community: Allow other than default parsers in SharePointLoader and OneDriveLoader (#27716)
## What this PR does?

### Currently `O365BaseLoader` (and consequently both derived loaders)
are limited to `pdf`, `doc`, `docx` files.
- **Solution: here we introduce _handlers_ attribute that allows for
custom handlers to be passed in. This is done in _dict_ form:**

**Example:**
```python
from langchain_community.document_loaders.parsers.documentloader_adapter import DocumentLoaderAsParser
# PR for DocumentLoaderAsParser here: https://github.com/langchain-ai/langchain/pull/27749
from langchain_community.document_loaders.excel import UnstructuredExcelLoader

xlsx_parser = DocumentLoaderAsParser(UnstructuredExcelLoader, mode="paged")

# create dictionary mapping file types to handlers (parsers)
handlers = {
    "doc": MsWordParser()
    "pdf": PDFMinerParser()
    "txt": TextParser()
    "xlsx": xlsx_parser
}
loader = SharePointLoader(document_library_id="...",
                            handlers=handlers # pass handlers to SharePointLoader
                            )
documents = loader.load()

# works the same in OneDriveLoader
loader = OneDriveLoader(document_library_id="...",
                            handlers=handlers
                            )
```
This dictionary is then passed to `MimeTypeBasedParser` same as in the
[current
implementation](5a2cfb49e0/libs/community/langchain_community/document_loaders/parsers/registry.py (L13)).


### Currently `SharePointLoader` and `OneDriveLoader` are separate
loaders that both inherit from `O365BaseLoader`
However both of these implement the same functionality. The only
differences are:
- `SharePointLoader` requires argument `document_library_id` whereas
`OneDriveLoader` requires `drive_id`. These are just different names for
the same thing.
  - `SharePointLoader` implements significantly more features.
- **Solution: `OneDriveLoader` is replaced with an empty shell just
renaming `drive_id` to `document_library_id` and inheriting from
`SharePointLoader`**

**Dependencies:** None
**Twitter handle:** @martintriska1

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-11-06 17:44:34 -05:00
takahashi
482c168b3e langchain_core: add file_type option to make file type default as png (#27855)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"

- [ ] **description**
langchain_core.runnables.graph_mermaid.draw_mermaid_png calls this
function, but the Mermaid API returns JPEG by default. To be consistent,
add the option `file_type` with the default `png` type.

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
With this small change, I didn't add tests and docs.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more:
One long sentence was divided into two.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-11-06 22:37:07 +00:00
Roman Solomatin
0f85dea8c8 langchain-huggingface: use separate kwargs for queries and docs (#27857)
Now `encode_kwargs` used for both for documents and queries and this
leads to wrong embeddings. E. g.:
```python
    model_kwargs = {"device": "cuda", "trust_remote_code": True}
    encode_kwargs = {"normalize_embeddings": False, "prompt_name": "s2p_query"}

    model = HuggingFaceEmbeddings(
        model_name="dunzhang/stella_en_400M_v5",
        model_kwargs=model_kwargs,
        encode_kwargs=encode_kwargs,
    )

    query_embedding = np.array(
        model.embed_query("What are some ways to reduce stress?",)
    )
    document_embedding = np.array(
        model.embed_documents(
            [
                "There are many effective ways to reduce stress. Some common techniques include deep breathing, meditation, and physical activity. Engaging in hobbies, spending time in nature, and connecting with loved ones can also help alleviate stress. Additionally, setting boundaries, practicing self-care, and learning to say no can prevent stress from building up.",
                "Green tea has been consumed for centuries and is known for its potential health benefits. It contains antioxidants that may help protect the body against damage caused by free radicals. Regular consumption of green tea has been associated with improved heart health, enhanced cognitive function, and a reduced risk of certain types of cancer. The polyphenols in green tea may also have anti-inflammatory and weight loss properties.",
            ]
        )
    )
    print(model._client.similarity(query_embedding, document_embedding)) # output: tensor([[0.8421, 0.3317]], dtype=torch.float64)
```
But from the [model
card](https://huggingface.co/dunzhang/stella_en_400M_v5#sentence-transformers)
expexted like this:
```python
    model_kwargs = {"device": "cuda", "trust_remote_code": True}
    encode_kwargs = {"normalize_embeddings": False}
    query_encode_kwargs = {"normalize_embeddings": False, "prompt_name": "s2p_query"}

    model = HuggingFaceEmbeddings(
        model_name="dunzhang/stella_en_400M_v5",
        model_kwargs=model_kwargs,
        encode_kwargs=encode_kwargs,
        query_encode_kwargs=query_encode_kwargs,
    )

    query_embedding = np.array(
        model.embed_query("What are some ways to reduce stress?", )
    )
    document_embedding = np.array(
        model.embed_documents(
            [
                "There are many effective ways to reduce stress. Some common techniques include deep breathing, meditation, and physical activity. Engaging in hobbies, spending time in nature, and connecting with loved ones can also help alleviate stress. Additionally, setting boundaries, practicing self-care, and learning to say no can prevent stress from building up.",
                "Green tea has been consumed for centuries and is known for its potential health benefits. It contains antioxidants that may help protect the body against damage caused by free radicals. Regular consumption of green tea has been associated with improved heart health, enhanced cognitive function, and a reduced risk of certain types of cancer. The polyphenols in green tea may also have anti-inflammatory and weight loss properties.",
            ]
        )
    )
    print(model._client.similarity(query_embedding, document_embedding)) # tensor([[0.8398, 0.2990]], dtype=torch.float64)
```
2024-11-06 17:35:39 -05:00
Bagatur
60123bef67 docs: fix trim_messages docstring (#27948) 2024-11-06 22:25:13 +00:00
murrlincoln
14f1827953 docs: Adding notebook for cdp agentkit toolkit (#27910)
- **Description:** Adding in the first pass of documentation for the CDP
Agentkit Toolkit
    - **Issue:** N/a
    - **Dependencies:** cdp-langchain
    - **Twitter handle:** @CoinbaseDev

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: John Peterson <john.peterson@coinbase.com>
2024-11-06 13:28:27 -08:00
Eric Pinzur
ea0ad917b0 community: added Document.id support to opensearch vectorstore (#27945)
Description:
* Added support of Document.id on OpenSearch vector store
* Added tests cases to match
2024-11-06 15:04:09 -05:00
Hammad Randhawa
75aa82fedc docs: Completed sentence under the heading "Instantiating a Browser … (#27944)
…Toolkit" in "playwright.ipynb" integration.

- Completed the incomplete sentence in the Langchain Playwright
documentation.

- Enhanced documentation clarity to guide users on best practices for
instantiating browser instances with Langchain Playwright.

Example before:
> "It's always recommended to instantiate using the from_browser method
so that the

Example after:
> "It's always recommended to instantiate using the `from_browser`
method so that the browser context is properly initialized and managed,
ensuring seamless interaction and resource optimization."

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-06 19:55:00 +00:00
Bagatur
67ce05a0a7 core[patch]: make oai tool description optional (#27756) 2024-11-06 18:06:47 +00:00
Bagatur
b2da3115ed docs: document init_chat_model standard params (#27812) 2024-11-06 09:50:07 -08:00
Dobiichi-Origami
395674d503 community: re-arrange function call message parse logic for Qianfan (#27935)
the [PR](https://github.com/langchain-ai/langchain/pull/26208) two month
ago has a potential bug which causes malfunction of `tool_call` for
`QianfanChatEndpoint` waiting for fix
2024-11-06 09:58:16 -05:00
Erick Friis
41b7a5169d infra: starter codeowners file (#27929) 2024-11-05 16:43:11 -08:00
ccurme
66966a6e72 openai[patch]: release 0.2.6 (#27924)
Some additions in support of [predicted
outputs](https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs)
feature:
- Bump openai sdk version
- Add integration test
- Add example to integration docs

The `prediction` kwarg is already plumbed through model invocation.
2024-11-05 23:02:24 +00:00
Erick Friis
a8c473e114 standard-tests: ci pipeline (#27923) 2024-11-05 20:55:38 +00:00
Erick Friis
c3b75560dc infra: release note grep order of operations (#27922) 2024-11-05 12:44:36 -08:00
Erick Friis
b3c81356ca infra: release note compute 2 (#27921) 2024-11-05 12:04:41 -08:00
Erick Friis
bff2a8b772 standard-tests: add tools standard tests (#27899) 2024-11-05 11:44:34 -08:00
SHJUN
f6b2f82099 community: chroma error patch(attribute changed on chroma) (#27827)
There was a change of attribute name which was "max_batch_size". It's
now "get_max_batch_size" method.
I want to use "create_batches" which is right down below.

Please check this PR link.
reference: https://github.com/chroma-core/chroma/pull/2305

---------

Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>
Co-authored-by: Prithvi Kannan <46332835+prithvikannan@users.noreply.github.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Jun Yamog <jkyamog@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ono-hiroki <86904208+ono-hiroki@users.noreply.github.com>
Co-authored-by: Dobiichi-Origami <56953648+Dobiichi-Origami@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Duy Huynh <vndee.huynh@gmail.com>
Co-authored-by: Rashmi Pawar <168514198+raspawar@users.noreply.github.com>
Co-authored-by: sifatj <26035630+sifatj@users.noreply.github.com>
Co-authored-by: Eric Pinzur <2641606+epinzur@users.noreply.github.com>
Co-authored-by: Daniel Vu Dao <danielvdao@users.noreply.github.com>
Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: Stéphane Philippart <wildagsx@gmail.com>
2024-11-05 19:43:11 +00:00
Tomaz Bratanic
a3bbbe6a86 update llm graph transformer documentation (#27905) 2024-11-05 11:54:26 -05:00
Erick Friis
31f4fb790d standard-tests: release 0.3.0 (#27900) 2024-11-04 17:29:15 -08:00
Erick Friis
ba5cba04ff infra: get min versions (#27896) 2024-11-04 23:46:13 +00:00
Bagatur
6973f7214f docs: sidebar capitalization (#27894) 2024-11-04 22:09:32 +00:00
Stéphane Philippart
4b8cd7a09a community: Use new OVHcloud batch embedding (#26209)
- **Description:** change to do the batch embedding server side and not
client side
- **Twitter handle:** @wildagsx

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-11-04 16:40:30 -05:00
Erick Friis
a54f390090 infra: fix prev tag output (#27892) 2024-11-04 12:46:23 -08:00
Erick Friis
75f80c2910 infra: fix prev tag condition (#27891) 2024-11-04 12:42:22 -08:00
Ofer Mendelevitch
d7c39e6dbb community: update Vectara integration (#27869)
Thank you for contributing to LangChain!

- **Description:** Updated Vectara integration
- **Issue:** refresh on descriptions across all demos and added UDF
reranker
- **Dependencies:** None
- **Twitter handle:** @ofermend

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 20:40:39 +00:00
Erick Friis
14a71a6e77 infra: fix prev tag calculation (#27890) 2024-11-04 12:38:39 -08:00
Daniel Vu Dao
5745f3bf78 docs: Update messages.mdx (#27856)
### Description
Updates phrasing for the header of the `Messages` section.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 20:36:31 +00:00
sifatj
e02a5ee03e docs: Update VectorStore as_retriever method url in qa_chat_history_how_to.ipynb (#27844)
**Description**: Update VectorStore `as_retriever` method api reference
url in `qa_chat_history_how_to.ipynb`

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 20:34:50 +00:00
sifatj
dd1711f3c2 docs: Update max_marginal_relevance_search api reference url in multi_vector.ipynb (#27843)
**Description**: Update VectorStore `max_marginal_relevance_search` api
reference url in `multi_vector.ipynb`

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 20:31:36 +00:00
sifatj
aa1f46a03a docs: Update VectorStore .as_retriever method url in vectorstore_retriever.ipynb (#27842)
**Description**: Update VectorStore `.as_retriever` method url in
`vectorstore_retriever.ipynb`

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 20:28:11 +00:00
Eric Pinzur
8eb38622a6 community: fixed bug in GraphVectorStoreRetriever (#27846)
Description:

This fixes an issue that mistakenly created in
https://github.com/langchain-ai/langchain/pull/27253. The issue
currently exists only in `langchain-community==0.3.4`.

Test cases were added to prevent this issue in the future.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 20:27:17 +00:00
sifatj
eecf95df9b docs: Update VectorStore api reference url in rag.ipynb (#27841)
**Description**: Update VectorStore api reference url in `rag.ipynb`

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 20:27:03 +00:00
sifatj
50563400fb docs: Update broken vectorstore urls in retrievers.ipynb (#27838)
**Description**: Update outdated `VectorStore` api reference urls in
`retrievers.ipynb`

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 20:26:03 +00:00
Bagatur
dfa83531ad qdrant,nomic[minor]: bump core deps (#27849) 2024-11-04 20:19:50 +00:00
Erick Friis
4e5cc84d40 infra: release tag compute (#27836) 2024-11-04 12:16:51 -08:00
Rashmi Pawar
f86a09f82c Add nvidia as provider for embedding, llm (#27810)
Documentation: Add NVIDIA as integration provider

cc: @mattf @dglogo

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-11-04 19:45:51 +00:00
Erick Friis
0c62684ce1 Revert "infra: add neo4j to package list" (#27887)
Reverts langchain-ai/langchain#27833

Wait for release
2024-11-04 18:18:38 +00:00
Erick Friis
bcf499df16 infra: add neo4j to package list (#27833) 2024-11-04 09:24:04 -08:00
Duy Huynh
a487ec47f4 community: set default output_token_limit value for PowerBIToolkit to fix validation error (#26308)
### Description:
This PR sets a default value of `output_token_limit = 4000` for the
`PowerBIToolkit` to fix the unintentionally validation error.

### Problem:
When attempting to run a code snippet from [Langchain's PowerBI toolkit
documentation](https://python.langchain.com/v0.1/docs/integrations/toolkits/powerbi/)
to interact with a `PowerBIDataset`, the following error occurs:

```
pydantic.v1.error_wrappers.ValidationError: 1 validation error for QueryPowerBITool
output_token_limit
  none is not an allowed value (type=type_error.none.not_allowed)
```

### Root Cause:
The issue arises because when creating a `QueryPowerBITool`, the
`output_token_limit` parameter is unintentionally set to `None`, which
is the current default for `PowerBIToolkit`. However, `QueryPowerBITool`
expects a default value of `4000` for `output_token_limit`. This
unintended override causes the error.


17659ca2cd/libs/community/langchain_community/agent_toolkits/powerbi/toolkit.py (L63)

17659ca2cd/libs/community/langchain_community/agent_toolkits/powerbi/toolkit.py (L72-L79)

17659ca2cd/libs/community/langchain_community/tools/powerbi/tool.py (L39)

### Solution:
To resolve this, the default value of `output_token_limit` is now
explicitly set to `4000` in `PowerBIToolkit` to prevent the accidental
assignment of `None`.

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-11-04 14:34:27 +00:00
Dobiichi-Origami
f7ced5b211 community: read function call from tool_calls for Qianfan (#26208)
I added one more 'elif' to read tool call message from `tool_calls`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-11-04 14:33:32 +00:00
ono-hiroki
b7d549ae88 docs: fix undefined 'data' variable in document_loader_csv.ipynb (#27872)
**Description:** 
This PR addresses an issue in the CSVLoader example where data is not
defined, causing a NameError. The line `data = loader.load()` is added
to correctly assign the output of loader.load() to the data variable.
2024-11-04 14:10:56 +00:00
Bagatur
3b0b7cfb74 chroma[minor]: release 0.2.0 (#27840) 2024-11-01 18:12:00 -07:00
Jun Yamog
830cad7bc0 core: fix CommaSeparatedListOutputParser to handle columns that may contain commas in it (#26365)
- **Description:**
Currently CommaSeparatedListOutputParser can't handle strings that may
contain commas within a column. It would parse any commas as the
delimiter.
Ex. 
"foo, foo2", "bar", "baz"

It will create 4 columns: "foo", "foo2", "bar", "baz"

This should be 3 columns:

"foo, foo2", "bar", "baz"

- **Dependencies:**
Added 2 additional imports, but they are built in python packages.

import csv
from io import StringIO

- **Twitter handle:** @jkyamog

- [ ] **Add tests and docs**: 
1. added simple unit test test_multiple_items_with_comma

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-11-01 22:42:24 +00:00
Erick Friis
9fedb04dd3 docs: INVALID_CHAT_HISTORY redirect (#27845) 2024-11-01 21:35:11 +00:00
Erick Friis
03a3670a5e infra: remove some special cases (#27839) 2024-11-01 21:13:43 +00:00
Bagatur
002e1c9055 airbyte: remove from master (#27837) 2024-11-01 13:59:34 -07:00
Bagatur
ee63d21915 many: use core 0.3.15 (#27834) 2024-11-01 20:35:55 +00:00
Prithvi Kannan
c3c638cd7b docs: Reference new databricks-langchain package (#27828)
Thank you for contributing to LangChain!

Update references in Databricks integration page to reference our new
partner package databricks-langchain
https://github.com/databricks/databricks-ai-bridge/tree/main/integrations/langchain

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>
2024-11-01 10:21:19 -07:00
sifatj
33d445550e docs: update VectorStore api reference url in retrievers.ipynb (#27814)
**Description:** Update outdated `VectorStore` api reference url in
Vector store subsection of `retrievers.ipynb`
2024-11-01 15:44:26 +00:00
sifatj
9a4a630e40 docs: Update Retrievers and Runnable links in Retrievers subsection of retrievers.ipynb (#27815)
**Description:** Update outdated links for `Retrievers` and `Runnable`
in Retrievers subsection of `retrievers.ipynb`
2024-11-01 15:42:30 +00:00
Zapiron
b0dfff4cd5 Fixed broken link for TokenTextSplitter (#27824)
Fixed the broken redirect link for `TokenTextSplitter` section
2024-11-01 11:32:07 -04:00
William FH
b4cb2089a2 langchain[patch]: Add warning in react agent (#26980) 2024-10-31 22:29:34 +00:00
Eugene Yurtsev
2f6254605d docs: fix more links (#27809)
Fix more broken links
2024-10-31 17:15:46 -04:00
Ant White
e3ea365725 core: use friendlier names for duplicated nodes in mermaid output (#27747)
Thank you for contributing to LangChain!

- [x] **PR title**: "core: use friendlier names for duplicated nodes in
mermaid output"

- **Description:** When generating the Mermaid visualization of a chain,
if the chain had multiple nodes of the same type, the reid function
would replace their names with the UUID node_id. This made the generated
graph difficult to understand. This change deduplicates the nodes in a
chain by appending an index to their names.
- **Issue:** None
- **Discussion:**
https://github.com/langchain-ai/langchain/discussions/27714
- **Dependencies:** None

- [ ] **Add tests and docs**:  
- Currently this functionality is not covered by unit tests, happy to
add tests if you'd like


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

# Example Code:
```python
from langchain_core.runnables import RunnablePassthrough

def fake_llm(prompt: str) -> str: # Fake LLM for the example
    return "completion"

runnable = {
    'llm1':  fake_llm,
    'llm2':  fake_llm,
} | RunnablePassthrough.assign(
    total_chars=lambda inputs: len(inputs['llm1'] + inputs['llm2'])
)

print(runnable.get_graph().draw_mermaid(with_styles=False))
```

# Before
```mermaid
graph TD;
	Parallel_llm1_llm2_Input --> 0b01139db5ed4587ad37964e3a40c0ec;
	0b01139db5ed4587ad37964e3a40c0ec --> Parallel_llm1_llm2_Output;
	Parallel_llm1_llm2_Input --> a98d4b56bd294156a651230b9293347f;
	a98d4b56bd294156a651230b9293347f --> Parallel_llm1_llm2_Output;
	Parallel_total_chars_Input --> Lambda;
	Lambda --> Parallel_total_chars_Output;
	Parallel_total_chars_Input --> Passthrough;
	Passthrough --> Parallel_total_chars_Output;
	Parallel_llm1_llm2_Output --> Parallel_total_chars_Input;
```

# After
```mermaid
graph TD;
	Parallel_llm1_llm2_Input --> fake_llm_1;
	fake_llm_1 --> Parallel_llm1_llm2_Output;
	Parallel_llm1_llm2_Input --> fake_llm_2;
	fake_llm_2 --> Parallel_llm1_llm2_Output;
	Parallel_total_chars_Input --> Lambda;
	Lambda --> Parallel_total_chars_Output;
	Parallel_total_chars_Input --> Passthrough;
	Passthrough --> Parallel_total_chars_Output;
	Parallel_llm1_llm2_Output --> Parallel_total_chars_Input;
```
2024-10-31 16:52:00 -04:00
Eugene Yurtsev
71f590de50 docs: fix more broken links (#27806)
Fix some broken links
2024-10-31 19:46:39 +00:00
Neli Hateva
c572d663f9 docs: Ontotext GraphDB QA Chain Update Documentation (Fix versions of libraries) (#27783)
- **Description:** Update versions of libraries in the Ontotext GraphDB
QA Chain Documentation
 - **Issue:** N/A
 - **Dependencies:** N/A
 - **Twitter handle:** @OntotextGraphDB
2024-10-31 15:23:16 -04:00
L
8ef0df3539 feat: add batch request support for text-embedding-v3 model (#26375)
PR title: “langchain: add batch request support for text-embedding-v3
model”

PR message:

• Description: This PR introduces batch request support for the
text-embedding-v3 model within LangChain. The new functionality allows
users to process multiple text inputs in a single request, improving
efficiency and performance for high-volume applications.
	•	Issue: This PR addresses #<issue_number> (if applicable).
• Dependencies: No new external dependencies are required for this
change.
• Twitter handle: If announced on Twitter, please mention me at
@yourhandle.

Add tests and docs:

1. Added unit tests to cover the batch request functionality, ensuring
it operates without requiring network access.
2. Included an example notebook demonstrating the batch request feature,
located in docs/docs/integrations.

Lint and test: All required formatting and linting checks have been
performed using make format and make lint. The changes have been
verified with make test to ensure compatibility.

Additional notes:

	•	The changes are fully backwards compatible.
• No modifications were made to pyproject.toml, ensuring no new
dependencies were added.
• The update only affects the langchain package and does not involve
other packages.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-31 18:56:22 +00:00
putao520
2545fbe709 fix "WARNING: Received notification from DBMS server: {severity: WARN… (#27112)
…ING} {code: Neo.ClientNotification.Statement.FeatureDeprecationWarning}
{category: DEPRECATION} {title: This feature is deprecated and will be
removed in future versions.} {description: CALL subquery without a
variable scope clause is now deprecated." this warning

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: putao520 <putao520@putao282.com>
2024-10-31 18:47:25 +00:00
Ankan Mahapatra
905f43377b Update word_document.py | Fixed metadata["source"] for web paths (#27220)
The metadata["source"] value for the web paths was being set to
temporary path (/tmp).

Fixed it by creating a new variable self.original_file_path, which will
store the original path.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-10-31 18:37:41 +00:00
Daniel Birn
389771ccc0 community: fix @embeddingKey in azure cosmos db no sql (#27377)
I will keep this PR as small as the changes made.

**Description:** fixes a fatal bug syntax error in
AzureCosmosDBNoSqlVectorSearch
**Issue:** #27269 #25468
2024-10-31 18:36:02 +00:00
Bagatur
06420de2e7 integrations[patch]: bump core to 0.3.15 (#27805) 2024-10-31 11:27:05 -07:00
Erick Friis
54cb80c778 docs: experimental case, use yq action (#27798) 2024-10-31 11:21:48 -07:00
W. Gustavo Cevallos
f94125a325 community: Update Polygon.io API (#27552)
**Description:** 
Update the wrapper to support the Polygon API if not you get an error. I
keeped `STOCKBUSINESS` for retro-compatbility with older endpoints /
other uses
Old Code:
```
 if status not in ("OK", "STOCKBUSINESS"):
    raise ValueError(f"API Error: {data}")

```
API Respond:
```
API Error: {'results': {'P': 0.22, 'S': 0, 'T': 'ZOM', 'X': 5, 'p': 0.123, 'q': 0, 's': 200, 't': 1729614422813395456, 'x': 1, 'z': 1}, 'status': 'STOCKSBUSINESS', 'request_id': 'XXXXXX'}
```

- **Issue:** N/A Polygon API update
- **Dependencies:** N/A
- **Twitter handle:** @wgcv

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-10-31 18:14:06 +00:00
Wang
621f78babd community: [fix] add missing tool_calls kwargs of delta message in openai adapter (#27492)
- **Description:** add missing tool_calls kwargs of delta message in
openai adapter, then tool call will work correctly via adapter's stream
chat completion
- **Issue:** Fixes
https://github.com/langchain-ai/langchain/issues/25436
- **Dependencies:** None
2024-10-31 14:07:17 -04:00
Tao Wang
25a1031871 community: Fix a validation error for MoonshotChat (#27801)
- **Description:** Change `MoonshotCommon.client` type from
`_MoonshotClient` to `Any`.
- **Issue:** Fix the issue #27058
- **Dependencies:** No
- **Twitter handle:** TaoWang2218

In PR #17100, the implementation for Moonshot was added, which defined
two classes:

- `MoonshotChat(MoonshotCommon, ChatOpenAI)` in
`langchain_community.chat_models.moonshot`;
- Here, `validate_environment()` assigns **client** as
`openai.OpenAI().chat.completions`
- Note that **client** here is actually a member variable defined in
`ChatOpenAI`;
- `MoonshotCommon` in `langchain_community.llms.moonshot`;
- And here, `validate_environment()` assigns **_client** as
`_MoonshotClient`;
- Note that this is the underscored **_client**, which is defined within
`MoonshotCommon` itself;

At this time, there was no conflict between the two, one being `client`
and the other `_client`.

However, in PR #25878 which fixed #24390, `_client` in `MoonshotCommon`
was changed to `client`. Since then, a conflict in the definition of
`client` has arisen between `MoonshotCommon` and `MoonshotChat`, which
caused `pydantic` validation error.

To fix this issue, the type of `client` in `MoonshotCommon` should be
changed to `Any`.

Signed-off-by: Tao Wang <twang2218@gmail.com>
2024-10-31 14:00:16 -04:00
Bagatur
e4e2aa0b78 core[patch]: update image util err msg (#27803) 2024-10-31 10:56:43 -07:00
Bagatur
181bcd0577 core[patch]: Release 0.3.15 (#27802) 2024-10-31 10:35:02 -07:00
Bagatur
c1e742347f core[patch]: rm image loading (#27797) 2024-10-31 10:34:51 -07:00
ZhangShenao
ad0387ac97 Improvement [docs] Improve api docs (#27787)
- Add missing param
- Remove unused param

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-31 16:56:44 +00:00
Changyong Um
d9163e7afa community[docs]: Add content for the Lora adapter in the VLLM page. (#27788)
**Description:**
I added code for lora_request in the community package, but I forgot to
add content to the VLLM page. So, I will do that now. #27731

---------

Co-authored-by: Um Changyong <changyong.um@sfa.co.kr>
2024-10-31 12:44:35 -04:00
ccurme
0172d938b4 community: add AzureOpenAIWhisperParser (#27796)
Commandeered from https://github.com/langchain-ai/langchain/pull/26757.

---------

Co-authored-by: Sheepsta300 <128811766+Sheepsta300@users.noreply.github.com>
2024-10-31 12:37:41 -04:00
ccurme
b631b0a596 community[patch]: cap SQLAlchemy and update deps (#27792)
SQLAlchemy 2.0.36 introduces a regression when creating a table in
DuckDB.

Relevant issues:
- In SQLAlchemy repo (resolution is to update DuckDB):
https://github.com/sqlalchemy/sqlalchemy/discussions/12011
- In DuckDB repo (PR is open):
https://github.com/Mause/duckdb_engine/issues/1128

Plan is to track these issues and remove cap when resolved.
2024-10-31 14:19:09 +00:00
Erick Friis
8ad7adad87 infra: build api docs from package listing (#27774) 2024-10-30 21:31:01 -07:00
JiaranI
3952ee31b8 ollama: add pydocstyle linting for ollama (#27686)
Description: add lint docstrings for ollama module
Issue: the issue https://github.com/langchain-ai/langchain/issues/23188
@baskaryan

test: ruff check passed.
<img width="311" alt="e94c68ffa93dd518297a95a93de5217"
src="https://github.com/user-attachments/assets/e96bf721-e0e3-44de-a50e-206603de398e">

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-31 03:06:55 +00:00
Aayush Kataria
a8a33b2dc6 LangChain-Community - AzureCosmos Mongo vCore: Bug Fix when the data doesn't contain metadata field (#27772)
Thank you for contributing to LangChain!
- **Description:** Adding an empty metadata field when metadata is not
present in the data
- **Issue:** This PR fixes the issue when the data items doesn't contain
the metadata field. This happens when there is already data in the
container, or cx uses CosmosDB Python SDK to insert data.
- **Dependencies:** No dependencies required

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-10-30 20:05:25 -07:00
Rave Harpaz
8d8d85379f community: OCI Generative AI tool calling bug fix (#26910)
- [x] **PR title**: 
  "community: OCI Generative AI tool calling bug fix 


- [x] **PR message**: 
- **Description:** bug fix for streaming chat responses with tool calls.
Update to PR 24693
    - **Issue:** chat response content is repeated when streaming
    - **Dependencies:** NA
    - **Twitter handle:** NA


- [x] **Add tests and docs**: NA


- [x] **Lint and test**: make format, make lint and make test we run
successfully

---------

Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-31 02:35:25 +00:00
Erick Friis
128b07208e community: release 0.3.4 (#27769) 2024-10-30 17:48:03 -07:00
Bagatur
6691202998 anthropic[patch]: allow multiple sys not at start (#27725) 2024-10-30 23:56:47 +00:00
Erick Friis
1ed3cd252e langchain: release 0.3.6 (#27768) 2024-10-30 23:50:42 +00:00
Sergey Ryabov
8180637345 community[patch]: Fix Playwright Tools bug with Pydantic schemas (#27050)
- Add tests for Playwright tools schema serialization
- Introduce base empty args Input class for BaseBrowserTool

Test Plan: `poetry run pytest
tests/unit_tests/tools/playwright/test_all.py`

Fixes #26758

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-30 23:45:36 +00:00
Bagatur
92024d0d7d infra: turn off release attestations (#27765) 2024-10-30 15:22:31 -07:00
Bagatur
deb4320d29 core[patch]: Release 0.3.14 (#27764) 2024-10-30 21:47:33 +00:00
Bagatur
5d337326b0 core[patch]: make get_all_basemodel_annotations public (#27761) 2024-10-30 14:43:29 -07:00
Bagatur
94ea950c6c core[patch]: support bedrock converse -> openai tool (#27754) 2024-10-30 12:20:39 -07:00
Lorenzo
3dfdb3e6fb community: prevent gitlab commit on main branch for Gitlab tool (#27750)
### About

- **Description:** In the Gitlab utilities used for the Gitlab tool
there is no check to prevent pushing to the main branch, as this is
already done for Github (for example here:
5a2cfb49e0/libs/community/langchain_community/utilities/github.py (L587)).
This PR add this check as already done for Github.
- **Issue:** None
- **Dependencies:** None
2024-10-30 18:50:13 +00:00
Sam Julien
0a472e2a2d community: Add Writer integration (#27646)
**Description:** Add support for Writer chat models   
**Issue:** N/A
**Dependencies:** Add `writer-sdk` to optional dependencies.
**Twitter handle:** Please tag `@samjulien` and `@Get_Writer`

**Tests and docs**
- [x] Unit test
- [x] Example notebook in `docs/docs/integrations` directory.

**Lint and test**
- [x] Run `make format` 
- [x] Run `make lint`
- [x] Run `make test`

---------

Co-authored-by: Johannes <tolstoy.work@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-30 18:06:05 +00:00
ccurme
595dc592c9 docs: run how-to guides in CI (#27615)
Add how-to guides to [Run notebooks
job](https://github.com/langchain-ai/langchain/actions/workflows/run_notebooks.yml)
and fix existing notebooks.

- As with tutorials, cassettes must be updated when HTTP calls in guides
change (by running existing
[script](https://github.com/langchain-ai/langchain/blob/master/docs/scripts/update_cassettes.sh)).
- Cassettes now total ~62mb over 474 files.
- `docs/scripts/prepare_notebooks_for_ci.py` lists a number of notebooks
that do not run (e.g., due to requiring additional infra, slowness,
requiring `input()`, etc.).
2024-10-30 12:35:38 -04:00
ccurme
88bfd60b03 infra: specify python max version of 3.12 for some integration packages (#27740) 2024-10-30 12:24:48 -04:00
fayvor
3b956b3a97 community: Update Replicate LLM and fix tests (#27655)
**Description:** 
- Fix bug in Replicate LLM class, where it was looking for parameter
names in a place where they no longer exist in pydantic 2, resulting in
the "Field required" validation error described in the issue.
- Fix Replicate LLM integration tests to:
  - Use active models on Replicate.
- Use the correct model parameter `max_new_tokens` as shown in the
[Replicate
docs](https://replicate.com/docs/guides/language-models/how-to-use#minimum-and-maximum-new-tokens).
  - Use callbacks instead of deprecated callback_manager.

**Issue:** #26937 

**Dependencies:** n/a

**Twitter handle:** n/a

---------

Signed-off-by: Fayvor Love <fayvor@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-30 16:07:08 +00:00
Yuki Watanabe
e593e017d2 Update compatibility table for ChatDatabricks (#27676)
`ChatDatabricks` added support for structured output and JSON mode in
the last release. This PR updates the feature table accordingly.

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2024-10-30 11:56:55 -04:00
ccurme
bd5ea18a6c groq[patch]: update standard tests (#27744)
- Add xfail on integration test (fails [> 50% of the
time](https://github.com/langchain-ai/langchain/actions/workflows/scheduled_test.yml));
- Remove xfail on passing unit test.
2024-10-30 15:50:51 +00:00
hmn falahi
98bb3a02bd docs: Add OpenAIAssistantV2Runnable docstrings (#27402)
- **Description:** add/improve docstrings of OpenAIAssistantV2Runnable
- **Issue:** the issue #21983

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-30 15:35:51 +00:00
Luiz F. G. dos Santos
7a29ca6200 community: add new parameters to pass to OpenAIAssistantV2Runnable (#27372)
Thank you for contributing to LangChain!
 
**Description:** Added the model parameters to be passed in the OpenAI
Assistant. Enabled it at the `OpenAIAssistantV2Runnable` class.
 **Issue:** NA
  **Dependencies:** None
  **Twitter handle:** luizf0992
2024-10-30 10:51:03 -04:00
Ankur Singh
0b97135da1 fix the grammar and markdown component (#27657)
## Before

![Screenshot from 2024-10-26
08-47-29](https://github.com/user-attachments/assets/d8ccead1-3ba3-4f67-a29f-ef8b352341cf)

## After


![image](https://github.com/user-attachments/assets/78f36d54-b2d7-4164-b334-8ac41000711e)

## Typo
`(either in PR summary of in a linked issue)` => `either in PR summary
or in a linked issue`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-30 14:47:26 +00:00
Abdesselam Benameur
8fb6708ac4 Fix typo (missing letter) in elasticsearch_retriever.ipynb (#27639)
Fixed a small typo (added a missing "t" in ElasticsearchRetriever docs
page)


https://python.langchain.com/docs/integrations/retrievers/elasticsearch_retriever/#:~:text=It%20is%20possible%20to%20cusomize%20the%20function%20tha%20maps%20an%20Elasticsearch%20result%20(hit)%20to%20a%20LangChain%20document.
2024-10-30 14:38:39 +00:00
随风枫叶
18cfb4c067 community: Add token_usage and model_name metadata to ChatZhipuAI stream() and astream() response (#27677)
Thank you for contributing to LangChain!


- **Description:** Add token_usage and model_name metadata to
ChatZhipuAI stream() and astream() response
- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** None


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: jianfehuang <jianfehuang@tencent.com>
2024-10-30 10:34:33 -04:00
Martin Gullbrandson
8a5807a6b4 docs: Update Milvus documentation to correctly show how to filter in similarity_search (#27723)
### Description/Issue:

I had problems filtering when setting up a local Milvus db and noticed
that the `filter` option in the `similarity_search` and
`similarity_search_with_score` appeared to do nothing. Instead, the
`expr` option should be used.

The `expr` option is correctly used in the retriever example further
down in the documentation.

The `expr` option seems to be correctly passed on, for example
[here](447c0dd2f0/libs/community/langchain_community/vectorstores/milvus.py (L701))

### Solution:

Update the documentation for the functions mentioned to show intended
behavior.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-30 14:15:11 +00:00
tkubo-heroz
028e0253d8 community: Added anthropic.claude-3-5-sonnet-20241022-v2:0 cost detials (#27728)
Added anthropic.claude-3-5-sonnet-20241022-v2:0 cost detials
2024-10-30 14:01:01 +00:00
Changyong Um
dc171221b3 community[patch]: Fix vLLM integration to apply lora_request (#27731)
**Description:**
- Add the `lora_request` parameter to the VLLM class to support LoRA
model configurations. This enhancement allows users to specify LoRA
requests directly when using VLLM, enabling more flexible and efficient
model customization.

**Issue:**
- No existing issue for `lora_adapter` in VLLM. This PR addresses the
need for configuring LoRA requests within the VLLM framework.
- Reference : [Using LoRA Adapters in
vLLM](https://docs.vllm.ai/en/stable/models/lora.html#using-lora-adapters)


**Example Code :**
Before this change, the `lora_request` parameter was not applied
correctly:

```python
ADAPTER_PATH = "/path/of/lora_adapter"

llm = VLLM(model="Bllossom/llama-3.2-Korean-Bllossom-3B",
           max_new_tokens=512,
           top_k=2,
           top_p=0.90,
           temperature=0.1,
           vllm_kwargs={
               "gpu_memory_utilization":0.5, 
               "enable_lora":True, 
               "max_model_len":1024,
           }
)

print(llm.invoke(
    ["...prompt_content..."], 
    lora_request=LoRARequest("lora_adapter", 1, ADAPTER_PATH)
    ))
```
**Before Change Output:**
```bash
response was not applied lora_request
```
So, I attempted to apply the lora_adapter to
langchain_community.llms.vllm.VLLM.

**current output:**
```bash
response applied lora_request
```

**Dependencies:**
- None

**Lint and test:**
- All tests and lint checks have passed.

---------

Co-authored-by: Um Changyong <changyong.um@sfa.co.kr>
2024-10-30 13:59:34 +00:00
Nawaz Haider
9d2f6701e1 DOCS: Fixed import of langchain instead of langchain_nvidia_ai_endpoints for ChatNVIDIA (#27734)
* **PR title**: "docs: Replaced langchain import with
langchain-nvidia-ai-endpoints in NVIDIA Endpoints Tab"

* **PR message**:
+ **Description:** Replaced the import of `langchain` with
`langchain-nvidia-ai-endpoints` in the NVIDIA Endpoints Tab to resolve
an error caused by the documentation attempting to import the generic
`langchain` module despite the targeted import.
	+ **Issue:** 
+ **Dependencies:** No additional dependencies introduced; simply
updated the existing import to a more specific module.
	+ **Twitter handle:** https://x.com/nawaz0x1

* **Add tests and docs**:
+ **Applicability:** Not applicable in this case, as the change is a fix
to an existing integration rather than the addition of a new one.
+ **Rationale:** No new functionality or integrations are introduced,
only a corrective import change.

* **Lint and test**:
	+ **Status:** Completed
	+ **Outcome:** 
		- `make format`: **Passed**
		- `make lint`: **Passed**
		- `make test`: **Passed**


![image](https://github.com/user-attachments/assets/fbc1b597-5083-4461-875a-d32ab8ed933c)
2024-10-30 13:57:37 +00:00
Qier LU
8d8e38b090 community[pathch]: Add missing custom content_key handling in Redis vector store (#27736)
This fix an error caused by missing custom content_key handling in Redis
vector store in function similarity_search_with_score.
2024-10-30 13:57:20 +00:00
William FH
5a2cfb49e0 Support message trimming on single messages (#27729)
Permit trimming message lists of length 1
2024-10-30 04:27:52 +00:00
Bagatur
5111063af2 langchain[patch]: Release 0.3.5 (#27727) 2024-10-29 17:06:23 -07:00
Bagatur
8f4423e042 text-splitters[patch]: Release 0.3.1 (#27726) 2024-10-30 00:04:48 +00:00
Prithvi Kannan
0433b114bb docs: Add databricks-langchain package consolidation notice (#27703)
Thank you for contributing to LangChain!

Add notice of upcoming package consolidation of `langchain-databricks`
into `databricks-langchain`.

<img width="1047" alt="image"
src="https://github.com/user-attachments/assets/18eaa394-4e82-444b-85d5-7812be322674">


Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Signed-off-by: Prithvi Kannan <prithvi.kannan@databricks.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-29 22:00:27 +00:00
Zapiron
447c0dd2f0 docs: Fixed Grammar & Improve reading (#27672)
Updated the documentation to fix some grammar errors

    - **Description:** Some language errors exist in the documentation
    - **Issue:** the issue # Changed the structure of some sentences
2024-10-29 20:19:00 +00:00
Soham Das
913ff1b152 docs: fix typo in query analysis documentation (#27721)
**PR Title**: `docs: fix typo in query analysis documentation`

**Description**: This PR corrects a typo on line 68 in the query
analysis documentation, changing **"pharsings"** to **"phrasings"** for
clarity and accuracy. Only one instance of the typo was fixed in the
last merge, and this PR fixes the second instance.

**Issue**: N/A

**Dependencies**: None

**Additional Notes**: No functional changes were made; this is a
documentation fix only.
2024-10-29 16:15:37 -04:00
Erick Friis
8396ca2990 docs: redis in api docs (#27722) 2024-10-29 20:13:53 +00:00
Mateusz Szewczyk
0606aabfa3 docs: Added WatsonxRerank documentation (#27424)
Thank you for contributing to LangChain!

Changes:
- docs: Added `WatsonxRerank` documentation 
- docs Updated `WatsonxEmbeddings` with docs template
- docs: Updated `ChatWatsonx` with docs template
- docs: Updated `WatsonxLLM` with docs template
- docs: Added `ChatWatsonx` to list with Chat models providers. Added
[test_chat_models_standard](https://github.com/langchain-ai/langchain-ibm/blob/main/libs/ibm/tests/integration_tests/test_chat_models_standard.py)
to `langchain_ibm` tests suite.
- docs: Added `IBM` to list with Embedding models providers. Added
[test_embeddings_standard](https://github.com/langchain-ai/langchain-ibm/blob/main/libs/ibm/tests/integration_tests/test_embeddings_standard.py)
to `langchain_ibm` tests suite.
- docs: Updated `langcahin_ibm` recommended versions compatible with
`LangChain v0.3`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-29 16:57:47 +00:00
Zapiron
9ccd4a6ffb DOC: Tutorial Section Updates (#27675)
Edited various notebooks in the tutorial section to fix:
* Grammatical Errors
* Improve Readability by changing the sentence structure or reducing
repeated words which bears the same meaning
* Edited a code block to follow the PEP 8 Standard
* Added more information in some sentences to make the concept more
clear and reduce ambiguity

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-10-29 14:51:34 +00:00
Harsimran-19
c1d8c33df6 core: JsonOutputParser UTF characters bug (#27306)
**Description:**
This PR fixes an issue where non-ASCII characters in Pydantic field
descriptions were being escaped to their Unicode representations when
using `JsonOutputParser`. The change allows non-ASCII characters to be
preserved in the output, which is especially important for multilingual
support and when working with non-English languages.

**Issue:** Fixes #27256

**Example Code:**
```python
from pydantic import BaseModel, Field
from langchain_core.output_parsers import JsonOutputParser

class Article(BaseModel):
    title: str = Field(description="科学文章的标题")

output_data_structure = Article
parser = JsonOutputParser(pydantic_object=output_data_structure)
print(parser.get_format_instructions())
```
**Previous Output**:
```... "title": {"description": "\\u79d1\\u5b66\\u6587\\u7ae0\\u7684\\u6807\\u9898", "title": "Title", "type": "string"}} ...```

**Current Output**:
```... "title": {"description": "科学文章的标题", "title": "Title", "type":
"string"}} ...```

**Changes made**:
- Modified `json.dumps()` call in
`langchain_core/output_parsers/json.py` to use `ensure_ascii=False`
- Added a unit test to verify Unicode handling

Co-authored-by: Harsimran-19 <harsimran1869@gmail.com>
2024-10-29 14:48:53 +00:00
Andrew Effendi
49517cc1e7 partners/huggingface[patch]: fix HuggingFacePipeline model_id parameter (#27514)
**Description:** Fixes issue with model parameter not getting
initialized correctly when passing transformers pipeline
**Issue:** https://github.com/langchain-ai/langchain/issues/25915
2024-10-29 14:34:46 +00:00
Jeong-Minju
0a465b8032 docs: Fix typo in _action_agent docs section (#27698)
PR Title: docs: Fix typo in _action_agent function docs section

Description: In line 1185, _action_agent function's docs, changing
**".agent"** to **"self.agent"**.

Issue: N/A

Dependencies: None

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-10-29 14:16:42 +00:00
Soham Das
c3021e9322 docs: fix typo in query analysis documentation (#27697)
**PR Title**: `docs: fix typo in query analysis documentation`

**Description**: This PR corrects a typo on line 68 in the query
analysis documentation, changing **"pharsings"** to **"phrasings"** for
clarity and accuracy.

**Issue**: N/A

**Dependencies**: None

**Additional Notes**: No functional changes were made; this is a
documentation fix only.
2024-10-29 14:07:22 +00:00
Neil Vachharajani
eec35672a4 core[patch]: Improve type checking for the tool decorator (#27460)
**Description:**

When annotating a function with the @tool decorator, the symbol should
have type BaseTool. The previous type annotations did not convey that to
type checkers. This patch creates 4 overloads for the tool function for
the 4 different use cases.

1. @tool decorator with no arguments
2. @tool decorator with only keyword arguments
3. @tool decorator with a name argument (and possibly keyword arguments)
4. Invoking tool as function with a name and runnable positional
arguments

The main function is updated to match the overloads. The changes are
100% backwards compatible (all existing calls should continue to work,
just with better type annotations).

**Twitter handle:** @nvachhar

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-29 13:59:56 +00:00
Erick Friis
94e5765416 docs: packages in homepage (#27693) 2024-10-28 20:44:30 +00:00
Erick Friis
583808a7b8 partners/huggingface: release 0.1.1 (#27691) 2024-10-28 13:39:38 -07:00
Erick Friis
6d524e9566 partners/box: release 0.2.2 (#27690) 2024-10-28 12:54:20 -07:00
yahya-mouman
6803cb4f34 openai[patch]: add check for none values when summing token usage (#27585)
**Description:** Fixes None addition issues when an empty value is
passed on

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-10-28 12:49:43 -07:00
Bagatur
ede953d617 openai[patch]: fix schema formatting util (#27685) 2024-10-28 15:46:47 +00:00
Baptiste Pasquier
440c162b8b community: Fix closed session in Infinity (#26933)
**Description:** 

The `aiohttp.ClientSession` is closed at the end of the with statement,
which causes an error during a second call.

The implemented fix is to define the session directly within the with
block, exactly like in the textembed code:


c6350d636e/libs/community/langchain_community/embeddings/textembed.py (L335-L346)
 
**Issue:** Fix #26932

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-10-27 11:37:21 -04:00
Jorge Piedrahita Ortiz
8895d468cb community: sambastudio llm refactor (#27215)
**Description:** 
    - Sambastudio LLM refactor 
    - Sambastudio openai compatible API support added
    - docs updated
2024-10-27 11:08:15 -04:00
ccurme
fe87e411f2 groq: fix unit test (#27660) 2024-10-26 14:57:23 -04:00
Erick Friis
cdb4b1980a docs: reorganize contributing docs (#27649) 2024-10-25 22:41:54 +00:00
Erick Friis
fbfc6bdade core: test runner improvements (#27654)
when running core tests locally this
- prevents langsmith tracing from being enabled by env vars
- prevents network calls
2024-10-25 15:06:59 -07:00
Gabriel Faundez
ef27ce7a45 docs: add missing import for tools docs (#27650)
## Description

Added missing import from `pydantic` in the tools docs
2024-10-25 21:14:40 +00:00
Vincent Min
7bc4e320f1 core[patch]: improve performance of InMemoryVectorStore (#27538)
**Description:** We improve the performance of the InMemoryVectorStore.
**Isue:** Originally, similarity was computed document by document:
```
for doc in self.store.values():
            vector = doc["vector"]
            similarity = float(cosine_similarity([embedding], [vector]).item(0))
```
This is inefficient and does not make use of numpy vectorization.
This PR computes the similarity in one vectorized go:
```
docs = list(self.store.values())
similarity = cosine_similarity([embedding], [doc["vector"] for doc in docs])
```
**Dependencies:** None
**Twitter handle:** @b12_consulting, @Vincent_Min

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-25 17:07:04 -04:00
Bagatur
d5306899d3 openai[patch]: Release 0.2.4 (#27652) 2024-10-25 20:26:21 +00:00
Erick Friis
247d6bb09d infra: test doc imports 3.12 (#27653) 2024-10-25 13:23:06 -07:00
Erick Friis
600b7bdd61 all: test 3.13 ci (#27197)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-10-25 12:56:58 -07:00
Bagatur
06df15c9c0 core[patch]: Release 0.3.13 (#27651) 2024-10-25 19:22:44 +00:00
Erick Friis
2683f814f4 docs: contributing index page (#27647) 2024-10-25 17:06:55 +00:00
Rashmi Pawar
83eebf549f docs: Add NVIDIA as provider in v3 integrations (#27254)
### Add NVIDIA as provider in langchain v3 integrations

cc: @sumitkbh @mattf @dglogo

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-25 16:21:22 +00:00
Steve Moss
24605bcdb6 community[patch]: Fix missing protected_namespaces(). (#27610)
- [x] **PR message**:
- **Description:** Fixes warning messages raised due to missing
`protected_namespaces` parameter in `ConfigDict`.
    - **Issue:** https://github.com/langchain-ai/langchain/issues/27609
    - **Dependencies:** No dependencies
    - **Twitter handle:** @gawbul
2024-10-25 02:16:26 +00:00
Eugene Yurtsev
7667ee126f core: remove mustache in extended deps (#27629)
Remove mustache from extended deps -- we vendor the mustache
implementation
2024-10-24 22:12:49 -04:00
Erick Friis
265e0a164a core: add flake8-bandit (S) ruff rules to core (#27368)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-24 22:33:41 +00:00
hippopond
bcff458ae3 DOC: Added notes in ipynb file to advise user to upgrade package langchain_openai. For issue: https://github.com/langchain-ai/langchain/issues/26616 (#27621)
Thank you for contributing to LangChain!

- [X] **PR title**: DOC: Added notes in ipynb file to advice user to
upgrade package langchain_openai.


- [X] 

Added notes from the issue report: to advise the user to upgrade
langchain_openai


Issue: 
https://github.com/langchain-ai/langchain/issues/26616

- [ ] **Add tests and docs**: 

- [ ] **Lint and test**: 

- [ ]

---------

Co-authored-by: Libby Lin <libbylin@Libbys-MacBook-Pro.local>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-24 21:54:12 +00:00
Nithish Raghunandanan
0623c74560 couchbase: Add document id to vector search results (#27622)
**Description:** Returns the document id along with the Vector Search
results

**Issue:** Fixes https://github.com/langchain-ai/langchain/issues/26860
for CouchbaseVectorStore


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-24 21:47:36 +00:00
ZhangShenao
455ab7d714 Improvement[Community] Improve Document Loaders and Splitters (#27568)
- Fix word spelling error
- Add static method decorator
- Fix language splitter

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-24 21:42:16 +00:00
Ed Branch
7345470669 docs: add aws support to how-to-guides (#27450)
This PR adds support to the how-to documentation for using AWS Bedrock
and Sagemaker Endpoints.

Because AWS services above dont presently use API keys to access LLMs
I've amended more of the source code than would normally be expected.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-24 14:23:32 -07:00
CLOVA Studio 개발
846a75284f community: Add Naver chat model & embeddings (#25162)
Reopened as a personal repo outside the organization.

## Description
- Naver HyperCLOVA X community package 
  - Add chat model & embeddings
  - Add unit test & integration test
  - Add chat model & embeddings docs
- I changed partner
package(https://github.com/langchain-ai/langchain/pull/24252) to
community package on this PR
- Could this
embeddings(https://github.com/langchain-ai/langchain/pull/21890) be
deprecated? We are trying to replace it with embedding
model(**ClovaXEmbeddings**) in this PR.

Twitter handle: None. (if needed, contact with
joonha.jeon@navercorp.com)

---
you can check our previous discussion below:

> one question on namespaces - would it make sense to have these in
.clova namespaces instead of .naver?

I would like to keep it as is, unless it is essential to unify the
package name.
(ClovaX is a branding for the model, and I plan to add other models and
components. They need to be managed as separate classes.)

> also, could you clarify the difference between ClovaEmbeddings and
ClovaXEmbeddings?

There are 3 models that are being serviced by embedding, and all are
supported in the current PR. In addition, all the functionality of CLOVA
Studio that serves actual models, such as distinguishing between test
apps and service apps, is supported. The existing PR does not support
this content because it is hard-coded.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Vadym Barda <vadym@langchain.dev>
2024-10-24 20:54:13 +00:00
Hyejun An
6227396e20 partners/HuggingFacePipeline[stream]: Change to use pipeline instead of pipeline.model.generate in stream() (#26531)
## Description

I encountered an error while using the` gemma-2-2b-it model` with the
`HuggingFacePipeline` class and have implemented a fix to resolve this
issue.

### What is Problem

```python
model_id="google/gemma-2-2b-it"


gemma_2_model = AutoModelForCausalLM.from_pretrained(model_id)
gemma_2_tokenizer = AutoTokenizer.from_pretrained(model_id)

gen = pipeline( 
    task='text-generation',
    model=gemma_2_model,
    tokenizer=gemma_2_tokenizer,
    max_new_tokens=1024,
    device=0 if torch.cuda.is_available() else -1,
    temperature=.5,
    top_p=0.7,
    repetition_penalty=1.1,
    do_sample=True,
    )

llm = HuggingFacePipeline(pipeline=gen)

for chunk in llm.stream("Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World."):
    print(chunk, end="", flush=True)
```

This code outputs the following error message:

```
/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1258: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(
Exception in thread Thread-19 (generate):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1874, in generate
    self._validate_generated_length(generation_config, input_ids_length, has_default_max_length)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 1266, in _validate_generated_length
    raise ValueError(
ValueError: Input length of input_ids is 31, but `max_length` is set to 20. This can lead to unexpected behavior. You should consider increasing `max_length` or, better yet, setting `max_new_tokens`.
```

In addition, the following error occurs when the number of tokens is
reduced.

```python
for chunk in llm.stream("Hello World"):
    print(chunk, end="", flush=True)
```

```
/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1258: UserWarning: Using the model-agnostic default `max_length` (=20) to control the generation length. We recommend setting `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(
/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py:1885: UserWarning: You are calling .generate() with the `input_ids` being on a device type different than your model's device. `input_ids` is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put `input_ids` to the correct device by calling for example input_ids = input_ids.to('cuda') before running `.generate()`.
  warnings.warn(
Exception in thread Thread-20 (generate):
Traceback (most recent call last):
  File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.10/threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2024, in generate
    result = self._sample(
  File "/usr/local/lib/python3.10/dist-packages/transformers/generation/utils.py", line 2982, in _sample
    outputs = self(**model_inputs, return_dict=True)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/gemma2/modeling_gemma2.py", line 994, in forward
    outputs = self.model(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/gemma2/modeling_gemma2.py", line 803, in forward
    inputs_embeds = self.embed_tokens(input_ids)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/sparse.py", line 164, in forward
    return F.embedding(
  File "/usr/local/lib/python3.10/dist-packages/torch/nn/functional.py", line 2267, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
```

On the other hand, in the case of invoke, the output is normal:

```
llm.invoke("Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World.")
```
```
'Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World. Hello World.\n\nThis is a simple program that prints the phrase "Hello World" to the console. \n\n**Here\'s how it works:**\n\n* **`print("Hello World")`**: This line of code uses the `print()` function, which is a built-in function in most programming languages (like Python). The `print()` function takes whatever you put inside its parentheses and displays it on the screen.\n* **`"Hello World"`**:  The text within the double quotes (`"`) is called a string. It represents the message we want to print.\n\n\nLet me know if you\'d like to explore other programming concepts or see more examples! \n'
```

### Problem Analysis

- Apparently, I put kwargs in while generating pipelines and it applied
to `invoke()`, but it's not applied in the `stream()`.
- When using the stream, `inputs = self.pipeline.tokenizer (prompt,
return_tensors = "pt")` enters cpu.
  - This can crash when the model is in gpu.

### Solution

Just use `self.pipeline` instead of `self.pipeline.model.generate`.

- **Original Code**

```python
stopping_criteria = StoppingCriteriaList([StopOnTokens()])

inputs = self.pipeline.tokenizer(prompt, return_tensors="pt")
streamer = TextIteratorStreamer(
    self.pipeline.tokenizer,
    timeout=60.0,
    skip_prompt=skip_prompt,
    skip_special_tokens=True,
)
generation_kwargs = dict(
    inputs,
    streamer=streamer,
    stopping_criteria=stopping_criteria,
    **pipeline_kwargs,
)
t1 = Thread(target=self.pipeline.model.generate, kwargs=generation_kwargs)
t1.start()
```

- **Updated Code**

```python
stopping_criteria = StoppingCriteriaList([StopOnTokens()])

streamer = TextIteratorStreamer(
    self.pipeline.tokenizer,
    timeout=60.0,
    skip_prompt=skip_prompt,
    skip_special_tokens=True,
)
generation_kwargs = dict(
    text_inputs= prompt,
    streamer=streamer,
    stopping_criteria=stopping_criteria,
    **pipeline_kwargs,
)
t1 = Thread(target=self.pipeline, kwargs=generation_kwargs)
t1.start()
```

By using the `pipeline` directly, the `kwargs` of the pipeline are
applied, and there is no need to consider the `device` of the `tensor`
made with the `tokenizer`.

> According to the change to use `pipeline`, it was modified to put
`text_inputs=prompts` directly into `generation_kwargs`.

## Issue

None

## Dependencies

None

## Twitter handle

None

---------

Co-authored-by: Vadym Barda <vadym@langchain.dev>
2024-10-24 16:49:43 -04:00
Bagatur
655ced84d7 openai[patch]: accept json schema response format directly (#27623)
fix #25460

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-24 18:19:15 +00:00
Tibor Reiss
20b56a0233 core[patch]: fix repr and str for Serializable (#26786)
Fixes #26499

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-10-24 08:36:35 -07:00
Adarsh Sahu
2d58a8a08d docs: Update structured_outputs.mdx (#27613)
`strightforward` => `straightforward`
`adavanced` => `advanced`
`There a few challenges` => `There are a few challenges`

Documentation Correction:

*
[`docs/docs/concepts/structured_output.mdx`]:
Corrected several typos in the sentence directing users to the API
reference.
2024-10-24 15:13:28 +00:00
Daniel Vu Dao
da6b526770 docs: Update Runnable documentation (#27606)
**Description**
Adds better code formatting for one of the docs.
2024-10-24 15:05:43 +00:00
QiQi
133c1b4f76 docs: Update passthrough.ipynb -- Grammar correction (#27601)
Grammar correction needed in passthrough.ipynb
The sentence is:

"Now you've learned how to pass data through your chains to help to help
format the data flowing through your chains."

There's a redundant "to help", and it could be more succinctly written
as:

"Now you've learned how to pass data through your chains to help format
the data flowing through your chains."
2024-10-24 15:05:06 +00:00
hippopond
61897aef90 docs: Fix for spelling mistake (#27599)
Fixes #26009

Thank you for contributing to LangChain!

- [x] **PR title**: "docs: Correcting spelling mistake"


- [x] **PR message**: 
    - **Description:** Corrected spelling from "trianed" to "trained"
    - **Issue:** the issue #26009 
    - **Dependencies:** NA
    - **Twitter handle:** NA


- [ ] **Add tests and docs**: NA


- [ ] **Lint and test**:

Co-authored-by: Libby Lin <libbylin@Libbys-MacBook-Pro.local>
2024-10-24 15:04:18 +00:00
Eugene Yurtsev
d081a5400a docs: fix more links (#27598)
Fix more links
2024-10-23 21:26:38 -04:00
Lei Zhang
f203229b51 community: Fix the failure of ChatSparkLLM after upgrading to Pydantic V2 (#27418)
**Description:**

The test_sparkllm.py can reproduce this issue.


https://github.com/langchain-ai/langchain/blob/master/libs/community/tests/integration_tests/chat_models/test_sparkllm.py#L66

```
Testing started at 18:27 ...
Launching pytest with arguments test_sparkllm.py::test_chat_spark_llm --no-header --no-summary -q in /Users/zhanglei/Work/github/langchain/libs/community/tests/integration_tests/chat_models

============================= test session starts ==============================
collecting ... collected 1 item

test_sparkllm.py::test_chat_spark_llm 

============================== 1 failed in 0.45s ===============================
FAILED                             [100%]
tests/integration_tests/chat_models/test_sparkllm.py:65 (test_chat_spark_llm)
def test_chat_spark_llm() -> None:
>       chat = ChatSparkLLM(
            spark_app_id="your spark_app_id",
            spark_api_key="your spark_api_key",
            spark_api_secret="your spark_api_secret",
        )  # type: ignore[call-arg]

test_sparkllm.py:67: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../../../../core/langchain_core/load/serializable.py:111: in __init__
    super().__init__(*args, **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

cls = <class 'langchain_community.chat_models.sparkllm.ChatSparkLLM'>
values = {'spark_api_key': 'your spark_api_key', 'spark_api_secret': 'your spark_api_secret', 'spark_api_url': 'wss://spark-api.xf-yun.com/v3.5/chat', 'spark_app_id': 'your spark_app_id', ...}

    @model_validator(mode="before")
    @classmethod
    def validate_environment(cls, values: Dict) -> Any:
        values["spark_app_id"] = get_from_dict_or_env(
            values,
            ["spark_app_id", "app_id"],
            "IFLYTEK_SPARK_APP_ID",
        )
        values["spark_api_key"] = get_from_dict_or_env(
            values,
            ["spark_api_key", "api_key"],
            "IFLYTEK_SPARK_API_KEY",
        )
        values["spark_api_secret"] = get_from_dict_or_env(
            values,
            ["spark_api_secret", "api_secret"],
            "IFLYTEK_SPARK_API_SECRET",
        )
        values["spark_api_url"] = get_from_dict_or_env(
            values,
            "spark_api_url",
            "IFLYTEK_SPARK_API_URL",
            SPARK_API_URL,
        )
        values["spark_llm_domain"] = get_from_dict_or_env(
            values,
            "spark_llm_domain",
            "IFLYTEK_SPARK_LLM_DOMAIN",
            SPARK_LLM_DOMAIN,
        )
    
        # put extra params into model_kwargs
        default_values = {
            name: field.default
            for name, field in get_fields(cls).items()
            if field.default is not None
        }
>       values["model_kwargs"]["temperature"] = default_values.get("temperature")
E       KeyError: 'model_kwargs'

../../../langchain_community/chat_models/sparkllm.py:368: KeyError
``` 

I found that when upgrading to Pydantic v2, @root_validator was changed
to @model_validator. When a class declares multiple
@model_validator(model=before), the execution order in V1 and V2 is
opposite. This is the reason for ChatSparkLLM's failure.

The correct execution order is to execute build_extra first.


https://github.com/langchain-ai/langchain/blob/langchain%3D%3D0.2.16/libs/community/langchain_community/chat_models/sparkllm.py#L302

And then execute validate_environment.


https://github.com/langchain-ai/langchain/blob/langchain%3D%3D0.2.16/libs/community/langchain_community/chat_models/sparkllm.py#L329

The Pydantic community also discusses it, but there hasn't been a
conclusion yet. https://github.com/pydantic/pydantic/discussions/7434

**Issus:** #27416 

**Twitter handle:** coolbeevip

---------

Co-authored-by: vbarda <vadym@langchain.dev>
2024-10-23 21:17:10 -04:00
Andrew Effendi
8f151223ad Community: Fix DuckDuckGo search tool Output Format (#27479)
**Issue:** : https://github.com/langchain-ai/langchain/issues/22961
   **Description:** 

Previously, the documentation for `DuckDuckGoSearchResults` said that it
returns a JSON string, however the code returns a regular string that
can't be parsed as is.
for example running

```python
from langchain_community.tools import DuckDuckGoSearchResults

# Create a DuckDuckGo search instance
search = DuckDuckGoSearchResults()

# Invoke the search
result = search.invoke("Obama")

# Print the result
print(result)
# Print the type of the result
print("Result Type:", type(result))
```
will return
```
snippet: Harris will hold a campaign event with former President Barack Obama in Georgia next Thursday, the first time the pair has campaigned side by side, a senior campaign official said. A week from ..., title: Obamas to hit the campaign trail in first joint appearances with Harris, link: https://www.nbcnews.com/politics/2024-election/obamas-hit-campaign-trail-first-joint-appearances-harris-rcna176034, snippet: Item 1 of 3 Former U.S. first lady Michelle Obama and her husband, former U.S. President Barack Obama, stand on stage during Day 2 of the Democratic National Convention (DNC) in Chicago, Illinois ..., title: Obamas set to hit campaign trail with Kamala Harris for first time, link: https://www.reuters.com/world/us/obamas-set-hit-campaign-trail-with-kamala-harris-first-time-2024-10-18/, snippet: Barack and Michelle Obama will make their first campaign appearances alongside Kamala Harris at rallies in Georgia and Michigan. By Reid J. Epstein Reporting from Ashwaubenon, Wis. Here come the ..., title: Harris Will Join Michelle Obama and Barack Obama on Campaign Trail, link: https://www.nytimes.com/2024/10/18/us/politics/kamala-harris-michelle-obama-barack-obama.html, snippet: Obama's leaving office was "a turning point," Mirsky said. "That was the last time anybody felt normal." A few feet over, a 64-year-old physics professor named Eric Swanson who had grown ..., title: Obama's reemergence on the campaign trail for Harris comes as he ..., link: https://www.cnn.com/2024/10/13/politics/obama-campaign-trail-harris-biden/index.html
Result Type: <class 'str'>
```

After the change in this PR, `DuckDuckGoSearchResults` takes an
additional `output_format = "list" | "json" | "string"` ("string" =
current behavior, default). For example, invoking
`DuckDuckGoSearchResults(output_format="list")` return a list of
dictionaries in the format
```
[{'snippet': '...', 'title': '...', 'link': '...'}, ...]
```
e.g.

```
[{'snippet': "Obama has in a sense been wrestling with Trump's impact since the real estate magnate broke onto the political stage in 2015. Trump's victory the next year, defeating Obama's secretary of ...", 'title': "Obama's fears about Trump drive his stepped-up campaigning", 'link': 'https://www.washingtonpost.com/politics/2024/10/18/obama-trump-anxiety-harris-campaign/'}, {'snippet': 'Harris will hold a campaign event with former President Barack Obama in Georgia next Thursday, the first time the pair has campaigned side by side, a senior campaign official said. A week from ...', 'title': 'Obamas to hit the campaign trail in first joint appearances with Harris', 'link': 'https://www.nbcnews.com/politics/2024-election/obamas-hit-campaign-trail-first-joint-appearances-harris-rcna176034'}, {'snippet': 'Item 1 of 3 Former U.S. first lady Michelle Obama and her husband, former U.S. President Barack Obama, stand on stage during Day 2 of the Democratic National Convention (DNC) in Chicago, Illinois ...', 'title': 'Obamas set to hit campaign trail with Kamala Harris for first time', 'link': 'https://www.reuters.com/world/us/obamas-set-hit-campaign-trail-with-kamala-harris-first-time-2024-10-18/'}, {'snippet': 'Barack and Michelle Obama will make their first campaign appearances alongside Kamala Harris at rallies in Georgia and Michigan. By Reid J. Epstein Reporting from Ashwaubenon, Wis. Here come the ...', 'title': 'Harris Will Join Michelle Obama and Barack Obama on Campaign Trail', 'link': 'https://www.nytimes.com/2024/10/18/us/politics/kamala-harris-michelle-obama-barack-obama.html'}]
Result Type: <class 'list'>
```

---------

Co-authored-by: vbarda <vadym@langchain.dev>
2024-10-23 20:18:11 -04:00
Erick Friis
5e5647b5dd docs: render api ref urls in search (#27594) 2024-10-23 16:18:21 -07:00
Bagatur
948e2e6322 docs: concept nits (#27586) 2024-10-23 14:52:44 -07:00
Eugene Yurtsev
562cf416c2 docs: Update messages.mdx (#27592)
Add missing `.`
2024-10-23 20:18:27 +00:00
Ankur Singh
71e0f4cd62 docs: Fix spelling mistake in concepts (#27589)
`Fore` => `For`

Documentation Correction:

*
[`docs/docs/concepts/async.mdx`](diffhunk://#diff-4959e81c20607c20c7a9c38db4405a687c5d94f24fc8220377701afeee7562b0L40-R40):
Corrected a typo from "Fore" to "For" in the sentence directing users to
the API reference.
2024-10-23 16:10:21 -04:00
Bagatur
968dccee04 core[patch]: convert_to_openai_tool Anthropic support (#27591) 2024-10-23 12:27:06 -07:00
Bagatur
217de4e6a6 langchain[patch]: de-beta init_chat_model (#27558) 2024-10-23 08:35:15 -07:00
Eugene Yurtsev
4466caadba concepts: update llm stub page and re-link (#27567)
Update text llm stub page and re-link content
2024-10-22 23:03:36 -04:00
Eugene Yurtsev
f2dbf01d4a Docs: Re-organize conceptual docs (#27047)
Reorganization of conceptual documentation

---------

Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
Co-authored-by: Lance Martin <lance@langchain.dev>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-10-22 22:08:20 -04:00
Kwan Kin Chan
6d2a76ac05 langchain_huggingface: Fix multiple GPU usage bug in from_model_id function (#23628)
- [ ]  **Description:**   
   - pass the device_map into model_kwargs 
- removing the unused device_map variable in the hf_pipeline function
call
- [ ] **Issue:** issue #13128 
When using the from_model_id function to load a Hugging Face model for
text generation across multiple GPUs, the model defaults to loading on
the CPU despite multiple GPUs being available using the expected format
``` python
llm = HuggingFacePipeline.from_model_id(
    model_id="model-id",
    task="text-generation",
    device_map="auto",
)
```
Currently, to enable multiple GPU , we have to pass in variable in this
format instead
``` python
llm = HuggingFacePipeline.from_model_id(
    model_id="model-id",
    task="text-generation",
    device=None,
    model_kwargs={
        "device_map": "auto",
    }
)
```
This issue arises due to improper handling of the device and device_map
parameters.

- [ ] **Explanation:**
1. In from_model_id, the model is created using model_kwargs and passed
as the model variable of the pipeline function. So at this moment, to
load the model with multiple GPUs, "device_map" needs to be set to
"auto" within model_kwargs. Otherwise, the model defaults to loading on
the CPU.
2. The device_map variable in from_model_id is not utilized correctly.
In the pipeline function's source code of tnansformer:
- The device_map variable is stored in the model_kwargs dictionary
(lines 867-878 of transformers/src/transformers/pipelines/\__init__.py).
```python
    if device_map is not None:
        ......
        model_kwargs["device_map"] = device_map
```
- The model is constructed with model_kwargs containing the device_map
value ONLY IF it is a string (lines 893-903 of
transformers/src/transformers/pipelines/\__init__.py).
```python
    if isinstance(model, str) or framework is None:
        model_classes = {"tf": targeted_task["tf"], "pt": targeted_task["pt"]}
        framework, model = infer_framework_load_model( ... , **model_kwargs, )
```
- Consequently, since a model object is already passed to the pipeline
function, the device_map variable from from_model_id is never used.

3. The device_map variable in from_model_id not only appears unused but
also causes errors. Without explicitly setting device=None, attempting
to load the model on multiple GPUs may result in the following error:
 ```
Device has 2 GPUs available. Provide device={deviceId} to
`from_model_id` to use available GPUs for execution. deviceId is -1
(default) for CPU and can be a positive integer associated with CUDA
device id.
  Traceback (most recent call last):
    File "foo.py", line 15, in <module>
      llm = HuggingFacePipeline.from_model_id(
File
"foo\site-packages\langchain_huggingface\llms\huggingface_pipeline.py",
line 217, in from_model_id
      pipeline = hf_pipeline(
File "foo\lib\site-packages\transformers\pipelines\__init__.py", line
1108, in pipeline
return pipeline_class(model=model, framework=framework, task=task,
**kwargs)
File "foo\lib\site-packages\transformers\pipelines\text_generation.py",
line 96, in __init__
      super().__init__(*args, **kwargs)
File "foo\lib\site-packages\transformers\pipelines\base.py", line 835,
in __init__
      raise ValueError(
ValueError: The model has been loaded with `accelerate` and therefore
cannot be moved to a specific device. Please discard the `device`
argument when creating your pipeline object.
```
This error occurs because, in from_model_id, the default values in from_model_id for device and device_map are -1 and None, respectively. It would passes the statement (`device_map is not None and device < 0`) and keep the device as -1 so the pipeline function later raises an error when trying to move a GPU-loaded model back to the CPU. 
19eb82e68b/libs/community/langchain_community/llms/huggingface_pipeline.py (L204-L213)




If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: vbarda <vadym@langchain.dev>
2024-10-22 21:41:47 -04:00
Prakul
031d0e4725 docs:update to MongoDB Docs (#27531)
**Description:** Update to MongoDB docs

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-23 00:21:37 +00:00
Fernando de Oliveira
ab205e7389 partners/openai + community: Async Azure AD token provider support for Azure OpenAI (#27488)
This PR introduces a new `azure_ad_async_token_provider` attribute to
the `AzureOpenAI` and `AzureChatOpenAI` classes in `partners/openai` and
`community` packages, given it's currently supported on `openai` package
as
[AsyncAzureADTokenProvider](https://github.com/openai/openai-python/blob/main/src/openai/lib/azure.py#L33)
type.

The reason for creating a new attribute is to avoid breaking changes.
Let's say you have an existing code that uses a `AzureOpenAI` or
`AzureChatOpenAI` instance to perform both sync and async operations.
The `azure_ad_token_provider` will work exactly as it is today, while
`azure_ad_async_token_provider` will override it for async requests.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-10-22 21:43:06 +00:00
Bagatur
34684423bf docs: rm Legacy API ref link (#27559) 2024-10-22 14:12:38 -07:00
Savar Bhasin
0cae37b0a9 docs: fix docker command for RedisChatMessageHistory (#27484)
docs: "fix docker command"

- **Description**: The Redis chat message history component requires the
Redis Stack to create indexes. When using only Redis, the following
error occurs: "Unknown command 'FT.INFO', with args beginning with:
'chat_history'".
- **Twitter handle**: savar_bhasin

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-22 19:42:51 +00:00
orkhank
9a277cbe00 community: Update file_path type in JSONLoader.__init__() signature (#27535)
- **Description:** Change the type of the `file_path` argument from `str
| pathlib.Path` to `str | os.PathLike`, since the latter is more widely
used: https://stackoverflow.com/a/58541858
  
This is a very minor fix. I was just annoyed to see the red underline
displayed by Pylance in VS Code: `reportArgumentType`.

![image](https://github.com/user-attachments/assets/719a7f8e-acca-4dfa-89df-925e1d938c71)
  
  The changes do not affect the behavior of the code.
2024-10-22 11:18:36 -07:00
Eric Pinzur
f636c83321 community: Cassandra Vector Store: modernize implementation (#27253)
**Description:** 

This PR updates `CassandraGraphVectorStore` to be based off
`CassandraVectorStore`, instead of using a custom CQL implementation.
This allows users using a `CassandraVectorStore` to upgrade to a
`GraphVectorStore` without having to change their database schema or
re-embed documents.

This PR also updates the documentation of the `GraphVectorStore` base
class and contains native async implementations for the standard graph
methods: `traversal_search` and `mmr_traversal_search` in
`CassandraVectorStore`.

**Issue:** No issue number.

**Dependencies:** https://github.com/langchain-ai/langchain/pull/27078
(already-merged)

**Lint and test**: 
- Lint and tests all pass, including existing
`CassandraGraphVectorStore` tests.
- Also added numerous additional tests based of the tests in
`langchain-astradb` which cover many more scenarios than the existing
tests for `Cassandra` and `CassandraGraphVectorStore`

** BREAKING CHANGE**

Note that this is a breaking change for existing users of
`CassandraGraphVectorStore`. They will need to wipe their database table
and restart.

However:
- The interfaces have not changed. Just the underlying storage
mechanism.
- Any one using `langchain_community.vectorstores.Cassandra` can instead
use `langchain_community.graph_vectorstores.CassandraGraphVectorStore`
and they will gain Graph capabilities without having to re-embed their
existing documents. This is the primary goal of this PR.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-22 18:11:11 +00:00
Vadym Barda
0640cbf2f1 huggingface[patch]: hide client field in HuggingFaceEmbeddings (#27522) 2024-10-21 17:37:07 -04:00
Chun Kang Lu
380449a7a9 core: fix Image prompt template hardcoded template format (#27495)
Fixes #27411 

**Description:** Adds `template_format` to the `ImagePromptTemplate`
class and updates passing in the `template_format` parameter from
ChatPromptTemplate instead of the hardcoded "f-string".
Also updated docs and typing related to `template_format` to be more
up-to-date and specific.

**Dependencies:** None

**Add tests and docs**: Added unit tests to validate fix. Needed to
update `test_chat` snapshot due to adding new attribute
`template_format` in `ImagePromptTemplate`.

---------

Co-authored-by: Vadym Barda <vadym@langchain.dev>
2024-10-21 17:31:40 -04:00
bbaltagi-dtsl
403c0ea801 community: fix DallE hidden open_api_key (#26996)
Thank you for contributing to LangChain!

- [ X] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"


- [ X] 
    - **Issue:** issue #26941


Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-21 19:46:56 +00:00
Erick Friis
c6d088bc15 docs: giscus component strict (#27515) 2024-10-21 11:36:51 -07:00
Erick Friis
6ed92f13d0 infra: azure/mongo api docs build (#27512) 2024-10-21 08:27:46 -07:00
Radi
689e8b7e66 docs: Update chatbot.ipynb (#27422)
- [ ] **PR title**: "docs: Typo fix"

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-21 15:06:28 +00:00
venkatram-dev
2678cda83b docs:tutorials:sql_qa.ipynb: fix typo (#27405)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"

docs:docs:tutorials:sql_qa.ipynb: fix typo

- [x] **PR message**: ***Delete this entire checklist*** and replace
with
Fix typo in docs:docs:tutorials:sql_qa.ipynb


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-21 15:01:23 +00:00
Erez Zohar
8f80dd28d9 docs: typo fix athena.ipynb and glue_catalog.ipynb (#27435)
**Description:** This PR fixes typos in 
```
docs/docs/integrations/document_loaders/athena.ipynb
docs/docs/integrations/document_loaders/glue_catalog.ipynb
```
2024-10-21 15:01:13 +00:00
nodfans
cfcf783cb5 community: fix a typo in planner_prompt.py (#27489)
Description: Fix typo in planner_prompt.py.
2024-10-21 14:59:33 +00:00
Seungha Jeon
edfe35c2a8 docs: fix typo on friendli.ipynb (#27412)
This PR fixes typos in `chat/friendli.ipynb` and `llms/friendli.ipynb`
docs.
2024-10-21 14:58:49 +00:00
Connor Park
e62e390ca0 docs: update API Reference Link in /docs/how_to/vectorstore_retriever/ (#27477)
Description: updated docs
[here](https://python.langchain.com/docs/how_to/vectorstore_retriever/#:~:text=VectorStoreRetriever)
for creating VectorStoreRetrievers. The URL was missing a `.base`, and
now works as expected.

This was a fix for Issue #27196
2024-10-19 00:44:58 +00:00
Erick Friis
97a819d578 community: fix lint from new mypy (#27474) 2024-10-18 20:08:03 +00:00
Erick Friis
c397baa85f community: release 0.3.3 (#27472) 2024-10-18 12:52:15 -07:00
Erick Friis
4ceb28009a mongodb: migrate to repo (#27467) 2024-10-18 12:35:12 -07:00
Erick Friis
a562c54f7d azure-dynamic-sessions: migrate to repo (#27468) 2024-10-18 12:30:48 -07:00
Erick Friis
30660786b3 langchain: release 0.3.4 (#27458) 2024-10-18 11:59:54 -07:00
Erick Friis
b468552859 docs: langgraph error code redirects (#27465) 2024-10-18 10:39:32 -07:00
Erick Friis
82242dfbb1 docs: openai audio docs (#27459) 2024-10-18 17:06:55 +00:00
Erick Friis
2cf2cefe39 partners/openai: release 0.2.3 (#27457) 2024-10-18 08:16:01 -07:00
Erick Friis
7d65a32ee0 openai: audio modality, remove sockets from unit tests (#27436) 2024-10-18 08:02:09 -07:00
Mateusz Szewczyk
97dc578d47 docs: Update custom name for IBM (#27437)
Thank you for contributing to LangChain!

PR: Update custom name for IBM in api_reference docs

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-17 19:41:24 +00:00
Isaac Francisco
6e228c84a8 docs: update mongo table (#27434) 2024-10-17 18:29:29 +00:00
Erick Friis
2a27234a77 docs: fix error reference header (#27431) 2024-10-17 15:51:33 +00:00
Erick Friis
322ca84812 infra: add ibm api build (#27425)
test build:
https://github.com/langchain-ai/langchain/actions/runs/11386155179
2024-10-17 07:47:29 -07:00
Erick Friis
4d11211c89 infra: schedule triggers monorepo only by default (#27428)
fixes https://github.com/langchain-ai/langchain/issues/27426
2024-10-17 14:31:14 +00:00
Erick Friis
f9cc9bdcf3 core: release 0.3.12 (#27410) 2024-10-17 06:32:40 -07:00
Erick Friis
0ebddabf7d docs, core: error messaging [wip] (#27397) 2024-10-17 03:39:36 +00:00
Eugene Yurtsev
202d7f6c4a core[patch]: 0.3.11 release (#27403)
Core bump to 0.3.11
2024-10-16 15:39:37 -04:00
Erick Friis
a38e903360 docs: platforms -> providers (#27285) 2024-10-16 18:27:07 +00:00
ccurme
fdb7f951c8 monorepo: add script for updating notebook cassettes (#27399)
1. Move dependencies for running notebooks into monorepo poetry test
deps;
2. Add script to update cassettes for a single notebook;
3. Add cassettes for some how-to guides.

---

To update cassettes for a single notebook, run
`docs/scripts/update_cassettes.sh`. For example:
```
./docs/scripts/update_cassettes.sh docs/docs/how_to/binding.ipynb
```
Requires:
1. monorepo dev and test dependencies installed;
2. env vars required by notebook are set.

Note: How-to guides are not currently run in [scheduled
job](https://github.com/langchain-ai/langchain/actions/workflows/run_notebooks.yml).
Will add cassettes for more how-to guides in subsequent PRs before
adding them to scheduled job.
2024-10-16 13:46:49 -04:00
Artur Barseghyan
88d71f6986 docs: Cosmetic documentation fix. Update llm_chain.ipynb. (#27394)
ATM [LLM chain
docs](https://python.langchain.com/docs/tutorials/llm_chain/#server)
say:

```
# 3. Create parser
parser = StrOutputParser()

# 4. Create chain
chain = prompt_template | model | parser


# 4. App definition
app = FastAPI(
  title="LangChain Server",
  version="1.0",
  description="A simple API server using LangChain's Runnable interfaces",
)

# 5. Adding chain route
add_routes(
    app,
    chain,
    path="/chain",
)
```

I corrected it to:


```
# 3. Create parser
parser = StrOutputParser()

# 4. Create chain
chain = prompt_template | model | parser

# 5. App definition
app = FastAPI(
  title="LangChain Server",
  version="1.0",
  description="A simple API server using LangChain's Runnable interfaces",
)

# 6. Adding chain route
add_routes(
    app,
    chain,
    path="/chain",
)
```
2024-10-16 17:42:52 +00:00
Bagatur
a4392b070d core[patch]: add convert_to_openai_messages util (#27263)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-16 17:10:10 +00:00
sByteman
31e7664afd community[minor]: add proxy support to RecursiveUrlLoader (#27364)
**Description**
This PR introduces the proxies parameter to the RecursiveUrlLoader
class, allowing the user to specify proxy servers for requests. This
update enables crawling through proxy servers, providing enhanced
flexibility for network configurations.
The key changes include:
  1.Added an optional proxies parameter to the constructor (__init__).
2.Updated the documentation to explain the proxies parameter usage with
an example.
3.Modified the _get_child_links_recursive method to pass the proxies
parameter to the requests.get function.



**Sample Usage**

```python
from bs4 import BeautifulSoup as Soup
from langchain_community.document_loaders.recursive_url_loader import RecursiveUrlLoader

proxies = {
    "http": "http://localhost:1080",
    "https": "http://localhost:1080",
}
url = "https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel"
loader = RecursiveUrlLoader(
    url=url, max_depth=1, extractor=lambda x: Soup(x, "html.parser").text,proxies=proxies
)
docs = loader.load()
```

---------

Co-authored-by: root <root@thb>
2024-10-16 16:29:59 +00:00
Leonid Ganeline
3165415369 docs: integrations updates 21 (#27380)
Added missed provider pages. Added descriptions and links. Fixed
inconsistency in text formatting.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-16 02:41:06 +00:00
Mateusz Szewczyk
591a3db4fb docs: Update IBM ChatWatsonx documentation (#27358)
Thank you for contributing to LangChain!

We would like update IBM ChatWatsonx documentation in LangChain
documentation

Changes:
- Added support for `JSON mode`
- Added support for `Image Input`
- Added support for `Logprobs`

Chat Standard tests ->
https://github.com/langchain-ai/langchain-ibm/blob/main/libs/ibm/tests/integration_tests/test_chat_models_standard.py

Integration_tests job  ->
https://github.com/langchain-ai/langchain-ibm/actions/runs/11327509188

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-16 02:27:56 +00:00
Yuki Watanabe
b8bfebd382 community: Add deprecation notice for Databricks integration in langchain-community (#27355)
We have released the
[langchain-databricks](https://github.com/langchain-ai/langchain-databricks)
package for Databricks integration. This PR deprecates the legacy
classes within `langchain-community`.

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-16 02:20:40 +00:00
xsai9101
15c1ddaf99 community: Add support for clob datatype in oracle database (#27330)
**Description**:
This PR add support of clob/blob data type for oracle document loader,
clob/blob can only be read by oracledb package when connection is open,
so reformat code to process data before connection closes.

**Dependencies**:
oracledb package same as before. pip install oracledb

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-16 02:19:20 +00:00
Leonid Ganeline
8e66822100 docs: integrations google update (#27218)
I've made several titles more compact hence a more compact menu.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 23:08:52 +00:00
Enes Bol
3f74dfc3d8 community[patch]: Fix vLLM integration to filter SamplingParams (#27367)
**Description:**
- This pull request addresses a bug in Langchain's VLLM integration,
where the use_beam_search parameter was erroneously passed to
SamplingParams. The SamplingParams class in vLLM does not support the
use_beam_search argument, which caused a TypeError.

- This PR introduces logic to filter out unsupported parameters,
ensuring that only valid parameters are passed to SamplingParams. As a
result, the integration now functions as expected without errors.

- The bug was reproduced by running the code sample from Langchain’s
documentation, which triggered the error due to the invalid parameter.
This fix resolves that error by implementing proper parameter filtering.

**VLLM Sampling Params Class:**
https://github.com/vllm-project/vllm/blob/main/vllm/sampling_params.py

**Issue:**
I could not found an Issue that belongs to this. Fixes "TypeError:
Unexpected keyword argument 'use_beam_search'" error when using VLLM
from Langchain.

**Dependencies:**
None.

**Tests and Documentation**:
Tests:
No new functionality was added, but I tested the changes by running
multiple prompts through the VLLM integration with various parameter
configurations. All tests passed successfully without breaking
compatibility.

Docs
No documentation changes were necessary as this is a bug fix.

**Reproducing the Error:**

https://python.langchain.com/docs/integrations/llms/vllm/

The code sample from the original documentation can be used to reproduce
the error I got.

from langchain_community.llms import VLLM
llm = VLLM(
    model="mosaicml/mpt-7b",
    trust_remote_code=True,  # mandatory for hf models
    max_new_tokens=128,
    top_k=10,
    top_p=0.95,
    temperature=0.8,
)
print(llm.invoke("What is the capital of France ?"))

![image](https://github.com/user-attachments/assets/3782d6ac-1f7b-4acc-bf2c-186216149de5)


This PR resolves the issue by ensuring that only valid parameters are
passed to SamplingParams.
2024-10-15 21:57:50 +00:00
Erick Friis
edf6d0a0fb partners/couchbase: release 0.2.0 (attempt 2) (#27375) 2024-10-15 14:51:05 -07:00
Erick Friis
d2cd43601b infra: add databricks api build (#27374) 2024-10-15 20:11:23 +00:00
Jorge Piedrahita Ortiz
12fea5b868 community: sambastudio chat model integration minor fix (#27238)
**Description:** sambastudio chat model integration minor fix
 fix default params
 fix usage metadata when streaming
2024-10-15 13:24:36 -04:00
Leonid Ganeline
fead4749b9 docs: integrations updates 20 (#27210)
Added missed provider pages. Added descriptions and links.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 16:38:12 +00:00
ZhangShenao
f3925d71b9 community: Fix word spelling in Text2vecEmbeddings (#27183)
Fix word spelling in `Text2vecEmbeddings`
2024-10-15 09:28:48 -07:00
Erick Friis
92ae61bcc8 multiple: rely on asyncio_mode auto in tests (#27200) 2024-10-15 16:26:38 +00:00
William FH
0a3e089827 [Anthropic] Shallow Copy (#27105)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 15:50:48 +00:00
Matthew Peveler
c6533616b6 docs: fix community pgvector deprecation warning formatting (#27094)
**Description**: PR fixes some formatting errors in deprecation message
in the `langchain_community.vectorstores.pgvector` module, where it was
missing spaces between a few words, and one word was misspelled.
**Issue**: n/a
**Dependencies**: n/a

Signed-off-by: mpeveler@timescale.com
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 15:45:53 +00:00
Erick Friis
3fa5ce3e5f community: clear mypy syntax warning in openapi (#27370)
not completely clear the regex is functional
2024-10-15 15:43:53 +00:00
Ahmet Yasin Aytar
443b37403d community: refactor Arxiv search logic (#27084)
PR message:

Description:
This PR refactors the Arxiv API wrapper by extracting the Arxiv search
logic into a helper function (_fetch_results) to reduce code duplication
and improve maintainability. The helper function is used in methods like
get_summaries_as_docs, run, and lazy_load, streamlining the code and
making it easier to maintain in the future.

Issue:
This is a minor refactor, so no specific issue is being fixed.

Dependencies:
No new dependencies are introduced with this change.

Add tests and docs:
No new integrations were added, so no additional tests or docs are
necessary for this PR.
Lint and test:
I have run make format, make lint, and make test to ensure all checks
pass successfully.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 08:43:03 -07:00
Qiu Qin
57fbc6bdf1 community: Update OCI data science integration (#27083)
This PR updates the integration with OCI data science model deployment
service.

- Update LLM to support streaming and async calls.
- Added chat model.
- Updated tests and docs.
- Updated `libs/community/scripts/check_pydantic.sh` since the use of
`@pre_init` is removed from existing integration.
- Updated `libs/community/extended_testing_deps.txt` as this integration
requires `langchain_openai`.

---------

Co-authored-by: MING KANG <ming.kang@oracle.com>
Co-authored-by: Dmitrii Cherkasov <dmitrii.cherkasov@oracle.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-15 08:32:54 -07:00
Rafael Miller
fc14f675f1 Community: Updated Firecrawl Document Loader to v1 (#26548)
This PR updates the Firecrawl Document Loader to use the recently
released V1 API of Firecrawl.

**Key Updates:**

**Firecrawl V1 Integration:** Updated the document loader to leverage
the new Firecrawl V1 API for improved performance, reliability, and
developer experience.

**Map Functionality Added:** Introduced the map mode for more flexible
document loading options.

These updates enhance the integration and provide access to the latest
features of Firecrawl.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-10-15 13:13:28 +00:00
Max Tran
8fea07f92e community: fixed KeyError: 'client' (#27345)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
Updated

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
twitter: @MaxHTran

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
Not needed due to small change

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Max Tran <maxtra@amazon.com>
2024-10-14 20:51:13 +00:00
Martin Triska
8dc4bec947 [community] [Bugfix] base_o365 document loader metadata needs to be JSON serializable (#26322)
In order for indexer to work, all metadata in the documents need to be
JSON serializable. Timestamps are not.

See here:

https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/indexing/api.py#L83-L89

@eyurtsev could you please review? It's a tiny PR :-)
2024-10-14 12:48:31 -04:00
YangZhaoo
de62d02102 docs: Maybe there is a missing closing bracket. (#27317)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: yangzhao <yzahha980122@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-14 12:46:56 -04:00
Trayan Azarov
59bbda9ba3 chroma: Deprecating versions 0.5.7 thru 0.5.12 (#27305)
**Description:** Deprecated version of Chroma >=0.5.5 <0.5.12 due to a
serious correctness issue that caused some embeddings for deployments
with multiple collections to be lost (read more on the issue in Chroma
repo)
**Issue:** chroma-core/chroma#2922 (fixed by chroma-core/chroma##2923
and released in
[0.5.13](https://github.com/chroma-core/chroma/releases/tag/0.5.13))
**Dependencies:** N/A
**Twitter handle:** `@t_azarov`
2024-10-14 11:56:05 -04:00
Erick Friis
2197958366 docs: add discussions with giscus (#27172) 2024-10-11 15:14:45 -07:00
Marcelo Nunes Alves
5647276998 community: Problem with embeddings in new versions of clickhouse. (#26041)
Starting with Clickhouse version 24.8, a different type of configuration
has been introduced in the vectorized data ingestion, and if this
configuration occurs, an error occurs when generating the table. As can
be seen below:

![Screenshot from 2024-09-04
11-48-00](https://github.com/user-attachments/assets/70840a93-1001-490c-921a-26924c51d9eb)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-11 18:54:50 +00:00
Sir Qasim
2a1029c53c Update chatbot.ipynb (#27243)
Async invocation:
remove : from at the end of line 
line 441 because there is not any structure block after it.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-10-10 18:03:10 +00:00
Eugene Yurtsev
5b9b8fe80f core[patch]: Ignore ASYNC110 to upgrade to newest ruff version (#27229)
Ignoring ASYNC110 with explanation
2024-10-09 11:25:58 -04:00
Vittorio Rigamonti
7da2efd9d3 community[minor]: VectorStore Infinispan. Adding TLS and authentication (#23522)
**Description**:
this PR enable VectorStore TLS and authentication (digest, basic) with
HTTP/2 for Infinispan server.
Based on httpx.

Added docker-compose facilities for testing
Added documentation

**Dependencies:**
requires `pip install httpx[http2]` if HTTP2 is needed

**Twitter handle:**
https://twitter.com/infinispan
2024-10-09 10:51:39 -04:00
Luke Jang
ff925d2ddc docs: fixed broken API reference link for StructuredTool.from_function (#27181)
Fix broken API reference link for StructuredTool.from_function
2024-10-09 10:05:22 -04:00
Diao Zihao
4553573acb core[patch],langchain[patch],community[patch]: Bump version dependency of tenacity to >=8.1.0,!=8.4.0,<10 (#27201)
This should fixes the compatibility issue with graprag as in

- https://github.com/langchain-ai/langchain/discussions/25595

Here are the release notes for tenacity 9
(https://github.com/jd/tenacity/releases/tag/9.0.0)

---------

Signed-off-by: Zihao Diao <hi@ericdiao.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-09 14:00:45 +00:00
Stefano Lottini
d05fdd97dd community: Cassandra Vector Store: extend metadata-related methods (#27078)
**Description:** this PR adds a set of methods to deal with metadata
associated to the vector store entries. These, while essential to the
Graph-related extension of the `Cassandra` vector store, are also useful
in themselves. These are (all come in their sync+async versions):

- `[a]delete_by_metadata_filter`
- `[a]replace_metadata`
- `[a]get_by_document_id`
- `[a]metadata_search`

Additionally, a `[a]similarity_search_with_embedding_id_by_vector`
method is introduced to better serve the store's internal working (esp.
related to reranking logic).

**Issue:** no issue number, but now all Document's returned bear their
`.id` consistently (as a consequence of a slight refactoring in how the
raw entries read from DB are made back into `Document` instances).

**Dependencies:** (no new deps: packaging comes through langchain-core
already; `cassio` is now required to be version 0.1.10+)


**Add tests and docs**
Added integration tests for the relevant newly-introduced methods.
(Docs will be updated in a separate PR).

**Lint and test** Lint and (updated) test all pass.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-09 06:41:34 +00:00
Erick Friis
84c05b031d community: release 0.3.2 (#27214) 2024-10-08 23:33:55 -07:00
Serena Ruan
a7c1ce2b3f [community] Add timeout control and retry for UC tool execution (#26645)
Add timeout at client side for UCFunctionToolkit and add retry logic.
Users could specify environment variable
`UC_TOOL_CLIENT_EXECUTION_TIMEOUT` to increase the timeout value for
retrying to get the execution response if the status is pending. Default
timeout value is 120s.


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Tested in Databricks:
<img width="1200" alt="image"
src="https://github.com/user-attachments/assets/54ab5dfc-5e57-4941-b7d9-bfe3f8ad3f62">



- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: serena-ruan <serena.rxy@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-09 06:31:48 +00:00
Tomaz Bratanic
481bd25d29 community: Fix database connections for neo4j (#27190)
Fixes https://github.com/langchain-ai/langchain/issues/27185

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 23:47:55 +00:00
Erick Friis
cedf4d9462 langchain: release 0.3.3 (#27213) 2024-10-08 16:39:42 -07:00
Jorge Piedrahita Ortiz
6c33124c72 docs: minor fix sambastudio chat model docs (#27212)
- **Description:**  minor fix sambastudio chat model docs

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 23:34:29 +00:00
Erick Friis
7264fb254c core: release 0.3.10 (#27209) 2024-10-08 16:21:42 -07:00
Bagatur
ce33c4fa40 openai[patch]: default temp=1 for o1 (#27206) 2024-10-08 15:45:21 -07:00
Mateusz Szewczyk
b298d0337e docs: Update IBM ChatWatsonx documentation (#27189) 2024-10-08 21:10:18 +00:00
RIdham Golakiya
73ad7f2e7a langchain_chroma[patch]: updated example for get documents with where clause (#26767)
Example updated for vectorstore ChromaDB.

If we want to apply multiple filters then ChromaDB supports filters like
this:
Reference: [ChromaDB
filters](https://cookbook.chromadb.dev/core/filters/)

Thank you.
2024-10-08 20:21:58 +00:00
Bagatur
e3e9ee8398 core[patch]: utils for adding/subtracting usage metadata (#27203) 2024-10-08 13:15:33 -07:00
ccurme
e3920f2320 community[patch]: fix structured_output in llamacpp integration (#27202)
Resolves https://github.com/langchain-ai/langchain/issues/25318.
2024-10-08 15:16:59 -04:00
Leonid Ganeline
c3cb56a9e8 docs: integrations updates 18 (#27054)
Added missed provider pages. Added descriptions and links. Fixed
inconsistency in text formatting.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 19:05:07 +00:00
Leonid Ganeline
b716d808ba docs: integrations/providers/microsoft update (#27055)
Added reference to the AzureCognitiveServicesToolkit.
Fixed titles.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 19:04:40 +00:00
Mathias Colpaert
feb4be82aa docs: in chatbot tutorial, make docs consistent with code sample (#27042)
**Docs Chatbot Tutorial**

The docs state that you can omit the language parameter, but the code
sample to demonstrate, still contains it.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 18:38:15 +00:00
Ikko Eltociear Ashimine
c10e1f70fe docs: update passio_nutrition_ai.ipynb (#27041)
initalize -> initialize


- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 18:35:48 +00:00
Erick Friis
b84e00283f standard-tests: test that only one chunk sets input_tokens (#27177) 2024-10-08 11:35:32 -07:00
Ajayeswar Reddy
9b7bdf1a26 Fixed typo in llibs/community/langchain_community/storage/sql.py (#27029)
- [ ] **PR title**: docs: fix typo in SQLStore import path

- [ ] **PR message**: 
- **Description:** This PR corrects a typo in the docstrings for the
class SQLStore(BaseStore[str, bytes]). The import path in the docstring
currently reads from langchain_rag.storage import SQLStore, which should
be changed to langchain_community.storage import SQLStore. This typo is
also reflected in the official documentation.
    - **Issue:** N/A
    - **Dependencies:** None
    - **Twitter handle:** N/A

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 17:51:26 +00:00
Nihal Chaudhary
0b36ed09cf DOC:Changed /docs/integrations/tools/jira/ (#27023)
- [x] - **Description:** replaced `%pip install -qU langchain-community`
to `%pip install -qU langchain-community langchain_openai ` in doc
\langchain\docs\docs\integrations\tools\jira.ipynb
- [x] - **Issue:** the issue #27013 
- [x] Add docs

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 17:48:08 +00:00
Jacob Lee
0ec74fbc14 docs: 👥 Update LangChain people data (#27022)
👥 Update LangChain people data

---------

Co-authored-by: github-actions <github-actions@github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 17:09:07 +00:00
Leonid Ganeline
ea9a59bcf5 docs: integrations updates 17 (#27015)
Added missed provider pages. Added missed descriptions and links.
I fixed the Ipex-LLM titles, so the ToC is now sorted properly for these
titles.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-08 17:03:18 +00:00
Vadym Barda
8d27325dbc core[patch]: support ValidationError from pydantic v1 in tools (#27194) 2024-10-08 10:19:04 -04:00
Christophe Bornet
16f5fdb38b core: Add various ruff rules (#26836)
Adds
- ASYNC
- COM
- DJ
- EXE
- FLY
- FURB
- ICN
- INT
- LOG
- NPY
- PD
- Q
- RSE
- SLOT
- T10
- TID
- YTT

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-07 22:30:27 +00:00
Erick Friis
5c826faece core: update make format to fix all autofixable things (#27174) 2024-10-07 15:20:47 -07:00
Christophe Bornet
d31ec8810a core: Add ruff rules for error messages (EM) (#26965)
All auto-fixes

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-07 22:12:28 +00:00
Oleksii Pokotylo
37ca468d03 community: AzureSearch: fix reranking for empty lists (#27104)
**Description:** 
  Fix reranking for empty lists 

**Issue:** 
```
ValueError: not enough values to unpack (expected 3, got 0)
    documents, scores, vectors = map(list, zip(*docs))
  File langchain_community/vectorstores/azuresearch.py", line 1680, in _reorder_results_with_maximal_marginal_relevance
```

Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com>
2024-10-07 15:27:09 -04:00
Bhadresh Savani
8454a742d7 Update README.md for Tutorial to Usecase url (#27099)
Fixed tutorial URL since earlier Tutorial URL was pointing to usecase
age which does not have any detail it should redirect to correct URL
page
2024-10-07 15:24:33 -04:00
Christophe Bornet
c4ebccfec2 core[minor]: Improve support for id in VectorStore (#26660)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-07 15:01:08 -04:00
Bharat Ramanathan
931ce8d026 core[patch]: Update AsyncCallbackManager to honor run_inline attribute and prevent context loss (#26885)
## Description

This PR fixes the context loss issue in `AsyncCallbackManager`,
specifically in `on_llm_start` and `on_chat_model_start` methods. It
properly honors the `run_inline` attribute of callback handlers,
preventing race conditions and ordering issues.

Key changes:
1. Separate handlers into inline and non-inline groups.
2. Execute inline handlers sequentially for each prompt.
3. Execute non-inline handlers concurrently across all prompts.
4. Preserve context for stateful handlers.
5. Maintain performance benefits for non-inline handlers.

**These changes are implemented in `AsyncCallbackManager` rather than
`ahandle_event` because the issue occurs at the prompt and message_list
levels, not within individual events.**

## Testing

- Test case implemented in #26857 now passes, verifying execution order
for inline handlers.

## Related Issues

- Fixes issue discussed in #23909 

## Dependencies

No new dependencies are required.

---

@eyurtsev: This PR implements the discussed changes to respect
`run_inline` in `AsyncCallbackManager`. Please review and advise on any
needed changes.

Twitter handle: @parambharat

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-07 14:59:29 -04:00
Aleksandar Petrov
c61b9daef5 docs: Grammar fix in concepts.mdx (#27149)
Missing "is" in a sentence about the Tool usage.
2024-10-07 18:55:25 +00:00
Eugene Yurtsev
8f8392137a Update MIGRATE.md (#27169)
Upgrade the content of MIGRATE.md so it's in sync
2024-10-07 14:53:40 -04:00
João Carlos Ferra de Almeida
780ce00dea core[minor]: add **kwargs to index and aindex functions for custom vector_field support (#26998)
Added `**kwargs` parameters to the `index` and `aindex` functions in
`libs/core/langchain_core/indexing/api.py`. This allows users to pass
additional arguments to the `add_documents` and `aadd_documents`
methods, enabling the specification of a custom `vector_field`. For
example, users can now use `vector_field="embedding"` when indexing
documents in `OpenSearchVectorStore`

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-07 14:52:50 -04:00
Jorge Piedrahita Ortiz
14de81b140 community: sambastudio chat model (#27056)
**Description:**: sambastudio chat model integration added, previously
only LLM integration
     included docs and tests

---------

Co-authored-by: luisfucros <luisfucros@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-10-07 14:31:39 -04:00
Aditya Anand
f70650f67d core[patch]: correct typo doc-string for astream_events method (#27108)
This commit addresses a typographical error in the documentation for the
async astream_events method. The word 'evens' was incorrectly used in
the introductory sentence for the reference table, which could lead to
confusion for users.\n\n### Changes Made:\n- Corrected 'Below is a table
that illustrates some evens that might be emitted by various chains.' to
'Below is a table that illustrates some events that might be emitted by
various chains.'\n\nThis enhancement improves the clarity of the
documentation and ensures accurate terminology is used throughout the
reference material.\n\nIssue Reference: #27107
2024-10-07 14:12:42 -04:00
Bagatur
38099800cc docs: fix anthropic max_tokens docstring (#27166) 2024-10-07 16:51:42 +00:00
ogawa
07dd8dd3d7 community[patch]: update gpt-4o cost (#27038)
updated OpenAI cost definition according to the following:
https://openai.com/api/pricing/
2024-10-07 09:06:30 -04:00
Averi Kitsch
7a07196df6 docs: update Google Spanner Vector Store documentation (#27124)
Thank you for contributing to LangChain!

- [X] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, etc. is
being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** Update Spanner VS integration doc
    - **Issue:** None
    - **Dependencies:** None
    - **Twitter handle:** NA


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-04 23:59:10 +00:00
Bagatur
06ce5d1d5c anthropic[patch]: Release 0.2.3 (#27126) 2024-10-04 22:38:03 +00:00
Bagatur
0b8416bd2e anthropic[patch]: fix input_tokens when cached (#27125) 2024-10-04 22:35:51 +00:00
Erick Friis
64a16f2cf0 infra: add nvidia and astradb back to api build (#27115)
test build
https://github.com/langchain-ai/langchain/actions/runs/11185115845
2024-10-04 14:41:41 -07:00
Bagatur
bd5b335cb4 standard-tests[patch]: fix oai usage metadata test (#27122) 2024-10-04 20:00:48 +00:00
Bagatur
827bdf4f51 fireworks[patch]: Release 0.2.1 (#27120) 2024-10-04 18:59:15 +00:00
Bagatur
98942edcc9 openai[patch]: Release 0.2.2 (#27119) 2024-10-04 11:54:01 -07:00
Bagatur
414fe16071 anthropic[patch]: Release 0.2.2 (#27118) 2024-10-04 11:53:53 -07:00
Bagatur
11df1b2b8d core[patch]: Release 0.3.9 (#27117) 2024-10-04 18:35:33 +00:00
Scott Hurrey
558fb4d66d box: Add citation support to langchain_box.retrievers.BoxRetriever when used with Box AI (#27012)
Thank you for contributing to LangChain!

**Description:** Box AI can return responses, but it can also be
configured to return citations. This change allows the developer to
decide if they want the answer, the citations, or both. Regardless of
the combination, this is returned as a single List[Document] object.

**Dependencies:** Updated to the latest Box Python SDK, v1.5.1
**Twitter handle:** BoxPlatform


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-04 18:32:34 +00:00
Bagatur
1e768a9ec7 anthropic[patch]: correctly handle tool msg with empty list (#27109) 2024-10-04 11:30:50 -07:00
Bagatur
4935a14314 core,integrations[minor]: Dont error on fields in model_kwargs (#27110)
Given the current erroring behavior, every time we've moved a kwarg from
model_kwargs and made it its own field that was a breaking change.
Updating this behavior to support the old instantiations /
serializations.

Assuming build_extra_kwargs was not something that itself is being used
externally and needs to be kept backwards compatible
2024-10-04 11:30:27 -07:00
Bagatur
0495b7f441 anthropic[patch]: add usage_metadata details (#27087)
fixes https://github.com/langchain-ai/langchain/pull/27087
2024-10-04 08:46:49 -07:00
Erick Friis
e8e5d67a8d openai: fix None token detail (#27091)
happens in Azure
2024-10-04 01:25:38 +00:00
Vadym Barda
2715bed70e docs[patch]: update links w/ new langgraph API ref (#26961)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-03 23:52:01 +00:00
Rashmi Pawar
47142eb6ee docs: Integrations NVIDIA llm documentation (#26934)
**Description:**

Add Notebook for NVIDIA prompt completion llm class.

cc: @sumitkbh @mattf @dglogo

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-03 23:32:45 +00:00
Erick Friis
ab4dab9a0c core: fix batch race condition in FakeListChatModel (#26924)
fixed #26273
2024-10-03 23:14:31 +00:00
Bagatur
87fc5ce688 core[patch]: exclude model cache from ser (#27086) 2024-10-03 22:00:31 +00:00
Bagatur
c09da53978 openai[patch]: add usage metadata details (#27080) 2024-10-03 14:01:03 -07:00
Bagatur
546dc44da5 core[patch]: add UsageMetadata details (#27072) 2024-10-03 20:36:17 +00:00
Sean
cc1b8b3d30 docs: Documentation update for Document Parse (#26844)
Renamed `Layout Analysis` to `Document Parser` in the doc as we have
recently renamed it!

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-10-03 20:36:04 +00:00
Erick Friis
7f730ce8b2 docs: remove spaces in percent pip (#27082) 2024-10-03 20:34:24 +00:00
Tibor Reiss
47a9199fa6 community[patch]: Fix missing protected_namespaces (#27076)
Fixes #26861

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-10-03 20:12:11 +00:00
Bagatur
2a54448a0a langchain[patch]: Release 0.3.2 (#27073) 2024-10-03 18:13:23 +00:00
Bharat Ramanathan
103e573f9b community[patch]: chore warn deprecate the wandb callback handler (#27062)
- **Description:**: This PR deprecates the wandb callback handler in
favor of the new
[WeaveTracer](https://weave-docs.wandb.ai/guides/integrations/langchain#using-weavetracer)
in W&B
- **Dependencies:** No dependencies, just a deprecation warning.
- **Twitter handle:** @parambharat


@baskaryan
2024-10-03 11:59:20 -04:00
Vadym Barda
907c758d67 docs[patch]: add long-term memory agent tutorial (#27057) 2024-10-02 23:02:44 -04:00
Eugene Yurtsev
635c55c039 core[patch]: Release 0.3.8 (#27046)
0.3.8 release for core
2024-10-02 16:58:38 +00:00
Eugene Yurtsev
74bf620e97 core[patch]: Support injected tool args that are arbitrary types (#27045)
This adds support for inject tool args that are arbitrary types when
used with pydantic 2.

We'll need to add similar logic on the v1 path, and potentially mirror
the config from the original model when we're doing the subset.
2024-10-02 12:50:58 -04:00
Erick Friis
e806e9de38 infra: fix api docs build checkout 2 (#27033) 2024-10-01 14:49:35 -07:00
Bagatur
099235da01 Revert "huggingface[patch]: make HuggingFaceEndpoint serializable (#2… (#27032)
…7027)"

This reverts commit b5e28d3a6d.
2024-10-01 21:26:38 +00:00
Bagatur
5f2e93ffea huggingface[patch]: xfail test (#27031) 2024-10-01 21:14:07 +00:00
Bagatur
b5e28d3a6d huggingface[patch]: make HuggingFaceEndpoint serializable (#27027) 2024-10-01 13:16:10 -07:00
ccurme
9d10151123 core[patch]: fix init of RunnableAssign (#26903)
Example in API ref currently raises ValidationError.

Resolves https://github.com/langchain-ai/langchain/issues/26862
2024-10-01 14:21:54 -04:00
Erick Friis
f7583194de docs: build new api docs (#26951) 2024-10-01 09:18:54 -07:00
Erick Friis
95a87291fd community: deprecate community ollama integrations (#26733) 2024-10-01 09:18:07 -07:00
ZhangShenao
e317d457cf Bug-Fix[Community] Fix FastEmbedEmbeddings (#26764)
#26759 

- Fix https://github.com/langchain-ai/langchain/issues/26759 
- Change `model` param from private to public, which may not be
initiated.
- Add test case
2024-09-30 21:23:08 -04:00
Erick Friis
a8e1577f85 milvus: mv to external repo (#26920) 2024-10-01 00:38:30 +00:00
Erick Friis
35f6393144 unstructured: mv to external repo (#26923) 2024-09-30 17:38:21 -07:00
Erick Friis
7ecd720120 multiple: update docs urls to latest 2 (#26837) 2024-09-30 17:37:07 -07:00
Erika Cardenas
4a32cc3c66 Update FeatureTables.js to add Weaviate (#26824)
Thank you for contributing to LangChain!


- [x] **PR message**: 
    - Add Weaviate to the vector store list.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-30 23:05:37 +00:00
William FH
6a861b0ad9 [Doc] Name variable langgraph_agent_executor (#26799) 2024-09-30 15:52:23 -07:00
Ayodele Aransiola
5346c7b27e doc: grammar fix on index.mdx (#26771)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


The PR is an adjustment on few grammar adjustments on the page.
@leomofthings is my twitter handle




If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-30 21:52:39 +00:00
Tomaz Bratanic
446144e7c6 Update neo4j vector procedures (#26775) 2024-09-30 14:45:09 -07:00
Arun Prakash
870bd42b0d docs: GremlinGraph Remove = in the URL (#26705)
- **Description:** URL is appended with = which is not working
    - **Issue:** removing the = symbol makes the URL valid
    - **Twitter handle:** @arunprakash_com

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-30 21:36:30 +00:00
federico-pisanu
2538963945 core[patch]: improve index/aindex api when batch_size<n_docs (#25754)
- **Description:** prevent index function to re-index entire source
document even if nothing has changed.
- **Issue:** #22135

I worked on a solution to this issue that is a compromise between being
cheap and being fast.
In the previous code, when batch_size is greater than the number of docs
from a certain source almost the entire source is deleted (all documents
from that source except for the documents in the first batch)
My solution deletes documents from vector store and record manager only
if at least one document has changed for that source.

Hope this can help!

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-30 20:57:41 +00:00
Eugene Yurtsev
7fde2791dc core[patch]: Add kwargs to Runnable (#27008)
Fixes #26685

---------

Co-authored-by: Tibor Reiss <tibor.reiss@gmail.com>
2024-09-30 16:45:29 -04:00
Christophe Bornet
2a6abd3f0a community[patch]: Add docstring for Links (#25969)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-30 20:33:50 +00:00
Ronan Amicel
19ed3165fb docs: Fix typo in list of PDF loaders (#26774)
Description: Fix typo in list of PDF loaders.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-30 20:04:18 +00:00
Mohammad Mohtashim
e12f570ead Merge pull request #26794
* [chore]: Agent Observation should be casted to string to avoid errors

* Merge branch 'master' into fix_observation_type_streaming

* [chore]: Using Json.dumps

* [chore]: Exact same logic as  when casting agent oobservation to string
2024-09-30 15:54:51 -04:00
Bagatur
34bd718fe1 core[patch]: Release 0.3.7 (#27004) 2024-09-30 18:52:42 +00:00
Bagatur
248be02259 core[patch]: fix structured prompt template format (#27003)
template_format is an init argument on ChatPromptTemplate but not an
attribute on the object so was getting shoved into
StructuredPrompt.structured_ouptut_kwargs
2024-09-30 11:47:46 -07:00
Bagatur
0078493a80 fireworks[patch]: allow tool_choice with multiple tools (#26999)
https://docs.fireworks.ai/api-reference/post-chatcompletions
2024-09-30 11:28:43 -07:00
Bagatur
c7120d87dd groq[patch]: support tool_choice=any/required (#27000)
https://console.groq.com/docs/api-reference#chat-create
2024-09-30 11:28:35 -07:00
Christophe Bornet
db8845a62a core: Add ruff rules for pycodestyle Warning (W) (#26964)
All auto-fixes.
2024-09-30 09:31:43 -04:00
Bagatur
9404e7af9d openai[patch]: exclude http client (#26891)
httpx clients aren't serializable
2024-09-29 11:16:27 -07:00
Andrew Benton
ce2669cb56 docs: update code interpreter tool table to reflect riza file upload support (#26960)
**Description:** Update the code interpreter tools feature table to
reflect Riza file upload support (blog announcement here:
https://riza.io/blog/adding-support-for-input-files-and-http-credentials)
**Issue:** N/A
**Dependencies:** N/A
2024-09-29 12:04:07 -04:00
Erick Friis
b2c315997c infra: custom commit to external repo (#26962) 2024-09-27 16:39:28 -07:00
Ben Chambers
29bf89db25 community: Add conversions from GVS to networkx (#26906)
These allow converting linked documents (such as those used with
GraphVectorStore) to networkx for rendering and/or in-memory graph
algorithms such as community detection.
2024-09-27 16:48:55 -04:00
Christophe Bornet
7809b31b95 core[patch]: Add ruff rules for flake8-simplify (SIM) (#26848)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-27 20:13:23 +00:00
Eugene Yurtsev
de0b48c41a docs: Upgrade examples with RunnableWithMessageHistory to langgraph memory (#26855)
This PR updates the documentation examples that used
RunnableWithMessageHistory to show how to achieve the same
implementation with langgraph memory.

Some of the underlying PRs (not all of them):

- docs[patch]: update chatbot tutorial and migration guide (#26780)
- docs[patch]: update chatbot memory how-to (#26790)
- docs[patch]: update chatbot tools how-to (#26816)
- docs: update chat history in rag how-to (#26821)
- docs: update trim messages notebook (#26793)
- docs: clean up imports in how to guide for rag qa with chat history
(#26825)
- docs[patch]: update conversational rag tutorial (#26814)

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Vadym Barda <vadym@langchain.dev>
Co-authored-by: mercyspirit <ziying.qiu@gmail.com>
Co-authored-by: aqiu7 <aqiu7@gatech.edu>
Co-authored-by: John <43506685+Coniferish@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Subhrajyoty Roy <subhrajyotyroy@gmail.com>
Co-authored-by: Rajendra Kadam <raj.725@outlook.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
Co-authored-by: Devin Gaffney <itsme@devingaffney.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-27 20:04:30 +00:00
ccurme
44eddd39d6 infra[patch]: update notebooks workflow (#26956)
Addressing some lingering comments from
https://github.com/langchain-ai/langchain/pull/26944, adding parameters
for
- python version
- working directory

![Screenshot 2024-09-27 at 3 33
21 PM](https://github.com/user-attachments/assets/dfa45772-fddb-4489-a148-c9ed83d844d0)
2024-09-27 15:39:14 -04:00
ccurme
67df944dfb infra: add CI job for running tutorial notebooks (#26944) 2024-09-27 18:29:49 +00:00
Erick Friis
9eb26c5f9d infra: api docs build ref experimental (#26950) 2024-09-27 10:21:07 -07:00
Erick Friis
135164e1ee infra: api docs build ref update (#26949) 2024-09-27 10:12:10 -07:00
Erick Friis
c38ea7a069 infra: api docs build (#26948) 2024-09-27 09:47:43 -07:00
Christophe Bornet
f4e738bb40 core: Add ruff rules for PIE (#26939)
All auto-fixes.
2024-09-27 12:08:35 -04:00
ccurme
836c2a4ae0 docs: update memory integrations page (#26912) 2024-09-27 10:02:09 -04:00
ccurme
39987ebd91 openai[patch]: update deprecation target in API ref (#26921) 2024-09-27 08:42:31 -04:00
Subhrajyoty Roy
7f37fd8b80 community[patch]: callback before yield for cloudflare (#26927)
**Description:** Moves yield to after callback for `_stream` function
for the cloudfare workersai model in the community llm package
**Issue:** #16913
2024-09-27 08:42:01 -04:00
Youshin Kim
2d9a09dfa4 Fix typo in mlflow code example in mlflow.py (#26931)
- [x] PR title: Fix typo in code example in mlflow.py
- In libs/community/langchain_community/chat_models/mlflow.py
2024-09-27 12:41:39 +00:00
Subhrajyoty Roy
7037ba0f06 community[patch]: callback before yield for mlx pipeline (#26928)
**Description:** Moves yield to after callback for `_stream` function
for the MLX pipeline model in the community llm package
**Issue:** #16913
2024-09-27 08:41:34 -04:00
Subhrajyoty Roy
adcfecdb67 community[patch]: callback before yield for textgen (#26929)
**Description:** Moves callback to before yield for `_stream` and
`_astream` function for the textgen model in the community llm package
**Issue:** #16913
2024-09-27 08:41:13 -04:00
Subhrajyoty Roy
5f2cc4ecb2 community[patch]: callback before yield for titan takeoff (#26930)
**Description:** Moves yield to after callback for `_stream` function
for the titan takeoff model in the community llm package
**Issue:** #16913
2024-09-27 08:40:22 -04:00
Emmanuel Sciara
c6350d636e core[fix]: using async rate limiter methods in async code (#26914)
**Description:** Replaced blocking (sync) rate_limiter code in async
methods.

**Issue:** #26913

**Dependencies:** N/A

**Twitter handle:** no need 🤗
2024-09-26 20:44:28 +00:00
Eugene Yurtsev
02f5962cf1 docs: add api referencs to langgraph (#26877)
Add api references to langgraph
2024-09-26 15:21:10 -04:00
Abhi Agarwal
696114e145 community: add sqlite-vec vectorstore (#25003)
**Description**:

Adds a vector store integration with
[sqlite-vec](https://alexgarcia.xyz/sqlite-vec/), the successor to
sqlite-vss that is a single C file with no external dependencies.

Pretty straightforward, just copy-pasted the sqlite-vss integration and
made a few tweaks and added integration tests. Only question is whether
all documentation should be directed away from sqlite-vss if it is
defacto deprecated (cc @asg017).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: philippe-oger <philippe.oger@adevinta.com>
2024-09-26 17:37:10 +00:00
Erick Friis
8bc12df2eb voyageai: new models (#26907)
Co-authored-by: fzowl <zoltan@voyageai.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
2024-09-26 17:07:10 +00:00
Eugene Yurtsev
2a0d9d05fb docs: Fix trim_messages invocations in the memory migration guide (#26902)
Should only be start_on="human", not start_on=("human", "ai")
2024-09-26 17:02:30 +00:00
Erick Friis
7a99a4d4f8 infra: fix experimental in dco imports check (#26905) 2024-09-26 09:51:58 -07:00
Subhrajyoty Roy
ba467f1a36 community[patch]: callback before yield for gigachat (#26881)
**Description:** Moves yield to after callback for `_stream` and
`_astream` function for the gigachat model in the community llm package
**Issue:** #16913
2024-09-26 12:47:28 -04:00
Subhrajyoty Roy
11e703a97e community[patch]: callback before yield for google palm (#26882)
**Description:** Moves yield to after callback for `_stream` function
for the google palm model in the community package
**Issue:** #16913
2024-09-26 12:47:05 -04:00
Julius Stopforth
121e79b1f0 core: Fix IndexError when trim_messages invoked with empty list (#26896)
This prevents `trim_messages` from raising an `IndexError` when invoked
with `include_system=True`, `strategy="last"`, and an empty message
list.

Fixes #26895

Dependencies: none
2024-09-26 11:29:58 -04:00
ccurme
7091a1a798 openai[patch]: increase token limit in azure integration tests (#26901)
`test_json_mode` occasionally runs into this
2024-09-26 14:31:33 +00:00
Erick Friis
2ea5f60cc5 experimental: migrate to external repo (#26879)
security scanners can't distinguish monorepo sources from each other.
this will resolve issues for folks trying to use e.g. langchain-core but
getting security issues from experimental flagged!
2024-09-25 19:02:19 -07:00
Bagatur
c750600d3d infra: update release secrets (#26878) 2024-09-26 00:12:31 +00:00
Jack Peplinski
edf879d321 docs: update extraction_examples.ipynb (#26874)
The `Without examples 😿` and `With examples 😻` should have different
outputs to illustrate their point.

See v0.2 docs.
https://python.langchain.com/docs/how_to/extraction_examples/#without-examples-

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-25 17:26:42 -04:00
Erick Friis
6f3c8313ba community: bump langchain version (#26876) 2024-09-25 12:58:24 -07:00
Erick Friis
e068407f18 community: bump core versoin (#26875) 2024-09-25 12:57:16 -07:00
Eugene Yurtsev
25cb44c9ee 0.3.1 release community (#26872)
Release for 0.3.1

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-25 19:38:53 +00:00
Erick Friis
9a31ad6f60 langchain: release 0.3.1 (#26868) 2024-09-25 11:43:54 -07:00
Erick Friis
ef2ab26113 core: release 0.3.6 (#26863) 2024-09-25 11:05:53 -07:00
ccurme
87e21493f7 docs[patch]: remove deprecated loaders from feature tables (#26709) 2024-09-25 12:53:32 -04:00
ccurme
a0010063e8 docs[patch]: add guide for loading web pages (#26708) 2024-09-25 12:03:42 -04:00
Bagatur
eaffa92c1d openai[patch]: Release 0.2.1 (#26858) 2024-09-25 15:55:49 +00:00
Rajendra Kadam
51c4393298 community[patch]: Fix validation error in SettingsConfigDict across multiple Langchain modules (#26852)
- **Description:** This pull request addresses the validation error in
`SettingsConfigDict` due to extra fields in the `.env` file. The issue
is prevalent across multiple Langchain modules. This fix ensures that
extra fields in the `.env` file are ignored, preventing validation
errors.
  **Changes include:**
    - Applied fixes to modules using `SettingsConfigDict`.

- **Issue:** NA, similar
https://github.com/langchain-ai/langchain/issues/26850
- **Dependencies:** NA
2024-09-25 10:02:14 -04:00
Devin Gaffney
d502858412 Update main README.md to reference latest version of documentation (#26854)
Update README.md to point at latest docs
2024-09-25 09:44:18 -04:00
Eugene Yurtsev
27c12146c8 docs[patch]: In conceptual docs explain constraints on ToolMessage (#26792)
Minor clarification
2024-09-25 09:34:45 -04:00
Christophe Bornet
3a1b9259a7 core: Add ruff rules for comprehensions (C4) (#26829) 2024-09-25 09:34:17 -04:00
Rajendra Kadam
7e5a9c317f community[minor]: [Pebblo] Enhance PebbloSafeLoader to take anonymize flag (#26812)
- **Description:** The flag is named `anonymize_snippets`. When set to
true, the Pebblo server will anonymize snippets by redacting all
personally identifiable information (PII) from the snippets going into
VectorDB and the generated reports
- **Issue:** NA
- **Dependencies:** NA
- **docs**: Updated
2024-09-25 09:33:06 -04:00
Rajendra Kadam
92003b3724 community[patch]: [SharePointLoader] Fix validation error in _O365Settings due to extra fields in .env file (#26851)
**Description:** Fix validation error in _O365Settings by ignoring extra
fields in .env file
**Issue:** https://github.com/langchain-ai/langchain/issues/26850
**Dependencies:** NA
2024-09-25 09:31:59 -04:00
Subhrajyoty Roy
b61fb98466 community[patch]: callback before yield for friendli (#26842)
**Description:** Moves yield to after callback for `_stream` and
`_astream` function for the friendli model in the community package
**Issue:** #16913
2024-09-25 09:31:12 -04:00
ccurme
13acf9e6b0 langchain[patch]: add deprecation warnings (#26853) 2024-09-25 09:26:44 -04:00
William FH
82b5b77940 [Core] Add more interops tests (#26841)
To test that the client propagates both ways
2024-09-24 20:18:20 -07:00
William FH
9b6ac41442 [Core] Inherit tracing metadata & tags (#26838) 2024-09-24 19:33:12 -07:00
Erick Friis
3796e143f8 docs: remove one more print from build (#26834) 2024-09-24 22:40:16 +00:00
Erick Friis
95269366ae docs: make build less verbose (#26833) 2024-09-24 22:30:05 +00:00
Erick Friis
425c0f381f experimental: release 0.3.1 (#26830) 2024-09-24 15:03:05 -07:00
John
6c3ea262c8 partners/unstructured: release 0.1.5 (#26831)
**Description:** update package version to support loading URLs #26670
**Issue:**  #26697
2024-09-24 15:02:53 -07:00
mercyspirit
0414be4b80 experimental[major]: CVE-2024-46946 fix (#26783)
Description: Resolve CVE-2024-46946 by switching out sympify with
parse_expr with a very specific allowed set of operations.

https://nvd.nist.gov/vuln/detail/cve-2024-46946

Sympify uses eval which makes it vulnerable to code execution.
parse_expr is limited to specific expressions.

Bandit results

![image](https://github.com/user-attachments/assets/170a6376-7028-4e70-a7ef-9acfb49c1d8a)

---------

Co-authored-by: aqiu7 <aqiu7@gatech.edu>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-24 21:37:56 +00:00
Erick Friis
f9ef688b3a docs: upgrade to docusaurus v3 (#26803) 2024-09-24 11:28:13 -07:00
Subhrajyoty Roy
b1da532522 community[patch]: callback before yield for deepsparse llm (#26822)
**Description:** Moves yield to after callback for `_stream` and
`_astream` function for the deepsparse model in the community package
**Issue:** #16913
2024-09-24 13:55:52 -04:00
Nuno Campos
de70a64e3a core: Run LangChainTracer inline (#26797)
- this flag ensures the tracer always runs in the same thread as the run
being traced for both sync and async runs
- pro: less chance for ordering bugs and other oddities
- blocking the event loop is not a concern given all code in the tracer
holds the GIL anyway
2024-09-24 08:31:18 -07:00
Jorge Piedrahita Ortiz
408a930d55 community: Add Sambanova Cloud Chat model community integration (#26333)
**Description:** : Add SambaNova Cloud Chat model community integration
Includes 
- chat model integration (following Standardize ChatModel docstrings)
-  tests
- docs usage notebook (following Standardize ChatModel integration docs)

https://cloud.sambanova.ai/

---------

Co-authored-by: luisfucros <luisfucros@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-24 14:11:32 +00:00
Tom
2b83c7c3ab community[patch]: Fix tool_calls parsing when streaming from DeepInfra (#26813)
- **Description:** This PR fixes the response parsing logic for
`ChatDeepInfra`, more specifially `_convert_delta_to_message_chunk()`,
which is invoked when streaming via `ChatDeepInfra`.
- **Issue:** Streaming from DeepInfra via `ChatDeepInfra` is currently
broken because the response parsing logic doesn't handle that
`tool_calls` can be `None`. (There is no GitHub issue for this problem
yet.)
- **Dependencies:** –
- **Twitter handle:** –

Keeping this here as a reminder:
> If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-24 13:47:36 +00:00
Subhrajyoty Roy
997d95c8f8 community[patch]: callback before yield for bedrock llm (#26804)
**Description:** Moves yield to after callback for
`_prepare_input_and_invoke_stream` and
`_aprepare_input_and_invoke_stream` for bedrock llm in community
package.
**Issue:** #16913
2024-09-24 12:14:59 +00:00
Erick Friis
e40a2b8bbf docs: fix mdx codefences (#26802)
```
git grep -l -E '"```\{=mdx\}\\n",' | xargs perl -0777 -i -pe 's/"```\{=mdx\}\\n",\n    (\W.*?),\n\s*"```\\?n?"/$1/s'
```
2024-09-24 06:06:13 +00:00
Erick Friis
35081d2765 docs: fix admonition formatting (#26801) 2024-09-23 21:55:17 -07:00
Erick Friis
603d38f06d docs: make docs mdxv2 compatible (#26798)
prep for docusaurus migration
2024-09-23 21:24:23 -07:00
ccurme
2a4c5713cd openai[patch]: fix azure integration tests (#26791) 2024-09-23 17:49:15 -04:00
ccurme
1ce056d1b2 docs[patch]: add memory migration guides to sidebar (#26711)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-23 15:31:27 -04:00
Mohammad Mohtashim
154a5ff7ca core[patch]: On Chain Start Fix for Chain Class (#26593)
- **Issue:** #26588

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-23 19:30:59 +00:00
ccurme
bba7af903b core[patch]: set default on Blob (#26787)
Resolves https://github.com/langchain-ai/langchain/issues/26781
2024-09-23 18:55:56 +00:00
ccurme
97b27f0930 langchain[patch]: fix extended tests (#26788)
Broken by addition of `disabled_params`
2024-09-23 18:52:09 +00:00
Brace Sproul
fb9ac8da2f fix(docs): Drop announcement bar (#26782) 2024-09-23 18:03:59 +00:00
Bagatur
e1e4f88b3e openai[patch]: enable Azure structured output, parallel_tool_calls=Fa… (#26599)
…lse, tool_choice=required

response_format=json_schema, tool_choice=required, parallel_tool_calls
are all supported for gpt-4o on azure.
2024-09-22 22:25:22 -07:00
Gabriel Altay
bb40a0fb32 Remove pydantic restricted namespaces from HuggingFaceInferenceAPIEmbedings (#26744)
without this `model_config` importing this package produces warnings
about "model_name" having conflicts with protected namespace "model_".

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-22 08:05:37 -04:00
Gor Hayrapetyan
f97ac92f00 community[patch]: Handle empty PR body in get_pull_request in Github utility (#26739)
**Description:**
When PR body is empty `get_pull_request` method fails with bellow
exception.


**Issue:**
```
TypeError('expected string or buffer')Traceback (most recent call last):


  File ".../.venv/lib/python3.9/site-packages/langchain_core/tools/base.py", line 661, in run
    response = context.run(self._run, *tool_args, **tool_kwargs)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/tools/github/tool.py", line 52, in _run
    return self.api_wrapper.run(self.mode, query)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 816, in run
    return json.dumps(self.get_pull_request(int(query)))


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 495, in get_pull_request
    add_to_dict(response_dict, "body", pull.body)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 487, in add_to_dict
    tokens = get_tokens(value)


  File ".../.venv/lib/python3.9/site-packages/langchain_community/utilities/github.py", line 483, in get_tokens
    return len(tiktoken.get_encoding("cl100k_base").encode(text))


  File "....venv/lib/python3.9/site-packages/tiktoken/core.py", line 116, in encode
    if match := _special_token_regex(disallowed_special).search(text):


TypeError: expected string or buffer
```

**Twitter:**  __gorros__
2024-09-22 01:56:24 +00:00
Erick Friis
238a31bbd9 core: release 0.3.5 (#26737) 2024-09-21 00:26:39 +00:00
William FH
55af6fbd02 [LangChainTracer] Omit Chunk (#26602)
in events / new llm token
2024-09-20 17:10:34 -07:00
Anton Dubovik
3e2cb4e8a4 openai: embeddings: supported chunk_size when check_embedding_ctx_length is disabled (#23767)
Chunking of the input array controlled by `self.chunk_size` is being
ignored when `self.check_embedding_ctx_length` is disabled. Effectively,
the chunk size is assumed to be equal 1 in such a case. This is
suprising.

The PR takes into account `self.chunk_size` passed by the user.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 16:58:45 -07:00
William FH
864020e592 [Tracer] add project name to run from tracer (#26736) 2024-09-20 16:48:37 -07:00
Nithish Raghunandanan
2d21274bf6 couchbase: Add ttl support to caches & chat_message_history (#26214)
**Description:** Add support to delete documents automatically from the
caches & chat message history by adding a new optional parameter, `ttl`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:44:29 +00:00
Krishna Kulkarni
c6c508ee96 Refining Skip Count Calculation by Filtering Documents with session_id (#26020)
In the previous implementation, `skip_count` was counting all the
documents in the collection. Instead, we want to filter the documents by
`session_id` and calculate `skip_count` by subtracting `history_size`
from the filtered count.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-20 23:40:56 +00:00
Tibor Reiss
a8b24135a2 fix[experimental]: Fix text splitter with gradient (#26629)
Fixes #26221

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:35:50 +00:00
Alejandro Rodríguez
4ac9a6f52c core: fix "template" not allowed as prompt param (#26060)
- **Description:**  fix "template" not allowed as prompt param
- **Issue:** #26058
- **Dependencies:** none


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:33:06 +00:00
Christophe Bornet
58f339a67c community: Fix links in GraphVectorStore pydoc (#25959)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 23:17:53 +00:00
Christophe Bornet
e49c413977 core: Add docstring for GraphVectorStoreRetriever (#26224)
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-09-20 23:16:37 +00:00
Lucain
a2023a1e96 huggingface; fix huggingface_endpoint.py (initialize clients only with supported kwargs) (#26378)
## Description

By default, `HuggingFaceEndpoint` instantiates both the
`InferenceClient` and the `AsyncInferenceClient` with the
`"server_kwargs"` passed as input. This is an issue as both clients
might not support exactly the same kwargs. This has been highlighted in
https://github.com/huggingface/huggingface_hub/issues/2522 by
@morgandiverrez with the `trust_env` parameter. In order to make
`langchain` integration future-proof, I do think it's wiser to forward
only the supported parameters to each client. Parameters that are not
supported are simply ignored with a warning to the user. From a
`huggingface_hub` maintenance perspective, this allows us much more
flexibility as we are not constrained to support the exact same kwargs
in both clients.

## Issue

https://github.com/huggingface/huggingface_hub/issues/2522

## Dependencies

None

## Twitter 

https://x.com/Wauplin

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 16:05:24 -07:00
ccurme
f2285376a5 community[patch]: add web loader tests (#26728) 2024-09-20 18:29:54 -04:00
Erick Friis
4a2745064a core: release 0.3.4 (#26729) 2024-09-20 14:47:15 -07:00
Nuno Campos
345edeb1f0 core: In astream_events propagate cancellation reason to inner task (#26727)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-20 14:42:10 -07:00
Erick Friis
465e43cd43 core: release 0.3.3 (#26713) 2024-09-20 13:54:19 -07:00
Eugene Yurtsev
4fc69d61ad core[patch]: Fix defusedxml import (#26718)
Fix defusedxml import. Haven't investigated what's actually going on
under the hood -- defusedxml probably does some weird things in __init__
2024-09-20 16:53:24 -04:00
Eugene Yurtsev
79b224f6f3 core/langchain: fix version used in deprecation (#26724)
in core deprecation should be version 0.3.3 instead of 0.3.4
in langchain deprecation should be version 0.3.1 instead of 0.3.4
2024-09-20 16:47:18 -04:00
Eugene Yurtsev
8a9f7091c0 docs: Update trim message usage in migrating_memory (#26722)
Make sure we don't end up with a ToolMessage that precedes an AIMessage
2024-09-20 20:20:27 +00:00
Eugene Yurtsev
91f4711e53 core[patch],langchain[patch]: deprecate memory and entity abstractions and implementations (#26717)
This PR deprecates the old memory, entity abstractions and implementations
2024-09-20 15:06:25 -04:00
William FH
19ce95d3c9 Avoid copying runs (#26689)
Also, re-unify run trees. Use a single shared client.
2024-09-20 10:57:41 -07:00
Eric
90031b1b3e support epsilla cloud vector database in langchain (#26065)
Description

- support epsilla cloud in langchain

---------

Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-20 17:14:23 +00:00
ZhangShenao
baef7639fd Improvement[text-splitter] Fix import of ExperimentalMarkdownSyntaxTextSplitter (#26703)
#26028 

Export `ExperimentalMarkdownSyntaxTextSplitter` in __init__

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-20 17:04:31 +00:00
Eugene Yurtsev
acf8c2c13e docs: Add migration instructions for v0.0.x memory abstractions (#26668)
This PR adds a migration guide for any code that relies on old memory
abstractions.
2024-09-20 15:09:23 +00:00
ccurme
eeab6a688c docs[patch]: update PDF loader docs (#26627)
Docs preview:
https://langchain-git-cc-pdfdocs-langchain.vercel.app/docs/how_to/document_loader_pdf/
2024-09-20 11:07:06 -04:00
stein1988
91594928c5 fix:fix ChatZhipuAI tool call bug (#26693)
- [ ] **PR title**: "community:fix ChatZhipuAI tool call bug"

- [ ] **Description:** ZhipuAI api response as follows:
{'id': '20240920132549e379a9152a6a4d7c', 'created': 1726809949, 'model':
'glm-4-flash', 'choices': [{'index': 0, 'finish_reason': 'tool_calls',
'delta': {'role': 'assistant', 'tool_calls': [{'id':
'call_20240920132549e379a9152a6a4d7c', 'index': 0, 'type': 'function',
'function': {'name': 'get_datetime_offline', 'arguments': '{}'}}]}}]}
so, tool_calls = dct.get("tool_call", None) in
_convert_delta_to_message_chunk should be "tool_calls"
2024-09-20 13:06:42 +00:00
guoqiang0401
8f0c04f47e Update tool_calling.ipynb (#26699)
There is a small bug in "TypedDict class" sample source.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-20 13:04:50 +00:00
Bagatur
f7bb3640f1 core[patch]: support js chat model namespaces (#26688) 2024-09-19 16:14:20 -07:00
Bagatur
c453b76579 core[patch]: Release 0.3.2 (#26686) 2024-09-19 14:58:45 -07:00
Piyush Jain
f087ab43fd core[patch]: Fix load of ChatBedrock (#26679)
Complementary PR to master for #26643.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-19 21:57:20 +00:00
Bagatur
409f35363b core[patch]: support load from path for default namespaces (#26675) 2024-09-19 14:47:27 -07:00
Eugene Yurtsev
e8236e58f2 ci: restore qa template that was known to work (#26684)
Restore qa template that was working
2024-09-19 17:20:42 -04:00
ccurme
eef18dec44 unstructured[patch]: support loading URLs (#26670)
`unstructured.partition.auto.partition` supports a `url` kwarg, but
`url` in `UnstructuredLoader.__init__` is reserved for the server URL.
Here we add a `web_url` kwarg that is passed to the partition kwargs:
```python
self.unstructured_kwargs["url"] = web_url
```
2024-09-19 11:40:25 -07:00
Erick Friis
311f861547 core, community: move graph vectorstores to community (#26678)
remove beta namespace from core, add to community
2024-09-19 11:38:14 -07:00
Serena Ruan
c77c28e631 [community] Fix WorkspaceClient error with pydantic validation (#26649)
Thank you for contributing to LangChain!

Fix error like
<img width="1167" alt="image"
src="https://github.com/user-attachments/assets/2e219b26-ec7e-48ef-8111-e0ff2f5ac4c0">

After the fix:
<img width="584" alt="image"
src="https://github.com/user-attachments/assets/48f36fe7-628c-48b6-81b2-7fe741e4ca85">


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: serena-ruan <serena.rxy@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 18:25:33 +00:00
ccurme
7d49ee9741 unstructured[patch]: add to integration tests (#26666)
- Add to tests on parsed content;
- Add tests for async + lazy loading;
- Add a test for `strategy="hi_res"`.
2024-09-19 13:43:34 -04:00
Erick Friis
28dd6564db docs: highlight styling (#26636)
MERGE ME PLEASE
2024-09-19 17:12:59 +00:00
ccurme
f91bdd12d2 community[patch]: add to pypdf tests and run in CI (#26663) 2024-09-19 14:45:49 +00:00
ice yao
4d3d62c249 docs: fix nomic link error (#26642) 2024-09-19 14:41:45 +00:00
Rajendra Kadam
60dc19da30 [community] Added PebbloTextLoader for loading text data in PebbloSafeLoader (#26582)
- **Description:** Added PebbloTextLoader for loading text in
PebbloSafeLoader.
- Since PebbloSafeLoader wraps document loaders, this new loader enables
direct loading of text into Documents using PebbloSafeLoader.
- **Issue:** NA
- **Dependencies:** NA
- [x] **Tests**: Added/Updated tests
2024-09-19 09:59:04 -04:00
Jorge Piedrahita Ortiz
55b641b761 community: fix error in sambastudio embeddings (#26260)
fix error in samba studio embeddings  result unpacking
2024-09-19 09:57:04 -04:00
Jorge Piedrahita Ortiz
37b72023fe community: remove sambaverse (#26265)
removing Sambaverse llm model and references given is not available
after Sep/10/2024

<img width="1781" alt="image"
src="https://github.com/user-attachments/assets/4dcdb5f7-5264-4a03-b8e5-95c88304e059">
2024-09-19 09:56:30 -04:00
Martin Triska
3fc0ea510e community : [bugfix] Use document ids as keys in AzureSearch vectorstore (#25486)
# Description
[Vector store base
class](4cdaca67dc/libs/core/langchain_core/vectorstores/base.py (L65))
currently expects `ids` to be passed in and that is what it passes along
to the AzureSearch vector store when attempting to `add_texts()`.
However AzureSearch expects `keys` to be passed in. When they are not
present, AzureSearch `add_embeddings()` makes up new uuids. This is a
problem when trying to run indexing. [Indexing code
expects](b297af5482/libs/core/langchain_core/indexing/api.py (L371))
the documents to be uploaded using provided ids. Currently AzureSearch
ignores `ids` passed from `indexing` and makes up new ones. Later when
`indexer` attempts to delete removed file, it uses the `id` it had
stored when uploading the document, however it was uploaded under
different `id`.

**Twitter handle: @martintriska1**
2024-09-19 09:37:18 -04:00
Tomaz Bratanic
a8561bc303 Fix async parsing for llm graph transformer (#26650) 2024-09-19 09:15:33 -04:00
Erik
4e0a6ebe7d community: Add warning when page_content is empty (#25955)
Page content sometimes is empty when PyMuPDF can not find text on pages.
For example, this can happen when the text of the PDF is not copyable
"by hand". Then an OCR solution is need - which is not integrated here.

This warning should accurately warn the user that some pages are lost
during this process.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 05:22:09 +00:00
Christophe Bornet
fd21ffe293 core: Add N(naming) ruff rules (#25362)
Public classes/functions are not renamed and rule is ignored for them.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 05:09:39 +00:00
Daniel Cooke
7835c0651f langchain_chroma: Pass through kwargs to Chroma collection.delete (#25970)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:21:24 +00:00
Tibor Reiss
85caaa773f docs[community]: Fix raw string in docstring (#26350)
Fixes #26212: replaced the raw string with backslashes. Alternative:
raw-stringif the full docstring.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-09-19 04:18:56 +00:00
Erick Friis
8fb643a6e8 partners/box: release 0.2.1 (#26644) 2024-09-19 04:02:06 +00:00
Tomaz Bratanic
03b9aca55d community: Retry retriable errors in Neo4j (#26211)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-19 04:01:07 +00:00
Scott Hurrey
acbb4e4701 box: Add searchoptions for BoxRetriever, documentation for BoxRetriever as agent tool (#26181)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


Added search options for BoxRetriever and added documentation to
demonstrate how to use BoxRetriever as an agent tool - @BoxPlatform


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-18 21:00:06 -07:00
Erick Friis
e0c36afc3e docs: v0.3 link redirect (#26632) 2024-09-18 14:28:56 -07:00
Erick Friis
9909354cd0 core: use ruff.target-version instead (#26634)
tested on one of the replacement cases and seems to work! 
![ScreenShot 2024-09-18 at 02 02
43PM](https://github.com/user-attachments/assets/7170975a-2542-43ed-a203-d4126c6a2c81)
2024-09-18 21:06:14 +00:00
Erick Friis
84b831356c core: remove [project] tag from pyproject (#26633)
makes core incompatible with uv installs
2024-09-18 20:39:49 +00:00
Christophe Bornet
a47b332841 core: Put Python version as a project requirement so it is considered by ruff (#26608)
Ruff doesn't know about the python version in
`[tool.poetry.dependencies]`. It can get it from
`project.requires-python`.

Notes:
* poetry seems to have issues getting the python constraints from
`requires-python` and using `python` in per dependency constraints. So I
had to duplicate the info. I will open an issue on poetry.
* `inspect.isclass()` doesn't work correctly with `GenericAlias`
(`list[...]`, `dict[..., ...]`) on Python <3.11 so I added some `not
isinstance(type, GenericAlias)` checks:

Python 3.11
```pycon
>>> import inspect
>>> inspect.isclass(list)
True
>>> inspect.isclass(list[str])
False
```

Python 3.9
```pycon
>>> import inspect
>>> inspect.isclass(list)
True
>>> inspect.isclass(list[str])
True
```

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-18 14:37:57 +00:00
Patrick McGleenon
0f07cf61da docs: fixed typo in XML document loader (#26613)
Fixed typo `Unstrucutred`
2024-09-18 14:26:57 +00:00
Erick Friis
d158401e73 infra: master release checkout ref for release note (#26605) 2024-09-18 01:51:54 +00:00
Bagatur
de58942618 docs: consolidate dropdowns (#26600) 2024-09-18 01:24:10 +00:00
Bagatur
df38d5250f docs: cleanup nav (#26546) 2024-09-17 17:49:46 -07:00
sanjay920
b246052184 docs: fix typo in clickhouse vectorstore doc (#26598)
- **Description:** typo in clickhouse vectorstore doc
- **Issue:** #26597
- **Dependencies:** none
- **Twitter handle:** sanjay920

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 23:33:22 +00:00
Miguel Grinberg
52729ac0be docs: update hybrid search example with Elasticsearch retriever (#26328)
- **Description:** the example to perform hybrid search with the
Elasticsearch retriever is out of date
- **Issue:** N/A
- **Dependencies:** N/A

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 23:15:27 +00:00
Marco Rossi IT
f62d454f36 docs: fix typo on amazon_textract.ipynb (#26493)
- **Description:** fixed a typo on amazon textract page

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:27:45 +00:00
gbaian10
6fe2536c5a docs: fix the ImportError in google_speech_to_text.ipynb (#26522)
fix #26370

- #26370 

`GoogleSpeechToTextLoader` is a deprecated method in
`langchain_community.document_loaders.google_speech_to_text`.

The new recommended usage is to use `SpeechToTextLoader` from
`langchain_google_community`.

When importing from `langchain_google_community`, use the name
`SpeechToTextLoader` instead of the old `GoogleSpeechToTextLoader`.


![image](https://github.com/user-attachments/assets/3a8bd309-9858-4938-b7db-872f51b9542e)

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:18:57 +00:00
Zhanwei Zhang
418b170f94 docs: Fix typo in conda environment code block in rag.ipynb (#26487)
Thank you for contributing to LangChain!

- [x] **PR title**: Fix typo in conda environment code block in
rag.ipynb
  - In docs/tutorials/rag.ipynb

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:13:55 +00:00
ZhangShenao
c3b3f46cb8 Improvement[Community] Improve api doc of BeautifulSoupTransformer (#26423)
- Add missing args

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 22:00:07 +00:00
ogawa
e2245fac82 community[patch]: o1-preview and o1-mini costs (#26411)
updated OpenAI cost definitions according to the following:
https://openai.com/api/pricing/

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:59:46 +00:00
ZhangShenao
1a8e9023de Improvement[Community] Improve streamlit_callback_handler (#26373)
- add decorator for static methods

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:54:37 +00:00
Bagatur
1a62f9850f anthropic[patch]: Release 0.2.1 (#26592) 2024-09-17 14:44:21 -07:00
Harutaka Kawamura
6ed50e78c9 community: Rename deployments server to AI gateway (#26368)
We recently renamed `MLflow Deployments Server` to `MLflow AI Gateway`
in mlflow. This PR updates the relevant notebooks to use `MLflow AI
gateway`

---

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: harupy <17039389+harupy@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:36:04 +00:00
Bagatur
5ced41bf50 anthropic[patch]: fix tool call and tool res image_url handling (#26587)
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-17 14:30:07 -07:00
Christophe Bornet
c6bdd6f482 community: Fix references in link extractors docstrings (#26314)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:26:25 +00:00
Christophe Bornet
3a99467ccb core[patch]: Add ruff rule UP006(use PEP585 annotations) (#26574)
* Added rules `UPD006` now that Pydantic is v2+

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-17 21:22:50 +00:00
wlleiiwang
2ef4c9466f community: modify document links for tencent vectordb (#26316)
- modify document links for create a tencent vectordb database instance.

Co-authored-by: wlleiiwang <wlleiiwang@tencent.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 21:11:10 +00:00
Erick Friis
194adc485c docs: pypi readme image links (#26590) 2024-09-17 20:41:34 +00:00
Bagatur
97b05d70e6 docs: anthropic api ref nit (#26591) 2024-09-17 20:39:53 +00:00
Bagatur
e1d113ea84 core,openai,grow,fw[patch]: deprecate bind_functions, update chat mod… (#26584)
…el api ref
2024-09-17 11:32:39 -07:00
ccurme
7c05f71e0f milvus[patch]: fix vectorstore integration tests (#26583)
Resolves https://github.com/langchain-ai/langchain/issues/26564
2024-09-17 14:17:05 -04:00
Bagatur
145a49cca2 core[patch]: Release 0.3.1 (#26581) 2024-09-17 17:34:09 +00:00
Nuno Campos
5fc44989bf core[patch]: Fix "argument of type 'NoneType' is not iterable" error in LangChainTracer (#26576)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-17 10:29:46 -07:00
Erick Friis
f4a65236ee infra: only force reinstall on release (#26580) 2024-09-17 17:12:17 +00:00
Isaac Francisco
06cde06a20 core[minor]: remove beta from RemoveMessage (#26579) 2024-09-17 17:09:58 +00:00
Erick Friis
3e51fdc840 infra: more skip if pull request libs (#26578) 2024-09-17 09:48:02 -07:00
RUO
0a177ec2cc community: Enhance MongoDBLoader with flexible metadata and optimized field extraction (#23376)
### Description:
This pull request significantly enhances the MongodbLoader class in the
LangChain community package by adding robust metadata customization and
improved field extraction capabilities. The updated class now allows
users to specify additional metadata fields through the metadata_names
parameter, enabling the extraction of both top-level and deeply nested
document attributes as metadata. This flexibility is crucial for users
who need to include detailed contextual information without altering the
database schema.

Moreover, the include_db_collection_in_metadata flag offers optional
inclusion of database and collection names in the metadata, allowing for
even greater customization depending on the user's needs.

The loader's field extraction logic has been refined to handle missing
or nested fields more gracefully. It now employs a safe access mechanism
that avoids the KeyError previously encountered when a specified nested
field was absent in a document. This update ensures that the loader can
handle diverse and complex data structures without failure, making it
more resilient and user-friendly.

### Issue:
This pull request addresses a critical issue where the MongodbLoader
class in the LangChain community package could throw a KeyError when
attempting to access nested fields that may not exist in some documents.
The previous implementation did not handle the absence of specified
nested fields gracefully, leading to runtime errors and interruptions in
data processing workflows.

This enhancement ensures robust error handling by safely accessing
nested document fields, using default values for missing data, thus
preventing KeyError and ensuring smoother operation across various data
structures in MongoDB. This improvement is crucial for users working
with diverse and complex data sets, ensuring the loader can adapt to
documents with varying structures without failing.

### Dependencies: 
Requires motor for asynchronous MongoDB interaction.

### Twitter handle: 
N/A

### Add tests and docs
Tests: Unit tests have been added to verify that the metadata inclusion
toggle works as expected and that the field extraction correctly handles
nested fields.
Docs: An example notebook demonstrating the use of the enhanced
MongodbLoader is included in the docs/docs/integrations directory. This
notebook includes setup instructions, example usage, and outputs.
(Here is the notebook link : [colab
link](https://colab.research.google.com/drive/1tp7nyUnzZa3dxEFF4Kc3KS7ACuNF6jzH?usp=sharing))
Lint and test
Before submitting, I ran make format, make lint, and make test as per
the contribution guidelines. All tests pass, and the code style adheres
to the LangChain standards.

```python
import unittest
from unittest.mock import patch, MagicMock
import asyncio
from langchain_community.document_loaders.mongodb import MongodbLoader

class TestMongodbLoader(unittest.TestCase):
    def setUp(self):
        """Setup the MongodbLoader test environment by mocking the motor client 
        and database collection interactions."""
        # Mocking the AsyncIOMotorClient
        self.mock_client = MagicMock()
        self.mock_db = MagicMock()
        self.mock_collection = MagicMock()

        self.mock_client.get_database.return_value = self.mock_db
        self.mock_db.get_collection.return_value = self.mock_collection

        # Initialize the MongodbLoader with test data
        self.loader = MongodbLoader(
            connection_string="mongodb://localhost:27017",
            db_name="testdb",
            collection_name="testcol"
        )

    @patch('langchain_community.document_loaders.mongodb.AsyncIOMotorClient', return_value=MagicMock())
    def test_constructor(self, mock_motor_client):
        """Test if the constructor properly initializes with the correct database and collection names."""
        loader = MongodbLoader(
            connection_string="mongodb://localhost:27017",
            db_name="testdb",
            collection_name="testcol"
        )
        self.assertEqual(loader.db_name, "testdb")
        self.assertEqual(loader.collection_name, "testcol")

    def test_aload(self):
        """Test the aload method to ensure it correctly queries and processes documents."""
        # Setup mock data and responses for the database operations
        self.mock_collection.count_documents.return_value = asyncio.Future()
        self.mock_collection.count_documents.return_value.set_result(1)
        self.mock_collection.find.return_value = [
            {"_id": "1", "content": "Test document content"}
        ]

        # Run the aload method and check responses
        loop = asyncio.get_event_loop()
        results = loop.run_until_complete(self.loader.aload())
        self.assertEqual(len(results), 1)
        self.assertEqual(results[0].page_content, "Test document content")

    def test_construct_projection(self):
        """Verify that the projection dictionary is constructed correctly based on field names."""
        self.loader.field_names = ['content', 'author']
        self.loader.metadata_names = ['timestamp']
        expected_projection = {'content': 1, 'author': 1, 'timestamp': 1}
        projection = self.loader._construct_projection()
        self.assertEqual(projection, expected_projection)

if __name__ == '__main__':
    unittest.main()
```


### Additional Example for Documentation
Sample Data:

```json
[
    {
        "_id": "1",
        "title": "Artificial Intelligence in Medicine",
        "content": "AI is transforming the medical industry by providing personalized medicine solutions.",
        "author": {
            "name": "John Doe",
            "email": "john.doe@example.com"
        },
        "tags": ["AI", "Healthcare", "Innovation"]
    },
    {
        "_id": "2",
        "title": "Data Science in Sports",
        "content": "Data science provides insights into player performance and strategic planning in sports.",
        "author": {
            "name": "Jane Smith",
            "email": "jane.smith@example.com"
        },
        "tags": ["Data Science", "Sports", "Analytics"]
    }
]
```
Example Code:

```python
loader = MongodbLoader(
    connection_string="mongodb://localhost:27017",
    db_name="example_db",
    collection_name="articles",
    filter_criteria={"tags": "AI"},
    field_names=["title", "content"],
    metadata_names=["author.name", "author.email"],
    include_db_collection_in_metadata=True
)

documents = loader.load()

for doc in documents:
    print("Page Content:", doc.page_content)
    print("Metadata:", doc.metadata)
```
Expected Output:

```
Page Content: Artificial Intelligence in Medicine AI is transforming the medical industry by providing personalized medicine solutions.
Metadata: {'author_name': 'John Doe', 'author_email': 'john.doe@example.com', 'database': 'example_db', 'collection': 'articles'}
```

Thank you.

---

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-09-17 10:23:17 -04:00
ccurme
6758894af1 docs: update v0.3 integrations table (#26571) 2024-09-17 09:56:04 -04:00
venkatram-dev
6ba3c715b7 doc_fix_chroma_integration (#26565)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
docs:integrations:vectorstores:chroma:fix_typo


- [x] **PR message**: ***Delete this entire checklist*** and replace
with


- **Description:** fix_typo in docs:integrations:vectorstores:chroma
https://python.langchain.com/docs/integrations/vectorstores/chroma/
    - **Issue:** https://github.com/langchain-ai/langchain/issues/26561

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-17 08:17:54 -04:00
Bagatur
d8952b8e8c langchain[patch]: infer mistral provider in init_chat_model (#26557) 2024-09-17 00:35:54 +00:00
Bagatur
31f61d4d7d docs: v0.3 nits (#26556) 2024-09-17 00:14:47 +00:00
Bagatur
99abd254fb docs: clean up init_chat_model (#26551) 2024-09-16 22:08:22 +00:00
Tomaz Bratanic
3bcd641bc1 Add check for prompt based approach in llm graph transformer (#26519) 2024-09-16 15:01:09 -07:00
Bagatur
0bd98c99b3 docs: add sema4 to release table (#26549) 2024-09-16 14:59:13 -07:00
Eugene Yurtsev
8a2f2fc30b docs: what langchain-cli migrate can do (#26547) 2024-09-16 20:10:40 +00:00
SQpgducray
724a53711b docs: Fix missing self argument in _get_docs_with_query method of `Cust… (#26312)
…omSelfQueryRetriever`

This commit corrects an issue in the `_get_docs_with_query` method of
the `CustomSelfQueryRetriever` class. The method was incorrectly using
`self.vectorstore.similarity_search_with_score(query, **search_kwargs)`
without including the `self` argument, which is required for proper
method invocation.

The `self` argument is necessary for calling instance methods and
accessing instance attributes. By including `self` in the method call,
we ensure that the method is correctly executed in the context of the
current instance, allowing it to function as intended.

No other changes were made to the method's logic or functionality.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-16 20:02:30 +00:00
Eugene Yurtsev
c6a78132d6 docs: show how to use langchain-cli for migration (#26535)
Update v0.3 instructions a bit

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-16 15:53:05 -04:00
Bagatur
a319a0ff1d docs: add redirects for tools and lcel (#26541) 2024-09-16 18:06:15 +00:00
Eugene Yurtsev
63c3cc1f1f ci: updates issue and discussion templates (#26542)
Update issue and discussion templates
2024-09-16 17:43:04 +00:00
ccurme
0154c586d3 docs: update integrations table in 0.3 guide (#26536) 2024-09-16 17:41:56 +00:00
Eugene Yurtsev
c2588b334f unstructured: release 0.1.4 (#26540)
Release to work with langchain 0.3
2024-09-16 17:38:38 +00:00
Eugene Yurtsev
8b985a42e9 milvus: 0.1.6 release (#26538)
Release to work with langchain 0.3
2024-09-16 13:33:09 -04:00
Eugene Yurtsev
5b4206acd8 box: 0.2.0 release (#26539)
Release to work with langchain 0.3
2024-09-16 13:32:59 -04:00
ccurme
0592c29e9b qdrant[patch]: release 0.1.4 (#26534)
`langchain-qdrant` imports pydantic but was importing pydantic proper
before 0.3 release:
042e84170b/libs/partners/qdrant/langchain_qdrant/sparse_embeddings.py (L5-L8)
2024-09-16 13:04:12 -04:00
Eugene Yurtsev
88891477eb langchain-cli: release 0.0.31 (#26533)
langchain-cli 0.0.31 release
2024-09-16 12:57:24 -04:00
ccurme
88bc15d69b standard-tests[patch]: add async test for structured output (#26527) 2024-09-16 11:15:23 -04:00
Erick Friis
1ab181f514 voyageai: release 0.1.2 (#26512) 2024-09-16 03:11:15 +00:00
Erick Friis
ee4e11379f nomic: release 0.1.3, core 0.3 compat but not required (#26511) 2024-09-15 20:10:25 -07:00
Yoshitaka Fujii
bd42344b0a docs: Update concepts.mdx (#26496)
- Fix comments in Python
- Fix repeated sentences
2024-09-16 01:46:15 +00:00
Erick Friis
9f5960a0aa docs: new algolia index (#26508) 2024-09-15 18:33:42 -07:00
Erick Friis
135afdf4fb docs: most 0.1 redirects too (#26494)
takes redirects from 0.1 docs and factors them into suggested redirects
in 0.3 docs
2024-09-15 18:29:58 +00:00
Erick Friis
4131be63af multiple: 0.3.0 not dev version (#26502) 2024-09-15 18:26:50 +00:00
Bhadresh Savani
f66b7ba32d Update google_search.ipynb (#26420)
Added changes for pip installation
2024-09-14 15:08:40 -07:00
jessicaou
9c6aa3f0b7 broken LangGraph docs link (#26438)
Update broken langgraph link in the README.md file

Co-authored-by: Jess Ou <jessou@jesss-mbp.local.meter>
2024-09-14 15:07:51 -07:00
Nicolas
2240ca2979 docs: Fix Firecrawl v0 version (#26452)
Firecrawl integration is currently on v0 - which is supported until
version 0.0.20.

@rafaelsideguide is working on a pr for v1 but meanwhile we should fix
the docs.
2024-09-14 15:06:15 -07:00
Eugene Yurtsev
77ccb4b1cf cli[patch]: Update the migration script message (#26490)
Update the migration script message
2024-09-14 14:40:35 -04:00
Bagatur
b47f4cfe51 mongodb[minor]: Release 0.2.0 (#26484) 2024-09-13 19:17:36 -07:00
Bagatur
779a008d4e docs: update v3 versions (#26483) 2024-09-14 02:16:54 +00:00
Bagatur
4e6620ecdd chroma[patch]: Release 0.1.4 (#26470) 2024-09-13 17:31:34 -07:00
Bagatur
543a80569c prompty[minor]: Release 0.1.0 (#26481) 2024-09-13 23:32:01 +00:00
ccurme
9c88037dbc huggingface[patch]: xfail test (#26479) 2024-09-13 23:16:06 +00:00
Bagatur
a2bfa41216 azure-dynamic-sessions[minor]: Release 0.2.0 (#26478) 2024-09-13 23:09:48 +00:00
ccurme
8abc7ff55a experimental: release 0.3 (#26477) 2024-09-13 23:07:35 +00:00
Bagatur
6abb23ca97 exa[minor]: Release 0.2.0 (#26476) 2024-09-13 23:04:10 +00:00
ccurme
900115a568 community: release 0.3 (#26472) 2024-09-13 22:55:56 +00:00
Bagatur
17b397ef93 pinecone[minor]: Release 0.2.0 (#26474) 2024-09-13 22:55:35 +00:00
Erick Friis
ca304ae046 robocorp: rm package (now langchain-sema4) (#26471) 2024-09-13 15:54:00 -07:00
Erick Friis
537f6924dc partners/ollama: release 0.2.0 (#26468) 2024-09-13 15:48:48 -07:00
Erick Friis
995dfc6b05 partners/fireworks: release 0.2.0 (#26467) 2024-09-13 22:48:16 +00:00
Erick Friis
832bc834b1 partners/anthropic: release 0.2.0 (#26469)
0.3.0 version was a mistake! not released - bumping version back to
0.2.0 here
2024-09-13 22:47:09 +00:00
Erick Friis
6997731729 partners/anthropic: release 0.3.0 (#26466) 2024-09-13 22:44:11 +00:00
Bagatur
64bfe1ff23 groq[minor]: Release 0.2.0 (#26465) 2024-09-13 22:43:11 +00:00
Erick Friis
58c7414e10 langchain: release 0.3.0 (#26462) 2024-09-13 22:40:37 +00:00
ccurme
125c9896a8 huggingface: release 0.1 (#26463) 2024-09-13 22:39:49 +00:00
Bagatur
f7ae12fa1f openai[minor]: Release 0.2.0 (#26464) 2024-09-13 15:38:10 -07:00
ccurme
d1462badaf text-splitters: release 0.3 (#26460)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-13 22:31:06 +00:00
ccurme
9b30bdceb6 mistralai: release 0.2 (#26458)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-13 18:27:51 -04:00
Bagatur
3125a89198 infra: fix min version (#26461) 2024-09-13 22:25:22 +00:00
Bagatur
44791ce131 infra: rm pydantic from min version test (#26459) 2024-09-13 15:22:28 -07:00
Bagatur
fa8e0d90de docs: update version docs (#26457) 2024-09-13 22:20:24 +00:00
Bagatur
222caaebdd infra: fix release (#26455) 2024-09-13 15:01:36 -07:00
Erick Friis
d46ab19954 core: release 0.3.0 (#26453) 2024-09-13 21:45:45 +00:00
Erick Friis
c2a3021bb0 multiple: pydantic 2 compatibility, v0.3 (#26443)
Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Dan O'Donovan <dan.odonovan@gmail.com>
Co-authored-by: Tom Daniel Grande <tomdgrande@gmail.com>
Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: ZhangShenao <15201440436@163.com>
Co-authored-by: Friso H. Kingma <fhkingma@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Morgante Pell <morgantep@google.com>
2024-09-13 14:38:45 -07:00
Bagatur
d9813bdbbc openai[patch]: Release 0.1.25 (#26439) 2024-09-13 12:00:01 -07:00
liuhetian
7fc9e99e21 openai[patch]: get output_type when using with_structured_output (#26307)
- This allows pydantic to correctly resolve annotations necessary when
using openai new param `json_schema`

Resolves issue: #26250

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-13 11:42:01 -07:00
Bagatur
0f2b32ffa9 core[patch]: Release 0.2.40 (#26435) 2024-09-13 09:57:09 -07:00
Bagatur
e32adad17a community[patch]: Release 0.2.17 (#26432) 2024-09-13 09:56:39 -07:00
langchain-infra
8a02fd9c01 core: add additional import mappings to loads (#26406)
Support using additional import mapping. This allows users to override
old mappings/add new imports to the loads function.

- [x ] **Add tests and docs**: If you're adding a new integration,
please include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-09-13 09:39:58 -07:00
Erick Friis
1d98937e8d partners/openai: release 0.1.24 (#26417) 2024-09-12 21:54:13 -07:00
Harrison Chase
28ad244e77 community, openai: support nested dicts (#26414)
needed for thinking tokens

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-12 21:47:47 -07:00
Erick Friis
c0dd293f10 partners/groq: release 0.1.10 (#26393) 2024-09-12 17:41:11 +00:00
Erick Friis
54c85087e2 groq: add back streaming tool calls (#26391)
api no longer throws an error

https://console.groq.com/docs/tool-use#streaming
2024-09-12 10:29:45 -07:00
jessicaou
396c0aee4d docs: Adding LC Academy links (#26164)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Jess Ou <jessou@jesss-mbp.local.meter>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-11 23:37:17 +00:00
Bagatur
feb351737c core[patch]: fix empty OpenAI tools when strict=True (#26287)
Fix #26232

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-09-11 16:06:03 -07:00
William FH
d87feb1b04 [Docs] Correct the admonition explaining min langchain-anthropic version in doc (#26359)
0.1.15 instead of just 0.1.5
2024-09-11 23:03:42 +00:00
ccurme
398718e1cb core[patch]: fix regression in convert_to_openai_tool with instances of Tool (#26327)
```python
from langchain_core.tools import Tool
from langchain_core.utils.function_calling import convert_to_openai_tool

def my_function(x: int) -> int:
    return x + 2

tool = Tool(
    name="tool_name",
    func=my_function,
    description="test description",
)
convert_to_openai_tool(tool)
```

Current:
```
{'type': 'function',
 'function': {'name': 'tool_name',
  'description': 'test description',
  'parameters': {'type': 'object',
   'properties': {'args': {'type': 'array', 'items': {}},
    'config': {'type': 'object',
     'properties': {'tags': {'type': 'array', 'items': {'type': 'string'}},
      'metadata': {'type': 'object'},
      'callbacks': {'anyOf': [{'type': 'array', 'items': {}}, {}]},
      'run_name': {'type': 'string'},
      'max_concurrency': {'type': 'integer'},
      'recursion_limit': {'type': 'integer'},
      'configurable': {'type': 'object'},
      'run_id': {'type': 'string', 'format': 'uuid'}}},
    'kwargs': {'type': 'object'}},
   'required': ['config']}}}
```

Here:
```
{'type': 'function',
 'function': {'name': 'tool_name',
  'description': 'test description',
  'parameters': {'properties': {'__arg1': {'title': '__arg1',
     'type': 'string'}},
   'required': ['__arg1'],
   'type': 'object'}}}
```
2024-09-11 15:51:10 -04:00
이규민
7feae62ad7 core[patch]: Support non ASCII characters in tool output if user doesn't output string (#26319)
### simple modify
core: add supporting non english character

target issue is #26315 
same issue on langgraph -
https://github.com/langchain-ai/langgraph/issues/1504
2024-09-11 15:21:00 +00:00
William FH
b993172702 Keyword-like runnable config (#26295) 2024-09-11 07:44:47 -07:00
Bagatur
17659ca2cd core[patch]: Release 0.2.39 (#26279) 2024-09-10 20:11:27 +00:00
Nuno Campos
212c688ee0 core[minor]: Remove serialized manifest from tracing requests for non-llm runs (#26270)
- This takes a long time to compute, isn't used, and currently called on
every invocation of every chain/retriever/etc
2024-09-10 12:58:24 -07:00
ccurme
979232257b huggingface[patch]: add integration tests for embeddings (#26272) 2024-09-10 14:57:16 -04:00
ccurme
4ffd27c4d0 huggingface[patch]: add integration tests (#26269)
Add standard tests for ChatHuggingFace. About half of these fail
currently.
2024-09-10 18:31:51 +00:00
Emad Rad
16d41eab1e docs: typos fixed (#26234)
While going through the chatbot tutorial, I noticed a couple of typos
and grammatical issues. Also, the pip install command for
langchain_community was commented out, but the document mentions
installing it.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-09-10 00:52:20 +00:00
venkatram-dev
fa229d6c02 docs: fix_typo_llm_chain_tutorial (#26229)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
docs:tutorials:llm_chain:fix typo



- [ ] **PR message**: 
fix typo in llm chain tutorial

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-10 00:39:29 +00:00
Christophe Bornet
9cf7ae0a52 community: Add docstring for HtmlLinkExtractor (#26213)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-10 00:27:37 +00:00
Christophe Bornet
56580b5fff community: Add docstring for GLiNERLinkExtractor (#26218)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-10 00:27:23 +00:00
Christophe Bornet
e235a572a0 community: Add docstring for KeybertLinkExtractor (#26210)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-10 00:26:29 +00:00
Vadym Barda
bab9de581c core[patch]: wrap mermaid node names w/ markdown in <p> tag (#26235)
This fixes the issue where `__start__` and `__end__` node labels are
being interpreted as markdown, as of the most recent Mermaid update
2024-09-09 20:11:00 -04:00
miri-bar
3e48c728d5 docs: add ai21 tool calling example (#26199)
Add tool calling example to AI21 docs
2024-09-09 09:34:54 -07:00
Geoffrey HARRAZI
76bce42629 docs: Update Google BigQuery Vector Search with new SQL filter feature introduce in langchain-google-community 1.0.9 (#26184)
Hello,

fix: https://github.com/langchain-ai/langchain/issues/26183

Adding documentation regarding SQL like filter for Google BigQuery
Vector Search coming in next langchain-google-community 1.0.9 release.
Note: langchain-google-community==1.0.9 is not yet released

Question: There is no way to warn the user int the doc about the
availability of a feature after a specific package version ?

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:58:28 +00:00
Matt Hull
bca51ca164 docs: Update func doc strings in tools_human (#26149)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Fix docstring for two functions that look like have
docstrings carried over from other functions.
    - **Issue:** Not found issue reporting the miss-leading docstrings.
    - **Dependencies:** None
    - **Twitter handle:** 


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:48:24 +00:00
Qasim Khan
fa17b145bb docs: fix typo in graph_constructing tutorial (#26134)
Changed 

> "At a high-level, the steps of constructing a knowledge are from text
are:"

to 

> "At a high-level, the steps of constructing a knowledge graph from
text are:"

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:46:49 +00:00
Tomaz Bratanic
181e4fc0e0 Add session expired retry to neo4j graph (#26182)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 11:40:43 -07:00
Sebastian Cherny
b3c7ed4913 Adding bind_tools in ChatOctoAI (#26168)
The object extends from
langchain_community.chat_models.openai.ChatOpenAI which doesn't have
`bind_tools` defined. I tried extending from
`langchain_openai.ChatOpenAI` in
https://github.com/langchain-ai/langchain/pull/25975 but that PR got
closed because this is not correct.
So adding our own `bind_tools` (which for now copying from ChatOpenAI is
good enough) will solve the tool calling issue we are having now.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:38:43 +00:00
Malik Ashar Khan
042e84170b Fix typo in ScrapflyLoader documentation (#26117)
This PR fixes a minor typo in the ScrapflyLoader documentation. The word
"passigng" was changed to "passing."

Before: passigng
After: passing

This change improves the clarity and professionalism of the
documentation.

Co-authored-by: Ashar <asharmalik.ds193@gmail.com>
2024-09-08 18:33:01 +00:00
John
97a8e365ec partners/unstructured: update unstructured client version (#26105)
Users are having version conflicts with `unstructured-client` as
described here:

https://unstructuredw-kbe4326.slack.com/archives/C06JJHC9G4U/p1725557970546199?thread_ts=1725035247.162819&cid=C06JJHC9G4U

This PR fixes that issue and should update the version to "0.1.3" as
well for a clean-slate version for users to install

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 18:32:34 +00:00
Vadym Barda
1b3bd52e0e core[patch]: fix edge labels for mermaid graphs (#26201) 2024-09-08 14:35:25 +00:00
Marcelo Machado
9bd4f1dfa8 docs: small improvement ChatOllama setup description (#26043)
Small improvement on ChatOllama description

---------

Co-authored-by: Marcelo Machado <mmachado@ibm.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 00:15:05 +00:00
Leonid Ganeline
2f80d67dc1 docs: integrations reference updates 16 (#26059)
Added missed provider pages and links. Fixed inconsistent formatting.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 00:13:53 +00:00
Ikko Eltociear Ashimine
ffdc370200 docs: update agent_executor.ipynb (#26035)
initalize -> initialize

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-08 00:07:26 +00:00
Leonid Ganeline
5052e87d7c docs: integrations reference updates 15 (#25994)
Added missed provider pages and links. Fixed inconsistent formatting.
2024-09-07 16:51:47 -07:00
Erick Friis
6e82d2184b partners/mongodb: release 0.1.9 (#26193) 2024-09-07 23:20:25 +00:00
William FH
262e19b15d infra: Clear cache for env-var checks (#26073) 2024-09-06 21:29:29 +00:00
Brace Sproul
854f37be87 docs[minor]: Add state of agents survey to docs announcement bar (#26167) 2024-09-06 14:28:08 -07:00
ChengZi
a03141ac51 partners[milvus]: fix integration test issues (#26136)
fix some integration test issues:
https://github.com/langchain-ai/langchain/actions/runs/10688447230/job/29628412258

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-06 16:52:36 +00:00
Erick Friis
5c1ebd3086 partners/unstructured: release 0.1.3 (#26119) 2024-09-06 16:22:53 +00:00
Bagatur
de97d50644 core,standard-tests[patch]: add Ser/Des test and update serialization mapping (#26042) 2024-09-04 11:58:36 -07:00
Bagatur
1241a004cb fmt 2024-09-04 11:44:59 -07:00
Bagatur
4ba14ae9e5 fmt 2024-09-04 11:34:59 -07:00
Bagatur
dba308447d fmt 2024-09-04 11:28:04 -07:00
Bagatur
fdf6fbde18 fmt 2024-09-04 11:12:11 -07:00
Bagatur
576574c82c fmt 2024-09-04 11:05:36 -07:00
Bagatur
7bf54636ff make 2024-09-04 10:24:42 -07:00
Bagatur
3ec93c2817 standard-tests[patch]: add Ser/Des test 2024-09-04 10:24:06 -07:00
Friso H. Kingma
af11fbfbf6 langchain_openai: Make sure the response from the async client in the astream method of ChatOpenAI is properly awaited in case of "include_response_headers=True" (#26031)
- **Description:** This is a **one line change**. the
`self.async_client.with_raw_response.create(**payload)` call is not
properly awaited within the `_astream` method. In `_agenerate` this is
done already, but likely forgotten in the other method.
  - **Issue:** Not applicable
  - **Dependencies:** No dependencies required.

(If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-04 13:26:48 +00:00
ZhangShenao
c812237217 Improvement[Community] Improve args description in api doc of DocArrayInMemorySearch (#26024)
- Add missing arg
- Remove redundant arg
2024-09-04 09:26:26 -04:00
Tomaz Bratanic
c649b449d7 Add the option to ignore structured output method to LLM graph transf… (#26013)
Open source models like Llama3.1 have function calling, but it's not
that great. Therefore, we introduce the option to ignore model's
function calling and just use the prompt-based approach
2024-09-04 09:15:43 -04:00
Bagatur
34fc00aff1 openai[patch]: add back azure embeddings api_version alias (#26003) 2024-09-03 17:27:10 -07:00
Bagatur
4b99426a4f openai[patch]: add back azure embeddings api_version alias 2024-09-03 17:25:03 -07:00
Eugene Yurtsev
bc3b851f08 openai[patch]: Upgrade @root_validators in preparation for pydantic 2 migration (#25491)
* Upgrade @root_validator in openai pkg
* Ran notebooks for all but AzureAI embeddings

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-03 14:42:24 -07:00
Tom Daniel Grande
0207dc1431 community: delta in openai choice can be None, creates handler for that (#25954)
Thank you for contributing to LangChain!

- [X ] **PR title**

- [X ] **PR message**: 

     **Description:** adds a handler for when delta choice is None

     **Issue:** Fixes #25951
     **Dependencies:** Not applicable


- [ X] **Add tests and docs**: Not applicable

- [X ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Grande <Tom.Daniel.Grande@statsbygg.no>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-09-03 20:30:03 +00:00
Bagatur
9eb9ff52c0 experimental[patch]: Release 0.0.65 (#25987) 2024-09-03 19:15:48 +00:00
Bagatur
bc3b02651c standard-tests[patch]: test init from env vars (#25983) 2024-09-03 19:05:39 +00:00
Bagatur
ac922105ad infra: rm ai21 from CI (#25984) 2024-09-03 11:47:27 -07:00
Bagatur
0af447c90b community[patch]: Release 0.2.16 (#25982) 2024-09-03 18:34:18 +00:00
Dan O'Donovan
f49da71e87 community[patch]: change default Neo4j username/password (#25226)
**Description:**

Change the default Neo4j username/password (when not supplied as
environment variable or in code) from `None` to `""`.

Neo4j has an option to [disable
auth](https://neo4j.com/docs/operations-manual/current/configuration/configuration-settings/#config_dbms.security.auth_enabled)
which is helpful when developing. When auth is disabled, the username /
password through the `neo4j` module should be `""` (ie an empty string).

Empty strings get marked as false in
`langchain_core.utils.env.get_from_dict_or_env` -- changing this code /
behaviour would have a wide impact and is undesirable.

In order to both _allow_ access to Neo4j with auth disabled and _not_
impact `langchain_core` this patch is presented. The downside would be
that if a user forgets to set NEO4J_USERNAME or NEO4J_PASSWORD they
would see an invalid credentials error rather than missing credentials
error. This could be mitigated but would result in a less elegant patch!

**Issue:**
Fix issue where langchain cannot communicate with Neo4j if Neo4j auth is
disabled.
2024-09-03 11:24:18 -07:00
Bagatur
035d8cf51b milvus[patch]: Release 0.1.5 (#25981) 2024-09-03 18:19:51 +00:00
Bagatur
1dfc8c01af langchain[patch]: Release 0.2.16 (#25977) 2024-09-03 18:10:21 +00:00
Bagatur
fb642e1e27 text-splitters[patch]: Release 0.2.4 (#25979) 2024-09-03 18:09:43 +00:00
Bagatur
7457949619 mistralai[patch]: Release 0.1.13 (#25978) 2024-09-03 18:03:15 +00:00
Bagatur
0c69c9fb3f core[patch]: Release 0.2.38 (#25974) 2024-09-03 17:31:41 +00:00
Eugene Yurtsev
fa8402ea09 core[minor]: Add support for multiple env keys for secrets_from_env (#25971)
- Add support to look up secret using more than one env variable
- Add overload to help mypy

Needed for https://github.com/langchain-ai/langchain/pull/25491
2024-09-03 11:39:54 -04:00
Maximilian Schulz
fdeaff4149 langchain-mistralai - make base URL possible to set via env variable for ChatMistralAI (#25956)
Thank you for contributing to LangChain!


**Description:** 

Similar to other packages (`langchain_openai`, `langchain_anthropic`) it
would be beneficial if that `ChatMistralAI` model could fetch the API
base URL from the environment.

This PR allows this via the following order:
- provided value
- then whatever `MISTRAL_API_URL` is set to
- then whatever `MISTRAL_BASE_URL` is set to
- if `None`, then default is ` "https://api.mistral.com/v1"`


- [x] **Add tests and docs**:

Added unit tests, docs I feel are unnecessary, as this is just aligning
with other packages that do the same?


- [x] **Lint and test**: 

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-03 14:32:35 +00:00
Jorge Piedrahita Ortiz
c7154a4045 community: sambastudio llms api v2 support (#25063)
- **Description:** SambaStudio GenericV2 API support
2024-09-03 10:18:15 -04:00
ZhangShenao
8d784db107 docs: Add missing args in api doc of WebResearchRetriever (#25949)
Add missing args in api doc of `WebResearchRetriever`
2024-09-03 01:24:23 -07:00
Bagatur
da113f6363 docs: ChatOpenAI.with_structured_output nits (#25952) 2024-09-03 08:20:58 +00:00
Bagatur
5b99bb2437 docs: fix bullet list spacing (#25950)
Fix #25935
2024-09-03 08:12:58 +00:00
Yuki Watanabe
ef329f6819 docs: Fix databricks doc (#25941)
https://github.com/langchain-ai/langchain/pull/25929 broke the layout
because of missing `:::` for the caution clause.

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2024-09-02 18:17:47 -07:00
Bagatur
f872c50b3f docs: installation nits (#24484) 2024-09-03 01:05:08 +00:00
Isaac Francisco
4833375200 community[patch]: added option to change how duckduckgosearchresults tool converts api outputs into string (#22580)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-02 22:42:19 +00:00
JonZeolla
78ff51ce83 community[patch]: update the default hf bge embeddings (#22627)
**Description:** This updates the langchain_community > huggingface >
default bge embeddings ([the current default recommends this
change](https://huggingface.co/BAAI/bge-large-en))
**Issue:** None
**Dependencies:** None
**Twitter handle:** @jonzeolla

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-02 22:10:21 +00:00
Leonid Ganeline
150251fd49 docs: integrations reference updates 13 (#25711)
Added missed provider pages and links. Fixed inconsistent formatting.
Added arxiv references to docstirngs.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-02 22:08:50 +00:00
Yuki Watanabe
64dfdaa924 docs: Add Databricks integration (#25929)
Updating the gateway pages in the documentation to name the
`langchain-databricks` integration.

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-02 22:05:40 +00:00
Bagatur
933bc0d6ff core[patch]: support additional kwargs on StructuredPrompt (#25645) 2024-09-02 14:55:26 -07:00
Yash Parmar
51dae57357 community[minor]: jina search tools integrating (jina reader) (#23339)
- **PR title**: "community: add Jina Search tool"
- **Description:** Added the Jina Search tool for querying the Jina
search API. This includes the implementation of the JinaSearchAPIWrapper
and the JinaSearch tool, along with a Jupyter notebook example
demonstrating its usage.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** [Twitter
handle](https://x.com/yashp3020?t=7wM0gQ7XjGciFoh9xaBtqA&s=09)


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-02 14:52:14 -07:00
Matthew DeGenaro
66828f4ecc text-splitters[patch]: Modified SpacyTextSplitter to fully keep whitespace when strip_whitespace is false (#23272)
Previously, regardless of whether or not strip_whitespace was set to
true or false, the strip text method in the SpacyTextSplitter class used
`sent.text` to get the sentence. I modified this to include a ternary
such that if strip_whitespace is false, it uses `sent.text_with_ws`
I also modified the project.toml to include the spacy pipeline package
and to lock the numpy version, as higher versions break spacy.

- **Issue:** N/a
- **Dependencies:** None
2024-09-02 21:15:56 +00:00
Qingchuan Hao
3145995ed9 community[patch]: BingSearchResults returns raw snippets as artifact(#23304)
Returns an array of results which is more specific and easier for later
use.

Tested locally:
```
resp = tool.invoke("what's the weather like in Shanghai?")
for item in resp:
    print(item)
```
returns
```
{'snippet': '<b>Shanghai</b>, <b>Shanghai</b>, China <b>Weather</b> Forecast, with current conditions, wind, air quality, and what to expect for the next 3 days.', 'title': 'Shanghai, Shanghai, China Weather Forecast | AccuWeather', 'link': 'https://www.accuweather.com/en/cn/shanghai/106577/weather-forecast/106577'}
{'snippet': '5. 99 / 87 °F. 6. 99 / 86 °F. 7. Detailed forecast for 14 days. Need some help? Current <b>weather</b> <b>in Shanghai</b> and forecast for today, tomorrow, and next 14 days.', 'title': 'Weather for Shanghai, Shanghai Municipality, China - timeanddate.com', 'link': 'https://www.timeanddate.com/weather/china/shanghai'}
{'snippet': '<b>Shanghai</b> - <b>Weather</b> warnings issued 14-day forecast. <b>Weather</b> warnings issued. Forecast - <b>Shanghai</b>. Day by day forecast. Last updated Friday at 01:05. Tonight, ... Temperature feels <b>like</b> 34 ...', 'title': 'Shanghai - BBC Weather', 'link': 'https://www.bbc.com/weather/1796236'}
{'snippet': 'Current <b>weather</b> <b>in Shanghai</b>, <b>Shanghai</b>, China. Check current conditions <b>in Shanghai</b>, <b>Shanghai</b>, China with radar, hourly, and more.', 'title': 'Shanghai, Shanghai, China Current Weather | AccuWeather', 'link': 'https://www.accuweather.com/en/cn/shanghai/106577/current-weather/106577'}
13-Day Beijing, Xi&#39;an, Chengdu, <b>Shanghai</b> Chinese Language and Culture Immersion Tour. <b>Shanghai</b> in September. Average daily temperature range: 23–29°C (73–84°F) Average rainy days: 10. Average sunny days: 20. September ushers in pleasant autumn <b>weather</b>, making it one of the best months to visit <b>Shanghai</b>. <b>Weather</b> in <b>Shanghai</b>: Climate, Seasons, and Average Monthly Temperature. <b>Shanghai</b> has a subtropical maritime monsoon climate, meaning high humidity and lots of rain. Hot muggy summers, cool falls, cold winters with little snow, and warm springs are the norm. Midsummer through early fall is the best time to visit <b>Shanghai</b>. <b>Shanghai</b>, <b>Shanghai</b>, China <b>Weather</b> Forecast, with current conditions, wind, air quality, and what to expect for the next 3 days. 1165. 45.9. 121. Winter, from December to February, is quite cold: the average January temperature is 5 °C (41 °F). There may be cold periods, with highs around 5 °C (41 °F) or below, and occasionally, even snow can fall. The temperature dropped to -10 °C (14 °F) in January 1977 and to -7 °C (19.5 °F) in January 2016. 5. 99 / 87 °F. 6. 99 / 86 °F. 7. Detailed forecast for 14 days. Need some help? Current <b>weather</b> in <b>Shanghai</b> and forecast for today, tomorrow, and next 14 days. Everything you need to know about today&#39;s <b>weather</b> in <b>Shanghai</b>, <b>Shanghai</b>, China. High/Low, Precipitation Chances, Sunrise/Sunset, and today&#39;s Temperature History. <b>Shanghai</b> - <b>Weather</b> warnings issued 14-day forecast. <b>Weather</b> warnings issued. Forecast - <b>Shanghai</b>. Day by day forecast. Last updated Friday at 01:05. Tonight, ... Temperature feels <b>like</b> 34 ... <b>Shanghai</b> 14 Day Extended Forecast. <b>Weather</b> Today <b>Weather</b> Hourly 14 Day Forecast Yesterday/Past <b>Weather</b> Climate (Averages) Currently: 84 °F. Passing clouds. (<b>Weather</b> station: <b>Shanghai</b> Hongqiao Airport, China). See more current <b>weather</b>. Current <b>weather</b> in <b>Shanghai</b>, <b>Shanghai</b>, China. Check current conditions in <b>Shanghai</b>, <b>Shanghai</b>, China with radar, hourly, and more. <b>Shanghai</b> <b>Weather</b> Forecasts. <b>Weather Underground</b> provides local &amp; long-range <b>weather</b> forecasts, weatherreports, maps &amp; tropical <b>weather</b> conditions for the <b>Shanghai</b> area.
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-09-02 21:11:32 +00:00
venkatram-dev
a09e2afee4 typo_summarization_tutorial (#25938)
Thank you for contributing to LangChain!

- [ ] **PR title**:
docs: fix typo in summarization_tutorial


- [ ] **PR message**: 
docs: fix couple of typos in summarization_tutorial

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-02 13:44:11 -07:00
Alexander KIRILOV
6a8f8a56ac community[patch]: added content_columns option to CSVLoader (#23809)
**Description:** 
Adding a new option to the CSVLoader that allows us to implicitly
specify the columns that are used for generating the Document content.
Currently these are implicitly set as "all fields not part of the
metadata_columns".

In some cases however it is useful to have a field both as a metadata
and as part of the document content.
2024-09-02 20:25:53 +00:00
Bruno Alvisio
ab527027ac community: Resolve refs recursively when generating openai_fn from OpenAPI spec (#19002)
- **Description:** This PR is intended to improve the generation of
payloads for OpenAI functions when converting from an OpenAPI spec file.
The solution is to recursively resolve `$refs`.
Currently when converting OpenAPI specs into OpenAI functions using
`openapi_spec_to_openai_fn`, if the schemas have nested references, the
generated functions contain `$ref` that causes the LLM to generate
payloads with an incorrect schema.

For example, for the for OpenAPI spec:

```
text = """
{
  "openapi": "3.0.3",
  "info": {
    "title": "Swagger Petstore - OpenAPI 3.0",
    "termsOfService": "http://swagger.io/terms/",
    "contact": {
      "email": "apiteam@swagger.io"
    },
    "license": {
      "name": "Apache 2.0",
      "url": "http://www.apache.org/licenses/LICENSE-2.0.html"
    },
    "version": "1.0.11"
  },
  "externalDocs": {
    "description": "Find out more about Swagger",
    "url": "http://swagger.io"
  },
  "servers": [
    {
      "url": "https://petstore3.swagger.io/api/v3"
    }
  ],
  "tags": [
    {
      "name": "pet",
      "description": "Everything about your Pets",
      "externalDocs": {
        "description": "Find out more",
        "url": "http://swagger.io"
      }
    },
    {
      "name": "store",
      "description": "Access to Petstore orders",
      "externalDocs": {
        "description": "Find out more about our store",
        "url": "http://swagger.io"
      }
    },
    {
      "name": "user",
      "description": "Operations about user"
    }
  ],
  "paths": {
    "/pet": {
      "post": {
        "tags": [
          "pet"
        ],
        "summary": "Add a new pet to the store",
        "description": "Add a new pet to the store",
        "operationId": "addPet",
        "requestBody": {
          "description": "Create a new pet in the store",
          "content": {
            "application/json": {
              "schema": {
                "$ref": "#/components/schemas/Pet"
              }
            }
          },
          "required": true
        },
        "responses": {
          "200": {
            "description": "Successful operation",
            "content": {
              "application/json": {
                "schema": {
                  "$ref": "#/components/schemas/Pet"
                }
              }
            }
          }
        }
      }
    }
  },
  "components": {
    "schemas": {
      "Tag": {
        "type": "object",
        "properties": {
          "id": {
            "type": "integer",
            "format": "int64"
          },
          "model_type": {
            "type": "number"
          }
        }
      },
      "Category": {
        "type": "object",
        "required": [
          "model",
          "year",
          "age"
        ],
        "properties": {
          "year": {
            "type": "integer",
            "format": "int64",
            "example": 1
          },
          "model": {
            "type": "string",
            "example": "Ford"
          },
          "age": {
            "type": "integer",
            "example": 42
          }
        }
      },
      "Pet": {
        "required": [
          "name"
        ],
        "type": "object",
        "properties": {
          "id": {
            "type": "integer",
            "format": "int64",
            "example": 10
          },
          "name": {
            "type": "string",
            "example": "doggie"
          },
          "category": {
            "$ref": "#/components/schemas/Category"
          },
          "tags": {
            "type": "array",
            "items": {
              "$ref": "#/components/schemas/Tag"
            }
          },
          "status": {
            "type": "string",
            "description": "pet status in the store",
            "enum": [
              "available",
              "pending",
              "sold"
            ]
          }
        }
      }
    }
  }
}
"""
```

Executing:
```
spec = OpenAPISpec.from_text(text)
pet_openai_functions, pet_callables = openapi_spec_to_openai_fn(spec)
response = model.invoke("Create a pet named Scott", functions=pet_openai_functions)
```

`pet_open_functions` contains unresolved `$refs`:

```
[
  {
    "name": "addPet",
    "description": "Add a new pet to the store",
    "parameters": {
      "type": "object",
      "properties": {
        "json": {
          "properties": {
            "id": {
              "type": "integer",
              "schema_format": "int64",
              "example": 10
            },
            "name": {
              "type": "string",
              "example": "doggie"
            },
            "category": {
              "ref": "#/components/schemas/Category"
            },
            "tags": {
              "items": {
                "ref": "#/components/schemas/Tag"
              },
              "type": "array"
            },
            "status": {
              "type": "string",
              "enum": [
                "available",
                "pending",
                "sold"
              ],
              "description": "pet status in the store"
            }
          },
          "type": "object",
          "required": [
            "name",
            "photoUrls"
          ]
        }
      }
    }
  }
]
```

and the generated JSON has an incorrect schema (e.g. category is filled
with `id` and `name` instead of `model`, `year` and `age`:

```
{
  "id": 1,
  "name": "Scott",
  "category": {
    "id": 1,
    "name": "Dogs"
  },
  "tags": [
    {
      "id": 1,
      "name": "tag1"
    }
  ],
  "status": "available"
}
```

With this change, the generated JSON by the LLM becomes,
`pet_openai_functions` becomes:

```
[
  {
    "name": "addPet",
    "description": "Add a new pet to the store",
    "parameters": {
      "type": "object",
      "properties": {
        "json": {
          "properties": {
            "id": {
              "type": "integer",
              "schema_format": "int64",
              "example": 10
            },
            "name": {
              "type": "string",
              "example": "doggie"
            },
            "category": {
              "properties": {
                "year": {
                  "type": "integer",
                  "schema_format": "int64",
                  "example": 1
                },
                "model": {
                  "type": "string",
                  "example": "Ford"
                },
                "age": {
                  "type": "integer",
                  "example": 42
                }
              },
              "type": "object",
              "required": [
                "model",
                "year",
                "age"
              ]
            },
            "tags": {
              "items": {
                "properties": {
                  "id": {
                    "type": "integer",
                    "schema_format": "int64"
                  },
                  "model_type": {
                    "type": "number"
                  }
                },
                "type": "object"
              },
              "type": "array"
            },
            "status": {
              "type": "string",
              "enum": [
                "available",
                "pending",
                "sold"
              ],
              "description": "pet status in the store"
            }
          },
          "type": "object",
          "required": [
            "name"
          ]
        }
      }
    }
  }
]
```

and the JSON generated by the LLM is:
```
{
  "id": 1,
  "name": "Scott",
  "category": {
    "year": 2022,
    "model": "Dog",
    "age": 42
  },
  "tags": [
    {
      "id": 1,
      "model_type": 1
    }
  ],
  "status": "available"
}
```

which has the intended schema.

    - **Twitter handle:**: @brunoalvisio

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-09-02 13:17:39 -07:00
Nuno Campos
464dae8ac2 core: Include global variables in variables found by get_function_nonlocals (#25936)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-09-02 11:49:25 -07:00
Luiz F. G. dos Santos
36bbdc776e community: fix bug to support for file_search tool from OpenAI (#25927)
- **Description:** The function `_is_assistants_builtin_tool` didn't had
support for `file_search` from OpenAI. This was creating conflict and
blocking the usage of such. OpenAI Assistant changed from`retrieval` to
`file_search`.
  
  The following code
  
  ```
              agent = OpenAIAssistantV2Runnable.create_assistant(
                name="Data Analysis Assistant",
                instructions=prompt[0].content,
                tools={'type': 'file_search'},
                model=self.chat_config.connection.deployment_name,
                client=llm,
                as_agent=True,
                tool_resources={
                    "file_search": {
                        "vector_store_ids": vector_store_id
                        }
                    }
                )
```

Was throwing the following error

```
Traceback (most recent call last):
File
"/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/chat_decorators.py",
line 500, in get_response
    return await super().get_response(post, context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/chat_decorators.py",
line 96, in get_response
    response = await self.inner_chat.get_response(post, context)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/chat_decorators.py",
line 96, in get_response
    response = await self.inner_chat.get_response(post, context)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/chat_decorators.py",
line 96, in get_response
    response = await self.inner_chat.get_response(post, context)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  [Previous line repeated 4 more times]
File
"/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/chat/azure_open_ai_chat.py",
line 147, in get_response
chain = chain_factory.get_chain(prompts, post.conversation.id,
overrides, context)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/Documents/codes/shellai-nlp-backend/app/llm_connections/chains.py",
line 1324, in get_chain
    agent = OpenAIAssistantV2Runnable.create_assistant(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_community/agents/openai_assistant/base.py",
line 256, in create_assistant
tools=[_get_assistants_tool(tool) for tool in tools], # type: ignore
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_community/agents/openai_assistant/base.py",
line 256, in <listcomp>
tools=[_get_assistants_tool(tool) for tool in tools], # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_community/agents/openai_assistant/base.py",
line 119, in _get_assistants_tool
    return convert_to_openai_tool(tool)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_core/utils/function_calling.py",
line 255, in convert_to_openai_tool
    function = convert_to_openai_function(tool)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/Users/l.guedesdossantos/anaconda3/envs/shell-e/lib/python3.11/site-packages/langchain_core/utils/function_calling.py",
line 230, in convert_to_openai_function
    raise ValueError(
ValueError: Unsupported function

{'type': 'file_search'}

Functions must be passed in as Dict, pydantic.BaseModel, or Callable. If
they're a dict they must either be in OpenAI function format or valid
JSON schema with top-level 'title' and 'description' keys.
```

With the proposed changes, this is fixed and the function will have support for `file_search`.
  This was the only place missing the support for `file_search`.
  
  Reference doc
  https://platform.openai.com/docs/assistants/tools/file-search
  
  
  - **Twitter handle:** luizf0992

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-09-02 18:21:51 +00:00
Jacob Lee
f49cce739b 👥 Update LangChain people data (#25917)
👥 Update LangChain people data

Co-authored-by: github-actions <github-actions@github.com>
2024-09-02 11:14:35 -07:00
Leonid Ganeline
96b99a5022 docs: integrations google missed references (#25923)
Added missed integration links. Fixed inconsistent formatting.
2024-09-02 11:14:18 -07:00
Leonid Ganeline
086556d466 docs: integrations reference updates 14 (#25928)
Added missed provider pages and links. Fixed inconsistent formatting.
2024-09-02 11:07:45 -07:00
Tyler Wray
1ff8c36aa6 docs: fix pgvector link (#25930)
- **Description:** pg_vector link is 404'ing. This fixes it.
2024-09-02 18:03:19 +00:00
xander-art
6cd452d985 Feature/update hunyuan (#25779)
Description: 
    - Add system templates and user templates in integration testing
    - initialize the response id field value to request_id
    - Adjust the default model to hunyuan-pro
    - Remove the default values of Temperature and TopP
    - Add SystemMessage

all the integration tests have passed.
1、Execute integration tests for the first time
<img width="1359" alt="71ca77a2-e9be-4af6-acdc-4d665002bd9b"
src="https://github.com/user-attachments/assets/9298dc3a-aa26-4bfa-968b-c011a4e699c9">

2、Run the integration test a second time
<img width="1501" alt="image"
src="https://github.com/user-attachments/assets/61335416-4a67-4840-bb89-090ba668e237">

Issue: None
Dependencies: None
Twitter handle: None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-09-02 12:55:08 +00:00
Yuwen Hu
566e9ba164 community: add Intel GPU support to ipex-llm llm integration (#22458)
**Description:** [IPEX-LLM](https://github.com/intel-analytics/ipex-llm)
is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local
PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low
latency. This PR adds Intel GPU support to `ipex-llm` llm integration.
**Dependencies:** `ipex-llm`
**Contribution maintainer**: @ivy-lv11 @Oscilloscope98
**tests and docs**: 
- Add: langchain/docs/docs/integrations/llms/ipex_llm_gpu.ipynb
- Update: langchain/docs/docs/integrations/llms/ipex_llm_gpu.ipynb
- Update: langchain/libs/community/tests/llms/test_ipex_llm.py

---------

Co-authored-by: ivy-lv11 <zhicunlv@gmail.com>
2024-09-02 08:49:08 -04:00
Bagatur
d19e074374 core[patch]: handle serializable fields that cant be converted to bool (#25903) 2024-09-01 16:44:33 -07:00
Kirushikesh DB
7f857a02d5 docs: HuggingFace pipeline returns the prompt if return_full_text is not set (#25916)
Thank you for contributing to LangChain!

**Description:**
The current documentation of using the Huggingface with Langchain needs
to set return_full_text as False otherwise pipeline by default returns
both the prompt and response as output.


Code to reproduce:
```python
from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline
from langchain_core.messages import (
    HumanMessage,
    SystemMessage,
)

llm = HuggingFacePipeline.from_model_id(
    model_id="microsoft/Phi-3.5-mini-instruct",
    task="text-generation",
    pipeline_kwargs=dict(
        max_new_tokens=512,
        do_sample=False,
        repetition_penalty=1.03,
        # return_full_text=False
    ),
    device=0
)

chat_model = ChatHuggingFace(llm=llm)

messages = [
    SystemMessage(content="You're a helpful assistant"),
    HumanMessage(
        content="What happens when an unstoppable force meets an immovable object?"
    ),
]

ai_msg = chat_model.invoke(messages)
print(ai_msg.content)
```
Output:
```
<|system|>
You're a helpful assistant<|end|>
<|user|>
What happens when an unstoppable force meets an immovable object?<|end|>
<|assistant|>
 The scenario of an "unstoppable force" meeting an "immovable object" is a classic paradox that has puzzled philosophers, scientists, and thinkers for centuries. In physics, however, there are no such things as truly unstoppable forces or immovable objects because all physical entities have mass and interact with other masses through fundamental forces (like gravity).

When we consider the laws of motion, particularly Newton's third law which states that for every action, there is an equal and opposite reaction, it becomes clear that if one were to exist, the other would necessarily be negated by the interaction. For example, if you push against a solid wall with great force, the wall exerts an equal and opposite force back on you, preventing your movement.

In theoretical discussions, this paradox often serves as a thought experiment to explore concepts like determinism versus free will, the limits of physical laws, and the nature of reality itself. However, in practical terms, any force applied to an object will result in some form of deformation, transfer of energy, or movement, depending on the properties of both the force and the object.

So while the idea of an unstoppable force and an immovable object remains a fascinating philosophical conundrum, it does not hold up under the scrutiny of physical laws as we understand them.
```

---------

Co-authored-by: Kirushikesh D B kirushi@ibm.com <kirushi@cccxl012.pok.ibm.com>
2024-09-01 13:52:20 -07:00
Yuxi Zheng
38dfde6946 docs: fix typo in Cassandra for ./cookbook/cql_agent.ipynb (#25922)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: “syd” <“zheng.yuxi@outlook.com>
2024-09-01 20:51:47 +00:00
Borahm Lee
9cdb99bd60 docs: remove unused imports in Tutorials Basics (#25919)
## Description

- `List` is not explicitly used, so the unnecessary imports will be
removed.
2024-09-01 20:51:00 +00:00
Erick Friis
8732cfc6ef docs: review process gh discussion (#25921) 2024-09-01 17:20:46 +00:00
Erick Friis
08b9715845 docs: pr review process (#25899)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-09-01 16:51:12 +00:00
ccurme
60054db1c4 infra[patch]: remove together from scheduled tests (#25909)
These now run in https://github.com/langchain-ai/langchain-together
2024-08-31 18:43:16 +00:00
Emmanuel Leroy
654da27255 improve llamacpp embeddings (#12972)
- **Description:**
Improve llamacpp embedding class by adding the `device` parameter so it
can be passed to the model and used with `gpu`, `cpu` or Apple metal
(`mps`).
Improve performance by making use of the bulk client api to compute
embeddings in batches.
  
  - **Dependencies:** none
  - **Tag maintainer:** 
@hwchase17

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-31 18:27:59 +00:00
Sandeep Bhandari
f882824eac Update tool_choice.ipynb spelling mistake of select (#25907)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-31 12:36:32 +00:00
ℍ𝕠𝕝𝕝𝕠𝕨 𝕄𝕒𝕟
64b62f6ae4 community[neo4j_vector]: make embedding dimension check optional (#25737)
**Description:**

Starting from Neo4j 5.23 (22 August 2024), with vector-2.0 indexes,
`vector.dimensions` is not required to be set, which will cause it the
key not exist error in index config if it's not set.

Since the existence of vector.dimensions will only ensure additional
checks, this commit turns embedding dimension check optional, and only
do checks when it exists (not None).

https://neo4j.com/release-notes/database/neo4j-5/

**Twitter handle:** @HollowM186

Signed-off-by: Hollow Man <hollowman@opensuse.org>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-31 12:36:20 +00:00
Christophe Bornet
0a752a74cc community[patch], docs: Add API reference doc for GraphVectorStore (#25751) 2024-08-30 17:42:00 -07:00
Bagatur
28e2ec7603 ollama[patch]: Release 0.1.3 (#25902) 2024-08-31 00:11:45 +00:00
Bagatur
ca1c3bd9c0 community[patch]: bump + fix core dep (#25901) 2024-08-30 15:54:07 -07:00
Bagatur
fabe32c06d core[patch]: Release 0.2.37 (#25900) 2024-08-30 22:29:12 +00:00
Richmond Alake
9992a1db43 cookbook: AI Agent Built With LangChain and FireWorksAI (#22609)
Thank you for contributing to LangChain!

- **AI Agent Built With LangChain and FireWorksAI**: "community
notebook"
- **Description:** Added a new AI agent in the cookbook folder that
integrates prompt compression using LLMLingua and arXiv retrieval tools.
The agent is designed to optimize the efficiency and performance of
research tasks by compressing lengthy prompts and retrieving relevant
academic papers. The agent also makes uses of MongoDB to store
conversational history and as it's knowledge base using MongoDB vector
store
    - **Twitter handle:** https://x.com/richmondalake

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-30 22:19:17 +00:00
mehdiosa
c6f00e6bdc community: Fix branch not being considered when using GithubFileLoader (#20075)
- **Description:** Added `ref` query parameter so data is not loaded
only from the default branch but any branch passed

---------

Co-authored-by: Osama Mehdi <mehdi@hm.edu>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-30 21:47:11 +00:00
Leonid Ganeline
54d2b861f6 docs: integrations reference updates 12 (#25676)
Added missed provider pages and links. Fixed inconsistent formatting.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-30 21:25:42 +00:00
Aditya
c8b1c3a7e7 docs: update documentation for Vertex Embeddings Models (#25745)
- **Description:update documentation for Vertex Embeddings Models
    - **Issue:NA
    - **Dependencies:NA
    - **Twitter handle:NA

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
2024-08-30 13:58:21 -07:00
Alex Sherstinsky
617a4e617b community: Fix a bug in handling kwargs overwrites in Predibase integration, and update the documentation. (#25893)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-30 12:41:42 -07:00
Erick Friis
28f6ff6fcd docs: remove incorrect vectorstore local column (#25895) 2024-08-30 18:54:51 +00:00
Anush
ade4bfdff1 qdrant: Updated class check in Self-Query Retriever factory (#25877)
## Description

- Updates the self-query retriever factory to check for the new Qdrant
vector store class. i.e. `langchain_qdrant.QdrantVectorstore`.
- Deprecates `QdrantSparseVectorRetriever`, since the vector store
implementation natively supports it now.

Resolves #25798
2024-08-30 12:11:55 -04:00
Djordje
862ef32fdc community: Fixed infinity embeddings async request (#25882)
**Description:** Fix async infinity embeddings
**Issue:** #24942  

@baskaryan, @ccurme
2024-08-30 12:10:34 -04:00
rainsubtime
f75d5621e2 community:Fix a bug of LLM in moonshot (#25878)
- **Description:** When useing LLM integration moonshot,it's occurring
error "'Moonshot' object has no attribute '_client'",it's because of the
"_client" that is private in pydantic v1.0 so that we can't use it.I
turn "_client" into "client" , the error to be resolved!
- **Issue:** the issue #24390 
- **Dependencies:** none
- **Twitter handle:** @Rainsubtime




- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Cyue <Cyue_work2001@163.com>
2024-08-30 16:09:39 +00:00
ZhangShenao
fd0f147df3 Improvement[Community] Add tool-calling test case for ChatZhipuAI (#25884)
- Add tool-calling test case for `ChatZhipuAI`
2024-08-30 12:05:43 -04:00
k.muto
5bb810c5c6 docs: updated args_schema to be required when using callback handlers in custom tools. (#25887)
- **Description:** if you use callback handlers when using tool,
run_manager will be added to input, so you need to explicitly specify
args_schema, but i was confused because it was not listed, so i added
it. Also, it seems that the type does not work with pydantic.BaseModel.
- **Issue:** None
- **Dependencies:** None
2024-08-30 12:04:40 -04:00
默奕
6377185291 add neo4j query constructor for self query (#25288)
- [x] **PR title - community: add neo4j query constructor for self
query**

- [x] **PR message**
- **Description:** adding a Neo4jTranslator so that the Neo4j vector
database can use SelfQueryRetriever
    - **Issue:** this issue had been raised before in #19748
    - **Dependencies:** none. 
    - **Twitter handle:** @moyi_dang
- p.s. I have not added the query constructor in BUILTIN_TRANSLATORS in
this PR, I want to make changes to only one package at a time.

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-30 14:54:33 +00:00
Ohad Eytan
b5d670498f partners/milvus: allow creating a vectorstore with sparse embeddings (#25284)
# Description
Milvus (and `pymilvus`) recently added the option to use [sparse
vectors](https://milvus.io/docs/sparse_vector.md#Sparse-Vector) with
appropriate search methods (e.g., `SPARSE_INVERTED_INDEX`) and
embeddings (e.g., `BM25`, `SPLADE`).

This PR allow creating a vector store using langchain's `Milvus` class,
setting the matching vector field type to `DataType.SPARSE_FLOAT_VECTOR`
and the default index type to `SPARSE_INVERTED_INDEX`.

It is only extending functionality, and backward compatible. 

## Note
I also interested in extending the Milvus class further to support multi
vector search (aka hybrid search). Will be happy to discuss that. See
[here](https://github.com/langchain-ai/langchain/discussions/19955),
[here](https://github.com/langchain-ai/langchain/pull/20375), and
[here](https://github.com/langchain-ai/langchain/discussions/22886)
similar needs.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-30 02:30:23 +00:00
Erick Friis
09b04c7e3b "community: release 0.2.15" (#25867) 2024-08-30 02:18:48 +00:00
Erick Friis
f7e62754a1 community: undo azure_ad_access_token breaking change (#25818) 2024-08-30 02:06:14 +00:00
Leonid Ganeline
6047138379 docs: arxiv reference updates (#24949)
Added: arxiv references to the concepts page.
Regenerated: arxiv references page.
Improved: formatting of the concepts page (moved the Partner packages
section after langchain_community)
2024-08-29 18:51:18 -07:00
Bagatur
1759ff5836 infra: rm together lagnchain test dp (#25866) 2024-08-30 00:59:53 +00:00
Erick Friis
24f0c232fe docs: elastic feature (#25865) 2024-08-30 00:55:16 +00:00
Erick Friis
1640872059 together: mv to external repo (#25863) 2024-08-29 16:42:59 -07:00
Michael Paciullo
e7c856c298 langchain_openai: Add "strict" parameter to OpenAIFunctionsAgent (#25862)
- **Description:** OpenAI recently introduced a "strict" parameter for
[structured outputs in their
API](https://openai.com/index/introducing-structured-outputs-in-the-api/).
An optional `strict` parameter has been added to
`create_openai_functions_agent()` and `create_openai_tools_agent()` so
developers can use this feature in those agents.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-29 22:27:07 +00:00
Bagatur
fabd3295fa core[patch]: dont mutate merged lists/dicts (#25858)
Update merging utils to
- not mutate objects
- have special handling to 'type' keys in dicts
2024-08-29 20:34:54 +00:00
Kyle Winkelman
09c2d8faca langchain_openai: Cleanup OpenAIEmbeddings validate_environment. (#25855)
**Description:** [This portion of
code](https://github.com/langchain-ai/langchain/blob/v0.1.16/libs/partners/openai/langchain_openai/embeddings/base.py#L189-L196)
has no use as a couple lines later a [`ValueError` is
thrown](https://github.com/langchain-ai/langchain/blob/v0.1.16/libs/partners/openai/langchain_openai/embeddings/base.py#L209-L213).
**Issue:** A follow up to #25852.
2024-08-29 13:54:43 -04:00
Kyle Winkelman
201bdf7148 community: Cap AzureOpenAIEmbeddings chunk_size at 2048 instead of 16. (#25852)
**Description:** Within AzureOpenAIEmbeddings there is a validation to
cap `chunk_size` at 16. The value of 16 is either an old limitation or
was erroneously chosen. I have checked all of the `preview` and `stable`
releases to ensure that the `embeddings` endpoint can handle 2048
entries
[Azure/azure-rest-api-specs](https://github.com/Azure/azure-rest-api-specs/tree/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference).
I have also found many locations that confirm this limit should be 2048:
-
https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#embeddings
-
https://learn.microsoft.com/en-us/azure/ai-services/openai/quotas-limits

**Issue:** fixes #25462
2024-08-29 16:48:04 +00:00
Leonid Ganeline
08c9c683a7 docs: integrations reference updates 6 (#25188)
Added missed provider pages. Added missed references to the integration
components.
2024-08-29 09:17:41 -07:00
Allan Ascencio
a8af396a82 added octoai test (#21793)
- [ ] **PR title**: community: add tests for ChatOctoAI

- [ ] **PR message**: 
Description: Added unit tests for the ChatOctoAI class in the community
package to ensure proper validation and default values. These tests
verify the correct initialization of fields, the handling of missing
required parameters, and the proper setting of aliases.
Issue: N/A
Dependencies: None

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-08-29 15:07:27 +00:00
Param Singh
69f9acb60f premai[patch]: Standardize premai params (#21513)
Thank you for contributing to LangChain!

community:premai[patch]: standardize init args

- updated `temperature` with Pydantic Field, updated the unit test.
- updated `max_tokens` with Pydantic Field, updated the unit test.
- updated `max_retries` with Pydantic Field, updated the unit test.

Related to #20085

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-29 11:01:28 -04:00
Guangdong Liu
fcf9230257 community(sparkllm): Add function call support in Sparkllm chat model. (#20607)
- **Description:** Add function call support in Sparkllm chat model.
Related documents
https://www.xfyun.cn/doc/spark/Web.html#_2-function-call%E8%AF%B4%E6%98%8E
- @baskaryan

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-29 14:38:39 +00:00
ChengZi
37f5ba416e partners[milvus]: fix issue when metadata_schema is None (#25836)
fix issue when metadata_schema is None

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
2024-08-29 10:11:09 -04:00
ccurme
426333ff6f infra[patch]: remove AI21 from scheduled tests (#25847)
These now run in https://github.com/langchain-ai/langchain-ai21
2024-08-29 14:03:20 +00:00
Jorge Piedrahita Ortiz
9ac953a948 Community: sambastudio embeddings GenericV2 API support (#25064)
- **Description:** 
        SambaStudio GenericV2 API support 
        Minor changes for requests error handling
2024-08-29 09:52:49 -04:00
Sam Jove
bdce9a47d0 community[patch]: callback before yield for _astream (gigachat) (#25834)
Description: Moves yield to after callback for _astream for gigachat in
the community package
Issue: #16913

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-29 13:29:28 +00:00
Jinoos Lee
703af9ffe3 Patch enable to use Amazon OpenSearch Serverless(aoss) for Semantic Cache store (#25833)
- [x] **PR title**: "community: Patch enable to use Amazon OpenSearch
Serverless for Semantic Cache store"

- [x] **PR message**: 
- **Description:** OpenSearchSemanticCache class support Amazon
OpenSearch Serverless for Semantic Cache store, it's only required to
pass auth(http_auth) parameter to initializer
    - **Dependencies:** none

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Jinoos Lee <jinoos@amazon.com>
2024-08-29 13:28:22 +00:00
William FH
1ad621120d docs: Update langgraph 0.2.0 checkpointer import path (#25205)
And fix the description for timeout
2024-08-28 19:32:08 -07:00
Andrew Benton
c410545075 docs: add self-hosting row to code interpreter tools table (#25303)
**Description:** Add information about self-hosting support to the code
interpreter tools table.
**Issue:** N/A
**Dependencies:** N/A
2024-08-28 19:30:12 -07:00
Eugene Yurtsev
83327ac43a docs: Fix typo in openai llm integration notebook (#25492)
Fix typo in openai LLM integration notebook.
2024-08-28 19:22:57 -07:00
Leonid Ganeline
31f55781b3 docs: added ColBERT reference (#25452)
Added references to the source papers.
Fixed URL verification code.
Improved arXive page formatting.
Regenerated arXiv page.
2024-08-28 19:05:44 -07:00
Mikhail Khludnev
a017f49fd3 comminity[patch]: fix #25575 YandexGPTs for _grpc_metadata (#25617)
it fixes two issues:

### YGPTs are broken #25575

```
File ....conda/lib/python3.11/site-packages/langchain_community/embeddings/yandex.py:211, in _make_request(self, texts, **kwargs)
..
--> 211 res = stub.TextEmbedding(request, metadata=self._grpc_metadata)  # type: ignore[attr-defined]

AttributeError: 'YandexGPTEmbeddings' object has no attribute '_grpc_metadata'
```
My gut feeling that #23841 is the cause.

I have to drop leading underscore from `_grpc_metadata` for quickfix,
but I just don't know how to do it _pydantic_ enough.

### minor issue:

if we use `api_key`, which is not the best practice the code fails with 

```
File ~/git/...../python3.11/site-packages/langchain_community/embeddings/yandex.py:119, in YandexGPTEmbeddings.validate_environment(cls, values)
...

AttributeError: 'tuple' object has no attribute 'append'
```

- Added new integration test. But it requires YGPT env available and
active account. I don't know how int tests dis\enabled in CI.
 - added small unit tests with mocks. Should be fine.

---------

Co-authored-by: mikhail-khludnev <mikhail_khludnev@rntgroup.com>
2024-08-28 18:48:10 -07:00
Serena Ruan
850bf89e48 community[patch]: Support passing extra params for executing functions in UCFunctionToolkit (#25652)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


Support passing extra params when executing UC functions:
The params should be a dictionary with key EXECUTE_FUNCTION_ARG_NAME,
the assumption is that the function itself doesn't use such variable
name (starting and ending with double underscores), and if it does we
raise Exception.
If invalid params passing to the execute_statement, we raise Exception
as well.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-28 18:47:32 -07:00
崔浩
3555882a0d community[patch]: optimize xinference llm import (#25809)
Thank you for contributing to LangChain!

- [ ] **PR title**: "community: optimize xinference llm import"

- [ ] **PR message**: 
- **Description:** from xinferece_client import RESTfulClient when there
is no importing xinference.
    - **Dependencies:** xinferece_client
- **Why do so:** the total xinference(pip install xinference[all]) is
too heavy for installing, let alone it is useless for langchain user
except RESTfulClient. The modification has maintained consistency with
the xinference embeddings
[embeddings/xinference](../blob/master/libs/community/langchain_community/embeddings/xinference.py#L89).
2024-08-29 01:41:43 +00:00
Michael Rubél
9decd0b243 langchain[patch]: fix moderation chain init (#25778)
[This
commit](d3ca2cc8c3)
has broken the moderation chain so we've faced a crash when migrating
the LangChain from v0.1 to v0.2.

The issue appears that the class attribute the code refers to doesn't
hold the value processed in the `validate_environment` method. We had
`extras={}` in this attribute, and it was casted to `True` when it
should've been `False`. Adding a simple assignment seems to resolve the
issue, though I'm not sure it's the right way.

---

---------

Co-authored-by: Michael Rubél <mrubel@oroinc.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-28 18:41:31 -07:00
Madhu Shantan
63a1569d5f docs: fixed syntax error in ChatAnthropic Example - rag app tutorial notebook (#25824)
Thank you for contributing to LangChain!

- [ ] **PR title**: docs: fixed syntax error in ChatAnthropic Example -
rag app tutorial notebook - generation


- [ ] **PR message**: 
- **Description:** Fixed a syntax error in the ChatAnthropic
initialization example in the RAG tutorial notebook. The original code
had an extra set of quotation marks around the model parameter, which
would cause a Python syntax error. The corrected version removes these
unnecessary quotes.
 
- **Dependencies:** No new dependencies required for this documentation
fix.
I've verified that the corrected code is syntactically valid and matches
the expected format for initializing a ChatAnthropic instance in
LangChain.
    - **Twitter handle:** madhu_shantan


- [ ] **Add tests and docs**: the error in Jupyter notebook: 
<img width="1189" alt="Screenshot 2024-08-29 at 12 43 47 AM"
src="https://github.com/user-attachments/assets/07148a93-300f-40e2-ad4a-ac219cbb56a4">

the corrected cell: 
<img width="983" alt="Screenshot 2024-08-29 at 12 44 18 AM"
src="https://github.com/user-attachments/assets/75b1455a-3671-454e-ac16-8ca77c049dbd">



- [ ] **Lint and test**: As this is a documentation-only change, I have
not run the full test suite. However, I have verified that the corrected
code example is syntactically valid and matches the expected usage of
the ChatAnthropic class.
 
the error in the docs is here -  
<img width="1020" alt="Screenshot 2024-08-29 at 12 48 36 AM"
src="https://github.com/user-attachments/assets/812ccb20-b411-4a5b-afc1-41742efb32a7">
2024-08-29 01:31:01 +00:00
Erick Friis
e5ae988505 prompty: bump core version (#25831) 2024-08-28 23:06:13 +00:00
Erick Friis
c8b8335b82 core: prompt variable error msg (#25787) 2024-08-28 22:54:00 +00:00
ccurme
ff168aaec0 prompty: release 0.0.3 (#25830) 2024-08-28 15:52:17 -07:00
Matthieu
783397eacb community: avoid double templating in langchain_prompty (#25777)
## Description

In `langchain_prompty`, messages are templated by Prompty. However, a
call to `ChatPromptTemplate` was initiating a second templating. We now
convert parsed messages to `Message` objects before calling
`ChatPromptTemplate`, signifying clearly that they are already
templated.

We also revert #25739 , which applied to this second templating, which
we now avoid, and did not fix the original issue.

## Issue

Closes #25703
2024-08-28 18:18:18 -04:00
ccurme
afe8ccaaa6 community[patch]: Add ID field back to Azure AI Search results (#25828)
Commandeering https://github.com/langchain-ai/langchain/pull/23243 as
maintainers don't have ability to modify that PR.

Fixes https://github.com/langchain-ai/langchain/issues/22827

---------

Co-authored-by: Ming Quah <fleetadmiralbutter@icloud.com>
2024-08-28 17:56:50 -04:00
rbrugaro
9fa172bc26 add links in example nb with tei/tgi references (#25821)
I have validated langchain interface with tei/tgi works as expected when
TEI and TGI running on Intel Gaudi2. Adding some references to notebooks
to help users find relevant info.

---------

Co-authored-by: Rita Brugarolas <rbrugaro@idc708053.jf.intel.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-28 21:33:25 +00:00
Erick Friis
8fb594fd2a ai21: migrate to external repo (#25827) 2024-08-28 14:24:07 -07:00
Erick Friis
095b712a26 ollama: bump core version (#25826) 2024-08-28 12:31:16 -07:00
Erick Friis
5db6c6d96d community: release 0.2.14 (#25822) 2024-08-28 19:05:53 +00:00
Erick Friis
d6c4803ab0 core: release 0.2.36 (#25819) 2024-08-28 18:04:51 +00:00
Erick Friis
5186325bc7 partners/ollama: release 0.1.2 (#25817)
release for #25697
2024-08-28 17:47:32 +00:00
Rohit Gupta
aff50a1e6f milvus: add array data type for collection create (#23219)
Add array data type for milvus vector store collection create


Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Rohit Gupta <rohit.gupta2@walmart.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-28 16:55:57 +00:00
Cillian Berragan
754f3c41f9 community: add score to PineconeHybridSearchRetriever (#25781)
**Description:**

Adds the 'score' returned by Pinecone to the
`PineconeHybridSearchRetriever` list of returned Documents.

There is currently no way to return the score when using Pinecone hybrid
search, so in this PR I include it by default.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-28 13:11:06 +00:00
ZhangShenao
3f1d652f15 Improvement[Community] Improve api doc for PineconeHybridSearchRetriever (#25803)
- Complete missing args in api doc
2024-08-28 08:38:56 -04:00
Moritz Schlager
555f97becb community[patch]: fix model initialization bug for deepinfra (#25727)
### Description
adds an init method to ChatDeepInfra to set the model_name attribute
accordings to the argument
### Issue
currently, the model_name specified by the user during initialization of
the ChatDeepInfra class is never set. Therefore, it always chooses the
default model (meta-llama/Llama-2-70b-chat-hf, however probably since
this is deprecated it always uses meta-llama/Llama-3-70b-Instruct). We
stumbled across this issue and fixed it as proposed in this pull
request. Feel free to change the fix according to your coding guidelines
and style, this is just a proposal and we want to draw attention to this
problem.
### Dependencies
no additional dependencies required

Feel free to contact me or @timo282 and @finitearth if you have any
questions.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-28 02:02:35 -07:00
Bagatur
a052173b55 together[patch]: Release 0.1.6 (#25805) 2024-08-28 02:01:49 -07:00
Bagatur
b0ac6fe8d3 community[patch]: Release 0.2.13 (#25806) 2024-08-28 08:57:49 +00:00
Bagatur
85aef7641c openai[patch]: Release 0.1.23 (#25804) 2024-08-28 08:52:08 +00:00
Bagatur
0d3fd0aeb9 langchain[patch]: Release 0.2.15 (#25802) 2024-08-28 08:35:00 +00:00
zysoong
25a6790e1a community[patch]: Minor Improvement of extract hyperlinks tool output (#25728)
**Description:** Make the hyperlink only appear once in the
extract_hyperlinks tool output. (for some websites output contains
meaningless '#' hyperlinks multiple times which will extend the tokens
of context window without any advantage)
**Issue:** None
**Dependencies:** None
2024-08-28 08:02:40 +00:00
Christophe Bornet
ff0df5ea15 core[patch]: Add B(bugbear) ruff rules (#25520)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-28 07:09:29 +00:00
Isaac Francisco
d5ddaac1fc docs minor fix (#25794) 2024-08-28 04:14:36 +00:00
ccurme
3c784e10a8 docs: improve docs for InMemoryVectorStore (#25786)
Closes https://github.com/langchain-ai/langchain/issues/25775
2024-08-27 21:12:32 -07:00
Erick Friis
1023fbc98a databricks: mv to partner repo (#25788) 2024-08-27 18:51:17 -07:00
ccurme
2e5c379632 openai[patch]: fix get_num_tokens for function calls (#25785)
Closes https://github.com/langchain-ai/langchain/issues/25784

See additional discussion
[here](0a4ee864e9 (r145147380)).
2024-08-27 20:18:19 +00:00
Erick Friis
2aa35d80a0 docs, infra: cerebras docs, update docs template linting with better error (#25782) 2024-08-27 17:19:59 +00:00
venkatram-dev
48b579f6b5 date_time_parser (#25763)
Thank you for contributing to LangChain!

- [x] **PR title**: "langchain: Chains: query_constructor: add date time
parser"

- [x] **PR message**: 
- **Description:** add date time parser to langchain Chains
query_constructor
    - **Issue: https://github.com/langchain-ai/langchain/issues/25526


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-27 13:18:52 -04:00
Tomaz Bratanic
f359e6b0a5 Add mmr to neo4j vector (#25765) 2024-08-27 08:55:19 -04:00
pazshalev
995305fdd5 test: fix tool calling integration tests for AI21 Jamba models (#25771)
Ignore specific integration tests that handles specific tool calling
cases that will soon be fixed.
2024-08-27 08:54:51 -04:00
Luis Valencia
99f9a664a5 community: Azure Search Vector Store is missing Access Token Authentication (#24330)
Added Azure Search Access Token Authentication instead of API KEY auth.
Fixes Issue: https://github.com/langchain-ai/langchain/issues/24263
Dependencies: None
Twitter: @levalencia

@baskaryan

Could you please review? First time creating a PR that fixes some code.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-26 15:41:50 -07:00
Leonid Ganeline
49b0bc7b5a docs: integrations reference updates 5 (#25151)
Added missed references. Added missed provider pages.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-26 15:12:39 -04:00
ZhangShenao
44e3e2391c Improvement[Community] Improve methods in IMessageChatLoader (#25746)
- Add @staticmethod to static methods in `IMessageChatLoader`.
- Format args name.
2024-08-26 14:20:22 -04:00
Erick Friis
815f59dba5 partners/ai21: release 0.1.8 (#25759) 2024-08-26 18:02:43 +00:00
amirai21
17dffd9741 AI21: tools calling support in Langchain (#25635)
This pull request introduces support for the AI21 tools calling feature,
available by the Jamba-1.5 models. When Jamba-1.5 detects the necessity
to invoke a provided tool, as indicated by the 'tools' parameter passed
to the model:

```
class ToolDefinition(TypedDict, total=False):
    type: Required[Literal["function"]]
    function: Required[FunctionToolDefinition]

class FunctionToolDefinition(TypedDict, total=False):
    name: Required[str]
    description: str
    parameters: ToolParameters

class ToolParameters(TypedDict, total=False):
    type: Literal["object"]
    properties: Required[Dict[str, Any]]
    required: List[str]
```

It will respond with a list of tool calls structured as follows:

```
class ToolCall(AI21BaseModel):
    id: str
    function: ToolFunction
    type: Literal["function"] = "function"

class ToolFunction(AI21BaseModel):
    name: str
    arguments: str
```

This pull request incorporates the necessary modifications to integrate
this functionality into the ai21-langchain library.

---------

Co-authored-by: asafg <asafg@ai21.com>
Co-authored-by: pazshalev <111360591+pazshalev@users.noreply.github.com>
Co-authored-by: Paz Shalev <pazs@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-26 10:50:30 -07:00
maang-h
a566a15930 Fix MoonshotChat instantiate with alias (#25755)
- **Description:**
   -  Fix `MoonshotChat` instantiate with alias
   - Add `MoonshotChat` to `__init__.py`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-26 17:33:22 +00:00
venkatram-dev
ec99f0d193 milvus: add_db_milvus_connection (#25627)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"
  - "libs: langchain_milvus: add db name to milvus connection check"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:**  add db name to milvus connection check
    - **Issue:** https://github.com/langchain-ai/langchain/issues/25277



- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-26 17:29:12 +00:00
Ashvin
af3b3a4474 Update endpoint for AzureMLEndpointApiType class. (#25725)
This addresses the issue mentioned in #25702

I have updated the endpoint used in validating the endpoint API type in
the AzureMLBaseEndpoint class from `/v1/completions` to `/completions`
and `/v1/chat/completions` to `/chat/completions`.

Co-authored-by: = <=>
2024-08-26 08:50:02 -04:00
Mohammad Mohtashim
dcf2278a05 [Community]: Added Template Format Parameter in create_chat_prompt for Langchain Prompty (#25739)
- **Description:** Added a `template_format` parameter to
`create_chat_prompt` to allow `.prompty` files to handle variables in
different template formats.
- **Issue:** #25703

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-26 12:48:21 +00:00
Dristy Srivastava
7205057c3e [Community][minor]: Added langchain_version while calling discover API (#24428)
- **Description:** Added langchain version while calling discover API
during both ingestion and retrieval
- **Issue:** NA
- **Dependencies:** NA
- **Tests:** NA
- **Docs** NA

---------

Co-authored-by: dristy.cd <dristy@clouddefense.io>
2024-08-26 08:47:48 -04:00
Dristy Srivastava
fbb4761199 [Community][minor]: Updating source path, and file path for SharePoint loader in PebbloSafeLoader (#25592)
- **Description:** Updating source path and file path in Pebblo safe
loader for SharePoint apps during loading
- **Issue:** NA
- **Dependencies:** NA
- **Tests:** NA
- **Docs** NA

---------

Co-authored-by: dristy.cd <dristy@clouddefense.io>
2024-08-26 08:38:40 -04:00
Rajendra Kadam
745d1c2b8d community[minor]: [Pebblo] Fix URL construction in newer Python versions (#25747)
- **PR message**: **Fix URL construction in newer Python versions**
- **Description:** 
- Update the URL construction logic to use the .value attribute for
Routes enum members.
- This adjustment resolves an issue where the code worked correctly in
Python 3.9 but failed in Python 3.11.
  - Clean up unused routes.
- **Issue:** NA
- **Dependencies:** NA
2024-08-26 07:27:30 -04:00
Rajendra Kadam
58a98c7d8a community: [PebbloRetrievalQA] Implemented Async support for prompt APIs (#25748)
- **Description:** PebbloRetrievalQA: Implemented Async support for
prompt APIs (classification and governance)
- **Issue:** NA
- **Dependencies:** NA
2024-08-26 07:27:05 -04:00
Tomaz Bratanic
6703d795c5 Handle Ollama tool raw schema in llmgraphtransformer (#25752) 2024-08-26 07:26:26 -04:00
Bagatur
30f1bf24ac core[patch]: Release 0.2.35 (#25729) 2024-08-25 16:49:27 -07:00
Christophe Bornet
038c287b3a all: Improve make lint command (#25344)
* Removed `ruff check --select I` as `I` is already selected and checked
in the main `ruff check` command
* Added checks for non-empty `PYTHON_FILES`
* Run `ruff check` only on `PYTHON_FILES`

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-23 18:23:52 -07:00
Mohammad Mohtashim
9a29398fe6 huggingface: fix model param population (#24743)
- **Description:** Fix the validation error for `endpoint_url` for
HuggingFaceEndpoint. I have given a descriptive detail of the isse in
the issue that I have created.
- **Issue:** #24742

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-24 00:45:28 +00:00
Yuki Watanabe
c7a8af2e75 databricks: add vector search and embeddings (#25648)
### Summary

Add `DatabricksVectorSearch` and `DatabricksEmbeddings` classes to the
`langchain-databricks` partner packages. Core functionality is
unchanged, but the vector search class is largely refactored for
readability and maintainability.

This PR does not add integration tests yet. This will be added once the
Databricks test workspace is ready.

Tagging @efriis as POC


### Tracker
[] Create a package and imgrate ChatDatabricks
[✍️] Migrate DatabricksVectorSearch, DatabricksEmbeddings, and their
docs
~[ ] Migrate UCFunctionToolkit and its doc~
[ ] Add provider document and update README.md
[ ] Add integration tests and set up secrets (after moved to an external
package)
[ ] Add deprecation note to the community implementations.

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-24 00:40:21 +00:00
Erick Friis
71c039571a docs: remove deprecated nemo embed docs (#25720) 2024-08-24 00:36:33 +00:00
Hyman
58e72febeb openai:compatible with other llm usage meta data (#24500)
- [ ] **PR message**:
- **Description:** Compatible with other llm (eg: deepseek-chat, glm-4)
usage meta data
    - **Issue:** N/A
    - **Dependencies:** no new dependencies added


- [ ] **Add tests and docs**: 
libs/partners/openai/tests/unit_tests/chat_models/test_base.py
```shell
cd libs/partners/openai
poetry run pytest tests/unit_tests/chat_models/test_base.py::test_openai_astream
poetry run pytest tests/unit_tests/chat_models/test_base.py::test_openai_stream
poetry run pytest tests/unit_tests/chat_models/test_base.py::test_deepseek_astream
poetry run pytest tests/unit_tests/chat_models/test_base.py::test_deepseek_stream
poetry run pytest tests/unit_tests/chat_models/test_base.py::test_glm4_astream
poetry run pytest tests/unit_tests/chat_models/test_base.py::test_glm4_stream
```

---------

Co-authored-by: hyman <hyman@xiaozancloud.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-23 16:59:14 -07:00
Erick Friis
3dc7d447aa infra: reenable min version testing 2, ci ignore ai21 (#25709) 2024-08-23 23:28:42 +00:00
Erick Friis
f6491ceb7d community: remove integration test deps (#24460)
they arent used
2024-08-23 23:25:17 +00:00
Erick Friis
0022ae1b31 docs: remove templates (#25717)
- [x] check redirect works at template root
- [x] check redirect works within individual template page
2024-08-23 15:51:12 -07:00
Sharmistha S. Gupta
90439b12f6 Added support for Nebula Chat model (#21925)
Description: Added support for Nebula Chat model in addition to Nebula
Instruct
Dependencies: N/A
Twitter handle: @Symbldotai

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-23 22:34:32 +00:00
James Espichan Vilca
080741d336 core[patch]: Fix type for inner input in base prompts (#25713)
Thank you for contributing to LangChain!

- [ ] **PR title**: "langchain-core: Fix type"
- The file to modify is located in
/libs/core/langchain_core/prompts/base.py


- [ ] **PR message**: 
- **Description:** The change is a type for the inner input variable,
the type go from dict to Any. This change is required since the method
_validate input expects a type that is not only a dictionary.
    - **Dependencies:** There are no dependencies for this change


- [ ] **Add tests and docs**: 
1. A test is not needed. This error occurs because I overrode a portion
of the _validate_input method, which is causing a 'beartype' to raise an
error.
2024-08-23 14:06:39 -07:00
John
5ce9a716a7 docs, langchain-unstructured: update langchain-unstructured docs and update ustructured-client dependency (#25451)
Be more explicit in the docs about creating an instance of the
UnstructuredClient if you want to customize it versus using sdk
parameters with the UnstructuredLoader.

Bump the unstructured-client dependency as discussed
[here](https://github.com/langchain-ai/langchain/discussions/25328#discussioncomment-10350949)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-23 20:36:41 +00:00
Scott Hurrey
92abf62292 box[patch]: fix bugs in docs (#25699) 2024-08-23 19:36:23 +00:00
Ian
64ace25eb8 <Community>: tidb vector support vector index (#19984)
This PR introduces adjustments to ensure compatibility with the recently
released preview version of [TiDB Serverless Vector
Search](https://tidb.cloud/ai), aiming to prevent user confusion.

- TiDB Vector now supports vector indexing with cosine and l2 distance
strategies, although inner_product remains unsupported.
- Changing the distance strategy is currently not supported, so the test
cased should be adjusted.
2024-08-23 13:59:23 -04:00
Austin Burdette
f355a98bb6 community:yuan2[patch]: standardize init args (#21462)
updated stop and request_timeout so they aliased to stop_sequences, and
timeout respectively. Added test that both continue to set the same
underlying attributes.

Related to
[20085](https://github.com/langchain-ai/langchain/issues/20085)

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-23 17:56:19 +00:00
ccurme
bc557a5663 text-splitters[patch]: fix typing for keep_separator (#25706) 2024-08-23 17:22:02 +00:00
Erick Friis
8170bd636f Revert "infra: reenable min version testing" (#25708)
Reverts langchain-ai/langchain#24640
2024-08-23 10:20:23 -07:00
Erick Friis
3d5808ef27 infra: reenable min version testing (#24640)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-23 10:17:41 -07:00
Erick Friis
b365ee996b community: remove unused verify_ssl kwarg from aiohttp request (#25707)
it's not a valid kwarg in aiohttp request
2024-08-23 17:14:04 +00:00
Erick Friis
580fbd9ada docs: api ref to new site somewheres (#25679)
```
https://api\.python\.langchain\.com/en/latest/([^/]*)/langchain_([^.]*)\.(.*)\.html([^"]*)
https://python.langchain.com/v0.2/api_reference/$2/$1/langchain_$2.$3.html$4
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-23 10:01:16 -07:00
Bagatur
6a60a2a435 infra: gitignore api_ref mds (#25705) 2024-08-23 09:50:30 -07:00
Ashvin
2cd77a53a3 docs: Add docstrings for CassandraChatMessageHistory class and package namespace function. (#24222)
- Modified docstring for CassandraChatMessageHistory in
libs/community/langchain_community/chat_message_history/cassandra.py.

- Added docstring for _package_namespace function in
docs/api_reference/create_api_rst.py

---------

Co-authored-by: ashvin <ashvin.anilkumar@qburst.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-23 15:49:41 +00:00
Leonid Ganeline
8788a34bfa community: NeptuneGraph fix (#23281)
Issue: the `service` optional parameter was mentioned but not used.
Fix: added this parameter.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-23 15:34:26 +00:00
Djordje
22f9ae489f community: Opensearch - added score function for similarity_score_threshold (#23928)
This PR resolves the NotImplemented error for the
similarity_score_threshold search type for OpenSearch.
2024-08-23 11:30:04 -04:00
ZhangShenao
b38c83ff93 patch[Community] Optimize methods in several ChatLoaders (#24806)
There are some static methods in ChatLoaders, try to add @staticmethod
decorator for them.
2024-08-23 11:00:41 -04:00
James Espichan Vilca
644e0d3463 Use extend method for embeddings concatenation in mlflow_gateway (#14358)
## Description
There is a bug in the concatenation of embeddings obtained from MLflow
that does not conform to the type hint requested by the function.
``` python  
def _query(self, texts: List[str]) -> List[List[float]]:
```
It is logical to expect a **List[List[float]]** for a **List[str]**.
However, the append method encapsulates the response in a global List.
To avoid this, the extend method should be used, which will add the
embeddings of all strings at the same list level.

## Testing
I have tried using OpenAI-ADA to obtain the embeddings, and the result
of executing this snippet is as follows:

``` python  
embeds = await MlflowAIGatewayEmbeddings().aembed_documents(texts=["hi", "how are you?"])
print(embeds)
```  

``` python  
[[[-0.03512698, -0.020624293, -0.015343423, ...], [-0.021260535, -0.011461929, -0.00033121882, ...]]]
```
When in reality, the expected result should be:

``` python  
[[-0.03512698, -0.020624293, -0.015343423, ...], [-0.021260535, -0.011461929, -0.00033121882, ...]]
```
The above result complies with the expected type hint:
**List[List[float]]** . As I mentioned, we can achieve that by using the
extend method instead of the append method.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-23 14:43:43 +00:00
Christophe Bornet
7f1e444efa partners: Use simsimd types (#25299)
The simsimd package [now has
types](https://github.com/ashvardanian/SimSIMD/releases/tag/v5.0.0)
2024-08-23 10:41:39 -04:00
clement.l
642f9530cd community: add supported blockchains to Blockchain Document Loader (#25428)
- Remove deprecated chains.
- Add more supported chains.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-23 14:39:42 +00:00
conjuncts
818267bbc3 community: allow chroma DB delete() to use "where" argument (#19826)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"

Description: Simply pass kwargs to allow arguments like "where" to be
propagated
Issue: Previously, db.delete(where={}) wouldn't work for chroma
vectorstores
Dependencies: N/A
Twitter handle: N/A

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-08-23 10:10:57 -04:00
Kevin Engelke
3c7f12cbf5 community[minor]: Fix missing 'keep_newlines' parameter forward-pass to 'process_pages' function in confluence loader (#20086) (#20087)
- **Description:** Fixed missing `keep_newlines` parameter forward-pass
in confluence-loader
- **Issue:** #20086 
- **Dependencies:** None

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-23 12:59:38 +00:00
Erik Lindgren
583b0449eb community[patch]: Fix Hybrid Search for non-Databricks managed embeddings (#25590)
Description: Send both the query and query_embedding to the Databricks
index for hybrid search.

Issue: When using hybrid search with non-Databricks managed embedding we
currently don't pass both the embedding and query_text to the index.
Hybrid search requires both of these. This change fixes this issue for
both `similarity_search` and `similarity_search_by_vector`.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-23 08:57:13 +00:00
Alejandro Companioni
bcd5842b5d community[patch]: Updating default PPLX model to supported llama-3.1 model. (#25643)
# Issue

As of late July, Perplexity [no longer supports Llama 3
models](https://docs.perplexity.ai/changelog/introducing-new-and-improved-sonar-models).

# Description

This PR updates the default model and doc examples to reflect their
latest supported model. (Mostly updating the same places changed by
#23723.)

# Twitter handle

`@acompa_` on behalf of the team at Not Diamond. Check us out
[here](https://notdiamond.ai).

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-08-23 08:33:30 +00:00
Leonid Ganeline
163ef35dd1 docs: templates updated titles (#25646)
Updated titles into a consistent format. 
Fixed links to the diagrams.
Fixed typos.
Note: The Templates menu in the navbar is now sorted by the file names.
I'll try sorting the navbar menus by the page titles, not the page file
names.
2024-08-23 01:19:38 -07:00
Parsa Abbasi
1b2ae40d45 docs: Updated WikipediaLoader documentation (#25647)
- Output of the cells was not included in the documentation. I have
added them.
- There is another parameter in the `WikipediaLoader` class called
`doc_content_chars_max` (Based on
[this](https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.wikipedia.WikipediaLoader.html)).
I have included this in the list of parameters.
- I put the list of parameters under a new section called "Parameters"
in the documentation.
- I also included the `langchain_community` package in the installation
command.
- Some minor formatting/spelling issues were fixed.
2024-08-23 01:19:03 -07:00
Jakub W.
b865ee49a0 community[patch]: Dynamodb history messages key (#25658)
- **Description:** adding the history_messages_key so you don't have to
use "History" as a key in langchain
2024-08-23 08:05:28 +00:00
Erick Friis
b28bc252c4 core[patch]: mmr util (#25689) 2024-08-22 21:31:17 -07:00
ZhangShenao
ba89933c2c Doc[Embeddings] Add docs for ZhipuAIEmbeddings (#25662)
- Add docs for `ZhipuAIEmbeddings`.
- Using integration doc template.
- Source api reference: https://bigmodel.cn/dev/api#vector

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-23 01:33:43 +00:00
Erick Friis
6096c80b71 core: pydantic output parser streaming fix (#24415)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-22 18:00:09 -07:00
Eugene Yurtsev
c316361115 core[patch]: Add _api.rename_parameter to support renaming of parameters in functions (#25101)
Add ability to rename paramerters in function signatures

```python

    @rename_parameter(since="2.0.0", removal="3.0.0", old="old_name", new="new_name")
    def foo(new_name: str) -> str:
        """original doc"""
        return new_name
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-22 17:16:31 -07:00
Yusuke Fukasawa
0258cb96fa core[patch]: add additionalProperties recursively to oai function if strict (#25169)
Hello. 
First of all, thank you for maintaining such a great project.

## Description
In https://github.com/langchain-ai/langchain/pull/25123, support for
structured_output is added. However, `"additionalProperties": false`
needs to be included at all levels when a nested object is generated.

error from current code:
https://gist.github.com/fufufukakaka/e9b475300e6934853d119428e390f204
```
BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for response_format 'JokeWithEvaluation': In context=('properties', 'self_evaluation'), 'additionalProperties' is required to be supplied and to be false", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}
```

Reference: [Introducing Structured Outputs in the
API](https://openai.com/index/introducing-structured-outputs-in-the-api/)

```json
{
  "model": "gpt-4o-2024-08-06",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful math tutor."
    },
    {
      "role": "user",
      "content": "solve 8x + 31 = 2"
    }
  ],
  "response_format": {
    "type": "json_schema",
    "json_schema": {
      "name": "math_response",
      "strict": true,
      "schema": {
        "type": "object",
        "properties": {
          "steps": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "explanation": {
                  "type": "string"
                },
                "output": {
                  "type": "string"
                }
              },
              "required": ["explanation", "output"],
              "additionalProperties": false
            }
          },
          "final_answer": {
            "type": "string"
          }
        },
        "required": ["steps", "final_answer"],
        "additionalProperties": false
      }
    }
  }
}
```

In the current code, `"additionalProperties": false` is only added at
the last level.
This PR introduces the `_add_additional_properties_key` function, which
recursively adds `"additionalProperties": false` to the entire JSON
schema for the request.

Twitter handle: `@fukkaa1225`

Thank you!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-23 00:08:58 +00:00
Bagatur
b35ee09b3f infra: xfail pydantic v2 arg to py function (#25686)
Issue to track: #25687
2024-08-22 23:52:57 +00:00
Christophe Bornet
ee98da4f4e core[patch]: Add UP(upgrade) ruff rules (#25358) 2024-08-22 16:29:22 -07:00
William FH
294f7fcb38 core[patch]: Remove different parent run id warning (#25683) 2024-08-22 16:10:35 -07:00
Vadym Barda
46d344c33d core[patch]: support drawing nested subgraphs in draw_mermaid (#25581)
Previously the code was able to only handle a single level of nesting
for subgraphs in mermaid. This change adds support for arbitrary nesting
of subgraphs.
2024-08-22 16:08:49 -07:00
Manuel Jaiczay
1c31234eed community: fix HuggingFacePipeline pipeline_kwargs (#19920)
Fix handling of pipeline_kwargs to prioritize class attribute defaults.

#19770

Co-authored-by: jaizo <manuel.jaiczay@polygons.at>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-08-22 18:29:46 -04:00
Nobuhiko Otoba
4b63a217c2 "community: Fix GithubFileLoader source code", "docs: Fix GithubFileLoader code sample" (#19943)
This PR adds tiny improvements to the `GithubFileLoader` document loader
and its code sample, addressing the following issues:

1. Currently, the `file_extension` argument of `GithubFileLoader` does
not change its behavior at all.
1. The `GithubFileLoader` sample code in
`docs/docs/integrations/document_loaders/github.ipynb` does not work as
it stands.

The respective solutions I propose are the following:

1. Remove `file_extension` argument from `GithubFileLoader`.
1. Specify the branch as `master` (not the default `main`) and rename
`documents` as `document`.

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-08-22 18:24:57 -04:00
Leonid Ganeline
e7abee034e docs: integrations reference updates 4 (#25118)
Added missed references; missed provider pages.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 22:16:45 +00:00
Erick Friis
5fb8aa82b9 docs: api ref to new site links in featuretable (#25678) 2024-08-22 21:52:50 +00:00
Bagatur
cf9c484715 standard-tests[patch]: test Message.name (#25677)
Tests:
https://github.com/langchain-ai/langchain/actions/runs/10516092584
2024-08-22 14:47:31 -07:00
Nada Amin
ac7b71e0d7 langchain_community.graphs: Neo4JGraph: prop min_size might be None (#23944)
When I used the Neo4JGraph enhanced_schema=True option, I ran into an
error because a prop min_size of None was compared numerically with an
int.

The fix I applied is similar to the pattern of skipping embeddings
elsewhere in the file.

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-22 20:29:52 +00:00
CastaChick
7d13a2f958 core[patch]: add option to specify the chunk separator in merge_message_runs (#24783)
**Description:**
LLM will stop generating text even in the middle of a sentence if
`finish_reason` is `length` (for OpenAI) or `stop_reason` is
`max_tokens` (for Anthropic).
To obtain longer outputs from LLM, we should call the message generation
API multiple times and merge the results into the text to circumvent the
API's output token limit.
The extra line breaks forced by the `merge_message_runs` function when
seamlessly merging messages can be annoying, so I added the option to
specify the chunk separator.

**Issue:**
No corresponding issues.

**Dependencies:**
No dependencies required.

**Twitter handle:**
@hanama_chem
https://x.com/hanama_chem

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-22 19:46:25 +00:00
basirsedighi
0f3fe44e44 parsed_json is expected to be a list of dictionaries, but it seems to… (#24018)
parsed_json is expected to be a list of dictionaries, but it seems to…
be a single dictionary instead.
This is at
libs/experimental/langchain_experimental/graph_transformers/llm.py
process process_response

Thank you for contributing to LangChain!

- [ ] **Bugfix**: "experimental: bugfix"

---------

Co-authored-by: based <basir.sedighi@nris.no>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 19:09:43 +00:00
ZhangShenao
8bde04079b patch[experimental] Fix start_index in SemanticChunker (#24761)
- Cause chunks are joined by space, so they can't be found in text, and
the final `start_index` is very possibility to be -1.
- The simplest way is to use the natural index of the chunk as
`start_index`.
2024-08-22 14:59:40 -04:00
Sanjay Parajuli
6fbd53bc60 docs: Update tool_calling.ipynb (#25434)
**Description:** This part of the documentation didn't explain about the
`required` property of function calling. I added additional line as a
note.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 18:55:24 +00:00
William FH
fad6fc866a Rm DeepInfra Breakpoint Comment (#25206)
tbh should rm the print staement too
2024-08-22 14:43:44 -04:00
yahya-mouman
e5bb4cb646 lagchain-pinecone: add id to similarity documents results (#25630)
- **Description:** This change adds the ID field that's required in
Pinecone to the result documents of the similarity search method.
- **Issue:** Lack of document metadata namely the ID field

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 18:33:26 +00:00
Eric Pinzur
01ded5e2f9 community: add metadata filter to CassandraGraphVectorStore (#25663)
- **Description:** 
- Added metadata filtering support to
`langchain_community.graph_vectorstores.cassandra.CassandraGraphVectorStore`
  - Also fixed type conversion issues highlighted by mypy.
- **Dependencies:** 
  - `ragstack-ai-knowledge-store 0.2.0` (released July 23, 2024)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 14:27:16 -04:00
Ivan
5b9290a449 Fix UnionType type var replacement (#25566)
[langchain_core] Fix UnionType type var replacement

- Added types.UnionType to typing.Union mapping

Type replacement cause `TypeError: 'type' object is not subscriptable`
if any of union type comes as function `_py_38_safe_origin` return
`types.UnionType` instead of `typing.Union`

```python
>>> from types import UnionType
>>> from typing import Union, get_origin
>>> type_ = get_origin(str | None)
>>> type_
<class 'types.UnionType'>
>>> UnionType[(str, None)]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'type' object is not subscriptable
>>> Union[(str, None)]
typing.Optional[str]
```

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 14:22:09 -04:00
William FH
8230ba47f3 core[patch]: Improve some error messages and add another test for checking RunnableWithMessageHistory (#25209)
Also add more useful error messages.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-08-22 18:14:27 +00:00
Hasan Kumar
b4fcda7657 langchain: Fix type warnings when passing Runnable as agent to AgentExecutor (#24750)
Fix for https://github.com/langchain-ai/langchain/issues/13075

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 14:02:02 -04:00
sslee
61228da1c4 fix typo (#25673) 2024-08-22 17:33:53 +00:00
Leonid Ganeline
d886f4e107 docs: integrations reference update 9 (#25511)
Added missed provider pages. Added missed references and descriptions.
2024-08-22 10:25:41 -07:00
Maurits Bos
3da752c7bb Update pyproject.toml of packageopenai-functions-agent-gmail to prevent ModuleOrPackageNotFound error (#25597)
I was trying to add this package using langchain-cli: `langchain app add
openai-functions-agent-gmail`, but when then try to build the whole
project using poetry or pip, it fails with the following
error:`poetry.core.masonry.utils.module.ModuleOrPackageNotFound: No
file/folder found for package openai-functions-agent-gmail`

This was fixed by modifying the pyproject.toml as in this commit

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-22 17:22:50 +00:00
Leonid Ganeline
624e0747b9 docs: integrations reference updates 10 (#25556)
Added missed provider pages. Added descriptions, links.
2024-08-22 10:21:54 -07:00
Erick Friis
9447925d94 cli: release 0.0.30 (#25672) 2024-08-22 10:21:19 -07:00
Leonid Ganeline
47adc7f32b docs: integrations reference updates 11 (#25598)
Added missed provider pages and links.
2024-08-22 10:19:17 -07:00
Dylan
16fc0a866e docs: Change Pull Request to Merge Request in GitLab notebook (#25649)
- **Description:** In GitLab we call these "merge requests" rather than
"pull requests" so I thought I'd go ahead and update the notebook.
- **Issue:** N/A
- **Dependencies:** none
- **Twitter handle:** N/A

Thanks for creating the tools and notebook to help people work with
GitLab. I thought I'd contribute some minor docs updates here.
2024-08-22 17:15:45 +00:00
mschoenb97IL
e499caa9cd community: Give more context on DeepInfra 500 errors (#25671)
Description: DeepInfra 500 errors have useful information in the text
field that isn't being exposed to the user. I updated the error message
to fix this.

As an example, this code

```
from langchain_community.chat_models import ChatDeepInfra
from langchain_core.messages import HumanMessage

model = "meta-llama/Meta-Llama-3-70B-Instruct"
deepinfra_api_token = "..."

model = ChatDeepInfra(model=model, deepinfra_api_token=deepinfra_api_token)

messages = [HumanMessage("All work and no play makes Jack a dull boy\n" * 9000)]
response = model.invoke(messages)
```

Currently gives this error:
```
langchain_community.chat_models.deepinfra.ChatDeepInfraException: DeepInfra Server: Error 500
```

This change would give the following error:
```
langchain_community.chat_models.deepinfra.ChatDeepInfraException: DeepInfra Server error status 500: {"error":{"message":"Requested input length 99009 exceeds maximum input length 8192"}}
```
2024-08-22 10:10:51 -07:00
Brian Sam-Bodden
29c873dd69 [docs]: update Redis (langchain-redis) documentation notebooks (vectorstore, llm caching, chat message history) (#25113)
- **Description:** Adds notebooks for Redis Partner Package
(langchain-redis)
- **Issue:** N/A
- **Dependencies:** None
- **Twitter handle:** `@bsbodden` and `@redis`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 11:53:02 -04:00
Rajendra Kadam
4ff2f4499e community: Refactor PebbloRetrievalQA (#25583)
**Refactor PebbloRetrievalQA**
  - Created `APIWrapper` and moved API logic into it.
  - Created smaller functions/methods for better readability.
  - Properly read environment variables.
  - Removed unused code.
  - Updated models

**Issue:** NA
**Dependencies:** NA
**tests**:  NA
2024-08-22 11:51:21 -04:00
Rajendra Kadam
1f1679e960 community: Refactor PebbloSafeLoader (#25582)
**Refactor PebbloSafeLoader**
  - Created `APIWrapper` and moved API logic into it.
  - Moved helper functions to the utility file.
  - Created smaller functions and methods for better readability.
  - Properly read environment variables.
  - Removed unused code.

**Issue:** NA
**Dependencies:** NA
**tests**:  Updated
2024-08-22 11:46:52 -04:00
maang-h
5e3a321f71 docs: Add ChatZhipuAI tool calling and structured output docstring (#25669)
- **Description:** Add `ChatZhipuAI` tool calling and structured output
docstring.
2024-08-22 10:34:41 -04:00
Krishna Kulkarni
820da64983 limit the most recent documents to fetch from MongoDB database. (#25435)
limit the most recent documents to fetch from MongoDB database.

Thank you for contributing to LangChain!

- [ ] **limit the most recent documents to fetch from MongoDB
database.**: "langchain_mongodb: limit the most recent documents to
fetch from MongoDB database."


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Added a doc_limit parameter which enables the limit
for the documents to fetch from MongoDB database
    - **Issue:** 
    - **Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-22 10:33:45 -04:00
ccurme
67b6e6c2e3 docs[patch]: update AWS integration docs (#25631)
De-beta ChatBedrockConverse.
2024-08-22 09:22:03 -04:00
Swastik-Swarup-Dash
6247259438 update_readme (#25665)
Updated LangChain Expression Language (LCEL). for Easier Understanding
2024-08-22 13:18:32 +00:00
Noah Mayerhofer
0091947efd community: add retry for session expired exception in neo4j (#25660)
Description: The neo4j driver can raise a SessionExpired error, which is
considered a retriable error. If a query fails with a SessionExpired
error, this change retries every query once. This change will make the
neo4j integration less flaky.
Twitter handle: noahmay_
2024-08-22 13:07:36 +00:00
Erick Friis
e958f76160 docs: migration guide nits (#25600) 2024-08-22 04:24:34 +00:00
Yuki Watanabe
3981d736df databricks: Add partner package directory and ChatDatabricks implementation (#25430)
### Summary

Create `langchain-databricks` as a new partner packages. This PR does
not migrate all existing Databricks integration, but the package will
eventually contain:

* `ChatDatabricks` (implemented in this PR)
* `DatabricksVectorSearch`
* `DatabricksEmbeddings`
* ~`UCFunctionToolkit`~ (will be done after UC SDK work which
drastically simplify implementation)

Also, this PR does not add integration tests yet. This will be added
once the Databricks test workspace is ready.

Tagging @efriis as POC


### Tracker
[✍️] Create a package and imgrate ChatDatabricks
[ ] Migrate DatabricksVectorSearch, DatabricksEmbeddings, and their docs
~[ ] Migrate UCFunctionToolkit and its doc~
[ ] Add provider document and update README.md
[ ] Add integration tests and set up secrets (after moved to an external
package)
[ ] Add deprecation note to the community implementations.

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-21 17:19:28 -07:00
Scott Hurrey
fb1d67edf6 box: add retrievers and fix docs (#25633)
Thank you for contributing to LangChain!


**Description:** Adding `BoxRetriever` for langchain_box. This retriever
handles two use cases:
* Retrieve all documents that match a full-text search
* Retrieve the answer to a Box AI prompt as a Document

**Twitter handle:** @BoxPlatform


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-21 22:40:40 +00:00
Bagatur
4f347cbcb9 docs: link Versions in intro (#25640) 2024-08-21 21:02:25 +00:00
jakerachleff
4591bc0b01 Use 1.101 instead of 1.100 bc 1.100 was yanked (#25638) 2024-08-21 21:02:01 +00:00
Bagatur
f535e8a99e docs: ls similar examples header (#25642) 2024-08-21 20:50:24 +00:00
Erick Friis
766b650fdc chroma: add back fastapi optional dep (#25641) 2024-08-21 20:00:47 +00:00
Bagatur
9daff60698 docs: fix openai api ref (#25639) 2024-08-21 12:55:17 -07:00
Erick Friis
c8be0a9f70 partners/unstructured: release 0.1.2 (#25637) 2024-08-21 12:53:55 -07:00
Bagatur
f4b3c90886 docs: add prereq commas (#25626) 2024-08-21 12:38:53 -07:00
Christophe Bornet
b71ae52e65 [unstructured][security] Bump unstructured version (#25364)
This ensures version 0.15.7+ is pulled. 
This version of unstructured uses a version of NLTK >= 3.8.2 that has a
fix for a critical CVE:
https://github.com/advisories/GHSA-cgvx-9447-vcch
2024-08-21 12:25:24 -07:00
Bagatur
39c44817ae infra: test convert_message (#25632) 2024-08-21 18:24:06 +00:00
Bagatur
4feda41ab6 docs: ls how to link (#25624) 2024-08-21 10:18:08 -07:00
Bagatur
71c2ec6782 docs: langsmith few shot prereq (#25623) 2024-08-21 16:44:25 +00:00
Bagatur
628574b9c2 core[patch]: Release 0.2.34 (#25622) 2024-08-21 16:26:51 +00:00
Bagatur
0bc3845e1e core[patch]: support oai dicts as messages (#25621)
and update langsmtih example selector docs
2024-08-21 16:13:15 +00:00
Bagatur
a78843bb77 docs: how to use langsmith few shot (#25601)
Requires langsmith 0.1.101 release
2024-08-21 08:12:42 -07:00
ccurme
10a2ce2a26 together[patch]: use mixtral in standard integration tests (#25619)
Mistral 7B occasionally fails tool-calling tests. Updating to Mixtral
appears to improve this.
2024-08-21 14:26:25 +00:00
Mikhail Khludnev
d457d7d121 docs: Update qdrant.ipynb "BM25".lower() (#25616)
Otherwise I've got KeyError from `fastembeds`
2024-08-21 09:45:00 -04:00
Dristy Srivastava
b002702af6 [Community][minor]: Updating metadata with full_path in SharePoint loader (#25593)
- **Description:** Updating metadata for sharepoint loader with full
path i.e., webUrl
- **Issue:** NA
- **Dependencies:** NA
- **Tests:** NA
- **Docs** NA

Co-authored-by: dristy.cd <dristy@clouddefense.io>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-21 13:10:14 +00:00
ZhangShenao
34d0417eb5 Improvement[Doc] Improve api doc in of PineconeVectorStore (#25605)
Complete missing arguments in api doc of `PineconeVectorStore`.
2024-08-21 08:58:00 -04:00
wangda
e7d6b25653 docs:Correcting spelling mistakes (#25612) 2024-08-21 08:49:12 -04:00
Scott Hurrey
55fd2e2158 box: add langchain box package and DocumentLoader (#25506)
Thank you for contributing to LangChain!

-Description: Adding new package: `langchain-box`:

* `langchain_box.document_loaders.BoxLoader` — DocumentLoader
functionality
* `langchain_box.utilities.BoxAPIWrapper` — Box-specific code
* `langchain_box.utilities.BoxAuth` — Helper class for Box
authentication
* `langchain_box.utilities.BoxAuthType` — enum used by BoxAuth class

- Twitter handle: @boxplatform


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-21 02:23:43 +00:00
Bagatur
be27e1787f docs: few-shot conceptual guide (#25596)
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: jakerachleff <jake@langchain.dev>
2024-08-21 00:39:50 +00:00
Erick Friis
f878df404f partners/chroma: release 0.1.3 (#25599) 2024-08-20 23:24:32 +00:00
Erick Friis
60cf49a618 chroma: ban chromadb sdk versions 0.5.4 and 0.5.5 due to pydantic bug (#25586)
also remove some unused dependencies (fastapi) and unused test/lint/dev
dependencies (community, openai, textsplitters)

chromadb 0.5.4 introduced usage of `model_fields` which is pydantic v2
specific. also released in 0.5.5
2024-08-20 23:21:38 +00:00
Erick Friis
e37caa9b9a core: fix fallback context overwriting (#25550)
fixes #25337
2024-08-20 16:07:12 -07:00
Bagatur
3e296e39c8 docs: update examples in api ref (#25589) 2024-08-20 11:08:24 -07:00
Isaac Francisco
d40bdd6257 docs: more indexing of document loaders (#25500)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-20 17:54:42 +00:00
Bagatur
8a71f1b41b core[minor]: add langsmith document loader (#25493)
needs tests
2024-08-20 10:22:14 -07:00
Bob Merkus
8e3e532e7d docs: ollama doc update (toolcalling, install, notebook examples) (#25549)
The new `langchain-ollama` package seems pretty well implemented, but I
noticed the docs were still outdated so I decided to fix em up a bit.

- Llama3.1 was release on 23rd of July;
https://ai.meta.com/blog/meta-llama-3-1/
- Ollama supports tool calling since 25th of July;
https://ollama.com/blog/tool-support
- LangChain Ollama partner package was released 1st of august;
https://pypi.org/project/langchain-ollama/

**Problem**: Docs note langchain-community instead of langchain-ollama

**Solution**: Update docs to
https://python.langchain.com/v0.2/docs/integrations/chat/ollama/


**Problem**: OllamaFunctions is deprecated, as noted on
[Integrations](https://python.langchain.com/v0.2/docs/integrations/chat/ollama_functions/):
This was an experimental wrapper that attempts to bolt-on tool calling
support to models that do not natively support it. The [primary Ollama
integration](https://python.langchain.com/v0.2/docs/integrations/chat/ollama/) now
supports tool calling, and should be used instead.

**Solution**: Delete old notebook from repo, update the existing one
with @tool decorator + pydantic examples to the notebook


**Problem**: Llama3.1 was released while llama3-groq-tool-call fine-tune
Is noted in notebooks.

**Solution**: update docs + notebooks to llama3.1 (which has improved
tool calling support)


**Problem**: Install instructions are incomplete, there is no
information to download a model and/or run the Ollama server

**Solution**: Add simple instructions to start the ollama service and
pull model (for toolcalling)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-20 09:20:59 -04:00
Jabir
12e490ea56 Update azuresearch.py (#25577)
This will allow complextype metadata to be returned. the current
implementation throws error when dealing with nested metadata

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-20 12:53:30 +00:00
Abraham Omorogbe
498a482e76 docs: Adding Azure Database for PostgreSQL docs (#25560)
This PR to show support for the Azure Database for PostgreSQL Vector
Store and Memory

[Azure Database for PostgreSQL - Flexible
Server](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/service-overview)
[Azure Database for PostgreSQL pgvector
extension](https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/how-to-use-pgvector)

**Description:** Added vector store and memory usage documentation for
Azure Database for PostgreSQL
 **Twitter handle:** [@_aiabe](https://x.com/_aiabe)

---------

Co-authored-by: Abeomor <{ID}+{username}@users.noreply.github.com>
2024-08-20 12:01:32 +00:00
Leonid Ganeline
d324fd1821 docs: added Constitutional AI references (#25553)
Added reference to the source paper.
2024-08-20 08:00:58 -04:00
Bagatur
4bd005adb6 core[patch]: Allow bound models as token_counter in trim_messages (#25563) 2024-08-20 00:21:22 -07:00
Erick Friis
e01c6789c4 core,community: add beta decorator to missed GraphVectorStore extensions (#25562) 2024-08-19 17:29:09 -07:00
Erick Friis
dd2d094adc infra: remove huggingface from ci tree (#25559) 2024-08-19 22:48:26 +00:00
Bagatur
6b98207eda infra: test chat prompt ser/des (#25557) 2024-08-19 15:27:36 -07:00
ccurme
c5bf114c0f together, standard-tests: specify tool_choice in standard tests (#25548)
Here we allow standard tests to specify a value for `tool_choice` via a
`tool_choice_value` property, which defaults to None.

Chat models [available in
Together](https://docs.together.ai/docs/chat-models) have issues passing
standard tool calling tests:
- llama 3.1 models currently [appear to rely on user-side
parsing](https://docs.together.ai/docs/llama-3-function-calling) in
Together;
- Mixtral-8x7B and Mistral-7B (currently tested) consistently do not
call tools in some tests.

Specifying tool_choice also lets us remove an existing `xfail` and use a
smaller model in Groq tests.
2024-08-19 16:37:36 -04:00
maang-h
015ab91b83 community[patch]: Add ToolMessage for ChatZhipuAI (#25547)
- **Description:** Add ToolMessage for `ChatZhipuAI` to solve the issue
#25490
2024-08-19 11:26:38 -04:00
ccurme
5a3aaae6dc groq[patch]: update model used for llama tests (#25542)
`llama-3.1-8b-instant` often fails some of the tool calling standard
tests. Here we update to `llama-3.1-70b-versatile`.
2024-08-19 13:58:06 +00:00
Mohammad Mohtashim
75c3c81b8c [Community]: Fix - Open AI Whisper client.audio.transcriptions returning Text Object which raises error (#25271)
- **Description:** The following
[line](fd546196ef/libs/community/langchain_community/document_loaders/parsers/audio.py (L117))
in `OpenAIWhisperParser` returns a text object for some odd reason
despite the official documentation saying it should return `Transcript`
Instance which should have the text attribute. But for the example given
in the issue and even when I tried running on my own, I was directly
getting the text. The small PR accounts for that.
 - **Issue:** : #25218
 

I was able to replicate the error even without the GenericLoader as
shown below and the issue was with `OpenAIWhisperParser`

```python
parser = OpenAIWhisperParser(api_key="sk-fxxxxxxxxx",
                                            response_format="srt",
                                            temperature=0)

list(parser.lazy_parse(Blob.from_path('path_to_file.m4a')))
```
2024-08-19 09:36:42 -04:00
Thin red line 未来产品经理
0f7b8adddf fix issue: cannot use document_variable_name to override context in create_stuff_documents_chain (#25531)
…he prompt in the create_stuff_documents_chain

Thank you for contributing to LangChain!

- [ ] **PR title**: "langchain:add document_variable_name in the
function _validate_prompt in create_stuff_documents_chain"
 

- [ ] **PR message**: 
- **Description:** add document_variable_name in the function
_validate_prompt in create_stuff_documents_chain
- **Issue:** according to the description of
create_stuff_documents_chain function, the parameter
document_variable_name can be used to override the "context" in the
prompt, but in the function, _validate_prompt it still use DOCUMENTS_KEY
to check if it is a valid prompt, the value of DOCUMENTS_KEY is always
"context", so even through the user use document_variable_name to
override it, the code still tries to check if "context" is in the
prompt, and finally it reports error. so I use document_variable_name to
replace DOCUMENTS_KEY, the default value of document_variable_name is
"context" which is same as DOCUMENTS_KEY, but it can be override by
users.
    - **Dependencies:** none
    - **Twitter handle:** https://x.com/xjr199703


- [ ] **Add tests and docs**: none

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-19 13:33:19 +00:00
ccurme
09c0823c3a docs: update summarization guides (#25408) 2024-08-19 13:29:25 +00:00
maang-h
32f5147523 docs: Fix QianfanLLMEndpoint and Tongyi input text (#25529)
- **Description:** Fix `QianfanLLMEndpoint` and `Tongyi` input text.
2024-08-19 09:23:09 -04:00
ZhangShenao
4255a30f20 Improvement[Community] Improve api doc for SingleFileFacebookMessengerChatLoader (#25536)
Delete redundant args in api doc
2024-08-19 09:00:21 -04:00
Bagatur
49dea06af1 docs: fix Agent deprecation msg (#25464) 2024-08-18 19:15:52 +00:00
Hassan El Mghari
937b3904eb together[patch]: update base url (#25524)
Updated the Together base URL from `.ai` to `.xyz` since some customers
have reported problems with `.ai`.
2024-08-18 10:48:30 -07:00
gbaian10
bda3becbe7 docs: add prompt to install beautifulsoup4. (#25518)
fix: #25482

- **Description:**
Add a prompt to install beautifulsoup4 in places where `from
langchain_community.document_loaders import WebBaseLoader` is used.
- **Issue:** #25482
2024-08-17 23:23:24 -07:00
gbaian10
f6e6a17878 docs: add prompt to install nltk (#25519)
fix: #25473 

- **Description:** add prompt to install nltk
- **Issue:** #25473
2024-08-17 23:22:49 -07:00
Chengzu Ou
c1bd4e05bc docs: fix Databricks Vector Search demo notebook (#25504)
**Description:** This PR fixes an issue in the demo notebook of
Databricks Vector Search in "Work with Delta Sync Index" section.

**Issue:** N/A

**Dependencies:** N/A

---------

Co-authored-by: Chengzu Ou <chengzu.ou@databrick.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-16 20:24:30 +00:00
Isaac Francisco
a2e90a5a43 add embeddings integration tests (#25508) 2024-08-16 13:20:37 -07:00
Bagatur
a06818a654 openai[patch]: update core dep (#25502) 2024-08-16 18:30:17 +00:00
Bagatur
df98552b6f core[patch]: Release 0.2.33 (#25498) 2024-08-16 11:18:54 -07:00
ccurme
b83f1eb0d5 core, partners: implement standard tracing params for LLMs (#25410) 2024-08-16 13:18:09 -04:00
Bagatur
9f0c76bf89 openai[patch]: Release 0.1.22 (#25496) 2024-08-16 16:53:04 +00:00
ccurme
01ecd0acba openai[patch]: fix json mode for Azure (#25488)
https://github.com/langchain-ai/langchain/issues/25479
https://github.com/langchain-ai/langchain/issues/25485

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-16 09:50:50 -07:00
Eugene Yurtsev
1fd1c1dca5 docs: use .invoke rather than __call__ in openai integration notebook (#25494)
Documentation should be using .invoke()
2024-08-16 15:59:18 +00:00
Bagatur
253ceca76a docs: fix mimetype parser docstring (#25463) 2024-08-15 16:16:52 -07:00
Eugene Yurtsev
e18511bb22 core[minor], anthropic[patch]: Upgrade @root_validator usage to be consistent with pydantic 2 (#25457)
anthropic: Upgrade `@root_validator` usage to be consistent with
pydantic 2
core: support looking up multiple keys from env in from_env factory
2024-08-15 20:09:34 +00:00
Eugene Yurtsev
34da8be60b pinecone[patch]: Upgrade @root_validators to be consistent with pydantic 2 (#25453)
Upgrade root validators for pydantic 2 migration
2024-08-15 19:45:14 +00:00
Eugene Yurtsev
b297af5482 voyageai[patch]: Upgrade root validators for pydantic 2 (#25455)
Update @root_validators to be consistent with pydantic 2 semantics
2024-08-15 15:30:41 -04:00
Eugene Yurtsev
4cdaca67dc ai21[patch]: Upgrade @root_validators for pydantic 2 migration (#25454)
Upgrade @root_validators usage to match pydantic 2 semantics
2024-08-15 14:54:08 -04:00
Eugene Yurtsev
d72a08a60d groq[patch]: Update root validators for pydantic 2 migration (#25402) 2024-08-15 18:46:52 +00:00
Leonid Ganeline
8eb63a609e docs: arxiv page update (#25450)
Added `arxive` papers that use `LangGraph` or `LangSmith`. Improved the
page formatting.
2024-08-15 14:30:35 -04:00
Isaac Francisco
5150ec3a04 [experimental]: minor fix to open assistants code (#24682) 2024-08-15 17:50:57 +00:00
Bagatur
2b4fbcb4b4 docs: format oai embeddings docstring (#25448) 2024-08-15 16:57:54 +00:00
Eugene Yurtsev
eb3870e9d8 fireworks[patch]: Upgrade @root_validators to be pydantic 2 compliant (#25443)
Update @root_validators to be pydantic 2 compliant
2024-08-15 16:56:48 +00:00
William FH
75ae585deb Merge support for group manager (#25360) 2024-08-15 09:56:31 -07:00
Eugene Yurtsev
b7c070d437 docs[patch]: Update code that checks API keys (#25444)
Check whether the API key is already in the environment

Update:

```python
import getpass
import os

os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
os.environ["DATABRICKS_TOKEN"] = getpass.getpass("Enter your Databricks access token: ")
```

To:

```python
import getpass
import os

os.environ["DATABRICKS_HOST"] = "https://your-workspace.cloud.databricks.com"
if "DATABRICKS_TOKEN" not in os.environ:
    os.environ["DATABRICKS_TOKEN"] = getpass.getpass(
        "Enter your Databricks access token: "
    )
```

grit migration:

```
engine marzano(0.1)
language python

`os.environ[$Q] = getpass.getpass("$X")` as $CHECK where {
    $CHECK <: ! within if_statement(),
    $CHECK => `if $Q not in os.environ:\n    $CHECK`
}
```
2024-08-15 12:52:37 -04:00
Bagatur
60b65528c5 docs: fix api ref mod links in pkg page (#25447) 2024-08-15 16:52:12 +00:00
Eugene Yurtsev
2ef9d12372 mistralai[patch]: Update more @root_validators for pydantic 2 compatibility (#25446)
Update @root_validators in mistralai integration for pydantic 2 compatibility
2024-08-15 12:44:42 -04:00
Eugene Yurtsev
6910b0b3aa docs[patch]: Fix integration notebook for Fireworks llm (#25442)
Fix integration notebook
2024-08-15 12:42:33 -04:00
Eugene Yurtsev
831708beb7 together[patch]: Update @root_validator for pydantic 2 compatibility (#25423)
This PR updates usage of @root_validator to be compatible with pydantic 2.
2024-08-15 11:27:42 -04:00
Eugene Yurtsev
a114255b82 ai21[patch]: Update @root_validators for pydantic2 migration (#25401)
Update @root_validators for pydantic 2 migration.
2024-08-15 11:26:44 -04:00
Eugene Yurtsev
6f68c8d6ab mistralai[patch]: Update root validator for compatibility with pydantic 2 (#25403) 2024-08-15 11:26:24 -04:00
ccurme
8afbab4cf6 langchain[patch]: deprecate various chains (#25310)
- [x] NatbotChain: move to community, deprecate langchain version.
Update to use `prompt | llm | output_parser` instead of LLMChain.
- [x] LLMMathChain: deprecate + add langgraph replacement example to API
ref
- [x] HypotheticalDocumentEmbedder (retriever): update to use `prompt |
llm | output_parser` instead of LLMChain
- [x] FlareChain: update to use `prompt | llm | output_parser` instead
of LLMChain
- [x] ConstitutionalChain: deprecate + add langgraph replacement example
to API ref
- [x] LLMChainExtractor (document compressor): update to use `prompt |
llm | output_parser` instead of LLMChain
- [x] LLMChainFilter (document compressor): update to use `prompt | llm
| output_parser` instead of LLMChain
- [x] RePhraseQueryRetriever (retriever): update to use `prompt | llm |
output_parser` instead of LLMChain
2024-08-15 10:49:26 -04:00
Luke
66e30efa61 experimental: Fix divide by 0 error (#25439)
Within the semantic chunker, when calling `_threshold_from_clusters`
there is the possibility for a divide by 0 error if the
`number_of_chunks` is equal to the length of `distances`.

Fix simply implements a check if these values match to prevent the error
and enable chunking to continue.
2024-08-15 14:46:30 +00:00
ccurme
ba167dc158 community[patch]: update connection string in azure cosmos integration test (#25438) 2024-08-15 14:07:54 +00:00
Eugene Yurtsev
44f69063b1 docs[patch]: Fix a few typos in the chat integration docs for TogetherAI (#25424)
Fix a few minor typos
2024-08-15 09:48:36 -04:00
Isaac Francisco
f18b77fd59 [docs]: pdf loaders (#25425) 2024-08-14 21:44:57 -07:00
Isaac Francisco
966b408634 [docs]: doc loader changes (#25417) 2024-08-14 19:46:33 -07:00
ccurme
bd261456f6 langchain: bump core to 0.2.32 (#25421) 2024-08-15 00:00:42 +00:00
Bagatur
ec8ffc8f40 core[patch]: Release 0.2.32 (#25420) 2024-08-14 15:56:56 -07:00
Bagatur
2494cecabf core[patch]: tool import fix (#25419) 2024-08-14 22:54:13 +00:00
ccurme
df632b8cde langchain: bump min core version (#25418) 2024-08-14 22:51:35 +00:00
ccurme
1050e890c6 langchain: release 0.2.14 (#25416)
Fixes https://github.com/langchain-ai/langchain/issues/25413
2024-08-14 22:29:39 +00:00
Isaac Francisco
c4779f5b9c [docs]: sitemaploader update (#25363) 2024-08-14 15:27:40 -07:00
gbaian10
0a99935794 docs: remove the extra period in docstring (#25414)
Remove the period after the hyperlink in the docstring of
BaseChatOpenAI.with_structured_output.

I have repeatedly copied the extra period at the end of the hyperlink,
which results in a "Page not found" page when pasted into the browser.
2024-08-14 18:07:15 -04:00
Isaac Francisco
63aba3fe5b [docs]: link fix directory loader (#25411) 2024-08-14 20:58:54 +00:00
Bagatur
dc80be5efe docs: fix deprecated functions table (#25409) 2024-08-14 12:25:39 -07:00
Erick Friis
ab29ee79a3 docs: fix tool index (#25404) 2024-08-14 18:36:41 +00:00
Werner van der Merwe
1d3f7231b8 fix: typo where github should be gitlab (#25397)
**PR title**: "GitLabToolkit: fix typo"
    - **Description:** fix typo where GitHub should have been GitLab
    - **Dependencies:** None
2024-08-14 18:36:25 +00:00
Bagatur
a58d4ba340 core[patch]: Release 0.2.31 (#25388) 2024-08-14 11:26:49 -07:00
Bagatur
d178fb9dc3 docs: fix api ref package tables (#25400) 2024-08-14 10:40:16 -07:00
Bagatur
414154fa59 experimental[patch]: refactor rl chain structure (#25398)
can't have a class and function with same name but different
capitalization in same file for api reference building
2024-08-14 17:09:43 +00:00
Flávio Knob
94c9cb7321 Update document_loader_custom.ipynb (#25393)
Fix typo

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-14 12:33:21 -04:00
Jacob Lee
012929551c docs[patch]: Hide deprecated integration pages (#25389) 2024-08-14 09:17:39 -07:00
Bagatur
63c483ea01 standard-tests: import fix (#25395) 2024-08-14 09:13:56 -07:00
Bagatur
eec7bb4f51 anthropic[patch]: Release 0.1.23 (#25394) 2024-08-14 09:03:39 -07:00
Flávio Knob
f0f125dac7 Update document_loader_custom.ipynb (#25391)
Fix typo and some `callout` tags

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-14 15:07:42 +00:00
Eugene Yurtsev
f4196f1fb8 ollama[patch]: Update extra in ollama package (#25383)
Backwards compatible change that converts pydantic extras to literals
which is consistent with pydantic 2 usage.
2024-08-14 10:30:01 -04:00
Chengyu Yan
d0ad713937 core: fix issue#24660, slove error messages about ValueError when use model with history (#25183)
- **Description:**
This PR will slove error messages about `ValueError` when use model with
history.
Detail in #24660.
#22933 causes that
`langchain_core.runnables.history.RunnableWithMessageHistory._get_output_messages`
miss type check of `output_val` if `output_val` is `False`. After
running `RunnableWithMessageHistory._is_not_async`, `output` is `False`.

249945a572/libs/core/langchain_core/runnables/history.py (L323-L334)

15a36dd0a2/libs/core/langchain_core/runnables/history.py (L461-L471)
~~I suggest that `_get_output_messages` return empty list when
`output_val == False`.~~

- **Issue**:
  - #24660

- **Dependencies:**: No Change.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-08-14 14:26:22 +00:00
Jacob Lee
ddd7919f6a docs[patch]: Add conceptual guide links to integration index pages (#25387) 2024-08-14 07:14:24 -07:00
Bagatur
493e474063 docs: udpated api reference (#25172)
- Move the API reference into the vercel build
- Update api reference organization and styling
2024-08-14 07:00:17 -07:00
Leonid Ganeline
4a812e3193 docs: integrations references update (#25217)
Added missed provider pages. Fixed formats and added descriptions and
links.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-14 13:58:38 +00:00
Eugene Yurtsev
5f5e8c9a60 huggingface[patch], pinecone[patch], fireworks[patch], mistralai[patch], voyageai[patch], togetherai[path]: convert Pydantic extras to literals (#25384)
Backwards compatible change that converts pydantic extras to literals
which is consistent with pydantic 2 usage.

- fireworks
- voyage ai
- mistralai
- mistral ai
- together ai
- huggigng face
- pinecone
2024-08-14 09:55:30 -04:00
Eugene Yurtsev
d00176e523 openai[patch]: Update extra to match pydantic 2 (#25382)
Backwards compatible change that converts pydantic extras to literals
which is consistent with pydantic 2 usage.
2024-08-14 09:55:18 -04:00
Eugene Yurtsev
dc51cc5690 core[minor]: Prevent PydanticOutputParser from encoding schema as ASCII (#25386)
This allows users to provide parameter descriptions in the pydantic
models in other languages.

Continuing this PR: https://github.com/langchain-ai/langchain/pull/24809
2024-08-14 13:54:31 +00:00
ccurme
27690506d0 multiple: update removal targets (#25361) 2024-08-14 09:50:39 -04:00
Ikko Eltociear Ashimine
4029f5650c docs: update clarifai.ipynb (#25373)
Intialize -> Initialize
2024-08-14 09:20:17 -04:00
Erick Friis
10e6725a7e docs: tools index table (#25370) 2024-08-14 02:38:03 +00:00
Harrison Chase
967b6f21f6 docs: improve document loaders index (#25365)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-14 01:48:48 +00:00
Erick Friis
4a78be7861 docs: remove sidebar comment (#25369) 2024-08-14 01:47:12 +00:00
Eugene Yurtsev
d6c180996f docs[patch]: Fix typo in CohereEmbeddings integration docs (#25367)
Fix typo
2024-08-14 01:18:54 +00:00
Eugene Yurtsev
93dcc47463 docs: Partial integration update for cohere embeddings (#25250)
This can be finished after the following issue is resolved:

https://github.com/langchain-ai/langchain-cohere/issues/81

Related to: https://github.com/langchain-ai/langchain/issues/24856

```json
[
   {
      "provider": "cohere",
      "js":  true,
      "local": false,
     "serializable": false,
   }
]
```

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-08-14 00:53:13 +00:00
Eugene Yurtsev
27def6bddb docs[patch]: Update integration docs for AzureOpenAIEmbeddings (#25311)
https://github.com/langchain-ai/langchain/issues/24856

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:33:13 +00:00
Eugene Yurtsev
b4e3bdb714 docs: Update nomic AI embeddings integration docs (#25308)
Issue: https://github.com/langchain-ai/langchain/issues/24856

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:32:07 +00:00
Eugene Yurtsev
f82c3f622a docs: Update AI21Embeddings Integration docs (#25298)
Update AI21 Integration docs

Issue: https://github.com/langchain-ai/langchain/issues/24856

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:30:16 +00:00
Eugene Yurtsev
d55d99222b docs: update integration docs for mistral ai embedding model (#25253)
Related issue: https://github.com/langchain-ai/langchain/issues/24856

```json
[
   {
      "provider": "mistralai",
      "js":  true,
      "local": false,
     "serializable": false,
    "native_async": true
   }
]
```

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:25:36 +00:00
Eugene Yurtsev
0f6217f507 docs: together ai embeddings integration docs (#25252)
Update together AI embedding integration docs

Related issue: https://github.com/langchain-ai/langchain/issues/24856

```json
[
   {
      "provider": "together",
      "js":  true,
      "local": false,
     "serializable": false,
   }
]
```

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:24:02 +00:00
Eugene Yurtsev
8645a49f31 docs: Update integration docs for OllamaEmbeddingsModel (#25314)
Issue: https://github.com/langchain-ai/langchain/issues/24856

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:23:05 +00:00
Eugene Yurtsev
a4ef830480 docs: update integration docs for openai embeddings (#25249)
Related issue: https://github.com/langchain-ai/langchain/issues/24856

```json
   {
      "provider": "openai",
      "js":  true,
      "local": false,
     "serializable": false,
"async_native": true
  }
```

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-08-14 00:21:36 +00:00
Eugene Yurtsev
b1aed44540 docs: Updating integration docs for Fireworks Embeddings (#25247)
Providers:
* fireworks

See related issue:
* https://github.com/langchain-ai/langchain/issues/24856

Features:

```json
[
   {
      "provider": "fireworks",
      "js":  true,
      "local": false,
     "serializable": false,
   }



]


```

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-08-13 17:04:18 -07:00
Isaac Francisco
f4ffd692a3 [docs]: standardize doc loader doc strings (#25325) 2024-08-13 23:18:56 +00:00
Isaac Francisco
e0bbb81d04 [docs]: standardize tool docstrings (#25351) 2024-08-13 16:10:00 -07:00
Erick Friis
d5b548b4ce docs: index pages, sidebars (#25316) 2024-08-13 15:52:51 -07:00
Isaac Francisco
0478f7f5e4 [docs]: LLM integration pages (#25005) 2024-08-13 14:50:45 -07:00
thedavgar
9d08369442 community: fix AzureSearch vectorstore asyncronous methods (#24921)
**Description**
Fix the asyncronous methods to retrieve documents from AzureSearch
VectorStore. The previous changes from [this
commit](ffe6ca986e)
create a similar code for the syncronous methods and the asyncronous
ones but the asyncronous client return an asyncronous iterator
"AsyncSearchItemPaged" as said in the issue #24740.
To solve this issue, the syncronous iterators in asyncronous methods
where changed to asyncronous iterators.

@chrislrobert said in [this
comment](https://github.com/langchain-ai/langchain/issues/24740#issuecomment-2254168302)
that there was a still a flaw due to `with` blocks that close the client
after each call. I removed this `with` blocks in the `async_client`
following the same pattern as the sync `client`.

In order to close up the connections, a __del__ method is included to
gently close up clients once the vectorstore object is destroyed.

**Issue:** #24740 and #24064
**Dependencies:** No new dependencies for this change

**Example notebook:** I created a notebook just to test the changes work
and gives the same results as the syncronous methods for vector and
hybrid search. With these changes, the asyncronous methods in the
retriever work as well.

![image](https://github.com/user-attachments/assets/697e431b-9d7f-4d0d-b205-59d051ac2b67)


**Lint and test**: Passes the tests and the linter
2024-08-13 14:20:51 -07:00
Isaac Francisco
6bc451b942 [docs]: merge tool/toolkit duplicates (#25197) 2024-08-13 12:19:17 -07:00
Fedor Nikolaev
2b15518c5f community: add args_schema to SearxSearchResults tool (#25350)
This adds `args_schema` member to `SearxSearchResults` tool. This member
is already present in the `SearxSearchRun` tool in the same file.

I was having `TypeError: Type is not JSON serializable:
AsyncCallbackManagerForToolRun` being thrown in langserve playground
when I was using `SearxSearchResults` tool as a part of chain there.
This fixes the issue, so the error is not raised anymore.

This is a example langserve app that was giving me the error, but it
works properly after the proposed fix:
```python
#!/usr/bin/env python

from fastapi import FastAPI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI
from langchain_community.utilities import SearxSearchWrapper
from langchain_community.tools.searx_search.tool import SearxSearchResults
from langserve import add_routes

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""
prompt = ChatPromptTemplate.from_template(template)
model = ChatOpenAI()

s = SearxSearchWrapper(searx_host="http://localhost:8080")

search = SearxSearchResults(wrapper=s)

search_chain = (
    {"context": search, "question": RunnablePassthrough()}
    | prompt
    | model
    | StrOutputParser()
)

app = FastAPI()

add_routes(
    app,
    search_chain,
    path="/chain",
)

if __name__ == "__main__":
    import uvicorn

    uvicorn.run(app, host="localhost", port=8000)
```
2024-08-13 18:26:09 +00:00
Matt Kandler
b6df3405fb docs: Fix broken link to Runhouse documentation (#25349)
- **Description:** Runhouse recently migrated from Read the Docs to a
self-hosted solution. This PR updates a broken link from the old docs to
www.run.house/docs. Also changed "The Runhouse" to "Runhouse" (it's
cleaner).
- **Issue:** None
- **Dependencies:** None
2024-08-13 18:18:19 +00:00
maang-h
089f5e6cad Standardize SparkLLM (#25239)
- **Description:** Standardize SparkLLM, include:
  - docs, the issue #24803 
  - to support stream
  - update api url
  - model init arg names, the issue #20085
2024-08-13 09:50:12 -04:00
Leonid Ganeline
35e2230f56 docs: integrationsreferences update (#25322)
Added missed provider pages. Fixed formats and added descriptions and
links.
2024-08-13 09:29:51 -04:00
Chen Xiabin
24155aa1ac qianfan generate/agenerate with usage_metadata (#25332) 2024-08-13 09:24:41 -04:00
Christophe Bornet
ebbe609193 Add README for astradb package (#25345)
Similar to
https://github.com/langchain-ai/langchain/blob/master/libs/partners/ibm/README.md
2024-08-13 09:17:23 -04:00
Eugene Yurtsev
f679ed72ca ollama[patch]: Update API Reference for ollama embeddings (#25315)
Update API reference for OllamaEmbeddings
Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 21:31:48 -04:00
Erick Friis
2907ab2297 community: release 0.2.12 (#25324) 2024-08-12 23:30:27 +00:00
Erick Friis
06f8bd9946 langchain: release 0.2.13 (#25323) 2024-08-12 22:24:06 +00:00
Erick Friis
252f0877d1 core: release 0.2.30 (#25321) 2024-08-12 22:01:24 +00:00
Eugene Yurtsev
217a915b29 openai: Update API Reference docs for AzureOpenAI Embeddings (#25312)
Update AzureOpenAI Embeddings docs
2024-08-12 19:41:18 +00:00
Eugene Yurtsev
056c7c2983 core[patch]: Update API reference for fake embeddings (#25313)
Issue: https://github.com/langchain-ai/langchain/issues/24856

Using the same template for the fake embeddings in langchain_core as
used in the integrations.
2024-08-12 19:40:05 +00:00
Ben Chambers
1adc161642 community: kwargs for CassandraGraphVectorStore (#25300)
- **Description:** pass kwargs from CassandraGraphVectorStore to
underlying store

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-12 18:01:29 +00:00
Hassan-Memon
deb27d8970 docs: remove unused imports in Conversational RAG tutorial (#25297)
Cleaned up the "Tying it Together" section of the Conversational RAG
tutorial by removing unnecessary imports that were not used. This
reduces confusion and makes the code more concise.

Thank you for contributing to LangChain!

PR title: docs: remove unused imports in Conversational RAG tutorial

PR message:

Description: Removed unnecessary imports from the "Tying it Together"
section of the Conversational RAG tutorial. These imports were not used
in the code and created confusion. The updated code is now more concise
and easier to understand.
Issue: N/A
Dependencies: None
LinkedIn handle: [Hassan
Memon](https://www.linkedin.com/in/hassan-memon-a109b3257/)
Add tests and docs:

Hi [LangChain Team Member’s Name],

I hope you're doing well! I’m thrilled to share that I recently made my
second contribution to the LangChain project. If possible, could you
give me a shoutout on LinkedIn? It would mean a lot to me and could help
inspire others to contribute to the community as well.

Here’s my LinkedIn profile: [Hassan
Memon](https://www.linkedin.com/in/hassan-memon-a109b3257/).

Thank you so much for your support and for creating such a great
platform for learning and collaboration. I'm looking forward to
contributing more in the future!

Best regards,
Hassan Memon
2024-08-12 13:49:55 -04:00
gbaian10
5efd0fe9ae docs: Change SqliteSaver to MemorySaver (#25306)
fix: #25137

`SqliteSaver.from_conn_string()` has been changed to a `contextmanager`
method in `langgraph >= 0.2.0`, the original usage is no longer
applicable.

Refer to
<https://github.com/langchain-ai/langgraph/pull/1271#issue-2454736415>
modification method to replace `SqliteSaver` with `MemorySaver`.
2024-08-12 13:45:32 -04:00
Eugene Yurtsev
1c9917dfa2 fireworks[patch]: Fix doc-string for API Referenmce (#25304) 2024-08-12 17:16:13 +00:00
Eugene Yurtsev
ccff1ba8b8 ai21[patch]: Update API reference documentation (#25302)
Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 13:15:27 -04:00
Eugene Yurtsev
53ee5770d3 fireworks: Add APIReference for the FireworksEmbeddings model (#25292)
Add API Reference documentation for the FireworksEmbedding model.

Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 13:13:43 -04:00
Eugene Yurtsev
8626abf8b5 togetherai[patch]: Update API Reference for together AI embeddings model (#25295)
Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 17:12:28 +00:00
Eugene Yurtsev
1af8456a2c mistralai[patch]: Docs Update APIReference for MistralAIEmbeddings (#25294)
Update API Reference for MistralAI embeddings

Issue: https://github.com/langchain-ai/langchain/issues/24856
2024-08-12 15:25:37 +00:00
Eugene Yurtsev
0a3500808d openai[patch]: Docs fix RST formatting in OpenAIEmbeddings (#25293) 2024-08-12 11:24:35 -04:00
Eugene Yurtsev
ee8a585791 openai[patch]: Add API Reference docs to OpenAIEmbeddings (#25290)
Issue: [24856](https://github.com/langchain-ai/langchain/issues/24856)
2024-08-12 14:53:51 +00:00
ccurme
e77eeee6ee core[patch]: add standard tracing params for retrievers (#25240) 2024-08-12 14:51:59 +00:00
Mohammad Mohtashim
9927a4866d [Community] - Added bind_tools and with_structured_output for ChatZhipuAI (#23887)
- **Description:** This PR implements the `bind_tool` functionality for
ChatZhipuAI as requested by the user. ChatZhipuAI models support tool
calling according to the `OpenAI` tool format, as outlined in their
official documentation [here](https://open.bigmodel.cn/dev/api#glm-4).
- **Issue:**  ##23868

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-12 14:11:43 +00:00
Hassan-Memon
420534c8ca docs: Replaced SqliteSaver with MemorySaver and updated installation instru… (#25285)
…ctions to match LangGraph v2 documentation. Corrected code snippet to
prevent validation errors.

Here's how you can fill out the provided template for your pull request:

---

**Thank you for contributing to LangChain!**

- [ ] **PR title**: `docs: update checkpointer example in Conversational
RAG tutorial`

- [ ] **PR message**:
- **Description:** Updated the Conversational RAG tutorial to correct
the checkpointer example by replacing `SqliteSaver` with `MemorySaver`.
Added installation instructions for `langgraph-checkpoint-memory` to
match LangGraph v2 documentation and prevent validation errors.
    - **Issue:** N/A
    - **Dependencies:** `langgraph-checkpoint-memory`
    - **Twitter handle:** N/A

- [ ] **Add tests and docs**: 
  1. No new integration tests are required.
  2. Updated documentation in the Conversational RAG tutorial.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: [LangChain Contribution
Guidelines](https://python.langchain.com/docs/contributing/)

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-12 09:24:51 -04:00
Yunus Emre Özdemir
794f28d4e2 docs: document upstash vector namespaces (#25289)
**Description:** This PR rearranges the examples in Upstash Vector
integration documentation to describe how to use namespaces and improve
the description of metadata filtering.
2024-08-12 09:17:11 -04:00
JasonJ
f28ae20b81 docs: pip install bug fixed (#25287)
Thank you for contributing to LangChain!
- **Description:** Fixing package install bug in cookbook
- **Issue:** zsh:1: no matches found: unstructured[all-docs]
- **Dependencies:** N/A
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!



If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-12 05:12:44 +00:00
Soichi Sumi
9f0eda6a18 docs: Fix link for API reference of Gmail Toolkit (#25286)
- **Description:** Fix link for API reference of Gmail Toolkit
- **Issue:** I've just found this issue while I'm reading the doc
- **Dependencies:** N/A
- **Twitter handle:** [@soichisumi](https://x.com/soichisumi)

TODO: If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-12 05:12:31 +00:00
Anush
472527166f qdrant: Update API reference link and install command (#25245)
## Description

As the title goes. The current API reference links to the deprecated
class.
2024-08-11 16:54:14 -04:00
Aryan Singh
074fa0db73 docs: Fixed grammer error in functions.ipynb (#25255)
**Description**: Grammer Error in functions.ipynb
**Issue**: #25222
2024-08-11 20:53:27 +00:00
gbaian10
4fd1efc48f docs: update "Build an Agent" Installation Hint in agents.ipynb (#25263)
fix #25257
2024-08-11 16:51:34 -04:00
gbaian10
aa2722cbe2 docs: update numbering of items in docstring (#25267)
A problem similar to #25093 .

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-11 20:50:24 +00:00
Maddy Adams
a82c0533f2 langchain: default to langsmith sdk for pulling prompts, fallback to langchainhub (#24156)
**Description:** Deprecating langchainhub, replacing with langsmith sdk
2024-08-11 13:30:52 -07:00
maang-h
bc60cddc1b docs: Fix ChatBaichuan, QianfanChatEndpoint, ChatSparkLLM, ChatZhipuAI docs (#25265)
- **Description:** Fix some chat models docs, include:
  - ChatBaichuan
  - QianfanChatEndpoint
  - ChatSparkLLM
  - ChatZhipuAI
2024-08-11 16:23:55 -04:00
ZhangShenao
43deed2a95 Improvement[Embeddings] Add dimension support to ZhipuAIEmbeddings (#25274)
- In the in ` embedding-3 ` and later models of Zhipu AI, it is
supported to specify the dimensions parameter of Embedding. Ref:
https://bigmodel.cn/dev/api#text_embedding-3 .
- Add test case for `embedding-3` model by assigning dimensions.
2024-08-11 16:20:37 -04:00
maang-h
9cd608efb3 docs: Standardize OpenAI Docs (#25280)
- **Description:** Standardize OpenAI Docs
- **Issue:** the issue #24803

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-11 20:20:16 +00:00
Bagatur
fd546196ef openai[patch]: Release 0.1.21 (#25269) 2024-08-10 16:37:31 -07:00
Eugene Yurtsev
6dd9f053e3 core[patch]: Deprecating beta upsert APIs in vectorstore (#25069)
This PR deprecates the beta upsert APIs in vectorstore.

We'll introduce them in a V2 abstraction instead to keep the existing
vectorstore implementations lighter weight.

The main problem with the existing APIs is that it's a bit more
challenging to
implement the correct behavior w/ respect to IDs since ID can be present
in
both the function signature and as an optional attribute on the document
object.

But VectorStores that pass the standard tests should have implemented
the semantics properly!

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-09 17:17:36 -04:00
Bagatur
ca9dcee940 standard-tests[patch]: test ToolMessage.status="error" (#25210) 2024-08-09 13:00:14 -07:00
Eugene Yurtsev
dadb6f1445 cli[patch]: Update integration template for embedding models (#25248)
Update integration template for embedding models
2024-08-09 14:28:57 -04:00
Eugene Yurtsev
b6f0174bb9 community[patch],core[patch]: Update EdenaiTool root_validator and add unit test in core (#25233)
This PR gets rid `root_validators(allow_reuse=True)` logic used in
EdenAI Tool in preparation for pydantic 2 upgrade.
- add another test to secret_from_env_factory
2024-08-09 15:59:27 +00:00
blueoom
c3ced4c6ce core[patch]: use time.monotonic() instead time.time() in InMemoryRateLimiter
**Description:**

The get time point method in the _consume() method of
core.rate_limiters.InMemoryRateLimiter uses time.time(), which can be
affected by system time backwards. Therefore, it is recommended to use
the monotonically increasing monotonic() to obtain the time

```python
        with self._consume_lock:
            now = time.time()  # time.time() -> time.monotonic()

            # initialize on first call to avoid a burst
            if self.last is None:
                self.last = now

            elapsed = now - self.last  # when use time.time(), elapsed may be negative when system time backwards

```
2024-08-09 11:31:20 -04:00
Eugene Yurtsev
bd6c31617e community[patch]: Remove more @allow_reuse=True validators (#25236)
Remove some additional allow_reuse=True usage in @root_validators.
2024-08-09 11:10:27 -04:00
Eugene Yurtsev
6e57aa7c36 community[patch]: Remove usage of @root_validator(allow_reuse=True) (#25235)
Remove usage of @root_validator(allow_reuse=True)
2024-08-09 10:57:42 -04:00
thiswillbeyourgithub
a2b4c33bd6 community[patch]: FAISS: ValueError mentions normalize_score_fn isntead of relevance_score_fn (#25225)
Thank you for contributing to LangChain!

- [X] **PR title**: "community: fix valueerror mentions wrong argument
missing"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [X] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** when faiss.py has a None relevance_score_fn it raises
a ValueError that says a normalize_fn_score argument is needed.

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-09 14:40:29 +00:00
ccurme
4825dc0d76 langchain[patch]: add deprecations (#24792) 2024-08-09 10:34:43 -04:00
ccurme
02300471be langchain[patch]: extended-tests: drop logprobs from OAI expected config (#25234)
Following https://github.com/langchain-ai/langchain/pull/25229
2024-08-09 14:23:11 +00:00
Shivendra Soni
66b7206ab6 community: Add llm-extraction option to FireCrawl Document Loader (#25231)
**Description:** This minor PR aims to add `llm_extraction` to Firecrawl
loader. This feature is supported on API and PythonSDK, but the
langchain loader omits adding this to the response.
**Twitter handle:** [scalable_pizza](https://x.com/scalablepizza)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-09 13:59:10 +00:00
blaufink
c81c77b465 partners: fix of issue #24880 (#25229)
- Description: As described in the related issue: There is an error
occuring when using langchain-openai>=0.1.17 which can be attributed to
the following PR: #23691
Here, the parameter logprobs is added to requests per default.
However, AzureOpenAI takes issue with this parameter as stated here:
https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?tabs=python-new&pivots=programming-language-chat-completions
-> "If you set any of these parameters, you get an error."
Therefore, this PR changes the default value of logprobs parameter to
None instead of False. This results in it being filtered before the
request is sent.
- Issue: #24880
- Dependencies: /

Co-authored-by: blaufink <sebastian.brueckner@outlook.de>
2024-08-09 13:21:37 +00:00
ccurme
3b7437d184 docs: update integration api refs (#25195)
- [x] toolkits
- [x] retrievers (in this repo)
2024-08-09 12:27:32 +00:00
Bagatur
91ea4b7449 infra: avoid orjson 3.10.7 in vercel build (#25212) 2024-08-09 02:23:18 +00:00
Isaac Francisco
652b3fa4a4 [docs]: playwright fix (#25163) 2024-08-08 17:13:42 -07:00
Bagatur
7040013140 core[patch]: fix deprecation pydantic bug (#25204)
#25004 is incompatible with pydantic < 1.10.17. Introduces fix for this.
2024-08-08 16:39:38 -07:00
Isaac Francisco
dc7423e88f [docs]: standardizing document loader integration pages (#25002) 2024-08-08 16:33:09 -07:00
Casey Clements
25f2e25be1 partners[patch]: Mongodb Retrievers - CI final touches. (#25202)
## Description

Contains 2 updates to for integration tests to run on langchain's CI.
Addendum to #25057 to get release github action to succeed.
2024-08-08 15:38:31 -07:00
Bagatur
786ef021a3 docs: redirect toolkits (#25190) 2024-08-08 14:54:11 -07:00
Eugene Yurtsev
429a0ee7fd core[minor]: Add factory for looking up secrets from the env (#25198)
Add factory method for looking secrets from the env.
2024-08-08 16:41:58 -04:00
Erick Friis
da9281feb2 cli: release 0.0.29 (#25196) 2024-08-08 12:52:49 -07:00
Erick Friis
c6ece6a96d core: autodetect more ls params (#25044)
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-08-08 12:44:21 -07:00
Eugene Yurtsev
86355640c3 experimental[patch]: Use get_fields adapter (#25193)
Change all usages of __fields__ with get_fields adapter merged into
langchain_core.

Code mod generated using the following grit pattern:

```
engine marzano(0.1)
language python


`$X.__fields__` => `get_fields($X)` where {
    add_import(source="langchain_core.utils.pydantic", name="get_fields")
}
```
2024-08-08 15:10:11 -04:00
Eugene Yurtsev
b9f65e5038 experimental[patch]: Migrate pydantic extra to literals (#25194)
Migrate pydantic extra to literals

Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.

This works correctly also in pydantic v1.

```python
from pydantic.v1 import BaseModel

class Foo(BaseModel, extra="forbid"):
    x: int

Foo(x=5, y=1)
```

And 


```python
from pydantic.v1 import BaseModel

class Foo(BaseModel):
    x: int

    class Config:
      extra = "forbid"

Foo(x=5, y=1)
```


## Enum -> literal using grit pattern:

```
engine marzano(0.1)
language python
or {
    `extra=Extra.allow` => `extra="allow"`,
    `extra=Extra.forbid` => `extra="forbid"`,
    `extra=Extra.ignore` => `extra="ignore"`
}
```

Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)


## Sort attributes in Config:

```
engine marzano(0.1)
language python


function sort($values) js {
    return $values.text.split(',').sort().join("\n");
}


class_definition($name, $body) as $C where {
    $name <: `Config`,
    $body <: block($statements),
    $values = [],
    $statements <: some bubble($values) assignment() as $A where {
        $values += $A
    },
    $body => sort($values),
}

```
2024-08-08 19:05:54 +00:00
Eugene Yurtsev
30fb345342 core[minor]: Add from_env utility (#25189)
Add a utility that can be used as a default factory

The goal will be to start migrating from of the pydantic models to use
`from_env` as a default factory if possible.

```python

from pydantic import Field, BaseModel
from langchain_core.utils import from_env

class Foo(BaseModel):
   name: str = Field(default_factory=from_env('HELLO'))
```
2024-08-08 14:52:35 -04:00
Eugene Yurtsev
98779797fe community[patch]: Use get_fields adapter for pydantic (#25191)
Change all usages of __fields__ with get_fields adapter merged into
langchain_core.

Code mod generated using the following grit pattern:

```
engine marzano(0.1)
language python


`$X.__fields__` => `get_fields($X)` where {
    add_import(source="langchain_core.utils.pydantic", name="get_fields")
}
```
2024-08-08 14:43:09 -04:00
Rajendra Kadam
663638d6a8 community[minor]: [SharePointLoader] Load extended metadata for the root folder (#24872)
- **Title:** [SharePointLoader] Load extended metadata for the root
folder
- **Description:** 
    - Ensure extended metadata loads correctly for the root folder.
- Cleanup: Refactor SharePointLoader to remove unused fields(`file_id` &
`site_id`).
- **Dependencies:** NA
- **Add tests and docs:** NA
2024-08-08 14:39:16 -04:00
Eugene Yurtsev
2f209d84fa core[patch]: Add pydantic get_fields adapter (#25187)
Add adapter to get fields
2024-08-08 17:47:42 +00:00
Eugene Yurtsev
c72e522e96 langchain[patch]: Upgrade pydantic extra (#25186)
Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.

This works correctly also in pydantic v1.

```python
from pydantic.v1 import BaseModel

class Foo(BaseModel, extra="forbid"):
    x: int

Foo(x=5, y=1)
```

And 


```python
from pydantic.v1 import BaseModel

class Foo(BaseModel):
    x: int

    class Config:
      extra = "forbid"

Foo(x=5, y=1)
```


## Enum -> literal using grit pattern:

```
engine marzano(0.1)
language python
or {
    `extra=Extra.allow` => `extra="allow"`,
    `extra=Extra.forbid` => `extra="forbid"`,
    `extra=Extra.ignore` => `extra="ignore"`
}
```

Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)


## Sort attributes in Config:

```
engine marzano(0.1)
language python


function sort($values) js {
    return $values.text.split(',').sort().join("\n");
}


class_definition($name, $body) as $C where {
    $name <: `Config`,
    $body <: block($statements),
    $values = [],
    $statements <: some bubble($values) assignment() as $A where {
        $values += $A
    },
    $body => sort($values),
}

```
2024-08-08 17:27:27 +00:00
Eugene Yurtsev
bf5193bb99 community[patch]: Upgrade pydantic extra (#25185)
Upgrade to using a literal for specifying the extra which is the
recommended approach in pydantic 2.

This works correctly also in pydantic v1.

```python
from pydantic.v1 import BaseModel

class Foo(BaseModel, extra="forbid"):
    x: int

Foo(x=5, y=1)
```

And 


```python
from pydantic.v1 import BaseModel

class Foo(BaseModel):
    x: int

    class Config:
      extra = "forbid"

Foo(x=5, y=1)
```


## Enum -> literal using grit pattern:

```
engine marzano(0.1)
language python
or {
    `extra=Extra.allow` => `extra="allow"`,
    `extra=Extra.forbid` => `extra="forbid"`,
    `extra=Extra.ignore` => `extra="ignore"`
}
```

Resorted attributes in config and removed doc-string in case we will
need to deal with going back and forth between pydantic v1 and v2 during
the 0.3 release. (This will reduce merge conflicts.)


## Sort attributes in Config:

```
engine marzano(0.1)
language python


function sort($values) js {
    return $values.text.split(',').sort().join("\n");
}


class_definition($name, $body) as $C where {
    $name <: `Config`,
    $body <: block($statements),
    $values = [],
    $statements <: some bubble($values) assignment() as $A where {
        $values += $A
    },
    $body => sort($values),
}

```
2024-08-08 17:20:39 +00:00
Isaac Francisco
11adc09e02 [docs]: change rag reference in vector store pages (#25125) 2024-08-08 10:08:14 -07:00
Anush
6b32810b68 qdrant: Update doc with usage snippets (#25179)
## Description

This PR adds back snippets demonstrating sparse and hybrid retrieval in
the Qdrant notebook.

Without the snippets, it's hard to grok the usage.
2024-08-08 12:58:26 -04:00
Eugene Yurtsev
3da2713172 docs: Update pydantic compatibility (#25145)
Update pydantic compatibility
2024-08-08 12:10:44 -04:00
Eugene Yurtsev
425f6ffa5b core[patch]: Fix aindex API (#25155)
A previous PR accidentally broke the aindex API by renaming a positional
argument vectorstore into vector_store. This PR reverts this change.
2024-08-08 12:08:18 -04:00
Isaac Francisco
15a36dd0a2 [docs]: combine tools and toolkits (#25158) 2024-08-08 08:59:02 -07:00
ololand
249945a572 Update polygon.py for business subscription (#25085)
For business subscription the status is STOCKSBUSINESS not OK

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-08 15:28:41 +00:00
ccurme
59b8850909 groq[patch]: update rate limit in integration tests (#25177)
Divide by ~2 to account for testing python 3.8 and 3.12 in parallel.
2024-08-08 13:33:25 +00:00
Chad Juliano
4828c441a7 docs: Update notebook name for Kinetica (#25149)
**Description:** Change notebook description in documentation.
**Issue:** N/A
**Dependencies:** N/A
2024-08-08 09:27:29 -04:00
Francisco Kurucz
725e4912ae docs: Fix reference to SQL QA migration (#25157)
**Description:** I found that the link to the notebook in the Migration
notes is broken, i found that it was linked to this file
https://github.com/langchain-ai/langchain/blob/v0.0.250/docs/extras/use_cases/tabular/sql_query.ipynb
and i think now this tutorial
https://github.com/JuanFKurucz/langchain/blob/master/docs/docs/tutorials/sql_qa.ipynb
is the best fit for this reference

**Twitter handle:** @juanfkurucz

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-08 09:26:13 -04:00
ogawa
d895db11d6 community[patch]: gpt-4o-2024-08-06 costs (#25164)
- **Description:** updated OpenAI cost definitions according to the
following:
  - https://openai.com/api/pricing/
- **Twitter handle:** `@ogawa65a`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-08 13:22:11 +00:00
Brace Sproul
d77c7c4236 docs: Fix misspelling of instantiate in docs (#25107) 2024-08-07 15:05:06 -07:00
Eugene Yurtsev
7b1a132aff core[patch]: Add unit tests for Serializable (#25152)
Add a few test cases for serializable (many other test cases already
covered
throguh runnable tests).
2024-08-07 21:01:36 +00:00
Bagatur
df99b832a7 core[patch]: support Field deprecation (#25004)
![Screenshot 2024-08-02 at 4 23 17
PM](https://github.com/user-attachments/assets/c757e093-877e-4af6-9dcd-984195454158)
2024-08-07 13:57:55 -07:00
ccurme
803eba3163 core[patch]: check for model_fields attribute (#25108)
`__fields__` raises a warning in pydantic v2
2024-08-07 13:32:56 -07:00
Casey Clements
6e9a8b188f mongodb: Add Hybrid and Full-Text Search Retrievers, release 0.2.0 (#25057)
## Description

This pull-request extends the existing vector search strategies of
MongoDBAtlasVectorSearch to include Hybrid (Reciprocal Rank Fusion) and
Full-text via new Retrievers.

There is a small breaking change in the form of the `prefilter` kwarg to
search. For this, and because we have now added a great deal of
features, including programmatic Index creation/deletion since 0.1.0, we
plan to bump the version to 0.2.0.

### Checklist
* Unit tests have been extended
* formatting has been applied
* One mypy error remains which will either go away in CI or be
simplified.

---------

Signed-off-by: Casey Clements <casey.clements@mongodb.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-07 20:10:29 +00:00
Isaac Francisco
f337408b0f [docs]: add sidebar for different tool categories (#25065) 2024-08-07 12:57:58 -07:00
Bagatur
0b4608f71e infra: temp skip oai embeddings test (#25148) 2024-08-07 17:51:39 +00:00
Bagatur
a4086119f8 openai[patch]: Release 0.1.21rc2 (#25146) 2024-08-07 16:59:15 +00:00
Bagatur
b4c12346cc core[patch]: Release 0.2.29 (#25126) 2024-08-07 09:50:20 -07:00
Erick Friis
dff83cce66 core[patch]: base language model disable_streaming (#25070)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-07 09:26:21 -07:00
eric-langenberg
130e80b60f docs: rag.ipynb - fixing typo (#25142)
Just changing gpt-3.5 to gpt-4o-mini . That's what's used in the code
examples now. It just didn't get updated in the main text.
2024-08-07 16:02:22 +00:00
Bagatur
09fbce13c5 openai[patch]: ChatOpenAI.with_structured_output json_schema support (#25123) 2024-08-07 08:09:07 -07:00
maang-h
0ba125c3cd docs: Standardize QianfanLLMEndpoint LLM (#25139)
- **Description:** Standardize QianfanLLMEndpoint LLM,include:
  - docs, the issue #24803 
  - model init arg names, the issue #20085
2024-08-07 10:57:27 -04:00
Eugene Yurtsev
28e0958ff4 core[patch]: Relax rate limit unit tests in terms of timing (#25140)
Relax rate limit unit tests
2024-08-07 14:04:58 +00:00
Eray Eroğlu
a2e9910268 Documentation Update for Upstash Semantic Caching (#25114)
Thank you for contributing to LangChain!

- [ ] **PR title**: "Documentation Update : Semantic Caching Update for
Upstash"
 - Docs, llm caching integrations update

- **Description:** Upstash supports semantic caching, and we would like
to inform you about this
- **Twitter handle:** You can mention eray_eroglu_ if you want to post a
tweet about the PR

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-07 14:02:07 +00:00
Pat Patterson
7e7fcf5b1f community: Fix ValidationError on creating GPT4AllEmbeddings with no gpt4all_kwargs (#25124)
- **Description:** Instantiating `GPT4AllEmbeddings` with no
`gpt4all_kwargs` argument raised a `ValidationError`. Root cause: #21238
added the capability to pass `gpt4all_kwargs` through to the `GPT4All`
instance via `Embed4All`, but broke code that did not specify a
`gpt4all_kwargs` argument.
- **Issue:** #25119 
- **Dependencies:** None
- **Twitter handle:** [`@metadaddy`](https://twitter.com/metadaddy)
2024-08-07 13:34:01 +00:00
Atanu Dasgupta
04dd8d3b0a Update google_search.ipynb (#25135)
updated with langchain_google_community instead as the latest revision

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-07 13:30:59 +00:00
ZhangShenao
63d84e93b9 patch[doc] Fix word spelling error (#25128)
Fix word spelling error
2024-08-07 09:16:17 -04:00
Eugene Yurtsev
4d28c70000 core[patch]: Sort Config attributes (#25127)
This PR does an aesthetic sort of the config object attributes. This
will make it a bit easier to go back and forth between pydantic v1 and
pydantic v2 on the 0.3.x branch
2024-08-07 02:53:50 +00:00
Erick Friis
46a47710b0 partners/milvus: release 0.1.4 (#25058) 2024-08-06 16:29:29 -07:00
Erick Friis
35ebd2620c infra,cli: template matching registration (#25110) 2024-08-06 15:29:55 -07:00
ccurme
23c9aba575 groq[patch]: allow warnings during tests (#25105)
Among integration packages in libs/partners, Groq is an exception in
that it errors on warnings.

Following https://github.com/langchain-ai/langchain/pull/25084, Groq
fails with

> pydantic.warnings.PydanticDeprecatedSince20: The `__fields__`
attribute is deprecated, use `model_fields` instead. Deprecated in
Pydantic V2.0 to be removed in V3.0.

Here we update the behavior to no longer fail on warning, which is
consistent with the rest of the packages in libs/partners.
2024-08-06 18:02:20 -04:00
Bagatur
1331e8589c docs: oai chat nit (#25117) 2024-08-06 22:00:42 +00:00
Bagatur
7882d5c978 openai[patch]: Release 0.1.21rc1 (#25116) 2024-08-06 21:50:36 +00:00
Bagatur
70677202c7 core[patch]: Release 0.2.29rc1 (#25115) 2024-08-06 21:36:56 +00:00
Bagatur
78403a3746 core[patch], openai[patch]: enable strict tool calling (#25111)
Introduced
https://openai.com/index/introducing-structured-outputs-in-the-api/
2024-08-06 21:21:06 +00:00
ccurme
5d10139fc7 docs[patch]: add to qa with sources guide (#25112) 2024-08-06 17:08:35 -04:00
Eugene Yurtsev
d283f452cc core[minor]: Add support for DocumentIndex in the index api (#25100)
Support document index in the index api.
2024-08-06 12:30:49 -07:00
Virat Singh
264ab96980 community: Add stock market tools from financialdatasets.ai (#25025)
**Description:**
In this PR, I am adding three stock market tools from
financialdatasets.ai (my API!):
- get balance sheets
- get cash flow statements
- get income statements

Twitter handle: [@virattt](https://twitter.com/virattt)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-06 18:28:12 +00:00
William FH
267855b3c1 Set Context in RunnableSequence & RunnableParallel (#25073) 2024-08-06 11:10:37 -07:00
Naval Chand
71c0698ee4 Added bedrock 3-5 sonnet cost detials for BedrockAnthropicTokenUsageCallbackHandler (#25104)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Example: "community: Added bedrock 3-5 sonnet cost detials for
BedrockAnthropicTokenUsageCallbackHandler"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Naval Chand <navalchand@192.168.1.36>
2024-08-06 17:28:47 +00:00
Isaac Francisco
a72fddbf8d [docs]: vector store integration pages (#24858)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-06 17:20:27 +00:00
Bagatur
2c798622cd docs: runnable docstring space (#25106) 2024-08-06 16:46:50 +00:00
Bagatur
3abf1b6905 docs: versions sidebar (#25061) 2024-08-06 09:23:43 -07:00
maang-h
1028af17e7 docs: Standardize Tongyi (#25103)
- **Description:** Standardize Tongyi LLM,include:
  - docs, the issue #24803
  - model init arg names, the issue #20085
2024-08-06 11:44:12 -04:00
Dobiichi-Origami
061ed250f6 delete the default model value from langchain and discard the need fo… (#24915)
- description: I remove the limitation of mandatory existence of
`QIANFAN_AK` and default model name which langchain uses cause there is
already a default model nama underlying `qianfan` SDK powering langchain
component.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-06 14:11:05 +00:00
Eugene Yurtsev
293a4a78de core[patch]: Include dependencies in sys_info (#25076)
`python -m langchain_core.sys_info`

```bash
System Information
------------------
> OS:  Linux
> OS Version:  #44~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Jun 18 14:36:16 UTC 2
> Python Version:  3.11.4 (main, Sep 25 2023, 10:06:23) [GCC 11.4.0]

Package Information
-------------------
> langchain_core: 0.2.28
> langchain: 0.2.8
> langsmith: 0.1.85
> langchain_anthropic: 0.1.20
> langchain_openai: 0.1.20
> langchain_standard_tests: 0.1.1
> langchain_text_splitters: 0.2.2
> langgraph: 0.1.19

Optional packages not installed
-------------------------------
> langserve

Other Dependencies
------------------
> aiohttp: 3.9.5
> anthropic: 0.31.1
> async-timeout: Installed. No version info available.
> defusedxml: 0.7.1
> httpx: 0.27.0
> jsonpatch: 1.33
> numpy: 1.26.4
> openai: 1.39.0
> orjson: 3.10.6
> packaging: 24.1
> pydantic: 2.8.2
> pytest: 7.4.4
> PyYAML: 6.0.1
> requests: 2.32.3
> SQLAlchemy: 2.0.31
> tenacity: 8.5.0
> tiktoken: 0.7.0
> typing-extensions: 4.12.2
```
2024-08-06 09:57:39 -04:00
Dominik Fladung
ffa0c838d8 Allow ConfluenceLoader authorization via Personal Access Tokens (#25096)
- community: Allow authorization to Confluence with bearer token

- **Description:** Allow authorization to Confluence with [Personal
Access
Token](https://confluence.atlassian.com/enterprise/using-personal-access-tokens-1026032365.html)
by checking for the keys `['client_id', token: ['access_token',
'token_type']]`

- **Issue:** 

Currently the following error occurs when using an personal access token
for authorization.

```python
loader = ConfluenceLoader(
    url=os.getenv('CONFLUENCE_URL'),
    oauth2={
        'token': {"access_token": os.getenv("CONFLUENCE_ACCESS_TOKEN"), "token_type": "bearer"},
        'client_id': 'client_id',
    },
    page_ids=['12345678'], 
)
```

```
ValueError: Error(s) while validating input: ["You have either omitted require keys or added extra keys to the oauth2 dictionary. key values should be `['access_token', 'access_token_secret', 'consumer_key', 'key_cert']`"]
```

With this PR the loader runs as expected.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-06 13:42:47 +00:00
orkhank
111c7df117 docs: update numbering of items in method docs (#25093)
Some methods' doc strings have a wrong numbering of items. The numbers
were adjusted accordingly
2024-08-06 09:21:52 -04:00
Bagatur
6eb42c657e core[patch]: Remove default BaseModel init docstring (#25009)
Currently a default init docstring gets appended to the class docstring
of every BaseModel inherited object. This removes the default init
docstring.

![Screenshot 2024-08-02 at 5 09 55
PM](https://github.com/user-attachments/assets/757fe4ae-a793-4e7d-8354-512de2c06818)
2024-08-06 01:04:04 +00:00
Gram Liu
88a9a6a758 core[patch]: Add pydantic metadata to subset model (#25032)
- **Description:** This includes Pydantic field metadata in
`_create_subset_model_v2` so that it gets included in the final
serialized form that get sent out.
- **Issue:** #25031 
- **Dependencies:** n/a
- **Twitter handle:** @gramliu

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-08-05 17:57:39 -07:00
BhujayKumarBhatta
8f33fce871 docs: change for optional variables in chatprompt (#25017)
Fixes #24884
2024-08-05 23:57:44 +00:00
Erick Friis
423d286546 infra: check doc script skip index page (#25088) 2024-08-05 16:38:30 -07:00
Bagatur
e572521f2a core[patch]: exclude special pydantic init params (#25084) 2024-08-05 23:32:51 +00:00
Isaac Francisco
63ddf0afb4 ollama: allow base_url, headers, and auth to be passed (#25078) 2024-08-05 15:39:36 -07:00
Eugene Yurtsev
4bcd2aad6c core[patch]: Relax time constraints on rate limit test (#25071)
Try to keep the unit test fast, but also have it repeat more robustly
2024-08-05 17:04:22 -04:00
jigsawlabs-student
427a04151c community: fix neo4j from_existing_graph (#24912)
Fixes Neo4JVector.from_existing_graph integration with huggingface

Previously threw an error with existing databases, because
from_existing_graph query returns empty list of new nodes, which are
then passed to embedding function, and huggingface errors with empty
list.

Fixes [24401](https://github.com/langchain-ai/langchain/issues/24401)

---------

Co-authored-by: Jeff Katzy <jeffreyerickatz@gmail.com>
2024-08-05 21:01:46 +00:00
Tomaz Bratanic
d166967003 experimental: Add gliner graph transformer (#25066)
You can use this with:

```
from langchain_experimental.graph_transformers import GlinerGraphTransformer
gliner = GlinerGraphTransformer(allowed_nodes=["Person", "Organization", "Nobel"], allowed_relationships=["EMPLOYEE", "WON"])

from langchain_core.documents import Document

text = """
Marie Curie, was a Polish and naturalised-French physicist and chemist who conducted pioneering research on radioactivity.
She was the first woman to win a Nobel Prize, the first person to win a Nobel Prize twice, and the only person to win a Nobel Prize in two scientific fields.
Her husband, Pierre Curie, was a co-winner of her first Nobel Prize, making them the first-ever married couple to win the Nobel Prize and launching the Curie family legacy of five Nobel Prizes.
She was, in 1906, the first woman to become a professor at the University of Paris.
"""
documents = [Document(page_content=text)]

gliner.convert_to_graph_documents(documents)
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-05 21:01:27 +00:00
Bagatur
a74e466507 docs: aws pydantic v2 compat (#25075) 2024-08-05 20:47:11 +00:00
Bagatur
a02a09c973 docs: remove redundant deprecation warning (#25067) 2024-08-05 18:44:47 +00:00
Eugene Yurtsev
41dfad5104 core[minor]: Introduce DocumentIndex abstraction (#25062)
This PR adds a minimal document indexer abstraction.

The goal of this abstraction is to allow developers to create custom
retrievers that also have a standard indexing API and allow updating the
document content in them.

The abstraction comes with a test suite that can verify that the indexer
implements the correct semantics.

This is an iteration over a previous PRs
(https://github.com/langchain-ai/langchain/pull/24364). The main
difference is that we're sub-classing from BaseRetriever in this
iteration and as so have consolidated the sync and async interfaces.

The main problem with the current design is that runt time search
configuration has to be specified at init rather than provided at run
time.

We will likely resolve this issue in one of the two ways:

(1) Define a method (`get_retriever`) that will allow creating a
retriever at run time with a specific configuration.. If we do this, we
will likely break the subclass on BaseRetriever
(2) Generalize base retriever so it can support structured queries

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-05 18:06:33 +00:00
Vkzem
e7b95e0802 docs: update exa search (#24861)
- [x] **PR title**: "docs: changed example for Exa search retriever
usage"

- [x] **PR message**:
- **Description:** Changed Exa integration doc at
`docs/docs/integrations/tools/exa_search.ipynb` to better reflect simple
Exa use case
- **Issue:** move toward more canonical use of Exa method
(`search_and_contents` rather than just `search`)
    - **Dependencies:** no dependencies; docs only change
    - **Twitter handle:** n/a - small change

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. - will do

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-05 17:41:33 +00:00
Stuart Marsh
16bd0697dc milvus: fixed bug when using partition key and dynamic fields together (#25028)
**Description:**

This PR fixes a bug where if `enable_dynamic_field` and
`partition_key_field` are enabled at the same time, a pymilvus error
occurs.

Milvus requires the partition key field to be a full schema defined
field, and not a dynamic one, so it will throw the error "the specified
partition key field {field} not exist" when creating the collection.

When `enabled_dynamic_field` is set to `True`, all schema field creation
based on `metadatas` is skipped. This code now checks if
`partition_key_field` is set, and creates the field.

Integration test added.

**Twitter handle:** StuartMarshUK

---------

Co-authored-by: Stuart Marsh <stuart.marsh@qumata.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-05 16:01:55 +00:00
Jim Baldwin
6890daa90c community: make AthenaLoader profile_name optional and fix type hint (#24958)
- **Description:** This PR makes the AthenaLoader profile_name optional
and fixes the type hint which says the type is `str` but it should be
`str` or `None` as None is handled in the loader init. This is a minor
problem but it just confused me when I was using the Athena Loader to
why we had to use a Profile, as I want that for local but not
production.
- **Issue:** #24957 
- **Dependencies:** None.
2024-08-05 14:28:58 +00:00
Alexey Lapin
335894893b langchain: Make RetryWithErrorOutputParser.from_llm() create a correct retry chain (#25053)
Description: RetryWithErrorOutputParser.from_llm() creates a retry chain
that returns a Generation instance, when it should actually just return
a string.
This class was forgotten when fixing the issue in PR #24687
2024-08-05 14:21:27 +00:00
Dobiichi-Origami
c5cb52a3c6 community: fix issue of the existence of numeric object in additional_kwargs a… (#24863)
- **Description:** A previous PR breaks the code from
`baidu_qianfan_endpoint.py` which causes the malfunction of streaming
2024-08-05 10:15:55 -04:00
ZhangShenao
cda79dbb6c community[patch]: Optimize test case for MoonshotChat (#25050)
Optimize test case for `MoonshotChat`. Use standard
ChatModelIntegrationTests.
2024-08-05 10:11:25 -04:00
orkhank
cea3f72485 docs: fix comment lines in code blocks (#25054)
The comments inside some code blocks seems to be misplaced. The comment
lines containing explanation about `default_key` behavior when operating
with prompts are updated.
2024-08-05 14:11:09 +00:00
ZhangShenao
02c35da445 doc[Retriever] Enhance api docs for MultiQueryRetriever (#25035)
Enhance api docs for `MultiQueryRetriever`:

- Complete missing parameters
- Unify parameter name
2024-08-04 13:56:38 -04:00
Alex Sherstinsky
208042e0f2 community: Fix Predibase Integration for HuggingFace-hosted fine-tuned adapters (#25015)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-08-03 14:05:43 -07:00
maang-h
f5da0d6d87 docs: Standardize MiniMaxEmbeddings (#24983)
- **Description:** Standardize MiniMaxEmbeddings
  - docs, the issue #24856 
  - model init arg names, the issue #20085
2024-08-03 14:01:23 -04:00
ZhangShenao
2c3e3dc6b1 patch[Partners] Unified fix of incorrect variable declarations in all check_imports (#25014)
There are some incorrect declarations of variable `has_failure` in
check_imports. The purpose of this PR is to uniformly fix these errors.
2024-08-03 13:49:41 -04:00
maang-h
7de62abc91 docs: Standardize SparkLLMTextEmbeddings docstrings (#25021)
- **Description:** Standardize SparkLLMTextEmbeddings docstrings
- **Issue:** the issue #24856
2024-08-03 13:44:09 -04:00
Tomaz Bratanic
f9a11a9197 Add relik transformer config (#25019) 2024-08-03 08:41:45 -04:00
Bagatur
1dcee68cb8 docs: show beta directive (#25013)
![Screenshot 2024-08-02 at 7 15 34
PM](https://github.com/user-attachments/assets/086831c7-36f3-4962-98dc-d707b6289747)
2024-08-03 03:07:45 +00:00
Bagatur
e81ddb32a6 docs: fix kwargs docstring (#25010)
Fix:
![Screenshot 2024-08-02 at 5 33 37
PM](https://github.com/user-attachments/assets/7c56cdeb-ee81-454c-b3eb-86aa8a9bdc8d)
2024-08-02 19:54:54 -07:00
Bagatur
57747892ce docs: show deprecation warning first in api ref (#25001)
OLD
![Screenshot 2024-08-02 at 3 29 39
PM](https://github.com/user-attachments/assets/7f169121-1202-4770-a006-d72ac7a1aa33)


NEW
![Screenshot 2024-08-02 at 3 29 45
PM](https://github.com/user-attachments/assets/9cc07cbd-2ae9-4077-95c5-03cb051e6cd7)
2024-08-02 17:35:25 -07:00
Bagatur
679843abb0 docs: separate deprecated classes (#25007)
![Screenshot 2024-08-02 at 4 58 54
PM](https://github.com/user-attachments/assets/29424dd5-0593-4818-9eed-901ff47246b9)
2024-08-02 17:12:47 -07:00
Isaac Francisco
73570873ab docs: standardizing tavily tool docs (#24736)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-02 22:25:27 +00:00
Isaac Francisco
2ae76cecde [docs]: updating mistral and hugging face chat model pages (#24731) 2024-08-02 15:21:25 -07:00
Bagatur
4305f78e40 core[patch]: Release 0.2.28 (#25000) 2024-08-02 21:07:06 +00:00
Bagatur
64ccddf3cb docs: fmt concepts (#24999) 2024-08-02 20:35:45 +00:00
Bagatur
dd8e4cd020 text-splitters[patch]: Release 0.2.3 (#24998) 2024-08-02 20:27:22 +00:00
Bagatur
0de0cd2d31 core[patch]: merge message runs nit (#24997)
Only add separator if both chunks are non-empty
2024-08-02 20:25:43 +00:00
Bagatur
8e2316b8c2 community[patch]: Release 0.2.11 (#24989) 2024-08-02 20:08:44 +00:00
ccurme
c2538e7834 experimental[patch]: bump min versions of core and community (#24996)
Ollama functions unit test broken with min version of community.
2024-08-02 19:58:55 +00:00
ccurme
acba38a18e docs: update toolkit guides (#24992) 2024-08-02 15:51:05 -04:00
ccurme
22c1a4041b community[patch]: support named arguments in github toolkit (#24986)
Parameters may be passed in by name if generated from tool calls.
2024-08-02 18:27:32 +00:00
ccurme
4797b806c2 experimental[patch]: release 0.0.64 (#24990) 2024-08-02 18:00:57 +00:00
Tomaz Bratanic
7061869aec Add relik graph transformer (#24982)
Relik is a new library for graph extraction that offers smaller and
cheaper models for graph construction
2024-08-02 13:55:41 -04:00
Erick Friis
98c22e9082 docs: feature table component (#24985) 2024-08-02 17:41:47 +00:00
ccurme
c04d95b962 standard-tests: set integration test parameters independent of unit test (#24979)
This ends up getting set in integration tests.
2024-08-02 10:40:11 -07:00
gbaian10
54e9ea433a fix: Modify the order of init_chat_model import ollama package. (#24977) 2024-08-02 08:32:56 -07:00
David Gao
fe1820cdaf docs: add wikipedia integration docs (#24932)
Dear langchain maintainers, 

I add the wikipedia integration docs according to the [web
docs](https://python.langchain.com/v0.2/docs/integrations/retrievers/wikipedia/),
and follow the format of [tavily
example](https://github.com/langchain-ai/langchain/blob/master/docs/docs/integrations/retrievers/tavily.ipynb)
and [retriever
template](https://github.com/langchain-ai/langchain/blob/master/libs/cli/langchain_cli/integration_template/docs/retrievers.ipynb),
this is my first time contributing large repo. please let me know if I'm
doing anything wrong, thank you!

Topic related: #24908

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-02 10:12:04 -04:00
ZhangShenao
71c0564c9f community[patch]: Add test case for MoonshotChat (#24960)
Add test case for `MoonshotChat`.
2024-08-02 09:37:31 -04:00
ZhangShenao
c65e48996c patch[partners] Fix check_imports bugs in pinecone and milvus (#24971)
Fix wrong declared variables of `check_imports` in pinecone and milvus
2024-08-02 09:27:11 -04:00
Isaac Francisco
d7688a4328 community[patch]: adding artifact to Tavily search (#24376)
This allows you to get raw content as well as the answer, instead of
just getting the results.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-01 21:12:11 -07:00
Bagatur
7b08de8909 langchain[patch]: Release 0.2.12 (#24954) 2024-08-02 04:04:49 +00:00
Bagatur
245cb5a252 core[patch]: Release 0.2.27 (#24952) 2024-08-02 01:43:24 +00:00
Bagatur
199e9c5ae0 core[patch]: Fix tool args schema inherited field parsing (#24936)
Fix #24925
2024-08-01 18:36:33 -07:00
Bagatur
fba65ba04f infra: test core on py 3.9, 10, 11 (#24951) 2024-08-01 18:23:37 -07:00
Leonid Ganeline
4092876863 core: docstrings `BaseCallbackHandler update (#24948)
Added missed docstrings
2024-08-01 20:46:53 -04:00
ccurme
6e45dba471 docs: fix redirect (#24950) 2024-08-01 20:45:54 -04:00
WU LIFU
ad16eed119 core[patch]: runnable config ensure_config deep copy from var_child_runnable… (#24862)
**issue**: #24660 
RunnableWithMessageHistory.stream result in error because the
[evaluation](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/runnables/branch.py#L220)
of the branch
[condition](99eb31ec41/libs/core/langchain_core/runnables/history.py (L328C1-L329C1))
unexpectedly trigger the
"[on_end](99eb31ec41/libs/core/langchain_core/runnables/history.py (L332))"
(exit_history) callback of the default branch


**descriptions**
After a lot of investigation I'm convinced that the root cause is that
1. during the execution of the runnable, the
[var_child_runnable_config](99eb31ec41/libs/core/langchain_core/runnables/config.py (L122))
is shared between the branch
[condition](99eb31ec41/libs/core/langchain_core/runnables/history.py (L328C1-L329C1))
runnable and the [default branch
runnable](99eb31ec41/libs/core/langchain_core/runnables/history.py (L332))
within the same context
2. when the default branch runnable runs, it gets the
[var_child_runnable_config](99eb31ec41/libs/core/langchain_core/runnables/config.py (L163))
and may unintentionally [add more handlers
](99eb31ec41/libs/core/langchain_core/runnables/config.py (L325))to
the callback manager of this config
3. when it is again the turn for the
[condition](99eb31ec41/libs/core/langchain_core/runnables/history.py (L328C1-L329C1))
to run, it gets the `var_child_runnable_config` whose callback manager
has the handlers added by the default branch. When it runs that handler
(`exit_history`) it leads to the error
   
with the assumption that, the `ensure_config` function actually does
want to create a immutable copy from `var_child_runnable_config` because
it starts with an [`empty` variable
](99eb31ec41/libs/core/langchain_core/runnables/config.py (L156)),
i go ahead to do a deepcopy to ensure that future modification to the
returned value won't affect the `var_child_runnable_config` variable
   
   Having said that I actually 
1. don't know if this is a proper fix
2. don't know whether it will lead to other unintended consequence 
3. don't know why only "stream" runs into this issue while "invoke" runs
without problem

so @nfcampos @hwchase17 please help review, thanks!

---------

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-08-01 17:30:32 -07:00
Jacob Lee
3ab09d87d6 docs[patch]: Adds components for prereqs, compatibility, fix chat model tab issue (#24585)
Added to `docs/how_to/tools_runtime` as a proof of concept, will apply
everywhere if we like.

A bit more compact than the default callouts, will help standardize the
layout of our pages since we frequently use these boxes.

<img width="1088" alt="Screenshot 2024-07-23 at 4 49 02 PM"
src="https://github.com/user-attachments/assets/7380801c-e092-4d31-bcd8-3652ee05f29e">
2024-08-01 15:04:13 -07:00
ccurme
9cb69a8746 docs: update retriever template, add arxiv retriever (#24947) 2024-08-01 16:53:18 -04:00
Casey Clements
db3ceb4d0a partners/mongodb: Improved search index commands (#24745)
Hardens index commands with try/except for free clusters and optional
waits for syncing and tests.

[efriis](https://github.com/efriis) These are the upgrades to the search
index commands (CRUD) that I mentioned.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-01 20:16:32 +00:00
ccurme
db42576b09 docs: delete old migration guide (#24881)
Redirects to
https://python.langchain.com/v0.2/docs/versions/migrating_chains/
2024-08-01 16:11:47 -04:00
Ikko Eltociear Ashimine
be5294e35d docs: update agents.ipynb (#24945)
initalize -> initialize
2024-08-01 14:37:37 -04:00
ccurme
41ed23a050 docs: update retriever integration pages (#24931) 2024-08-01 14:37:07 -04:00
maang-h
ea505985c4 docs: Standardize ZhipuAIEmbeddings docstrings (#24933)
- **Description:** Standardize ZhipuAIEmbeddings rich docstrings.
- **Issue:** the issue #24856
2024-08-01 14:06:53 -04:00
ccurme
02db66d764 docs: fix kv store column headers (#24941)
![Screenshot 2024-08-01 at 12 32 19
PM](https://github.com/user-attachments/assets/888056b7-3065-4be0-a6b8-bcab5b729c2c)
2024-08-01 09:49:36 -07:00
Anneli Samuel
2204d8cb7d community[patch]: Invoke on_llm_new_token callback before yielding chunk (#24938)
**Description**: Invoke on_llm_new_token callback before yielding chunk
in streaming mode
**Issue**:
[#16913](https://github.com/langchain-ai/langchain/issues/16913)
2024-08-01 16:39:04 +00:00
John
ff6274d32d docs: update langchain-unstructured docs (#24935)
- **Description:** The UnstructuredClient will have a breaking change in
the near future. Add a note in the docs that the examples here may not
use the latest version and users should refer to the SDK docs for the
latest info.
2024-08-01 16:27:40 +00:00
ccurme
c72f0d2f20 docs: update toolkit integration pages (#24887)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-08-01 12:13:08 -04:00
Eugene Yurtsev
75776e4a54 core[patch]: In unit tests, use _schema() instead of BaseModel.schema() (#24930)
This PR introduces a module with some helper utilities for the pydantic
1 -> 2 migration.

They're meant to be used in the following way:

1) Use the utility code to get unit tests pass without requiring
modification to the unit tests
2) (If desired) upgrade the unit tests to match pydantic 2 output
3) (If desired) stop using the utility code

Currently, this module contains a way to map `schema()` generated by
pydantic 2 to (mostly) match the output from pydantic v1.
2024-08-01 11:59:04 -04:00
Serena Ruan
1827bb4042 community[patch]: support bind_tools for ChatMlflow (#24547)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- **Description:** 
Support ChatMlflow.bind_tools method
Tested in Databricks:
<img width="836" alt="image"
src="https://github.com/user-attachments/assets/fa28ef50-0110-4698-8eda-4faf6f0b9ef8">



- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: Serena Ruan <serena.rxy@gmail.com>
2024-08-01 08:43:07 -07:00
Michal Gregor
769c3bb838 huggingface: Added a missing argument to a ChatHuggingFace doc notebook. (#24929)
- **Description:** When adding docs for constructing ChatHuggingFace
using a HuggingFacePipeline, I forgot to add `return_full_text=False` as
an argument. In this setup, the chat response would incorrectly contain
all the input text. I am fixing that here by adding that line to the
offending notebook.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-01 15:42:35 +00:00
BottlePumpkin
bfc59c1d26 community: Fix KeyError in NotionDB loader when 'name' is missing (#24224)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.



**Description:** This PR fixes a KeyError in NotionDBLoader when the
"name" key is missing in the "people" property.

**Issue:** Fixes #24223 

**Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-08-01 13:55:40 +00:00
alexqiao
8eb0bdead3 community[patch]: Invoke callback prior to yielding token (#24917)
**Description: Invoke callback prior to yielding token in stream method
for chat_models .**
**Issue**: https://github.com/langchain-ai/langchain/issues/16913
#16913
2024-08-01 13:19:55 +00:00
ZhangShenao
b2dd9ffaaf patch[cli] Fix bug in check_imports.py (#24918)
The variable `has_failure` in check_imports.py is wrong-declared. It's
actually an another variable.
2024-08-01 09:08:12 -04:00
Jacob Lee
f14121faaf docs[patch]: Update local RAG tutorial (#24909) 2024-07-31 19:19:23 -07:00
Bagatur
b7abac9f92 infra: poetry lock root (#24913) 2024-08-01 01:19:34 +00:00
Jacob Lee
42c686bc28 docs[patch]: Update local model how-to guide (#24911)
Updates to use `langchain_ollama`, new models, chat model example
2024-07-31 18:01:55 -07:00
Erick Friis
600fc233ef partners/ollama: release 0.1.1 (#24910) 2024-07-31 17:31:29 -07:00
Bagatur
25b93cc4c0 core[patch]: stringify tool non-content blocks (#24626)
Slightly breaking bugfix. Shouldn't cause too many issues since no
models would be able to handle non-content block ToolMessage.content
anyways.
2024-07-31 16:42:38 -07:00
Bagatur
492df75937 docs: chat model table nit (#24907) 2024-07-31 15:14:27 -07:00
Bagatur
a24c445e02 docs: cleanup readme (#24905) 2024-07-31 15:03:28 -07:00
Jacob Lee
5098f9dc79 infra: related section in docs (#24829)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-31 14:25:58 -07:00
Nikita Pakunov
c776471ac6 community: fix AttributeError: 'YandexGPT' object has no attribute '_grpc_metadata' (#24432)
Fixes #24049

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-31 21:18:33 +00:00
Bagatur
752a71b688 integrations[patch]: release model packages (#24900) 2024-07-31 20:48:20 +00:00
Jacob Lee
1213a59f87 docs[patch]: Update kv store docs pages (#24848) 2024-07-31 13:23:24 -07:00
Erick Friis
17a06cb7a6 infra: check templates based on integration (#24857)
instead of hardcoding a linter for each, iterate through the lines of
the template notebook and find lines that start with `##` (includes
lower headings), and enforce that those headings are found in new docs
that are contributed
2024-07-31 13:19:50 -07:00
Erick Friis
a7380dd531 cli: release 0.0.28 (#24852) 2024-07-31 13:03:24 -07:00
Erick Friis
e98e4be0f7 cli: register new integration doc templates (#24854)
- wait to merge for retriever.ipynb merge #24836
2024-07-31 13:03:05 -07:00
Eugene Yurtsev
210623b409 core[minor]: Add support for pydantic 2 to utility to get fields (#24899)
Add compatibility for pydantic 2 for a utility function.

This will help push some small changes to master, so they don't have to
be kept track of on a separate branch.
2024-07-31 19:11:07 +00:00
Bagatur
7d1694040d core[patch]: Release 0.2.26 (#24898) 2024-07-31 19:00:50 +00:00
Eugene Yurtsev
add16111b9 community[patch]: Make the pydantic linter stricter (#24897)
Stricter linting of deprecated pydantic features.
2024-07-31 18:57:37 +00:00
Eugene Yurtsev
a4a444f73d community[patch]: Fix arcee llm usage of root_validator(pre=False) (#24896)
Should be pre=True
2024-07-31 18:49:20 +00:00
Eugene Yurtsev
69c656aa5f langchain[minor]: Upgrade ambiguous root_validator to @pre_init (#24895)
The @pre_init validator is a temporary solution for base models. It has
similar (but not identical) semantics to @root_validator(), but it works
strictly as a pre-init validator.

It'll work as expected as long as the pydantic model type hints were
correct.
2024-07-31 18:46:47 +00:00
Eugene Yurtsev
5099a9c9b4 core[patch]: Update unit tests with a workaround for using AnyID in pydantic 2 (#24892)
Pydantic 2 ignores __eq__ overload for subclasses of strings.
2024-07-31 14:42:12 -04:00
Bagatur
8461934c2b core[patch], integrations[patch]: convert TypedDict to tool schema support (#24641)
supports following UX

```python
    class SubTool(TypedDict):
        """Subtool docstring"""

        args: Annotated[Dict[str, Any], {}, "this does bar"]

    class Tool(TypedDict):
        """Docstring
        Args:
            arg1: foo
        """

        arg1: str
        arg2: Union[int, str]
        arg3: Optional[List[SubTool]]
        arg4: Annotated[Literal["bar", "baz"], ..., "this does foo"]
        arg5: Annotated[Optional[float], None]
```

- can parse google style docstring
- can use Annotated to specify default value (second arg)
- can use Annotated to specify arg description (third arg)
- can have nested complex types
2024-07-31 18:27:24 +00:00
Eugene Yurtsev
d24b82357f community[patch]: Add missing annotations (#24890)
This PR adds annotations in comunity package.

Annotations are only strictly needed in subclasses of BaseModel for
pydantic 2 compatibility.

This PR adds some unnecessary annotations, but they're not bad to have
regardless for documentation pages.
2024-07-31 18:13:44 +00:00
Eugene Yurtsev
7720483432 langchain[patch]: Update unit tests to workaround a pydantic 2 issue (#24886)
This will allow our unit tests to pass when using AnyID() with our pydantic models.
2024-07-31 14:09:40 -04:00
Eugene Yurtsev
2019e31bc5 langchain[patch]: Add missing type annotations (#24889)
Adds missing type annotations in preparation for pydantic 2 upgrade.
2024-07-31 14:09:22 -04:00
ccurme
30f18c7b02 docs: add retriever integrations template (#24836) 2024-07-31 13:50:44 -04:00
Anirudh31415926535
4da3d4b18e docs: Minor corrections and updates to Cohere docs (#22726)
- **Description:** Update the Cohere's provider and RagRetriever
documentations with latest updates.
    - **Twitter handle:** Anirudh1810
2024-07-31 10:16:26 -07:00
ccurme
40b4a3de6e docs: update chat model integration pages (#24882)
to conform with template
2024-07-31 11:26:52 -04:00
Nishan Jain
b00c0fc558 [Community][minor]: Added prompt governance in pebblo_retrieval (#24874)
Title: [pebblo_retrieval] Identifying entities in prompts given in
PebbloRetrievalQA leading to prompt governance
Description: Implemented identification of entities in the prompt using
Pebblo prompt governance API.
Issue: NA
Dependencies: NA
Add tests and docs: NA
2024-07-31 13:14:51 +00:00
Rajendra Kadam
a6add89bd4 community[minor]: [PebbloSafeLoader] Implement content-size-based batching (#24871)
- **Title:** [PebbloSafeLoader] Implement content-size-based batching in
the classification flow(loader/doc API)
- **Description:** 
- Implemented content-size-based batching in the loader/doc API, set to
100KB with no external configuration option, intentionally hard-coded to
prevent timeouts.
    - Remove unused field(pb_id) from doc_metadata
- **Issue:** NA
- **Dependencies:** NA
- **Add tests and docs:** Updated
2024-07-31 09:10:28 -04:00
TrumanYan
096b66db4a community: replace it with Tencent Cloud SDK (#24172)
Description: The old method will be discontinued; use the official SDK
for more model options.
Issue: None
Dependencies: None
Twitter handle: None

Co-authored-by: trumanyan <trumanyan@tencent.com>
2024-07-31 09:05:38 -04:00
Erick Friis
99eb31ec41 cli: embed docstring template (#24855) 2024-07-31 02:16:40 +00:00
Noah Peterson
4b2a8ce6c7 docs: Shorten unreasonably long OllamaEmbeddings page (#24850)
This change removes excessive embeddings output in the Jupyter Notebook
on the [Ollama text embedding
page](https://python.langchain.com/v0.2/docs/integrations/text_embedding/ollama/)
2024-07-31 01:57:04 +00:00
Erick Friis
3999e9035c cli/docs: embedding template standardization (#24849) 2024-07-30 18:54:03 -07:00
Bagatur
1181c10c65 docs: reorder integrations sidebar (#24847) 2024-07-30 16:58:26 -07:00
Bagatur
943126c5fd docs: chat model pkg links (#24845) 2024-07-30 16:26:06 -07:00
Erick Friis
1f5444817a community: deprecate BedrockEmbeddings in favor of langchain-aws (#24846) 2024-07-30 23:13:17 +00:00
Jacob Lee
21eb4c9e5d docs[patch]: Adds first kv store doc matching new template (#24844) 2024-07-30 15:58:51 -07:00
Bagatur
a4e940550a docs: integrations custom callout (#24843) 2024-07-30 22:48:18 +00:00
Bagatur
61ecb10a77 docs: partner pkg table (#24840) 2024-07-30 15:28:10 -07:00
Erick Friis
b099cc3507 cli: release 0.0.27 (#24842) 2024-07-30 22:07:50 +00:00
Bagatur
419f2c2585 cli[patch]: tool integration templates (#24837)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-30 14:59:33 -07:00
mschoenb97IL
19b127f640 langchain: Update Langchain -> Langgraph migration docs for the deprecation of the messages_modifier parameter. (#24839)
**Description:** Updated the Langgraph migration docs to use
`state_modifier` rather than `messages_modifier`
**Issue:** N/A
**Dependencies:** N/A

- [ X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-30 21:28:32 +00:00
ccurme
c123cb2b30 docs: update migration guide (#24835)
Move to its own section in the sidebar.
2024-07-30 20:17:12 +00:00
Erick Friis
957b05b8d5 infra: py3.11 for community integration test compiling (#24834)
e.g.
https://github.com/langchain-ai/langchain/actions/runs/10167754785/job/28120861343?pr=24833
2024-07-30 18:43:10 +00:00
Erick Friis
88418af3f5 core: release 0.2.25 (#24833) 2024-07-30 18:41:09 +00:00
Bagatur
37b060112a langchain[patch]: fix ollama in init_chat_model (#24832) 2024-07-30 18:38:53 +00:00
Jerron Lim
d8f3ea82db langchain[patch]: init_chat_model() to import ChatOllama from langchain-ollama and fallback on langchain-community (#24821)
Description: init_chat_model() should import ChatOllama from
`langchain-ollama`. If that fails, fallback to `langchain-community`
2024-07-30 11:16:10 -07:00
Eugene Yurtsev
3a7f3d46c3 docs: Add pydantic compatibility to side bar (#24826)
Add pydantic compatibility to side bar
2024-07-30 14:10:48 -04:00
Isaac Francisco
511242280b [docs]: standardize vectorstores (#24797) 2024-07-30 10:38:04 -07:00
Jacob Lee
ac649800df docs[patch]: Adds kv store integration docs template (#24804) 2024-07-30 10:07:57 -07:00
cffranco94
b01d938997 experimental: Add config to convert_to_graph_documents (#24012)
PR title: Experimental: Add config to convert_to_graph_documents

Description: In order to use langfuse, i need to pass the langfuse
configuration when invoking the chain. langchain_experimental does not
allow to add any parameters (beside the documents) to the
convert_to_graph_documents method. This way, I cannot monitor the chain
in langfuse.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Catarina Franco <catarina.franco@criticalsoftware.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-30 17:01:06 +00:00
Shailendra Mishra
f2d810b3c0 clob_bugfix... (#24813)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-30 12:44:04 -04:00
Anush
51b15448cc community: Fix FastEmbedEmbeddings (#24462)
## Description

This PR:
- Fixes the validation error in `FastEmbedEmbeddings`.
- Adds support for `batch_size`, `parallel` params.
- Removes support for very old FastEmbed versions.
- Updates the FastEmbed doc with the new params.

Associated Issues:
- Resolves #24039
- Resolves #https://github.com/qdrant/fastembed/issues/296
2024-07-30 12:42:46 -04:00
ccurme
73ec24fc56 docs[patch]: add toolkit template (#24791) 2024-07-30 12:36:09 -04:00
Tamir Zitman
b3e1378f2b langchain : text_splitters Added PowerShell (#24582)
- **Description:** Added PowerShell support for text splitters language
include docs relevant update
  - **Issue:** None
  - **Dependencies:** None

---------

Co-authored-by: tzitman <tamir.zitman@intel.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-30 16:13:52 +00:00
ccurme
187ee96f7a docs: update chat model feature table (#24822) 2024-07-30 09:06:42 -07:00
Nuno Campos
68ecebf1ec core: Fix implementation of trim_first_node/trim_last_node to use exact same definition of first/last node as in the getter methods (#24802) 2024-07-30 08:44:27 -07:00
Igor Drozdov
c2706cfb9e feat(community): add tools support for litellm (#23906)
I used the following example to validate the behavior

```python
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import ConfigurableField
from langchain_anthropic import ChatAnthropic
from langchain_community.chat_models import ChatLiteLLM
from langchain_core.tools import tool
from langchain.agents import create_tool_calling_agent, AgentExecutor

@tool
def multiply(x: float, y: float) -> float:
    """Multiply 'x' times 'y'."""
    return x * y

@tool
def exponentiate(x: float, y: float) -> float:
    """Raise 'x' to the 'y'."""
    return x**y

@tool
def add(x: float, y: float) -> float:
    """Add 'x' and 'y'."""
    return x + y

prompt = ChatPromptTemplate.from_messages([
    ("system", "you're a helpful assistant"),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

tools = [multiply, exponentiate, add]

llm = ChatAnthropic(model="claude-3-sonnet-20240229", temperature=0)
# llm = ChatLiteLLM(model="claude-3-sonnet-20240229", temperature=0)

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241", })
```

`ChatAnthropic` version works:

```
> Entering new AgentExecutor chain...

Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}`
responded: [{'text': 'To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools:', 'type': 'text', 'index': 0}, {'id': 'toolu_01Gf54DFTkfLMJQX3TXffmxe', 'input': {}, 'name': 'exponentiate', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 5, "y": 2.743}'}]

82.65606421491815
Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}`
responded: [{'id': 'toolu_01XUq9S56GT3Yv2N1KmNmmWp', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 0, 'partial_json': '{"x": 3, "y": 82.65606421491815}'}]

85.65606421491815
Invoking: `add` with `{'x': 17.24, 'y': -918.1241}`
responded: [{'text': '\n\nSo 3 + 5^2.743 = 85.66\n\nTo calculate 17.24 - 918.1241, we can use:', 'type': 'text', 'index': 0}, {'id': 'toolu_01BkXTwP7ec9JKYtZPy5JKjm', 'input': {}, 'name': 'add', 'type': 'tool_use', 'index': 1, 'partial_json': '{"x": 17.24, "y": -918.1241}'}]

-900.8841[{'text': '\n\nTherefore, 17.24 - 918.1241 = -900.88', 'type': 'text', 'index': 0}]

> Finished chain.
```

While `ChatLiteLLM` version doesn't.

But with the changes in this PR, along with:

- https://github.com/langchain-ai/langchain/pull/23823
- https://github.com/BerriAI/litellm/pull/4554

The result is _almost_ the same:

```
> Entering new AgentExecutor chain...

Invoking: `exponentiate` with `{'x': 5, 'y': 2.743}`
responded: To calculate 3 + 5^2.743, we can use the "exponentiate" and "add" tools:

82.65606421491815
Invoking: `add` with `{'x': 3, 'y': 82.65606421491815}`


85.65606421491815
Invoking: `add` with `{'x': 17.24, 'y': -918.1241}`
responded:

So 3 + 5^2.743 = 85.66

To calculate 17.24 - 918.1241, we can use:

-900.8841

Therefore, 17.24 - 918.1241 = -900.88

> Finished chain.
```

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-30 15:39:34 +00:00
David Robertson
bfb7f8d40a Brave Search: Enhance search result details with extra snippets (#19209)
**Description:** 

This update significantly improves the Brave Search Tool's utility
within the LangChain library by enriching the search results it returns.
The tool previously returned title, link, and snippet, with the snippet
being a truncated 140-character description from the search engine. To
make the search results more informative, this update enables
extra_snippets by default and introduces additional result fields:
title, link, description (enhancing and renaming the former snippet
field), age, and snippets. The snippets field provides a list of strings
summarizing the webpage, utilizing Brave's capability for more detailed
search insights. This enhancement aims to make the search tool far more
informative and beneficial for users.

**Issue:** N/A

**Dependencies:** No additional dependencies introduced.

**Twitter handle:** @davidalexr987

**Code Changes Summary:**

- Changed the default setting to include extra_snippets in search
results.
- Renamed the snippet field to description to accurately reflect its
content and included an age field for search results.
- Introduced a snippets field that lists webpage summaries, providing
users with comprehensive search result insights.

**Backward Compatibility Note:**

The renaming of snippet to description improves the accuracy of the
returned data field but may impact existing users who have developed
integration's or analyses based on the snippet field. I believe this
change is essential for clarity and utility, and it aligns better with
the data provided by Brave Search.

**Additional Notes:**

This proposal focuses exclusively on the Brave Search package, without
affecting other LangChain packages or introducing new dependencies.
2024-07-30 15:29:38 +00:00
Eugene Yurtsev
873f64751e docs: Remove danger on how to migrate to astream events v2 (#24825)
Users should migrate to v2 now
2024-07-30 15:28:07 +00:00
Ben Chambers
435771fe74 [community]: Fix package name mismatch (#24824)
- **Description:** fix a mismatch in pypi package names
2024-07-30 11:21:39 -04:00
ccurme
b7bbfc7c67 langchain: revert "init_chat_model() to support ChatOllama from langchain-ollama" (#24819)
Reverts langchain-ai/langchain#24818

Overlooked discussion in
https://github.com/langchain-ai/langchain/pull/24801.
2024-07-30 14:23:36 +00:00
Jerron Lim
5abfc85fec langchain: init_chat_model() to support ChatOllama from langchain-ollama (#24818)
Description: Since moving away from `langchain-community` is
recommended, `init_chat_models()` should import ChatOllama from
`langchain-ollama` instead.
2024-07-30 10:17:38 -04:00
Eugene Yurtsev
4fab8996cf docs: Update pydantic compatibility (#24625)
Update pydantic compatibility. This will only be true after we release
the partner packages.
2024-07-29 22:19:00 -04:00
Jacob Lee
d6ca1474e0 docs[patch]: Adds key-value store to conceptual guide (#24798) 2024-07-29 18:45:16 -07:00
Erick Friis
cdaea17b3e cli/docs: llm integration template standardization (#24795) 2024-07-29 17:47:13 -07:00
Bagatur
a6d1fb4275 core[patch]: introduce ToolMessage.status (#24628)
Anthropic models (including via Bedrock and other cloud platforms)
accept a status/is_error attribute on tool messages/results
(specifically in `tool_result` content blocks for Anthropic API). Adding
a ToolMessage.status attribute so that users can set this attribute when
using those models
2024-07-29 14:01:53 -07:00
Isaac Francisco
78d97b49d9 [partner]: ollama llm fix (#24790) 2024-07-29 13:00:02 -07:00
maang-h
4bb1a11e02 community: Add MiniMaxChat bind_tools and structured output (#24310)
- **Description:** 
  - Add `bind_tools` method to support tool calling 
  - Add `with_structured_output` method to support structured output
2024-07-29 15:51:52 -04:00
John
0a2ff40fcc partners/unstructured: fix client api_url (#24680)
**Description:** Add empty string default for api_key and change
`server_url` to `url` to match existing loaders.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-29 11:16:41 -07:00
maang-h
bf685c242f docs: Standardize QianfanEmbeddingsEndpoint (#24786)
- **Description:** Standardize QianfanEmbeddingsEndpoint, include:
  - docstrings, the issue #21983 
  - model init arg names, the issue #20085
2024-07-29 13:19:24 -04:00
ccurme
9998e55936 core[patch]: support tool calls with non-pickleable args in tools (#24741)
Deepcopy raises with non-pickleable args.
2024-07-29 13:18:39 -04:00
Erick Friis
df78608741 mongodb: bson optional import (#24685) 2024-07-29 09:54:01 -07:00
M. Ali
c086410677 fix docs typos (#23668)
Thank you for contributing to LangChain!

- [x] **PR title**: "docs: fix multiple typos"

Co-authored-by: mohblnk <mohamed.ali@blnk.ai>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-29 16:10:55 +00:00
Pere Pasamonte
98175860ad community: Fix AWS DocumentDB similarity_search when filter is None (#24777)
**Description**

Fixes DocumentDBVectorSearch similarity_search when no filter is used;
it defaults to None but $match does not accept None, so changed default
to empty {} before pipeline is created.

**Issue**

AWS DocumentDB similarity search does not work when no filter is used.
Error msg: "the match filter must be an expression in an object" #24775

**Dependencies**

No dependencies

**Twitter handle**

https://x.com/perepasamonte

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-29 15:32:05 +00:00
Lennart J. Kurzweg
7da0597ecb partners[ollama]: Support seed parameter for ChatOllama (#24782)
## Description

Adds seed parameter to ChatOllama

## Resolves Issues
- #24703

## Dependency Changes
None

Co-authored-by: Lennart J. Kurzweg (Nx2) <git@nx2.site>
2024-07-29 15:15:20 +00:00
ccurme
e264ccf484 standard-tests[patch]: update groq and structured output test (#24781)
- Mixtral with Groq has started consistently failing tool calling tests.
Here we restrict testing to llama 3.1.
- `.schema` is deprecated in pydantic proper in favor of
`.model_json_schema`.
2024-07-29 11:10:01 -04:00
ZhangShenao
4a05679fdb patch[experimental] Fix prompt in GenerativeAgentMemory (#24771)
There is an issue with the prompt format in `GenerativeAgentMemory` ,
try to fix it.
The prompt is same as the one in method `_score_memory_importance`.
2024-07-29 07:02:31 -04:00
WU LIFU
2ba8393182 graph_transformers: bug fix for create_simple_model not passing in ll… (#24643)
issue: #24615 

descriptions: The _Graph pydantic model generated from
create_simple_model (which LLMGraphTransformer uses when allowed nodes
and relationships are provided) does not constrain the relationships
(source and target types, relationship type), and the node and
relationship properties with enums when using ChatOpenAI.
The issue is that when calling optional_enum_field throughout
create_simple_model the llm_type parameter is not passed in except for
when creating node type. Passing it into each call fixes the issue.

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
2024-07-29 07:00:56 -04:00
William FH
01ab2918a2 core[patch]: Respect injected in bound fns (#24733)
Since right now you cant use the nice injected arg syntas directly with
model.bind_tools()
2024-07-28 15:45:19 -07:00
Pavel
7fcfe7c1f4 openai[patch]: openai proxy added to base embeddings (#24539)
- [ ] **PR title**: "langchain-openai: openai proxy added to base
embeddings"

- [ ] **PR message**: 
    - **Description:** 
    Dear langchain developers,
You've already supported proxy for ChatOpenAI implementation in your
package. At the same time, if somebody needed to use proxy for chat, it
also could be necessary to be able to use it for OpenAIEmbeddings.
That's why I think it's important to add proxy support for OpenAI
embeddings. That's what I've done in this PR.

@baskaryan

---------

Co-authored-by: karpov <karpov@dohod.ru>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-07-28 20:54:13 +00:00
Lakshmi Peri
821196c4ee langchain-aws InMemoryVectorStore documentation updates (#24347)
Thank you for contributing to LangChain!

- [x] **PR title**: "Add documentaiton on InMemoryVectorStore driver for
MemoryDB to langchain-aws"
  - Langchain-aws repo :Add MemoryDB documentation 
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Added documentation on InMemoryVectorStore driver to
aws.mdx and usage example on MemoryDB clusuter
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
Add memorydb notebook to docs/docs/integrations/ folde


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-28 15:09:51 -04:00
Chuck Wooters
56c2a7f6d4 partners: add missing key name to Field() for ChatFireworks model (#24721)
**Description:** 

In the `ChatFireworks` class definition, the Field() call for the "stop"
("stop_sequences") parameter is missing the "default" keyword.

**Issue:**

Type checker reports "stop_sequences" as a missing arg (not recognizing
the default value is None)

**Dependencies:**

None

**Twitter handle:**

None
2024-07-28 18:40:21 +00:00
AmosDinh
c113682328 community:Add support for specifying document_loaders.firecrawl api url. (#24747)
community:Add support for specifying document_loaders.firecrawl api url.


Add support for specifying document_loaders.firecrawl api url. 
This is mainly to support the
[self-hosting](https://github.com/mendableai/firecrawl/blob/main/SELF_HOST.md)
option firecrawl provides. Eg. now I can specify localhost:....

The corresponding firecrawl class already provides functionality to pass
the argument. See here:
4c9d62f6d3/apps/python-sdk/firecrawl/firecrawl.py (L29)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-28 14:30:36 -04:00
Jerron Lim
df37c0d086 partners[ollama]: Support base_url for ChatOllama (#24719)
Add a class attribute `base_url` for ChatOllama to allow users to choose
a different URL to connect to.

Fixes #24555
2024-07-28 14:25:58 -04:00
Bagatur
8964f8a710 core: use mypy<1.11 (#24749)
Bug in mypy 1.11.0 blocking CI, see example:
https://github.com/langchain-ai/langchain/actions/runs/10127096903/job/28004492692?pr=24641
2024-07-27 16:37:02 -07:00
Moritz
b81fbc962c docs: fix typo in DSPy docs (#24748)
**Description:** Just a missing "r" in metric
**Dependencies:**N/A
2024-07-27 23:34:39 +00:00
Isaac Francisco
152427eca1 make image inputs compatible with langchain_ollama (#24619) 2024-07-26 17:39:57 -07:00
William FH
0535d72927 Add type() in error msg (#24723) 2024-07-26 16:48:45 -07:00
Eugene Yurtsev
9be6b5a20f core[patch]: Correct doc-string for InMemoryRateLimiter (#24730)
Correct the documentaiton string.
2024-07-26 22:17:22 +00:00
Erick Friis
d5b4b7e05c infra: langchain max python 3.11 for resolution (#24729) 2024-07-26 21:17:11 +00:00
Erick Friis
3c3d3e9579 infra: community max python 3.11 for resolution (#24728) 2024-07-26 21:10:14 +00:00
Cristi Burcă
174e7d2ab2 langchain: Make OutputFixingParser.from_llm() create a useable retry chain (#24687)
Description: OutputFixingParser.from_llm() creates a retry chain that
returns a Generation instance, when it should actually just return a
string.
Issue: https://github.com/langchain-ai/langchain/issues/24600
Twitter handle: scribu

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-26 13:55:47 -07:00
Bagatur
b3a23ddf93 integration releases (#24725)
Release anthropic, openai, groq, mistralai, robocorp
2024-07-26 12:30:10 -07:00
Bagatur
315223ce26 core[patch]: Release 0.2.24 (#24722) 2024-07-26 18:55:32 +00:00
Hayden Wolff
0345990a42 docs: Add NVIDIA NIMs to Model Tab and Feature Table (#24146)
**Description:** Add NVIDIA NIMs to Model Tab and LLM Feature Table

---------

Co-authored-by: Hayden Wolff <hwolff@nvidia.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-26 18:20:52 +00:00
Haijian Wang
cda3025ee1 Integrating the Yi family of models. (#24491)
Thank you for contributing to LangChain!

- [x] **PR title**: "community:add Yi LLM", "docs:add Yi Documentation"
                          
- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** This PR adds support for the Yi model to LangChain.
- **Dependencies:**
[langchain_core,requests,contextlib,typing,logging,json,langchain_community]
    - **Twitter handle:** 01.AI


- [x] **Add tests and docs**: I've added the corresponding documentation
to the relevant paths

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-26 10:57:33 -07:00
Bagatur
ad7581751f core[patch]: ChatPromptTemplate.init same as ChatPromptTemplate.from_… (#24486) 2024-07-26 10:48:39 -07:00
Marc Gibbons
cc451effd1 community[patch]: langchain_community.vectorstores.azuresearch Raise LangChainException instead of bare Exception (#23935)
Raise `LangChainException` instead of `Exception`. This alleviates the
need for library users to use bare try/except to handle exceptions
raised by `AzureSearch`.

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-26 15:59:06 +00:00
Jacob Lee
3d16dcd88d docs[patch]: Hide deprecated ChatGPT plugins page (#24704) 2024-07-26 08:24:33 -07:00
Eugene Yurtsev
3a5365a33e ai21: apply rate limiter in integration tests (#24717)
Apply rate limiter in integration tests
2024-07-26 11:15:36 -04:00
Eugene Yurtsev
03d62a737a together: Add rate limiter to integration tests (#24714)
Rate limit the integration tests to avoid getting 429s.
2024-07-26 10:59:33 -04:00
Eugene Yurtsev
e00cc74926 docs[minor]: Add how to guide for rate limiting a chat model (#24686)
Add how-to guide for rate limiting a chat model.
2024-07-26 14:29:06 +00:00
Diverrez morgan
c4d2a53f18 community: creation score_threshold in flashrank_rerank.py (#24016)
Description: 
add a optional score relevance threshold for select only coherent
document, it's in complement of top_n

Discussion:
add relevance score threshold in flashrank_rerank document compressors
#24013

Dependencies:
 no dependencies

---------

Co-authored-by: Benjamin BERNARD <benjamin.bernard@openpathview.fr>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-26 13:34:39 +00:00
Cong Peng
190988d93e community: Add parameter allow_dangerous_requests to WebResearchRetriever.from_llm construct (#24712)
**Description:** To avoid ValueError when construct the retriever from
method `from_llm()`.
2024-07-26 06:24:58 -07:00
monysun
5f593c172a community: fix dashcope embeddings embed_query func post too much req to api (#24707)
the fuc of embed_query of dashcope embeddings send a str param, and in
the embed_with_retry func will send error content to api
2024-07-26 12:44:07 +00:00
yonarw
b65ac8d39c community[minor]: Self query retriever for HANA Cloud Vector Engine (#24494)
Description:

- This PR adds a self query retriever implementation for SAP HANA Cloud
Vector Engine. The retriever supports all operators except for contains.
- Issue: N/A
- Dependencies: no new dependencies added

**Add tests and docs:**
Added integration tests to:
libs/community/tests/unit_tests/query_constructors/test_hanavector.py

**Documentation for self query retriever:**
/docs/integrations/retrievers/self_query/hanavector_self_query.ipynb

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-26 06:56:51 +00:00
nobbbbby
4f3b4fc7fe community[patch]: Extend Baichuan model with tool support (#24529)
**Description:** Expanded the chat model functionality to support tools
in the 'baichuan.py' file. Updated module imports and added tool object
handling in message conversions. Additional changes include the
implementation of tool binding and related unit tests. The alterations
offer enhanced model capabilities by enabling interaction with tool-like
objects.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-25 23:20:44 -07:00
Rave Harpaz
ee399e3ec5 community[patch]: Add OCI Generative AI tool and structured output support (#24693)
- [x] **PR title**: 
  community: Add OCI Generative AI tool and structured output support


- [x] **PR message**: 
- **Description:** adding tool calling and structured output support for
chat models offered by OCI Generative AI services. This is an update to
our last PR 22880 with changes in
/langchain_community/chat_models/oci_generative_ai.py
    - **Issue:** NA
    - **Dependencies:** NA
    - **Twitter handle:** NA


- [x] **Add tests and docs**: 
  1. we have updated our unit tests
2. we have updated our documentation under
/docs/docs/integrations/chat/oci_generative_ai.ipynb


- [x] **Lint and test**: `make format`, `make lint` and `make test` we
run successfully

---------

Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com>
Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
2024-07-25 23:19:00 -07:00
Yuki Watanabe
2b6a262f84 community[patch]: Replace filters argument to filter in DatabricksVectorSearch (#24530)
The
[DatabricksVectorSearch](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/vectorstores/databricks_vector_search.py#L21)
class exposes similarity search APIs with argument `filters`, which is
inconsistent with other VS classes who uses `filter` (singular). This PR
updates the argument and add alias for backward compatibility.

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2024-07-25 21:20:18 -07:00
Leonid Ganeline
148766ddc1 docs: integrations missed links (#24681)
Added missed links; missed provider page
2024-07-25 20:38:25 -07:00
Sunish Sheth
59880a9147 community[patch]: mlflow handle empty chunk(#24689) 2024-07-25 20:36:29 -07:00
Eugene Yurtsev
20690db482 core[minor]: Add BaseModel.rate_limiter, RateLimiter abstraction and in-memory implementation (#24669)
This PR proposes to create a rate limiter in the chat model directly,
and would replace: https://github.com/langchain-ai/langchain/pull/21992

It resolves most of the constraints that the Runnable rate limiter
introduced:

1. It's not annoying to apply the rate limiter to existing code; i.e., 
possible to roll out the change at the location where the model is
instantiated,
rather than at every location where the model is used! (Which is
necessary
   if the model is used in different ways in a given application.)
2. batch rate limiting is enforced properly
3. the rate limiter works correctly with streaming
4. the rate limiter is aware of the cache
5. The rate limiter can take into account information about the inputs
into the
model (we can add optional inputs to it down-the road together with
outputs!)

The only downside is that information will not be properly reflected in
tracing
as we don't have any metadata evens about a rate limiter. So the total
time
spent on a model invocation will be: 

* time spent waiting for the rate limiter
* time spend on the actual model request

## Example

```python
from langchain_core.rate_limiters import InMemoryRateLimiter
from langchain_groq import ChatGroq

groq = ChatGroq(rate_limiter=InMemoryRateLimiter(check_every_n_seconds=1))
groq.invoke('hello')
```
2024-07-26 03:03:34 +00:00
Eugene Yurtsev
c623ae6661 experimental[patch]: Fix import test (#24672)
Import test was misconfigured, the glob wasn't returning any file paths
2024-07-25 22:14:40 -04:00
Chaunte W. Lacewell
69eacaa887 Community[minor]: Update VDMS vectorstore (#23729)
**Description:** 
- This PR exposes some functions in VDMS vectorstore, updates VDMS
related notebooks, updates tests, and upgrade version of VDMS (>=0.0.20)

**Issue:** N/A

**Dependencies:** 
- Update vdms>=0.0.20
2024-07-25 22:13:04 -04:00
sykp241095
703491e824 docs: update another TiDB Cloud link as it is already public beta (#24694)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-25 18:39:55 -07:00
Nuno Campos
8734cabc09 core: Don't draw None edge labels (#24690)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-25 22:12:39 +00:00
Jacob Lee
ce067c19e9 docs[patch]: Simplify tool calling guide, improve tool calling conceptual guide (#24637)
Lots of duplicated content from concepts, missing pointers to the second
half of the tool calling loop

Simpler + more focused + a more prominent link to the second half of the
loop was what I was aiming for, but down to be more conservative and
just more prominently link the "passing tools back to the model" guide.

I have also moved the tool calling conceptual guide out from under
`Structured Output` (while leaving a small section for structured
output-specific information) and added more content. The existing
`#functiontool-calling` link will go to this new section.
2024-07-25 14:39:14 -07:00
Bagatur
4840db6892 docs: standardize groq chat model docs (#24616)
part of #22296

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-25 14:10:49 -07:00
Isaac Francisco
218c554c4f [docs]: add doctoring to ChatTogether (#24636) 2024-07-25 14:10:41 -07:00
Bagatur
0fe29b4343 docs: standardize Together docs (#24617)
Part of #22296

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-25 14:10:31 -07:00
Isaac Francisco
5c7e589aaf deprecating ollama_functions (#24632) 2024-07-25 13:50:04 -07:00
KyrianC
0fdbaf4a8d community: fix ChatEdenAI + EdenAI Tools (#23715)
Fixes for Eden AI Custom tools and ChatEdenAI:
- add missing import in __init__ of chat_models
- add `args_schema` to custom tools. otherwise '__arg1' would sometimes
be passed to the `run` method
- fix IndexError when no human msg is added in ChatEdenAI
2024-07-25 15:19:14 -04:00
Daniel Campos
871bf5a841 docs: Update snowflake.mdx for arctic-m-v1.5 (#24678)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-25 17:48:54 +00:00
Leonid Ganeline
8b7cffc363 docs: integrations missed references (#24631)
**Issue:** Several packages are not referenced in the `providers` pages.

**Fix:** Added the missed references. Fixed the notebook formatting.
2024-07-25 13:26:46 -04:00
ccurme
58dd69f7f2 core[patch]: fix mutating tool calls (#24677)
In some cases tool calls are mutated when passed through a tool.
2024-07-25 16:46:36 +00:00
ccurme
dfbd12b384 mistral[patch]: translate tool call IDs to mistral compatible format (#24668)
Mistral appears to have added validation for the format of its tool call
IDs:

`{"object":"error","message":"Tool call id was abc123 but must be a-z,
A-Z, 0-9, with a length of
9.","type":"invalid_request_error","param":null,"code":null}`

This breaks compatibility of messages from other providers. Here we add
a function that converts any string to a Mistral-valid tool call ID, and
apply it to incoming messages.
2024-07-25 12:39:32 -04:00
maang-h
38d30e285a docs: Standardize BaichuanTextEmbeddings docstrings (#24674)
- **Description:** Standardize BaichuanTextEmbeddings docstrings.
- **Issue:** the issue #21983
2024-07-25 12:12:00 -04:00
Eugene Yurtsev
89bcca3542 experimental[patch]: Bump core (#24671) 2024-07-25 09:05:43 -07:00
rick-SOPTIM
cd563fb628 community[minor]: passthrough auth parameter on requests to Ollama-LLMs (#24068)
Thank you for contributing to LangChain!

**Description:**
This PR allows users of `langchain_community.llms.ollama.Ollama` to
specify the `auth` parameter, which is then forwarded to all internal
calls of `requests.request`. This works in the same way as the existing
`headers` parameters. The auth parameter enables the usage of the given
class with Ollama instances, which are secured by more complex
authentication mechanisms, that do not only rely on static headers. An
example are AWS API Gateways secured by the IAM authorizer, which
expects signatures dynamically calculated on the specific HTTP request.

**Issue:**

Integrating a remote LLM running through Ollama using
`langchain_community.llms.ollama.Ollama` only allows setting static HTTP
headers with the parameter `headers`. This does not work, if the given
instance of Ollama is secured with an authentication mechanism that
makes use of dynamically created HTTP headers which for example may
depend on the content of a given request.

**Dependencies:**

None

**Twitter handle:**

None

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-25 15:48:35 +00:00
남광우
256bad3251 core[minor]: Support asynchronous in InMemoryVectorStore (#24472)
### Description

* support asynchronous in InMemoryVectorStore
* since embeddings might be possible to call asynchronously, ensure that
both asynchronous and synchronous functions operate correctly.
2024-07-25 11:36:55 -04:00
Luca Dorigo
5fdbdd6bec community[patch]: Fix invalid iohttp verify parameter (#24655)
Should fix https://github.com/langchain-ai/langchain/issues/24654
2024-07-25 11:09:21 -04:00
Daniel Glogowski
221486687a docs: updated CHATNVIDIA notebooks (#24584)
Updated notebook for tool calling support in chat models
2024-07-25 09:22:53 -04:00
Ken Jenney
d6631919f4 docs: tool calling is enabled in ChatOllama (#24665)
Description: According to this page:
https://python.langchain.com/v0.2/docs/integrations/chat/ollama_functions/
ChatOllama does support Tool Calling.
Issue: The documentation is incorrect
Dependencies: None
Twitter handle: NA
2024-07-25 13:21:30 +00:00
sykp241095
235eb38d3e docs: update TiDB Cloud links as vector search feature becomes public beta (#24667)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-25 13:20:02 +00:00
Eugene Yurtsev
7dd6b32991 core[minor]: Add InMemoryRateLimiter (#21992)
This PR introduces the following Runnables:

1. BaseRateLimiter: an abstraction for specifying a time based rate
limiter as a Runnable
2. InMemoryRateLimiter: Provides an in-memory implementation of a rate
limiter

## Example

```python

from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda
from datetime import datetime

foo = InMemoryRateLimiter(requests_per_second=0.5)

def meow(x):
    print(datetime.now().strftime("%H:%M:%S.%f"))
    return x

chain = foo | meow

for _ in range(10):
    print(chain.invoke('hello'))
```

Produces:

```
17:12:07.530151
hello
17:12:09.537932
hello
17:12:11.548375
hello
17:12:13.558383
hello
17:12:15.568348
hello
17:12:17.578171
hello
17:12:19.587508
hello
17:12:21.597877
hello
17:12:23.607707
hello
17:12:25.617978
hello
```


![image](https://github.com/user-attachments/assets/283af59f-e1e1-408b-8e75-d3910c3c44cc)


## Interface

The rate limiter uses the following interface for acquiring a token:

```python
class BaseRateLimiter(Runnable[Input, Output], abc.ABC):
  @abc.abstractmethod
  def acquire(self, *, blocking: bool = True) -> bool:
      """Attempt to acquire the necessary tokens for the rate limiter.```
```

The flag `blocking` has been added to the abstraction to allow
supporting streaming (which is easier if blocking=False).

## Limitations

- The rate limiter is not designed to work across different processes.
It is an in-memory rate limiter, but it is thread safe.
- The rate limiter only supports time-based rate limiting. It does not
take into account the size of the request or any other factors.
- The current implementation does not handle streaming inputs well and
will consume all inputs even if the rate limit has been reached. Better
support for streaming inputs will be added in the future.
- When the rate limiter is combined with another runnable via a
RunnableSequence, usage of .batch() or .abatch() will only respect the
average rate limit. There will be bursty behavior as .batch() and
.abatch() wait for each step to complete before starting the next step.
One way to mitigate this is to use batch_as_completed() or
abatch_as_completed().

## Bursty behavior in `batch` and `abatch`

When the rate limiter is combined with another runnable via a
RunnableSequence, usage of .batch() or .abatch() will only respect the
average rate limit. There will be bursty behavior as .batch() and
.abatch() wait for each step to complete before starting the next step.

This becomes a problem if users are using `batch` and `abatch` with many
inputs (e.g., 100). In this case, there will be a burst of 100 inputs
into the batch of the rate limited runnable.

1. Using a RunnableBinding

The API would look like:

```python
from langchain_core.runnables import InMemoryRateLimiter, RunnableLambda

rate_limiter = InMemoryRateLimiter(requests_per_second=0.5)

def meow(x):
    return x

rate_limited_meow = RunnableLambda(meow).with_rate_limiter(rate_limiter)
```

2. Another option is to add some init option to RunnableSequence that
changes `.batch()` to be depth first (e.g., by delegating to
`batch_as_completed`)

```python
RunnableSequence(first=rate_limiter, last=model, how='batch-depth-first')
```

Pros: Does not require Runnable Binding
Cons: Feels over-complicated
2024-07-25 01:34:03 +00:00
Oleg Kulyk
4b1b7959a2 community[minor]: Add ScrapingAnt Loader Community Integration (#24514)
Added [ScrapingAnt](https://scrapingant.com/) Web Loader integration.
ScrapingAnt is a web scraping API that allows extracting web page data
into accessible and well-formatted markdown.

Description: Added ScrapingAnt web loader for retrieving web page data
as markdown
Dependencies: scrapingant-client
Twitter: @WeRunTheWorld3

---------

Co-authored-by: Oleg Kulyk <oleg@scrapingant.com>
2024-07-24 21:11:43 -04:00
Jacob Lee
afee851645 docs[patch]: Fix image caption document loader page and typo on custom tools page (#24635) 2024-07-24 17:16:18 -07:00
Jacob Lee
a73e2222d4 docs[patch]: Updates LLM caching, HF sentence transformers, and DDG pages (#24633) 2024-07-24 16:58:05 -07:00
Erick Friis
e160b669c8 infra: add unstructured api key to release (#24638) 2024-07-24 16:47:24 -07:00
John
d59c656ea5 unstructured, community, initialize langchain-unstructured package (#22779)
#### Update (2): 
A single `UnstructuredLoader` is added to handle both local and api
partitioning. This loader also handles single or multiple documents.

#### Changes in `community`:
Changes here do not affect users. In the initial process of using the
SDK for the API Loaders, the Loaders in community were refactored.
Other changes include:
The `UnstructuredBaseLoader` has a new check to see if both
`mode="paged"` and `chunking_strategy="by_page"`. It also now has
`Element.element_id` added to the `Document.metadata`.
`UnstructuredAPIFileLoader` and `UnstructuredAPIFileIOLoader`. As such,
now both directly inherit from `UnstructuredBaseLoader` and initialize
their `file_path`/`file` attributes respectively and implement their own
`_post_process_elements` methods.

--------
#### Update:
New SDK Loaders in a [partner
package](https://python.langchain.com/v0.1/docs/contributing/integrations/#partner-package-in-langchain-repo)
are introduced to prevent breaking changes for users (see discussion
below).

##### TODO:
- [x] Test docstring examples
--------
- **Description:** UnstructuredAPIFileIOLoader and
UnstructuredAPIFileLoader calls to the unstructured api are now made
using the unstructured-client sdk.
- **New Dependencies:** unstructured-client

- [x] **Add tests and docs**: If you're adding a new integration, please
include
- [x] a test for the integration, preferably unit tests that do not rely
on network access,
- [x] update the description in
`docs/docs/integrations/providers/unstructured.mdx`
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

TODO:
- [x] Update
https://python.langchain.com/v0.1/docs/integrations/document_loaders/unstructured_file/#unstructured-api
-
`langchain/docs/docs/integrations/document_loaders/unstructured_file.ipynb`
- The description here needs to indicate that users should install
`unstructured-client` instead of `unstructured`. Read over closely to
look for any other changes that need to be made.
- [x] Update the `lazy_load` method in `UnstructuredBaseLoader` to
handle json responses from the API instead of just lists of elements.
- This method may need to be overwritten by the API loaders instead of
changing it in the `UnstructuredBaseLoader`.
- [x] Update the documentation links in the class docstrings (the
Unstructured documents have moved)
- [x] Update Document.metadata to include `element_id` (see thread
[here](https://unstructuredw-kbe4326.slack.com/archives/C044N0YV08G/p1718187499818419))

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
2024-07-24 23:21:20 +00:00
Leonid Ganeline
2394807033 docs: fix ChatGooglePalm fix (#24629)
**Issue:** now the
[ChatGooglePalm](https://python.langchain.com/v0.2/docs/integrations/vectorstores/scann/#retrievalqa-demo)
class is not parsed and do not presented in the "API Reference:" line.

**PR:** [Fixed
it](https://langchain-7n5k5wkfs-langchain.vercel.app/v0.2/docs/integrations/vectorstores/scann/#retrievalqa-demo)
by properly importing.
2024-07-24 18:09:08 -04:00
Joel Akeret
acfce30017 Adding compatibility for OllamaFunctions with ImagePromptTemplate (#24499)
- [ ] **PR title**: "experimental: Adding compatibility for
OllamaFunctions with ImagePromptTemplate"

- [ ] **PR message**: 
- **Description:** Removes the outdated
`_convert_messages_to_ollama_messages` method override in the
`OllamaFunctions` class to ensure that ollama multimodal models can be
invoked with an image.
    - **Issue:** #24174

---------

Co-authored-by: Joel Akeret <joel.akeret@ti&m.com>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-24 14:57:05 -07:00
Erick Friis
8f3c052db1 cli: release 0.0.26 (#24623)
- **cli: remove snapshot flag from pytest defaults**
- **x**
- **x**
2024-07-24 13:13:58 -07:00
ChengZi
29a3b3a711 partners[milvus]: add dynamic field (#24544)
add dynamic field feature to langchain_milvus
more unittest, more robustic

plan to deprecate the `metadata_field` in the future, because it's
function is the same as `enable_dynamic_field`, but the latter one is a
more advanced concept in milvus

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-24 20:01:58 +00:00
Erick Friis
20fe4deea0 milvus: release 0.1.3 (#24624) 2024-07-24 13:01:27 -07:00
Erick Friis
3a55f4bfe9 cli: remove snapshot flag from pytest defaults (#24622) 2024-07-24 19:41:01 +00:00
Isaac Francisco
fea9ff3831 docs: add tables for search and code interpreter tools (#24586) 2024-07-24 10:51:39 -07:00
Eugene Yurtsev
b55f6105c6 community[patch]: Add linter to prevent further usage of root_validator and validator (#24613)
This linter is meant to move development to use __init__ instead of
root_validator and validator.

We need to investigate whether we need to lint some of the functionality
of Field (e.g., `lt` and `gt`, `alias`)

`alias` is the one that's most popular:

(community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep
" Field(" | grep "alias=" | wc -l
144

(community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep
" Field(" | grep "ge=" | wc -l
10

(community) ➜ community git:(eugene/add_linter_to_community) ✗ git grep
" Field(" | grep "gt=" | wc -l
4
2024-07-24 12:35:21 -04:00
Anush
4585eaef1b qdrant: Fix vectors_config access (#24606)
## Description

Fixes #24558 by accessing `vectors_config` after asserting it to be a
dict.
2024-07-24 10:54:33 -04:00
ccurme
f337f3ed36 docs: update chain migration guide (#24501)
- Update `ConversationChain` example to show use without session IDs;
- Fix a minor bug (specify history_messages_key).
2024-07-24 10:45:00 -04:00
maang-h
22175738ac docs: Add MongoDBChatMessageHistory docstrings (#24608)
- **Description:** Add MongoDBChatMessageHistory rich docstrings.
- **Issue:** the issue #21983
2024-07-24 10:12:44 -04:00
Anindyadeep
12c3454fd9 [Community] PremAI Tool Calling Functionality (#23931)
This PR is under WIP and adds the following functionalities:

- [X] Supports tool calling across the langchain ecosystem. (However
streaming is not supported)
- [X] Update documentation
2024-07-24 09:53:58 -04:00
Vishnu Nandakumar
e271965d1e community: retrievers: added capability for using Product Quantization as one of the retriever. (#22424)
- [ ] **Community**: "Retrievers: Product Quantization"
- [X] This PR adds Product Quantization feature to the retrievers to the
Langchain Community. PQ is one of the fastest retrieval methods if the
embeddings are rich enough in context due to the concepts of
quantization and representation through centroids
    - **Description:** Adding PQ as one of the retrievers
    - **Dependencies:** using the package nanopq for this PR
    - **Twitter handle:** vishnunkumar_


- [X] **Add tests and docs**: If you're adding a new integration, please
include
   - [X] Added unit tests for the same in the retrievers.
   - [] Will add an example notebook subsequently

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/ -
done the same

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-24 13:52:15 +00:00
stydxm
b9bea36dd4 community: fix typo in warning message (#24597)
- **Description:** 
  This PR fixes a small typo in a warning message
- **Issue:**

![](https://github.com/user-attachments/assets/5aa57724-26c5-49f6-8bc1-5a54bb67ed49)
There were double `Use` and double `instead`
2024-07-24 13:19:07 +00:00
cüre
da06d4d7af community: update finetuned model cost for 4o-mini (#24605)
- **Description:** adds model price for. reference:
https://openai.com/api/pricing/
- **Issue:** -
- **Dependencies:** -
- **Twitter handle:** cureef
2024-07-24 13:17:26 +00:00
Philippe PRADOS
5f73c836a6 openai[small]: Add the new model: gpt-4o-mini (#24594) 2024-07-24 09:14:48 -04:00
Mateusz Szewczyk
597be7d501 docs: Update IBM docs about information to pass client into WatsonxLLM and WatsonxEmbeddings object. (#24602)
Thank you for contributing to LangChain!

- [x] **PR title**: Update IBM docs about information to pass client
into WatsonxLLM and WatsonxEmbeddings object.


- [x] **PR message**: 
- **Description:** Update IBM docs about information to pass client into
WatsonxLLM and WatsonxEmbeddings object.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-07-24 09:12:13 -04:00
Jacob Lee
379803751e docs[patch]: Remove very old document comparison notebook (#24587) 2024-07-23 22:25:35 -07:00
ZhangShenao
ad18afc3ec community[patch]: Fix param spelling error in ElasticsearchChatMessageHistory (#24589)
Fix param spelling error in `ElasticsearchChatMessageHistory`
2024-07-23 19:29:42 -07:00
Isaac Francisco
464a525a5a [partner]: minor change to embeddings for Ollama (#24521) 2024-07-24 00:00:13 +00:00
Aayush Kataria
0f45ac4088 LangChain Community: VectorStores: Azure Cosmos DB Filtered Vector Search (#24087)
Thank you for contributing to LangChain!

- This PR adds vector search filtering for Azure Cosmos DB Mongo vCore
and NoSQL.


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-23 16:59:23 -07:00
Gareth
ac41c97d21 pinecone: Add embedding Inference Support (#24515)
**Description**

Add support for Pinecone hosted embedding models as
`PineconeEmbeddings`. Replacement for #22890

**Dependencies**
Add `aiohttp` to support async embeddings call against REST directly

- [x] **Add tests and docs**: If you're adding a new integration, please
include

Added `docs/docs/integrations/text_embedding/pinecone.ipynb`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Twitter: `gdjdg17`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-23 22:50:28 +00:00
ccurme
aaf788b7cb docs[patch]: fix chat model tabs in runnable-as-tool guide (#24580) 2024-07-23 18:36:01 -04:00
Bagatur
47ae06698f docs: update ChatModelTabs defaults (#24583) 2024-07-23 21:56:30 +00:00
Erick Friis
03881c6743 docs: fix hf embeddings install (#24577) 2024-07-23 21:03:30 +00:00
ccurme
2d6b0bf3e3 core[patch]: add to RunnableLambda docstring (#24575)
Explain behavior when function returns a runnable.
2024-07-23 20:46:44 +00:00
Erick Friis
ee3955c68c docs: add tool calling for ollama (#24574) 2024-07-23 20:33:23 +00:00
Carlos André Antunes
325068bb53 community: Fix azure_openai.py (#24572)
In some lines its trying to read a key that do not exists yet. In this
cases I changed the direct access to dict.get() method


- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-07-23 16:22:21 -04:00
Bagatur
bff6ca78a2 docs: duplicate how to link (#24569) 2024-07-23 18:52:05 +00:00
Nik Jmaeff
6878bc39b5 langchain: fix TrajectoryEvalChain.prep_inputs (#19959)
The previous implementation would never be called.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-23 18:37:39 +00:00
Bagatur
55e66aa40c langchain[patch]: init_chat_model support ChatBedrockConverse (#24564) 2024-07-23 11:07:38 -07:00
Bagatur
9b7db08184 experimental[patch]: Release 0.0.63 (#24563) 2024-07-23 16:28:37 +00:00
Bagatur
8691a5a37f community[patch]: Release 0.2.10 (#24560) 2024-07-23 09:24:57 -07:00
Bagatur
4919d5d6df langchain[patch]: Release 0.2.11 (#24559) 2024-07-23 09:18:44 -07:00
Bagatur
918e1c8a93 core[patch]: Release 0.2.23 (#24557) 2024-07-23 09:01:18 -07:00
Lance Martin
58def6e34d Add tool calling example to Ollama ntbk (#24522) 2024-07-23 15:58:54 +00:00
Leonid Ganeline
e787532479 langchain: globals fix (#21281)
Issue: functions from `globals`, like the `get_debug` are placed in the
init.py file. As a result, they don't listed in the API Reference docs.
[See
this](https://langchain-9jq1kef7i-langchain.vercel.app/v0.2/docs/how_to/debugging/#set_debugtrue)
and [broken
this](https://api.python.langchain.com/en/latest/globals/langchain.globals.set_debug.html).
Change: moved code from init.py into the `globals.py` file and removed
`globals` directory. Similar to: #21266
BTW `globals` in core implemented exactly inside a file not inside a
folder.
2024-07-23 11:23:18 -04:00
Ben Chambers
e80b0932ee community[patch]: small fixes to link extractors (#24528)
- **Description:** small fixes to imports / types in the link extraction
work

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-23 14:28:06 +00:00
Morteza Hosseini
9e06991aae community[patch]: Update URL to the 2markdown API (#24546)
Update the URL to Markdown endpoint.

API information is available here: https://2markdown.com/docs#url2md
2024-07-23 14:27:55 +00:00
ZhangShenao
a14e02ab33 core[patch]: Fix word spelling error in globals.py (#24532)
Fix word spelling error in `globals.py`

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-23 14:27:16 +00:00
maang-h
378db2e1a5 docs: Add RedisChatMessageHistory docstrings (#24548)
- **Description:** Add `RedisChatMessageHistory ` rich docstrings.
- **Issue:** the issue #21983

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-23 14:23:46 +00:00
ccurme
a197a8e184 openai[patch]: move test (#24552)
No-override tests (https://github.com/langchain-ai/langchain/pull/24407)
include a condition that integrations not implement additional tests.
2024-07-23 10:22:22 -04:00
Eugene Yurtsev
0bb54ab9f0 CI: Temporarily disable min version checking on pull request (#24551)
Short term to fix CI
2024-07-23 14:12:08 +00:00
Eugene Yurtsev
f47b4edcc2 standard-test: Fix typo in skipif for chat model integration tests (#24553) 2024-07-23 10:11:01 -04:00
Jesse Wright
837a3d400b chore(docs): SQARQL -> SPARQL typo fix (#24536)
nit picky typo fix
2024-07-23 13:39:34 +00:00
Eugene Yurtsev
20b72a044c standard-tests: Add BaseModel variations tests to with_structured_output (#24527)
After this standard tests will test with the following combinations:

1. pydantic.BaseModel
2. pydantic.v1.BaseModel

If ran within a matrix, it'll covert both pydantic.BaseModel originating
from
pydantic 1 and the one defined in pydantic 2.
2024-07-23 09:01:26 -04:00
Bagatur
70c71efcab core[patch]: merge_content fix (#24526) 2024-07-22 22:20:22 -07:00
Ben Chambers
a5a3d28776 community[patch]: Remove targets_table from C* GraphVectorStore (#24502)
- **Description:** Remove the unnecessary `targets_table` parameter
2024-07-22 22:09:36 -04:00
Alexander Golodkov
2a70a07aad community[minor]: added new document loaders based on dedoc library (#24303)
### Description
This pull request added new document loaders to load documents of
various formats using [Dedoc](https://github.com/ispras/dedoc):
  - `DedocFileLoader` (determine file types automatically and parse)
  - `DedocPDFLoader` (for `PDF` and images parsing)
- `DedocAPIFileLoader` (determine file types automatically and parse
using Dedoc API without library installation)

[Dedoc](https://dedoc.readthedocs.io) is an open-source library/service
that extracts texts, tables, attached files and document structure
(e.g., titles, list items, etc.) from files of various formats. The
library is actively developed and maintained by a group of developers.

`Dedoc` supports `DOCX`, `XLSX`, `PPTX`, `EML`, `HTML`, `PDF`, images
and more.
Full list of supported formats can be found
[here](https://dedoc.readthedocs.io/en/latest/#id1).
For `PDF` documents, `Dedoc` allows to determine textual layer
correctness and split the document into paragraphs.


### Issue
This pull request extends variety of document loaders supported by
`langchain_community` allowing users to choose the most suitable option
for raw documents parsing.

### Dependencies
The PR added a new (optional) dependency `dedoc>=2.2.5` ([library
documentation](https://dedoc.readthedocs.io)) to the
`extended_testing_deps.txt`

### Twitter handle
None

### Add tests and docs
1. Test for the integration:
`libs/community/tests/integration_tests/document_loaders/test_dedoc.py`
2. Example notebook:
`docs/docs/integrations/document_loaders/dedoc.ipynb`
3. Information about the library:
`docs/docs/integrations/providers/dedoc.mdx`

### Lint and test

Done locally:

  - `make format`
  - `make lint`
  - `make integration_tests`
  - `make docs_build` (from the project root)

---------

Co-authored-by: Nasty <bogatenkova.anastasiya@mail.ru>
2024-07-23 02:04:53 +00:00
Ben Chambers
5ac936a284 community[minor]: add document transformer for extracting links (#24186)
- **Description:** Add a DocumentTransformer for executing one or more
`LinkExtractor`s and adding the extracted links to each document.
- **Issue:** n/a
- **Depedencies:** none

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-22 22:01:21 -04:00
Jacob Lee
3c4652c906 docs[patch]: Hide OllamaFunctions now that Ollama supports tool calling (#24523) 2024-07-22 17:56:51 -07:00
Erick Friis
2c6b9e8771 standard-tests: add override check (#24407) 2024-07-22 23:38:01 +00:00
Nithish Raghunandanan
1639ccfd15 couchbase: [patch] Return chat message history in order (#24498)
**Description:** Fixes an issue where the chat message history was not
returned in order. Fixed it now by returning based on timestamps.

- [x] **Add tests and docs**: Updated the tests to check the order
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-22 23:30:29 +00:00
C K Ashby
ab036c1a4c docs: Update .run() to .invoke() (#24520) 2024-07-22 14:21:33 -07:00
Erick Friis
3dce2e1d35 all: add release notes to pypi (#24519) 2024-07-22 13:59:13 -07:00
Bagatur
c48e99e7f2 docs: fix sql db note (#24505) 2024-07-22 13:30:29 -07:00
Bagatur
8a140ee77c core[patch]: don't serialize BasePromptTemplate.input_types (#24516)
Candidate fix for #24513
2024-07-22 13:30:16 -07:00
MarkYQJ
df357f82ca ignore the first turn to apply "history" mechanism (#14118)
This will generate a meaningless string "system: " for generating
condense question; this increases the probability to make an improper
condense question and misunderstand user's question. Below is a case
- Original Question: Can you explain the arguments of Meilisearch?
- Condense Question
  - What are the benefits of using Meilisearch? (by CodeLlama)
  - What are the reasons for using Meilisearch? (by GPT-4)

The condense questions (not matter from CodeLlam or GPT-4) are different
from the original one.

By checking the content of each dialogue turn, generating history string
only when the dialog content is not empty.
Since there is nothing before first turn, the "history" mechanism will
be ignored at the very first turn.

Doing so, the condense question will be "What are the arguments for
using Meilisearch?".

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-22 20:11:17 +00:00
Bagatur
236e957abb core,groq,openai,mistralai,robocorp,fireworks,anthropic[patch]: Update BaseModel subclass and instance checks to handle both v1 and proper namespaces (#24417)
After this PR chat models will correctly handle pydantic 2 with
bind_tools and with_structured_output.


```python
import pydantic
print(pydantic.__version__)
```
2.8.2

```python
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field

class Add(BaseModel):
    x: int
    y: int

model = ChatOpenAI().bind_tools([Add])
print(model.invoke('2 + 5').tool_calls)

model = ChatOpenAI().with_structured_output(Add)
print(type(model.invoke('2 + 5')))
```

```
[{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_PNUFa4pdfNOYXxIMHc6ps2Do', 'type': 'tool_call'}]
<class '__main__.Add'>
```


```python
from langchain_openai import ChatOpenAI
from pydantic.v1 import BaseModel, Field

class Add(BaseModel):
    x: int
    y: int

model = ChatOpenAI().bind_tools([Add])
print(model.invoke('2 + 5').tool_calls)

model = ChatOpenAI().with_structured_output(Add)
print(type(model.invoke('2 + 5')))
```

```python
[{'name': 'Add', 'args': {'x': 2, 'y': 5}, 'id': 'call_hhiHYP441cp14TtrHKx3Upg0', 'type': 'tool_call'}]
<class '__main__.Add'>
```

Addresses issues: https://github.com/langchain-ai/langchain/issues/22782

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-22 20:07:39 +00:00
C K Ashby
199e64d372 Please spell Lex's name correctly Fridman (#24517)
https://www.youtube.com/watch?v=ZIyB9e_7a4c

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-22 19:38:32 +00:00
Erick Friis
1f01c0fd98 infra: remove core from min version pr testing (#24507)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-22 17:46:15 +00:00
Naka Masato
884f76e05a fix: load google credentials properly in GoogleDriveLoader (#12871)
- **Description:** 
- Fix #12870: set scope in `default` func (ref:
https://google-auth.readthedocs.io/en/master/reference/google.auth.html)
- Moved the code to load default credentials to the bottom for clarity
of the logic
	- Add docstring and comment for each credential loading logic
- **Issue:** https://github.com/langchain-ai/langchain/issues/12870
- **Dependencies:** no dependencies change
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** @gymnstcs

<!-- If no one reviews your PR within a few days, please @-mention one
of @baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-22 17:43:33 +00:00
Erick Friis
a45337ea07 ollama: release 0.1.0 (#24510) 2024-07-22 10:35:26 -07:00
Isaac Francisco
1318d534af [docs]: minor react change (#24509) 2024-07-22 10:25:01 -07:00
Jorge Piedrahita Ortiz
10e3982b59 community: sambanova integration minor changes (#24503)
- Minor changes in samabanova llm integration 
  - default api 
  - docstrings
- minor changes in docs
2024-07-22 17:06:35 +00:00
maang-h
721f709dec community: Improve QianfanChatEndpoint tool result to model (#24466)
- **Description:** `QianfanChatEndpoint` When using tool result to
answer questions, the content of the tool is required to be in Dict
format. Of course, this can require users to return Dict format when
calling the tool, but in order to be consistent with other Chat Models,
I think such modifications are necessary.
2024-07-22 11:29:00 -04:00
Chaunte W. Lacewell
02f0a29293 Cookbook: Add Visual RAG example using VDMS (#24353)
- **Description:** Adding notebook to demonstrate visual RAG which uses
both video scene description generated by open source vision models (ex.
video-llama, video-llava etc.) as text embeddings and frames as image
embeddings to perform vector similarity search using VDMS.
  - **Issue:** N/A
  - **Dependencies:** N/A
2024-07-22 11:16:06 -04:00
ccurme
dcba7df2fe community[patch]: deprecate langchain_community Chroma in favor of langchain_chroma (#24474) 2024-07-22 11:00:13 -04:00
ccurme
0f7569ddbc core[patch]: enable RunnableWithMessageHistory without config (#23775)
Feedback that `RunnableWithMessageHistory` is unwieldy compared to
ConversationChain and similar legacy abstractions is common.

Legacy chains using memory typically had no explicit notion of threads
or separate sessions. To use `RunnableWithMessageHistory`, users are
forced to introduce this concept into their code. This possibly felt
like unnecessary boilerplate.

Here we enable `RunnableWithMessageHistory` to run without a config if
the `get_session_history` callable has no arguments. This enables
minimal implementations like the following:
```python
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-3.5-turbo-0125")
memory = InMemoryChatMessageHistory()
chain = RunnableWithMessageHistory(llm, lambda: memory)

chain.invoke("Hi I'm Bob")  # Hello Bob!
chain.invoke("What is my name?")  # Your name is Bob.
```
2024-07-22 10:36:53 -04:00
Mohammad Mohtashim
5ade0187d0 [Commutiy]: Prompts Fixed for ZERO_SHOT_REACT React Agent Type in create_sql_agent function (#23693)
- **Description:** The correct Prompts for ZERO_SHOT_REACT were not
being used in the `create_sql_agent` function. They were not using the
specific `SQL_PREFIX` and `SQL_SUFFIX` prompts if client does not
provide any prompts. This is fixed.
- **Issue:** #23585

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-22 14:04:20 +00:00
ZhangShenao
0f6737cbfe [Vector Store] Fix function add_texts in TencentVectorDB (#24469)
Regardless of whether `embedding_func` is set or not, the 'text'
attribute of document should be assigned, otherwise the `page_content`
in the document of the final search result will be lost
2024-07-22 09:50:22 -04:00
남광우
7ab82eb8cc langchain: Copy libs/standard-tests folder when building devcontainer (#24470)
### Description

* Fix `libs/langchain/dev.Dockerfile` file. copy the
`libs/standard-tests` folder when building the devcontainer.
* `poetry install --no-interaction --no-ansi --with dev,test,docs`
command requires this folder, but it was not copied.

### Reference

#### Error message when building the devcontainer from the master branch

```
...

[2024-07-20T14:27:34.779Z] ------
 > [langchain langchain-dev-dependencies 7/7] RUN poetry install --no-interaction --no-ansi --with dev,test,docs:
0.409 
0.409 Directory ../standard-tests does not exist
------

...
```

#### After the fix

Build success at vscode:

<img width="866" alt="image"
src="https://github.com/user-attachments/assets/10db1b50-6fcf-4dfe-83e1-d93c96aa2317">
2024-07-22 13:46:38 +00:00
rbrugaro
37b89fb7fc fix RAG with quantized embeddings notebook (#24422)
1. Fix HuggingfacePipeline import error to newer partner package
 2. Switch to IPEXModelForCausalLM for performance

There are no dependency changes since optimum intel is also needed for
QuantizedBiEncoderEmbeddings

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-22 13:44:03 +00:00
Thomas Meike
40c02cedaf langchain[patch]: add async methods to ConversationSummaryBufferMemory (#20956)
Added asynchronously callable methods according to the
ConversationSummaryBufferMemory API documentation.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-22 09:21:43 -04:00
Steve Sharp
cecd875cdc docs: Update streaming.ipynb (typo fix) (#24483)
**Description:** Fixes typo `Le'ts` -> `Let's`.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-22 11:09:13 +00:00
Sheng Han Lim
0c6a3fdd6b langchain: Update ContextualCompressionRetriever base_retriever type to RetrieverLike (#24192)
**Description:**
When initializing retrievers with `configurable_fields` as base
retriever, `ContextualCompressionRetriever` validation fails with the
following error:

```
ValidationError: 1 validation error for ContextualCompressionRetriever
base_retriever
  Can't instantiate abstract class BaseRetriever with abstract method _get_relevant_documents (type=type_error)
```

Example code:

```python
esearch_retriever = VertexAISearchRetriever(
    project_id=GCP_PROJECT_ID,
    location_id="global",
    data_store_id=SEARCH_ENGINE_ID,
).configurable_fields(
    filter=ConfigurableField(id="vertex_search_filter", name="Vertex Search Filter")
)

# rerank documents with Vertex AI Rank API
reranker = VertexAIRank(
    project_id=GCP_PROJECT_ID,
    location_id=GCP_REGION,
    ranking_config="default_ranking_config",
)

retriever_with_reranker = ContextualCompressionRetriever(
    base_compressor=reranker, base_retriever=esearch_retriever
)
```

It seems like the issue stems from ContextualCompressionRetriever
insisting that base retrievers must be strictly `BaseRetriever`
inherited, and doesn't take into account cases where retrievers need to
be chained and can have configurable fields defined.


0a1e475a30/libs/langchain/langchain/retrievers/contextual_compression.py (L15-L22)

This PR proposes that the base_retriever type be set to `RetrieverLike`,
similar to how `EnsembleRetriever` validates its list of retrievers:


0a1e475a30/libs/langchain/langchain/retrievers/ensemble.py (L58-L75)
2024-07-21 14:23:19 -04:00
clement.l
d98b830e4b community: add flag to toggle progress bar (#24463)
- **Description:** Add a flag to determine whether to show progress bar 
- **Issue:** n/a
- **Dependencies:** n/a
- **Twitter handle:** n/a

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-20 13:18:02 +00:00
chuanbei888
6b08a33fa4 community: fix QianfanChatEndpoint default model (#24464)
the baidu_qianfan_endpoint has been changed from ERNIE-Bot-turbo to
ERNIE-Lite-8K
2024-07-20 13:00:29 +00:00
Nuno Campos
947628311b core[patch]: Accept configurable keys top-level (#23806)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-07-20 03:49:00 +00:00
Jesus Martinez
c1d1fc13c2 langchain[patch]: Remove multiagent return_direct validation (#24419)
**Description:**

When you use Agents with multi-input tool and some of these tools have
`return_direct=True`, langchain thrown an error related to one
validator.
This change is implemented on [JS
community](https://github.com/langchain-ai/langchainjs/pull/4643) as
well

**Issue**:
This MR resolves #19843

**Dependencies:**

None

Co-authored-by: Jesus Martinez <jesusabraham.martinez@tyson.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-20 03:27:43 +00:00
Will Badart
74e3d796f1 core[patch]: ensure iterator_ in scope for _atransform_stream_with_config except (#24454)
Before, if an exception was raised in the outer `try` block in
`Runnable._atransform_stream_with_config` before `iterator_` is
assigned, the corresponding `finally` block would blow up with an
`UnboundLocalError`:

```txt
UnboundLocalError: cannot access local variable 'iterator_' where it is not associated with a value
```

By assigning an initial value to `iterator_` before entering the `try`
block, this commit ensures that the `finally` can run, and not bury the
"true" exception under a "During handling of the above exception [...]"
traceback.

Thanks for your consideration!
2024-07-20 03:24:04 +00:00
maang-h
7b28359719 docs: Add ChatSparkLLM docstrings (#24449)
- **Description:** 
  - Add `ChatSparkLLM` docstrings, the issue #22296 
  - To support `stream` method
2024-07-19 20:19:14 -07:00
Eugene Yurtsev
5e48f35fba core[minor]: Relax constraints on type checking for tools and parsers (#24459)
This will allow tools and parsers to accept pydantic models from any of
the
following namespaces:

* pydantic.BaseModel with pydantic 1
* pydantic.BaseModel with pydantic 2
* pydantic.v1.BaseModel with pydantic 2
2024-07-19 21:47:34 -04:00
Isaac Francisco
838464de25 ollama: init package (#23615)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-20 00:43:29 +00:00
Erick Friis
f4ee3c8a22 infra: add min version testing to pr test flow (#24358)
xfailing some sql tests that do not currently work on sqlalchemy v1

#22207 was very much not sqlalchemy v1 compatible. 

Moving forward, implementations should be compatible with both to pass
CI
2024-07-19 22:03:19 +00:00
Erick Friis
50cb0a03bc docs: advanced feature note (#24456)
fixes #24430
2024-07-19 20:05:59 +00:00
Bagatur
842065a9cc community[patch]: Release 0.2.9 (#24453) 2024-07-19 12:50:22 -07:00
Bagatur
27ad6a4bb3 langchain[patch]: Release 0.2.10 (#24452) 2024-07-19 12:50:13 -07:00
Bagatur
dda9438e87 community[patch]: gpt-4o-mini costs (#24421) 2024-07-19 19:02:44 +00:00
Eugene Yurtsev
604dfe2d99 community[patch]: Force opt-in for WebResearchRetriever (CVE-2024-3095) (#24451)
This PR addresses the issue raised by (CVE-2024-3095)

https://huntr.com/bounties/e62d4895-2901-405b-9559-38276b6a5273

Unfortunately, we didn't do a good job writing the initial report. It's
pointing at both the wrong package and the wrong code.

The affected code is the Web Retriever not the AsyncHTMLLoader, and the
WebRetriever lives in langchain-community

The vulnerable code lives here: 

0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L233-L233)


This PR adds a forced opt-in for users to make sure they are aware of
the risk and can mitigate by configuring a proxy:


0bd3f4e129/libs/community/langchain_community/retrievers/web_research.py (L84-L84)
2024-07-19 18:51:35 +00:00
Bagatur
f101c759ed docs: how to pass runtime secrets (#24450) 2024-07-19 18:36:28 +00:00
Asi Greenholts
372c27f2e5 community[minor]: [GoogleApiYoutubeLoader] Replace API used in _get_document_for_channel from search to playlistItem (#24034)
- **Description:** Search has a limit of 500 results, playlistItems
doesn't. Added a class in except clause to catch another common error.
- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** @TupleType

---------

Co-authored-by: asi-cider <88270351+asi-cider@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-19 14:04:34 -04:00
Rafael Pereira
6a45bf9554 community[minor]: GraphCypherQAChain to accept additional inputs as provided by the user for cypher generation (#24300)
**Description:** This PR introduces a change to the
`cypher_generation_chain` to dynamically concatenate inputs. This
improvement aims to streamline the input handling process and make the
method more flexible. The change involves updating the arguments
dictionary with all elements from the `inputs` dictionary, ensuring that
all necessary inputs are dynamically appended. This will ensure that any
cypher generation template will not require a new `_call` method patch.

**Issue:** This PR fixes issue #24260.
2024-07-19 14:03:14 -04:00
Philippe PRADOS
f5856680fe community[minor]: add mongodb byte store (#23876)
The `MongoDBStore` can manage only documents.
It's not possible to use MongoDB for an `CacheBackedEmbeddings`.

With this new implementation, it's possible to use:
```python
CacheBackedEmbeddings.from_bytes_store(
    underlying_embeddings=embeddings,
    document_embedding_cache=MongoDBByteStore(
      connection_string=db_uri,
      db_name=db_name,
      collection_name=collection_name,
  ),
)
```
and use MongoDB to cache the embeddings !
2024-07-19 13:54:12 -04:00
yabooung
07715f815b community[minor]: Add ability to specify file encoding and json encoding for FileChatMessageHistory (#24258)
Description:
Add UTF-8 encoding support

Issue:
Inability to properly handle characters from certain languages (e.g.,
Korean)

Fix:
Implement UTF-8 encoding in FileChatMessageHistory

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-19 13:53:21 -04:00
Dristy Srivastava
020cc1cf3e Community[minor]: Added checksum in while send data to pebblo-cloud (#23968)
- **Description:** 
            - Updated checksum in doc metadata
- Sending checksum and removing actual content, while sending data to
`pebblo-cloud` if `classifier-location `is `pebblo-cloud` in
`/loader/doc` API
            - Adding `pb_id` i.e. pebblo id to doc metadata
            - Refactoring as needed.
- Sending `content-checksum` and removing actual content, while sending
data to `pebblo-cloud` if `classifier-location `is `pebblo-cloud` in
`prmopt` API
- **Issue:** NA
- **Dependencies:** NA
- **Tests:** Updated
- **Docs** NA

---------

Co-authored-by: dristy.cd <dristy@clouddefense.io>
2024-07-19 13:52:54 -04:00
Eun Hye Kim
9aae8ef416 core[patch]: Fix utils.json_schema.dereference_refs (#24335 KeyError: 400 in JSON schema processing) (#24337)
Description:
This PR fixes a KeyError: 400 that occurs in the JSON schema processing
within the reduce_openapi_spec function. The _retrieve_ref function in
json_schema.py was modified to handle missing components gracefully by
continuing to the next component if the current one is not found. This
ensures that the OpenAPI specification is fully interpreted and the
agent executes without errors.

Issue:
Fixes issue #24335

Dependencies:
No additional dependencies are required for this change.

Twitter handle:
@lunara_x
2024-07-19 13:31:00 -04:00
keval dekivadiya
06f47678ae community[minor]: Add TextEmbed Embedding Integration (#22946)
**Description:**

**TextEmbed** is a high-performance embedding inference server designed
to provide a high-throughput, low-latency solution for serving
embeddings. It supports various sentence-transformer models and includes
the ability to deploy image and text embedding models. TextEmbed offers
flexibility and scalability for diverse applications.

- **PyPI Package:** [TextEmbed on
PyPI](https://pypi.org/project/textembed/)
- **Docker Image:** [TextEmbed on Docker
Hub](https://hub.docker.com/r/kevaldekivadiya/textembed)
- **GitHub Repository:** [TextEmbed on
GitHub](https://github.com/kevaldekivadiya2415/textembed)

**PR Description**
This PR adds functionality for embedding documents and queries using the
`TextEmbedEmbeddings` class. The implementation allows for both
synchronous and asynchronous embedding requests to a TextEmbed API
endpoint. The class handles batching and permuting of input texts to
optimize the embedding process.

**Example Usage:**

```python
from langchain_community.embeddings import TextEmbedEmbeddings

# Initialise the embeddings class
embeddings = TextEmbedEmbeddings(model="your-model-id", api_key="your-api-key", api_url="your_api_url")

# Define a list of documents
documents = [
    "Data science involves extracting insights from data.",
    "Artificial intelligence is transforming various industries.",
    "Cloud computing provides scalable computing resources over the internet.",
    "Big data analytics helps in understanding large datasets.",
    "India has a diverse cultural heritage."
]

# Define a query
query = "What is the cultural heritage of India?"

# Embed all documents
document_embeddings = embeddings.embed_documents(documents)

# Embed the query
query_embedding = embeddings.embed_query(query)

# Print embeddings for each document
for i, embedding in enumerate(document_embeddings):
    print(f"Document {i+1} Embedding:", embedding)

# Print the query embedding
print("Query Embedding:", query_embedding)

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-19 17:30:25 +00:00
Shikanime Deva
9c3da11910 Fix MultiQueryRetriever breaking Embeddings with empty lines (#21093)
Fix MultiQueryRetriever breaking Embeddings with empty lines

```
[chain/end] [1:chain:ConversationalRetrievalChain > 2:retriever:Retriever > 3:retriever:Retriever > 4:chain:LLMChain] [2.03s] Exiting Chain run with output:
[outputs]
> /workspaces/Sfeir/sncf/metabot-backend/.venv/lib/python3.11/site-packages/langchain/retrievers/multi_query.py(116)_aget_relevant_documents()
-> if self.include_original:
(Pdb) queries
['## Alternative questions for "Hello, tell me about phones?":', '', '1. **What are the latest trends in smartphone technology?** (Focuses on recent advancements)', '2. **How has the mobile phone industry evolved over the years?** (Historical perspective)', '3. **What are the different types of phones available in the market, and which one is best for me?** (Categorization and recommendation)']
```

Example of failure on VertexAIEmbeddings

```
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.INVALID_ARGUMENT
	details = "The text content is empty."
	debug_error_string = "UNKNOWN:Error received from peer ipv4:142.250.184.234:443 {created_time:"2024-04-30T09:57:45.625698408+00:00", grpc_status:3, grpc_message:"The text content is empty."}"
```

Fixes: #15959

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-19 17:13:12 +00:00
John Kelly
5affbada61 langchain: Add aadd_documents to ParentDocumentRetriever (#23969)
- **Description:** Add an async version of `add_documents` to
`ParentDocumentRetriever`
-  **Twitter handle:** @johnkdev

---------

Co-authored-by: John Kelly <j.kelly@mwam.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-19 13:12:39 -04:00
Andrew Benton
f9d64d22e5 community[minor]: Add Riza Python/JS code execution tool (#23995)
- **Description:** Add Riza Python/JS code execution tool
- **Issue:** N/A
- **Dependencies:** an optional dependency on the `rizaio` pypi package
- **Twitter handle:** [@rizaio](https://x.com/rizaio)

[Riza](https://riza.io) is a safe code execution environment for
agent-generated Python and JavaScript that's easy to integrate into
langchain apps. This PR adds two new tool classes to the community
package.
2024-07-19 17:03:22 +00:00
Ben Chambers
3691701d58 community[minor]: Add keybert-based link extractor (#24311)
- **Description:** Add a `KeybertLinkExtractor` for graph vectorstores.
This allows extracting links from keywords in a Document and linking
nodes that have common keywords.
- **Issue:** None
- **Dependencies:** None.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-19 12:25:07 -04:00
Erick Friis
ef049769f0 core[patch]: Release 0.2.22 (#24423)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-19 09:09:24 -07:00
Bagatur
cd19ba9a07 core[patch]: core lint fix (#24447) 2024-07-19 09:01:22 -07:00
Ben Chambers
83f3d95ffa community[minor]: GLiNER link extraction (#24314)
- **Description:** This allows extracting links between documents with
common named entities using [GLiNER](https://github.com/urchade/GLiNER).
- **Issue:** None
- **Dependencies:** None

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-19 15:34:54 +00:00
Anas Khan
b5acb91080 Mask API keys for various LLM/ChatModel Modules (#13885)
**Description:** 
- Added masking of the API Keys for the modules:
  - `langchain/chat_models/openai.py`
  - `langchain/llms/openai.py`
  - `langchain/llms/google_palm.py`
  - `langchain/chat_models/google_palm.py`
  - `langchain/llms/edenai.py`

- Updated the modules to utilize `SecretStr` from pydantic to securely
manage API key.
- Added unit/integration tests
- `langchain/chat_models/asure_openai.py` used the `open_api_key` that
is derived from the `ChatOpenAI` Class and it was assuming
`openai_api_key` is a str so we changed it to expect `SecretStr`
instead.

**Issue:** https://github.com/langchain-ai/langchain/issues/12165 ,
**Dependencies:** none,
**Tag maintainer:** @eyurtsev

---------

Co-authored-by: HassanA01 <anikeboss@gmail.com>
Co-authored-by: Aneeq Hassan <aneeq.hassan@utoronto.ca>
Co-authored-by: kristinspenc <kristinspenc2003@gmail.com>
Co-authored-by: faisalt14 <faisalt14@gmail.com>
Co-authored-by: Harshil-Patel28 <76663814+Harshil-Patel28@users.noreply.github.com>
Co-authored-by: kristinspenc <146893228+kristinspenc@users.noreply.github.com>
Co-authored-by: faisalt14 <90787271+faisalt14@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-19 15:23:34 +00:00
ccurme
f99369a54c community[patch]: fix formatting (#24443)
Somehow this got through CI:
https://github.com/langchain-ai/langchain/pull/24363
2024-07-19 14:38:53 +00:00
Ben Chambers
242b085be7 Merge pull request #24315
* community: Add Hierarchy link extractor

* add example

* lint
2024-07-19 09:42:26 -04:00
Rhuan Barros
c3308f31bc Merge pull request #24363
* important email fields
2024-07-19 09:41:20 -04:00
Piotr Romanowski
c50dd79512 docs: Update langchain-openai package version in chat_token_usage_tracking (#24436)
This PR updates docs to mention correct version of the
`langchain-openai` package required to use the `stream_usage` parameter.

As it can be noticed in the details of this [merge
commit](722c8f50ea),
that functionality is available only in `langchain-openai >= 0.1.9`
while docs state it's available in `langchain-openai >= 0.1.8`.
2024-07-19 13:07:37 +00:00
Han Sol Park
aade9bfde5 Mask API key for ChatOpenAI based chat_models (#14293)
- **Description**: Mask API key for ChatOpenAi based chat_models
(openai, azureopenai, anyscale, everlyai).
Made changes to all chat_models that are based on ChatOpenAI since all
of them assumes that openai_api_key is str rather than SecretStr.
  - **Issue:**: #12165 
  - **Dependencies:**  N/A
  - **Tag maintainer:** @eyurtsev
  - **Twitter handle:** N/A

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-19 02:25:38 +00:00
William FH
0ee6ed76ca [Evaluation] Pass in seed directly (#24403)
adding test rn
2024-07-18 19:12:28 -07:00
Nuno Campos
62b6965d2a core: In ensure_config don't copy dunder configurable keys to metadata (#24420) 2024-07-18 22:28:52 +00:00
Eugene Yurtsev
ef22ebe431 standard-tests[patch]: Add pytest assert rewrites (#24408)
This will surface nice error messages in subclasses that fail assertions.
2024-07-18 21:41:11 +00:00
Eugene Yurtsev
f62b323108 core[minor]: Support all versions of pydantic base model in argsschema (#24418)
This adds support to any pydantic base model for tools.

The only potential issue is that `get_input_schema()` will not always
return a v1 base model.
2024-07-18 17:14:23 -04:00
Prakul
b2bc15e640 docs: Update mongodb README.md (#24412)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-18 14:02:34 -07:00
Evan Harris
61ea7bf60b Add a ListRerank document compressor (#13311)
- **Description:** This PR adds a new document compressor called
`ListRerank`. It's derived from `BaseDocumentCompressor`. It's a near
exact implementation of introduced by this paper: [Zero-Shot Listwise
Document Reranking with a Large Language
Model](https://arxiv.org/pdf/2305.02156.pdf) which it finds to
outperform pointwise reranking, which is somewhat implemented in
LangChain as
[LLMChainFilter](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter.py).
- **Issue:** None
- **Dependencies:** None
- **Tag maintainer:** @hwchase17 @izzymsft
- **Twitter handle:** @HarrisEMitchell

Notes:
1. I didn't add anything to `docs`. I wasn't exactly sure which patterns
to follow as [cohere reranker is under
Retrievers](https://python.langchain.com/docs/integrations/retrievers/cohere-reranker)
with other external document retrieval integrations, but other
contextual compression is
[here](https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression/).
Happy to contribute to either with some direction.
2. I followed syntax, docstrings, implementation patterns, etc. as well
as I could looking at nearby modules. One thing I didn't do was put the
default prompt in a separate `.py` file like [Chain
Filter](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_filter_prompt.py)
and [Chain
Extract](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/retrievers/document_compressors/chain_extract_prompt.py).
Happy to follow that pattern if it would be preferred.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-18 20:34:38 +00:00
Srijan Dubey
4c651ba13a Adding LangChain v0.2 support for nvidia ai endpoint, langchain-nvidia-ai-endpoints. Removed deprecated classes from nvidia_ai_endpoints.ipynb (#24411)
Description: added support for LangChain v0.2 for nvidia ai endpoint.
Implremented inMemory storage for chains using
RunnableWithMessageHistory which is analogous to using
`ConversationChain` which was used in v0.1 with the default
`ConversationBufferMemory`. This class is deprecated in favor of
`RunnableWithMessageHistory` in LangChain v0.2

Issue: None

Dependencies: None.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-18 15:59:26 -04:00
Erick Friis
334fc1ed1c mongodb: release 0.1.7 (#24409) 2024-07-18 18:13:27 +00:00
ccurme
ba74341eee docs: update tool calling how-to to pass functions to bind_tools (#24402) 2024-07-18 08:53:48 -07:00
Harrison Chase
3adf710f1d docs: improve docs on tools (#24404)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-18 08:52:12 -07:00
Eun Hye Kim
07c5c60f63 community: fix tool appending logic and update planner prompt in OpenAPI agent toolkit (#24384)
**Description:**
- Updated the format for the 'Action' section in the planner prompt to
ensure it must be one of the tools without additional words. Adjusted
the phrasing from "should be" to "must be" for clarity and
enforceability.
- Corrected the tool appending logic in the
`_create_api_controller_agent` function to ensure that
`RequestsDeleteToolWithParsing` and `RequestsPatchToolWithParsing` are
properly added to the tools list for "DELETE" and "PATCH" operations.

**Issue:** #24382

**Dependencies:** None

**Twitter handle:** @lunara_x

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-18 13:37:46 +00:00
Casey Clements
aade1550c6 docs: Adds MongoDBAtlasVectorSearch to VectorStore list compatible with Indexing API (#24374)
Adds MongoDBAtlasVectorSearch to list of VectorStores compatible with
the Indexing API.

(One line change.)

As of `langchain-mongodb = "0.1.7"`, the requirements that the
VectorStore have both add_documents and delete methods with an ids kwarg
is satisfied. #23535 contains the implementation of that, and has been
merged.
2024-07-18 09:37:29 -04:00
Chen Xiabin
63c60a31f0 [fix] baidu qianfan AiMessage with usage_metadata (#24389)
make AIMessage usage_metadata has error
2024-07-18 09:28:16 -04:00
João Dinis Ferreira
242de9aa5e docs: remove redundant --quiet option in pip install command (#24397)
- **Description:** Removes a redundant option in a `pip install` command
in the documentation.
- **Issue:** N/A
- **Dependencies:** N/A
2024-07-18 13:24:42 +00:00
ZhangShenao
916b813107 community[patch]: Fix spelling error in ConversationVectorStoreTokenBufferMemory doc-string (#24385)
Fix word spelling error in `ConversationVectorStoreTokenBufferMemory`
2024-07-18 12:27:36 +00:00
Rajendra Kadam
1c65529fd7 community[minor]: [PebbloSafeLoader] Rename loader type and add SharePointLoader to supported loaders (#24393)
Thank you for contributing to LangChain!

- [x] **PR title**: [PebbloSafeLoader] Rename loader type and add
SharePointLoader to supported loaders
    - **Description:** Minor fixes in the PebbloSafeLoader:
        - Renamed the loader type from `remote_db` to `cloud_folder`.
- Added `SharePointLoader` to the list of loaders supported by
PebbloSafeLoader.
    - **Issue:** NA
    - **Dependencies:** NA
- [x] **Add tests and docs**: NA
2024-07-18 08:23:12 -04:00
Eugene Yurtsev
6182a402f1 experimental[patch]: block a few more things from PALValidator (#24379)
* Please see security warning already in existing class.
* The approach here is fundamentally insecure as it's relying on a block
  approach rather than an approach based on only running allowed nodes.
So users should only use this code if its running from a properly
sandboxed  environment.
2024-07-18 08:22:45 -04:00
Paolo Ráez
0dec72cab0 Community[patch]: Missing "stream" parameter in cloudflare_workersai (#23987)
### Description
Missing "stream" parameter. Without it, you'd never receive a stream of
tokens when using stream() or astream()

### Issue
No existing issue available
2024-07-18 02:09:39 +00:00
Eugene Yurtsev
570566b858 core[patch]: Update API reference for astream events (#24359)
Update the API reference for astream events to include information about
custom events.
2024-07-17 21:48:53 -04:00
Bagatur
f9baaae3ec docs: clean up tool how to titles (#24373) 2024-07-17 17:08:31 -07:00
Bagatur
4da1df568a docs: tools concepts (#24368) 2024-07-17 17:08:16 -07:00
Erick Friis
96ccba9c27 infra: 15s retry wait on test pypi (#24375) 2024-07-17 23:41:22 +00:00
Bagatur
a4c101ae97 core[patch]: Release 0.2.21 (#24372) 2024-07-17 22:44:35 +00:00
William FH
c5a07e2dd8 core[patch]: add InjectedToolArg annotation (#24279)
```python
from typing_extensions import Annotated
from langchain_core.tools import tool, InjectedToolArg
from langchain_anthropic import ChatAnthropic

@tool
def multiply(x: int, y: int, not_for_model: Annotated[dict, InjectedToolArg]) -> str:
    """multiply."""
    return x * y 

ChatAnthropic(model='claude-3-sonnet-20240229',).bind_tools([multiply]).invoke('5 times 3').tool_calls
'''
-> [{'name': 'multiply',
  'args': {'x': 5, 'y': 3},
  'id': 'toolu_01Y1QazYWhu4R8vF4hF4z9no',
  'type': 'tool_call'}]
'''
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-07-17 15:28:40 -07:00
Erick Friis
80f3d48195 openai: release 0.1.18 (#24369) 2024-07-17 22:26:33 +00:00
Bagatur
7d83189b19 openai[patch]: use model_name in AzureOpenAI.ls_model_name (#24366) 2024-07-17 15:24:05 -07:00
Nithish Raghunandanan
eb26b5535a couchbase: Add chat message history (#24356)
**Description:** : Add support for chat message history using Couchbase

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
2024-07-17 15:22:42 -07:00
Eugene Yurtsev
96bac8e20d core[patch]: Fix regression requiring input_variables in few chat prompt templates (#24360)
* Fix regression that requires users passing input_variables=[].

* Regression introduced by my own changes to this PR:
https://github.com/langchain-ai/langchain/pull/22851
2024-07-17 18:14:57 -04:00
Brice Fotzo
034a8c7c1b community: support advanced text extraction options for pdf documents (#20265)
**Description:** 
- Updated constructors in PyPDFParser and PyPDFLoader to handle
`extraction_mode` and additional kwargs, aligning with the capabilities
of `PageObject.extract_text()` from pypdf.

- Added `test_pypdf_loader_with_layout` along with a corresponding
example text file to validate layout extraction from PDFs.

**Issue:** fixes #19735 

**Dependencies:** This change requires updating the pypdf dependency
from version 3.4.0 to at least 4.0.0.

Additional changes include the addition of a new test
test_pypdf_loader_with_layout and an example text file to ensure the
functionality of layout extraction from PDFs aligns with the new
capabilities.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-17 20:47:09 +00:00
hmasdev
a402de3dae langchain[patch]: fix wrong dict key in OutputFixingParser, RetryOutputParser and RetryWithErrorOutputParser (#23967)
# Description
This PR aims to solve a bug in `OutputFixingParser`, `RetryOutputParser`
and `RetryWithErrorOutputParser`
The bug is that the wrong keyword argument was given to `retry_chain`.
The correct keyword argument is 'completion', but 'input' is used.

This pull request makes the following changes:
1. correct a `dict` key given to `retry_chain`;
2. add a test when using the default prompt.
   - `NAIVE_FIX_PROMPT` for `OutputFixingParser`;
   - `NAIVE_RETRY_PROMPT` for `RetryOutputParser`;
   - `NAIVE_RETRY_WITH_ERROR_PROMPT` for `RetryWithErrorOutputParser`;
3. ~~add comments on `retry_chain` input and output types~~ clarify
`InputType` and `OutputType` of `retry_chain`

# Issue
The bug is pointed out in
https://github.com/langchain-ai/langchain/pull/19792#issuecomment-2196512928

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-17 20:34:46 +00:00
Casey Clements
a47f69a120 partners/mongodb : Significant MongoDBVectorSearch ID enhancements (#23535)
## Description

This pull-request improves the treatment of document IDs in
`MongoDBAtlasVectorSearch`.

Class method signatures of add_documents, add_texts, delete, and
from_texts
now include an `ids:Optional[List[str]]` keyword argument permitting the
user
greater control. 
Note that, as before, IDs may also be inferred from
`Document.metadata['_id']`
if present, but this is no longer required,
IDs can also optionally be returned from searches.

This PR closes the following JIRA issues.

* [PYTHON-4446](https://jira.mongodb.org/browse/PYTHON-4446)
MongoDBVectorSearch delete / add_texts function rework
* [PYTHON-4435](https://jira.mongodb.org/browse/PYTHON-4435) Add support
for "Indexing"
* [PYTHON-4534](https://jira.mongodb.org/browse/PYTHON-4534) Ensure
datetimes are json-serializable

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-17 13:26:20 -07:00
Erick Friis
cc2cbfabfc milvus: release 0.1.2 (#24365) 2024-07-17 19:42:44 +00:00
Eugene Yurtsev
9e4a0e76f6 core[patch]: Fix one unit test for chat prompt template (#24362)
Minor change that fixes a unit test that had missing assertions.
2024-07-17 18:56:48 +00:00
Erick Friis
81639243e2 openai: release 0.1.17 (#24361) 2024-07-17 18:50:42 +00:00
Erick Friis
61976a4147 pinecone: release 0.1.2 (#24355) 2024-07-17 17:09:07 +00:00
Bagatur
b5360e2e5f community[patch]: Release 0.2.8 (#24354) 2024-07-17 17:07:27 +00:00
ccurme
4cf67084d3 openai[patch]: fix key collision and _astream (#24345)
Fixes small issues introduced in
https://github.com/langchain-ai/langchain/pull/24150 (unreleased).
2024-07-17 12:59:26 -04:00
Luis Moros
bcb5f354ad community: Fix SQLDatabse.from_databricks issue when ran from Job (#24346)
- Description: When SQLDatabase.from_databricks is ran from a Databricks
Workflow job, line 205 (default_host = context.browserHostName) throws
an ``AttributeError`` as the ``context`` object has no
``browserHostName`` attribute. The fix handles the exception and sets
the ``default_host`` variable to null

---------

Co-authored-by: lmorosdb <lmorosdb>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-17 12:40:12 -04:00
Bagatur
24e9b48d15 langchain[patch]: Release 0.2.9 (#24327) 2024-07-17 09:39:57 -07:00
Rafael Pereira
cf28708e7b Neo4j: Update with non-deprecated cypher methods, and new method to associate relationship embeddings (#23725)
**Description:** At the moment neo4j wrapper is using setVectorProperty,
which is deprecated
([link](https://neo4j.com/docs/operations-manual/5/reference/procedures/#procedure_db_create_setVectorProperty)).
I replaced with the non-deprecated version.

Neo4j recently introduced a new cypher method to associate embeddings
into relations using "setRelationshipVectorProperty" method. In this PR
I also implemented a new method to perform this association maintaining
the same format used in the "add_embeddings" method which is used to
associate embeddings into Nodes.
I also included a test case for this new method.
2024-07-17 12:37:47 -04:00
maang-h
2a3288b15d docs: Add ChatBaichuan docstrings (#24348)
- **Description:** Add ChatBaichuan rich docstrings.
- **Issue:** the issue #22296
2024-07-17 12:00:16 -04:00
Srijan Dubey
1792684e8f removed deprecated classes from pipelineai.ipynb, added support for LangChain v0.2 for PipelineAI integration (#24333)
Description: added support for LangChain v0.2 for PipelineAI
integration. Removed deprecated classes and incorporated support for
LangChain v0.2 to integrate with PipelineAI. Removed LLMChain and
replaced it with Runnable interface. Also added StrOutputParser, that
parses LLMResult into the top likely string.

Issue: None

Dependencies: None.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-17 13:48:32 +00:00
Tobias Sette
e60ad12521 docs(infobip.ipynb): fix typo (#24328) 2024-07-17 13:33:34 +00:00
Rafael Pereira
fc41730e28 neo4j: Fix test for order-insensitive comparison and floating-point precision issues (#24338)
**Description:** 
This PR addresses two main issues in the `test_neo4jvector.py`:
1. **Order-insensitive Comparison:** Modified the
`test_retrieval_dictionary` to ensure that it passes regardless of the
order of returned values by parsing `page_content` into a structured
format (dictionary) before comparison.
2. **Floating-point Precision:** Updated
`test_neo4jvector_relevance_score` to handle minor floating-point
precision differences by using the `isclose` function for comparing
relevance scores with a relative tolerance.

Errors addressed:

- **test_neo4jvector_relevance_score:**
  ```
AssertionError: assert [(Document(page_content='foo', metadata={'page':
'0'}), 1.0000014305114746), (Document(page_content='bar',
metadata={'page': '1'}), 0.9998371005058289),
(Document(page_content='baz', metadata={'page': '2'}),
0.9993508458137512)] == [(Document(page_content='foo', metadata={'page':
'0'}), 1.0), (Document(page_content='bar', metadata={'page': '1'}),
0.9998376369476318), (Document(page_content='baz', metadata={'page':
'2'}), 0.9993523359298706)]
At index 0 diff: (Document(page_content='foo', metadata={'page': '0'}),
1.0000014305114746) != (Document(page_content='foo', metadata={'page':
'0'}), 1.0)
  Full diff:
  - [(Document(page_content='foo', metadata={'page': '0'}), 1.0),
+ [(Document(page_content='foo', metadata={'page': '0'}),
1.0000014305114746),
? +++++++++++++++
- (Document(page_content='bar', metadata={'page': '1'}),
0.9998376369476318),
? ^^^ ------
+ (Document(page_content='bar', metadata={'page': '1'}),
0.9998371005058289),
? ^^^^^^^^^
- (Document(page_content='baz', metadata={'page': '2'}),
0.9993523359298706),
? ----------
+ (Document(page_content='baz', metadata={'page': '2'}),
0.9993508458137512),
? ++++++++++
  ]
  ```

- **test_retrieval_dictionary:**
  ```
AssertionError: assert [Document(page_content='skills:\n- Python\n- Data
Analysis\n- Machine Learning\nname: John\nage: 30\n')] ==
[Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine
Learning\nage: 30\nname: John\n')]
At index 0 diff: Document(page_content='skills:\n- Python\n- Data
Analysis\n- Machine Learning\nname: John\nage: 30\n') !=
Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine
Learning\nage: 30\nname: John\n')
  Full diff:
- [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine
Learning\nage: 30\nname: John\n')]
? ---------
+ [Document(page_content='skills:\n- Python\n- Data Analysis\n- Machine
Learning\nage: John\nage: 30\n')]
? +++++++++
  ```
2024-07-17 09:28:25 -04:00
Erick Friis
47ed7f766a infra: fix release prerelease deps bug (#24323) 2024-07-16 15:13:41 -07:00
Bagatur
80e7cd6cff core[patch]: Release 0.2.20 (#24322) 2024-07-16 15:04:36 -07:00
Erick Friis
6c3e65a878 infra: prerelease dep checking on release (#23269) 2024-07-16 21:48:15 +00:00
Eugene Yurtsev
616196c620 Docs: Add how to dispatch custom callback events (#24278)
* Add how-to guide for dispatching custom callback events.
* Add links from index to the how to guide
* Add link from streaming from within a tool
* Update versionadded to correct release
https://github.com/langchain-ai/langchain/releases/tag/langchain-core%3D%3D0.2.15
2024-07-16 17:38:32 -04:00
Erick Friis
dd7938ace8 docs: readthedocs deprecation fix (#24321)
https://about.readthedocs.com/blog/2024/07/addons-by-default/#how-does-it-affect-my-projects

we use build.command so we're already using addons, so I think this is
it
2024-07-16 20:32:51 +00:00
Srijan Dubey
ef07308c30 Upgraded shaleprotocol to use langchain v0.2 removed deprecated classes (#24320)
Description: Added support for langchain v0.2 for shale protocol.
Replaced LLMChain with Runnable interface which allows any two Runnables
to be 'chained' together into sequences. Also added
StreamingStdOutCallbackHandler. Callback handler for streaming.
Issue: None
Dependencies: None.
2024-07-16 20:07:36 +00:00
pbharti0831
049bc37111 Cookbook for applying RAG locally using open source models and tools on CPU (#24284)
This cookbook guides user to implement RAG locally on CPU using
langchain tools and open source models. It enables Llama2 model to
answer queries about Intel Q1 2024 earning release using RAG pipeline.

Main libraries are langchain, llama-cpp-python and gpt4all.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Sriragavi <sriragavi.r@intel.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-16 15:17:10 -04:00
Leonid Ganeline
5ccf8ebfac core: docstrings vectorstores update (#24281)
Added missed docstrings. Formatted docstrings to the consistent form.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-16 16:58:11 +00:00
Erick Friis
1e9cc02ed8 openai: raw response headers (#24150) 2024-07-16 09:54:54 -07:00
Bagatur
dc42279eb5 core[patch]: fix Typing.cast import (#24313)
Fixes #24287
2024-07-16 16:53:48 +00:00
Anush
e38bf08139 qdrant: Fixed typos in Qdrant vectorstore docs (#24312)
## Description 

As that title goes.
2024-07-16 09:44:07 -07:00
bovlb
5caa381177 community[minor]: Add ApertureDB as a vectorstore (#24088)
Thank you for contributing to LangChain!

- [X] *ApertureDB as vectorstore**: "community: Add ApertureDB as a
vectorestore"

- **Description:** this change provides a new community integration that
uses ApertureData's ApertureDB as a vector store.
    - **Issue:** none
    - **Dependencies:** depends on ApertureDB Python SDK
    - **Twitter handle:** ApertureData

- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Integration tests rely on a local run of a public docker image.
Example notebook additionally relies on a local Ollama server.

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

All lint tests pass.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Gautam <gautam@aperturedata.io>
2024-07-16 09:32:59 -07:00
frob
c59e663365 community[patch]: Fix docstring for ollama parameter "keep_alive" (#23973)
Fix doc-string for ollama integration
2024-07-16 14:48:38 +00:00
Mazen Ramadan
0c1889c713 docs: fix parameter typo in scrapfly loader docs (#24307)
Fixed wrong parameter typo in
[ScrapflyLoader](https://github.com/langchain-ai/langchain/blob/master/libs/community/langchain_community/document_loaders/scrapfly.py)
docs, where `ignore_scrape_failures` is used instead of
`continue_on_failure`.

- Description: Fix wrong param typo in ScrapflyLoader docs.
2024-07-16 14:48:13 +00:00
Leonid Ganeline
5fcf2ef7ca core: docstrings documents (#23506)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-16 10:43:54 -04:00
Rafael Pereira
77dd327282 Docs: Fix Concepts Integration Tools Link (#24301)
- **Description:** This PR fix concepts integrations tools link.

- **Issue:** Fixes issue #24112

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-16 10:29:30 -04:00
Rahul Raghavendra Choudhury
f5a38772a8 community[patch]: Update TavilySearch to use TavilyClient instead of the deprecated Client (#24270)
On using TavilySearchAPIRetriever with any conversation chain getting
error :

`TypeError: Client.__init__() got an unexpected keyword argument
'api_key'`

It is because the retreiver class is using the depreciated `Client`
class, `TavilyClient` need to be used instead.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-16 13:35:28 +00:00
Shenhai Ran
5f2dea2b20 core[patch]: Add encoding options when create prompt template from a file (#24054)
- Uses default utf-8 encoding for loading prompt templates from file

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-16 09:35:09 -04:00
Chen Xiabin
69b1603173 baidu qianfan AiMessage with usage_metadata (#24288)
add usage_metadata to qianfan AIMessage. Thanks
2024-07-16 09:30:50 -04:00
amcastror
d83164f837 Update retrievers.ipynb (#24289)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-16 13:30:41 +00:00
Leonid Ganeline
198b85334f core[patch]: docstrings langchain_core/ files update (#24285)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-16 09:21:51 -04:00
Dobiichi-Origami
7aeaa1974d community[patch]: change the class of qianfan_ak and qianfan_sk parameters (#24293)
- **Description:** we changed the class of two parameters to fix a bug,
which causes validation failure when using QianfanEmbeddingEndpoint
2024-07-16 09:17:48 -04:00
Tibor Reiss
1c753d1e81 core[patch]: Update typing for template format to include jinja2 as a Literal (#24144)
Fixes #23929 via adjusting the typing
2024-07-16 09:09:42 -04:00
Jacob Lee
6716379f0c docs[patch]: Fix rendering issue in code splitter page (#24291) 2024-07-15 23:08:21 -07:00
Jacob Lee
58fdb070fa docs[patch]: Update intro diagram (#24290)
CC @agola11
2024-07-15 22:04:42 -07:00
Erick Friis
1d7a3ae7ce infra: add test deps to add_dependents (#24283) 2024-07-15 15:48:53 -07:00
Erick Friis
d2f671271e langchain: fix extended test (#24282) 2024-07-15 15:29:48 -07:00
Lage Ragnarsson
a3c10fc6ce community: Add support for specifying hybrid search for Databricks vector search (#23528)
**Description:**

Databricks Vector Search recently added support for hybrid
keyword-similarity search.
See [usage
examples](https://docs.databricks.com/en/generative-ai/create-query-vector-search.html#query-a-vector-search-endpoint)
from their documentation.

This PR updates the Langchain vectorstore interface for Databricks to
enable the user to pass the *query_type* parameter to
*similarity_search* to make use of this functionality.
By default, there will not be any changes for existing users of this
interface. To use the new hybrid search feature, it is now possible to
do

```python
# ...
dvs = DatabricksVectorSearch(index)
dvs.similarity_search("my search query", query_type="HYBRID")
```

Or using the retriever:

```python
retriever = dvs.as_retriever(
    search_kwargs={
        "query_type": "HYBRID",
    }
)
retriever.invoke("my search query")
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-15 22:14:08 +00:00
Christopher Tee
5171ffc026 community(you): Integrate You.com conversational APIs (#23046)
You.com is releasing two new conversational APIs — Smart and Research. 

This PR:
- integrates those APIs with Langchain, as an LLM
- streaming is supported

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-15 17:46:58 -04:00
maang-h
6c7d9f93b9 feat: Add ChatTongyi structured output (#24187)
- **Description:** Add `with_structured_output` method to ChatTongyi to
support structured output.
2024-07-15 15:57:21 -04:00
Chen Xiabin
8f4620f4b8 baidu qianfan streaming token_usage (#24117)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-15 19:52:31 +00:00
maang-h
9d97de34ae community[patch]: Improve ChatBaichuan init args and role (#23878)
- **Description:** Improve ChatBaichuan init args and role
   -  ChatBaichuan adds `system` role
   - alias: `baichuan_api_base` -> `base_url`
   - `with_search_enhance` is deprecated
   - Add `max_tokens` argument
2024-07-15 15:17:00 -04:00
Erick Friis
56cca23745 openai: remove some params from default serialization (#24280) 2024-07-15 18:53:36 +00:00
mrugank-wadekar
66bebeb76a partners: add similarity search by image functionality to langchain_chroma partner package (#22982)
- **Description:** This pull request introduces two new methods to the
Langchain Chroma partner package that enable similarity search based on
image embeddings. These methods enhance the package's functionality by
allowing users to search for images similar to a given image URI. Also
introduces a notebook to demonstrate it's use.
  - **Issue:** N/A
  - **Dependencies:** None
  - **Twitter handle:** @mrugank9009

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-15 18:48:22 +00:00
pm390
b0aa915dea community[patch]: use asyncio.sleep instead of sleep in OpenAI Assistant async (#24275)
**Description:** Implemented async sleep using asyncio instead of
synchronous sleep in openAI Assistants
**Issue:** 24194
**Dependencies:** asyncio
**Twitter handle:** pietromald60939
2024-07-15 18:14:39 +00:00
Anush
d93ae756e6 qdrant: Documentation for the new QdrantVectorStore class (#24166)
## Description

Follow up on #24165. Adds a page to document the latest usage of the new
`QdrantVectorStore` class.
2024-07-15 10:39:23 -07:00
Erick Friis
1244e66bd4 docs: remove couchbase from docs linking (#24277)
`pip install couchbase` adds 12 minutes to the docs build...
2024-07-15 17:34:41 +00:00
wenngong
a001037319 retrievers: MultiVectorRetriever similarity_score_threshold search type (#23539)
Description: support MultiVectorRetriever similarity_score_threshold
search type.

Issue: #23387 #19404

---------

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
2024-07-15 13:31:34 -04:00
Carlos André Antunes
20151384d7 fix azure_openai.py: some keys do not exists (#24158)
In some lines its trying to read a key that do not exists yet. In this
cases I changed the direct access to dict.get() method

Thank you for contributing to LangChain!

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-07-15 17:17:05 +00:00
blueoom
d895614d19 text_splitters: add request parameters for function HTMLHeaderTextSplitter.split_text… (#24178)
**Description:**

The `split_text_from_url` method of `HTMLHeaderTextSplitter` does not
include parameters like `timeout` when using `requests` to send a
request. Therefore, I suggest adding a `kwargs` parameter to the
function, which can be passed as arguments to `requests.get()`
internally, allowing control over the `get` request.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-15 16:43:56 +00:00
Bagatur
9d0c1d2dc9 docs: specify init_chat_model version (#24274) 2024-07-15 16:29:06 +00:00
MoraxMa
a7296bddc2 docs: updated Tongyi package (#24259)
* updated pip install package
2024-07-15 16:25:35 +00:00
Bagatur
c9473367b1 langchain[patch]: Release 0.2.8 (#24273) 2024-07-15 16:05:51 +00:00
JP-Ellis
f77659463a core[patch]: allow message utils to work with lcel (#23743)
The functions `convert_to_messages` has had an expansion of the
arguments it can take:

1. Previously, it only could take a `Sequence` in order to iterate over
it. This has been broadened slightly to an `Iterable` (which should have
no other impact).
2. Support for `PromptValue` and `BaseChatPromptTemplate` has been
added. These are generated when combining messages using the overloaded
`+` operator.

Functions which rely on `convert_to_messages` (namely `filter_messages`,
`merge_message_runs` and `trim_messages`) have had the type of their
arguments similarly expanded.

Resolves #23706.

<!--
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
-->

---------

Signed-off-by: JP-Ellis <josh@jpellis.me>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-07-15 08:58:05 -07:00
Harold Martin
ccdaf14eff docs: Spell check fixes (#24217)
**Description:** Spell check fixes for docs, comments, and a couple of
strings. No code change e.g. variable names.
**Issue:** none
**Dependencies:** none
**Twitter handle:** hmartin
2024-07-15 15:51:43 +00:00
Leonid Ganeline
cacdf96f9c core docstrings tracers update (#24211)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-15 11:37:09 -04:00
Leonid Ganeline
36ee083753 core: docstrings utils update (#24213)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-15 11:36:00 -04:00
thehunmonkgroup
e8a21146d3 community[patch]: upgrade default model for ChatAnyscale (#24232)
Old default `meta-llama/Llama-2-7b-chat-hf` no longer supported.
2024-07-15 11:34:59 -04:00
Bagatur
a0958c0607 docs: more tool call -> tool message docs (#24271) 2024-07-15 07:55:07 -07:00
Bagatur
620b118c70 core[patch]: Release 0.2.19 (#24272) 2024-07-15 07:51:30 -07:00
ccurme
888fbc07b5 core[patch]: support passing args_schema through as_tool (#24269)
Note: this allows the schema to be passed in positionally.

```python
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.runnables import RunnableLambda


class Add(BaseModel):
    """Add two integers together."""

    a: int = Field(..., description="First integer")
    b: int = Field(..., description="Second integer")


def add(input: dict) -> int:
    return input["a"] + input["b"]


runnable = RunnableLambda(add)
as_tool = runnable.as_tool(Add)
as_tool.args_schema.schema()
```
```
{'title': 'Add',
 'description': 'Add two integers together.',
 'type': 'object',
 'properties': {'a': {'title': 'A',
   'description': 'First integer',
   'type': 'integer'},
  'b': {'title': 'B', 'description': 'Second integer', 'type': 'integer'}},
 'required': ['a', 'b']}
```
2024-07-15 07:51:05 -07:00
ccurme
ab2d7821a7 fireworks[patch]: use firefunction-v2 in standard tests (#24264) 2024-07-15 13:15:08 +00:00
ccurme
6fc7610b1c standard-tests[patch]: update test_bind_runnables_as_tools (#24241)
Reduce number of tool arguments from two to one.
2024-07-15 08:35:07 -04:00
Bagatur
0da5078cad langchain[minor]: Generic configurable model (#23419)
alternative to
[23244](https://github.com/langchain-ai/langchain/pull/23244). allows
you to use chat model declarative methods

![Screenshot 2024-06-25 at 1 07 10
PM](https://github.com/langchain-ai/langchain/assets/22008038/910d1694-9b7b-46bc-bc2e-3792df9321d6)
2024-07-15 01:11:01 +00:00
Bagatur
d0728b0ba0 core[patch]: add tool name to tool message (#24243)
Copying current ToolNode behavior
2024-07-15 00:42:40 +00:00
Bagatur
9224027e45 docs: tool artifacts how to (#24198) 2024-07-14 17:04:47 -07:00
Bagatur
5c3e2612da core[patch]: Release 0.2.18 (#24230) 2024-07-13 09:14:43 -07:00
Bagatur
65321bf975 core[patch]: fix ToolCall "type" when streaming (#24218) 2024-07-13 08:59:03 -07:00
Jacob Lee
2b7d1cdd2f docs[patch]: Update tool child run docs (#24160)
Documents #24143
2024-07-13 07:52:37 -07:00
Anush
a653b209ba qdrant: test new QdrantVectorStore (#24165)
## Description

This PR adds integration tests to follow up on #24164.

By default, the tests use an in-memory instance.

To run the full suite of tests, with both in-memory and Qdrant server:

```
$ docker run -p 6333:6333 qdrant/qdrant

$ make test

$ make integration_test
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:59:30 +00:00
Roman Solomatin
f071581aea openai[patch]: update openai params (#23691)
**Description:** Explicitly add parameters from openai API



- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 16:53:33 -07:00
Leonid Ganeline
f0a7581b50 milvus: docstring (#23151)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:25:31 +00:00
Christian D. Glissov
474b88326f langchain_qdrant: Added method "_asimilarity_search_with_relevance_scores" to Qdrant class (#23954)
I stumbled upon a bug that led to different similarity scores between
the async and sync similarity searches with relevance scores in Qdrant.
The reason being is that _asimilarity_search_with_relevance_scores is
missing, this makes langchain_qdrant use the method of the vectorstore
baseclass leading to drastically different results.

To illustrate the magnitude here are the results running an identical
search in a test vectorstore.

Output of asimilarity_search_with_relevance_scores:
[0.9902903374601824, 0.9472135924938804, 0.8535534011299859]

Output of similarity_search_with_relevance_scores:
[0.9805806749203648, 0.8944271849877607, 0.7071068022599718]

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:25:20 +00:00
Bagatur
bdc03997c9 standard-tests[patch]: check for ToolCall["type"] (#24209) 2024-07-12 16:17:34 -07:00
Nada Amin
3f1cf00d97 docs: Improve neo4j semantic templates (#23939)
I made some changes based on the issues I stumbled on while following
the README of neo4j-semantic-ollama.
I made the changes to the ollama variant, and can also port the relevant
ones to the layer variant once this is approved.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:08:25 +00:00
Nada Amin
6b47c7361e docs: fix code usage to use the ollama variant (#23937)
**Description:** the template neo4j-semantic-ollama uses an import from
the neo4j-semantic-layer template instead of its own.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 23:07:42 +00:00
Anirudh31415926535
7677ceea60 docs: model parameter mandatory for cohere embedding and rerank (#23349)
Latest langchain-cohere sdk mandates passing in the model parameter into
the Embeddings and Reranker inits.

This PR is to update the docs to reflect these changes.
2024-07-12 23:07:28 +00:00
Miroslav
aee55eda39 community: Skip Login to HuggubgFaceHub when token is not set (#21561)
Thank you for contributing to LangChain!

- [ ] **HuggingFaceEndpoint**: "Skip Login to HuggingFaceHub"
  - Where:  langchain, community, llm, huggingface_endpoint
 


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Skip login to huggingface hub when when
`huggingfacehub_api_token` is not set. This is needed when using custom
`endpoint_url` outside of HuggingFaceHub.
- **Issue:** the issue # it fixes
https://github.com/langchain-ai/langchain/issues/20342 and
https://github.com/langchain-ai/langchain/issues/19685
    - **Dependencies:** None


- [ ] **Add tests and docs**: 
  1. Tested with locally available TGI endpoint
  2.  Example Usage
```python
from langchain_community.llms import HuggingFaceEndpoint

llm = HuggingFaceEndpoint(
    endpoint_url='http://localhost:8080',
    server_kwargs={
        "headers": {"Content-Type": "application/json"}
    }
)
resp = llm.invoke("Tell me a joke")
print(resp)
```
 Also tested against HF Endpoints
 ```python
 from langchain_community.llms import HuggingFaceEndpoint
huggingfacehub_api_token = "hf_xyz"
repo_id = "mistralai/Mistral-7B-Instruct-v0.2"
llm = HuggingFaceEndpoint(
    huggingfacehub_api_token=huggingfacehub_api_token,
    repo_id=repo_id,
)
resp = llm.invoke("Tell me a joke")
print(resp)
 ```
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 22:10:32 +00:00
Anush
d09dda5a08 qdrant: Bump patch version (#24168)
# Description

To release a new version of `langchain-qdrant` after #24165 and #24166.
2024-07-12 14:48:50 -07:00
Bagatur
12950cc602 standard-tests[patch]: improve runnable tool description (#24210) 2024-07-12 21:33:56 +00:00
Erick Friis
e8ee781a42 ibm: move to external repo (#24208) 2024-07-12 21:14:24 +00:00
Bagatur
02e71cebed together[patch]: Release 0.1.4 (#24205) 2024-07-12 13:59:58 -07:00
Bagatur
259d4d2029 anthropic[patch]: Release 0.1.20 (#24204) 2024-07-12 13:59:15 -07:00
Bagatur
3aed74a6fc fireworks[patch]: Release 0.1.5 (#24203) 2024-07-12 13:58:58 -07:00
Bagatur
13b0d7ec8f openai[patch]: Release 0.1.16 (#24202) 2024-07-12 13:58:39 -07:00
Bagatur
71cd6e6feb groq[patch]: Release 0.1.7 (#24201) 2024-07-12 13:58:19 -07:00
Bagatur
99054e19eb mistralai[patch]: Release 0.1.10 (#24200) 2024-07-12 13:57:58 -07:00
Bagatur
7a1321e2f9 ibm[patch]: Release 0.1.10 (#24199) 2024-07-12 13:57:38 -07:00
Bagatur
cb5031f22f integrations[patch]: require core >=0.2.17 (#24207) 2024-07-12 20:54:01 +00:00
Nithish Raghunandanan
f1618ec540 couchbase: Add standard and semantic caches (#23607)
Thank you for contributing to LangChain!

**Description:** Add support for caching (standard + semantic) LLM
responses using Couchbase


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-12 20:30:03 +00:00
Eugene Yurtsev
8d82a0d483 core[patch]: Mark GraphVectorStore as beta (#24195)
* This PR marks graph vectorstore as beta
2024-07-12 14:28:06 -04:00
Bagatur
0a1e475a30 core[patch]: Release 0.2.17 (#24189) 2024-07-12 17:08:29 +00:00
Bagatur
6166ea67a8 core[minor]: rename ToolMessage.raw_output -> artifact (#24185) 2024-07-12 09:52:44 -07:00
Jean Nshuti
d77d9bfc00 community[patch]: update typo document content returned from semanticscholar (#24175)
Update "astract" -> abstract
2024-07-12 15:40:47 +00:00
Leonid Ganeline
aa3e3cfa40 core[patch]: docstrings runnables update (#24161)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-12 11:27:06 -04:00
mumu
14ba1d4b45 docs: fix numeric errors in tools_chain.ipynb (#24169)
Description: Corrected several numeric errors in the
docs/docs/how_to/tools_chain.ipynb file to ensure the accuracy of the
documentation.
2024-07-12 11:26:26 -04:00
Ikko Eltociear Ashimine
18da9f5e59 docs: update custom_chat_model.ipynb (#24170)
characetrs -> characters
2024-07-12 06:48:22 -04:00
Tomaz Bratanic
d3a2b9fae0 Fix neo4j type error on missing constraint information (#24177)
If you use `refresh_schema=False`, then the metadata constraint doesn't
exist. ATM, we used default `None` in the constraint check, but then
`any` fails because it can't iterate over None value
2024-07-12 06:39:29 -04:00
Anush
7014d07cab qdrant: new Qdrant implementation (#24164) 2024-07-12 04:52:02 +02:00
Xander Dumaine
35784d1c33 langchain[minor]: add document_variable_name to create_stuff_documents_chain (#24083)
- **Description:** `StuffDocumentsChain` uses `LLMChain` which is
deprecated by langchain runnables. `create_stuff_documents_chain` is the
replacement, but needs support for `document_variable_name` to allow
multiple uses of the chain within a longer chain.
- **Issue:** none
- **Dependencies:** none
2024-07-12 02:31:46 +00:00
Eugene Yurtsev
8858846607 milvus[patch]: Fix Milvus vectorstore for newer versions of langchain-core (#24152)
Fix for: https://github.com/langchain-ai/langchain/issues/24116

This keeps the old behavior of add_documents and add_texts
2024-07-11 18:51:18 -07:00
thedavgar
ffe6ca986e community: Fix Bug in Azure Search Vectorstore search asyncronously (#24081)
Thank you for contributing to LangChain!

**Description**:
This PR fixes a bug described in the issue in #24064, when using the
AzureSearch Vectorstore with the asyncronous methods to do search which
is also the method used for the retriever. The proposed change includes
just change the access of the embedding as optional because is it not
used anywhere to retrieve documents. Actually, the syncronous methods of
retrieval do not use the embedding neither.

With this PR the code given by the user in the issue works.

```python
vectorstore = AzureSearch(
    azure_search_endpoint=os.getenv("AI_SEARCH_ENDPOINT_SECRET"),
    azure_search_key=os.getenv("AI_SEARCH_API_KEY"),
    index_name=os.getenv("AI_SEARCH_INDEX_NAME_SECRET"),
    fields=fields,
    embedding_function=encoder,
)

retriever = vectorstore.as_retriever(search_type="hybrid", k=2)

await vectorstore.avector_search("what is the capital of France")
await retriever.ainvoke("what is the capital of France")
```

**Issue**:
The Azure Search Vectorstore is not working when searching for documents
with asyncronous methods, as described in issue #24064

**Dependencies**:
There are no extra dependencies required for this change.

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-11 18:32:19 -07:00
Anush
7790d67f94 qdrant: New sparse embeddings provider interface - PART 1 (#24015)
## Description

This PR introduces a new sparse embedding provider interface to work
with the new Qdrant implementation that will follow this PR.

Additionally, an implementation of this interface is provided with
https://github.com/qdrant/fastembed.

This PR will be followed by
https://github.com/Anush008/langchain/pull/3.
2024-07-11 17:07:25 -07:00
Erick Friis
1132fb801b core: release 0.2.16 (#24159) 2024-07-11 23:59:41 +00:00
Nuno Campos
1d37aa8403 core: Remove extra newline (#24157) 2024-07-11 23:55:36 +00:00
ccurme
cb95198398 standard-tests[patch]: add tests for runnables as tools and streaming usage metadata (#24153) 2024-07-11 18:30:05 -04:00
Erick Friis
d002fa902f infra: fix redundant matrix config (#24151) 2024-07-11 15:15:41 -07:00
Bagatur
8d100c58de core[patch]: Tool accept RunnableConfig (#24143)
Relies on #24038

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-11 22:13:17 +00:00
Bagatur
5fd1e67808 core[minor], integrations...[patch]: Support ToolCall as Tool input and ToolMessage as Tool output (#24038)
Changes:
- ToolCall, InvalidToolCall and ToolCallChunk can all accept a "type"
parameter now
- LLM integration packages add "type" to all the above
- Tool supports ToolCall inputs that have "type" specified
- Tool outputs ToolMessage when a ToolCall is passed as input
- Tools can separately specify ToolMessage.content and
ToolMessage.raw_output
- Tools emit events for validation errors (using on_tool_error and
on_tool_end)

Example:
```python
@tool("structured_api", response_format="content_and_raw_output")
def _mock_structured_tool_with_raw_output(
    arg1: int, arg2: bool, arg3: Optional[dict] = None
) -> Tuple[str, dict]:
    """A Structured Tool"""
    return f"{arg1} {arg2}", {"arg1": arg1, "arg2": arg2, "arg3": arg3}


def test_tool_call_input_tool_message_with_raw_output() -> None:
    tool_call: Dict = {
        "name": "structured_api",
        "args": {"arg1": 1, "arg2": True, "arg3": {"img": "base64string..."}},
        "id": "123",
        "type": "tool_call",
    }
    expected = ToolMessage("1 True", raw_output=tool_call["args"], tool_call_id="123")
    tool = _mock_structured_tool_with_raw_output
    actual = tool.invoke(tool_call)
    assert actual == expected

    tool_call.pop("type")
    with pytest.raises(ValidationError):
        tool.invoke(tool_call)

    actual_content = tool.invoke(tool_call["args"])
    assert actual_content == expected.content
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-11 14:54:02 -07:00
Bagatur
eeb996034b core[patch]: Release 0.2.15 (#24149) 2024-07-11 21:34:25 +00:00
Nuno Campos
03fba07d15 core[patch]: Update styles for mermaid graphs (#24147) 2024-07-11 14:19:36 -07:00
Jacob Lee
c481a2715d docs[patch]: Add structural example to style guide (#24133)
CC @nfcampos
2024-07-11 13:20:14 -07:00
ccurme
8ee8ca7c83 core[patch]: propagate parse_docstring to tool decorator (#24123)
Disabled by default.

```python
from langchain_core.tools import tool

@tool(parse_docstring=True)
def foo(bar: str, baz: int) -> str:
    """The foo.

    Args:
        bar: this is the bar
        baz: this is the baz
    """
    return bar


foo.args_schema.schema()
```
```json
{
  "title": "fooSchema",
  "description": "The foo.",
  "type": "object",
  "properties": {
    "bar": {
      "title": "Bar",
      "description": "this is the bar",
      "type": "string"
    },
    "baz": {
      "title": "Baz",
      "description": "this is the baz",
      "type": "integer"
    }
  },
  "required": [
    "bar",
    "baz"
  ]
}
```
2024-07-11 20:11:45 +00:00
Jacob Lee
4121d4151f docs[patch]: Fix typo (#24132)
CC @efriis
2024-07-11 20:10:48 +00:00
Erick Friis
bd18faa2a0 infra: add SQLAlchemy to min version testing (#23186)
preventing issues like #22546 

Notes:
- this will only affect release CI. We may want to consider adding
running unit tests with min versions to PR CI in some form
- because this only affects release CI, it could create annoying issues
releasing while I'm on vacation. Unless anyone feels strongly, I'll wait
to merge this til when I'm back
2024-07-11 20:09:57 +00:00
Jacob Lee
f1f1f75782 community[patch]: Make AzureML endpoint return AI messages for type assistant (#24085) 2024-07-11 21:45:30 +02:00
Eugene Yurtsev
4ba14adec6 core[patch]: Clean up indexing test code (#24139)
Refactor the code to use the existing InMemroyVectorStore.

This change is needed for another PR that moves some of the imports
around (and messes up the mock.patch in this file)
2024-07-11 18:54:46 +00:00
Atul R
457677c1b7 community: Fixes use of ImagePromptTemplate with Ollama (#24140)
Description: ImagePromptTemplate for Multimodal llms like llava when
using Ollama
Twitter handle: https://x.com/a7ulr

Details:

When using llava models / any ollama multimodal llms and passing images
in the prompt as urls, langchain breaks with this error.

```python
image_url_components = image_url.split(",")
                           ^^^^^^^^^^^^^^^^^^^^
AttributeError: 'dict' object has no attribute 'split'
```

From the looks of it, there was bug where the condition did check for a
`url` field in the variable but missed to actually assign it.

This PR fixes ImagePromptTemplate for Multimodal llms like llava when
using Ollama specifically.

@hwchase17
2024-07-11 11:31:48 -07:00
Matt
8327925ab7 community:support additional Azure Search Options (#24134)
- **Description:** Support additional kwargs options for the Azure
Search client (Described here
https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md#configurations)
    - **Issue:** N/A
    - **Dependencies:** No additional Dependencies

---------
2024-07-11 18:22:36 +00:00
ccurme
122e80e04d core[patch]: add versionadded to as_tool (#24138) 2024-07-11 18:08:08 +00:00
Erick Friis
c4417ea93c core: release 0.2.14, remove poetry 1.7 incompatible flag from root (#24137) 2024-07-11 17:59:51 +00:00
Isaac Francisco
7a62d3dbd6 standard-tests[patch]: test that bind_tools can accept regular python function (#24135) 2024-07-11 17:42:17 +00:00
Nuno Campos
2428984205 core: Add metadata to graph json repr (#24131)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-11 17:23:52 +00:00
Harley Gross
ea3cd1ebba community[minor]: added support for C in RecursiveCharacterTextSplitter (#24091)
Description: Added support for C in RecursiveCharacterTextSplitter by
reusing the separators for C++
2024-07-11 16:47:48 +00:00
Nuno Campos
3e454d7568 core: fix docstring (#24129) 2024-07-11 16:38:14 +00:00
Eugene Yurtsev
08638ccc88 community[patch]: QianfanLLMEndpoint fix type information for the keys (#24128)
Fix for issue: https://github.com/langchain-ai/langchain/issues/24126
2024-07-11 16:24:26 +00:00
Nuno Campos
ee3fe20af4 core: mermaid: Render metadata key-value pairs when drawing mermaid graph (#24103)
- if node is runnable binding with metadata attached
2024-07-11 16:22:23 +00:00
Eugene Yurtsev
1e7d8ba9a6 ci[patch]: Update community linter to provide a helpful error message (#24127)
Update community import linter to explain what's wrong
2024-07-11 16:22:08 +00:00
maang-h
16e178a8c2 docs: Add MiniMaxChat docstrings (#24026)
- **Description:** Add MiniMaxChat rich docstrings.
- **Issue:** the issue #22296
2024-07-11 10:55:02 -04:00
Christophe Bornet
5fc5ef2b52 community[minor]: Add graph store extractors (#24065)
This adds an extractor interface and an implementation for HTML pages.
Extractors are used to create GraphVectorStore Links on loaded content.

**Twitter handle:** cbornet_
2024-07-11 10:35:31 -04:00
maang-h
9bcf8f867d docs: Add SQLChatMessageHistory docstring (#23978)
- **Description:** Add SQLChatMessageHistory docstring.
- **Issue:** the issue #21983

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-11 14:24:28 +00:00
Rafael Pereira
092e9ee0e6 community[minor]: Neo4j Fixed similarity docs (#23913)
**Description:** There was missing some documentation regarding the
`filter` and `params` attributes in similarity search methods.

---------

Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>
2024-07-11 10:16:48 -04:00
Mis
10d8c3cbfa docs: Fix column positioning in the text splitting section for AI21SemanticTextSplitter (#24062) 2024-07-11 09:38:04 -04:00
Jacob Lee
555c6d3c20 docs[patch]: Updates tool error handling guide, add admonition (#24102)
@eyurtsev
2024-07-10 21:10:46 -07:00
Eugene Yurtsev
dc131ac42a core[minor]: Add dispatching for custom events (#24080)
This PR allows dispatching adhoc events for a given run.

# Context

This PR allows users to send arbitrary data to the callback system and
to the astream events API from within a given runnable. This can be
extremely useful to surface custom information to end users about
progress etc.

Integration with langsmith tracer will be done separately since the data
cannot be currently visualized. It'll be accommodated using the events
attribute of the Run

# Examples with astream events

```python
from langchain_core.callbacks import adispatch_custom_event
from langchain_core.tools import tool

@tool
async def foo(x: int) -> int:
    """Foo"""
    await adispatch_custom_event("event1", {"x": x})
    await adispatch_custom_event("event2", {"x": x})
    return x + 1

async for event in foo.astream_events({'x': 1}, version='v2'):
    print(event)
```

```python
{'event': 'on_tool_start', 'data': {'input': {'x': 1}}, 'name': 'foo', 'tags': [], 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'metadata': {}, 'parent_ids': []}
{'event': 'on_custom_event', 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []}
{'event': 'on_custom_event', 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []}
{'event': 'on_tool_end', 'data': {'output': 2}, 'run_id': 'fd6fb7a7-dd37-4191-962c-e43e245909f6', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []}
```

```python
from langchain_core.callbacks import adispatch_custom_event
from langchain_core.runnables import RunnableLambda

@RunnableLambda
async def foo(x: int) -> int:
    """Foo"""
    await adispatch_custom_event("event1", {"x": x})
    await adispatch_custom_event("event2", {"x": x})
    return x + 1

async for event in foo.astream_events(1, version='v2'):
    print(event)
```

```python
{'event': 'on_chain_start', 'data': {'input': 1}, 'name': 'foo', 'tags': [], 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'metadata': {}, 'parent_ids': []}
{'event': 'on_custom_event', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'event1', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []}
{'event': 'on_custom_event', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'event2', 'tags': [], 'metadata': {}, 'data': {'x': 1}, 'parent_ids': []}
{'event': 'on_chain_stream', 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'foo', 'tags': [], 'metadata': {}, 'data': {'chunk': 2}, 'parent_ids': []}
{'event': 'on_chain_end', 'data': {'output': 2}, 'run_id': 'ce2beef2-8608-49ea-8eba-537bdaafb8ec', 'name': 'foo', 'tags': [], 'metadata': {}, 'parent_ids': []}
```

# Examples with handlers 

This is copy pasted from unit tests

```python
    class CustomCallbackManager(BaseCallbackHandler):
        def __init__(self) -> None:
            self.events: List[Any] = []

        def on_custom_event(
            self,
            name: str,
            data: Any,
            *,
            run_id: UUID,
            tags: Optional[List[str]] = None,
            metadata: Optional[Dict[str, Any]] = None,
            **kwargs: Any,
        ) -> None:
            assert kwargs == {}
            self.events.append(
                (
                    name,
                    data,
                    run_id,
                    tags,
                    metadata,
                )
            )

    callback = CustomCallbackManager()

    run_id = uuid.UUID(int=7)

    @RunnableLambda
    def foo(x: int, config: RunnableConfig) -> int:
        dispatch_custom_event("event1", {"x": x})
        dispatch_custom_event("event2", {"x": x}, config=config)
        return x

    foo.invoke(1, {"callbacks": [callback], "run_id": run_id})

    assert callback.events == [
        ("event1", {"x": 1}, UUID("00000000-0000-0000-0000-000000000007"), [], {}),
        ("event2", {"x": 1}, UUID("00000000-0000-0000-0000-000000000007"), [], {}),
    ]
```
2024-07-11 02:25:12 +00:00
Jacob Lee
14a8bbc21a docs[patch]: Adds tool intermediate streaming guide (#24098)
Can merge now and update when we add support for custom events.

CC @eyurtsev @vbarda
2024-07-10 17:38:51 -07:00
Erick Friis
1de1182a9f docs: discourage unconfirmed partner packages (#24099) 2024-07-11 00:34:37 +00:00
Erick Friis
71c2221f8c openai: release 0.1.15 (#24097) 2024-07-10 16:45:42 -07:00
Erick Friis
6ea6f9f7bc core: release 0.2.13 (#24096) 2024-07-10 16:39:15 -07:00
ccurme
975b6129f6 core[patch]: support conversion of runnables to tools (#23992)
Open to other thoughts on UX.

string input:
```python
as_tool = retriever.as_tool()
as_tool.invoke("cat")  # [Document(...), ...]
```

typed dict input:
```python
class Args(TypedDict):
    key: int

def f(x: Args) -> str:
    return str(x["key"] * 2)

as_tool = RunnableLambda(f).as_tool(
    name="my tool",
    description="description",  # name, description are inferred if not supplied
)
as_tool.invoke({"key": 3})  # "6"
```

for untyped dict input, allow specification of parameters + types
```python
def g(x: Dict[str, Any]) -> str:
    return str(x["key"] * 2)

as_tool = RunnableLambda(g).as_tool(arg_types={"key": int})
result = as_tool.invoke({"key": 3})  # "6"
```

Passing the `arg_types` is slightly awkward but necessary to ensure tool
calls populate parameters correctly:
```python
from typing import Any, Dict

from langchain_core.runnables import RunnableLambda
from langchain_openai import ChatOpenAI


def f(x: Dict[str, Any]) -> str:
    return str(x["key"] * 2)

runnable = RunnableLambda(f)
as_tool = runnable.as_tool(arg_types={"key": int})

llm = ChatOpenAI().bind_tools([as_tool])

result = llm.invoke("Use the tool on 3.")
tool_call = result.tool_calls[0]
args = tool_call["args"]
assert args == {"key": 3}

as_tool.run(args)
```

Contrived (?) example with langgraph agent as a tool:
```python
from typing import List, Literal
from typing_extensions import TypedDict

from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent


llm = ChatOpenAI(temperature=0)


def magic_function(input: int) -> int:
    """Applies a magic function to an input."""
    return input + 2


agent_1 = create_react_agent(llm, [magic_function])


class Message(TypedDict):
    role: Literal["human"]
    content: str

agent_tool = agent_1.as_tool(
    arg_types={"messages": List[Message]},
    name="Jeeves",
    description="Ask Jeeves.",
)

agent_2 = create_react_agent(llm, [agent_tool])
```

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-10 19:29:59 -04:00
Jacob Lee
b63a48b7d3 docs[patch]: Fix typos, add prereq sections (#24095) 2024-07-10 23:15:37 +00:00
Erick Friis
9de562f747 infra: create individual jobs in check_diff, do max milvus testing in 3.11 (#23829)
pickup from #23721
2024-07-10 22:45:18 +00:00
Erick Friis
141943a7e1 infra: docs ignore step in script (#24090) 2024-07-10 15:18:00 -07:00
Bagatur
6928f4c438 core[minor]: Add ToolMessage.raw_output (#23994)
Decisions to discuss:
1.  is a new attr needed or could additional_kwargs be used for this
2. is raw_output a good name for this attr
3. should raw_output default to {} or None
4. should raw_output be included in serialization
5. do we need to update repr/str to  exclude raw_output
2024-07-10 20:11:10 +00:00
jongwony
14dd89a1ee docs: add itemgetter in how_to/dynamic_chain (#23951)
Hello! I am honored to be able to contribute to the LangChain project
for the first time.

- **Description:** Using `RunnablePassthrough` logic without providing
`chat_history` key will result in nested keys in `question`, so I submit
a pull request to resolve this issue.

I am attaching a LangSmith screenshot below.

This is the result of the current version of the document.

<img width="1112" alt="image"
src="https://github.com/langchain-ai/langchain/assets/12846075/f0597089-c375-472f-b2bf-793baaecd836">

without `chat_history`:

<img width="1112" alt="image"
src="https://github.com/langchain-ai/langchain/assets/12846075/5c0e3ae7-3afe-417c-9132-770387f0fff2">


- **Lint and test**: 

<img width="777" alt="image"
src="https://github.com/langchain-ai/langchain/assets/12846075/575d2545-3aed-4338-9779-1a0b17365418">
2024-07-10 17:17:51 +00:00
Eugene Yurtsev
c4e149d4f1 community[patch]: Add linter to catch @root_validator (#24070)
- Add linter to prevent further usage of vanilla root validator
- Udpate remaining root validators
2024-07-10 14:51:03 +00:00
ccurme
9c6efadec3 community[patch]: propagate cost information to OpenAI callback (#23996)
This is enabled following
https://github.com/langchain-ai/langchain/pull/22716.
2024-07-10 14:50:35 +00:00
Dismas Banda
91b37b2d81 docs: fix spelling mistake in concepts.mdx: Fouth -> Fourth (#24067)
**Description:** 
Corrected the spelling for fourth.

**Twitter handle:** @dismasbanda
2024-07-10 14:35:54 +00:00
William FH
1e1fd30def [Core] Fix fstring in logger warning (#24043) 2024-07-09 19:53:18 -07:00
Jacob Lee
66265aaac4 docs[patch]: Update GPT4All docs (#24044)
CC @efriis
2024-07-10 02:39:42 +00:00
Jacob Lee
8dac0fb3f1 docs[patch]: Remove deprecated Airbyte loaders from listings (#23927)
CC @efriis
2024-07-10 02:21:25 +00:00
G Sreejith
68fee3e44b docs: template readme update, fix docstring typo in a runnable (#24002)
URL

https://python.langchain.com/v0.2/docs/templates/openai-functions-tool-retrieval-agent/

Checklist
I added a url -
https://python.langchain.com/v0.2/docs/templates/openai-functions-agent/
2024-07-09 14:03:31 -07:00
Ethan Yang
13855ef0c3 [HuggingFace Pipeline] add streaming support (#23852) 2024-07-09 17:02:00 -04:00
Erick Friis
34a02efcf9 infra: remove double heading in release notes (#24037) 2024-07-09 20:48:17 +00:00
Nuno Campos
859e434932 core: Speed up json parse for large strings (#24036)
for a large string:
- old 4.657918874989264
- new 0.023724667000351474
2024-07-09 12:26:50 -07:00
Nuno Campos
160fc7f246 core: Move json parsing in base chat model / output parser to bg thread (#24031)
- add version of AIMessageChunk.__add__ that can add many chunks,
instead of only 2
- In agenerate_from_stream merge and parse chunks in bg thread
- In output parse base classes do more work in bg threads where
appropriate

---------

Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2024-07-09 12:26:36 -07:00
Nuno Campos
73966e693c openai: Create msg chunk in bg thread (#24032)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-09 12:01:51 -07:00
Erick Friis
007c5a85d5 multiple: use modern installer in poetry (#23998) 2024-07-08 18:50:48 -07:00
Erick Friis
e80c150c44 community: release 0.2.7 (prev was langchain) (#23997) 2024-07-08 23:43:32 +00:00
Erick Friis
9f8fd08955 community: release 0.2.7 (#23993) 2024-07-08 22:04:58 +00:00
Bhadresh Savani
5d78b34a6f [Docs] typo Update in azureopenai.ipynb (#23945)
Update documentation for a typo.
2024-07-08 17:48:33 -04:00
Erick Friis
bedd893cd1 core: release 0.2.12 (#23991) 2024-07-08 21:29:29 +00:00
Bagatur
1e957c0c23 docs: rm discord (#23985) 2024-07-08 14:27:58 -07:00
Eugene Yurtsev
f765e8fa9d core[minor],community[patch],standard-tests[patch]: Move InMemoryImplementation to langchain-core (#23986)
This PR moves the in memory implementation to langchain-core.

* The implementation remains importable from langchain-community.
* Supporting utilities are marked as private for now.
2024-07-08 14:11:51 -07:00
Eugene Yurtsev
aa8c9bb4a9 community[patch]: Add constraint for pdfminer.six to unbreak CI (#23988)
Something changed in pdfminer six. This PR unreaks CI without
fixing the underlying PDF parser.
2024-07-08 20:55:19 +00:00
Eugene Yurtsev
2c180d645e core[minor],community[minor]: Upgrade all @root_validator() to @pre_init (#23841)
This PR introduces a @pre_init decorator that's a @root_validator(pre=True) but with all the defaults populated!
2024-07-08 16:09:29 -04:00
Mustafa Abdul-Kader
f152d6ed3d docs(llamacpp): fix copy paste error (#23983) 2024-07-08 20:06:04 +00:00
JonasDeitmersATACAMA
4d6f28cdde Update annoy.ipynb (#23970)
mmemory in the description -> memory (corrected spelling mistake)

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-08 12:52:05 +00:00
Zheng Robert Jia
bf8d4716a7 Update concepts.mdx (#23955)
Added link to list of built-in tools.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-08 08:47:51 -04:00
Zheng Robert Jia
4ec5fdda8d Update index.mdx (#23956)
Added reference to built-in tools list.
2024-07-08 08:47:28 -04:00
ccurme
ee579c77c1 docs: chain migration guide (#23844)
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
2024-07-05 16:37:34 -07:00
Eugene Yurtsev
9787552b00 core[patch]: Use InMemoryChatMessageHistory in unit tests (#23916)
Update unit test to use the existing implementation of chat message
history
2024-07-05 20:10:54 +00:00
Rajendra Kadam
8b84457b17 community[minor]: Support PGVector in PebbloRetrievalQA (#23874)
- **Description:** Support PGVector in PebbloRetrievalQA
  - Identity and Semantic Enforcement support for PGVector
  - Refactor Vectorstore validation and name check
  - Clear the overridden identity and semantic enforcement filters
- **Issue:** NA
- **Dependencies:** NA
- **Tests**: NA(already added)
-  **Docs**: Updated
- **Twitter handle:** [@Raj__725](https://twitter.com/Raj__725)
2024-07-05 16:02:25 -04:00
Eugene Yurtsev
e0186df56b core[patch]: Clarify upsert response semantics (#23921) 2024-07-05 15:59:47 -04:00
Leonid Ganeline
fcd018be47 docs: langgraph link fix (#23848)
Link for the LangGraph doc is instead the LG repo link.
Fixed the link
2024-07-05 15:50:45 -04:00
Robbie Cronin
0990ab146c community: update import in chatbot tutorial to use InMemoryChatMessageHistory (#23903)
Summary of change:

- Replace ChatMessageHistory with InMemoryChatMessageHistory

Fixes #23892

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-05 15:48:11 -04:00
Rajendra Kadam
ee8aa54f53 community[patch]: Fix source path mismatch in PebbloSafeLoader (#23857)
**Description:** Fix for source path mismatch in PebbloSafeLoader. The
fix involves storing the full path in the doc metadata in VectorDB
**Issue:** NA, caught in internal testing
**Dependencies:** NA
**Add tests**:  Updated tests
2024-07-05 15:24:17 -04:00
Eugene Yurtsev
5b7d5f7729 core[patch]: Add comment to clarify aadd_documents (#23920)
Add comment to clarify how add documents works
2024-07-05 15:20:16 -04:00
Eugene Yurtsev
e0889384d9 standard-tests[minor]: add unit tests for testing get_by_ids, aget_by_ids, upsert, aupsert_by_ids (#23919)
These standard unit tests provide standard tests for functionality
introduced in these PRs:

* https://github.com/langchain-ai/langchain/pull/23774
* https://github.com/langchain-ai/langchain/pull/23594
2024-07-05 19:11:54 +00:00
ccurme
74c7198906 core, anthropic[patch]: support streaming tool calls when function has no arguments (#23915)
resolves https://github.com/langchain-ai/langchain/issues/23911

When an AIMessageChunk is instantiated, we attempt to parse tool calls
off of the tool_call_chunks.

Here we add a special-case to this parsing, where `""` will be parsed as
`{}`.

This is a reaction to how Anthropic streams tool calls in the case where
a function has no arguments:
```
{'id': 'toolu_01J8CgKcuUVrMqfTQWPYh64r', 'input': {}, 'name': 'magic_function', 'type': 'tool_use', 'index': 1}
{'partial_json': '', 'type': 'tool_use', 'index': 1}
```
The `partial_json` does not accumulate to a valid json string-- most
other providers tend to emit `"{}"` in this case.
2024-07-05 18:57:41 +00:00
Mateusz Szewczyk
902b57d107 IBM: Added WatsonxChat passing params to invoke method (#23758)
Thank you for contributing to LangChain!

- [x] **PR title**: "IBM: Added WatsonxChat to chat models preview,
update passing params to invoke method"


- [x] **PR message**: 
- **Description:** Added WatsonxChat passing params to invoke method,
added integration tests
    - **Dependencies:** `ibm_watsonx_ai`


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-05 18:07:50 +00:00
ccurme
1f5a163f42 langchain[patch]: deprecate QAGenerationChain (#23730) 2024-07-05 18:06:19 +00:00
ccurme
25de47878b langchain[patch]: deprecate AnalyzeDocumentChain (#23769) 2024-07-05 14:00:23 -04:00
Christophe Bornet
42d049f618 core[minor]: Add Graph Store component (#23092)
This PR introduces a GraphStore component. GraphStore extends
VectorStore with the concept of links between documents based on
document metadata. This allows linking documents based on a variety of
techniques, including common keywords, explicit links in the content,
and other patterns.

This works with existing Documents, so it’s easy to extend existing
VectorStores to be used as GraphStores. The interface can be implemented
for any Vector Store technology that supports metadata, not only graph
DBs.

When retrieving documents for a given query, the first level of search
is done using classical similarity search. Next, links may be followed
using various traversal strategies to get additional documents. This
allows documents to be retrieved that aren’t directly similar to the
query but contain relevant information.

2 retrieving methods are added to the VectorStore ones : 
* traversal_search which gets all linked documents up to a certain depth
* mmr_traversal_search which selects linked documents using an MMR
algorithm to have more diverse results.

If a depth of retrieval of 0 is used, GraphStore is effectively a
VectorStore. It enables an easy transition from a simple VectorStore to
GraphStore by adding links between documents as a second step.

An implementation for Apache Cassandra is also proposed.

See
https://github.com/datastax/ragstack-ai/blob/main/libs/knowledge-store/notebooks/astra_support.ipynb
for a notebook explaining how to use GraphStore and that shows that it
can answer correctly to questions that a simple VectorStore cannot.

**Twitter handle:** _cbornet
2024-07-05 12:24:10 -04:00
Leonid Ganeline
77f5fc3d55 core: docstrings load (#23787)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-05 12:23:19 -04:00
Eugene Yurtsev
6f08e11d7c core[minor]: add upsert, streaming_upsert, aupsert, astreaming_upsert methods to the VectorStore abstraction (#23774)
This PR rolls out part of the new proposed interface for vectorstores
(https://github.com/langchain-ai/langchain/pull/23544) to existing store
implementations.

The PR makes the following changes:

1. Adds standard upsert, streaming_upsert, aupsert, astreaming_upsert
methods to the vectorstore.
2. Updates `add_texts` and `aadd_texts` to be non required with a
default implementation that delegates to `upsert` and `aupsert` if those
have been implemented. The original `add_texts` and `aadd_texts` methods
are problematic as they spread object specific information across
document and **kwargs. (e.g., ids are not a part of the document)
3. Adds a default implementation to `add_documents` and `aadd_documents`
that delegates to `upsert` and `aupsert` respectively.
4. Adds standard unit tests to verify that a given vectorstore
implements a correct read/write API.

A downside of this implementation is that it creates `upsert` with a
very similar signature to `add_documents`.
The reason for introducing `upsert` is to:
* Remove any ambiguities about what information is allowed in `kwargs`.
Specifically kwargs should only be used for information common to all
indexed data. (e.g., indexing timeout).
*Allow inheriting from an anticipated generalized interface for indexing
that will allow indexing `BaseMedia` (i.e., allow making a vectorstore
for images/audio etc.)
 
`add_documents` can be deprecated in the future in favor of `upsert` to
make sure that users have a single correct way of indexing content.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-05 12:21:40 -04:00
G Sreejith
3c752238c5 core[patch]: Fix typo in docstring (graphm -> graph) (#23910)
Changes has been as per the request
Replaced graphm with graph
2024-07-05 16:20:33 +00:00
Leonid Ganeline
12c92b6c19 core: docstrings outputs (#23889)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-05 12:18:17 -04:00
Leonid Ganeline
1eca98ec56 core: docstrings prompts (#23890)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-05 12:17:52 -04:00
Philippe PRADOS
289960bc60 community[patch]: Redis.delete should be a regular method not a static method (#23873)
The `langchain_common.vectostore.Redis.delete()` must not be a
`@staticmethod`.

With the current implementation, it's not possible to have multiple
instances of Redis vectorstore because all versions must share the
`REDIS_URL`.

It's not conform with the base class.
2024-07-05 12:04:58 -04:00
Mohammad Mohtashim
2274d2b966 core[patch]: Accounting for Optional Input Variables in BasePromptTemplate (#22851)
**Description**: After reviewing the prompts API, it is clear that the
only way a user can explicitly mark an input variable as optional is
through the `MessagePlaceholder.optional` attribute. Otherwise, the user
must explicitly pass in the `input_variables` expected to be used in the
`BasePromptTemplate`, which will be validated upon execution. Therefore,
to semantically handle a `MessagePlaceholder` `variable_name` as
optional, we will treat the `variable_name` of `MessagePlaceholder` as a
`partial_variable` if it has been marked as optional. This approach
aligns with how the `variable_name` of `MessagePlaceholder` is already
handled
[here](https://github.com/keenborder786/langchain/blob/optional_input_variables/libs/core/langchain_core/prompts/chat.py#L991).
Additionally, an attribute `optional_variable` has been added to
`BasePromptTemplate`, and the `variable_name` of `MessagePlaceholder` is
also made part of `optional_variable` when marked as optional.

Moreover, the `get_input_schema` method has been updated for
`BasePromptTemplate` to differentiate between optional and non-optional
variables.

**Issue**: #22832, #21425

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-05 15:49:40 +00:00
Klaudia Lemiec
a2082bc1f8 docs: Arxiv docs update (#23871)
- [X] **PR title**
- [X] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** Update of docstrings and docpages
- **Issue:**
[22866](https://github.com/langchain-ai/langchain/issues/22866)

- [X] **Add tests and docs**

- [X] **Lint and test**
2024-07-05 11:43:51 -04:00
jonathan | ヨナタン
d311f22182 Langchain: fixed a typo in the imports (#23864)
Description: Fixed a typo during the imports for the
GoogleDriveSearchTool
    
Issue: It's only for the docs, but it bothered me so i decided to fix it
quickly :D
2024-07-05 15:42:50 +00:00
Arun Sasidharan
db6512aa35 docs: fix typo in llm_chain.ipynb (#23907)
- Fix typo in the tutorial step
- Add some context on `text`
2024-07-05 15:41:46 +00:00
André Quintino
99b1467b63 community: add support for 'cloud' parameter in JiraAPIWrapper (#23057)
- **Description:** Enhance JiraAPIWrapper to accept the 'cloud'
parameter through an environment variable. This update allows more
flexibility in configuring the environment for the Jira API.
 - **Twitter handle:** Andre_Q_Pereira

---------

Co-authored-by: André Quintino <andre.quintino@tui.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-05 15:11:10 +00:00
wenngong
b1e90b3075 community: add model_name param valid for GPT4AllEmbeddings (#23867)
Description: add model_name param valid for GPT4AllEmbeddings

Issue: #23863 #22819

---------

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
2024-07-05 10:46:34 -04:00
volodymyr-memsql
a4eb6d0fb1 community: add SingleStoreDB semantic cache (#23218)
This PR adds a `SingleStoreDBSemanticCache` class that implements a
cache based on SingleStoreDB vector store, integration tests, and a
notebook example.

Additionally, this PR contains minor changes to SingleStoreDB vector
store:
 - change add texts/documents methods to return a list of inserted ids
 - implement delete(ids) method to delete documents by list of ids
 - added drop() method to drop a correspondent database table
- updated integration tests to use and check functionality implemented
above


CC: @baskaryan, @hwchase17

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
2024-07-05 09:26:06 -04:00
Igor Drozdov
bb597b1286 feat(community): add bind_tools function for ChatLiteLLM (#23823)
It's a follow-up to https://github.com/langchain-ai/langchain/pull/23765

Now the tools can be bound by calling `bind_tools`

```python
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.utils.function_calling import convert_to_openai_tool
from langchain_community.chat_models import ChatLiteLLM

class GetWeather(BaseModel):
    '''Get the current weather in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

class GetPopulation(BaseModel):
    '''Get the current population in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

prompt = "Which city is hotter today and which is bigger: LA or NY?"
# tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)]
tools = [GetWeather, GetPopulation]

llm = ChatLiteLLM(model="claude-3-sonnet-20240229").bind_tools(tools)
ai_msg = llm.invoke(prompt)
print(ai_msg.tool_calls)
```

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>
2024-07-05 09:19:41 -04:00
eliasecchig
efb48566d0 docs: add Vertex Feature Store, edit BigQuery Vector Search (#23709)
Add Vertex Feature Store, edit BigQuery Vector Search docs

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-05 12:12:21 +00:00
Yuki Watanabe
0e916d0d55 community: Overhaul MLflow Integration documentation (#23067) 2024-07-03 22:52:17 -04:00
ccurme
e62f8f143f infra: remove cohere from monorepo scheduled tests (#23846) 2024-07-03 21:48:39 +00:00
Jiejun Tan
2be66a38d8 huggingface: Fix huggingface tei support (#22653)
Update former pull request:
https://github.com/langchain-ai/langchain/pull/22595.

Modified
`libs/partners/huggingface/langchain_huggingface/embeddings/huggingface_endpoint.py`,
where the API call function does not match current [Text Embeddings
Inference
API](https://huggingface.github.io/text-embeddings-inference/#/Text%20Embeddings%20Inference/embed).
One example is:
```json
{
  "inputs": "string",
  "normalize": true,
  "truncate": false
}
```
Parameters in `_model_kwargs` are not passed properly in the latest
version. By the way, the issue *[why cause 413?
#50](https://github.com/huggingface/text-embeddings-inference/issues/50)*
might be solved.
2024-07-03 13:30:29 -07:00
Eugene Yurtsev
9ccc4b1616 core[patch]: Fix logic in BaseChatModel that processes the llm string that is used as a key for caching chat models responses (#23842)
This PR should fix the following issue:
https://github.com/langchain-ai/langchain/issues/23824
Introduced as part of this PR:
https://github.com/langchain-ai/langchain/pull/23416

I am unable to reproduce the issue locally though it's clear that we're
getting a `serialized` object which is not a dictionary somehow.

The test below passes for me prior to the PR as well

```python

def test_cache_with_sqllite() -> None:
    from langchain_community.cache import SQLiteCache

    from langchain_core.globals import set_llm_cache

    cache = SQLiteCache(database_path=".langchain.db")
    set_llm_cache(cache)
    chat_model = FakeListChatModel(responses=["hello", "goodbye"], cache=True)
    assert chat_model.invoke("How are you?").content == "hello"
    assert chat_model.invoke("How are you?").content == "hello"
```
2024-07-03 16:23:55 -04:00
Vadym Barda
9bb623381b core[minor]: update conversion utils to handle RemoveMessage (#23840) 2024-07-03 16:13:31 -04:00
Eugene Yurtsev
4ab78572e7 core[patch]: Speed up unit tests for imports (#23837)
Speed up unit tests for imports
2024-07-03 15:55:15 -04:00
Nico Puhlmann
4a15fce516 langchain: update declarative_base import (#20056)
**Description**: The ``declarative_base()`` function is now available as
sqlalchemy.orm.declarative_base(). (depreca ted since: 2.0) (Background
on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-07-03 15:52:35 -04:00
Mu Xian Ming
c06c666ce5 docs: fix docs/tutorials/llm_chain.ipynb (#23807)
to correctly display the link

Co-authored-by: Mu Xianming <mu.xianming@lmwn.com>
2024-07-03 15:38:31 -04:00
Vadym Barda
d206df8d3d docs: improve structure in the agent migration to langgraph guide (#23817) 2024-07-03 12:25:11 -07:00
Théo Deschamps
39b19cf764 core[patch]: extract input variables for path and detail keys in order to format an ImagePromptTemplate (#22613)
- Description: Add support for `path` and `detail` keys in
`ImagePromptTemplate`. Previously, only variables associated with the
`url` key were considered. This PR allows for the inclusion of a local
image path and a detail parameter as input to the format method.
- Issues:
    - fixes #20820 
    - related to #22024 
- Dependencies: None
- Twitter handle: @DeschampsTho5

---------

Co-authored-by: tdeschamps <tdeschamps@kameleoon.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-03 18:58:42 +00:00
Bagatur
a4798802ef cli[patch]: ruff 0.5 (#23833) 2024-07-03 18:33:15 +00:00
Leonid Ganeline
55f6f91f17 core[patch]: docstrings output_parsers (#23825)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-03 14:27:40 -04:00
Philippe PRADOS
26cee2e878 partners[patch]: MongoDB vectorstore to return and accept string IDs (#23818)
The mongdb have some errors.
- `add_texts() -> List` returns a list of `ObjectId`, and not a list of
string
- `delete()` with `id` never remove chunks.

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-07-03 14:14:08 -04:00
Ikko Eltociear Ashimine
75734fbcf1 community: fix typo in unit tests for test_zenguard.py (#23819)
enviroment -> environment


- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"
2024-07-03 14:05:42 -04:00
Bagatur
a0c2281540 infra: update mypy 1.10, ruff 0.5 (#23721)
```python
"""python scripts/update_mypy_ruff.py"""
import glob
import tomllib
from pathlib import Path

import toml
import subprocess
import re

ROOT_DIR = Path(__file__).parents[1]


def main():
    for path in glob.glob(str(ROOT_DIR / "libs/**/pyproject.toml"), recursive=True):
        print(path)
        with open(path, "rb") as f:
            pyproject = tomllib.load(f)
        try:
            pyproject["tool"]["poetry"]["group"]["typing"]["dependencies"]["mypy"] = (
                "^1.10"
            )
            pyproject["tool"]["poetry"]["group"]["lint"]["dependencies"]["ruff"] = (
                "^0.5"
            )
        except KeyError:
            continue
        with open(path, "w") as f:
            toml.dump(pyproject, f)
        cwd = "/".join(path.split("/")[:-1])
        completed = subprocess.run(
            "poetry lock --no-update; poetry install --with typing; poetry run mypy . --no-color",
            cwd=cwd,
            shell=True,
            capture_output=True,
            text=True,
        )
        logs = completed.stdout.split("\n")

        to_ignore = {}
        for l in logs:
            if re.match("^(.*)\:(\d+)\: error:.*\[(.*)\]", l):
                path, line_no, error_type = re.match(
                    "^(.*)\:(\d+)\: error:.*\[(.*)\]", l
                ).groups()
                if (path, line_no) in to_ignore:
                    to_ignore[(path, line_no)].append(error_type)
                else:
                    to_ignore[(path, line_no)] = [error_type]
        print(len(to_ignore))
        for (error_path, line_no), error_types in to_ignore.items():
            all_errors = ", ".join(error_types)
            full_path = f"{cwd}/{error_path}"
            try:
                with open(full_path, "r") as f:
                    file_lines = f.readlines()
            except FileNotFoundError:
                continue
            file_lines[int(line_no) - 1] = (
                file_lines[int(line_no) - 1][:-1] + f"  # type: ignore[{all_errors}]\n"
            )
            with open(full_path, "w") as f:
                f.write("".join(file_lines))

        subprocess.run(
            "poetry run ruff format .; poetry run ruff --select I --fix .",
            cwd=cwd,
            shell=True,
            capture_output=True,
            text=True,
        )


if __name__ == "__main__":
    main()

```
2024-07-03 10:33:27 -07:00
William FH
6cd56821dc [Core] Unify function schema parsing (#23370)
Use pydantic to infer nested schemas and all that fun.
Include bagatur's convenient docstring parser
Include annotation support


Previously we didn't adequately support many typehints in the
bind_tools() method on raw functions (like optionals/unions, nested
types, etc.)
2024-07-03 09:55:38 -07:00
Oguz Vuruskaner
2a2c0d1a94 community[deepinfra]: fix tool call parsing. (#23162)
This PR includes fix for DeepInfra tool call parsing.
2024-07-03 12:11:37 -04:00
maang-h
525109e506 feat: Implement ChatBaichuan asynchronous interface (#23589)
- **Description:** Add interface to `ChatBaichuan` to support
asynchronous requests
    - `_agenerate` method
    - `_astream` method

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-03 12:10:04 -04:00
Bagatur
8842a0d986 docs: fireworks nit (#23822) 2024-07-03 15:36:27 +00:00
Leonid Ganeline
716a316654 core: docstrings indexing (#23785)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-03 11:27:34 -04:00
Leonid Ganeline
30fdc2dbe7 core: docstrings messages (#23788)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-07-03 11:25:00 -04:00
ccurme
54e730f6e4 fireworks[patch]: read from tool calls attribute (#23820) 2024-07-03 11:11:17 -04:00
Bagatur
e787249af1 docs: fireworks standard page (#23816) 2024-07-03 14:33:05 +00:00
Jacob Lee
27aa4d38bf docs[patch]: Update structured output docs to have more discussion (#23786)
CC @agola11 @ccurme
2024-07-02 16:53:31 -07:00
Bagatur
ebb404527f anthropic[patch]: Release 0.1.19 (#23783) 2024-07-02 18:17:25 -04:00
Bagatur
6168c846b2 openai[patch]: Release 0.1.14 (#23782) 2024-07-02 18:17:15 -04:00
Bagatur
cb9812593f openai[patch]: expose model request payload (#23287)
![Screenshot 2024-06-21 at 3 12 12
PM](https://github.com/langchain-ai/langchain/assets/22008038/6243a01f-1ef6-4085-9160-2844d9f2b683)
2024-07-02 17:43:55 -04:00
Bagatur
ed200bf2c4 anthropic[patch]: expose payload (#23291)
![Screenshot 2024-06-21 at 4 56 02
PM](https://github.com/langchain-ai/langchain/assets/22008038/a2c6224f-3741-4502-9607-1a726a0551c9)
2024-07-02 17:43:47 -04:00
Bagatur
7a3d8e5a99 core[patch]: Release 0.2.11 (#23780) 2024-07-02 17:35:57 -04:00
Bagatur
d677dadf5f core[patch]: mark RemoveMessage beta (#23656) 2024-07-02 21:27:21 +00:00
ccurme
1d54ac93bb ai21[patch]: release 0.1.7 (#23781) 2024-07-02 21:24:13 +00:00
Asaf Joseph Gardin
320dc31822 partners: AI21 Labs Jamba Streaming Support (#23538)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"

- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** Added support for streaming in AI21 Jamba Model
    - **Twitter handle:** https://github.com/AI21Labs


- [x] **Add tests and docs**: If you're adding a new integration, please
include

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-02 17:15:46 -04:00
Qingchuan Hao
5cd4083457 community: make bing web search as the only option (#23523)
This PR make bing web search as the option for BingSearchAPIWrapper to
facilitate and simply the user interface on Langchain.
This is a follow-up work of
https://github.com/langchain-ai/langchain/pull/23306.
2024-07-02 17:13:54 -04:00
William W Wang
76e7e4e9e6 Update docs: LangChain agent memory (#23673)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


**Description:** Update docs content on agent memory

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-02 17:06:32 -04:00
ccurme
7c1cddf1b7 anthropic[patch]: release 0.1.18 (#23778) 2024-07-02 16:46:47 -04:00
ccurme
c9dac59008 anthropic[patch]: fix model name in some integration tests (#23779) 2024-07-02 20:45:52 +00:00
Bagatur
7a6c06cadd anthropic[patch]: tool output parser fix (#23647) 2024-07-02 16:33:22 -04:00
ccurme
46cbf0e4aa anthropic[patch]: use core output parsers for structured output (#23776)
Also add to standard tests for structured output.
2024-07-02 16:15:26 -04:00
kiarina
dc396835ed langchain_anthropic: add stop_reason in ChatAnthropic stream result (#23689)
`ChatAnthropic` can get `stop_reason` from the resulting `AIMessage` in
`invoke` and `ainvoke`, but not in `stream` and `astream`.
This is a different behavior from `ChatOpenAI`.
It is possible to get `stop_reason` from `stream` as well, since it is
needed to determine the next action after the LLM call. This would be
easier to handle in situations where only `stop_reason` is needed.

- Issue: NA
- Dependencies: NA
- Twitter handle: https://x.com/kiarina37
2024-07-02 15:16:20 -04:00
Bagatur
27ce58f86e docs: google genai standard page (#23766)
Part of #22296
2024-07-02 13:54:34 -04:00
maang-h
e4e28a6ff5 community[patch]: Fix MiniMaxChat validate_environment error (#23770)
- **Description:** Fix some issues in MiniMaxChat 
  - Fix `minimax_api_host` not in `values` error
- Remove `minimax_group_id` from reading environment variables, the
`minimax_group_id` no longer use in MiniMaxChat
  - Invoke callback prior to yielding token, the issus #16913
2024-07-02 13:23:32 -04:00
SN
acc457f645 core[patch]: fix nested sections for mustache templating (#23747)
The prompt template variable detection only worked for singly-nested
sections because we just kept track of whether we were in a section and
then set that to false as soon as we encountered an end block. i.e. the
following:

```
{{#outerSection}}
    {{variableThatShouldntShowUp}}
    {{#nestedSection}}
        {{nestedVal}}
    {{/nestedSection}}
    {{anotherVariableThatShouldntShowUp}}
{{/outerSection}}
```

Would yield `['outerSection', 'anotherVariableThatShouldntShowUp']` as
input_variables (whereas it should just yield `['outerSection']`). This
fixes that by keeping track of the current depth and using a stack.
2024-07-02 10:20:45 -07:00
Karim Lalani
acc8fb3ead docs[patch]: Update OllamaFunctions docs to match chat model integration template (#23179)
Added Tool Calling Agent Example with langgraph to OllamaFunctions
documentation
2024-07-02 10:05:44 -07:00
Bagatur
79c07a8ade docs: standardize bedrock page (#23738)
Part of #22296
2024-07-02 12:03:36 -04:00
Teja Hara
a77a263e24 Added langchain-community installation (#23741)
PR title: Docs enhancement

- Description: Adding installation instructions for integrations
requiring langchain-community package since 0.2
- Issue: https://github.com/langchain-ai/langchain/issues/22005

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-02 11:03:07 -04:00
Eugene Yurtsev
46ff0f7a3c community[patch]: Update @root_validators to use explicit pre=True or pre=False (#23737) 2024-07-02 10:47:21 -04:00
Igor Drozdov
b664dbcc36 feat(community): add support for tool_calls response (#23765)
When `model_kwargs={"tools": tools}` are passed to `ChatLiteLLM`, they
are executed, but the response is not recognized correctly

Let's add `tool_calls` to the `additional_kwargs`

Thank you for contributing to LangChain!

## ChatAnthropic

I used the following example to verify the output of llm with tools:

```python
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_anthropic import ChatAnthropic

class GetWeather(BaseModel):
    '''Get the current weather in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

class GetPopulation(BaseModel):
    '''Get the current population in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

llm = ChatAnthropic(model="claude-3-sonnet-20240229")
llm_with_tools = llm.bind_tools([GetWeather, GetPopulation])
ai_msg = llm_with_tools.invoke("Which city is hotter today and which is bigger: LA or NY?")
print(ai_msg.tool_calls)
```

I get the following response:

```json
[{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01UfDA89knrhw3vFV9X47neT'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01NrYVRYae7m7z7tBgyPb3Gd'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_01EPFEpDgzL6vV2dTpD9SVP5'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01B5J6tPJXgwwfhQX9BHP2dt'}]
```

## LiteLLM

Based on https://litellm.vercel.app/docs/completion/function_call

```python
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.utils.function_calling import convert_to_openai_tool
import litellm

class GetWeather(BaseModel):
    '''Get the current weather in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

class GetPopulation(BaseModel):
    '''Get the current population in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

prompt = "Which city is hotter today and which is bigger: LA or NY?"
tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)]

response = litellm.completion(model="claude-3-sonnet-20240229", messages=[{'role': 'user', 'content': prompt}], tools=tools)
print(response.choices[0].message.tool_calls)
```

```python
[ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetWeather'), id='toolu_01HeDWV5vP7BDFfytH5FJsja', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetWeather'), id='toolu_01EiLesUSEr3YK1DaE2jxsQv', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "Los Angeles, CA"}', name='GetPopulation'), id='toolu_01Xz26zvkBDRxEUEWm9pX6xa', type='function'), ChatCompletionMessageToolCall(function=Function(arguments='{"location": "New York, NY"}', name='GetPopulation'), id='toolu_01SDqKnsLjvUXuBsgAZdEEpp', type='function')]
```

## ChatLiteLLM

When I try the following

```python
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_core.utils.function_calling import convert_to_openai_tool
from langchain_community.chat_models import ChatLiteLLM

class GetWeather(BaseModel):
    '''Get the current weather in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

class GetPopulation(BaseModel):
    '''Get the current population in a given location'''

    location: str = Field(..., description="The city and state, e.g. San Francisco, CA")

prompt = "Which city is hotter today and which is bigger: LA or NY?"
tools = [convert_to_openai_tool(GetWeather), convert_to_openai_tool(GetPopulation)]

llm = ChatLiteLLM(model="claude-3-sonnet-20240229", model_kwargs={"tools": tools})
ai_msg = llm.invoke(prompt)
print(ai_msg)
print(ai_msg.tool_calls)
```

```python
content="Okay, let's find out the current weather and populations for Los Angeles and New York City:" response_metadata={'token_usage': Usage(prompt_tokens=329, completion_tokens=193, total_tokens=522), 'model': 'claude-3-sonnet-20240229', 'finish_reason': 'tool_calls'} id='run-748b7a84-84f4-497e-bba1-320bd4823937-0'
[]
```

---

When I apply the changes of this PR, the output is

```json
[{'name': 'GetWeather', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_017D2tGjiaiakB1HadsEFZ4e'}, {'name': 'GetWeather', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01WrDpJfVqLkPejWzonPCbLW'}, {'name': 'GetPopulation', 'args': {'location': 'Los Angeles, CA'}, 'id': 'toolu_016UKyYrVAV9Pz99iZGgGU7V'}, {'name': 'GetPopulation', 'args': {'location': 'New York, NY'}, 'id': 'toolu_01Sgv1imExFX1oiR1Cw88zKy'}]
```

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Igor Drozdov <idrozdov@gitlab.com>
2024-07-02 10:42:08 -04:00
Eugene Yurtsev
338cef35b4 community[patch]: update @root_validator in utilities namespace (#23768)
Update all utilities to use `pre=True` or `pre=False`

https://github.com/langchain-ai/langchain/issues/22819
2024-07-02 14:33:01 +00:00
wenngong
ee5eedfa04 partners: support reading HuggingFace params from env (#23309)
Description: 
1. partners/HuggingFace module support reading params from env. Not
adjust langchain_community/.../huggingfaceXX modules since they are
deprecated.
  2. pydantic 2 @root_validator migration.

Issue: #22448 #22819

---------

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
2024-07-02 10:12:45 -04:00
antonpibm
ffde8a6a09 Milvus vectorstore: fix pass ids as argument after upsert (#23761)
**Description**: Milvus vectorstore supports both `add_documents` via
the base class and `upsert` method which deletes and re-adds documents
based on their ids

**Issue**: Due to mismatch in the interfaces the ids used by `upsert`
are neglected in `add_documents`, as `ids` are passed as argument in
`upsert` but via `kwargs` is `add_documents`

This caused exceptions and inconsistency in the DB, tested with
`auto_id=False`

**Fix**: pass `ids` via `kwargs` to `add_documents`
2024-07-02 13:45:30 +00:00
Eugene Yurtsev
d084172b63 community[patch]: root validator set explicit pre=False or pre=True (#23764)
See issue: https://github.com/langchain-ai/langchain/issues/22819
2024-07-02 09:42:05 -04:00
Khelan Modi
4457e64e13 Update azure_cosmos_db for mongodb documentation (#23740)
added pre-filtering documentation

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: 
    - **Description:** added filter vector search 
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:**: n/a


- [x] **Add tests and docs**: If you're adding a new integration, please
include - No need for tests, just a simple doc update
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-02 12:53:05 +00:00
panwg3
bc98f90ba3 update wrong words (#23749)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-02 08:50:20 -04:00
mattthomps1
cc55823486 docs: updated PPLX model (#23723)
Description: updated pplx docs to reference a currently [supported
model](https://docs.perplexity.ai/docs/model-cards). pplx-70b-online
->llama-3-sonar-small-32k-online

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-07-02 08:48:49 -04:00
Bagatur
aa165539f6 docs: standardize cohere page (#23739)
Part of #22296
2024-07-01 19:34:13 -04:00
Jacob Lee
7791d92711 community[patch]: Fix requests alias for load_tools (#23734)
CC @baskaryan
2024-07-01 15:02:14 -07:00
Eugene Yurtsev
f24e38876a community[patch]: Update root_validators to use explicit pre=True or pre=False (#23736) 2024-07-01 17:13:23 -04:00
Yannick Stephan
5b1de2ae93 mistralai: Fixed streaming in MistralAI with ainvoke and callbacks (#22000)
# Fix streaming in mistral with ainvoke 
- [x] **PR title**
- [x] **PR message**
- [x] **Add tests and docs**:
  1. [x] Added a test for the fixed integration.
2. [x] An example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [x] **Lint and test**: Ran `make format`, `make lint` and `make test`
from the root of the package(s) I've modified.

Hello 

* I Identified an issue in the mistral package where the callback
streaming (see on_llm_new_token) was not functioning correctly when the
streaming parameter was set to True and call with `ainvoke`.
* The root cause of the problem was the streaming not taking into
account. ( I think it's an oversight )
* To resolve the issue, I added the `streaming` attribut.
* Now, the callback with streaming works as expected when the streaming
parameter is set to True.

## How to reproduce

```
from langchain_mistralai.chat_models import ChatMistralAI
chain = ChatMistralAI(streaming=True)
# Add a callback
chain.ainvoke(..)

# Oberve on_llm_new_token
# Now, the callback is given as streaming tokens, before it was in grouped format.
```

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-01 20:53:09 +00:00
Jacob Lee
f4b2e553e7 docs[patch]: Update Unstructured loader notebooks and install instructions (#23726)
CC @baskaryan @MthwRobinson
2024-07-01 13:36:48 -07:00
Eugene Yurtsev
5d2262af34 community[patch]: Update root_validators to use pre=True or pre=False (#23731)
Update root_validators in preparation for pydantic 2 migration.
2024-07-01 20:10:15 +00:00
Erick Friis
6019147b66 infra: filter template check (#23727) 2024-07-01 13:00:33 -07:00
Eugene Yurtsev
ebcee4f610 core[patch]: Add versionadded to get_by_ids (#23728) 2024-07-01 15:16:00 -04:00
Eugene Yurtsev
e800f6bb57 core[minor]: Create BaseMedia object (#23639)
This PR implements a BaseContent object from which Document and Blob
objects will inherit proposed here:
https://github.com/langchain-ai/langchain/pull/23544

Alternative: Create a base object that only has an identifier and no
metadata.

For now decided against it, since that refactor can be done at a later
time. It also feels a bit odd since our IDs are optional at the moment.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-01 15:07:30 -04:00
Chip Davis
04bc5f1a95 partners[azure]: fix having openai_api_base set for other packages (#22068)
This fix is for #21726. When having other packages installed that
require the `openai_api_base` environment variable, users are not able
to instantiate the AzureChatModels or AzureEmbeddings.

This PR adds a new value `ignore_openai_api_base` which is a bool. When
set to True, it sets `openai_api_base` to `None`

Two new tests were added for the `test_azure` and a new file
`test_azure_embeddings`

A different approach may be better for this. If you can think of better
logic, let me know and I can adjust it.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-07-01 18:35:20 +00:00
Nuno Campos
b36e95caa9 core[patch]: use async messages where possible (#23718)
Fix #23716

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-01 18:33:05 +00:00
Spyros Avlonitis
8cfb2fa1b7 core[minor]: Add maxsize for InMemoryCache (#23405)
This PR introduces a maxsize parameter for the InMemoryCache class,
allowing users to specify the maximum number of items to store in the
cache. If the cache exceeds the specified maximum size, the oldest items
are removed. Additionally, comprehensive unit tests have been added to
ensure all functionalities are thoroughly tested. The tests are written
using pytest and cover both synchronous and asynchronous methods.

Twitter: @spyrosavl

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-07-01 14:21:21 -04:00
maang-h
96af8f31ae community[patch]: Invoke callback prior to yielding token (#23638)
- **Description:** Invoke callback prior to yielding token in stream and
astream methods for ChatZhipuAI.
- **Issue:** the issue #16913
2024-07-01 18:12:24 +00:00
Eugene Yurtsev
b5aef4cf97 core[patch]: Fix llm string representation for serializable models (#23416)
Fix LLM string representation for serializable objects.

Fix for issue: https://github.com/langchain-ai/langchain/issues/23257

The llm string of serializable chat models is the serialized
representation of the object. LangChain serialization dumps some basic
information about non serializable objects including their repr() which
includes an object id.

This means that if a chat model has any non serializable fields (e.g., a
cache), then any new instantiation of the those fields will change the
llm representation of the chat model and cause chat misses.

i.e., re-instantiating a postgres cache would result in cache misses!
2024-07-01 14:06:33 -04:00
nobbbbby
3904f2cd40 core: fix NameError (#23658)
**Description:** In the chat_models module of the language model, the
import statement for BaseModel has been moved from the conditionally
imported section to the main import area, fixing `NameError `.
**Issue:** fix `NameError `
2024-07-01 17:51:23 +00:00
Jacob Lee
d2c7379f1c 👥 Update LangChain people data (#23697)
👥 Update LangChain people data

---------

Co-authored-by: github-actions <github-actions@github.com>
2024-07-01 17:42:55 +00:00
Jordy Jackson Antunes da Rocha
a50eabbd48 experimental: LLMGraphTransformer add missing conditional adding restrictions to prompts for LLM that do not support function calling (#22793)
- Description: Modified the prompt created by the function
`create_unstructured_prompt` (which is called for LLMs that do not
support function calling) by adding conditional checks that verify if
restrictions on entity types and rel_types should be added to the
prompt. If the user provides a sufficiently large text, the current
prompt **may** fail to produce results in some LLMs. I have first seen
this issue when I implemented a custom LLM class that did not support
Function Calling and used Gemini 1.5 Pro, but I was able to replicate
this issue using OpenAI models.

By loading a sufficiently large text
```python
from langchain_community.llms import Ollama
from langchain_openai import ChatOpenAI, OpenAI
from langchain_core.prompts import PromptTemplate
import re
from langchain_experimental.graph_transformers import LLMGraphTransformer
from langchain_core.documents import Document

with open("texto-longo.txt", "r") as file:
    full_text = file.read()
    partial_text = full_text[:4000]

documents = [Document(page_content=partial_text)] # cropped to fit GPT 3.5 context window
```

And using the chat class (that has function calling)
```python
chat_openai = ChatOpenAI(model="gpt-3.5-turbo", model_kwargs={"seed": 42})
chat_gpt35_transformer = LLMGraphTransformer(llm=chat_openai)
graph_from_chat_gpt35 = chat_gpt35_transformer.convert_to_graph_documents(documents)
```
It works:
```
>>> print(graph_from_chat_gpt35[0].nodes)
[Node(id="Jesu, Joy of Man's Desiring", type='Music'), Node(id='Godel', type='Person'), Node(id='Johann Sebastian Bach', type='Person'), Node(id='clever way of encoding the complicated expressions as numbers', type='Concept')]
```

But if you try to use the non-chat LLM class (that does not support
function calling)
```python
openai = OpenAI(
    model="gpt-3.5-turbo-instruct",
    max_tokens=1000,
)
gpt35_transformer = LLMGraphTransformer(llm=openai)
graph_from_gpt35 = gpt35_transformer.convert_to_graph_documents(documents)
```

It uses the prompt that has issues and sometimes does not produce any
result
```
>>> print(graph_from_gpt35[0].nodes)
[]
```

After implementing the changes, I was able to use both classes more
consistently:

```shell
>>> chat_gpt35_transformer = LLMGraphTransformer(llm=chat_openai)
>>> graph_from_chat_gpt35 = chat_gpt35_transformer.convert_to_graph_documents(documents)
>>> print(graph_from_chat_gpt35[0].nodes)
[Node(id="Jesu, Joy Of Man'S Desiring", type='Music'), Node(id='Johann Sebastian Bach', type='Person'), Node(id='Godel', type='Person')]
>>> gpt35_transformer = LLMGraphTransformer(llm=openai)
>>> graph_from_gpt35 = gpt35_transformer.convert_to_graph_documents(documents)
>>> print(graph_from_gpt35[0].nodes)
[Node(id='I', type='Pronoun'), Node(id="JESU, JOY OF MAN'S DESIRING", type='Song'), Node(id='larger memory', type='Memory'), Node(id='this nice tree structure', type='Structure'), Node(id='how you can do it all with the numbers', type='Process'), Node(id='JOHANN SEBASTIAN BACH', type='Composer'), Node(id='type of structure', type='Characteristic'), Node(id='that', type='Pronoun'), Node(id='we', type='Pronoun'), Node(id='worry', type='Verb')]
```

The results are a little inconsistent because the GPT 3.5 model may
produce incomplete json due to the token limit, but that could be solved
(or mitigated) by checking for a complete json when parsing it.
2024-07-01 17:33:51 +00:00
Eugene Yurtsev
4f1821db3e core[minor]: Add get_by_ids to vectorstore interface (#23594)
This PR adds a part of the indexing API proposed in this RFC
https://github.com/langchain-ai/langchain/pull/23544/files.

It allows rolling out `get_by_ids` which should be uncontroversial to
existing vectorstores without introducing new abstractions.

The semantics for this method depend on the ability of identifying
returned documents using the new optional ID field on documents:
https://github.com/langchain-ai/langchain/pull/23411

Alternatives are:

1. Relax the sequence requirement

```python
def get_by_ids(self, ids: Iterable[str], /) -> Iterable[Document]:
```

Rejected:
- implementations are more likley to start batching with bad defaults
- users would need to call list() or we'd need to introduce another
convenience method

2. Support more kwargs

```python

def get_by_ids(self, ids: Sequence[str], /, **kwargs) -> List[Document]:
...
```

Rejected: 
- No need for `batch` parameter since IDs is a sequence
- Output cannot be customized since `Document` is fixed. (e.g.,
parameters could be useful to grab extra metadata like the vector that
was indexed with the Document or to project a part of the document)
2024-07-01 13:04:33 -04:00
Valentin
bf402f902e community: Fix LanceDB similarity search bug (#23591)
**Description:** LanceDB didn't allow querying the database using
similarity score thresholds because the metrics value was missing. This
PR simply fixes that bug.
**Issue:** not applicable
**Dependencies:** none
**Twitter handle:** not available

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-07-01 16:33:45 +00:00
Bagatur
389a568f9a standard-tests[patch]: add anthropic format integration test (#23717) 2024-07-01 11:06:04 -04:00
Rafael Pereira
4b9517db85 Jira: Allow Jira access using only the token (#23708)
- **Description:** At the moment the Jira wrapper only accepts the the
usage of the Username and Password/Token at the same time. However Jira
allows the connection using only is useful for enterprise context.

Co-authored-by: rpereira <rafael.pereira@criticalsoftware.com>
2024-07-01 13:13:51 +00:00
Francesco Kruk
7538f3df58 Update jina embedding notebook to show multimodal capability more clearly (#23702)
After merging the [PR #22594 to include Jina AI multimodal capabilities
in the Langchain
documentation](https://github.com/langchain-ai/langchain/pull/22594), we
updated the notebook to showcase the difference between text and
multimodal capabilities more clearly.
2024-07-01 09:13:19 -04:00
Tim Van Wassenhove
24916c6703 community: Register pandas df in duckdb when creating vector_store (#23690)
- **Description:** Register pandas df in duckdb when creating
vector_store
- **Issue:** Resolves #23308
- **Dependencies:** None
- **Twitter handle:** @timvw

Co-authored-by: Tim Van Wassenhove <tim.van.wassenhove@telenetgroup.be>
2024-07-01 09:12:06 -04:00
Sourav Biswal
b60df8bb4f Update chatbot.ipynb (#23688)
DOC: missing parenthesis #23687

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-07-01 13:00:34 +00:00
Jacob Lee
9604cb833b ci[patch]: Update people PR CI permissions (#23696)
CC @agola11
2024-06-30 22:25:08 -07:00
Bagatur
29aa9d6750 groq[patch]: Release 0.1.6 (#23655) 2024-06-29 07:35:23 -04:00
Bagatur
f2d0c13a15 fireworks[patch]: Release 0.1.4 (#23654) 2024-06-29 07:35:16 -04:00
Bagatur
9a5e35d1ba mistralai[patch]: Release 0.1.9 (#23653) 2024-06-29 07:35:09 -04:00
Bagatur
74321e546d infra: update release permissions (#23662) 2024-06-29 07:31:36 -04:00
Mateusz Szewczyk
a78ccb993c ibm: Add support for Chat Models (#22979) 2024-06-29 01:59:25 -07:00
Jacob Lee
16c59118eb docs[patch]: Adds short tracing how-tos and conceptual guide (#23657)
CC @agola11
2024-06-28 18:28:49 -07:00
Jacob Lee
c0bb26e85b docs[patch]: Typo fix (#23652) 2024-06-28 17:27:44 -07:00
Jacob Lee
72175c57bd docs[patch]: Fix docs bugs in response to feedback (#23649)
- Update Meta Llama 3 cookbook link
- Add prereq section and information on `messages_modifier` to LangGraph
migration guide
- Update `PydanticToolsParser` explanation and entrypoint in tool
calling guide
- Add more obvious warning to `OllamaFunctions`
- Fix Wikidata tool install flow
- Update Bedrock LLM initialization

@baskaryan can you add a bit of information on how to authenticate into
the `ChatBedrock` and `BedrockLLM` models? I wasn't able to figure it
out :(
2024-06-28 17:24:55 -07:00
Bagatur
af2c05e5f3 openai[patch]: Release 0.1.13 (#23651) 2024-06-28 17:10:30 -07:00
Bagatur
b63c7f10bc anthropic[patch]: Release 0.1.17 (#23650) 2024-06-28 17:07:08 -07:00
Bagatur
fc8fd49328 openai, anthropic, ...: with_structured_output to pass in explicit tool choice (#23645)
...community, mistralai, groq, fireworks

part of #23644
2024-06-28 16:39:53 -07:00
Bagatur
c5f35a72da docs: vllm pkg nit (#23648) 2024-06-28 16:09:36 -07:00
Bagatur
81064017a9 docs: azure openai docstring (#23643)
part of #22296
2024-06-28 15:15:58 -07:00
Bagatur
381aedcc61 docs: standardize azure openai page (#23642)
part of #22296
2024-06-28 15:15:41 -07:00
Vadym Barda
e8d77002ea core: add RemoveMessage (#23636)
This change adds a new message type `RemoveMessage`. This will enable
`langgraph` users to manually modify graph state (or have the graph
nodes modify the state) to remove messages by `id`

Examples:

* allow users to delete messages from state by calling

```python
graph.update_state(config, values=[RemoveMessage(id=state.values[-1].id)])
```

* allow nodes to delete messages

```python
graph.add_node("delete_messages", lambda state: [RemoveMessage(id=state[-1].id)])
```
2024-06-28 14:40:02 -07:00
ccurme
8fce8c6771 community: fix extended tests (#23640) 2024-06-28 16:35:38 -04:00
ccurme
5d93916665 openai[patch]: release 0.1.12 (#23641) 2024-06-28 19:51:16 +00:00
Jacob Lee
a032583b17 docs[patch]: Update diagrams (#23613) 2024-06-28 12:36:00 -07:00
ccurme
390ee8d971 standard-tests: add test for structured output (#23631)
- add test for structured output
- fix bug with structured output for Azure
- better testing on Groq (break out Mixtral + Llama3 and add xfails
where needed)
2024-06-28 15:01:40 -04:00
Eugene Yurtsev
6c1ba9731d docs: Resurface some methods in API reference and clarify note at top of Reference (#23633)
This PR modifies the API Reference in the following way:

1. Relist standard methods: invoke, ainvoke, batch, abatch,
batch_as_completed, abatch_as_completed, stream, astream,
astream_events. These are the main entry points for a lot of runnables,
so we'll keep them for each runnable.
2. Relist methods from Runnable Serializable: to_json,
configurable_fields, configurable_alternatives.
3. Expand the note in the API reference documentation to explain that
additional methods are available.
2024-06-28 12:31:37 -04:00
Brace Sproul
800b0ff3b9 docs[minor]: Hide langserve pages (#23618) 2024-06-28 08:25:08 -07:00
j pradhan
5f21eab491 community:perplexity[patch]: standardize init args (#21794)
updated request_timeout default alias value per related docstring.

Related to
[20085](https://github.com/langchain-ai/langchain/issues/20085)

Thank you for contributing to LangChain!

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-28 13:26:12 +00:00
mackong
11483b0fb8 community[patch]: set tool name for tongyi&qianfan llm (#22889)
- **Description:** The name of ToolMessage is default to None, which
makes tool message send to LLM likes
 ```json
{"role": "tool",
   "tool_call_id": "",
   "content": "{\"time\": \"12:12\"}",
   "name": null}
```
But the name seems essential for some LLMs like TongYi Qwen. so we need to set the name use agent_action's tool value.
  - **Issue:** N/A
  - **Dependencies:** N/A
2024-06-28 09:17:05 -04:00
Leonid Ganeline
e4caa41aa9 community: docstrings toolkits (#23616)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-06-28 08:40:52 -04:00
clement.l
19eb82e68b docs: Fix link in LLMChain tutorial (#23620)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-28 03:59:24 +00:00
Bagatur
bd68a38723 docs: update chatmodel.with_structured_output feat in table (#23610) 2024-06-27 20:38:49 -07:00
ccurme
adf2dc13de community: fix lint (#23611) 2024-06-27 22:12:16 +00:00
Bagatur
ef0593db58 docs: tool call run model (#23609) 2024-06-27 22:02:12 +00:00
Leonid Ganeline
75a44fe951 core: chat_* docstrings (#23412)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-06-27 17:29:38 -04:00
Bagatur
3b1fcb2a65 chroma[patch]: Release 0.1.2 (#23604) 2024-06-27 13:58:24 -07:00
Eugene Yurtsev
68f348357e community[patch]: Test InMemoryVectorStore with RWAPI test suite (#23603)
Add standard test suite to InMemoryVectorStore implementation.
2024-06-27 16:43:43 -04:00
Eugene Yurtsev
da7beb1c38 core[patch]: Add unit test when catching generator exit (#23402)
This pr adds a unit test for:
https://github.com/langchain-ai/langchain/pull/22662
And narrows the scope where the exception is caught.
2024-06-27 20:36:07 +00:00
NG Sai Prasanth
5e6d23f27d community: Standardise tool import for arxiv & semantic scholar (#23578)
- **Description:** Fixing the way users have to import Arxiv and
Semantic Scholar
- **Issue:** Changed to use `from langchain_community.tools.arxiv import
ArxivQueryRun` instead of `from langchain_community.tools.arxiv.tool
import ArxivQueryRun`
    - **Dependencies:** None
    - **Twitter handle:** Nope
2024-06-27 16:35:50 -04:00
ccurme
d04f657424 langchain[patch]: deprecate ConversationChain (#23504)
Would like some feedback on how to best incorporate legacy memory
objects into `RunnableWithMessageHistory`.
2024-06-27 16:32:44 -04:00
Ayo Ayibiowu
c6f700b7cb fix(community): allow support for disabling max_tokens args (#21534)
This PR fixes an issue with not able to use unlimited/infinity tokens
from the respective provider for the LiteLLM provider.

This is an issue when working in an agent environment that the token
usage can drastically increase beyond the initial value set causing
unexpected behavior.
2024-06-27 16:28:59 -04:00
WU LIFU
2a0d6788f7 docs[patch]: extraction_examples fix the examples given to the llm (#23393)
Descriptions: currently in the
[doc](https://python.langchain.com/v0.2/docs/how_to/extraction_examples/)
it sets "Data" as the LLM's structured output schema, however its
examples given to the LLM output's "Person", which causes the LLM to be
confused and might occasionally return "Person" as the function to call

issue: #23383

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
2024-06-27 16:22:26 -04:00
Leonid Ganeline
c0fdbaac85 langchain: docstrings in agents root (#23561)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-06-27 15:52:18 -04:00
Leonid Ganeline
b64c4b4750 langchain: docstrings agents nested (#23598)
Added missed docstrings. Formatted docstrings to the consistent form.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-27 19:49:41 +00:00
mackong
70834cd741 community[patch]: support convert FunctionMessage for Tongyi (#23569)
**Description:** For function call agent with Tongyi, cause the
AgentAction will be converted to FunctionMessage by

47f69fe0d8/libs/core/langchain_core/agents.py (L188)
But now Tongyi's *convert_message_to_dict* doesn't support
FunctionMessage

47f69fe0d8/libs/community/langchain_community/chat_models/tongyi.py (L184-L207)
Then next round conversation will be failed by the *TypeError*
exception.

This patch adds the support to convert FunctionMessage for Tongyi.

**Issue:** N/A
**Dependencies:** N/A
2024-06-27 15:49:26 -04:00
Bagatur
d45ece0e58 chroma[patch]: loosen py req (#23599)
currently causes issues if you try adding to a project that supports
py<4
2024-06-27 12:40:59 -07:00
Mohammad Mohtashim
4796b7eb15 [Community [HuggingFace]]: Small Fix for ChatHuggingFace. (#22925)
- **Description:** A small fix where I moved the `available_endpoints`
in order to avoid the token error in the below issue. Also I have added
conftest file and updated the `scripy`,`numpy` versions to support newer
python versions in poetry files.
- **Issue:** #22804

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-27 19:37:20 +00:00
Jacob Lee
644723adda docs[patch]: Add search keyword, update contribution guide (#23602)
CC @vbarda @hinthornw
2024-06-27 12:36:02 -07:00
ccurme
bffc3c24a0 openai[patch]: release 0.1.11 (#23596) 2024-06-27 18:48:40 +00:00
ccurme
a1520357c8 openai[patch]: revert addition of "name" to supported properties for tool messages (#23600) 2024-06-27 18:40:04 +00:00
joshc-ai21
16a293cc3a Small bug fixes (#23353)
Small bug fixes according to your comments

---------

Signed-off-by: Joffref <mariusjoffre@gmail.com>
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Baskar Gopinath <73015364+baskargopinath@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Mathis Joffre <51022808+Joffref@users.noreply.github.com>
Co-authored-by: Baur <baur.krykpayev@gmail.com>
Co-authored-by: Nuradil <nuradil.maksut@icloud.com>
Co-authored-by: Nuradil <133880216+yaksh0nti@users.noreply.github.com>
Co-authored-by: Jacob Lee <jacoblee93@gmail.com>
Co-authored-by: Rave Harpaz <rave.harpaz@oracle.com>
Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com>
Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: RUO <61719257+comsa33@users.noreply.github.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Luis Rueda <userlerueda@gmail.com>
Co-authored-by: Jib <Jibzade@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: S M Zia Ur Rashid <smziaurrashid@gmail.com>
Co-authored-by: Ikko Eltociear Ashimine <eltociear@gmail.com>
Co-authored-by: yuncliu <lyc1990@qq.com>
Co-authored-by: wenngong <76683249+wenngong@users.noreply.github.com>
Co-authored-by: gongwn1 <gongwn1@lenovo.com>
Co-authored-by: Mirna Wong <89008547+mirnawong1@users.noreply.github.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: maang-h <55082429+maang-h@users.noreply.github.com>
Co-authored-by: asafg <asafg@ai21.com>
Co-authored-by: Asaf Joseph Gardin <39553475+Josephasafg@users.noreply.github.com>
2024-06-27 17:58:22 +00:00
panwg3
9308bf32e5 spelling errors in words (#23559)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-27 17:16:22 +00:00
clement.l
182fc06769 docs: Fix typo in LLMChain tutorial (#23593)
When using `model_with_tools.invoke`, the `content` returns as an empty
string.
For more details, please refer to my [trace
log](https://smith.langchain.com/public/6fd24bc4-86c4-4627-8565-9a8adaf4ad7d/r).
2024-06-27 17:01:05 +00:00
ccurme
5536420bee openai[patch]: add comment (#23595)
Forgot to push this to
https://github.com/langchain-ai/langchain/pull/23551
2024-06-27 16:47:14 +00:00
andrewmjc
9f0f3c7e29 partners[openai]: Add name field to tool message to match OpenAI spec (#23551)
Discovered alongside @t968914

  - **Description:**
According to OpenAI docs, tool messages (response from calling tools)
must have a 'name' field.

https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models

  - **Issue:** N/A (as of right now)
  - **Dependencies:** N/A
  - **Twitter handle:** N/A

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-27 12:42:36 -04:00
Krista Pratico
85e36b0f50 partners[openai]: only add stream_options to kwargs if requested (#23552)
- **Description:** This PR
https://github.com/langchain-ai/langchain/pull/22854 added the ability
to pass `stream_options` through to the openai service to get token
usage information in the response. Currently OpenAI supports this
parameter, but Azure OpenAI does not yet. For users who proxy their
calls to both services through ChatOpenAI, this breaks when targeting
Azure OpenAI (see related discussion opened in openai-python:
https://github.com/openai/openai-python/issues/1469#issuecomment-2192658630).

> Error code: 400 - {'error': {'code': None, 'message': 'Unrecognized
request argument supplied: stream_options', 'param': None, 'type':
'invalid_request_error'}}

This PR fixes the issue by only adding `stream_options` to the request
if it's actually requested by the user (i.e. set to True). If I'm not
mistaken, we have a test case that already covers this scenario:
https://github.com/langchain-ai/langchain/blob/master/libs/partners/openai/tests/integration_tests/chat_models/test_base.py#L398-L399

- **Issue:** Issue opened in openai-python:
https://github.com/openai/openai-python/issues/1469
  - **Dependencies:** N/A
  - **Twitter handle:** N/A

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-27 12:23:05 -04:00
Eugene Yurtsev
96b72edac8 core[minor]: Add optional ID field to Document schema (#23411)
This PR adds an optional ID field to the document schema.

# 1. Optional or Required

- An optional field will will requrie additional checking for the type
in user code (annoying).
- However, vectorstores currently don't respect this field. So if we
make it
required and start returning random UUIDs that might be even more
confusing
  to users.


**Proposal**: Start with Optional and convert to Required (with default
set to uuid4()) in 1-2 major releases.


# 2. Override __str__ or generic solution in prompts

Overriding __str__ as a simple way to avoid changing user code that
relies on
default str(document) in prompts. 


I considered rolling out a more general solution in prompts
(https://github.com/langchain-ai/langchain/pull/8685),
but to do that we need to:

1. Make things serializable
2. The more general solution would likely need to be backwards
compatible as well
3. It's unclear that one wants to format a List[int] in the same way as
List[Document]. The former should be `,` seperated (likely), the latter
   should be `---` separated (likely).


**Proposal** Start with __str__ override and focus on the vectorstore
APIs, we generalize prompts later
2024-06-27 12:15:58 -04:00
ccurme
5bfcb898ad openai[patch]: bump sdk version (#23592)
Tests failing with `TypeError: Completions.create() got an unexpected
keyword argument 'parallel_tool_calls'`
2024-06-27 11:57:24 -04:00
Jacob Lee
60fc15a56b docs[patch]: Update docs introduction and README (#23558)
CC @hwchase17 @baskaryan
2024-06-27 08:51:43 -07:00
panwg3
2445b997ee Correction of incorrect words (#23557)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-27 15:13:15 +00:00
Aditya
6721b991ab docs: realigned sections for langchain-google-vertexai (#23584)
- **Description:** Re-aligned sections in documentation of Vertex AI
LLMs
    - **Issue:** NA
    - **Dependencies:** NA
    - **Twitter handle:**NA

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-27 10:42:32 -04:00
mackong
daf733b52e langchain[minor]: fix comment typo (#23564)
**Description:** fix typo of comment
**Issue:** N/A
**Dependencies:** N/A
2024-06-27 10:09:18 -04:00
Jacob Lee
47f69fe0d8 docs[patch]: Add ReAct agent conceptual guide, improve search (#23554)
@baskaryan
2024-06-26 19:02:03 -07:00
Jacob Lee
672fcbb8dc docs[patch]: Fix bad link format (#23553) 2024-06-26 16:43:26 -07:00
Jacob Lee
13254715a2 docs[patch]: Update installation guide with diagram (#23548)
CC @baskaryan
2024-06-26 15:10:22 -07:00
Leonid Ganeline
2c9b84c3a8 core[patch]: docstrings agents (#23502)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-06-26 17:50:48 -04:00
Jacob Lee
79d8556c22 docs[patch]: Address feedback from docs users (#23550)
- Updates chat few shot prompt tutorial to show off a more cohesive
example
- Fix async Chromium loader guide
- Fix Excel loader install instructions
- Reformat Html2Text page
- Add install instructions to Azure OpenAI embeddings page
- Add missing dep install to SQL QA tutorial

@baskaryan
2024-06-26 14:47:01 -07:00
Leonid Ganeline
2a5d59b3d7 core[patch]: callbacks docstrings (#23375)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-06-26 17:11:06 -04:00
Leonid Ganeline
1141b08eb8 core: docstrings example_selectors (#23542)
Added missed docstrings. Formatted docstrings to the consistent form.
2024-06-26 17:10:40 -04:00
wenngong
3bf1d98dbf langchain[patch]: update agent and chains modules root_validators (#23256)
Description: update agent and chains modules Pydantic root_validators.
Issue: the issue #22819

---------

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-26 17:09:50 -04:00
Bagatur
a7ab93479b anthropic[patch]: Release 0.1.16 (#23549) 2024-06-26 20:49:13 +00:00
Jib
c0fcf76e93 LangChain-MongoDB: [Experimental] Driver-side index creation helper (#19359)
## Description
Created a helper method to make vector search indexes via client-side
pymongo.

**Recent Update** -- Removed error suppressing/overwriting layer in
favor of letting the original exception provide information.

## ToDo's
- [x] Make _wait_untils for integration test delete index
functionalities.
- [x] Add documentation for its use. Highlight it's experimental
- [x] Post Integration Test Results in a screenshot
- [x] Get review from MongoDB internal team (@shaneharvey, @blink1073 ,
@NoahStapp , @caseyclements)



- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. Added new integration tests. Not eligible for unit testing since the
operation is Atlas Cloud specific.
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

![image](https://github.com/langchain-ai/langchain/assets/2887713/a3fc8ee1-e04c-4976-accc-fea0eeae028a)


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-26 15:07:28 -04:00
Jacob Lee
b1dfb8ea1e docs[patch]: Update contribution guides (#23382)
CC @vbarda @hwchase17
2024-06-26 11:12:41 -07:00
maang-h
5070004e8a docs: Update Tongyi ChatModel docstring (#23540)
- **Description:** Update Tongyi ChatModel rich docstring
- **Issue:** the issue #22296
2024-06-26 13:07:13 -04:00
Nuradil
2f976c5174 community: fix code example in ZenGuard docs (#23541)
Thank you for contributing to LangChain!

- [X] **PR title**: "community: fix code example in ZenGuard docs"


- [X] **PR message**: 
- **Description:** corrected the docs by indicating in the code example
that the tool accepts a list of prompts instead of just one


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Thank you for review

---------

Co-authored-by: Baur <baur.krykpayev@gmail.com>
2024-06-26 13:05:59 -04:00
yonarw
6d0ebbca1e community: SAP HANA Vector Engine fix for latest HANA release (#23516)
- **Description:** This PR fixes an issue with SAP HANA Cloud QRC03
version. In that version the number to indicate no length being set for
a vector column changed from -1 to 0. The change in this PR support both
behaviours (old/new).
- **Dependencies:** No dependencies have been introduced.

- **Tests**:  The change is covered by previous unit tests.
2024-06-26 13:15:51 +00:00
Roman Solomatin
1e3e05b0c3 openai[patch]: add support for extra_body (#23404)
**Description:** Add support passing extra_body parameter

Some OpenAI compatible API's have additional parameters (for example
[vLLM](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#extra-parameters))
that can be passed thought `extra_body`. Same question in
https://github.com/openai/openai-python/issues/767

<!--
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
-->
2024-06-26 13:11:59 +00:00
Alireza Kashani
c39521b70d Update grobid.py (#23399)
fixed potential `IndexError: list index out of range` in case there is
no title

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-26 09:11:02 -04:00
Qingchuan Hao
ee282a1d2e community: add missing link (#23526) 2024-06-26 09:06:28 -04:00
Lincoln Stein
c314222796 Add a conversation memory that combines a (optionally persistent) vectorstore history with a token buffer (#22155)
**langchain: ConversationVectorStoreTokenBufferMemory**

-**Description:** This PR adds ConversationVectorStoreTokenBufferMemory.
It is similar in concept to ConversationSummaryBufferMemory. It
maintains an in-memory buffer of messages up to a preset token limit.
After the limit is hit timestamped messages are written into a
vectorstore retriever rather than into a summary. The user's prompt is
then used to retrieve relevant fragments of the previous conversation.
By persisting the vectorstore, one can maintain memory from session to
session.
-**Issue:** n/a
-**Dependencies:** none
-**Twitter handle:** Please no!!!
- [X] **Add tests and docs**: I looked to see how the unit tests were
written for the other ConversationMemory modules, but couldn't find
anything other than a test for successful import. I need to know whether
you are using pytest.mock or another fixture to simulate the LLM and
vectorstore. In addition, I would like guidance on where to place the
documentation. Should it be a notebook file in docs/docs?

- [X] **Lint and test**: I am seeing some linting errors from a couple
of modules unrelated to this PR.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-25 20:17:10 -07:00
Bagatur
32f8f39974 core[patch]: use args_schema doc for tool description (#23503) 2024-06-25 15:26:35 -07:00
ccurme
6f7fe82830 text-splitters: release 0.2.2 (#23508) 2024-06-25 18:26:05 -04:00
ccurme
62b16fcc6b experimental: release 0.0.62 (#23507) 2024-06-25 22:01:35 +00:00
ccurme
99ce84ef23 community: release 0.2.6 (#23501) 2024-06-25 21:29:52 +00:00
ccurme
03c41e725e langchain: release 0.2.6 (#23426) 2024-06-25 21:03:41 +00:00
ccurme
86ca44d451 core: release 0.2.10 (#23420) 2024-06-25 16:26:31 -04:00
Isaac Francisco
85f5d14cef [docs]: split up tool docs (#22919) 2024-06-25 13:15:08 -07:00
ccurme
f788d0982d docs: update trim messages guide (#23418)
- rerun to remove warnings following
https://github.com/langchain-ai/langchain/pull/23363
- `raise` -> `return`
2024-06-25 19:50:53 +00:00
ccurme
c9619349d6 docs: rerun chatbot tutorial to remove warnings (#23417) 2024-06-25 19:26:54 +00:00
Nuradil
c93d9e66e4 Community: Update and fix ZenGuardTool docs and add ZenguardTool to init files (#23415)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: update docs and add tool to init.py"

- [x] **PR message**: 
- **Description:** Fixed some errors and comments in the docs and added
our ZenGuardTool and additional classes to init.py for easy access when
importing
- **Question:** when will you update the langchain-community package in
pypi to make our tool available?


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Thank you for review!

---------

Co-authored-by: Baur <baur.krykpayev@gmail.com>
2024-06-25 19:26:32 +00:00
William FH
8955bc1866 [Core] Logging: Suppress missing parent warning (#23363) 2024-06-25 14:57:23 -04:00
ccurme
730c551819 core[patch]: export tool output parsers from langchain_core.output_parsers (#23305)
These currently read off AIMessage.tool_calls, and only fall back to
OpenAI parsing if tool calls aren't populated.

Importing these from `openai_tools` (e.g., in our [tool calling
docs](https://python.langchain.com/v0.2/docs/how_to/tool_calling/#tool-calls))
can lead to confusion.

After landing, would need to release core and update docs.
2024-06-25 14:40:42 -04:00
Eugene Yurtsev
7e9e69c758 core[patch]: Add unit test for str and repr for Document (#23414) 2024-06-25 18:28:21 +00:00
Bagatur
f055f2a1e3 infra: install integration deps as needed (#23413) 2024-06-25 11:17:43 -07:00
Bagatur
92ac0fc9bd openai[patch]: Release 0.1.10 (#23410) 2024-06-25 17:40:02 +00:00
Bagatur
fb3df898b5 docs: Update README.md (#23409) 2024-06-25 17:35:00 +00:00
Bagatur
9d145b9630 openai[patch]: fix tool calling token counting (#23408)
Resolves https://github.com/langchain-ai/langchain/issues/23388
2024-06-25 10:34:25 -07:00
Tomaz Bratanic
22fa32e164 LLM Graph transformer dealing with empty strings (#23368)
Pydantic allows empty strings:

```
from langchain.pydantic_v1 import Field, BaseModel

class Property(BaseModel):
  """A single property consisting of key and value"""
  key: str = Field(..., description="key")
  value: str = Field(..., description="value")

x = Property(key="", value="")
```

Which can produce errors downstream. We simply ignore those records
2024-06-25 13:01:53 -04:00
Rajendra Kadam
d3520a784f docs: Added providers page for Pebblo and docs for PebbloRetrievalQA (#20746)
- **Description:** Added providers page for Pebblo and docs for
PebbloRetrievalQA
- **Issue:** NA
- **Dependencies:** None
- **Unit tests**: NA
2024-06-25 12:46:11 -04:00
clement.l
a75b32a54a docs: Fix typo in LLMChain tutorial (#23380)
Description: Fix a typo
Issue: n/a
Dependencies: None
Twitter handle:
2024-06-25 13:03:24 +00:00
Riccardo Schirone
4530d851e4 Merge pull request #22662
* core: runnables: special handling GeneratorExit because no error
2024-06-25 08:42:03 -04:00
Qingchuan Hao
ad50702934 community: add default value to bing_search_url (#23306)
bing_search_url is an endpoint to requests bing search resource and is
normally invariant to users, we can give it the default value to simply
the uesages of this utility/tool
2024-06-25 08:08:41 -04:00
ccurme
68e0ae3286 langchain[patch]: update removal target for LLMChain (#23373)
to 1.0

Also improve replacement example in docstring.
2024-06-24 21:51:29 +00:00
wenngong
b33d2346db community: FlashrankRerank support loading customer client (#23350)
Description: FlashrankRerank Document compressor support loading
customer client
Issue: #23338

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
2024-06-24 17:50:08 -04:00
maang-h
f58c40b4e3 docs: Update QianfanChatEndpoint ChatModel docstring (#23337)
- **Description:** Update QianfanChatEndpoint ChatModel rich docstring
- **Issue:** the issue #22296
2024-06-24 17:42:46 -04:00
Rahul Triptahi
9ef93ecd7c community[minor]: Added classification_location parameter in PebbloSafeLoader. (#22565)
Description: Add classifier_location feature flag. This flag enables
Pebblo to decide the classifier location, local or pebblo-cloud.
Unit Tests: N/A
Documentation: N/A

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-06-24 17:30:38 -04:00
Mirna Wong
2115fb76de Replace llm variable with model (#23280)
The code snippet under ‘pdfs_qa’ contains an small incorrect code
example , resulting in users getting errors. This pr replaces ‘llm’
variable with ‘model’ to help user avoid a NameError message.

Resolves #22689


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-24 17:08:02 -04:00
wenngong
af620db9c7 partners: add lint docstrings for azure-dynamic-sessions/together modules (#23303)
Description: add lint docstrings for azure-dynamic-sessions/together
modules
Issue: #23188 @baskaryan

test: ruff check passed.
<img width="782" alt="image"
src="https://github.com/langchain-ai/langchain/assets/76683249/bf11783d-65b3-4e56-a563-255eae89a3e4">

---------

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
2024-06-24 16:26:54 -04:00
yuncliu
398b2b9c51 community[minor]: Add Ascend NPU optimized Embeddings (#20260)
- **Description:** Add NPU support for embeddings

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-24 20:15:11 +00:00
Ikko Eltociear Ashimine
7b1066341b docs: update sql_query_checking.ipynb (#23345)
creat -> create
2024-06-24 16:03:32 -04:00
S M Zia Ur Rashid
d5b2a93c6d package: security update urllib3 to @1.26.19 (#23366)
urllib3 version update 1.26.18 to 1.26.19 to address a security
vulnerability.

**Reference:**
https://security.snyk.io/vuln/SNYK-PYTHON-URLLIB3-7267250
2024-06-24 19:44:39 +00:00
Jacob Lee
57c13b4ef8 docs[patch]: Fix typo in how to guide for message history (#23364) 2024-06-24 15:43:05 -04:00
Luis Rueda
168e9ed3a5 partners: add custom options to MongoDBChatMessageHistory (#22944)
**Description:** Adds options for configuring MongoDBChatMessageHistory
(no breaking changes):
- session_id_key: name of the field that stores the session id
- history_key: name of the field that stores the chat history
- create_index: whether to create an index on the session id field
- index_kwargs: additional keyword arguments to pass to the index
creation

**Discussion:**
https://github.com/langchain-ai/langchain/discussions/22918
**Twitter handle:** @userlerueda

---------

Co-authored-by: Jib <Jibzade@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-24 19:42:56 +00:00
Eugene Yurtsev
1e750f12f6 standard-tests[minor]: Add standard read write test suite for vectorstores (#23355)
Add standard read write test suite for vectorstores
2024-06-24 19:40:56 +00:00
Eugene Yurtsev
3b3ed72d35 standard-tests[minor]: Add standard tests for BaseStore (#23360)
Add standard tests to base store abstraction. These only work on [str,
str] right now. We'll need to check if it's possible to add
encoder/decoders to generalize
2024-06-24 19:38:50 +00:00
ccurme
e1190c8f3c mongodb[patch]: fix CI for python 3.12 (#23369) 2024-06-24 19:31:20 +00:00
RUO
2b87e330b0 community: fix issue with nested field extraction in MongodbLoader (#22801)
**Description:** 
This PR addresses an issue in the `MongodbLoader` where nested fields
were not being correctly extracted. The loader now correctly handles
nested fields specified in the `field_names` parameter.

**Issue:** 
Fixes an issue where attempting to extract nested fields from MongoDB
documents resulted in `KeyError`.

**Dependencies:** 
No new dependencies are required for this change.

**Twitter handle:** 
(Optional, your Twitter handle if you'd like a mention when the PR is
announced)

### Changes
1. **Field Name Parsing**:
- Added logic to parse nested field names and safely extract their
values from the MongoDB documents.

2. **Projection Construction**:
- Updated the projection dictionary to include nested fields correctly.

3. **Field Extraction**:
- Updated the `aload` method to handle nested field extraction using a
recursive approach to traverse the nested dictionaries.

### Example Usage
Updated usage example to demonstrate how to specify nested fields in the
`field_names` parameter:

```python
loader = MongodbLoader(
    connection_string=MONGO_URI,
    db_name=MONGO_DB,
    collection_name=MONGO_COLLECTION,
    filter_criteria={"data.job.company.industry_name": "IT", "data.job.detail": { "$exists": True }},
    field_names=[
        "data.job.detail.id",
        "data.job.detail.position",
        "data.job.detail.intro",
        "data.job.detail.main_tasks",
        "data.job.detail.requirements",
        "data.job.detail.preferred_points",
        "data.job.detail.benefits",
    ],
)

docs = loader.load()
print(len(docs))
for doc in docs:
    print(doc.page_content)
```
### Testing
Tested with a MongoDB collection containing nested documents to ensure
that the nested fields are correctly extracted and concatenated into a
single page_content string.
### Note
This change ensures backward compatibility for non-nested fields and
improves functionality for nested field extraction.
### Output Sample
```python
print(docs[:3])
```
```shell
# output sample:
[
    Document(
        # Here in this example, page_content is the combined text from the fields below
        # "position", "intro", "main_tasks", "requirements", "preferred_points", "benefits"
        page_content='all combined contents from the requested fields in the document',
        metadata={'database': 'Your Database name', 'collection': 'Your Collection name'}
    ),
    ...
]
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-24 19:29:11 +00:00
Tomaz Bratanic
aeeda370aa Sanitize backticks from neo4j labels and types for import (#23367) 2024-06-24 19:05:31 +00:00
Jacob Lee
d2db561347 docs[patch]: Adds callout in LLM concept docs, remove deprecated code (#23361)
CC @baskaryan @hwchase17
2024-06-24 12:03:18 -07:00
Rave Harpaz
f5ff7f178b Add OCI Generative AI new model support (#22880)
- [x] PR title: 
community: Add OCI Generative AI new model support
 
- [x] PR message:
- Description: adding support for new models offered by OCI Generative
AI services. This is a moderate update of our initial integration PR
16548 and includes a new integration for our chat models under
/langchain_community/chat_models/oci_generative_ai.py
    - Issue: NA
- Dependencies: No new Dependencies, just latest version of our OCI sdk
    - Twitter handle: NA


- [x] Add tests and docs: 
  1. we have updated our unit tests
2. we have updated our documentation including a new ipynb for our new
chat integration


- [x] Lint and test: 
 `make format`, `make lint`, and `make test` run successfully

---------

Co-authored-by: RHARPAZ <RHARPAZ@RHARPAZ-5750.us.oracle.com>
Co-authored-by: Arthur Cheng <arthur.cheng@oracle.com>
2024-06-24 14:48:23 -04:00
Jacob Lee
753edf9c80 docs[patch]: Update chatbot tools how-to guide (#23362) 2024-06-24 11:46:06 -07:00
Baur
aa358f2be4 community: Add ZenGuard tool (#22959)
** Description**
This is the community integration of ZenGuard AI - the fastest
guardrails for GenAI applications. ZenGuard AI protects against:

- Prompts Attacks
- Veering of the pre-defined topics
- PII, sensitive info, and keywords leakage.
- Toxicity
- Etc.

**Twitter Handle** : @zenguardai

- [x] **Add tests and docs**: If you're adding a new integration, please
include
  1. Added an integration test
  2. Added colab


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

---------

Co-authored-by: Nuradil <nuradil.maksut@icloud.com>
Co-authored-by: Nuradil <133880216+yaksh0nti@users.noreply.github.com>
2024-06-24 17:40:56 +00:00
Mathis Joffre
60103fc4a5 community: Fix OVHcloud 401 Unauthorized on embedding. (#23260)
They are now rejecting with code 401 calls from users with expired or
invalid tokens (while before they were being considered anonymous).
Thus, the authorization header has to be removed when there is no token.

Related to: #23178

---------

Signed-off-by: Joffref <mariusjoffre@gmail.com>
2024-06-24 12:58:32 -04:00
Baskar Gopinath
4964ba74db Update multimodal_prompts.ipynb (#23301)
fixes #23294

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-24 15:58:51 +00:00
Eugene Yurtsev
d90379210a standard-tests[minor]: Add standard tests for cache (#23357)
Add standard tests for cache abstraction
2024-06-24 15:15:03 +00:00
Leonid Ganeline
987099cfcd community: toolkits docstrings (#23286)
Added missed docstrings. Formatted docstrings to the consistent form.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-22 14:37:52 +00:00
Rahul Triptahi
0cd3f93361 Enhance metadata of sharepointLoader. (#22248)
Description: 2 feature flags added to SharePointLoader in this PR:

1. load_auth: if set to True, adds authorised identities to metadata
2. load_extended_metadata, adds source, owner and full_path to metadata

Unit tests:N/A
Documentation: To be done.

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-06-21 17:03:38 -07:00
Yuki Watanabe
5d4133d82f community: Overhaul Databricks provider documentation (#23203)
**Description**: Update [Databricks
Provider](https://python.langchain.com/v0.2/docs/integrations/providers/databricks/)
documentations to the latest component notebooks and draw better
navigation path to related notebooks.

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2024-06-21 16:57:35 -07:00
Bagatur
bcac6c3aff openai[patch]: temp fix ignore lint (#23290) 2024-06-21 16:52:52 -07:00
William FH
efb4c12abe [Core] Add support for inferring Annotated types (#23284)
in bind_tools() / convert_to_openai_function
2024-06-21 15:16:30 -07:00
Vadym Barda
9ac302cb97 core[minor]: update draw_mermaid node label processing (#23285)
This fixes processing issue for nodes with numbers in their labels (e.g.
`"node_1"`, which would previously be relabeled as `"node__"`, and now
are correctly processed as `"node_1"`)
2024-06-21 21:35:32 +00:00
Rajendra Kadam
7ee2822ec2 community: Fix TypeError in PebbloRetrievalQA (#23170)
**Description:** 
Fix "`TypeError: 'NoneType' object is not iterable`" when the
auth_context is absent in PebbloRetrievalQA. The auth_context is
optional; hence, PebbloRetrievalQA should work without it, but it throws
an error at the moment. This PR fixes that issue.

**Issue:** NA
**Dependencies:** None
**Unit tests:** NA

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-21 17:04:00 -04:00
Iurii Umnov
3b7b933aa2 community[minor]: OpenAPI agent. Add support for PUT, DELETE and PATCH (#22962)
**Description**: Add PUT, DELETE and PATCH tools to tool list for
OpenAPI agent if dangerous requests are allowed.

**Issue**: https://github.com/langchain-ai/langchain/issues/20469
2024-06-21 20:44:23 +00:00
Guangdong Liu
3c42bf8d97 community(patch):Fix PineconeHynridSearchRetriever not having search_kwargs (#21577)
- close #21521
2024-06-21 16:27:52 -04:00
Rahul Triptahi
4bb3d5c488 [community][quick-fix]: changed from blob.path to blob.path.name in 0365BaseLoader. (#22287)
Description: file_metadata_ was not getting propagated to returned
documents. Changed the lookup key to the name of the blob's path.
Changed blob.path key to blob.path.name for metadata_dict key lookup.
Documentation: N/A
Unit tests: N/A

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-21 15:51:03 -04:00
Bagatur
f824f6d925 docs: fix merge message runs docstring (#23279) 2024-06-21 19:50:50 +00:00
wenngong
f9aea3db07 partners: add lint docstrings for chroma module (#23249)
Description: add lint docstrings for chroma module
Issue: the issue #23188 @baskaryan

test:  ruff check passed.


![image](https://github.com/langchain-ai/langchain/assets/76683249/5e168a0c-32d0-464f-8ddb-110233918019)

---------

Co-authored-by: gongwn1 <gongwn1@lenovo.com>
2024-06-21 19:49:24 +00:00
Bagatur
9eda8f2fe8 docs: fix trim_messages code blocks (#23271) 2024-06-21 17:15:31 +00:00
Jacob Lee
86326269a1 docs[patch]: Adds prereqs to trim messages (#23270)
CC @baskaryan
2024-06-21 10:09:41 -07:00
Bagatur
4c97a9ee53 docs: fix message transformer docstrings (#23264) 2024-06-21 16:10:03 +00:00
Vwake04
0deb98ac0c pinecone: Fix multiprocessing issue in PineconeVectorStore (#22571)
**Description:**

Currently, the `langchain_pinecone` library forces the `async_req`
(asynchronous required) argument to Pinecone to `True`. This design
choice causes problems when deploying to environments that do not
support multiprocessing, such as AWS Lambda. In such environments, this
restriction can prevent users from successfully using
`langchain_pinecone`.

This PR introduces a change that allows users to specify whether they
want to use asynchronous requests by passing the `async_req` parameter
through `**kwargs`. By doing so, users can set `async_req=False` to
utilize synchronous processing, making the library compatible with AWS
Lambda and other environments that do not support multithreading.

**Issue:**
This PR does not address a specific issue number but aims to resolve
compatibility issues with AWS Lambda by allowing synchronous processing.

**Dependencies:**
None, that I'm aware of.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-21 15:46:01 +00:00
ccurme
75c7c3a1a7 openai: release 0.1.9 (#23263) 2024-06-21 11:15:29 -04:00
Brace Sproul
abe7566d7d core[minor]: BaseChatModel with_structured_output implementation (#22859) 2024-06-21 08:14:03 -07:00
mackong
360a70c8a8 core[patch]: fix no current event loop for sql history in async mode (#22933)
- **Description:** When use
RunnableWithMessageHistory/SQLChatMessageHistory in async mode, we'll
get the following error:
```
Error in RootListenersTracer.on_chain_end callback: RuntimeError("There is no current event loop in thread 'asyncio_3'.")
```
which throwed by
ddfbca38df/libs/community/langchain_community/chat_message_histories/sql.py (L259).
and no message history will be add to database.

In this patch, a new _aexit_history function which will'be called in
async mode is added, and in turn aadd_messages will be called.

In this patch, we use `afunc` attribute of a Runnable to check if the
end listener should be run in async mode or not.

  - **Issue:** #22021, #22022 
  - **Dependencies:** N/A
2024-06-21 10:39:47 -04:00
Philippe PRADOS
1c2b9cc9ab core[minor]: Update pgvector transalor for langchain_postgres (#23217)
The SelfQuery PGVectorTranslator is not correct. The operator is "eq"
and not "$eq".
This patch use a new version of PGVectorTranslator from
langchain_postgres.

It's necessary to release a new version of langchain_postgres (see
[here](https://github.com/langchain-ai/langchain-postgres/pull/75)
before accepting this PR in langchain.
2024-06-21 10:37:09 -04:00
Mu Yang
401d469a92 langchain: fix systax warning in create_json_chat_agent (#23253)
fix systax warning in `create_json_chat_agent`

```
.../langchain/agents/json_chat/base.py:22: SyntaxWarning: invalid escape sequence '\ '
  """Create an agent that uses JSON to format its logic, build for Chat Models.
```
2024-06-21 10:05:38 -04:00
mackong
b108b4d010 core[patch]: set schema format for AsyncRootListenersTracer (#23214)
- **Description:** AsyncRootListenersTracer support on_chat_model_start,
it's schema_format should be "original+chat".
  - **Issue:** N/A
  - **Dependencies:**
2024-06-21 09:30:27 -04:00
Bagatur
976b456619 docs: BaseChatModel key methods table (#23238)
If we're moving documenting inherited params think these kinds of tables
become more important

![Screenshot 2024-06-20 at 3 59 12
PM](https://github.com/langchain-ai/langchain/assets/22008038/722266eb-2353-4e85-8fae-76b19bd333e0)
2024-06-20 21:00:22 -07:00
Jacob Lee
5da7eb97cb docs[patch]: Update link (#23240)
CC @agola11
2024-06-20 17:43:12 -07:00
ccurme
a7b4175091 standard tests: add test for tool calling (#23234)
Including streaming
2024-06-20 17:20:11 -04:00
Bagatur
12e0c28a6e docs: fix chat model methods table (#23233)
rst table not md
![Screenshot 2024-06-20 at 12 37 46
PM](https://github.com/langchain-ai/langchain/assets/22008038/7a03b869-c1f4-45d0-8d27-3e16f4c6eb19)
2024-06-20 19:51:10 +00:00
Zheng Robert Jia
a349fce880 docs[minor],community[patch]: Minor tutorial docs improvement, minor import error quick fix. (#22725)
minor changes to module import error handling and minor issues in
tutorial documents.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-20 15:36:49 -04:00
Eugene Yurtsev
7545b1d29b core[patch]: Fix doc-strings for code blocks (#23232)
Code blocks need extra space around them to be rendered properly by
sphinx
2024-06-20 19:34:52 +00:00
Luis Moros
d5be160af0 community[patch]: Fix sql_databse.from_databricks issue when ran from Job (#23224)
**Desscription**: When the ``sql_database.from_databricks`` is executed
from a Workflow Job, the ``context`` object does not have a
"browserHostName" property, resulting in an error. This change manages
the error so the "DATABRICKS_HOST" env variable value is used instead of
stoping the flow

Co-authored-by: lmorosdb <lmorosdb>
2024-06-20 19:34:15 +00:00
Cory Waddingham
cd6812342e pinecone[patch]: Update Poetry requirements for pinecone-client >=3.2.2 (#22094)
This change updates the requirements in
`libs/partners/pinecone/pyproject.toml` to allow all versions of
`pinecone-client` greater than or equal to 3.2.2.

This change resolves issue
[21955](https://github.com/langchain-ai/langchain/issues/21955).

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-20 18:59:36 +00:00
ccurme
abb3066150 docs: clarify streaming with RunnableLambda (#23228) 2024-06-20 14:49:00 -04:00
ccurme
bf7763d9b0 docs: add serialization guide (#23223) 2024-06-20 12:50:24 -04:00
Eugene Yurtsev
59d7adff8f core[patch]: Add clarification about streaming to RunnableLambda (#23227)
Add streaming clarification to runnable lambda docstring.
2024-06-20 16:47:16 +00:00
Jacob Lee
60db79a38a docs[patch]: Update Anthropic chat model docs (#23226)
CC @baskaryan
2024-06-20 09:46:43 -07:00
maang-h
bc4cd9c5cc community[patch]: Update root_validators ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat, ChatSparkLLM, ChatZhipuAI (#22853)
This PR updates root validators for:

- ChatModels: ChatBaichuan, QianfanChatEndpoint, MiniMaxChat,
ChatSparkLLM, ChatZhipuAI

Issues #22819

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 16:36:41 +00:00
ChrisDEV
cb6cf4b631 Fix return value type of dumpd (#20123)
The return type of `json.loads` is `Any`.

In fact, the return type of `dumpd` must be based on `json.loads`, so
the correction here is understandable.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 16:31:41 +00:00
Guangdong Liu
0bce28cd30 core(patch): Fix encoding problem of load_prompt method (#21559)
- description: Add encoding parameters.
- @baskaryan, @efriis, @eyurtsev, @hwchase17.


![54d25ac7b1d5c2e47741a56fe8ed8ba](https://github.com/langchain-ai/langchain/assets/48236177/ffea9596-2001-4e19-b245-f8a6e231b9f9)
2024-06-20 09:25:54 -07:00
Philippe PRADOS
8711c61298 core[minor]: Adds an in-memory implementation of RecordManager (#13200)
**Description:**
langchain offers three technologies to save data:
-
[vectorstore](https://python.langchain.com/docs/modules/data_connection/vectorstores/)
- [docstore](https://js.langchain.com/docs/api/schema/classes/Docstore)
- [record
manager](https://python.langchain.com/docs/modules/data_connection/indexing)

If you want to combine these technologies in a sample persistence
stategy you need a common implementation for each. `DocStore` propose
`InMemoryDocstore`.

We propose the class `MemoryRecordManager` to complete the system.

This is the prelude to another full-request, which needs a consistent
combination of persistence components.

**Tag maintainer:**
@baskaryan

**Twitter handle:**
@pprados

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-20 12:19:10 -04:00
Eugene Yurtsev
3ab49c0036 docs: API reference remove Prev/Up/Next buttons (#23225)
These do not work anyway. Let's remove them for now for simplicity.
2024-06-20 16:15:45 +00:00
Eugene Yurtsev
61daa16e5d docs: Update clean up API reference (#23221)
- Fix bug with TypedDicts rendering inherited methods if inherting from
  typing_extensions.TypedDict rather than typing.TypedDict
- Do not surface inherited pydantic methods for subclasses of BaseModel
- Subclasses of RunnableSerializable will not how methods inherited from
  Runnable or from BaseModel
- Subclasses of Runnable that not pydantic models will include a link to
RunnableInterface (they still show inherited methods, we can fix this
later)
2024-06-20 11:35:00 -04:00
Leonid Ganeline
51e75cf59d community: docstrings (#23202)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-20 11:08:13 -04:00
Julian Weng
6a1a0d977a partners[minor]: Fix value error message for with_structured_output (#22877)
Currently, calling `with_structured_output()` with an invalid method
argument raises `Unrecognized method argument. Expected one of
'function_calling' or 'json_format'`, but the JSON mode option [is now
referred
to](https://python.langchain.com/v0.2/docs/how_to/structured_output/#the-with_structured_output-method)
by `'json_mode'`. This fixes that.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-20 15:03:21 +00:00
Qingchuan Hao
dd4d4411c9 doc: replace function all with tool call (#23184)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-20 09:27:39 -04:00
Yahkeef Davis
b03c801523 Docs: Update Rag tutorial so it includes an additional notebook cell with pip installs of required langchain_chroma and langchain_community. (#23204)
Description: Update Rag tutorial notebook so it includes an additional
notebook cell with pip installs of required langchain_chroma and
langchain_community.

This fixes the issue with the rag tutorial gives you a 'missing modules'
error if you run code in the notebook as is.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-20 09:22:49 -04:00
Leonid Ganeline
41f7620989 huggingface: docstrings (#23148)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-20 13:22:40 +00:00
ccurme
066a5a209f huggingface[patch]: fix CI for python 3.12 (#23197) 2024-06-20 09:17:26 -04:00
xyd
9b3a025f9c fix https://github.com/langchain-ai/langchain/issues/23215 (#23216)
fix bug 
The ZhipuAIEmbeddings class is not working.

Co-authored-by: xu yandong <shaonian@acsx1.onexmail.com>
2024-06-20 13:04:50 +00:00
Bagatur
ad7f2ec67d standard-tests[patch]: test stop not stop_sequences (#23200) 2024-06-19 18:07:33 -07:00
Bagatur
bd5c92a113 docs: standard params (#23199) 2024-06-19 17:57:05 -07:00
David DeCaprio
a4bcb45f65 core:Add optional max_messages to MessagePlaceholder (#16098)
- **Description:** Add optional max_messages to MessagePlaceholder
- **Issue:**
[16096](https://github.com/langchain-ai/langchain/issues/16096)
- **Dependencies:** None
- **Twitter handle:** @davedecaprio

Sometimes it's better to limit the history in the prompt itself rather
than the memory. This is needed if you want different prompts in the
chain to have different history lengths.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-19 23:39:51 +00:00
shaunakgodbole
7193634ae6 fireworks[patch]: fix api_key alias in Fireworks LLM (#23118)
Thank you for contributing to LangChain!

**Description**
The current code snippet for `Fireworks` had incorrect parameters. This
PR fixes those parameters.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-19 21:14:42 +00:00
Eugene Yurtsev
1fcf875fe3 core[patch]: Document agent schema (#23194)
* Document agent schema
* Refer folks to langgraph for more information on how to create agents.
2024-06-19 20:16:57 +00:00
Bagatur
255ad39ae3 infra: run CI on large diffs (#23192)
currently we skip CI on diffs >= 300 files. think we should just run it
on all packages instead

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-19 19:30:56 +00:00
Eugene Yurtsev
c2d43544cc core[patch]: Document messages namespace (#23154)
- Moved doc-strings below attribtues in TypedDicts -- seems to render
better on APIReference pages.
* Provided more description and some simple code examples
2024-06-19 15:00:00 -04:00
Eugene Yurtsev
3c917204dc core[patch]: Add doc-strings to outputs, fix @root_validator (#23190)
- Document outputs namespace
- Update a vanilla @root_validator that was missed
2024-06-19 14:59:06 -04:00
Bagatur
8698cb9b28 infra: add more formatter rules to openai (#23189)
Turns on
https://docs.astral.sh/ruff/settings/#format_docstring-code-format and
https://docs.astral.sh/ruff/settings/#format_skip-magic-trailing-comma

```toml
[tool.ruff.format]
docstring-code-format = true
skip-magic-trailing-comma = true
```
2024-06-19 11:39:58 -07:00
Michał Krassowski
710197e18c community[patch]: restore compatibility with SQLAlchemy 1.x (#22546)
- **Description:** Restores compatibility with SQLAlchemy 1.4.x that was
broken since #18992 and adds a test run for this version on CI (only for
Python 3.11)
- **Issue:** fixes #19681
- **Dependencies:** None
- **Twitter handle:** `@krassowski_m`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-19 17:58:57 +00:00
Erick Friis
48d6ea427f upstage: move to external repo (#22506) 2024-06-19 17:56:07 +00:00
Bagatur
0a4ee864e9 openai[patch]: image token counting (#23147)
Resolves #23000

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-19 10:41:47 -07:00
Jorge Piedrahita Ortiz
b3e53ffca0 community[patch]: sambanova llm integration improvement (#23137)
- **Description:** sambanova sambaverse integration improvement: removed
input parsing that was changing raw user input, and was making to use
process prompt parameter as true mandatory
2024-06-19 10:30:14 -07:00
Jorge Piedrahita Ortiz
e162893d7f community[patch]: update sambastudio embeddings (#23133)
Description: update sambastudio embeddings integration, now compatible
with generic endpoints and CoE endpoints
2024-06-19 10:26:56 -07:00
Philippe PRADOS
db6f46c1a6 langchain[small]: Change type to BasePromptTemplate (#23083)
```python
Change from_llm(
 prompt: PromptTemplate 
 ...
 )
```
 to
```python
Change from_llm(
 prompt: BasePromptTemplate 
 ...
 )
```
2024-06-19 13:19:36 -04:00
Sergey Kozlov
94452a94b1 core[patch[: add exceptions propagation test for astream_events v2 (#23159)
**Description:** `astream_events(version="v2")` didn't propagate
exceptions in `langchain-core<=0.2.6`, fixed in the #22916. This PR adds
a unit test to check that exceptions are propagated upwards.

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
2024-06-19 13:00:25 -04:00
Leonid Ganeline
50484be330 prompty: docstring (#23152)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-19 12:50:58 -04:00
Qingchuan Hao
9b82707ea6 docs: add bing search tool to ms platform (#23183)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-19 12:43:05 -04:00
chenxi
505a2e8743 fix: MoonshotChat fails when setting the moonshot_api_key through the OS environment. (#23176)
Close #23174

Co-authored-by: tianming <tianming@bytenew.com>
2024-06-19 16:28:24 +00:00
Bagatur
677408bfc9 core[patch]: fix chat history circular import (#23182) 2024-06-19 09:08:36 -07:00
Eugene Yurtsev
883e90d06e core[patch]: Add an example to the Document schema doc-string (#23131)
Add an example to the document schema
2024-06-19 11:35:30 -04:00
ccurme
2b08e9e265 core[patch]: update test to catch circular imports (#23172)
This raises ImportError due to a circular import:
```python
from langchain_core import chat_history
```

This does not:
```python
from langchain_core import runnables
from langchain_core import chat_history
```

Here we update `test_imports` to run each import in a separate
subprocess. Open to other ways of doing this!
2024-06-19 15:24:38 +00:00
Eugene Yurtsev
ae4c0ed25a core[patch]: Add documentation to load namespace (#23143)
Document some of the modules within the load namespace
2024-06-19 15:21:41 +00:00
Eugene Yurtsev
a34e650f8b core[patch]: Add doc-string to document compressor (#23085) 2024-06-19 11:03:49 -04:00
Eugene Yurtsev
1007a715a5 community[patch]: Prevent unit tests from making network requests (#23180)
* Prevent unit tests from making network requests
2024-06-19 14:56:30 +00:00
ccurme
ca798bc6ea community: move test to integration tests (#23178)
Tests failing on master with

> FAILED
tests/unit_tests/embeddings/test_ovhcloud.py::test_ovhcloud_embed_documents
- ValueError: Request failed with status code: 401, {"message":"Bad
token; invalid JSON"}
2024-06-19 14:39:48 +00:00
Eugene Yurtsev
4fe8403bfb core[patch]: Expand documentation in the indexing namespace (#23134) 2024-06-19 10:11:44 -04:00
Eugene Yurtsev
fe4f10047b core[patch]: Document embeddings namespace (#23132)
Document embeddings namespace
2024-06-19 10:11:16 -04:00
Eugene Yurtsev
a3bae56a48 core[patch]: Update documentation in LLM namespace (#23138)
Update documentation in lllm namespace.
2024-06-19 10:10:50 -04:00
Leonid Ganeline
a70b7a688e ai21: docstrings (#23142)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-19 08:51:15 -04:00
Jacob Lee
0c2ebe5f47 docs[patch]: Standardize prerequisites in tutorial docs (#23150)
CC @baskaryan
2024-06-18 23:10:13 -07:00
bilk0h
3d54784e6d text-splitters: Fix/recursive json splitter data persistence issue (#21529)
Thank you for contributing to LangChain!

**Description:** Noticed an issue with when I was calling
`RecursiveJsonSplitter().split_json()` multiple times that I was getting
weird results. I found an issue where `chunks` list in the `_json_split`
method. If chunks is not provided when _json_split (which is the case
when split_json calls _json_split) then the same list is used for
subsequent calls to `_json_split`.


You can see this in the test case i also added to this commit.

Output should be: 
```
[{'a': 1, 'b': 2}]
[{'c': 3, 'd': 4}]
```

Instead you get:
```
[{'a': 1, 'b': 2}]
[{'a': 1, 'b': 2, 'c': 3, 'd': 4}]
```

---------

Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-18 20:21:55 -07:00
Yuki Watanabe
9ab7a6df39 docs: Overhaul Databricks components documentation (#22884)
**Description:** Documentation at
[integrations/llms/databricks](https://python.langchain.com/v0.2/docs/integrations/llms/databricks/)
is not up-to-date and includes examples about chat model and embeddings,
which should be located in the different corresponding subdirectories.
This PR split the page into correct scope and overhaul the contents.

**Note**: This PR might be hard to review on the diffs view, please use
the following preview links for the changed pages.
- `ChatDatabricks`:
https://langchain-git-fork-b-step62-chat-databricks-doc-langchain.vercel.app/v0.2/docs/integrations/chat/databricks/
- `Databricks`:
https://langchain-git-fork-b-step62-chat-databricks-doc-langchain.vercel.app/v0.2/docs/integrations/llms/databricks/
- `DatabricksEmbeddings`:
https://langchain-git-fork-b-step62-chat-databricks-doc-langchain.vercel.app/v0.2/docs/integrations/text_embedding/databricks/

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
2024-06-18 20:10:54 -07:00
鹿鹿鹿鲨
6b46b5e9ce community: add **request_kwargs and expect TimeError AsyncHtmlLoader (#23068)
- **Description:** add `**request_kwargs` and expect `TimeError` in
`_fetch` function for AsyncHtmlLoader. This allows you to fill in the
kwargs parameter when using the `load()` method of the `AsyncHtmlLoader`
class.

Co-authored-by: Yucolu <yucolu@tencent.com>
2024-06-18 20:02:46 -07:00
Leonid Ganeline
109a70fc64 ibm: docstrings (#23149)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-18 20:00:27 -07:00
Ryan Elston
86ee4f0daa text-splitters: Introduce Experimental Markdown Syntax Splitter (#22257)
#### Description
This MR defines a `ExperimentalMarkdownSyntaxTextSplitter` class. The
main goal is to replicate the functionality of the original
`MarkdownHeaderTextSplitter` which extracts the header stack as metadata
but with one critical difference: it keeps the whitespace of the
original text intact.

This draft reimplements the `MarkdownHeaderTextSplitter` with a very
different algorithmic approach. Instead of marking up each line of the
text individually and aggregating them back together into chunks, this
method builds each chunk sequentially and applies the metadata to each
chunk. This makes the implementation simpler. However, since it's
designed to keep white space intact its not a full drop in replacement
for the original. Since it is a radical implementation change to the
original code and I would like to get feedback to see if this is a
worthwhile replacement, should be it's own class, or is not a good idea
at all.

Note: I implemented the `return_each_line` parameter but I don't think
it's a necessary feature. I'd prefer to remove it.

This implementation also adds the following additional features:
- Splits out code blocks and includes the language in the `"Code"`
metadata key
- Splits text on the horizontal rule `---` as well
- The `headers_to_split_on` parameter is now optional - with sensible
defaults that can be overridden.

#### Issue
Keeping the whitespace keeps the paragraphs structure and the formatting
of the code blocks intact which allows the caller much more flexibility
in how they want to further split the individuals sections of the
resulting documents. This addresses the issues brought up by the
community in the following issues:
- https://github.com/langchain-ai/langchain/issues/20823
- https://github.com/langchain-ai/langchain/issues/19436
- https://github.com/langchain-ai/langchain/issues/22256

#### Dependencies
N/A

#### Twitter handle
@RyanElston

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-18 19:44:00 -07:00
Bagatur
93d0ad97fe anthropic[patch]: test image input (#23155) 2024-06-19 02:32:15 +00:00
Leonid Ganeline
3dfd055411 anthropic: docstrings (#23145)
Added missed docstrings. Format docstrings to the consistent format
(used in the API Reference)
2024-06-18 22:26:45 -04:00
Bagatur
90559fde70 openai[patch], standard-tests[patch]: don't pass in falsey stop vals (#23153)
adds an image input test to standard-tests as well
2024-06-18 18:13:13 -07:00
Bagatur
e8a8286012 core[patch]: runnablewithchathistory from core.runnables (#23136) 2024-06-19 00:15:18 +00:00
Jacob Lee
2ae718796e docs[patch]: Fix typo in feedback (#23146) 2024-06-18 16:32:04 -07:00
Jacob Lee
74749c909d docs[patch]: Adds feedback input after thumbs up/down (#23141)
CC @baskaryan
2024-06-18 16:08:22 -07:00
Bagatur
cf38981bb7 docs: use trim_messages in chatbot how to (#23139) 2024-06-18 15:48:03 -07:00
Vadym Barda
b483bf5095 core[minor]: handle boolean data in draw_mermaid (#23135)
This change should address graph rendering issues for edges with boolean
data

Example from langgraph:

```python
from typing import Annotated, TypedDict

from langchain_core.messages import AnyMessage
from langgraph.graph import END, START, StateGraph
from langgraph.graph.message import add_messages


class State(TypedDict):
    messages: Annotated[list[AnyMessage], add_messages]


def branch(state: State) -> bool:
    return 1 + 1 == 3


graph_builder = StateGraph(State)
graph_builder.add_node("foo", lambda state: {"messages": [("ai", "foo")]})
graph_builder.add_node("bar", lambda state: {"messages": [("ai", "bar")]})

graph_builder.add_conditional_edges(
    START,
    branch,
    path_map={True: "foo", False: "bar"},
    then=END,
)

app = graph_builder.compile()
print(app.get_graph().draw_mermaid())
```

Previous behavior:

```python
AttributeError: 'bool' object has no attribute 'split'
```

Current behavior:

```python
%%{init: {'flowchart': {'curve': 'linear'}}}%%
graph TD;
	__start__[__start__]:::startclass;
	__end__[__end__]:::endclass;
	foo([foo]):::otherclass;
	bar([bar]):::otherclass;
	__start__ -. ('a',) .-> foo;
	foo --> __end__;
	__start__ -. ('b',) .-> bar;
	bar --> __end__;
	classDef startclass fill:#ffdfba;
	classDef endclass fill:#baffc9;
	classDef otherclass fill:#fad7de;
```
2024-06-18 20:15:42 +00:00
Bagatur
093ae04d58 core[patch]: Pin pydantic in py3.12.4 (#23130) 2024-06-18 12:00:02 -07:00
hmasdev
ff0c06b1e5 langchain[patch]: fix OutputType of OutputParsers and fix legacy API in OutputParsers (#19792)
# Description

This pull request aims to address specific issues related to the
ambiguity and error-proneness of the output types of certain output
parsers, as well as the absence of unit tests for some parsers. These
issues could potentially lead to runtime errors or unexpected behaviors
due to type mismatches when used, causing confusion for developers and
users. Through clarifying output types, this PR seeks to improve the
stability and reliability.

Therefore, this pull request

- fixes the `OutputType` of OutputParsers to be the expected type;
- e.g. `OutputType` property of `EnumOutputParser` raises `TypeError`.
This PR introduce a logic to extract `OutputType` from its attribute.
- and fixes the legacy API in OutputParsers like `LLMChain.run` to the
modern API like `LLMChain.invoke`;
- Note: For `OutputFixingParser`, `RetryOutputParser` and
`RetryWithErrorOutputParser`, this PR introduces `legacy` attribute with
False as default value in order to keep the backward compatibility
- and adds the tests for the `OutputFixingParser` and
`RetryOutputParser`.

The following table shows my expected output and the actual output of
the `OutputType` of OutputParsers.
I have used this table to fix `OutputType` of OutputParsers.

| Class Name of OutputParser | My Expected `OutputType` (after this PR)|
Actual `OutputType` [evidence](#evidence) (before this PR)| Fix Required
|
|---------|--------------|---------|--------|
| BooleanOutputParser | `<class 'bool'>` | `<class 'bool'>` | NO |
| CombiningOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| DatetimeOutputParser | `<class 'datetime.datetime'>` | `<class
'datetime.datetime'>` | NO |
| EnumOutputParser(enum=MyEnum) | `MyEnum` | `TypeError` is raised | YES
|
| OutputFixingParser | The same type as `self.parser.OutputType` | `~T`
| YES |
| CommaSeparatedListOutputParser | `typing.List[str]` |
`typing.List[str]` | NO |
| MarkdownListOutputParser | `typing.List[str]` | `typing.List[str]` |
NO |
| NumberedListOutputParser | `typing.List[str]` | `typing.List[str]` |
NO |
| JsonOutputKeyToolsParser | `typing.Any` | `typing.Any` | NO |
| JsonOutputToolsParser | `typing.Any` | `typing.Any` | NO |
| PydanticToolsParser | `typing.Any` | `typing.Any` | NO |
| PandasDataFrameOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| PydanticOutputParser(pydantic_object=MyModel) | `<class
'__main__.MyModel'>` | `<class '__main__.MyModel'>` | NO |
| RegexParser | `typing.Dict[str, str]` | `TypeError` is raised | YES |
| RegexDictParser | `typing.Dict[str, str]` | `TypeError` is raised |
YES |
| RetryOutputParser | The same type as `self.parser.OutputType` | `~T` |
YES |
| RetryWithErrorOutputParser | The same type as `self.parser.OutputType`
| `~T` | YES |
| StructuredOutputParser | `typing.Dict[str, Any]` | `TypeError` is
raised | YES |
| YamlOutputParser(pydantic_object=MyModel) | `MyModel` | `~T` | YES |

NOTE: In "Fix Required", "YES" means that it is required to fix in this
PR while "NO" means that it is not required.

# Issue

No issues for this PR.

# Twitter handle

- [hmdev3](https://twitter.com/hmdev3)

# Questions:

1. Is it required to create tests for legacy APIs `LLMChain.run` in the
following scripts?
   - libs/langchain/tests/unit_tests/output_parsers/test_fix.py;
   - libs/langchain/tests/unit_tests/output_parsers/test_retry.py.

2. Is there a more appropriate expected output type than I expect in the
above table?
- e.g. the `OutputType` of `CombiningOutputParser` should be
SOMETHING...

# Actual outputs (before this PR)

<div id='evidence'></div>

<details><summary>Actual outputs</summary>

## Requirements

- Python==3.9.13
- langchain==0.1.13

```python
Python 3.9.13 (tags/v3.9.13:6de2ca5, May 17 2022, 16:36:42) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import langchain
>>> langchain.__version__
'0.1.13'
>>> from langchain import output_parsers
```

### `BooleanOutputParser`

```python
>>> output_parsers.BooleanOutputParser().OutputType
<class 'bool'>
```

### `CombiningOutputParser`

```python
>>> output_parsers.CombiningOutputParser(parsers=[output_parsers.DatetimeOutputParser(), output_parsers.CommaSeparatedListOutputParser()]).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable CombiningOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `DatetimeOutputParser`

```python
>>> output_parsers.DatetimeOutputParser().OutputType
<class 'datetime.datetime'>
```

### `EnumOutputParser`

```python
>>> from enum import Enum
>>> class MyEnum(Enum):
...     a = 'a'
...     b = 'b'
...
>>> output_parsers.EnumOutputParser(enum=MyEnum).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable EnumOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `OutputFixingParser`

```python
>>> output_parsers.OutputFixingParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```

### `CommaSeparatedListOutputParser`

```python
>>> output_parsers.CommaSeparatedListOutputParser().OutputType
typing.List[str]
```

### `MarkdownListOutputParser`

```python
>>> output_parsers.MarkdownListOutputParser().OutputType
typing.List[str]
```

### `NumberedListOutputParser`

```python
>>> output_parsers.NumberedListOutputParser().OutputType
typing.List[str]
```

### `JsonOutputKeyToolsParser`

```python
>>> output_parsers.JsonOutputKeyToolsParser(key_name='tool').OutputType
typing.Any
```

### `JsonOutputToolsParser`

```python
>>> output_parsers.JsonOutputToolsParser().OutputType
typing.Any
```

### `PydanticToolsParser`

```python
>>> from langchain.pydantic_v1 import BaseModel
>>> class MyModel(BaseModel):
...     a: int
...
>>> output_parsers.PydanticToolsParser(tools=[MyModel, MyModel]).OutputType
typing.Any
```

### `PandasDataFrameOutputParser`

```python
>>> output_parsers.PandasDataFrameOutputParser().OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable PandasDataFrameOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `PydanticOutputParser`

```python
>>> output_parsers.PydanticOutputParser(pydantic_object=MyModel).OutputType
<class '__main__.MyModel'>
```

### `RegexParser`

```python
>>> output_parsers.RegexParser(regex='$', output_keys=['a']).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable RegexParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `RegexDictParser`

```python
>>> output_parsers.RegexDictParser(output_key_to_format={'a':'a'}).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable RegexDictParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `RetryOutputParser`

```python
>>> output_parsers.RetryOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```

### `RetryWithErrorOutputParser`

```python
>>> output_parsers.RetryWithErrorOutputParser(parser=output_parsers.DatetimeOutputParser()).OutputType
~T
```

### `StructuredOutputParser`

```python
>>> from langchain.output_parsers.structured import ResponseSchema
>>> response_schemas = [ResponseSchema(name="foo",description="a list of strings",type="List[string]"),ResponseSchema(name="bar",description="a string",type="string"), ]
>>> output_parsers.StructuredOutputParser.from_response_schemas(response_schemas).OutputType
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "D:\workspace\venv\lib\site-packages\langchain_core\output_parsers\base.py", line 160, in OutputType
    raise TypeError(
TypeError: Runnable StructuredOutputParser doesn't have an inferable OutputType. Override the OutputType property to specify the output type.
```

### `YamlOutputParser`

```python
>>> output_parsers.YamlOutputParser(pydantic_object=MyModel).OutputType
~T
```


<div>

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-18 18:59:42 +00:00
Artem Mukhin
e271f75bee docs: Fix URL formatting in deprecation warnings (#23075)
**Description**

Updated the URLs in deprecation warning messages. The URLs were
previously written as raw strings and are now formatted to be clickable
HTML links.

Example of a broken link in the current API Reference:
https://api.python.langchain.com/en/latest/chains/langchain.chains.openai_functions.extraction.create_extraction_chain_pydantic.html

<img width="942" alt="Screenshot 2024-06-18 at 13 21 07"
src="https://github.com/langchain-ai/langchain/assets/4854600/a1b1863c-cd03-4af2-a9bc-70375407fb00">
2024-06-18 14:49:58 -04:00
Gabriel Petracca
c6660df58e community[minor]: Implement Doctran async execution (#22372)
**Description**

The DoctranTextTranslator has an async transform function that was not
implemented because [the doctran
library](https://github.com/psychic-api/doctran) uses a sync version of
the `execute` method.

- I implemented the `DoctranTextTranslator.atransform_documents()`
method using `asyncio.to_thread` to run the function in a separate
thread.
- I updated the example in the Notebook with the new async version.
- The performance improvements can be appreciated when a big document is
divided into multiple chunks.

Relates to:
- Issue #14645: https://github.com/langchain-ai/langchain/issues/14645
- Issue #14437: https://github.com/langchain-ai/langchain/issues/14437
- https://github.com/langchain-ai/langchain/pull/15264

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-18 18:17:37 +00:00
Eugene Yurtsev
aa6415aa7d core[minor]: Support multiple keys in get_from_dict_or_env (#23086)
Support passing multiple keys for ge_from_dict_or_env
2024-06-18 14:13:28 -04:00
nold
226802f0c4 community: add args_schema to SearxSearch (#22954)
This change adds args_schema (pydantic BaseModel) to SearxSearchRun for
correct schema formatting on LLM function calls

Issue: currently using SearxSearchRun with OpenAI function calling
returns the following error "TypeError: SearxSearchRun._run() got an
unexpected keyword argument '__arg1' ".

This happens because the schema sent to the LLM is "input:
'{"__arg1":"foobar"}'" while the method should be called with the
"query" parameter.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-18 17:27:39 +00:00
Bagatur
01783d67fc core[patch]: Release 0.2.9 (#23091) 2024-06-18 17:15:04 +00:00
Finlay Macklon
616d06d7fe community: glob multiple patterns when using DirectoryLoader (#22852)
- **Description:** Updated
*community.langchain_community.document_loaders.directory.py* to enable
the use of multiple glob patterns in the `DirectoryLoader` class. Now,
the glob parameter is of type `list[str] | str` and still defaults to
the same value as before. I updated the docstring of the class to
reflect this, and added a unit test to
*community.tests.unit_tests.document_loaders.test_directory.py* named
`test_directory_loader_glob_multiple`. This test also shows an example
of how to use the new functionality.
- ~~Issue:~~**Discussion Thread:**
https://github.com/langchain-ai/langchain/discussions/18559
- **Dependencies:** None
- **Twitter handle:** N/a

- [x] **Add tests and docs**
    - Added test (described above)
    - Updated class docstring

- [x] **Lint and test**

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-18 09:24:50 -07:00
Eugene Yurtsev
5564d9e404 core[patch]: Document BaseStore (#23082)
Add doc-string to BaseStore
2024-06-18 11:47:47 -04:00
Takuya Igei
9f791b6ad5 core[patch],community[patch],langchain[patch]: tenacity dependency to version >=8.1.0,<8.4.0 (#22973)
Fix https://github.com/langchain-ai/langchain/issues/22972.

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-18 10:34:28 -04:00
Raghav Dixit
74c4cbb859 LanceDB example minor change (#23069)
Removed package version `0.6.13` in the example.
2024-06-18 09:16:17 -04:00
Bagatur
ddfbca38df docs: add trim_messages to chatbot (#23061) 2024-06-17 22:39:39 -07:00
Lance Martin
931b41b30f Update Fireworks link (#23058) 2024-06-17 21:16:18 -07:00
Leonid Ganeline
6a66d8e2ca docs: AWS platform page update (#23063)
Added a reference to the `GlueCatalogLoader` new document loader.
2024-06-17 21:01:58 -07:00
Raviraj
858ce264ef SemanticChunker : Feature Addition ("Semantic Splitting with gradient") (#22895)
```SemanticChunker``` currently provide three methods to split the texts semantically:
- percentile
- standard_deviation
- interquartile

I propose new method ```gradient```. In this method, the gradient of distance is used to split chunks along with the percentile method (technically) . This method is useful when chunks are highly correlated with each other or specific to a domain e.g. legal or medical. The idea is to apply anomaly detection on gradient array so that the distribution become wider and easy to identify boundaries in highly semantic data.
I have tested this merge on a set of 10 domain specific documents (mostly legal).

Details : 
    - **Issue:** Improvement
    - **Dependencies:** NA
    - **Twitter handle:** [x.com/prajapat_ravi](https://x.com/prajapat_ravi)


@hwchase17

---------

Co-authored-by: Raviraj Prajapat <raviraj.prajapat@sirionlabs.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-17 21:01:08 -07:00
Raghav Dixit
55705c0f5e LanceDB integration update (#22869)
Added : 

- [x] relevance search (w/wo scores)
- [x] maximal marginal search
- [x] image ingestion
- [x] filtering support
- [x] hybrid search w reranking 

make test, lint_diff and format checked.
2024-06-17 20:54:26 -07:00
Chang Liu
62c8a67f56 community: add KafkaChatMessageHistory (#22216)
Add chat history store based on Kafka.

Files added: 
`libs/community/langchain_community/chat_message_histories/kafka.py`
`docs/docs/integrations/memory/kafka_chat_message_history.ipynb`

New issue to be created for future improvement:
1. Async method implementation.
2. Message retrieval based on timestamp.
3. Support for other configs when connecting to cloud hosted Kafka (e.g.
add `api_key` field)
4. Improve unit testing & integration testing.
2024-06-17 20:34:01 -07:00
shimajiroxyz
3e835a1aa1 langchain: add id_key option to EnsembleRetriever for metadata-based document merging (#22950)
**Description:**
- What I changed
- By specifying the `id_key` during the initialization of
`EnsembleRetriever`, it is now possible to determine which documents to
merge scores for based on the value corresponding to the `id_key`
element in the metadata, instead of `page_content`. Below is an example
of how to use the modified `EnsembleRetriever`:
    ```python
retriever = EnsembleRetriever(retrievers=[ret1, ret2], id_key="id") #
The Document returned by each retriever must keep the "id" key in its
metadata.
    ```

- Additionally, I added a script to easily test the behavior of the
`invoke` method of the modified `EnsembleRetriever`.

- Why I changed
- There are cases where you may want to calculate scores by treating
Documents with different `page_content` as the same when using
`EnsembleRetriever`. For example, when you want to ensemble the search
results of the same document described in two different languages.
- The previous `EnsembleRetriever` used `page_content` as the basis for
score aggregation, making the above usage difficult. Therefore, the
score is now calculated based on the specified key value in the
Document's metadata.

**Twitter handle:** @shimajiroxyz
2024-06-18 03:29:17 +00:00
mackong
39f6c4169d langchain[patch]: add tool messages formatter for tool calling agent (#22849)
- **Description:** add tool_messages_formatter for tool calling agent,
make tool messages can be formatted in different ways for your LLM.
  - **Issue:** N/A
  - **Dependencies:** N/A
2024-06-17 20:29:00 -07:00
Lucas Tucker
e25a5966b5 docs: Standardize DocumentLoader docstrings (#22932)
**Standardizing DocumentLoader docstrings (of which there are many)**

This PR addresses issue #22866 and adds docstrings according to the
issue's specified format (in the appendix) for files csv_loader.py and
json_loader.py in langchain_community.document_loaders. In particular,
the following sections have been added to both CSVLoader and JSONLoader:
Setup, Instantiate, Load, Async load, and Lazy load. It may be worth
adding a 'Metadata' section to the JSONLoader docstring to clarify how
we want to extract the JSON metadata (using the `metadata_func`
argument). The files I used to walkthrough the various sections were
`example_2.json` from
[HERE](https://support.oneskyapp.com/hc/en-us/articles/208047697-JSON-sample-files)
and `hw_200.csv` from
[HERE](https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html).

---------

Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-18 03:26:36 +00:00
Leonid Ganeline
a56ff199a7 docs: embeddings classes (#22927)
Added a table with all Embedding classes.
2024-06-17 20:17:24 -07:00
Mohammad Mohtashim
60ba02f5db [Community]: Fixed DDG DuckDuckGoSearchResults Docstring (#22968)
- **Description:** A very small fix in the Docstring of
`DuckDuckGoSearchResults` identified in the following issue.
- **Issue:** #22961

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-18 03:16:24 +00:00
Eun Hye Kim
70761af8cf community: Fix #22975 (Add SSL Verification Option to Requests Class in langchain_community) (#22977)
- **PR title**: "community: Fix #22975 (Add SSL Verification Option to
Requests Class in langchain_community)"
- **PR message**: 
    - **Description:**
- Added an optional verify parameter to the Requests class with a
default value of True.
- Modified the get, post, patch, put, and delete methods to include the
verify parameter.
- Updated the _arequest async context manager to include the verify
parameter.
- Added the verify parameter to the GenericRequestsWrapper class and
passed it to the Requests class.
    - **Issue:** This PR fixes issue #22975.
- **Dependencies:** No additional dependencies are required for this
change.
    - **Twitter handle:** @lunara_x

You can check this change with below code.
```python
from langchain_openai.chat_models import ChatOpenAI
from langchain.requests import RequestsWrapper
from langchain_community.agent_toolkits.openapi import planner
from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec

with open("swagger.yaml") as f:
    data = yaml.load(f, Loader=yaml.FullLoader)
swagger_api_spec = reduce_openapi_spec(data)

llm = ChatOpenAI(model='gpt-4o')
swagger_requests_wrapper = RequestsWrapper(verify=False) # modified point
superset_agent = planner.create_openapi_agent(swagger_api_spec, swagger_requests_wrapper, llm, allow_dangerous_requests=True, handle_parsing_errors=True)

superset_agent.run(
    "Tell me the number and types of charts and dashboards available."
)
```

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-06-18 03:12:40 +00:00
Mohammad Mohtashim
bf839676c7 [Community]: FIxed the DocumentDBVectorSearch _similarity_search_without_score (#22970)
- **Description:** The PR #22777 introduced a bug in
`_similarity_search_without_score` which was raising the
`OperationFailure` error. The mistake was syntax error for MongoDB
pipeline which has been corrected now.
    - **Issue:** #22770
2024-06-17 20:08:42 -07:00
Nuno Campos
f01f12ce1e Include "no escape" and "inverted section" mustache vars in Prompt.input_variables and Prompt.input_schema (#22981) 2024-06-17 19:24:13 -07:00
Bella Be
7a0b36501f docs: Update how to docs for pydantic compatibility (#22983)
Add missing imports in docs from langchain_core.tools  BaseTool

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-18 01:49:56 +00:00
Jacob Lee
3b7b276f6f docs[patch]: Adds evaluation sections (#23050)
Also want to add an index/rollup page to LangSmith docs to enable
linking to a how-to category as a group (e.g.
https://docs.smith.langchain.com/how_to_guides/evaluation/)

CC @agola11 @hinthornw
2024-06-17 17:25:04 -07:00
Jacob Lee
6605ae22f6 docs[patch]: Update docs links (#23013) 2024-06-17 15:58:28 -07:00
Bagatur
c2b2e3266c core[minor]: message transformer utils (#22752) 2024-06-17 15:30:07 -07:00
Qingchuan Hao
c5e0acf6f0 docs: add bing search integration to agent (#22929)
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-17 18:08:52 -04:00
Anders Swanson
aacc6198b9 community: OCI GenAI embedding batch size (#22986)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: OCI GenAI embedding batch size"



- [x] **PR message**:
    - **Issue:** #22985 


- [ ] **Add tests and docs**: N/A


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Signed-off-by: Anders Swanson <anders.swanson@oracle.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-17 22:06:45 +00:00
Bagatur
8235bae48e core[patch]: Release 0.2.8 (#23012) 2024-06-17 20:55:39 +00:00
Bagatur
5ee6e22983 infra: test all dependents on any change (#22994) 2024-06-17 20:50:31 +00:00
Nuno Campos
bd4b68cd54 core: run_in_executor: Wrap StopIteration in RuntimeError (#22997)
- StopIteration can't be set on an asyncio.Future it raises a TypeError
and leaves the Future pending forever so we need to convert it to a
RuntimeError
2024-06-17 20:40:01 +00:00
Bagatur
d96f67b06f standard-tests[patch]: Update chat model standard tests (#22378)
- Refactor standard test classes to make them easier to configure
- Update openai to support stop_sequences init param
- Update groq to support stop_sequences init param
- Update fireworks to support max_retries init param
- Update ChatModel.bind_tools to type tool_choice
- Update groq to handle tool_choice="any". **this may be controversial**

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-17 13:37:41 -07:00
Bob Lin
14f0cdad58 docs: Add some 3rd party tutorials (#22931)
Langchain is very popular among developers in China, but there are still
no good Chinese books or documents, so I want to add my own Chinese
resources on langchain topics, hoping to give Chinese readers a better
experience using langchain. This is not a translation of the official
langchain documentation, but my understanding.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-17 20:12:49 +00:00
Jacob Lee
893299c3c9 docs[patch]: Reorder streaming guide, add tags (#22993)
CC @hinthornw
2024-06-17 13:10:51 -07:00
Oguz Vuruskaner
dd25d08c06 community[minor]: add tool calling for DeepInfraChat (#22745)
DeepInfra now supports tool calling for supported models.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-17 15:21:49 -04:00
Bagatur
158701ab3c docs: update universal init title (#22990) 2024-06-17 12:13:31 -07:00
Lance Martin
a54deba6bc Add RAG to conceptual guide (#22790)
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
2024-06-17 11:20:28 -07:00
maang-h
c6b7db6587 community: Add Baichuan Embeddings batch size (#22942)
- **Support batch size** 
Baichuan updates the document, indicating that up to 16 documents can be
imported at a time

- **Standardized model init arg names**
    - baichuan_api_key -> api_key
    - model_name  -> model
2024-06-17 14:11:04 -04:00
ccurme
722c8f50ea openai[patch]: add stream_usage parameter (#22854)
Here we add `stream_usage` to ChatOpenAI as:

1. a boolean attribute
2. a kwarg to _stream and _astream.

Question: should the `stream_usage` attribute be `bool`, or `bool |
None`?

Currently I've kept it `bool` and defaulted to False. It was implemented
on
[ChatAnthropic](e832bbb486/libs/partners/anthropic/langchain_anthropic/chat_models.py (L535))
as a bool. However, to maintain support for users who access the
behavior via OpenAI's `stream_options` param, this ends up being
possible:
```python
llm = ChatOpenAI(model_kwargs={"stream_options": {"include_usage": True}})
assert not llm.stream_usage
```
(and this model will stream token usage).

Some options for this:
- it's ok
- make the `stream_usage` attribute bool or None
- make an \_\_init\_\_ for ChatOpenAI, set a `._stream_usage` attribute
and read `.stream_usage` from a property

Open to other ideas as well.
2024-06-17 13:35:18 -04:00
Shubham Pandey
56ac94e014 community[minor]: add ChatSnowflakeCortex chat model (#21490)
**Description:** This PR adds a chat model integration for [Snowflake
Cortex](https://docs.snowflake.com/en/user-guide/snowflake-cortex/llm-functions),
which gives an instant access to industry-leading large language models
(LLMs) trained by researchers at companies like Mistral, Reka, Meta, and
Google, including [Snowflake
Arctic](https://www.snowflake.com/en/data-cloud/arctic/), an open
enterprise-grade model developed by Snowflake.

**Dependencies:** Snowflake's
[snowpark](https://pypi.org/project/snowflake-snowpark-python/) library
is required for using this integration.

**Twitter handle:** [@gethouseware](https://twitter.com/gethouseware)

- [x] **Add tests and docs**:
1. integration tests:
`libs/community/tests/integration_tests/chat_models/test_snowflake.py`
2. unit tests:
`libs/community/tests/unit_tests/chat_models/test_snowflake.py`
  3. example notebook: `docs/docs/integrations/chat/snowflake.ipynb`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-17 09:47:05 -07:00
Lance Martin
ea96133890 docs: Update llamacpp ntbk (#22907)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-17 15:42:56 +00:00
Bagatur
e2304ebcdb standard-tests[patch]: Release 0.1.1 (#22984) 2024-06-17 15:31:34 +00:00
Hakan Özdemir
c437b1aab7 [Partner]: Add metadata to stream response (#22716)
Adds `response_metadata` to stream responses from OpenAI. This is
returned with `invoke` normally, but wasn't implemented for `stream`.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-17 09:46:50 -04:00
Baskar Gopinath
42a379c75c docs: Standardise formatting (#22948)
Standardised formatting 


![image](https://github.com/langchain-ai/langchain/assets/73015364/ea3b5c5c-e7a6-4bb7-8c6b-e7d8cbbbf761)
2024-06-17 09:00:05 -04:00
Ikko Eltociear Ashimine
3e7bb7690c docs: update databricks.ipynb (#22949)
arbitary -> arbitrary
2024-06-17 08:57:49 -04:00
Baskar Gopinath
19356b6445 Update sql_qa.ipynb (#22966)
fixes #22798 
fixes #22963
2024-06-17 08:57:16 -04:00
Bagatur
9ff249a38d standard-tests[patch]: don't require str chunk contents (#22965) 2024-06-17 08:52:24 -04:00
Daniel Glogowski
892bd4c29b docs: nim model name update (#22943)
NIM Model name change in a notebook and mdx file.

Thanks!
2024-06-15 16:38:28 -04:00
Christopher Tee
ada03dd273 community(you): Better support for You.com News API (#22622)
## Description
While `YouRetriever` supports both You.com's Search and News APIs, news
is supported as an afterthought.
More specifically, not all of the News API parameters are exposed for
the user, only those that happen to overlap with the Search API.

This PR:
- improves support for both APIs, exposing the remaining News API
parameters while retaining backward compatibility
- refactor some REST parameter generation logic
- updates the docstring of `YouSearchAPIWrapper`
- add input validation and warnings to ensure parameters are properly
set by user
- 🚨 Breaking: Limit the news results to `k` items

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-15 20:05:19 +00:00
ccurme
e09c6bb58b infra: update integration test workflow (#22945) 2024-06-15 19:52:43 +00:00
Tomaz Bratanic
1c661fd849 Improve llm graph transformer docstring (#22939) 2024-06-15 15:33:26 -04:00
maang-h
7a0af56177 docs: update ZhipuAI ChatModel docstring (#22934)
- **Description:** Update ZhipuAI ChatModel rich docstring
- **Issue:** the issue #22296
2024-06-15 09:12:21 -04:00
Appletree24
6838804116 docs:Fix mispelling in streaming doc (#22936)
Description: Fix mispelling
Issue: None
Dependencies: None
Twitter handle: None

Co-authored-by: qcloud <ubuntu@localhost.localdomain>
2024-06-15 12:24:50 +00:00
Bitmonkey
570d45b2a1 Update ollama.py with optional raw setting. (#21486)
Ollama has a raw option now. 

https://github.com/ollama/ollama/blob/main/docs/api.md

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-14 17:19:26 -07:00
caiyueliang
9944ad7f5f community: 'Solve the issue where the _search function in ElasticsearchStore supports passing a query_vector parameter, but the parameter does not take effect. (#21532)
**Issue:**
When using the similarity_search_with_score function in
ElasticsearchStore, I expected to pass in the query_vector that I have
already obtained. I noticed that the _search function does support the
query_vector parameter, but it seems to be ineffective. I am attempting
to resolve this issue.

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-14 17:13:11 -07:00
Erick Friis
764f1958dd docs: add ollama json mode (#22926)
fixes #22910
2024-06-14 23:27:55 +00:00
Erick Friis
c374c98389 experimental: release 0.0.61 (#22924) 2024-06-14 15:55:07 -07:00
BuxianChen
af65cac609 cli[minor]: remove redefined DEFAULT_GIT_REF (#21471)
remove redefined DEFAULT_GIT_REF

Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-14 15:49:15 -07:00
Erick Friis
79a64207f5 community: release 0.2.5 (#22923) 2024-06-14 15:45:07 -07:00
Jiejun Tan
c8c67dde6f text-splitters[patch]: Fix HTMLSectionSplitter (#22812)
Update former pull request:
https://github.com/langchain-ai/langchain/pull/22654.

Modified `langchain_text_splitters.HTMLSectionSplitter`, where in the
latest version `dict` data structure is used to store sections from a
html document, in function `split_html_by_headers`. The header/section
element names serve as dict keys. This can be a problem when duplicate
header/section element names are present in a single html document.
Latter ones can replace former ones with the same name. Therefore some
contents can be miss after html text splitting is conducted.

Using a list to store sections can hopefully solve the problem. A Unit
test considering duplicate header names has been added.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-14 22:40:39 +00:00
Erick Friis
fbeeb6da75 langchain: release 0.2.5 (#22922) 2024-06-14 15:37:54 -07:00
Erick Friis
551640a030 templates: remove lockfiles (#22920)
poetry will default to latest versions without
2024-06-14 21:42:30 +00:00
Baskar Gopinath
c4f2bc9540 docs: Fix wrongly referenced class name in confluence.py (#22879)
Fixes #22542

Changed ConfluenceReader to ConfluenceLoader
2024-06-14 14:00:48 -07:00
ccurme
32966a08a9 infra: remove nvidia from monorepo scheduled tests (#22915)
Scheduled tests run in
https://github.com/langchain-ai/langchain-nvidia/tree/main
2024-06-14 13:23:04 -07:00
Erick Friis
9ef15691d6 core: release 0.2.7 (#22917) 2024-06-14 20:03:58 +00:00
Nuno Campos
338180f383 core: in astream_events v2 always await task even if already finished (#22916)
- this ensures exceptions propagate to the caller
2024-06-14 19:54:20 +00:00
Istvan/Nebulinq
513e491ce9 experimental: LLMGraphTransformer - added relationship properties. (#21856)
- **Description:** 
The generated relationships in the graph had no properties, but the
Relationship class was properly defined with properties. This made it
very difficult to transform conditional sentences into a graph. Adding
properties to relationships can solve this issue elegantly.
The changes expand on the existing LLMGraphTransformer implementation
but add the possibility to define allowed relationship properties like
this: LLMGraphTransformer(llm=llm, relationship_properties=["Condition",
"Time"],)
- **Issue:** 
    no issue found
 - **Dependencies:**
    n/a
- **Twitter handle:** 
    @IstvanSpace


-Quick Test
=================================================================
from dotenv import load_dotenv
import os
from langchain_community.graphs import Neo4jGraph
from langchain_experimental.graph_transformers import
LLMGraphTransformer
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.documents import Document

load_dotenv()
os.environ["NEO4J_URI"] = os.getenv("NEO4J_URI")
os.environ["NEO4J_USERNAME"] = os.getenv("NEO4J_USERNAME")
os.environ["NEO4J_PASSWORD"] = os.getenv("NEO4J_PASSWORD")
graph = Neo4jGraph()
llm = ChatOpenAI(temperature=0, model_name="gpt-4o")
llm_transformer = LLMGraphTransformer(llm=llm)
#text = "Harry potter likes pies, but only if it rains outside"
text = "Jack has a dog named Max. Jack only walks Max if it is sunny
outside."
documents = [Document(page_content=text)]
llm_transformer_props = LLMGraphTransformer(
    llm=llm,
    relationship_properties=["Condition"],
)
graph_documents_props =
llm_transformer_props.convert_to_graph_documents(documents)
print(f"Nodes:{graph_documents_props[0].nodes}")
print(f"Relationships:{graph_documents_props[0].relationships}")
graph.add_graph_documents(graph_documents_props)

---------

Co-authored-by: Istvan Lorincz <istvan.lorincz@pm.me>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-14 14:41:04 -04:00
ccurme
694ae87748 docs: add groq to chatmodeltabs (#22913) 2024-06-14 14:31:48 -04:00
Eugene Yurtsev
c816d03699 dcos: Add admonition to PythonREPL tool (#22909)
Add admonition to the documentation to make sure users are aware that
the tool allows execution of code on the host machine using a python
interpreter (by design).
2024-06-14 14:06:40 -04:00
kiarina
8171efd07a core[patch]: Fix FunctionCallbackHandler._on_tool_end (#22908)
If the global `debug` flag is enabled, the agent will get the following
error in `FunctionCallbackHandler._on_tool_end` at runtime.

```
Error in ConsoleCallbackHandler.on_tool_end callback: AttributeError("'list' object has no attribute 'strip'")
```

By calling str() before strip(), the error was avoided.
This error can be seen at
[debugging.ipynb](https://github.com/langchain-ai/langchain/blob/master/docs/docs/how_to/debugging.ipynb).

- Issue: NA
- Dependencies: NA
- Twitter handle: https://x.com/kiarina37
2024-06-14 17:59:29 +00:00
Philippe PRADOS
b61de9728e community[minor]: Fix long_context_reorder.py async (#22839)
Implement `async def atransform_documents( self, documents:
Sequence[Document], **kwargs: Any ) -> Sequence[Document]` for
`LongContextReorder`
2024-06-14 13:55:18 -04:00
Eugene Yurtsev
c72bcda4f2 community[major], experimental[patch]: Remove Python REPL from community (#22904)
Remove the REPL from community, and suggest an alternative import from
langchain_experimental.

Fix for this issue:
https://github.com/langchain-ai/langchain/issues/14345

This is not a bug in the code or an actual security risk. The python
REPL itself is behaving as expected.

The PR is done to appease blanket security policies that are just
looking for the presence of exec in the code.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-14 17:53:29 +00:00
Eugene Yurtsev
9a877c7adb community[patch]: SitemapLoader restrict depth of parsing sitemap (CVE-2024-2965) (#22903)
This PR restricts the depth to which the sitemap can be parsed.

Fix for: CVE-2024-2965
2024-06-14 13:04:40 -04:00
Eugene Yurtsev
4a77a3ab19 core[patch]: fix validation of @deprecated decorator (#22513)
This PR moves the validation of the decorator to a better place to avoid
creating bugs while deprecating code.

Prevent issues like this from arising:
https://github.com/langchain-ai/langchain/issues/22510

we should replace with a linter at some point that just does static
analysis
2024-06-14 16:52:30 +00:00
Jacob Lee
181a61982f anthropic[minor]: Adds streaming tool call support for Anthropic (#22687)
Preserves string content chunks for non tool call requests for
convenience.

One thing - Anthropic events look like this:

```
RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
RawContentBlockDeltaEvent(delta=TextDelta(text='<thinking>\nThe', type='text_delta'), index=0, type='content_block_delta')
RawContentBlockDeltaEvent(delta=TextDelta(text=' provide', type='text_delta'), index=0, type='content_block_delta')
...
RawContentBlockStartEvent(content_block=ToolUseBlock(id='toolu_01GJ6x2ddcMG3psDNNe4eDqb', input={}, name='get_weather', type='tool_use'), index=1, type='content_block_start')
RawContentBlockDeltaEvent(delta=InputJsonDelta(partial_json='', type='input_json_delta'), index=1, type='content_block_delta')
```

Note that `delta` has a `type` field. With this implementation, I'm
dropping it because `merge_list` behavior will concatenate strings.

We currently have `index` as a special field when merging lists, would
it be worth adding `type` too?

If so, what do we set as a context block chunk? `text` vs.
`text_delta`/`tool_use` vs `input_json_delta`?

CC @ccurme @efriis @baskaryan
2024-06-14 09:14:43 -07:00
ccurme
f40b2c6f9d fireworks[patch]: add usage_metadata to (a)invoke and (a)stream (#22906) 2024-06-14 12:07:19 -04:00
Mohammad Mohtashim
d1b7a934aa [Community]: HuggingFaceCrossEncoder score accounting for <not-relevant score,relevant score> pairs. (#22578)
- **Description:** Some of the Cross-Encoder models provide scores in
pairs, i.e., <not-relevant score (higher means the document is less
relevant to the query), relevant score (higher means the document is
more relevant to the query)>. However, the `HuggingFaceCrossEncoder`
`score` method does not currently take into account the pair situation.
This PR addresses this issue by modifying the method to consider only
the relevant score if score is being provided in pair. The reason for
focusing on the relevant score is that the compressors select the top-n
documents based on relevance.
    - **Issue:** #22556 
- Please also refer to this
[comment](https://github.com/UKPLab/sentence-transformers/issues/568#issuecomment-729153075)
2024-06-14 08:28:24 -07:00
Baskar Gopinath
83643cbdfe docs: Fix typo in tutorial about structured data extraction (#22888)
[Fixed typo](docs: Fix typo in tutorial about structured data
extraction)
2024-06-14 15:19:55 +00:00
Thanh Nguyen
b5e2ba3a47 community[minor]: add chat model llamacpp (#22589)
- **PR title**: [community] add chat model llamacpp


- **PR message**:
- **Description:** This PR introduces a new chat model integration with
llamacpp_python, designed to work similarly to the existing ChatOpenAI
model.
      + Work well with instructed chat, chain and function/tool calling.
+ Work with LangGraph (persistent memory, tool calling), will update
soon

- **Dependencies:** This change requires the llamacpp_python library to
be installed.
    
@baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-14 14:51:43 +00:00
Bagatur
e4279f80cd docs: doc loader feat table alignment (#22900) 2024-06-14 14:25:01 +00:00
Isaac Francisco
984c7a9d42 docs: generate table for document loaders (#22871)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-14 07:03:27 -07:00
Jacob Lee
8e89178047 docs[patch]: Expand embeddings docs (#22881) 2024-06-13 23:06:07 -07:00
ccurme
73c76b9628 anthropic[patch]: always add tool_result type to ToolMessage content (#22721)
Anthropic tool results can contain image data, which are typically
represented with content blocks having `"type": "image"`. Currently,
these content blocks are passed as-is as human/user messages to
Anthropic, which raises BadRequestError as it expects a tool_result
block to follow a tool_use.

Here we update ChatAnthropic to nest the content blocks inside a
tool_result content block.

Example:
```python
import base64

import httpx
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import AIMessage, HumanMessage, ToolMessage
from langchain_core.pydantic_v1 import BaseModel, Field


# Fetch image
image_url = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
image_data = base64.b64encode(httpx.get(image_url).content).decode("utf-8")


class FetchImage(BaseModel):
    should_fetch: bool = Field(..., description="Whether an image is requested.")


llm = ChatAnthropic(model="claude-3-sonnet-20240229").bind_tools([FetchImage])

messages = [
    HumanMessage(content="Could you summon a beautiful image please?"),
    AIMessage(
        content=[
            {
                "type": "tool_use",
                "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
                "name": "FetchImage",
                "input": {"should_fetch": True},
            },
        ],
        tool_calls=[
            {
                "name": "FetchImage",
                "args": {"should_fetch": True},
                "id": "toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
            },
        ],
    ),
    ToolMessage(
        name="FetchImage",
        content=[
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/jpeg",
                    "data": image_data,
                },
            },
        ],
        tool_call_id="toolu_01Rn6Qvj5m7955x9m9Pfxbcx",
    ),
]

llm.invoke(messages)
```

Trace:
https://smith.langchain.com/public/d27e4fc1-a96d-41e1-9f52-54f5004122db/r
2024-06-13 20:14:23 -07:00
Lucas Tucker
7114aed78f docs: Standardize ChatGroq (#22751)
Updated ChatGroq doc string as per issue
https://github.com/langchain-ai/langchain/issues/22296:"langchain_groq:
updated docstring for ChatGroq in langchain_groq to match that of the
description (in the appendix) provided in issue
https://github.com/langchain-ai/langchain/issues/22296. "

Issue: This PR is in response to issue
https://github.com/langchain-ai/langchain/issues/22296, and more
specifically the ChatGroq model. In particular, this PR updates the
docstring for langchain/libs/partners/groq/langchain_groq/chat_model.py
by adding the following sections: Instantiate, Invoke, Stream, Async,
Tool calling, Structured Output, and Response metadata. I used the
template from the Anthropic implementation and referenced the Appendix
of the original issue post. I also noted that: `usage_metadata `returns
none for all ChatGroq models I tested; there is no mention of image
input in the ChatGroq documentation; unlike that of ChatHuggingFace,
`.stream(messages)` for ChatGroq returned blocks of output.

---------

Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-14 03:08:36 +00:00
Anush
e002c855bd qdrant[patch]: Use collection_exists API instead of exceptions (#22764)
## Description

Currently, the Qdrant integration relies on exceptions raised by
[`get_collection`
](https://qdrant.tech/documentation/concepts/collections/#collection-info)
to check if a collection exists.

Using
[`collection_exists`](https://qdrant.tech/documentation/concepts/collections/#check-collection-existence)
is recommended to avoid missing any unhandled exceptions. This PR
addresses this.

## Testing
All integration and unit tests pass. No user-facing changes.
2024-06-13 20:01:32 -07:00
Anindyadeep
c417803908 community[minor]: Prem Templates (#22783)
This PR adds the feature add Prem Template feature in ChatPremAI.
Additionally it fixes a minor bug for API auth error when API passed
through arguments.
2024-06-13 19:59:28 -07:00
Stefano Lottini
4160b700e6 docs: Astra DB vectorstore, adjust syntax for automatic-embedding example (#22833)
Description: Adjusting the syntax for creating the vectorstore
collection (in the case of automatic embedding computation) for the most
idiomatic way to submit the stored secret name.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-14 02:52:32 +00:00
maang-h
1055b9a309 community[minor]: Implement ZhipuAIEmbeddings interface (#22821)
- **Description:** Implement ZhipuAIEmbeddings interface, include:
     - The `embed_query` method
     - The `embed_documents` method

refer to [ZhipuAI
Embedding-2](https://open.bigmodel.cn/dev/api#text_embedding)

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-06-13 19:45:11 -07:00
Leonid Ganeline
46c9784127 docs: ReAct reference (#22830)
The `ReAct` is used all across LangChain but it is not referenced
properly.
Added references to the original paper.
2024-06-13 19:39:28 -07:00
Giacomo Berardi
712aa0c529 docs: fixes for Elasticsearch integrations, cache doc and providers list (#22817)
Some minor fixes in the documentation:
 - ElasticsearchCache initilization is now correct
 - List of integrations for ES updated
2024-06-13 19:39:10 -07:00
Isaac Francisco
f9a6d5c845 infra: lint new docs to match doc loader template (#22867) 2024-06-13 19:34:50 -07:00
Bagatur
8bd368d07e cli[patch]: Release 0.0.25 (#22876) 2024-06-14 02:31:04 +00:00
Isaac Francisco
75e966a2fa docs, cli[patch]: document loaders doc template (#22862)
From: https://github.com/langchain-ai/langchain/pull/22290

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-13 19:28:57 -07:00
Hayden Wolff
d1cdde267a docs: update NVIDIA Riva tool to use NVIDIA NIM for LLM (#22873)
**Description:**
Update the NVIDIA Riva tool documentation to use NVIDIA NIM for the LLM.
Show how to use NVIDIA NIMs and link to documentation for LangChain with
NIM.

---------

Co-authored-by: Hayden Wolff <hwolff@nvidia.com>
Co-authored-by: Isaac Francisco <78627776+isahers1@users.noreply.github.com>
2024-06-13 19:26:05 -07:00
Zeeshan Qureshi
ada1e5cc64 docs: s/path_images/images/ for ImageCaptionLoader keyword arguments (#22857)
Quick update to `ImageCaptionLoader` documentation to reflect what's in
code.
2024-06-13 18:37:12 -07:00
liuzc9
41e232cb82 Fix typo in vearch.md (#22840)
Fix typo
2024-06-13 18:24:51 -07:00
Kagura Chen
57783c5e55 Fix: lint errors and update Field alias in models.py and AutoSelectionScorer initialization (#22846)
This PR addresses several lint errors in the core package of LangChain.
Specifically, the following issues were fixed:

1.Unexpected keyword argument "required" for "Field"  [call-arg]
2.tests/integration_tests/chains/test_cpal.py:263: error: Unexpected
keyword argument "narrative_input" for "QueryModel" [call-arg]
2024-06-13 18:18:00 -07:00
Erick Friis
5bc774827b langchain: release 0.2.4 (#22872) 2024-06-14 00:14:48 +00:00
Erick Friis
7234fd0f51 core: release 0.2.6 (#22868) 2024-06-13 22:22:34 +00:00
Jacob Lee
bcbb43480c core[patch]: Treat type as a special field when merging lists (#22750)
Should we even log a warning? At least for Anthropic, it's expected to
get e.g. `text_block` followed by `text_delta`.

@ccurme @baskaryan @efriis
2024-06-13 15:08:24 -07:00
Nuno Campos
bae82e966a core: In astream_events v2 propagate cancel/break to the inner astream call (#22865)
- previous behavior was for the inner astream to continue running with
no interruption
- also propagate break in core runnable methods
2024-06-13 15:02:48 -07:00
Eugene Yurtsev
a766815a99 experimental[patch]/docs[patch]: Update links to security docs (#22864)
Minor update to newest version of security docs (content should be
identical).
2024-06-13 20:29:34 +00:00
Eugene Yurtsev
8f7cc73817 ci: Add script to check for pickle usage in community (#22863)
Add script to check for pickle usage in community.
2024-06-13 16:13:15 -04:00
Eugene Yurtsev
77209f315e community[patch]: FAISS VectorStore deserializer should be opt-in (#22861)
FAISS deserializer uses pickle module. Users have to opt-in to
de-serialize.
2024-06-13 15:48:13 -04:00
Eugene Yurtsev
ce0b0f22a1 experimental[major]: Force users to opt-in into code that relies on the python repl (#22860)
This should make it obvious that a few of the agents in langchain
experimental rely on the python REPL as a tool under the hood, and will
force users to opt-in.
2024-06-13 15:41:24 -04:00
Isaac Francisco
869523ad72 [docs]: added info for TavilySearchResults (#22765) 2024-06-13 12:14:11 -07:00
ccurme
42257b120f partners: fix numpy dep (#22858)
Following https://github.com/langchain-ai/langchain/pull/22813, which
added python 3.12 to CI, here we update numpy accordingly in partner
packages.
2024-06-13 14:46:42 -04:00
Isaac Francisco
345fd3a556 minor functionality change: adding API functionality to tavilysearch (#22761) 2024-06-13 11:10:28 -07:00
Isaac Francisco
034257e9bf docs: improved recursive url loader docs (#22648)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-13 11:09:35 -07:00
Isaac Francisco
e832bbb486 [docs]: bind tools (#22831) 2024-06-13 09:50:43 -07:00
ccurme
b626c3ca23 groq[patch]: add usage_metadata to (a)invoke and (a)stream (#22834) 2024-06-13 10:26:27 -04:00
Jacob Lee
e01e5d5a91 docs[patch]: Improve Groq integration page (#22844)
Was bare bones and got marked by folks as unhelpful.

CC @efriis @colemccracken
2024-06-13 03:40:29 -07:00
Jacob Lee
12eff6a130 docs[patch]: Readd Pydantic compatibility docs (#22836)
As a how-to guide.

CC @eyurtsev @hwchase17
2024-06-13 02:56:10 -07:00
Jacob Lee
cb654a3245 docs[patch]: Adds multimodal column to chat models table, move up in concepts (#22837)
CC @hwchase17 @baskaryan
2024-06-13 02:26:55 -07:00
James Braza
45b394268c core[patch]: allowing latest packaging versions (#22792)
Allowing version 24 of https://github.com/pypa/packaging

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-12 23:22:20 +00:00
Jacob Lee
00ad197502 docs[patch]: Add structured output to conceptual docs (#22791)
This downgrades `Function/tool calling` from a h3 to an h4 which means
it'll no longer show up in the right sidebar, but any direct links will
still work. I think that is ok, but LMK if you disapprove.

CC @hwchase17 @eyurtsev @rlancemartin
2024-06-12 15:30:51 -07:00
Karim Lalani
276be6cdd4 [experimental][llms][OllamaFunctions] tool calling related fixes (#22339)
Fixes issues with tool calling to handle tool objects correctly. Added
support to handle ToolMessage correctly.
Added additional checks for error conditions.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-12 16:34:43 -04:00
Christophe Bornet
d04e899b56 ci: add testing with Python 3.12 (#22813)
We need to use a different version of numpy for py3.8 and py3.12 in
pyproject.
And so do projects that use that Python version range and import
langchain.

    - **Twitter handle:** _cbornet
2024-06-12 16:31:36 -04:00
HyoJin Kang
b6bf2bb234 community[patch]: fix database uri type in SQLDatabase (#22661)
**Description**
sqlalchemy uses "sqlalchemy.engine.URL" type for db uri argument.
Added 'URL' type for compatibility.

**Issue**: None

**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-12 15:11:00 -04:00
Eugene Yurtsev
5dbbdcbf8e core[patch]: Update remaining root_validators (#22829)
This PR updates the remaining root_validators in core to either be explicit pre-init or post-init validators.
2024-06-12 14:47:40 -04:00
Eugene Yurtsev
265e650e64 community[patch]: Update root_validators embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub, Toolkits: Connery, ChatModels: PAI_EAS, (#22828)
This PR updates root validators for:

* Embeddings: llamacpp, jina, dashscope, mosaicml, huggingface_hub
* Toolkits: Connery
* ChatModels: PAI_EAS

Following this issue:
https://github.com/langchain-ai/langchain/issues/22819
2024-06-12 13:59:05 -04:00
JonZeolla
32ba8cfab0 community[minor]: implement huggingface show_progress consistently (#22682)
- **Description:** This implements `show_progress` more consistently
(i.e. it is also added to the `HuggingFaceBgeEmbeddings` object).
- **Issue:** This implements `show_progress` more consistently in the
embeddings huggingface classes. Previously this could have been set via
`encode_kwargs`.
 - **Dependencies:** None
 - **Twitter handle:** @jonzeolla
2024-06-12 17:30:56 +00:00
Eugene Yurtsev
74e705250f core[patch]: update some root_validators (#22787)
Update some of the @root_validators to be explicit pre=True or
pre=False, skip_on_failure=True for pydantic 2 compatibility.
2024-06-12 13:04:57 -04:00
bincat
3d6e8547f9 docs: fix function name in tutorials/agents.ipynb (#22809)
the function called in the flowing example is `create_react_agent`, not
`create_tool_calling_executor `
2024-06-12 12:30:35 -04:00
mrhbj
a1268d9e9a community[patch]: fix hunyuan message include chinese signature error (#22795) (#22796)
… (#22795)

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-12 12:30:22 -04:00
Kagura Chen
513f1d8037 docs: update repo_structure.mdx to reflect latest code changes (#22810)
**Description:** This PR updates the documentation to reflect the recent
code changes.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-12 12:30:04 -04:00
Mr. Lance E Sloan «UMich»
08c466c603 community[patch]: bugfix for YoutubeLoader's LINES format (#22815)
- **Description:** A change I submitted recently introduced a bug in
`YoutubeLoader`'s `LINES` output format. In those conditions, curly
braces ("`{}`") creates a set, not a dictionary. This bugfix explicitly
specifies that a dictionary is created.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter:** lsloan_umich
- **Mastodon:**
[lsloan@mastodon.social](https://mastodon.social/@lsloan)
2024-06-12 12:29:34 -04:00
Philippe PRADOS
23c22fcbc9 langchain[minor]: Make EmbeddingsFilters async (#22737)
Add native async implementation for EmbeddingsFilter
2024-06-12 12:27:26 -04:00
endrajeet
b45bf78d2e Update index.mdx (#22818)
changed "# 🌟Recognition" to "### 🌟 Recognition" to match the rest of the
subheadings.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-12 12:27:16 -04:00
Bagatur
8203c1ff87 infra: lint new docs to match templates (#22786) 2024-06-11 13:26:35 -07:00
ccurme
936aedd10c mistral[patch]: add usage_metadata to (a)invoke and (a)stream (#22781) 2024-06-11 15:34:50 -04:00
Jiří Spilka
20e3662acf docs: Correct code examples in the Apify's notebooks (#22768)
**Description:** Correct code examples in the Apify document load
notebook and Apify Dataset notebook

**Issue**: None
**Dependencies**: None
**Twitter handle**: None
2024-06-11 15:20:16 -04:00
mrhbj
9212c9fcb8 community[patch]: fix hunyuan client json analysis (#22452) (#22767)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-11 19:05:18 +00:00
Rohan Aggarwal
86e8224cf1 community[patch]: Support for old clients (Thin and Thick) Oracle Vector Store (#22766)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
Support for old clients (Thin and Thick) Oracle Vector Store


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
Support for old clients (Thin and Thick) Oracle Vector Store

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
Have our own local tests

---------

Co-authored-by: rohan.aggarwal@oracle.com <rohaagga@phoenix95642.dev3sub2phx.databasede3phx.oraclevcn.com>
2024-06-11 11:36:06 -07:00
Jacob Lee
232908a46d docs[patch]: Adds streaming conceptual doc (#22760)
CC @hwchase17 @baskaryan
2024-06-11 11:03:52 -07:00
Mr. Lance E Sloan «UMich»
84dc2dd059 community[patch]: Load YouTube transcripts (captions) as fixed-duration chunks with start times (#21710)
- **Description:** Add a new format, `CHUNKS`, to
`langchain_community.document_loaders.youtube.YoutubeLoader` which
creates multiple `Document` objects from YouTube video transcripts
(captions), each of a fixed duration. The metadata of each chunk
`Document` includes the start time of each one and a URL to that time in
the video on the YouTube website.
  
I had implemented this for UMich (@umich-its-ai) in a local module, but
it makes sense to contribute this to LangChain community for all to
benefit and to simplify maintenance.

- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter:** lsloan_umich
- **Mastodon:**
[lsloan@mastodon.social](https://mastodon.social/@lsloan)

With regards to **tests and documentation**, most existing features of
the `YoutubeLoader` class are not tested. Only the
`YoutubeLoader.extract_video_id()` static method had a test. However,
while I was waiting for this PR to be reviewed and merged, I had time to
add a test for the chunking feature I've proposed in this PR.

I have added an example of using chunking to the
`docs/docs/integrations/document_loaders/youtube_transcript.ipynb`
notebook.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-11 17:44:36 +00:00
Aayush Kataria
71811e0547 community[minor]: Adds a vector store for Azure Cosmos DB for NoSQL (#21676)
This PR add supports for Azure Cosmos DB for NoSQL vector store.

Summary:

Description: added vector store integration for Azure Cosmos DB for
NoSQL Vector Store,
Dependencies: azure-cosmos dependency,
Tag maintainer: @hwchase17, @baskaryan @efriis @eyurtsev

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-11 10:34:01 -07:00
Mohammad Mohtashim
36cad5d25c [Community]: Added Metadata filter support for DocumentDB Vector Store (#22777)
- **Description:** As pointed out in this issue #22770, DocumentDB
`similarity_search` does not support filtering through metadata which
this PR adds by passing in the parameter `filter`. Also this PR fixes a
minor Documentation error.
- **Issue:** #22770

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-11 16:37:53 +00:00
Dmitry Stepanov
912751e268 Ollama vision support (#22734)
**Description:** Ollama vision with messages in OpenAI-style support `{
"image_url": { "url": ... } }`
**Issue:** #22460 

Added flexible solution for ChatOllama to support chat messages with
images. Works when you provide either `image_url` as a string or as a
dict with "url" inside (like OpenAI does). So it makes available to use
tuples with `ChatPromptTemplate.from_messages()`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-11 16:10:19 +00:00
Philippe PRADOS
0908b01cb2 langchain[minor]: Add native async implementation to LLMFilter, add concurrency to both sync and async paths (#22739)
Thank you for contributing to LangChain!

- [ ] **PR title**: "langchain: Fix chain_filter.py to be compatible
with async"


- [ ] **PR message**: 
    - **Description:** chain_filter is not compatible with async.
    - **Twitter handle:** pprados


- [X ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Signed-off-by: zhangwangda <zhangwangda94@163.com>
Co-authored-by: Prakul <discover.prakul@gmail.com>
Co-authored-by: Lei Zhang <zhanglei@apache.org>
Co-authored-by: Gin <ictgtvt@gmail.com>
Co-authored-by: wangda <38549158+daziz@users.noreply.github.com>
Co-authored-by: Max Mulatz <klappradla@posteo.net>
2024-06-11 10:55:40 -04:00
Jaeyeon Kim(김재연)
ce4e29ae42 community[minor]: fix redis store docstring and streamline initialization code (#22730)
Thank you for contributing to LangChain!

### Description

Fix the example in the docstring of redis store.
Change the initilization logic and remove redundant check, enhance error
message.

### Issue

The example in docstring of how to use redis store was wrong.

![image](https://github.com/langchain-ai/langchain/assets/37469330/78c5d9ce-ee66-45b3-8dfe-ea29f125e6e9)

### Dependencies
Nothing



- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-11 14:08:05 +00:00
am-kinetica
ad101adec8 community[patch]: Kinetica Integrations handled error in querying; quotes in table names; updated gpudb API (#22724)
- [ ] **Miscellaneous updates and fixes**: 
- **Description:** Handled error in querying; quotes in table names;
updated gpudb API
- **Issue:** Threw an error with an error message difficult to
understand if a query failed or returned no records
    - **Dependencies:** Updated GPUDB API version to `7.2.0.9`


@baskaryan @hwchase17
2024-06-11 10:01:26 -04:00
NithinBairapaka
27b9ea14a5 docs: Updated integration docs with required package installations (#22392)
**Title:** Updated integration docs with required package installations
   **Issue:**  #22005
2024-06-11 01:44:05 +00:00
Albert Gil López
1710423de3 docs: correct path in readme (#22383)
Description: Fix incorrect path in README instructions.
Issue: N/A
Dependencies: None
Twitter handle: @jddam

---------

Co-authored-by: isaac hershenson <ihershenson@hmc.edu>
2024-06-10 17:47:39 -07:00
Greg Tracy
7e115da16c docs: Fix pixelation in stack graphic (#21554)
This change updates the stack graphic displayed in the top-level README.
The LangChain tile is pixelated in the current graphic.
2024-06-10 22:52:22 +00:00
Leonid Ganeline
55bd8e582b docs: integrations cache: added class table (#22368)
Added a table with the cache classes. See [this table
here](https://langchain-rnpqvikie-langchain.vercel.app/v0.2/docs/integrations/llm_caching/#cache-classes-summary-table).
2024-06-10 15:09:03 -07:00
Jacob Lee
89804c3026 docs: Adds pointers from LLM pages to equivalent chat model pages (#22759)
@baskaryan
2024-06-10 14:13:22 -07:00
Qingchuan Hao
7f180f996b docs: fix langchain expression language link (#22683) 2024-06-10 21:12:47 +00:00
Mathis Joffre
ea43f40daf community[minor]: Add support for OVHcloud AI Endpoints Embedding (#22667)
**Description:** Add support for [OVHcloud AI
Endpoints](https://endpoints.ai.cloud.ovh.net/) Embedding models.

Inspired by:
https://gist.github.com/gmasse/e1f99339e161f4830df6be5d0095349a

Signed-off-by: Joffref <mariusjoffre@gmail.com>
2024-06-10 21:07:25 +00:00
Erick Friis
2aaf86ddae core: fix mustache falsy cases (#22747) 2024-06-10 14:00:12 -07:00
Eugene Yurtsev
5a7eac191a core[patch]: Add missing type annotations (#22756)
Add missing type annotations.

The missing type annotations will raise exceptions with pydantic 2.
2024-06-10 16:59:41 -04:00
Eugene Yurtsev
05d31a2f00 community[patch]: Add missing type annotations (#22758)
Add missing type annotations to objects in community.
These missing type annotations will raise type errors in pydantic 2.
2024-06-10 16:59:28 -04:00
Naka Masato
3237909221 langchain[patch]: allow to use partial variables in create_sql_query_chain (#22688)
- **Description:** allow to use partial variables to pass `top_k` and
`table_info`
- **Issue:** no
- **Dependencies:** no
- **Twitter handle:** @gymnstcs

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-10 20:58:30 +00:00
Bharat Ramanathan
2b5631a6be community[patch]: fix WandbTracer to work with new "RunV2" API (#22673)
- **Description:** This PR updates the `WandbTracer` to work with the
new RunV2 API so that wandb Traces logging works correctly for new
LangChain versions. Here's an example
[run](https://wandb.ai/parambharat/langchain-tracing/runs/wpm99ftq) from
the existing tests
- **Issue:** https://github.com/wandb/wandb/issues/7762
- **Twitter handle:** @ParamBharat

_If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17._
2024-06-10 13:56:35 -07:00
Oguz Vuruskaner
f0f4532579 community[patch]: fix deepinfra inference (#22680)
This PR includes:

1. Update of default model to LLama3.
2. Handle some 400x errors with more user friendly error messages.
3. Handle user errors.
2024-06-10 13:55:55 -07:00
Lucas Tucker
cb79e80b0b docs: standardize ChatHuggingFace (#22693)
**Updated ChatHuggingFace doc string as per issue #22296**:
"langchain_huggingface: updated docstring for ChatHuggingFace in
langchain_huggingface to match that of the description (in the appendix)
provided in issue #22296. "

**Issue:** This PR is in response to issue #22296, and more specifically
ChatHuggingFace model. In particular, this PR updates the docstring for
langchain/libs/partners/hugging_face/langchain_huggingface/chat_models/huggingface.py
by adding the following sections: Instantiate, Invoke, Stream, Async,
Tool calling, and Response metadata. I used the template from the
Anthropic implementation and referenced the Appendix of the original
issue post. I also noted that: langchain_community hugging face llms do
not work with langchain_huggingface's ChatHuggingFace model (at least
for me); the .stream(messages) functionality of ChatHuggingFace only
returned a block of response.

---------

Co-authored-by: lucast2021 <lucast2021@headroyce.org>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-10 20:54:36 +00:00
Erick Friis
d92f2251c8 docs: couchbase partner package (#22757) 2024-06-10 20:53:03 +00:00
Tomaz Bratanic
76a193decc community[patch]: Add function response to graph cypher qa chain (#22690)
LLMs struggle with Graph RAG, because it's different from vector RAG in
a way that you don't provide the whole context, only the answer and the
LLM has to believe. However, that doesn't really work a lot of the time.
However, if you wrap the context as function response the accuracy is
much better.

btw... `union[LLMChain, Runnable]` is linting fun, that's why so many
ignores
2024-06-10 13:52:17 -07:00
X-HAN
34edfe4a16 community[minor]: add Volcengine Rerank (#22700)
**Description:** this PR adds Volcengine Rerank capability to Langchain,
you can find Volcengine Rerank API from
[here](https://www.volcengine.com/docs/84313/1254474) &
[here](https://www.volcengine.com/docs/84313/1254605).
[Volcengine](https://www.volcengine.com/) is a cloud service platform
developed by ByteDance, the parent company of TikTok. You can obtain
Volcengine API AK/SK from
[here](https://www.volcengine.com/docs/84313/1254553).

**Dependencies:** VolcengineRerank depends on `volcengine` python
package.

**Twitter handle:** my twitter/x account is https://x.com/LastMonopoly
and I'd like a mention, thank you!


**Tests and docs**
  1. integration test: `test_volcengine_rerank.py`
  2. example notebook: `volcengine_rerank.ipynb`

**Lint and test**: I have run `make format`, `make lint` and `make test`
from the root of the package I've modified.
2024-06-10 13:41:05 -07:00
Prakul
9eacce9356 docs:Update reference to langchain-mongodb (#22705)
**Description**: Update reference to langchain-mongodb
2024-06-10 13:35:21 -07:00
Ikko Eltociear Ashimine
4197c9c85f docs: update azure_container_apps_dynamic_sessions_data_analyst.ipynb (#22718)
colum -> column
2024-06-10 13:33:40 -07:00
Jacob Lee
e4183cbc4e docs[patch]: Add caution on OpenAI LLMs integration page (#22754)
@baskaryan do we like?

<img width="1040" alt="Screenshot 2024-06-10 at 12 16 45 PM"
src="https://github.com/langchain-ai/langchain/assets/6952323/8893063f-1acf-4a56-9ee5-a8a2b1560277">
2024-06-10 13:27:22 -07:00
Mohammad Mohtashim
c3cce98d86 community[patch]: Small Fix in OutlookMessageLoader (Close the Message once Open) (#22744)
- **Description:** A very small fix where we close the message when it
opened
- **Issue:** #22729
2024-06-10 13:08:39 -07:00
Bagatur
86a3f6edf1 docs: standardize ChatVertexAI (#22686)
Part of #22296. Part two of
https://github.com/langchain-ai/langchain-google/pull/287
2024-06-10 12:50:50 -07:00
ccurme
f9fdca6cc2 openai: add parallel_tool_calls to api ref (#22746)
![Screenshot 2024-06-10 at 1 41 24
PM](https://github.com/langchain-ai/langchain/assets/26529506/2626bf9c-41c6-4431-b2e1-f59de1e4e468)
2024-06-10 17:44:43 +00:00
Max Mulatz
058a64c563 Community[minor]: Add language parser for Elixir (#22742)
Hi 👋 

First off, thanks a ton for your work on this 💚 Really appreciate what
you're providing here for the community.

## Description

This PR adds a basic language parser for the
[Elixir](https://elixir-lang.org/) programming language. The parser code
is based upon the approach outlined in
https://github.com/langchain-ai/langchain/pull/13318: it's using
`tree-sitter` under the hood and aligns with all the other `tree-sitter`
based parses added that PR.

The `CHUNK_QUERY` I'm using here is probably not the most sophisticated
one, but it worked for my application. It's a starting point to provide
"core" parsing support for Elixir in LangChain. It enables people to use
the language parser out in real world applications which may then lead
to further tweaking of the queries. I consider this PR just the ground
work.

- **Dependencies:** requires `tree-sitter` and `tree-sitter-languages`
from the extended dependencies
- **Twitter handle:**`@bitcrowd`

## Checklist

- [x] **PR title**: "package: description"
- [x] **Add tests and docs**
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17. -->
2024-06-10 15:56:57 +00:00
wangda
28e956735c docs:Correcting spelling mistakes in readme (#22664)
Signed-off-by: zhangwangda <zhangwangda94@163.com>
2024-06-10 15:33:41 +00:00
Gin
6f54abc252 docs: Add a missing dot in concepts.mdx (#22677) 2024-06-10 15:30:56 +00:00
Philippe PRADOS
2d4689d721 langchain[minor]: Add pgvector to list of supported vectorstores in self query retriever (#22678)
The fact that we outsourced pgvector to another project has an
unintended effect. The mapping dictionary found by
`_get_builtin_translator()` cannot recognize the new version of pgvector
because it comes from another package.
`SelfQueryRetriever` no longer knows `PGVector`.

I propose to fix this by creating a global dictionary that can be
populated by various database implementations. Thus, importing
`langchain_postgres` will allow the registration of the `PGvector`
mapping.

But for the moment I'm just adding a lazy import

Furthermore, the implementation of _get_builtin_translator()
reconstructs the BUILTIN_TRANSLATORS variable with each invocation,
which is not very efficient. A global map would be an optimization.

- **Twitter handle:** pprados

@eyurtsev, can you review this PR? And unlock the PR [Add async mode for
pgvector](https://github.com/langchain-ai/langchain-postgres/pull/32)
and PR [community[minor]: Add SQL storage
implementation](https://github.com/langchain-ai/langchain/pull/22207)?

Are you in favour of a global dictionary-based implementation of
Translator?
2024-06-10 11:27:47 -04:00
Lei Zhang
5ba1899cd7 infra: Scheduled GitHub Actions to run only on the upstream repository (#22707)
**Description:** Scheduled GitHub Actions to run only on the upstream
repository

**Issue:** Fixes #22706 

**Twitter handle:** @coolbeevip
2024-06-10 11:07:42 -04:00
Prakul
3f76c9e908 docs: Update MongoDB information in llm_caching (#22708)
**Description:**: Update MongoDB information in llm_caching
2024-06-10 11:05:55 -04:00
fzowl
c1fced9269 docs: VoyageAI new embedding and reranking models (#22719) 2024-06-09 09:12:43 -07:00
Enzo Poggio
8f019e91d7 community[patch]: Use Custom Logger Instead of Root Logger in get_user_agent Function (#22691)
## Description
This PR addresses a logging inconsistency in the `get_user_agent`
function. Previously, the function was using the root logger to log a
warning message when the "USER_AGENT" environment variable was not set.
This bypassed the custom logger `log` that was created at the start of
the module, leading to potential inconsistencies in logging behavior.

Changes:
- Replaced `logging.warning` with `log.warning` in the `get_user_agent`
function to ensure that the custom logger is used.

This change ensures that all logging in the `get_user_agent` function
respects the configurations of the custom logger, leading to more
consistent and predictable logging behavior.

## Dependencies

None

## Issue 

None

## Tests and docs

☝🏻 see description


## `make format`, `make lint` & `cd libs/community; make test`

```shell
> make format 
poetry run ruff format docs templates cookbook
1417 files left unchanged
poetry run ruff check --select I --fix docs templates cookbook
All checks passed!
```

```shell
> make lint
poetry run ruff check docs templates cookbook
All checks passed!
poetry run ruff format docs templates cookbook --diff
1417 files already formatted
poetry run ruff check --select I docs templates cookbook
All checks passed!
git grep 'from langchain import' docs/docs templates cookbook | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
```

~cd libs/community; make test~ too much dependencies for integration ...

```shell
>  poetry run pytest tests/unit_tests   
....
==== 884 passed, 466 skipped, 4447 warnings in 15.93s ====
```

I choose you randomly : @ccurme
2024-06-08 02:33:07 +00:00
Philippe PRADOS
9aabb446c5 community[minor]: Add SQL storage implementation (#22207)
Hello @eyurtsev

- package: langchain-comminity
- **Description**: Add SQL implementation for docstore. A new
implementation, in line with my other PR ([async
PGVector](https://github.com/langchain-ai/langchain-postgres/pull/32),
[SQLChatMessageMemory](https://github.com/langchain-ai/langchain/pull/22065))
- Twitter handler: pprados

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Piotr Mardziel <piotrm@gmail.com>
Co-authored-by: ChengZi <chen.zhang@zilliz.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-07 21:17:02 +00:00
Nithish Raghunandanan
f2f0e0e13d couchbase: Add the initial version of Couchbase partner package (#22087)
Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-07 14:04:08 -07:00
Cahid Arda Öz
6c07eb0c12 community[minor]: Add UpstashRatelimitHandler (#21885)
Adding `UpstashRatelimitHandler` callback for rate limiting based on
number of chain invocations or LLM token usage.

For more details, see [upstash/ratelimit-py
repository](https://github.com/upstash/ratelimit-py) or the notebook
guide included in this PR.

Twitter handle: @cahidarda

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-07 21:02:06 +00:00
Erick Friis
9b3ce16982 docs: remove nonexistent headings (#22685) 2024-06-07 20:02:06 +00:00
Erick Friis
9e03864d64 core: add error message for non-structured llm to StructuredPrompt (#22684)
previously was the blank `NotImplementedError` from
`BaseLanguageModel.with_structured_output`
2024-06-07 19:42:09 +00:00
Jacob Lee
02ff78deb8 docs[patch]: Adds LangGraph and LangSmith links, adds more crosslinks between pages (#22656)
@baskaryan @hwchase17
2024-06-07 10:22:29 -07:00
Mateusz Szewczyk
c3a8716589 docs: Updated product version in Embeddings notebook (#22062) 2024-06-07 08:11:03 -07:00
ccurme
f32d57f6f0 anthropic: refactor streaming to use events api; add streaming usage metadata (#22628)
- Refactor streaming to use raw events;
- Add `stream_usage` class attribute and kwarg to stream methods that,
if True, will include separate chunks in the stream containing usage
metadata.

There are two ways to implement streaming with anthropic's python sdk.
They have slight differences in how they surface usage metadata.
1. [Use helper
functions](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-helpers).
This is what we are doing now.
```python
count = 1
with client.messages.stream(**params) as stream:
    for text in stream.text_stream:
        snapshot = stream.current_message_snapshot
        print(f"{count}: {snapshot.usage} -- {text}")
        count = count + 1

final_snapshot = stream.get_final_message()
print(f"{count}: {final_snapshot.usage}")
```
```
1: Usage(input_tokens=8, output_tokens=1) -- Hello
2: Usage(input_tokens=8, output_tokens=1) -- !
3: Usage(input_tokens=8, output_tokens=1) --  How
4: Usage(input_tokens=8, output_tokens=1) --  can
5: Usage(input_tokens=8, output_tokens=1) --  I
6: Usage(input_tokens=8, output_tokens=1) --  assist
7: Usage(input_tokens=8, output_tokens=1) --  you
8: Usage(input_tokens=8, output_tokens=1) --  today
9: Usage(input_tokens=8, output_tokens=1) -- ?
10: Usage(input_tokens=8, output_tokens=12)
```
To do this correctly, we need to emit a new chunk at the end of the
stream containing the usage metadata.

2. [Handle raw
events](https://github.com/anthropics/anthropic-sdk-python?tab=readme-ov-file#streaming-responses)
```python
stream = client.messages.create(**params, stream=True)
count = 1
for event in stream:
    print(f"{count}: {event}")
    count = count + 1
```
```
1: RawMessageStartEvent(message=Message(id='msg_01Vdyov2kADZTXqSKkfNJXcS', content=[], model='claude-3-haiku-20240307', role='assistant', stop_reason=None, stop_sequence=None, type='message', usage=Usage(input_tokens=8, output_tokens=1)), type='message_start')
2: RawContentBlockStartEvent(content_block=TextBlock(text='', type='text'), index=0, type='content_block_start')
3: RawContentBlockDeltaEvent(delta=TextDelta(text='Hello', type='text_delta'), index=0, type='content_block_delta')
4: RawContentBlockDeltaEvent(delta=TextDelta(text='!', type='text_delta'), index=0, type='content_block_delta')
5: RawContentBlockDeltaEvent(delta=TextDelta(text=' How', type='text_delta'), index=0, type='content_block_delta')
6: RawContentBlockDeltaEvent(delta=TextDelta(text=' can', type='text_delta'), index=0, type='content_block_delta')
7: RawContentBlockDeltaEvent(delta=TextDelta(text=' I', type='text_delta'), index=0, type='content_block_delta')
8: RawContentBlockDeltaEvent(delta=TextDelta(text=' assist', type='text_delta'), index=0, type='content_block_delta')
9: RawContentBlockDeltaEvent(delta=TextDelta(text=' you', type='text_delta'), index=0, type='content_block_delta')
10: RawContentBlockDeltaEvent(delta=TextDelta(text=' today', type='text_delta'), index=0, type='content_block_delta')
11: RawContentBlockDeltaEvent(delta=TextDelta(text='?', type='text_delta'), index=0, type='content_block_delta')
12: RawContentBlockStopEvent(index=0, type='content_block_stop')
13: RawMessageDeltaEvent(delta=Delta(stop_reason='end_turn', stop_sequence=None), type='message_delta', usage=MessageDeltaUsage(output_tokens=12))
14: RawMessageStopEvent(type='message_stop')
```

Here we implement the second option, in part because it should make
things easier when implementing streaming tool calls in the near future.

This would add two new chunks to the stream-- one at the beginning and
one at the end-- with blank content and containing usage metadata. We
add kwargs to the stream methods and a class attribute allowing for this
behavior to be toggled. I enabled it by default. If we merge this we can
add the same kwargs / attribute to OpenAI.

Usage:
```python
from langchain_anthropic import ChatAnthropic

model = ChatAnthropic(
    model="claude-3-haiku-20240307",
    temperature=0
)

full = None
for chunk in model.stream("hi"):
    full = chunk if full is None else full + chunk
    print(chunk)

print(f"\nFull: {full}")
```
```
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 0, 'total_tokens': 8}
content='Hello' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='!' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' How' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' can' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' I' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' assist' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' you' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content=' today' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='?' id='run-8a20843f-25c7-4025-ad72-9add395899e3'
content='' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 0, 'output_tokens': 12, 'total_tokens': 12}

Full: content='Hello! How can I assist you today?' id='run-8a20843f-25c7-4025-ad72-9add395899e3' usage_metadata={'input_tokens': 8, 'output_tokens': 12, 'total_tokens': 20}
```
2024-06-07 13:21:46 +00:00
Bagatur
235d91940d community[patch]: Release 0.2.4 (#22643) 2024-06-06 17:47:44 -07:00
Francesco Kruk
344adad056 docs: Update jina embedding notebook to include multimodal capability (#22594)
After merging the [PR #22416 to include Jina AI multimodal
capabilities](https://github.com/langchain-ai/langchain/pull/22416), we
updated the Jina AI embedding notebook accordingly.
2024-06-07 00:02:20 +00:00
William FH
be79ce9336 [Core] Unified Enable/Disable Tracing (#22576) 2024-06-06 16:54:35 -07:00
Leonid Ganeline
57c1239643 docs: arxiv page update (#22574)
Added a link to search the arXiv papers with references to LangChain.
Updated table: better format (no horizontal scroll in table anymore).
2024-06-06 16:51:02 -07:00
Bagatur
fe2e5a3b74 langchain[patch]: Release 0.2.3 (#22644) 2024-06-06 16:29:18 -07:00
Erick Friis
a24a9c6427 multiple: get rid of pyproject extras (#22581)
They cause `poetry lock` to take a ton of time, and `uv pip install` can
resolve the constraints from these toml files in trivial time
(addressing problem with #19153)

This allows us to properly upgrade lockfile dependencies moving forward,
which revealed some issues that were either fixed or type-ignored (see
file comments)
2024-06-06 15:45:22 -07:00
Bagatur
4367e89c9a core[patch]: Release 0.2.5 (#22642) 2024-06-06 15:44:26 -07:00
Eugene Yurtsev
28f744c1f5 core[patch]: Correctly order parent ids in astream events (from root to immediate parent), add defensive check for cycles (#22637)
This PR makes two changes:

1. Fixes the order of parent IDs to be from root to immediate parent
2. Adds a simple defensive check for cycles
2024-06-06 20:37:52 +00:00
Satyam Kumar
835926153b updated oracleai_demo.ipynb (#22635)
The outer try/except block handles connection errors, and the inner
try/except block handles SQL execution errors, providing detailed error
messages for both.
try:
    conn = oracledb.connect(user=username, password=password, dsn=dsn)
    print("Connection successful!")

    cursor = conn.cursor()
    try:
        cursor.execute(
            """
            begin
                -- Drop user
                begin
                    execute immediate 'drop user testuser cascade';
                exception
                    when others then
dbms_output.put_line('Error dropping user: ' || SQLERRM);
                end;

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-06 20:29:24 +00:00
Eugene Yurtsev
035a9c9609 core[minor]: Add parent_ids to astream_events API (#22563)
Include a list of parent ids for each event in astream events.
2024-06-06 16:14:28 -04:00
Tomaz Bratanic
67e58fdc2e docs[patch]: Fix diffbot docs (#22584) 2024-06-06 16:08:59 -04:00
Eugene Yurtsev
6b8963ad92 docs: Add information about run time binding values to tools (#22623)
Add how-to guide that shows a design pattern for creating tools at run time
2024-06-06 16:05:34 -04:00
CharlesCNorton
aa49163bdf docs[patch]: typo in AutoGPT example notebook (#22631)
Corrected a typo in the AutoGPT example notebook. Changed "Needed synce
jupyter runs an async eventloop" to "Needed since Jupyter runs an async
event loop".

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-06 16:05:11 -04:00
CharlesCNorton
ffe75d1e46 docs: typo in dev container documentation (#22630)
removed an extra space before the period in the "Click **Create
codespace on master**." line.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-06 16:04:48 -04:00
Nicolas Nkiere
51005e2776 core[minor]: Add an async root listener and with_alisteners method (#22151)
- [x] **Adding AsyncRootListener**: "langchain_core: Adding
AsyncRootListener"

- **Description:** Adding an AsyncBaseTracer, AsyncRootListener and
`with_alistener` function. This is to enable binding async root listener
to runnables. This currently only supported for sync listeners.
- **Issue:** None
- **Dependencies:** None

- [x] **Add tests and docs**: Added units tests and example snippet code
within the function description of `with_alistener`


- [x] **Lint and test**: Run make format_diff, make lint_diff and make
test
2024-06-06 16:03:44 -04:00
seyf97
2904c50cd5 openai[patch]: correct grammar in exception message in embeddings/base.py (#22629)
Correct the grammar error for missing transformers package ValueError
2024-06-06 18:55:04 +00:00
Anush
80560419b0 qdrant[patch]: Make path optional in from_existing_collection() (#21875)
## Description

The `path` param is used to specify the local persistence directory,
which isn't required if using Qdrant server.

This is a breaking but necessary change.
2024-06-06 10:37:08 -07:00
ccurme
b57aa89f34 multiple: implement ls_params (#22621)
implement ls_params for ai21, fireworks, groq.
2024-06-06 16:51:37 +00:00
Xiangrui Meng
f26ab93df8 community: support Databricks Unity Catalog functions as LangChain tools (#22555)
This PR adds support for using Databricks Unity Catalog functions as
LangChain tools, which runs inside a Databricks SQL warehouse.

* An example notebook is provided.
2024-06-06 09:38:50 -07:00
ccurme
c1ef731503 anthropic: update attribute name and alias (#22625)
update name to `stop_sequences` and alias to `stop` (instead of the
other way around), since `stop_sequences` is the name used by anthropic.
2024-06-06 12:29:10 -04:00
lucasiscovici
05bf98b2f9 community[patch]: pgvector replace nin_ by not_in (#22619)
- [ ] **community**: "pgvector: replace nin_ by not_in"

- [ ] **PR message**: nin_ do not exist in sqlalchemy orm, it's not_in
2024-06-06 12:17:22 -04:00
ccurme
3999761201 multiple: add stop attribute (#22573) 2024-06-06 12:11:52 -04:00
ccurme
e08879147b Revert "anthropic: stream token usage" (#22624)
Reverts langchain-ai/langchain#20180
2024-06-06 12:05:08 -04:00
Bagatur
0d495f3f63 anthropic: stream token usage (#20180)
open to other ideas
<img width="1181" alt="Screenshot 2024-04-08 at 5 34 08 PM"
src="https://github.com/langchain-ai/langchain/assets/22008038/03eb11c4-5eb5-43e3-9109-a13f76098fa4">

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-06 11:51:34 -04:00
liuzc9
e0e40f3f63 docs: Fix typo in llmonitor.md (#22590) 2024-06-06 15:26:51 +00:00
Bagatur
feb73d4281 docs: Add ChatGoogleGenerativeAI to model feat table (#22617) 2024-06-06 08:07:13 -07:00
Satyam Kumar
17b486a37b openai, azure: update model_name in ChatResult to use name from API response (#22569)
The response.get("model", self.model_name) checks if the model key
exists in the response dictionary. If it does, it uses that value;
otherwise, it uses self.model_name.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-06 11:00:09 -04:00
Suganth Solamanraja
02495ae7c5 docs: Correct return type in docstring (#22597)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: 
- **Description:** This PR corrects the return type in the docstring of
the `docs/api_reference/create_api_rst.py/_load_package_modules`
function. The return type was previously described as a list of

Co-authored-by: suganthsolamanraja <suganth.solamanraja@techjays..com>
2024-06-06 14:51:46 +00:00
svmpsp-rc
51942c03eb docs: correct typos in Italian words (#22606)
**Description**

Fix typos in Italian words.
2024-06-06 07:46:07 -07:00
Gabriele Ghisleni
95883a99a9 docs: ElasticsearchCacheStore in stores integrations documentation (#22612)
The package for LangChain integrations with Elasticsearch
https://github.com/langchain-ai/langchain-elastic contains a
Elasticsearch byte store cache integration (see
https://github.com/langchain-ai/langchain-elastic/pull/27). This is the
documentation contribution on the page dedicated to stores integrations

Co-authored-by: Gabriele Ghisleni <gabriele.ghisleni@spaziodati.eu>
2024-06-06 14:36:43 +00:00
Christophe Bornet
12ddb4fc6f core[patch]: Use explicit classes for InMemoryByteStore and InMemoryStore (#22608)
The current implementation doesn't work well with type checking.
Instead replace with class definition that correctly works with type
checking.
2024-06-06 07:34:43 -07:00
andyjessen
cfed68e06f docs: Fix description (#22611)
This commit fixes the description of the hair_color field.
2024-06-06 07:25:27 -07:00
ccurme
1925bde32e together: bump langchain-core (#22616)
langchain-together depends on langchain-openai ^0.1.8
langchain-openai 0.1.8 has langchain-core >= 0.2.2

Here we bump langchain-core to 0.2.2, just to pass minimum dependency
version tests.
2024-06-06 14:09:40 +00:00
ccurme
35f4aa927b together[patch]: Release 0.1.3 (#22615) 2024-06-06 13:58:35 +00:00
Asi Greenholts
f23bec7be6 docs: Fix typo (#22596)
Fix typo
2024-06-06 08:39:54 -04:00
CharlesCNorton
abb0cecb44 fix: typo in Agents section of README (#22599)
Corrected the phrase "complete done" to "completely done" for better
grammatical accuracy and clarity in the Agents section of the README.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-06-06 07:44:36 -04:00
Kirushikesh DB
db7e7b69e3 docs: Removed unwanted cell in refine segment (#22604)
**Description:**
There is one unwanted duplicate cell in refine section of summarization
documentation, i have removed it.
2024-06-06 07:40:26 -04:00
andyjessen
8b40428f58 docs: Fix typo (#22603)
This commit changes minor typo in the field description.
2024-06-06 07:38:36 -04:00
Isaac Francisco
ba3e219d83 community[patch]: recursive url loader fix and unit tests (#22521)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-05 17:56:20 -07:00
Jacob Lee
234394f631 docs[minor]: Add "Build a PDF ingestion and Question/Answering system" tutorial (#22570)
More direct entrypoint for a common use-case. Meant to give people a
more hands-on intro to document loaders/loading data from different data
sources as well.

Some duplicate content for RAG and extraction (to show what you can do
with the loaded documents), but defers to the appropriate sections
rather than going too in-depth.

@baskaryan @hwchase17
2024-06-05 17:09:28 -07:00
Jeffrey Mak
5fc5ed463c community[patch]:Support filter for AzureAISearchRetriever (#22303)
**Description**: 
The AzureAISearchRetriever does not support the "$filter" argument
offered in the AISearch API:
https://learn.microsoft.com/en-us/rest/api/searchservice/documents/search-get?view=rest-searchservice-2023-11-01&tabs=HTTP
The $filter allows filtering of indexes based on values in metadata.

**Issue**: 
https://github.com/langchain-ai/langchain/issues/19885

**Dependencies**: 
No

**Twitter handle**: 
@Jeffreym9M
 

- [ ] **Add tests and docs**: Not relevant


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-05 16:53:19 -07:00
Isaac Francisco
148088a588 docs: duckduckgosearch options listed (#22568)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-05 23:29:47 +00:00
Mikhail Khludnev
ef868bc24b docs: mentioning query_instruction with regards to BGE-M3 (#22405)
see
https://github.com/langchain-ai/langchain/pull/18017#issuecomment-2143942760
https://huggingface.co/BAAI/bge-m3#faq

Co-authored-by: mikhail-khludnev <mikhail_khludnev@rntgroup.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-05 22:44:40 +00:00
X-HAN
62f13f95e4 community[minor]: add DashScope Rerank (#22403)
**Description:** this PR adds DashScope Rerank capability to Langchain,
you can find DashScope Rerank API from
[here](https://help.aliyun.com/document_detail/2780058.html?spm=a2c4g.2780059.0.0.6d995024FlrJ12)
&
[here](https://help.aliyun.com/document_detail/2780059.html?spm=a2c4g.2780058.0.0.63f75024cr11N9).
[DashScope](https://dashscope.aliyun.com/) is the generative AI service
from Alibaba Cloud (Aliyun). You can create DashScope API key from
[here](https://bailian.console.aliyun.com/?apiKey=1#/api-key).

**Dependencies:** DashScopeRerank depends on `dashscope` python package.

**Twitter handle:** my twitter/x account is https://x.com/LastMonopoly
and I'd like a mention, thanks you!


**Tests and docs**
  1. integration test: `test_dashscope_rerank.py`
  2. example notebook: `dashscope_rerank.ipynb`

**Lint and test**: I have run `make format`, `make lint` and `make test`
from the root of the package I've modified.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-05 15:40:21 -07:00
Ethan Yang
29064848f9 [Community]add option to delete the prompt from HF output (#22225)
This will help to solve pattern mismatching issue when parsing the
output in Agent.

https://github.com/langchain-ai/langchain/issues/21912
2024-06-05 18:38:54 -04:00
Jacob Lee
c040dc7017 docs[patch]: Adds heading keywords to concepts page (#22577)
@efriis @baskaryan
2024-06-05 15:28:58 -07:00
Erick Friis
24fa17593f docs: update agentexecutor title to legacy (#22575) 2024-06-05 15:09:41 -07:00
Bagatur
584a1e30ac community[patch]: AzureSearch async functions (#22075) 2024-06-05 14:39:54 -07:00
Bagatur
1a911018bc langchain[minor]: add universal init_model (#22039)
decisions to discuss
- only chat models
- model_provider isn't based on any existing values like llm-type,
package names, class names
- implemented as function not as a wrapper ChatModel
- function name (init_model)
- in langchain as opposed to community or core
- marked beta
2024-06-05 14:39:40 -07:00
Isaac Francisco
67012c2558 docs: deprecation of max_length parameter used in Exa search (#22567) 2024-06-05 12:09:53 -07:00
ccurme
af129974a3 community: update how OpenAIAssistantV2Runnable creates threads with tool_resources (#22549)
https://github.com/langchain-ai/langchain/issues/22503
2024-06-05 14:19:41 -04:00
Bagatur
51a0d4574e community[patch]: Release 0.2.3 (#22562) 2024-06-05 17:27:24 +00:00
Bagatur
b2daba37c7 nomic[patch]: Release 0.1.2 (#22561) 2024-06-05 17:06:58 +00:00
Zach Nussbaum
14f3014cce embeddings: nomic embed vision (#22482)
Thank you for contributing to LangChain!

**Description:** Adds Langchain support for Nomic Embed Vision
**Twitter handle:** nomic_ai,zach_nussbaum


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-05 09:47:17 -07:00
leila-messallem
3280a5b49b community[patch]: improve test setup to accurately test filtering of labels in neo4j (#22531)
**Description:** This PR addresses an issue with an existing test that
was not effectively testing the intended functionality. The previous
test setup did not adequately validate the filtering of the labels in
neo4j, because the nodes and relationship in the test data did not have
any properties set. Without properties these labels would not have been
returned, regardless of the filtering.

---------

Co-authored-by: Oskar Hane <oh@oskarhane.com>
2024-06-05 15:56:53 +00:00
Mohammad Mohtashim
7fcef2556c [Experimental]: Async agenerate method ollama functions (#21682)
- **Description:** :
Added Async method for Generate for OllamaFunctions which was missing
and was raising errors for the users.
   
- **Issue:** 
#21422
2024-06-05 11:50:36 -04:00
Stefano Lottini
328d0c99f2 community[minor]: Add support for metadata indexing policy in Cassandra vector store (#22548)
This PR adds a constructor `metadata_indexing` parameter to the
Cassandra vector store to allow optional fine-tuning of which fields of
the metadata are to be indexed.

This is a feature supported by the underlying CassIO library. Indexing
mode of "all", "none" or deny- and allow-list based choices are
available.

The rationale is, in some cases it's advisable to programmatically
exclude some portions of the metadata from the index if one knows in
advance they won't ever be used at search-time. this keeps the index
more lightweight and performant and avoids limitations on the length of
_indexed_ strings.

I added a integration test of the feature. I also added the possibility
of running the integration test with Cassandra on an arbitrary IP
address (e.g. Dockerized), via
`CASSANDRA_CONTACT_POINTS=10.1.1.5,10.1.1.6 poetry run pytest [...]` or
similar.

While I was at it, I added a line to the `.gitignore` since the mypy
_test_ cache was not ignored yet.

My X (Twitter) handle: @rsprrs.
2024-06-05 11:23:26 -04:00
Emilien Chauvet
c3d4126eb1 community[minor]: add user agent for web scraping loaders (#22480)
**Description:** This PR adds a `USER_AGENT` env variable that is to be
used for web scraping. It creates a util to get that user agent and uses
it in the classes used for scraping in [this piece of
doc](https://python.langchain.com/v0.1/docs/use_cases/web_scraping/).
Identifying your scraper is considered a good politeness practice, this
PR aims at easing it.
**Issue:** `None`
**Dependencies:** `None`
**Twitter handle:** `None`
2024-06-05 15:20:34 +00:00
Philippe PRADOS
8250c177de community[minor]: Add native async support to SQLChatMessageHistory (#22065)
# package community: Fix SQLChatMessageHistory

## Description
Here is a rewrite of `SQLChatMessageHistory` to properly implement the
asynchronous approach. The code circumvents [issue
22021](https://github.com/langchain-ai/langchain/issues/22021) by
accepting a synchronous call to `def add_messages()` in an asynchronous
scenario. This bypasses the bug.

For the same reasons as in [PR
22](https://github.com/langchain-ai/langchain-postgres/pull/32) of
`langchain-postgres`, we use a lazy strategy for table creation. Indeed,
the promise of the constructor cannot be fulfilled without this. It is
not possible to invoke a synchronous call in a constructor. We
compensate for this by waiting for the next asynchronous method call to
create the table.

The goal of the `PostgresChatMessageHistory` class (in
`langchain-postgres`) is, among other things, to be able to recycle
database connections. The implementation of the class is problematic, as
we have demonstrated in [issue
22021](https://github.com/langchain-ai/langchain/issues/22021).

Our new implementation of `SQLChatMessageHistory` achieves this by using
a singleton of type (`Async`)`Engine` for the database connection. The
connection pool is managed by this singleton, and the code is then
reentrant.

We also accept the type `str` (optionally complemented by `async_mode`.
I know you don't like this much, but it's the only way to allow an
asynchronous connection string).

In order to unify the different classes handling database connections,
we have renamed `connection_string` to `connection`, and `Session` to
`session_maker`.

Now, a single transaction is used to add a list of messages. Thus, a
crash during this write operation will not leave the database in an
unstable state with a partially added message list. This makes the code
resilient.

We believe that the `PostgresChatMessageHistory` class is no longer
necessary and can be replaced by:
```
PostgresChatMessageHistory = SQLChatMessageHistory
```
This also fixes the bug.


## Issue
- [issue 22021](https://github.com/langchain-ai/langchain/issues/22021)
  - Bug in _exit_history()
  - Bugs in PostgresChatMessageHistory and sync usage
  - Bugs in PostgresChatMessageHistory and async usage
- [issue
36](https://github.com/langchain-ai/langchain-postgres/issues/36)
 ## Twitter handle:
pprados

## Tests
- libs/community/tests/unit_tests/chat_message_histories/test_sql.py
(add async test)

@baskaryan, @eyurtsev or @hwchase17 can you check this PR ?
And, I've been waiting a long time for validation from other PRs. Can
you take a look?
- [PR 32](https://github.com/langchain-ai/langchain-postgres/pull/32)
- [PR 15575](https://github.com/langchain-ai/langchain/pull/15575)
- [PR 13200](https://github.com/langchain-ai/langchain/pull/13200)

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-06-05 15:10:38 +00:00
Vincent Min
59bef31997 community[minor]: Improve InMemoryVectorStore with ability to persist to disk and filter on metadata. (#22186)
- **Description:** The InMemoryVectorStore is a nice and simple vector
store implementation for quick development and debugging. The current
implementation is quite limited in its functionalities. This PR extends
the functionalities by adding utility function to persist the vector
store to a json file and to load it from a json file. We choose the json
file format because it allows inspection of the database contents in a
text editor, which is great for debugging. Furthermore, it adds a
`filter` keyword that can be used to filter out documents on their
`page_content` or `metadata`.
- **Issue:** -
- **Dependencies:** -
- **Twitter handle:** @Vincent_Min
2024-06-05 10:40:34 -04:00
Christophe Bornet
c34ad8c163 core[patch]: Improve VectorStore API doc (#22547) 2024-06-05 10:23:44 -04:00
maang-h
89128b7a49 community[patch]: add detailed paragraph and example for BaichuanTextEmbeddings (#22031)
- **Description:** add detailed paragraph and example for
BaichuanTextEmbeddings
   - **Issue:** the issue #21983
2024-06-05 10:18:11 -04:00
Anthony Bernabeu
4e676a63b8 community[minor]: Added filter search for LanceDB (#22461)
- [ ] **community**: "vectorstore: added filtering support for LanceDB
vector store"

- [ ] **This PR adds filtering capabilities to LanceDB**:
- **Description:** In LanceDB filtering can be applied when searching
for data into the vectorstore. It is using the SQL language as mentioned
in the LanceDB documentation.
    - **Issue:** #18235 
    - **Dependencies:** No

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-06-05 09:33:54 -04:00
Erick Friis
4050d6ea2b huggingface: remove text-generation dep (#22543) 2024-06-05 12:13:40 +00:00
Erick Friis
a6fc74f379 ai21: fix core version (#22544) 2024-06-05 08:09:19 -04:00
Asaf Joseph Gardin
75cba742e5 ai21: fix ai21 unittests (#22526)
Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-05 08:00:42 -04:00
Erick Friis
58192d617f community: fix huggingface deprecations (#22522) 2024-06-05 04:13:13 +00:00
Jacob Lee
1e748a6d40 docs[patch]: Adds links to deprecations page (#22514)
@baskaryan
2024-06-04 16:19:32 -07:00
William FH
91fed3ace7 [Docs] Structured output Keywords (#22511) 2024-06-04 20:56:05 +00:00
Christophe Bornet
8ba868d3b0 core[patch]: Add similarity_score_threshold to VectorStore search types (#22477) 2024-06-04 13:43:55 -07:00
Eugene Yurtsev
9120cf5df2 core[patch]: Deduplicate of callback handlers in merge_configs (#22478)
This PR adds deduplication of callback handlers in merge_configs.

Fix for this issue:
https://github.com/langchain-ai/langchain/issues/22227

The issue appears when the code is:

1) running python >=3.11
2) invokes a runnable from within a runnable
3) binds the callbacks to the child runnable from the parent runnable
using with_config

In this case, the same callbacks end up appearing twice: (1) the first
time from with_config, (2) the second time with langchain automatically
propagating them on behalf of the user.


Prior to this PR this will emit duplicate events:

```python
@tool
async def get_items(question: str, callbacks: Callbacks):  # <--- Accept callbacks
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model.with_config(
        {
            "callbacks": callbacks,  # <-- Propagate callbacks
        }
    )
    return await chain.ainvoke({"question": question})
```

Prior to this PR this will work work correctly (no duplicate events):

```python
@tool
async def get_items(question: str, callbacks: Callbacks):  # <--- Accept callbacks
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model
    return await chain.ainvoke({"question": question}, {"callbacks": callbacks})
```

This will also work (as long as the user is using python >= 3.11) -- as
langchain will automatically propagate callbacks

```python
@tool
async def get_items(question: str,):  
    """Ask question"""
    template = ChatPromptTemplate.from_messages(
        [
            (
                "human",
                "'{question}"
            )
        ]
    )
    chain = template | chat_model
    return await chain.ainvoke({"question": question})
```
2024-06-04 16:19:00 -04:00
Jacob Lee
64dbc52cae docs[patch]: Update quickstart tutorial (#22504)
Mentions LCEL more, hopefully flags it to more people as a simple
entrypoint

@baskaryan @hwchase17
2024-06-04 13:04:56 -07:00
Ofer Mendelevitch
ad502e8d50 community[minor]: Vectara Integration Update - Streaming, FCS, Chat, updates to documentation and example notebooks (#21334)
Thank you for contributing to LangChain!

**Description:** update to the Vectara / Langchain integration to
integrate new Vectara capabilities:
- Full RAG implemented as a Runnable with as_rag()
- Vectara chat supported with as_chat()
- Both support streaming response
- Updated documentation and example notebook to reflect all the changes
- Updated Vectara templates

**Twitter handle:** ofermend

**Add tests and docs**: no new tests or docs, but updated both existing
tests and existing docs
2024-06-04 12:57:28 -07:00
Bagatur
cb183a9bf1 docs: update anthropic chat model (#22483)
Related to #22296

And update anthropic to accept base_url
2024-06-04 12:42:06 -07:00
Erick Friis
d700ce8545 robocorp: typo (#22509) 2024-06-04 15:33:38 -04:00
Erick Friis
39fd44579a robocorp: release 0.0.9.post1 (#22507) 2024-06-04 15:32:30 -04:00
Erick Friis
339e3b7f55 ai21: release 0.1.6 (#22508) 2024-06-04 15:31:23 -04:00
ccurme
3c53cea760 together, upstage: bump minimum langchain-openai version (#22505) 2024-06-04 15:20:41 -04:00
Erick Friis
c438b5b78e docs: fix api ref link generation (#22438)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-04 12:09:22 -07:00
Bagatur
efcb04f84b mongodb[patch]: Release 0.1.6 (#22501) 2024-06-04 12:01:37 -07:00
Bagatur
222b1ba112 groq[patch]: Release 0.1.5 (#22500) 2024-06-04 12:01:17 -07:00
Bagatur
f021be510e milvus[patch]: Release 0.1.1 (#22499) 2024-06-04 12:00:53 -07:00
Bagatur
64d68c17cd upstage[patch]: Release 0.1.6 (#22498) 2024-06-04 11:58:44 -07:00
Bagatur
48fba40fce experimental[patch]: Release 0.0.60 (#22497) 2024-06-04 11:56:42 -07:00
Bagatur
e60f88ccdd community[patch]: Release 0.2.2 (#22496) 2024-06-04 11:42:11 -07:00
Bagatur
85aa218564 langchain[patch]: Release 0.2.2 (#22495) 2024-06-04 11:33:45 -07:00
Bagatur
8e86080def mistralai[patch]: Release 0.1.8 (#22494) 2024-06-04 11:33:06 -07:00
Bagatur
e850de2422 huggingface[patch]: release 0.0.2 (#22493) 2024-06-04 11:32:36 -07:00
Jacob Lee
593de8a913 docs[patch]: Add robots.txt and root sitemap (#22492)
CC @efriis @baskaryan
2024-06-04 11:26:40 -07:00
Bagatur
99a3cad258 text-splitters[patch]: Release 0.2.1 (#22490) 2024-06-04 11:19:21 -07:00
Bagatur
161b02a8be core[patch]: Release 0.2.4 (#22489) 2024-06-04 11:14:54 -07:00
Ragul Kachiappan
50258a7dda docs: Update chroma docs link for collection reference (#22472)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: 
- **Description:** Updated dead link referencing chroma docs in Chroma
notebook under vectorstores
2024-06-04 18:01:13 +00:00
nareshnagpal06
9b45374118 docs: Added Semantic Cache Example with BedrockChat using Bedrock Embedding… (#22190)
…s and Opensearch Semantic Cache

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-04 17:40:29 +00:00
Joydeep Banik Roy
3796672c67 community, milvus, pinecone, qdrant, mongo: Broadcast operation failure while using simsimd beyond v3.7.7 (#22271)
- [ ] **Packages affected**: 
  - community: fix `cosine_similarity` to support simsimd beyond 3.7.7
- partners/milvus: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/mongodb: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/pinecone: fix `cosine_similarity` to support simsimd beyond
3.7.7
- partners/qdrant: fix `cosine_similarity` to support simsimd beyond
3.7.7


- [ ] **Broadcast operation failure while using simsimd beyond v3.7.7**:
- **Description:** I was using simsimd 4.3.1 and the unsupported operand
type issue popped up. When I checked out the repo and ran the tests,
they failed as well (have attached a screenshot for that). Looks like it
is a variant of https://github.com/langchain-ai/langchain/issues/18022 .
Prior to 3.7.7, simd.cdist returned an ndarray but now it returns
simsimd.DistancesTensor which is ineligible for a broadcast operation
with numpy. With this change, it also remove the need to explicitly cast
`Z` to numpy array
    - **Issue:** #19905
    - **Dependencies:** No
    - **Twitter handle:** https://x.com/GetzJoydeep

<img width="1622" alt="Screenshot 2024-05-29 at 2 50 00 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/fb27b383-a9ae-4a6f-b355-6d503b72db56">

- [ ] **Considerations**: 
1. I started with community but since similar changes were there in
Milvus, MongoDB, Pinecone, and QDrant so I modified their files as well.
If touching multiple packages in one PR is not the norm, then I can
remove them from this PR and raise separate ones
2. I have run and verified that the tests work. Since, only MongoDB had
tests, I ran theirs and verified it works as well. Screenshots attached
:
<img width="1573" alt="Screenshot 2024-05-29 at 2 52 13 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/ce87d1ea-19b6-4900-9384-61fbc1a30de9">
<img width="1614" alt="Screenshot 2024-05-29 at 3 33 51 PM"
src="https://github.com/langchain-ai/langchain/assets/31132555/6ce1d679-db4c-4291-8453-01028ab2dca5">
  

I have added a test for simsimd. I feel it may not go well with the
CI/CD setup as installing simsimd is not a dependency requirement. I
have just imported simsimd to ensure simsimd cosine similarity is
invoked. However, its not a good approach. Suggestions are welcome and I
can make the required changes on the PR. Please provide guidance on the
same as I am new to the community.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-04 17:36:31 +00:00
KyrianC
03178ee74f community[minor]: Add tools calls to ChatEdenAI (#22320)
### Description  
Add tools implementation to `ChatEdenAI`:
- `bind_tools()`
- `with_structured_output()`

### Documentation 
Updated `docs/docs/integrations/chat/edenai.ipynb`

### Notes
We don´t support stream with tools as of yet. If stream is called with
tools we directly yield the whole message from `generate` (implemented
the same way as Anthropic did).
2024-06-04 10:29:28 -07:00
pranavvuppala
9d4350e69a docs : Update docstrings for OpenAI base.py (#22221)
- [x] **PR title**: Update docstrings for OpenAI base.py
-**Description:** Updated the docstring of few OpenAI functions for a
better understanding of the function.
    - **Issue:** #21983

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-06-04 17:24:17 +00:00
Anindyadeep
7a197539aa communty[patch]: Native RAG Support in Prem AI langchain (#22238)
This PR adds native RAG support in langchain premai package. The same
has been added in the docs too.
2024-06-04 10:19:54 -07:00
Rahul Triptahi
77ad857934 community[minor]: Enable retrieval api calls in PebbloRetrievalQA (#21958)
Description: Enable app discovery and Prompt/Response apis in
PebbloSafeRetrieval
Documentation: NA
Unit test: N/A

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-06-04 10:18:50 -07:00
liugz18
8fd231086e experimental[patch]: Fix graph_transformers llms #21482 (#22417)
Fix AttributeError on calling
LLMGraphTransformer.convert_to_graph_documents #21482

 since raw_schema is always a str

@baskaryan
2024-06-04 17:07:38 +00:00
ccurme
6db25b4e31 core[patch]: bump langsmith (#22476)
Noticing errors logged in some situations when tracing with Langsmith:
```python
from langchain_core.pydantic_v1 import BaseModel
from langchain_anthropic import ChatAnthropic


class AnswerWithJustification(BaseModel):
    """An answer to the user question along with justification for the answer."""
    answer: str
    justification: str


llm = ChatAnthropic(model="claude-3-haiku-20240307")
structured_llm = llm.with_structured_output(AnswerWithJustification)

list(structured_llm.stream("What weighs more a pound of bricks or a pound of feathers"))
```
```
Error in LangChainTracer.on_chain_end callback: AttributeError("'NoneType' object has no attribute 'append'")
[AnswerWithJustification(answer='A pound of bricks and a pound of feathers weigh the same amount.', justification='This is because a pound is a unit of mass, not volume. By definition, a pound of any material, whether bricks or feathers, will weigh the same - one pound. The physical size or volume of the materials does not matter when measuring by mass. So a pound of bricks and a pound of feathers both weigh exactly one pound.')]
```
2024-06-04 10:05:53 -07:00
Bagatur
17c127531a community[patch]: deprecate all HF classes (#22444) 2024-06-04 09:48:25 -07:00
Nuno Campos
58b118544e Use immutable sequence type for batch/batch_as_completed types (#22433)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-04 08:04:09 -07:00
Christophe Bornet
9a8fe58ebe community[minor]: Improve Cassandra VectorStore as_retriever (#22465)
The Vectorstore's API `as_retriever` doesn't expose explicitly the
parameters `search_type` and `search_kwargs` and so these are not well
documented.
This PR improves `as_retriever` for the Cassandra VectorStore by making
these parameters explicit.

NB: An alternative would have been to modify `as_retriever` in
`Vectorstore`. But there's probably a good reason these were not exposed
in the first place ? Is it because implementations may decide to not
support them and have fixed values when creating the
VectorStoreRetriever ?
2024-06-04 09:51:17 -04:00
Christophe Bornet
23bba18f92 core[patch]: Fix VectorStore's as_retriever mutating tags param (#22470)
The current VectorStore `as_retriever` implementation mutates the `tags`
param when it's passed in kwargs.
This fix ensures that a copy is done.
2024-06-04 09:50:36 -04:00
Michal Gregor
98b2e7b195 huggingface[patch]: Support for HuggingFacePipeline in ChatHuggingFace. (#22194)
- **Description:** Added support for using HuggingFacePipeline in
ChatHuggingFace (previously it was only usable with API endpoints,
probably by oversight).
- **Issue:** #19997 
- **Dependencies:** none
- **Twitter handle:** none

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-04 00:47:35 +00:00
Fahreddin Özcan
0061ded002 community[patch]: Upstash Vector Store Namespace Support (#22251)
This PR introduces namespace support for Upstash Vector Store, which
would allow users to partition their data in the vector index.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-03 17:30:56 -07:00
Isaac Francisco
25cf1a74d5 docs: rag tutorial small fixes (#22450) 2024-06-04 00:16:54 +00:00
Jacob Lee
b0f014666d docs[patch]: Adds search keywords for common queries (#22449)
CC @baskaryan @efriis @ccurme
2024-06-03 16:30:17 -07:00
Guangdong Liu
bc7e32f315 core(patch):fix partial_variables not working with SystemMessagePromptTemplate (#20711)
- **Issue:**  close #17560
- @baskaryan, @eyurtsev
2024-06-03 16:22:42 -07:00
Martin Kolb
f2dd31b9e8 docs: Fix doc issue for HANA Cloud Vector Engine (#22260)
- **Description:**
This PR fixes a rendering issue in the docs (Python notebook) of HANA
Cloud Vector Engine.

  - **Issue:** N/A
  - **Dependencies:** no new dependencies added

File of the fixed notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`
2024-06-03 15:53:43 -07:00
Dristy Srivastava
ef3df45d9d community[minor]: Updating payload for pebblo discover API (#22309)
**Description:** Updating response for pebblo discover API. Also
updating filed name case type
**Documentation:** N/A
**Unit tests:** N/A
2024-06-03 15:36:17 -07:00
Miroslav
cbd5720011 huggingface[patch]: Skip Login to HuggingFaceHub when token is not set (#22365) 2024-06-03 15:20:32 -07:00
Stefano Lottini
f78ae1d932 docs: Astra DB vectorstore, add automatic-embedding example (#22350)
Description: Adding an example showcasing the newly-introduced API-side
embedding computation option for the Astra DB vector store
2024-06-03 15:13:57 -07:00
bhardwaj-vipul
f397a84a59 langchain[patch]: Fix MongoDBAtlasVectorSearch reference in self query retriever (#22401)
**Description:** 
SelfQuery Retriever with MongoDBAtlasVectorSearch (from
langchain_mongodb import MongoDBAtlasVectorSearch) and
Chroma (from langchain_chroma import Chroma) is not supported.
The imports in the [builtin
translators](8cbce684d4/libs/langchain/langchain/retrievers/self_query/base.py (L73))
points to the
[deprecated](acaf214a45/libs/community/langchain_community/vectorstores/mongodb_atlas.py (L36))
vectorstore.

**Issue:** 
#22272

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-03 22:10:15 +00:00
ccurme
afe89a1411 community: add standard chat model params to Ollama (#22446) 2024-06-03 17:45:03 -04:00
Isaac Francisco
5119ab2fb9 docs: agents tutorial wording (#22447) 2024-06-03 14:40:01 -07:00
Ethan Yang
52da6a160d community[patch]: Update OpenVINO embedding and reranker to support static input shape (#22171)
It can help to deploy embedding models on NPU device
2024-06-03 13:27:17 -07:00
Tom Clelford
c599732e1a text-splitters[patch]: fix HTMLSectionSplitter parsing of xslt paths (#22176)
## Description
This PR allows passing the HTMLSectionSplitter paths to xslt files. It
does so by fixing two trivial bugs with how passed paths were being
handled. It also changes the default value of the param `xslt_path` to
`None` so the special case where the file was part of the langchain
package could be handled.

## Issue
#22175
2024-06-03 20:26:59 +00:00
maang-h
01352bb55f community[minor]: Implement MiniMaxChat interface (#22391)
- **Description:** Implement MiniMaxChat interface, include:
    - No longer inherits the LLM class (like other chat model)
    - Update request parameters (v1 -> v2)
        - update `base url`
        - update message role (system, user, assistant)
        - add `stream` function
        - no longer use `group id`
    - Implement the `_stream`, `_agenerate`, and `_astream` interfaces

[minimax v2 api
document](https://platform.minimaxi.com/document/guides/chat-model/V2?id=65e0736ab2845de20908e2dd)
2024-06-03 13:22:38 -07:00
Brandon Sharp
56e5aa4dd9 community[patch]: Airtable to allow for addtl params (#22092)
- [X] **PR title**: "community: added optional params to Airtable
table.all()"


- [X] **PR message**: 
- **Description:** Add's **kwargs to AirtableLoader to allow for kwargs:
https://pyairtable.readthedocs.io/en/latest/api.html#pyairtable.Table.all
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** parakoopa88


- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-06-03 13:05:56 -07:00
Harichandan Roy
1f751343e2 community[patch]: update embeddings/oracleai.py (#22240)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"

"community/embeddings: update oracleai.py"

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

Adding oracle VECTOR_ARRAY_T support.

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Tests are not impacted.

- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Done.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-03 12:38:51 -07:00
maang-h
13140dc4ff community[patch]: Update the default api_url and reqeust_body of sparkllm embedding (#22136)
- **Description:** When I was running the SparkLLMTextEmbeddings,
app_id, api_key and api_secret are all correct, but it cannot run
normally using the current URL.

    ```python
    # example
    from langchain_community.embeddings import SparkLLMTextEmbeddings

    embedding= SparkLLMTextEmbeddings(
        spark_app_id="my-app-id",
        spark_api_key="my-api-key",
        spark_api_secret="my-api-secret"
    )
    embedding= "hello"
    print(spark.embed_query(text1))
    ```

![sparkembedding](https://github.com/langchain-ai/langchain/assets/55082429/11daa853-4f67-45b2-aae2-c95caa14e38c)
   
So I updated the url and request body parameters according to
[Embedding_api](https://www.xfyun.cn/doc/spark/Embedding_api.html), now
it is runnable.
2024-06-03 12:38:11 -07:00
Yuwen Hu
ba0dca46d7 community[minor]: Add IPEX-LLM BGE embedding support on both Intel CPU and GPU (#22226)
**Description:** [IPEX-LLM](https://github.com/intel-analytics/ipex-llm)
is a PyTorch library for running LLM on Intel CPU and GPU (e.g., local
PC with iGPU, discrete GPU such as Arc, Flex and Max) with very low
latency. This PR adds ipex-llm integrations to langchain for BGE
embedding support on both Intel CPU and GPU.
**Dependencies:** `ipex-llm`, `sentence-transformers`
**Contribution maintainer**: @Oscilloscope98 
**tests and docs**: 
- langchain/docs/docs/integrations/text_embedding/ipex_llm.ipynb
- langchain/docs/docs/integrations/text_embedding/ipex_llm_gpu.ipynb
-
langchain/libs/community/tests/integration_tests/embeddings/test_ipex_llm.py

---------

Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>
2024-06-03 12:37:10 -07:00
Jacob Lee
c01467b1f4 core[patch]: RFC: Allow concatenation of messages with multi part content (#22002)
Anthropic's streaming treats tool calls as different content parts
(streamed back with a different index) from normal content in the
`content`.

This means that we need to update our chunk-merging logic to handle
chunks with multi-part content. The alternative is coerceing Anthropic's
responses into a string, but we generally like to preserve model
provider responses faithfully when we can. This will also likely be
useful for multimodal outputs in the future.

This current PR does unfortunately make `index` a magic field within
content parts, but Anthropic and OpenAI both use it at the moment to
determine order anyway. To avoid cases where we have content arrays with
holes and to simplify the logic, I've also restricted merging to chunks
in order.

TODO: tests

CC @baskaryan @ccurme @efriis
2024-06-03 09:46:40 -07:00
Dan
86509161b0 community: fix AzureSearch delete documents (#22315)
**Description**

Fix AzureSearch delete documents method by using FIELDS_ID variable
instead of the hard coded "id" value

**Issue:** 

This is linked to this issue:
https://github.com/langchain-ai/langchain/issues/22314

Co-authored-by: dseban <dan.seban@neoxia.com>
2024-06-03 15:55:06 +00:00
Harrison Chase
8fad2e209a fix error message (#22437)
Was confusing when language is in Enum but not implemented
2024-06-03 15:48:26 +00:00
Bagatur
678a19a5f7 infra: bump anthropic mypy 1 (#22373) 2024-06-03 08:21:55 -07:00
Nuno Campos
ceb73ad06f core: In BaseRetriever make get_relevant_docs delegate to invoke (#22434)
- This fixes all the tracing issues with people still using
get_relevant_docs, and a change we need for 0.3 anyway

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-06-03 07:34:53 -07:00
Zheng Robert Jia
1ad1dc5303 docs: resolve minor syntax error. (#22375)
Used the correct magic command. 
Changed from `% pip...` to `%pip`

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-03 14:34:24 +00:00
Charles John
2d81a72884 community: fix missing apify_api_token field in ApifyWrapper (#22421)
- **Description:** The `ApifyWrapper` class expects `apify_api_token` to
be passed as a named parameter or set as an environment variable. But
the corresponding field was missing in the class definition causing the
argument to be ignored when passed as a named param. This patch fixes
that.
2024-06-03 14:32:57 +00:00
Klaudia Lemiec
dac355fc62 docs: notebook loader: change .html to .ipynb (#22407)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-06-03 14:26:28 +00:00
Joan Fontanals
a7ae16f912 add embed_image API to JinaEmbedding (#22416)
- **Description:** Add `embed_image` to JinaEmbedding to embed images
 - **Twitter handle:** https://x.com/JinaAI_
2024-06-03 10:23:37 -04:00
Qingchuan Hao
3e92ed8056 docs: add Microsoft Azure to ChatModelTabs (#22367)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-06-03 10:19:00 -04:00
Nuno Campos
ed8e9c437a core: In RunnableSequence pass kwargs to the first step (#22393)
- This is a pattern that shows up occasionally in langgraph questions,
people chain a graph to something else after, and want to pass the graph
some kwargs (eg. stream_mode)
2024-06-03 14:18:10 +00:00
Jeffrey Morgan
eabcfaa3d6 Update Ollama instructions (#22394) 2024-06-03 10:17:35 -04:00
Harrison Chase
acaf214a45 update agent docs (#22370)
to use create_react_agent

---------

Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
2024-06-01 08:28:32 -07:00
Jacob Lee
16cce76a68 👥 Update LangChain people data (#22388)
👥 Update LangChain people data

Co-authored-by: github-actions <github-actions@github.com>
2024-06-01 07:36:45 -07:00
Jacob Lee
8a57102918 docs[patch]: Fix typo (#22377) 2024-05-31 16:37:05 -07:00
Bagatur
4d82cea71f docs: fix llm caches redirect (#22371) 2024-05-31 19:37:06 +00:00
Bagatur
a8098f5ddb anthropic[patch]: Release 0.1.15, fix sdk tools break (#22369) 2024-05-31 12:10:22 -07:00
Erick Friis
6ffa0acf32 ai21: fix text-splitters version (#22366) 2024-05-31 11:41:05 -04:00
Erick Friis
1bad0ac946 docs: redirect integration links to 0.2 (#22326) 2024-05-31 11:40:48 -04:00
ccurme
8cbce684d4 docs: update retriever how-to content (#22362)
- [x] How to: use a vector store to retrieve data
- [ ] How to: generate multiple queries to retrieve data for
- [x] How to: use contextual compression to compress the data retrieved
- [x] How to: write a custom retriever class
- [x] How to: add similarity scores to retriever results
^ done last month
- [x] How to: combine the results from multiple retrievers
- [x] How to: reorder retrieved results to mitigate the "lost in the
middle" effect
- [x] How to: generate multiple embeddings per document
^ this PR
- [ ] How to: retrieve the whole document for a chunk
- [ ] How to: generate metadata filters
- [ ] How to: create a time-weighted retriever
- [ ] How to: use hybrid vector and keyword retrieval
^ todo
2024-05-31 10:57:35 -04:00
Jacob Lee
75ed9ee929 docs: Fix Solar and OCI integration page typos (#22343)
@efriis @baskaryan
2024-05-31 10:36:12 -04:00
Bagatur
0214246dc6 docs: list tool calling models (#22334) 2024-05-30 14:32:33 -07:00
Bagatur
410e9add44 infra: run scheduled tests on aws, google, cohere, nvidia (#22328)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-30 13:57:12 -07:00
Harrison Chase
0c9a034ed7 add simpler agent tutorial (#22249)
1/ added section at start with full code
2/ removed retriever tool (was just distracting)
3/ added section on starting a new conversation

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-30 12:33:32 -07:00
Bagatur
2b9f1469d8 core[patch]: Release 0.2.3 (#22329) 2024-05-30 11:35:09 -07:00
Harrison Chase
ee32369265 core[patch]: fix runnable history and add docs (#22283) 2024-05-30 11:26:41 -07:00
William FH
dcec133b85 [Core] Update Tracing Interops (#22318)
LangSmith and LangChain context var handling evolved in parallel since
originally we didn't expect people to want to interweave the decorator
and langchain code.

Once we get a new langsmith release, this PR will let you seemlessly
hand off between @traceable context and runnable config context so you
can arbitrarily nest code.

It's expected that this fails right now until we get another release of
the SDK
2024-05-30 10:34:49 -07:00
ccurme
f34337447f openai: update ChatOpenAI api ref (#22324)
Update to reflect that token usage is no longer default in streaming
mode.

Add detail for streaming context under Token Usage section.
2024-05-30 12:31:28 -04:00
ChengZi
2443e85533 docs: fix milvus import and update template (#22306)
docs: fix milvus import problem
update milvus-rag template with milvus-lite

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
2024-05-30 08:28:55 -07:00
WU LIFU
86698b02a9 doc: fix wrong documentation on FAISS load_local function (#22310)
### Issue: #22299 

### descriptions
The documentation appears to be wrong. When the user actually sets this
parameter "asynchronous" to be True, it fails because the __init__
function of FAISS class doesn't allow this parameter. In fact, most of
the class/instance functions of this class have both the sync/async
version, so it looks like what we need is just to remove this parameter
from the doc.

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Co-authored-by: Lifu Wu <lifu@nextbillion.ai>
2024-05-30 15:15:04 +00:00
maang-h
596c062cba community[patch]: Standardize qianfan model init args name (#22322)
- **Description:**  
    - Standardize qianfan chat model intialization arguments name
        - qianfan_ak (qianfan api key)  -> api_key
        - qianfan_sk (qianfan secret key)  ->  secret_key
       
    - Delete unuse variable
- **Issue:** #20085
2024-05-30 11:08:32 -04:00
KhoPhi
c64b0a3095 Docs: Ollama (LLM, Chat Model & Text Embedding) (#22321)
- [x] Docs Update: Ollama
  - llm/ollama 
- Switched to using llama3 as model with reference to templating and
prompting
      - Added concurrency notes to llm/ollama docs
  - chat_models/ollama
      - Added concurrency notes to llm/ollama docs
  - text_embedding/ollama
     - include example for specific embedding models from Ollama
2024-05-30 11:06:45 -04:00
Dobiichi-Origami
10b12e1c08 community: adding tool_call_id for every ToolCall (#22323)
- **Description:** This PR contains a bugfix which result in malfunction
of multi-turn conversation in QianfanChatEndpoint and adaption for
ToolCall and ToolMessage
2024-05-30 10:59:08 -04:00
Bagatur
569d325a59 docs: link GH org (#22308) 2024-05-30 00:17:59 -07:00
Bagatur
93049d1563 docs: make llm cache its own section (#22301) 2024-05-30 00:17:33 -07:00
Bagatur
04631439c9 docs: add v0.2 links to README (#22300) 2024-05-29 16:22:01 -07:00
ccurme
f39e1a2288 community, docs: update token usage tracking callback + how-to guides (#22145) 2024-05-29 17:00:47 -04:00
Bagatur
2bc50fb895 docs, cli[patch]: chat model template nit (#22294) 2024-05-29 20:53:58 +00:00
Bagatur
aa6c31df53 cli[patch]: Release 0.0.24 (#22293) 2024-05-29 13:37:34 -07:00
Bagatur
627a337887 docs, cli[patch]: chat model doc template (#22290)
Update ChatModel integration doc template, integration docstring, and
adds langchain-cli command to easily create just doc (for updating
existing integrations):

```bash
langchain-cli integration create-doc --name "foo-bar"
```
2024-05-29 13:34:58 -07:00
Wu Enze
f40e341a03 docs : Added integrations for memory with langchain_community (#22265)
PR title: Integration Docs enhancement

Description: Adding installation instructions for integrations requiring
langchain-community package since 0.2
Issue: [#22005](https://github.com/langchain-ai/langchain/issues/22005)
2024-05-29 16:12:05 -04:00
ccurme
6e1df72a88 openai[patch]: Release 0.1.8 (#22291) 2024-05-29 20:08:30 +00:00
ccurme
e71b0b5827 core[patch]: Release 0.2.2 (#22289) 2024-05-29 19:51:37 +00:00
William FH
9d6cabe84a Update sequence.ipynb (#22288) 2024-05-29 19:34:44 +00:00
Daniel Glogowski
7ff05357ba docs: updating NIM documentation (#22258)
Updating NVIDIA NIM notebooks and readme file.

Thanks!
Daniel
2024-05-29 10:28:39 -07:00
Bagatur
6dd0f095c3 docs: revamp ChatOpenAI (#22253)
Can build API ref docs by running
```bash
make api_docs_clean; make api_docs_quick_preview API_PKG=openai
```
only builds openai ref, takes ~20 sec
2024-05-29 10:20:14 -07:00
Erick Friis
00c70d98c2 robocorp: release 0.0.9 (#22282) 2024-05-29 16:49:18 +00:00
Mikko Korpela
fc5909ad6f langchain-robocorp: Fix parsing of Union types (such as Optional). (#22277) 2024-05-29 09:47:02 -07:00
ccurme
af1f723ada openai: don't override stream_options default (#22242)
ChatOpenAI supports a kwarg `stream_options` which can take values
`{"include_usage": True}` and `{"include_usage": False}`.

Setting include_usage to True adds a message chunk to the end of the
stream with usage_metadata populated. In this case the final chunk no
longer includes `"finish_reason"` in the `response_metadata`. This is
the current default and is not yet released. Because this could be
disruptive to workflows, here we remove this default. The default will
now be consistent with OpenAI's API (see parameter
[here](https://platform.openai.com/docs/api-reference/chat/create#chat-create-stream_options)).

Examples:
```python
from langchain_openai import ChatOpenAI

llm = ChatOpenAI()

for chunk in llm.stream("hi"):
    print(chunk)
```
```
content='' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='Hello' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='!' id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
content='' response_metadata={'finish_reason': 'stop'} id='run-8cff4721-2acd-4551-9bf7-1911dae46b92'
```

```python
for chunk in llm.stream("hi", stream_options={"include_usage": True}):
    print(chunk)
```
```
content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='Hello' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='!' id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='' response_metadata={'finish_reason': 'stop'} id='run-39ab349b-f954-464d-af6e-72a0927daa27'
content='' id='run-39ab349b-f954-464d-af6e-72a0927daa27' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}
```

```python
llm = ChatOpenAI().bind(stream_options={"include_usage": True})

for chunk in llm.stream("hi"):
    print(chunk)
```
```
content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='Hello' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='!' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='' response_metadata={'finish_reason': 'stop'} id='run-59918845-04b2-41a6-8d90-f75fb4506e0d'
content='' id='run-59918845-04b2-41a6-8d90-f75fb4506e0d' usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17}
```
2024-05-29 10:30:40 -04:00
Karim Lalani
a1899439fc [experimental][llms][ollama_functions] Update OllamaFunctions to send tool_calls attribute (#21625)
Update OllamaFunctions to return `tool_calls` for AIMessages when used
for tool calling.
2024-05-29 09:38:33 -04:00
Bagatur
d61bdeba25 core[patch]: allow access RunnableWithFallbacks.runnable attrs (#22139)
RFC, candidate fix for #13095 #22134
2024-05-28 13:18:09 -07:00
SteveLiao
7496fe2b16 Update parent_document_retriever.py about **kwargs (#22219)
Add kwargs in add_documents function

**langchain**: Add **kwargs in parent_document_retriever"
 - **Add kwargs for `add_document` in `parent_document_retriever.py`** 


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-05-28 11:35:38 -07:00
Mark Cusack
8dfa3c5f1a Update/fix docs to list Yellowbrick as a supported indexed vectorstore (#22235)
Update/fix docs to list Yellowbrick as a supported indexed vectorstore
and fix the Jupyter notebook.
2024-05-28 11:34:49 -07:00
Erick Friis
93240fac68 milvus: fix core dep (#22239) 2024-05-28 10:21:37 -07:00
Erick Friis
611faa22c7 infra: allow first releases 2 (#22237) 2024-05-28 09:53:21 -07:00
Erick Friis
26c6e4a5ef infra: allow first releases (#22236) 2024-05-28 09:39:40 -07:00
ChengZi
404d92ded0 milvus: New langchain_milvus package and new milvus features (#21077)
New features:

- New langchain_milvus package in partner
- Milvus collection hybrid search retriever
- Zilliz cloud pipeline retriever
- Milvus Local guid
- Rag-milvus template

---------

Signed-off-by: ChengZi <chen.zhang@zilliz.com>
Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Jael Gu <mengjia.gu@zilliz.com>
Co-authored-by: Jackson <jacksonxie612@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-05-28 08:24:20 -07:00
Leonid Ganeline
d7f70535ba docs: arxiv page, added cookbooks (#22215)
Issue: The `arXiv` page is missing the arxiv paper references from the
`langchain/cookbook`.
PR: Added the cookbook references.
Result: `Found 29 arXiv references in the 3 docs, 21 API Refs, 5
Templates, and 18 Cookbooks.` - much more references are visible now.
2024-05-27 15:47:02 -07:00
Leonid Ganeline
d6995e814b ai21[patch]: added license (#22153)
The `pyproject.toml` missed the `license` parameter. I've added it as
`MIT`
2024-05-27 15:14:14 -07:00
Maddy Adams
8332a36f69 infra: update langchainhub and add integration test (#22154)
**Description:** Update langchainhub integration test dependency and add
an integration test for pulling private prompt
**Dependencies:** langchainhub 0.1.16
2024-05-27 14:58:10 -07:00
Will Higgins
83d10df78d community[patch]: Update firecrawl api key name (#22183)
Change 'FIREWALL' to 'FIRECRAWL' as I believe this may have been in
error. Other docs refer to 'FIRECRAWL_API_KEY'.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-27 21:39:29 +00:00
hmasdev
bbd7015b5d core[patch]: Add TypeError handler into get_graph of Runnable (#19856)
# Description

## Problem

`Runnable.get_graph` fails when `InputType` or `OutputType` property
raises `TypeError`.

-
003c98e5b4/libs/core/langchain_core/runnables/base.py (L250-L274)
-
003c98e5b4/libs/core/langchain_core/runnables/base.py (L394-L396)

This problem prevents getting a graph of `Runnable` objects whose
`InputType` or `OutputType` property raises `TypeError` but whose
`invoke` works well, such as `langchain.output_parsers.RegexParser`,
which I have already pointed out in #19792 that a `TypeError` would
occur.

## Solution

- Add `try-except` syntax to handle `TypeError` to the codes which get
`input_node` and `output_node`.

# Issue
- #19801 

# Twitter Handle
- [hmdev3](https://twitter.com/hmdev3)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-27 21:34:34 +00:00
acho98
753353411f docs: Fix Clova embeddings example document (#22181)
- [ ] **PR title**: "Fix list handling in Clova embeddings example
documentation"
  - Description:
Fixes a bug in the Clova Embeddings example documentation where
document_text was incorrectly wrapped in an additional list.
   - Rationale
The embed_documents method expects a list, but the previous example
wrapped document_text in an unnecessary additional list, causing an
error. The updated example correctly passes document_text directly to
the method, ensuring it functions as intended.
2024-05-27 14:31:34 -07:00
Mohammad Mohtashim
577ed68b59 mistralai[patch]: Added Json Mode for ChatMistralAI (#22213)
- **Description:** Powered
[ChatMistralAI.with_structured_output](fbfed65fb1/libs/partners/mistralai/langchain_mistralai/chat_models.py (L609))
via json mode
 

-  **Issue:** #22081

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-27 21:16:52 +00:00
Pranith
25c270b5a5 docs : Added integrations for tools with langchain_community (#22188)
PR title: Docs enhancement

Description: Adding installation instructions for integrations requiring
langchain-community package since 0.2
Issue: https://github.com/langchain-ai/langchain/issues/22005
2024-05-27 14:06:40 -07:00
Ibrahim
cfea0e231a Update llm_chain.ipynb text (#22198)
Added the missing verb "is" and a comma to the text in the Prompt
Templates description within the Build a Simple LLM Application tutorial
for more clarity.
2024-05-27 19:57:41 +00:00
Aditya
bf81ecd3b4 docs:updated documentation for llama, falcon and gemma on Vertex AI Model garden (#22201)
- **Description:** updated documentation for llama, falcona and gemma on
Vertex AI Model garden
    - **Issue:** NA
    - **Dependencies:** NA
    - **Twitter handle:** NA

@lkuligin for review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
2024-05-27 12:56:11 -07:00
Pavlo Paliychuk
342df7cf83 community[minor]: Add Zep Cloud components + docs + examples (#21671)
Thank you for contributing to LangChain!

- [x] **PR title**: community: Add Zep Cloud components + docs +
examples

- [x] **PR message**: 
We have recently released our new zep-cloud sdks that are compatible
with Zep Cloud (not Zep Open Source). We have also maintained our Cloud
version of langchain components (ChatMessageHistory, VectorStore) as
part of our sdks. This PRs goal is to port these components to langchain
community repo, and close the gap with the existing Zep Open Source
components already present in community repo (added
ZepCloudMemory,ZepCloudVectorStore,ZepCloudRetriever).
Also added a ZepCloudChatMessageHistory components together with an
expression language example ported from our repo. We have left the
original open source components intact on purpose as to not introduce
any breaking changes.
    - **Issue:** -
- **Dependencies:** Added optional dependency of our new cloud sdk
`zep-cloud`
    - **Twitter handle:** @paulpaliychuk51


- [x] **Add tests and docs**


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-27 12:50:13 -07:00
Jan Soubusta
cccc8fbe2f community[patch]: DuckDB VS - expose similarity, improve performance of from_texts (#20971)
3 fixes of DuckDB vector store:
- unify defaults in constructor and from_texts (users no longer have to
specify `vector_key`).
- include search similarity into output metadata (fixes #20969)
- significantly improve performance of `from_documents`

Dependencies: added Pandas to speed up `from_documents`.
I was thinking about CSV and JSON options, but I expect trouble loading
JSON values this way and also CSV and JSON options require storing data
to disk.
Anyway, the poetry file for langchain-community already contains a
dependency on Pandas.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-05-24 15:17:52 -07:00
Surya Pratap Singh Shekhawat
42207f5bef Update agent_executor.ipynb (#22104)
fixed typos in the doc.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-05-24 22:14:41 +00:00
Erick Friis
8acadc34f5 docs: edit links, direct for notebooks (#22051) 2024-05-24 19:44:46 +00:00
Erick Friis
42ffcb2ff1 anthropic: release 0.1.14rc2, test release note gen (#22147) 2024-05-24 12:40:10 -07:00
Erick Friis
6ee8de62c0 infra: auto-generated release notes based on git log (#22141)
Generates release notes based on a `git log` command with title names

Aiming to improve to splitting out features vs. bugfixes using
conventional commits in the coming weeks.

Will work for any monorepo packages
2024-05-24 11:43:28 -07:00
Ameya Shenoy
8ba492ed6a community[minor]: clickhouse -- ability to use secure connection (#22108)
- **Description:** this PR gives clickhouse client the ability to use a
secure connection to the clickhosue server
- **Issue:** fixes #22082
- **Dependencies:** -
- **Twitter handle:** `_codingcoffee_`

Signed-off-by: Ameya Shenoy <shenoy.ameya@gmail.com>
Co-authored-by: Shresth Rana <shresth@grapevine.in>
2024-05-24 17:30:22 +00:00
ccurme
9a010fb761 openai: read stream_options (#21548)
OpenAI recently added a `stream_options` parameter to its chat
completions API (see [release
notes](https://platform.openai.com/docs/changelog/added-chat-completions-stream-usage)).
When this parameter is set to `{"usage": True}`, an extra "empty"
message is added to the end of a stream containing token usage. Here we
propagate token usage to `AIMessage.usage_metadata`.

We enable this feature by default. Streams would now include an extra
chunk at the end, **after** the chunk with
`response_metadata={'finish_reason': 'stop'}`.

New behavior:
```
[AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='Hello', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='!', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde'),
 AIMessageChunk(content='', id='run-4b20dbe0-3817-4f62-b89d-03ef76f25bde', usage_metadata={'input_tokens': 8, 'output_tokens': 9, 'total_tokens': 17})]
```

Old behavior (accessible by passing `stream_options={"include_usage":
False}` into (a)stream:
```
[AIMessageChunk(content='', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='Hello', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='!', id='run-1312b971-c5ea-4d92-9015-e6604535f339'),
 AIMessageChunk(content='', response_metadata={'finish_reason': 'stop'}, id='run-1312b971-c5ea-4d92-9015-e6604535f339')]
```

From what I can tell this is not yet implemented in Azure, so we enable
only for ChatOpenAI.
2024-05-24 13:20:56 -04:00
Patrick Zhang
eb7c767e5b docs: update the name of the tool passio_nutrition_ai (#22116)
Updating the name of the Passion Nutrition AI tool so that the name of
the tool is correctly displayed in the sidebar menu.

Currently the name of the tool says "Quickstart" in the side bar.
The patch fixed the name to be Passio Nutrition AI.

<img width="681" alt="image"
src="https://github.com/langchain-ai/langchain/assets/4603110/9609975e-78ea-4032-9024-10c4f838170a">
2024-05-24 17:15:16 +00:00
Leonid Ganeline
fd4ee08167 docs: integrations/platforms/microsoft update (#22100)
Added the `Azure Container Apps dynamic sessions` tool reference
2024-05-24 13:14:51 -04:00
Rahul Triptahi
1a485f59b9 community[patch]: Put authorized identities behind a feature flag in SharepointLoader (#22125)
Description: Put authorised identities behind a feature flag, load_auth.
Documentation: N/A
Unit tests: N/A

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-05-24 12:42:57 -04:00
Anindyadeep
ee689412ab docs: Update PremAI Docs (#22114)
Thank you for contributing to LangChain!

- [X] **PR title**: community: Updated langchain-community PremAI
documentation

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-05-24 11:55:32 -04:00
sasha
1c9ceff503 community: add metadata to chain logging; (#22122)
Hey, I'm Sasha. The SDK engineer from [Comet](https://comet.com).
This PR updates the CometTracer class.
Added metadata to CometTracerr. From now on, both chains and spans will
send it.
2024-05-24 15:29:40 +00:00
Jirka Lhotka
7c0459faf2 community: Update costs of openai finetuned models (#22124)
- **Description:** Update costs of finetuned models and add
gpt-3-turbo-0125. Source: https://openai.com/api/pricing/
  - **Issue:** N/A
  - **Dependencies:** None
2024-05-24 15:25:17 +00:00
Eugene Yurtsev
d3db83abe3 community[major]: lint for usage of xml library (#22132)
* Lint for usage of standard xml library
* Add forced opt-in for quip client
* Actual security issue is with underlying QuipClient not LangChain
integration (since the client is doing the parsing), but adding
enforcement at the LangChain level.
2024-05-24 15:23:53 +00:00
Tom Aarsen
5b5ea2af30 docs: Add explanation on how to use Hugging Face embeddings (#22118)
- **Description:** I've added a tab on embedding text with LangChain
using Hugging Face models to here:
https://python.langchain.com/v0.2/docs/how_to/embed_text/. HF was
mentioned in the running text, but not in the tabs, which I thought was
odd.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** No need, this is tiny :) 

Also, I had a ton of issues with the poetry docs/lint install, so I
haven't linted this. Apologies for that.

cc @Jofthomas 

- Tom Aarsen
2024-05-24 11:21:03 -04:00
Bagatur
baa3c975cb anthropic[patch]: allow tool call mutation (#22130)
If tool_use blocks and tool_calls with overlapping IDs are present,
prefer the values of the tool_calls. Allows for mutating AIMessages just
via tool_calls.
2024-05-24 08:18:14 -07:00
Christophe Bornet
c838de5027 doc: Add doc for CassandraByteStore (#22126)
Preview:
https://langchain-git-fork-cbornet-doc-cassandrabytestore-langchain.vercel.app/v0.2/docs/integrations/stores/cassandra/
2024-05-24 10:57:55 -04:00
Vadym Barda
2edb512282 docs: improve how-to docs for message history (#22072)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-23 20:12:24 -04:00
Artem
eb7c453b98 docs: update hub.pull("rlm/map-prompt") to hub.pull("rlm/reduce-prompt") for reduce prompt (#22088)
**PR message**: 
Update `hub.pull("rlm/map-prompt")` to `hub.pull("rlm/reduce-prompt")`
in summarization.ipynb

**Description:** 
Fix typo in prompt hub link from `reduce_prompt =
hub.pull("rlm/map-prompt")` to `reduce_prompt =
hub.pull("rlm/reduce-prompt")` following next issue

**Issue:** #22014

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-23 23:07:37 +00:00
Leonid Ganeline
2416737c5f docs: compact the API Reference links (#21285)
This PR is opinionated. 
Issue: the `API Reference` sections in the examples hold too much
vertical space and make us scroll the page too much. See an
[example](https://python.langchain.com/docs/get_started/quickstart/#conversation-retrieval-chain).
These sections are **important**. So, the compacting should not make
these sections less noticeable.
Change: compacting the `API Reference` sections. See the [same example
after change
applied](https://langchain-j6nya46lf-langchain.vercel.app/docs/get_started/quickstart/#conversation-retrieval-chain).
It is more compact and now looks like references (footnotes).
Note: I would also change the section style, so it would be more
noticeable (maybe to look like the footnotes. Smaller wider font?)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-23 15:50:23 -07:00
ccurme
0ea1e89b2c groq: read tool calls from .tool_calls attribute (#22096) 2024-05-23 18:16:06 -04:00
Bagatur
96c21dfe56 docs: hf feat table tool calling (#22091) 2024-05-23 15:09:30 -07:00
Eugene Yurtsev
63004a0945 codespell ignore remaining issues (#22097) 2024-05-23 21:51:39 +00:00
Eugene Yurtsev
2d693c484e docs: fix some spelling mistakes caught by newest version of code spell (#22090)
Going to merge this even though it doesn't pass all tests, and open a
separate PR for the remaining spelling mistakes.
2024-05-23 16:59:11 -04:00
Bagatur
38783d07c9 infra: api docs quick preview (#22093) 2024-05-23 13:57:45 -07:00
Pavel Zloi
fe26f937e4 community[minor]: ManticoreSearch engine added to vectorstore (#19117)
**Description:** ManticoreSearch engine added to vectorstores
**Issue:** no issue, just a new feature
**Dependencies:** https://pypi.org/project/manticoresearch-dev/
**Twitter handle:** @EvilFreelancer

- Example notebook with test integration:

https://github.com/EvilFreelancer/langchain/blob/manticore-search-vectorstore/docs/docs/integrations/vectorstores/manticore_search.ipynb

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-23 13:56:18 -07:00
Erick Friis
95c3e5f85f cli: model name substitution fix, release 0.0.23 (#22089) 2024-05-23 13:09:38 -07:00
Kartheek Yakkala
18b8c8628a docs : Added integrations for tools with langchain_community (#22056)
- **PR title**:  Docs enhancement

- **Description:** Adding installation instructions for integrations
requiring `langchain-community` package since 0.2
    - **Issue:** https://github.com/langchain-ai/langchain/issues/22005
2024-05-23 15:09:34 -04:00
ccurme
152c8cac33 anthropic, openai: cut pre-releases (#22083) 2024-05-23 15:02:23 -04:00
ccurme
cd07521170 core: bump to 0.2.1rc (#22080) 2024-05-23 18:36:50 +00:00
Harrison Chase
170cc8aec3 docs: add multi-modal-docs (#21734)
We dont really have any abstractions around multi-modal... so add a
section explaining we dont have any abstrations and then how to guides
for openai and anthropic (probably need to add for more)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: junefish <junefish@users.noreply.github.com>
Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-23 18:33:25 +00:00
ccurme
fbfed65fb1 core, partners: add token usage attribute to AIMessage (#21944)
```python
class UsageMetadata(TypedDict):
    """Usage metadata for a message, such as token counts.

    Attributes:
        input_tokens: (int) count of input (or prompt) tokens
        output_tokens: (int) count of output (or completion) tokens
        total_tokens: (int) total token count
    """

    input_tokens: int
    output_tokens: int
    total_tokens: int
```
```python
class AIMessage(BaseMessage):
    ...
    usage_metadata: Optional[UsageMetadata] = None
    """If provided, token usage information associated with the message."""
    ...
```
2024-05-23 14:21:58 -04:00
Bagatur
3d26807b92 community[patch]: Release. 0.2.1 (#22073) 2024-05-23 10:40:32 -07:00
Bagatur
2d968213d7 langchain[patch]: Release 0.2.1 (#22074) 2024-05-23 10:09:36 -07:00
maang-h
9aba9e3e33 community[patch]: Update the default “API URL” and “MODEL” of sparkllm (#22070)
- **Description:** When I was running the sparkllm, I found that the
default parameters currently used could no longer run correctly.
    - original parameters & values:
         - spark_api_url: "wss://spark-api.xf-yun.com/v3.1/chat"
         - spark_llm_domain: "generalv3"
    ```python
    # example
    
    from langchain_community.chat_models import ChatSparkLLM
    
spark = ChatSparkLLM(spark_app_id="my_app_id",
spark_api_key="my_api_key", spark_api_secret="my_api_secret")
    spark.invoke("hello")
    ```

![sparkllm](https://github.com/langchain-ai/langchain/assets/55082429/5369bfdf-4305-496a-bcf5-2d3f59d39414)

So I updated them to 3.5 (same as sparkllm official website). After the
update, they can be used normally.
    - new parameters & values:
         - spark_api_url: "wss://spark-api.xf-yun.com/v3.5/chat"
         - spark_llm_domain: "generalv3.5"
2024-05-23 12:25:20 -04:00
junkeon
4fda7bf4f2 upstage[patch] : fix error handling in Layout Analysis parser (#22054)
This pull request addresses and fixes exception handling in the
UpstageLayoutAnalysisParser and enhances the test coverage by adding
error exception tests for the document loader. These improvements ensure
robust error handling and increase the reliability of the system when
dealing with external API calls and JSON responses.

### Changes Made
1. Fix Request Exception Handling:

- Issue: The existing implementation of UpstageLayoutAnalysisParser did
not properly handle exceptions thrown by the requests library, which
could lead to unhandled exceptions and potential crashes.
- Solution: Added comprehensive exception handling for
requests.RequestException to catch any request-related errors. This
includes logging the error details and raising a ValueError with a
meaningful error message.

2. Add Error Exception Tests for Document Loader:

- New Tests: Introduced new test cases to verify the robustness of the
UpstageLayoutAnalysisLoader against various error scenarios. The tests
ensure that the loader gracefully handles:
- RequestException: Simulates network issues or invalid API requests to
ensure appropriate error handling and user feedback.
- JSONDecodeError: Simulates scenarios where the API response is not a
valid JSON, ensuring the system does not crash and provides clear error
messaging.
2024-05-23 11:45:34 -04:00
JuHyung Son
d9eff44400 partner-upstage[patch]: embeddings empty list bug (#22057)
Fixed an error in `embed_documents` when the input was given as an empty
list. And I have revised the document.
2024-05-23 11:44:30 -04:00
Martin Triska
2df8ac402a community[minor]: Added propagation of document metadata from O365BaseLoader (#20663)
**Description:**
- Added propagation of document metadata from O365BaseLoader to
FileSystemBlobLoader (O365BaseLoader uses FileSystemBlobLoader under the
hood).
- This is done by passing dictionary `metadata_dict`: key=filename and
value=dictionary containing document's metadata
- Modified `FileSystemBlobLoader` to accept the `metadata_dict`, use
`mimetype` from it (if available) and pass metadata further into blob
loader.

**Issue:**
- `O365BaseLoader` under the hood downloads documents to temp folder and
then uses `FileSystemBlobLoader` on it.
- However metadata about the document in question is lost in this
process. In particular:
- `mime_type`: `FileSystemBlobLoader` guesses `mime_type` from the file
extension, but that does not work 100% of the time.
- `web_url`: this is useful to keep around since in RAG LLM we might
want to provide link to the source document. In order to work well with
document parsers, we pass the `web_url` as `source` (`web_url` is
ignored by parsers, `source` is preserved)

**Dependencies:**
None

**Twitter handle:**
@martintriska1

Please review @baskaryan

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-05-23 11:42:19 -04:00
Eugene Yurtsev
e5541d1da7 community[patch]: Update doc-string in CloudBlobLoader (#22069)
Update doc-string
2024-05-23 15:31:41 +00:00
Maxime Perrin
8ba4f77734 docs : Adding correct imports to the integrations callbacks doc (#22059)
- **Description:** Adding correct imports to the integrations callbacks
doc (langchain-community package)
  - **Issue:** #22005

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-05-23 11:27:36 -04:00
Philippe PRADOS
6dd621d636 community[minor]: Add CloudBlobLoader that supports loading data from cloud buckets (#21957)
Thank you for contributing to LangChain!

- [ ] **PR title**: "Add CloudBlobLoader"
  - community: Add CloudBlobLoader

- [ ] **PR message**: Add cloud blob loader
    - **Description:** 
 Langchain provides several approaches to read different file formats:

Specific loaders (`CVSLoader`) or blob-compatible loaders
(`FileSystemBlobLoader`). The only implementation proposed for
BlobLoader is `FileSystemBlobLoader`.
      
Many projects retrieve files from cloud storage. We propose a new
implementation of `BlobLoader` to read files from the three cloud
storage systems. The interface is strictly identical to
`FileSystemBlobLoader`. The only difference is the constructor, which
takes a cloud "url" object such as `s3://my-bucket`, `az://my-bucket`,
or `gs://my-bucket`.
      
By streamlining the process, this novel implementation eliminates the
requirement to pre-download files from cloud storage to local temporary
files (which are seldom removed).
      
The code relies on the
[CloudPathLib](https://cloudpathlib.drivendata.org/stable/) library to
interpret cloud URLs. This has been added as an optional dependency.

```Python
loader = CloudBlobLoader("s3://mybucket/id")
for blob in loader.yield_blobs():
    print(blob)
```

- [X] **Dependencies:** CloudPathLib
- [X] **Twitter handle:** pprados


- [X] **Add tests and docs**: Add unit test, but it's easy to convert to
integration test, with some files in a cloud storage (see
`test_cloud_blob_loader.py`)

- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified.

Hello from Paris @hwchase17. Can you review this PR?

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-05-23 10:59:55 -04:00
Christophe Bornet
74947ec894 community[minor]: Add Cassandra ByteStore (#22064) 2024-05-23 10:46:23 -04:00
Christophe Bornet
fea6b99b16 community[minor]: Add async methods to CassandraChatMessageHistory (#21975) 2024-05-23 10:13:05 -04:00
Eugene Yurtsev
37cfc00310 docs: concepts callbacks fix admonition (#22048)
Correct the admonition text
2024-05-22 20:33:28 -04:00
Erick Friis
53293dace8 docs: version increases (#22050) 2024-05-22 16:20:10 -07:00
Sky
12d65f17ff community[patch]: surrealdb provide functions for MMR (Maximal Marginal Relevance) (#21185)
This PR contains 4 added functions:

- max_marginal_relevance_search_by_vector
- amax_marginal_relevance_search_by_vector
- max_marginal_relevance_search
- amax_marginal_relevance_search

I'm no langchain expert, but tried do inspect other vectorstore sources
like chroma, to build these functions for SurrealDB. If someone has some
changes for me, please let me know. Otherwise I would be happy, if these
changes are added to the repository, so that I can use the orignal repo
and not my local monkey patched version.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 22:53:55 +00:00
Erick Friis
58b6c72375 docs: add astream v2 migration guide links (#21845)
- docs: v0.2 version sidebar
- x
- x
2024-05-22 15:48:42 -07:00
Bruno Alvisio
5eabe90494 community[patch]: Adding HEADER to the list of supported locations (#21946)
**Description:** adds headers to the list of supported locations when
generating the openai function schema
2024-05-22 22:47:56 +00:00
Bagatur
50186da0a1 infra: rm unused # noqa violations (#22049)
Updating #21137
2024-05-22 15:21:08 -07:00
acho98
45ed5f3f51 community[minor]: Add Clova Embeddings for LangChain Community (#21890)
- [ ] **PR title**: "Add Naver ClovaX embedding to LangChain community"
- HyperClovaX is a large language model developed by
[Naver](https://clova-x.naver.com/welcome).
It's a powerful and purpose-trained LLM.

- You can visit the embedding service provided by
[ClovaX](https://www.ncloud.com/product/aiService/clovaStudio)

- You may get CLOVA_EMB_API_KEY, CLOVA_EMB_APIGW_API_KEY,
CLOVA_EMB_APP_ID From
https://www.ncloud.com/product/aiService/clovaStudio

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 22:08:47 +00:00
arpitkumar980
444c2a3d9f community[patch]: sharepoint loader identity enabled (#21176)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:https://github.com/arpitkumar980/langchain.git
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-22 22:08:31 +00:00
Eugene Yurtsev
8a877120c3 docs: add admonitions to how-to callbacks (#22046)
Add admonitions with more information.
2024-05-22 22:05:57 +00:00
HuiyuanYan
bf3aefce93 community[patch]: Update tongyi.py to support MultimodalConversation in dashscope. (#21249)
Add the support of multimodal conversation in dashscope,now we can use
multimodal language model "qwen-vl-v1", "qwen-vl-chat-v1",
"qwen-audio-turbo" to processing picture an audio. :)

- [ ] **PR title**: "community: add multimodal conversation support in
dashscope"



- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** add multimodal conversation support in dashscope
    - **Issue:** 
    - **Dependencies:** dashscope≥1.18.0
    - **Twitter handle:** none :)


- [ ] **How to use it?**:
   - ```python
     Tongyi_chat = ChatTongyi(
        top_p=0.5,
        dashscope_api_key=api_key,
        model="qwen-vl-v1"
     )
     response= Tongyi_chat.invoke(
        input = 
        [
        {
            "role": "user",
            "content": [
{"image":
"https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"},
                {"text": "这是什么?"}
            ]
        }
        ]
       )
      ```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 22:04:58 +00:00
mochi
63284ffebf experimental[patch], docs: refine notebook for MyScale SelfQueryRetriever (#22016)
- **Description:** upgrade model to `gpt-4o`
2024-05-22 21:49:01 +00:00
MSubik
d948783a4c community[patch]: standardize init args, update for javelin sdk release. (#21980)
Related to
[20085](https://github.com/langchain-ai/langchain/issues/20085) Updated
the Javelin chat model to standardize the initialization argument. Also
fixed an existing bug, where code was initialized with incorrect call to
the JavelinClient defined in the javelin_sdk, resulting in an
initialization error. See related [Javelin
Documentation](https://docs.getjavelin.io/docs/javelin-python/quickstart).
2024-05-22 21:47:28 +00:00
Mohammad Mohtashim
16617dd239 community[patch]: AzureSearchVectorStoreRetriever Fixed to account for search_kwargs (#21572)
- **Description:** Fixed `AzureSearchVectorStoreRetriever` to account
for search_kwargs. More explanation is in the mentioned issue.
- **Issue:** #21492

---------

Co-authored-by: MAC <mac@MACs-MacBook-Pro.local>
Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 14:46:41 -07:00
Klaudia Lemiec
45351d1bc6 docs: Chroma docstrings update (#22001)
Thank you for contributing to LangChain!

- [X] **PR title**: "docs: Chroma docstrings update"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [X] **PR message**: 
    - **Description:** Added and updated Chroma docstrings
    - **Issue:** https://github.com/langchain-ai/langchain/issues/21983


- [X] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
  - only docs


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-05-22 21:45:30 +00:00
Jerron Lim
28456c2c33 community[patch]: add args_schema to WikipediaQueryRun (#22019)
Description: This change adds args_schema (pydantic BaseModel) to
WikipediaQueryRun for correct schema formatting on LLM function calls

Issue: currently using WikipediaQueryRun with OpenAI function calling
returns the following error "TypeError: WikipediaQueryRun._run() got an
unexpected keyword argument '__arg1' ". This happens because the schema
sent to the LLM is "input: '{"__arg1":"Hunter x Hunter"}'" while the
method should be called with the "query" parameter.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 21:31:58 +00:00
Mazen Ramadan
3c1d77dd64 community[minor]: Add Scrapfly Loader community integration (#22036)
Added [Scrapfly](https://scrapfly.io/) Web Loader integration. Scrapfly
is a web scraping API that allows extracting web page data into
accessible markdown or text datasets.

- __Description__: Added Scrapfly web loader for retrieving web page
data as markdown or text.
- Dependencies: scrapfly-sdk
- Twitter: @thealchemi1st

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 21:29:13 +00:00
Chad Juliano
9a66c43146 docs: Use Kinetica Sql context API (#21993)
Update python notebook to use new Kinetica SQL context API.
2024-05-22 14:26:20 -07:00
ccurme
b51a1eba4d langchain, community: move OpenAIAssistantV2Runnable to community (#22044) 2024-05-22 21:22:50 +00:00
Mirna Wong
b4d5f3181b docs: updates code examples in neo4j_cypher.ipynb (#21973)
Resolves #19134

Thank you for contributing to LangChain!

- [x ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** this pr replaces `title` with `name` in the [add
examples in cypher generation
prompt](https://python.langchain.com/v0.1/docs/integrations/graphs/neo4j_cypher/#add-examples-in-the-cypher-generation-prompt)
section.
    - **Issue:** 19134
    - **Dependencies:** any dependencies required for this change
    - **Twitter handle:** @mirna_wong
2024-05-22 20:48:09 +00:00
CaroFG
6b98140b38 community[patch]: update for compatibility with Meilisearch v1.8 (#21979)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Updates Meilisearch vectorstore for compatibility
with v1.8. Adds [”showRankingScore”:
true”](https://www.meilisearch.com/docs/reference/api/search#ranking-score)
in the search parameters and replaces `_semanticScore` field with `
_rankingScore`


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-05-22 13:37:01 -07:00
Oleksii Pokotylo
98c0b093bb community[patch]: Extend AzureSearch with maximal_marginal_relevance, from_embeddings (#21065)
**Description:**
- Extend AzureSearch with `maximal_marginal_relevance` (for vector and
hybrid search)
- Add construction `from_embeddings` - if the user has already embedded
the texts
- Add `add_embeddings` 
- Refactor common parts (`_simple_search`, `_results_to_documents`,
`_reorder_results_with_maximal_marginal_relevance`)
- Add `vector_search_dimensions` as a parameter to the constructor to
avoid extra calls to `embed_query` (most of the time the user applies
the same model and knows the dimension)

**Issue:** none
**Dependencies:** none

- [x] **Add tests and docs**: The docstrings have been added to the new
functions, and unified for the existing ones. The example notebook is
great in illustrating the main usage of AzureSearch, adding the new
methods would only dilute the main content.
- [x] **Lint and test**

---------

Co-authored-by: Oleksii Pokotylo <oleksii.pokotylo@pwc.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 13:36:06 -07:00
Erick Friis
ed5914ff61 docs: move feedback into paginator from content (#22041)
we only index what's in the `<article>` tags for search. We should not
have the feedback in the article.
2024-05-22 13:21:27 -07:00
SaschaStoll
709664a079 community[patch]: Performant filter columns option for Hanavector (#21971)
**Description:** Backwards compatible extension of the initialisation
interface of HanaDB to allow the user to specify
specific_metadata_columns that are used for metadata storage of selected
keys which yields increased filter performance. Any not-mentioned
metadata remains in the general metadata column as part of a JSON
string. Furthermore switched to executemany for batch inserts into
HanaDB.

**Issue:** N/A

**Dependencies:** no new dependencies added

**Twitter handle:** @sapopensource

---------

Co-authored-by: Martin Kolb <martin.kolb@sap.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-22 13:21:21 -07:00
Bagatur
16b55b0704 langchain[patch]: remove dataclasses-json dep (#22042)
vestigial dep afaict
2024-05-22 13:20:57 -07:00
Christos Boulmpasakos
c3bcfad66d text-splitters[patch]: Extend TextSplitter:keep_separator functionality (#21130)
**Description:** Added extra functionality to `CharacterTextSplitter`,
`TextSplitter` classes.
The user can select whether to append the separator to the previous
chunk with `keep_separator='end' ` or else prepend to the next chunk.
Previous functionality prepended by default to next chunk.
  
**Issue:** Fixes #20908

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-22 13:17:45 -07:00
Bagatur
b859765752 docs: fix partner api ref build (#22007) 2024-05-22 13:16:07 -07:00
Eric Zhang
e7e41eaabe langchain: add RankLLM Reranker (#21171)
Integrate RankLLM reranker (https://github.com/castorini/rank_llm) into
LangChain

An example notebook is given in
`docs/docs/integrations/retrievers/rankllm-reranker.ipynb`

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-05-22 20:12:55 +00:00
Eugene Yurtsev
14a9c7c44e concepts: update callback concepts (#22040)
Update callback concepts
2024-05-22 15:58:02 -04:00
maang-h
fc93bed8c4 community: Fix CSVLoader columns is None (#20701)
- **Bug code**: In
langchain_community/document_loaders/csv_loader.py:100

- **Description**: currently, when 'CSVLoader' reads the column as None
in the 'csv' file, it will report an error because the 'CSVLoader' does
not verify whether the column is of str type and does not consider how
to handle the corresponding 'row_data' when the column is' None 'in the
csv. This pr provides a solution.

- **Issue:**  Fix #20699 

- **thinking:**

1. Refer to the processing method for
'langchain_community/document_loaders/csv_loader.py:100' when **'v'**
equals'None', and apply the same method to '**k**'.
(Reference`csv.DictReader` ,**'k'** will only be None when `
len(columns) < len(number_row_data)` is established)
2. **‘k’** equals None only holds when it is the last column, and its
corresponding **'v'** type is a list. Therefore, I referred to the data
format in 'Document' and used ',' to concatenated the elements in the
list.(But I'm not sure if you accept this form, if you have any other
ideas, communicate)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-22 12:57:46 -07:00
Nithin James Padayatti
403142eaba langchain: added revision_example prompt template (#20916)
**Description:** Added revision_example prompt template to include the
revision request and revision examples in the revision chain.
    **Issue:** Not Applicable
    **Dependencies:** Not Applicable
    **Twitter handle:**  @nithinjp09
2024-05-22 19:57:32 +00:00
Sihan Chen
1f81277b9b community[minor]: allow enabling proxy in aiohttp session in AsyncHTML (#19499)
Allow enabling proxy in aiohttp session async html
2024-05-22 18:25:06 +00:00
Eugene Yurtsev
36813d2f00 community[patch]: Fix remaining __inits__ in community (#22037)
Fixes the __init__ files in community to use __all__ which is statically
defined.
2024-05-22 17:42:17 +00:00
Eugene Yurtsev
b7d08bf764 docs: update doc feedback to populate URL (#22033)
Update docfeedback to populate URL
2024-05-22 13:38:11 -04:00
Eugene Yurtsev
58360a1e53 community[patch]: Add unit test to verify that init is correctly defined (#22030)
Fix some __init__ files and add a unit test
2024-05-22 17:19:00 +00:00
Erick Friis
ef53ccf54b robocorp: release 0.0.8 (#22034) 2024-05-22 16:41:41 +00:00
Eugene Yurtsev
4633b4cf2b ci: update documentation template to include URL (#22032)
update documentation template to include URL
2024-05-22 12:01:28 -04:00
Matthew Hoffman
4f2e3bd7fd community[patch]: fix public interface for embeddings module (#21650)
## Description

The existing public interface for `langchain_community.emeddings` is
broken. In this file, `__all__` is statically defined, but is
subsequently overwritten with a dynamic expression, which type checkers
like pyright do not support. pyright actually gives the following
diagnostic on the line I am requesting we remove:


[reportUnsupportedDunderAll](https://github.com/microsoft/pyright/blob/main/docs/configuration.md#reportUnsupportedDunderAll):

```
Operation on "__all__" is not supported, so exported symbol list may be incorrect
```

Currently, I get the following errors when attempting to use publicablly
exported classes in `langchain_community.emeddings`:

```python
import langchain_community.embeddings

langchain_community.embeddings.HuggingFaceEmbeddings(...)  #  error: "HuggingFaceEmbeddings" is not exported from module "langchain_community.embeddings" (reportPrivateImportUsage)
```

This is solved easily by removing the dynamic expression.
2024-05-22 11:42:15 -04:00
Maxime Perrin
6548052f9e docs : Integrations vector stores with langchain-community install (#22028)
- **Description:** Adding installation instruction for integrations
requiring `langchain-community` package since 0.2
  - **Issue:** #22005

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-05-22 15:32:01 +00:00
Eugene Yurtsev
8d82160a8a community[patch]: Clean up logic in import checking unit test (#22026)
Clean up unit test
2024-05-22 15:30:10 +00:00
Tomaz Bratanic
d8a1f1114d community[patch]: Handle exceptions where node props aren't consistent in neo4j schema (#22027) 2024-05-22 11:21:56 -04:00
WeichenXu
b0ef5e778a community[patch]: Fix ChatDatabricsk in case that streaming response doesn't have role field in delta chunk (#21897)
Thank you for contributing to LangChain!

- [X] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


**Description:**
Fix ChatDatabricsk in case that streaming response doesn't have role
field in delta chunk


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
2024-05-22 08:12:53 -07:00
Eugene Yurtsev
aed64daabb community[patch]: Add unit test to catch bad __all__ definitions (#21996)
This will catch all dynamic __all__ definitions.
2024-05-22 09:32:13 -04:00
Brian Thorne
25ba733218 docs: Update import in wikipedia tool documentation (#21565)
Updates docs so the example doesn't lead to a warning:
```
LangChainDeprecationWarning: Importing tools from langchain is deprecated. Importing from langchain will no longer be supported as of langchain==0.2.0. Please import from langchain-community instead:

`from langchain_community.tools import WikipediaQueryRun`.

To install langchain-community run `pip install -U langchain-community`.
```
2024-05-21 17:20:51 -07:00
Bagatur
3b0437c05b core[patch]: Release 0.2.1 (#22003) 2024-05-22 00:05:04 +00:00
Kefan You
24b5c27bb1 community[patch]: raise_for_status logic missing in async _fetch of WebBaseLoader (#21948)
## 'raise_for_status' parameter of WebBaseLoader works in sync load but
not in async load.
In webBaseLoader:  

Sync load is calling `_scrape` and has `raise_for_status` properly
handled.
```
    def _scrape(
        self,
        url: str,
        parser: Union[str, None] = None,
        bs_kwargs: Optional[dict] = None,
    ) -> Any:
        from bs4 import BeautifulSoup

        if parser is None:
            if url.endswith(".xml"):
                parser = "xml"
            else:
                parser = self.default_parser

        self._check_parser(parser)

        html_doc = self.session.get(url, **self.requests_kwargs)
        if self.raise_for_status:
            html_doc.raise_for_status()

        if self.encoding is not None:
            html_doc.encoding = self.encoding
        elif self.autoset_encoding:
            html_doc.encoding = html_doc.apparent_encoding
        return BeautifulSoup(html_doc.text, parser, **(bs_kwargs or {}))
```
Async load is calling `_fetch` but missing `raise_for_status` logic.
```
    async def _fetch(
        self, url: str, retries: int = 3, cooldown: int = 2, backoff: float = 1.5
    ) -> str:
        async with aiohttp.ClientSession() as session:
            for i in range(retries):
                try:
                    async with session.get(
                        url,
                        headers=self.session.headers,
                        ssl=None if self.session.verify else False,
                        cookies=self.session.cookies.get_dict(),
                    ) as response:
                        return await response.text()
```

Co-authored-by: kefan.you <darkfss@sina.com>
2024-05-21 23:51:03 +00:00
Mateusz Szewczyk
80f8fe1793 docs: update IBM WatsonxLLM docs with deprecated LLMChain (#21960)
Thank you for contributing to LangChain!

- [x] **PR title**: "update IBM WatsonxLLM docs with deprecated
LLMChain"

- [x] **PR message**: 
- **Description:** update IBM WatsonxLLM docs with deprecated LLMChain

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-05-21 16:43:02 -07:00
Surya Rath
eb096675a8 OpenAI Assistants v2 api support for OpenAIAssistantRunnable (#21484)
**Title**: "langchain: OpenAI Assistants v2 api support"

***Descriptions*** 
- [x] "attachments" support added along with backward compatibility of
"file_ids"
- [x]  "tool_resources" support added while creating new assistant

- [ ] "tool_choice" parameter support
- [ ]  Streaming support


- **Dependencies:** OpenAI v2 API (openai>=1.23.0)
- **Twitter handle:** @skanta_rath

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-21 15:32:29 -07:00
Eugene Yurtsev
7a5d042bd2 langchain[patch]: Add unit test to detect changes to community imports (#21998)
Add unit tests for community imports
2024-05-21 17:45:26 -04:00
Eugene Yurtsev
90f4d8842f langchain[patch]: Turn on all deprecations for 0.2 (#21999)
- Turn on all 0.2 import deprecations.
- Update error messag with URL to upgrade instructions.
2024-05-21 17:33:43 -04:00
Asaf Joseph Gardin
a042e804b4 ai21: AI21 Jamba docs (#21978)
- Updated docs to have an example to use Jamba instead of J2

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-21 19:27:46 +00:00
Pengcheng Liu
4cf523949a community[patch]: Update model client to support vision model in Tong… (#21474)
- **Description:** Tongyi uses different client for chat model and
vision model. This PR chooses proper client based on model name to
support both chat model and vision model. Reference [tongyi
document](https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-vl-plus-api?spm=a2c4g.11186623.0.0.27404c9a7upm11)
for details.

```
from langchain_core.messages import HumanMessage
from langchain_community.chat_models import ChatTongyi

llm = ChatTongyi(model_name='qwen-vl-max')
image_message = {
    "image": "https://lilianweng.github.io/posts/2023-06-23-agent/agent-overview.png"
}
text_message = {
    "text": "summarize this picture",
}
message = HumanMessage(content=[text_message, image_message])
llm.invoke([message])
```

- **Issue:** None
- **Dependencies:** None
- **Twitter handle:** None
2024-05-21 11:58:27 -07:00
Erick Friis
98b64f3ae3 infra: only tag core releases as github latest (#21991) 2024-05-21 11:39:03 -07:00
Sevin F. Varoglu
1bc0ea5496 community[patch]: update OctoAIEmbeddings to subclass OpenAIEmbeddings (#21805) 2024-05-21 11:29:41 -07:00
Eugene Yurtsev
ded53297e0 core[patch]: Add unit test for RunnableGenerator for eventstream v2 (#21990)
No unit tests with runnable generator
2024-05-21 14:29:15 -04:00
Nuno Campos
fb6108c8f5 core[patch]: In astream_events(version=v2) tap output of root run (#21977)
- if tap_output_iter/aiter is called multiple times for the same run
issue events only once
- if chat model run is tapped don't issue duplicate on_llm_new_token
events
- if first chunk arrives after run has ended do not emit it as a stream
event

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-05-21 14:03:57 -04:00
Bagatur
72d4a8eeed community[patch]: AzureSearch dont overwrite default async (#21989) 2024-05-21 11:01:28 -07:00
ccurme
a983465694 docs: set default anthropic model (#21988)
`ChatAnthropic()` raises ValidationError.
2024-05-21 11:01:18 -07:00
Muhammed Al-Dulaimi
5448e16fe6 Fix grammar error (#21985)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.
2024-05-21 10:59:48 -07:00
ccurme
4be5537837 Revert "anthropic: set default model" (#21987)
Reverts langchain-ai/langchain#21986
2024-05-21 17:28:32 +00:00
ccurme
35439cf3bd anthropic: set default model (#21986)
Various docs reference `ChatAnthropic()`, but this currently raises
ValidationError.
2024-05-21 17:24:31 +00:00
ccurme
0923136851 langchain: default to Runnable in MultiQueryRetriever (#21770)
- `llm_chain` becomes `Union[LLMChain, Runnable]`
- `.from_llm` creates a runnable

tested by verifying that docs/how_to/MultiQueryRetriever.ipynb runs
unchanged with sync/async invoke (and that it runs if we specifically
instantiate with LLMChain).
2024-05-21 17:01:05 +00:00
Yulong Wang
8e1aeb8ad5 community[patch]: Fix typo in arxiv tool's doc (#21970)
Fix typo in arxiv tool's doc
2024-05-21 13:44:59 +00:00
Robert Caulk
54adcd9e82 community[minor]: add AskNews retriever and AskNews tool (#21581)
We add a tool and retriever for the [AskNews](https://asknews.app)
platform with example notebooks.

The retriever can be invoked with:

```py
from langchain_community.retrievers import AskNewsRetriever

retriever = AskNewsRetriever(k=3)

retriever.invoke("impact of fed policy on the tech sector")
```

To retrieve 3 documents in then news related to fed policy impacts on
the tech sector. The included notebook also includes deeper details
about controlling filters such as category and time, as well as
including the retriever in a chain.

The tool is quite interesting, as it allows the agent to decide how to
obtain the news by forming a query and deciding how far back in time to
look for the news:

```py
from langchain_community.tools.asknews import AskNewsSearch
from langchain import hub
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI

tool = AskNewsSearch()

instructions = """You are an assistant."""
base_prompt = hub.pull("langchain-ai/openai-functions-template")
prompt = base_prompt.partial(instructions=instructions)
llm = ChatOpenAI(temperature=0)
asknews_tool = AskNewsSearch()
tools = [asknews_tool]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
)

agent_executor.invoke({"input": "How is the tech sector being affected by fed policy?"})
```

---------

Co-authored-by: Emre <e@emre.pm>
2024-05-20 18:23:06 -07:00
Jesse S
fc79b372cb community[minor]: add aerospike vectorstore integration (#21735)
Please let me know if you see any possible areas of improvement. I would
very much appreciate your constructive criticism if time allows.

**Description:**
- Added a aerospike vector store integration that utilizes
[Aerospike-Vector-Search](https://aerospike.com/products/vector-database-search-llm/)
add-on.
- Added both unit tests and integration tests
- Added a docker compose file for spinning up a test environment
- Added a notebook

 **Dependencies:** any dependencies required for this change
- aerospike-vector-search

 **Twitter handle:** 
- No twitter, you can use my GitHub handle or LinkedIn if you'd like

Thanks!

---------

Co-authored-by: Jesse Schumacher <jschumacher@aerospike.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-21 01:01:47 +00:00
Prince Canuma
3587c60396 community[patch]: Fix MLX LLM Stream (#20575)
Closes #20561

This PR fixes MLX LLM stream `AttributeError`. 

Recently, `mlx-lm` changed the token decoding logic, which affected the
LC+MLX integration.

Additionally, I made minor fixes such as: docs example broken link and
enforcing pipeline arguments (max_tokens, temp and etc) for invoke.
   
- **Issue:** #20561
    
- **Twitter handle:** @Prince_Canuma
2024-05-20 17:17:08 -07:00
Rahul Triptahi
96bd0b0844 community[patch]: Remove redundant pebblo cloud api call (#21589)
Description: removed redundant pebblo cloud api call. Changed classified
`doc` key to `ai_apps_data`.
Documentation: N/A
Unit tests: N/A
2024-05-20 17:15:16 -07:00
Param Singh
d07885f8b7 community[patch]: standardized sparkllm init args (#21633)
Related to #20085 
@baskaryan 

Thank you for contributing to LangChain!

community:sparkllm[patch]: standardized init args

updated `spark_api_key` so that aliased to `api_key`. Added integration
test for `sparkllm` to test that it continues to set the same underlying
attribute.

updated temperature with Pydantic Field, added to the integration test.

Ran `make format`,`make test`, `make lint`, `make spell_check`
2024-05-20 17:11:36 -07:00
Dhruv Chawla
d4359d3de6 community[patch]: Update UpTrain Callback Handler to support the new UpTrain evaluation schema (#21656)
UpTrain has a new dashboard now that makes it easier to view projects
and evaluations. Using this requires specifying both project_name and
evaluation_name when performing evaluations. I have updated the code to
support it.
2024-05-20 17:06:00 -07:00
Alex Riina
c0e3c3a350 openai[patch], community[patch]: add pricing and max context window for GPT-4o (#21673)
# Add pricing and max context window for GPT-4o
- community: add cost per 1k tokens and max context window
- partners: add max context window

**Description:** adds static information about GPT-4o based on
https://openai.com/api/pricing/ and
https://platform.openai.com/docs/models/gpt-4o so that GPT-4o reporting
is accurate.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-20 23:47:43 +00:00
缨缨
bd39b2ccdf community: enable SupabaseVectorStore to support extended table fields (#21762)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: enable SupabaseVectorStore to support
extended table fields"

- [x] **PR message**: 
- Added extension fields to the function _add_vectors so that users can
add other custom fields when insert a record into the database. eg:
    

![image](https://github.com/langchain-ai/langchain/assets/10885578/e1d5ca20-936e-4cab-ba69-8fdd23b8ce8f)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-20 16:32:26 -07:00
Jerome Choo
2316635add docs: Clean up Diffbot docs (#21781)
The Diffbot DocumentLoader page doesn't actually run for a number of
reasons. This PR fixes it along with some light details on the Graph
Transformer and Provider pages.

## Full Changelog

[Document Loader
Page](https://python.langchain.com/v0.1/docs/integrations/document_loaders/diffbot/)
* Fixed the notebook so that it actually runs (missing required modules,
env variables, etc..)
* Added "open in colab" button like the Graph Transformer page

[Graph Transformer
Page](https://python.langchain.com/v0.2/docs/integrations/graphs/diffbot/)
* Fixed broken colab link
* Moved "open in colab" button to below description so the description
in the [Graphs category
page](https://python.langchain.com/v0.2/docs/integrations/graphs/) shows
up correctly

[Provider
Page](https://python.langchain.com/v0.2/docs/integrations/providers/diffbot/)
* Clarified explanations of Diffbot products
* Added section and link to LangChain Graph Transformer page

---------

Co-authored-by: jeromechoo <hello@jeromechoo.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-20 23:09:22 +00:00
Rohan Aggarwal
d8a101074f docs: updates for OracleDB (#21745)
Thank you for contributing to LangChain!

Documentation change for OracleDB

Fixed several things in Oracle Documentation.
2024-05-20 16:01:35 -07:00
Leonid Ganeline
9799437bc2 docs: YouTube page update (#21780)
Greatly simplified to get a cleaner look.
Only the YouTube pages with 40K+ views.
2024-05-20 15:50:41 -07:00
Leonid Ganeline
e98a4fd19a ai21[patch]: configuration fix (#21790)
added "repository" and "Source Code" parameters (these parameters are
missed only in this partner package configuration).
2024-05-20 15:49:38 -07:00
Trayan Azarov
f54cbf8ff5 chroma[patch]: Chroma - remove reference to collection upon delete_collection (#21817)
**Description**:

- Reference to `Collection` object is set to `None` when deleting a
collection `delete_collection()`
- Added utility method `reset_collection()` to allow recreating the
collection
- Moved collection creation out of `__init__` into
`__ensure_collection()` to be reused by object init and
`reset_collection()`
- `_collection` is now a property to avoid breaking changes

**Issues**: 

- chroma-core/chroma#2213

**Twitter**: @t_azarov
2024-05-20 15:42:36 -07:00
Jens
b0b302ec6b community[patch]: fixed aleph alpha default emedding request (#21826)
- **Description:** In the aleph alpha client the paramater `normalize`
is *not* optional. Setting this to `None` gives an error.
- **Dependencies:** None

Co-authored-by: Jens Lücke <jens.luecke@tngtech.com>
Co-authored-by: Jens <jens.luecke@hu-berlin.de>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-20 22:39:43 +00:00
Leonid Ganeline
6a59f76f2b docs: added template to arxiv page (#21846)
Updated `arXiv` page with the arxiv references from Templates (were
references from Docs and API Refs, not Templates).
Re #21450 
CC @eyurtsev
2024-05-20 15:30:35 -07:00
Jorge Piedrahita Ortiz
e6207ad4f3 community[patch]: Sambanova integration api update (#21848)
- **Description:**:
        SambaStudio generic endpoint compatibility added
        Improved error description, and handling
        streaming examples added
2024-05-20 15:29:59 -07:00
Bagatur
c6da9533ac docs: correct langserve link (#21940) 2024-05-20 22:15:31 +00:00
Michael Reed
7a5e1bcf99 core[patch]: Fix NPE in function_calling._get_python_function_required_args (#21863)
Example error message:
line 206, in _get_python_function_required_args
    if is_function_type and required[0] == "self":
                            ~~~~~~~~^^^
IndexError: list index out of range

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-20 22:06:27 +00:00
Liuww
332ffed393 community[patch]: Adopting the lighter-weight xinference_client (#21900)
While integrating the xinference_embedding, we observed that the
downloaded dependency package is quite substantial in size. With a focus
on resource optimization and efficiency, if the project requirements are
limited to its vector processing capabilities, we recommend migrating to
the xinference_client package. This package is more streamlined,
significantly reducing the storage space requirements of the project and
maintaining a feature focus, making it particularly suitable for
scenarios that demand lightweight integration. Such an approach not only
boosts deployment efficiency but also enhances the application's
maintainability, rendering it an optimal choice for our current context.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-20 22:05:09 +00:00
Tomaz Bratanic
a43515ca65 experimental[patch]: Pass enum only to openai in llm graph transformer (#21860)
Some models like Groq return bad request if you pass in `enum` parameter
in tool definition
2024-05-20 15:02:48 -07:00
Ozan Kaşıkçı
aab9cb666f docs: Update agents.ipynb, add missing word "see" (#21872)
- **Description:** Add missing see word in the docs
2024-05-20 22:00:03 +00:00
Jiří Spilka
6499897c87 community[patch]: update apify integration to attribute API activity to langchain (#21909)
**Description:** Add `Origin/langchain` to Apify's client's user-agent
to attribute API activity to LangChain (at Apify, we aim to monitor our
integrations to evaluate whether we should invest more in the LangChain
integration regarding functionality and content)

**Issue:** None
**Dependencies:** None
**Twitter handle:** None
2024-05-20 14:49:23 -07:00
Mohammad Mohtashim
711b8f1e52 docs: HuggingFace Endpoint Documentation Fixed (#21914)
Fixed Documentation for HuggingFaceEndpoint as per the issue #21903

---------

Co-authored-by: keenborder786 <mohammad.mohtashim78@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-20 21:23:28 +00:00
Jared Van Bortel
25d1c1c9bb nomic: implement local embeddings with the inference_mode parameter (#21934)
## Description

This PR implements local and dynamic mode in the Nomic Embed integration
using the inference_mode and device parameters. They work as documented
[here](https://docs.nomic.ai/reference/python-api/embeddings#local-inference).

<!-- If no one reviews your PR within a few days, please @-mention one
of baskaryan, efriis, eyurtsev, hwchase17. -->

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-05-20 14:17:07 -07:00
ccurme
0e72ed39a0 infra: fix CI on text-splitters (#21935) 2024-05-20 14:03:42 -07:00
Ozan Kaşıkçı
f4ffef98a2 docs: how to: tool calling: Fix typo in sentence (#21877)
- **Description:** Fix grammar error.
2024-05-20 20:58:52 +00:00
Erick Friis
6b97418836 docs: rewrite old home, fix v0.1 infinite redirect (#21936) 2024-05-20 13:44:41 -07:00
Bagatur
1418d3af00 docs: link to langsmith+langgraph docs (#21930) 2024-05-20 13:05:22 -07:00
ccurme
e8bdf245eb update maintainers (#21305) 2024-05-20 19:07:53 +00:00
ccurme
4470d3b4a0 partners: bump core in packages implementing ls_params (#21868)
These packages all import `LangSmithParams` which was released in
langchain-core==0.2.0.

N.B. we will need to release `openai` and then bump `langchain-openai`
in `together` and `upstage`.
2024-05-20 11:51:43 -07:00
junefish
0614a53d9c docs: update notebook for latest Pinecone API + serverless (#21921)
Thank you for contributing to LangChain!

- [x] **PR title**: "docs: update notebook for latest Pinecone API +
serverless"


- [x] **PR message**: Published notebook is incompatible with latest
`pinecone-client` and not runnable. Updated for use with latest Pinecone
Python SDK. Also updated to be compatible with serverless indexes (only
index type available on Pinecone free tier).


- [x] **Add tests and docs**: N/A (tested in Colab)


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.


---
- To see the specific tasks where the Asana app for GitHub is being
used, see below:
  - https://app.asana.com/0/0/1207328087952499
2024-05-20 11:51:03 -07:00
ccurme
9c76739425 mistral: implement ls_params (#21867) 2024-05-20 11:49:48 -07:00
junefish
68a90e2252 docs: update notebook for new Pinecone API + serverless (#21923)
Thank you for contributing to LangChain!

- [x] **PR title**: "docs: update notebook for new Pinecone API +
serverless"


- [x] **PR message**: The published notebook is not runnable after
`pinecone-client` v2, which is deprecated. `langchain-pinecone` is not
compatible with the latest `pinecone-client` (v4), so I hardcoded it to
the last v3. Also updated for serverless indexes (only index type
available on Pinecone free plan).


- [x] **Add tests and docs**: N/A (tested in Colab)


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.


---
- To see the specific tasks where the Asana app for GitHub is being
used, see below:
  - https://app.asana.com/0/0/1207328087952500
2024-05-20 11:48:55 -07:00
Eugene Yurtsev
8ed2ba9301 docs: migrate integrations using langchain-cli (#21929)
Migrate integration docs
2024-05-20 18:14:49 +00:00
Eugene Yurtsev
c98bd8505f docs: migrate tutorials using langchain-cli migrate (#21928)
Migrate tutorials
2024-05-20 13:45:35 -04:00
Eugene Yurtsev
b2f58d37db docs: run migration script against how-to docs (#21927)
Upgrade imports in how-to docs
2024-05-20 17:32:59 +00:00
Tomaz Bratanic
d85e46321a community[patch]: Better error message for neo4j vector when text is null (#21861) 2024-05-20 10:25:58 -07:00
Stefano Lottini
f2e75f9500 cli[minor]: fix import path for two Astra DB classes in the migration json data (#21926)
This PR fixes two mistakes in the import paths from community for the
json data aiding the cli migration to 0.2.

It is intended as a quick follow-up to
https://github.com/langchain-ai/langchain/pull/21913 .

@nicoloboschi FYI
2024-05-20 12:25:10 -04:00
WilliamEspegren
30bca57aae doc list not empty (#21208)
Make sure the doc list is not empty, and set Metadata: true in param, to
enable the user to disable metadata for slightly faster crawls.
2024-05-20 08:24:06 -07:00
David Charles
8da35fba7f langchain[minor]: add libs/partners to dev.Dockerfile (#21902)
Resolves #21886 by adding "COPY libs/partners ../partners/" to
libs/dev.Dockerfile

Twitter: @kabakongo
2024-05-20 15:20:56 +00:00
Eugene Yurtsev
8530bbac2d docs: update how to install (#21920)
Fix installation instructions in how-to install
2024-05-20 15:14:20 +00:00
TJ
8cd6ed3e1e community[patch]: Update documentation string in databricks chat model (#21915)
Update typos in documentation string in databricks chat model
2024-05-20 14:33:57 +00:00
Maxime Perrin
5ae982145e docs: fix wrong langchain-cli migration commands (#21906)
Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-05-20 10:29:50 -04:00
Nicolò Boschi
dd00aac7ad cli[minor]: add astradb in the cli migration to 0.2 (#21913)
astradb has a new partner package but the automatic migration cli tool
doesn't take care of migration astradb integrations
2024-05-20 10:29:17 -04:00
Jacob Lee
242eeb537f docs[patch]: Adds callback docs (#21889)
@efriis @hwchase17
2024-05-19 21:57:33 -07:00
Jacob Lee
da4fef8131 docs[patch]: Update 0.2 banner copy (#21888)
@nfcampos
2024-05-19 17:21:02 -07:00
Coozywana
b6c8b6f944 Fix base.py typo (#21862)
ChatOpenaAI --> ChatOpenAI

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-18 13:05:02 +00:00
fzowl
d3624eaba1 partners: Remove unnecessary print from voyageai embeddings (#21865)
Thank you for contributing to LangChain!

Remove unnecessary print from voyageai embeddings

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-18 08:57:17 -04:00
Eugene Yurtsev
61ebe7991c docs: how to remove conversion to openai function from index (#21836)
- bind_tools interface is a better alternative.
- openai doesn't use functions but tools in its API now.
- the underlying content appears in some redirects, so will need to
investigate if we can remove.
2024-05-17 23:00:07 -04:00
Eugene Yurtsev
0812723789 docs: how to tools human in the loop (#21858)
Update information in how to guide tools human in the loop.
2024-05-17 22:59:51 -04:00
Eugene Yurtsev
875230d5bc docs: how-to index page fix minor typo (#21859)
Fix typo
2024-05-17 22:45:47 -04:00
Bagatur
8b3c5f93f5 docs: lcel how to and cheatsheet (#21851) 2024-05-17 19:04:45 -07:00
Erick Friis
c3caec5aaf docs: update announcement bar (#21854) 2024-05-18 00:35:07 +00:00
Jacob Lee
0180716a95 docs[patch]: Remove padding from first sidebar link (#21852)
CC @efriis
2024-05-17 17:09:58 -07:00
Nuno Campos
b1e7b40b6a core: Tap output of sync iterators for astream_events (#21842)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-17 16:57:41 -07:00
Erick Friis
9a39f92aba docs: v0.2 version sidebar (#21844)
![image](https://github.com/langchain-ai/langchain/assets/9557659/189f2e04-0c08-4395-b729-f48982c6f53b)
2024-05-17 23:45:51 +00:00
Max Jakob
e6b7a1769b docs: update Elasticsearch strategy names (#21530)
Update documentation with the [new names for retrieval
strategies](https://github.com/langchain-ai/langchain-elastic/pull/22)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-17 23:21:46 +00:00
Erick Friis
cdc8e2d0c2 docs: resolve local links script escape (#21840)
Fixing warnings. Needs to be propagated to 0.1 branch if this works.

![Screenshot 2024-05-17 at 2 34
15 PM](https://github.com/langchain-ai/langchain/assets/9557659/e6ac95a9-5686-4747-9ab8-4cb49942dc8d)
2024-05-17 22:59:27 +00:00
Erick Friis
d02380c504 docs: remove postgres from docs build (#21847) 2024-05-17 15:36:35 -07:00
Eugene Yurtsev
67b6f6c82a core[patch]: Check if event loop is closed in memory stream (#21841)
Check if event stream is closed in memory loop.

Using try/except here to avoid race condition, but this may incur a
small overhead in versions prios to 3.11
2024-05-17 21:53:59 +00:00
Erick Friis
d8f89a5e9b docs: fix vercel core dep 2 (#21839) 2024-05-17 14:24:25 -07:00
Erick Friis
5285336cb1 docs: fix vercel core dep (#21837) 2024-05-17 14:18:57 -07:00
Erick Friis
2d3f4e1a16 experimental: release 0.0.59 (#21835) 2024-05-17 21:02:45 +00:00
Erick Friis
169f525cfb community: release 0.2.0 (#21834) 2024-05-17 13:49:29 -07:00
Eugene Yurtsev
2656bfe941 docs: how to guide tool calling using prompts (#21827)
Update tool calling using prompts.

- Add required concepts
- Update names of tool invoking function.
- Add doc-string to function, and add information about `config` (which
users often forget)
- Remove steps that show how to use single function only. This makes the
how-to guide a bit shorter and more to the point.
- Add diagram from another how-to guide that shows how the thing works
overall.
2024-05-17 16:46:59 -04:00
Erick Friis
e5046cbd72 langchain: release 0.2.0, fix min deps (#21833) 2024-05-17 13:40:51 -07:00
Erick Friis
1b555021f7 text-splitters: release 0.2.0 (#21832) 2024-05-17 13:30:54 -07:00
Erick Friis
0ad8de5eb7 langchain: release 0.2.0 (#21831) 2024-05-17 13:18:31 -07:00
Eugene Yurtsev
33dbad02fe docs: update how-to for built in tools and toolkits (#21828)
Fix some typos
2024-05-17 16:05:28 -04:00
Erick Friis
23310626b3 core: release 0.2.0 (#21829) 2024-05-17 13:04:39 -07:00
Eugene Yurtsev
e3f30b4cde docs: clean up link to bing search (#21825)
Documentation should be inlined, not linking to medium article.
2024-05-17 19:06:56 +00:00
Eugene Yurtsev
22d9aed508 docs: how to tools, merge built in tools and toolkits (#21824)
* Rename tools to built in tools
* Merge built in tools and toolkits
* Update links from providers
2024-05-17 14:35:57 -04:00
Leonid Ganeline
c4508ca7ef docs: arXiv references page (#21450)
Since the LangChain based on many research papers, the LC documentation
has several references to the arXiv papers. It would be beneficial to
create a single page with all referenced papers.
PR:
1. Developed code to search the arXiv references in the LangChain
Documentation and the LangChain code base. Those references are included
in a newly generated documentation page.
2. Page is linked to the Docs menu.

Controversial:
1. The `arxiv_references` page is automatically generated. But this
generation now started only manually. It is not included in the doc
generation scripts. The reason for this is simple. I don't want to
mangle into the current documentation refactoring. If you think, we need
to regenerate this page in each build, let me know. Note: This script
has a dependency on the `arxiv` package.
2. The link for this page in the menu is not obvious.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-05-17 18:28:57 +00:00
ccurme
181dfef118 core, standard tests, partner packages: add test for model params (#21677)
1. Adds `.get_ls_params` to BaseChatModel which returns
```python
class LangSmithParams(TypedDict, total=False):
    ls_provider: str
    ls_model_name: str
    ls_model_type: Literal["chat"]
    ls_temperature: Optional[float]
    ls_max_tokens: Optional[int]
    ls_stop: Optional[List[str]]
```
by default it will only return
```python
{ls_model_type="chat", ls_stop=stop}
```

2. Add these params to inheritable metadata in
`CallbackManager.configure`

3. Implement `.get_ls_params` and populate all params for Anthropic +
all subclasses of BaseChatOpenAI

Sample trace:
https://smith.langchain.com/public/d2962673-4c83-47c7-b51e-61d07aaffb1b/r

**OpenAI**:
<img width="984" alt="Screenshot 2024-05-17 at 10 03 35 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/2ef41f74-a9df-4e0e-905d-da74fa82a910">

**Anthropic**:
<img width="978" alt="Screenshot 2024-05-17 at 10 06 07 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/39701c9f-7da5-4f1a-ab14-84e9169d63e7">

**Mistral** (and all others for which params are not yet populated):
<img width="977" alt="Screenshot 2024-05-17 at 10 08 43 AM"
src="https://github.com/langchain-ai/langchain/assets/26529506/37d7d894-fec2-4300-986f-49a5f0191b03">
2024-05-17 13:51:26 -04:00
Eugene Yurtsev
4ca2149b70 docs: Remove duplicated content from how to tools (#21821)
Content is duplicated, and is covered in how to use chat models.
2024-05-17 17:30:43 +00:00
Matthew Koski
e59afe292d langchain: Fixing import in docs per https://github.com/langchain-ai/langchain/issues/21814 (#21815)
Description: The example in the How-To guide had an import which did not
work. I changed it to use an import from langchain_core.

Issue: https://github.com/langchain-ai/langchain/issues/21814
2024-05-17 17:19:57 +00:00
Sen Lin
eb7f07ae36 community[patch]: fix typo in ValueError message in load_local function (#21818)
**Description:**
Corrected an error in the `allow_dangerous_deserialization` message
within the `load_local` functions
2024-05-17 17:19:04 +00:00
Jorge Piedrahita Ortiz
700b1c7212 community: sambaverse api update (#21816)
- **Description:** fix sambaverse integration to make it compatible with
sambaverse API update / minor changes in docs
2024-05-17 10:18:08 -07:00
Erick Friis
7976fb1663 docs: cookbook redirect (#21822) 2024-05-17 17:07:30 +00:00
maang-h
9f8d18c028 community[patch]: Fix unintended newline in print statement in exception for BaichuanTextEmbeddings (#21820)
- **Code:** langchain_community/embeddings/baichuan.py:82
- **Description:** When I make an error using 'baichuan embeddings', the
printed error message is wrapped (there is actually no need to wrap)
```python
# example
from langchain_community.embeddings import BaichuanTextEmbeddings

# error key
BAICHUAN_API_KEY = "sk-xxxxxxxxxxxxx"
embeddings = BaichuanTextEmbeddings(baichuan_api_key=BAICHUAN_API_KEY)

text_1 = "今天天气不错"
query_result = embeddings.embed_query(text_1)
```



![unintended
newline](https://github.com/langchain-ai/langchain/assets/55082429/e1178ce8-62bb-405d-a4af-e3b28eabc158)
2024-05-17 16:38:38 +00:00
Eugene Yurtsev
aa648298ae docs: minor updates to migration docs (#21819)
Minor aesthetic updates to migration docs
2024-05-17 12:28:56 -04:00
Eugene Yurtsev
fc644c0e1c docs: Update v0.2 information (#21796)
Update information about v0.2 upgrade
2024-05-17 11:43:58 -04:00
Bakar Tavadze
3b5ac44e03 langchain-robocorp[minor]: Enable passing additional headers to the action server. (#21809)
Actions can optionally receive secrets via request headers. This PR
enables this functionality.
2024-05-17 15:08:48 +00:00
Erick Friis
09919c2cd5 docs: version dropdown (#21784) 2024-05-16 17:01:34 -07:00
Chad Juliano
685c13e157 docs: fix errors and table formatting in notebook (#21696)
There are 2 issues fixed here:

* In the notebook pandas dataframes are formatted as HTML in the cells.
On the documentation site the renderer that converts notebooks
incorrectly displays the raw HTML. I can't find any examples of where
this is working and so I am formatting the dataframes as text.

* Some incorrect table names were referenced resulting in errors.
2024-05-16 16:00:14 -07:00
Asaf Joseph Gardin
f3289b898c partners: Revert AI21 Labs docs scan feature (#21699)
Description: Reverted commit #21614

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-16 22:58:40 +00:00
github-user-en
ec8d406441 Made a grammatical correction in streaming.ipynb (#21707)
The only change is replacing the word "operators" with "operates," to
make the sentence grammatically correct.

Thank you for contributing to LangChain!

- [x] **PR title**: "docs: Made a grammatical correction in
streaming.ipynb to use the word "operates" instead of the word
"operators""


- [x] **PR message**: 
- **Description:** The use of the word "operators" was incorrect, given
the context and grammar of the sentence. This PR updates the
documentation to use the word "operates" instead of the word
"operators".
    - **Issue:** Makes the documentation more easily understandable.
    - **Dependencies:** -no dependencies-
    - **Twitter handle:** --


- [x] **Add tests and docs**: Since no new integration is being made, no
new tests/example notebooks are required.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
    - **No formatting changes made to the documentation**

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-16 22:47:40 +00:00
Brace Sproul
6febb283f6 docs[minor]: Hide prev/next buttons on docs in how to / tutorials (#21789)
These buttons don't navigate to the proper prev/next page. Hide in those
pages
2024-05-16 15:35:17 -07:00
Eugene Yurtsev
8607735b80 langchain[patch],community[patch]: Move unit tests that depend on community to community (#21685) 2024-05-16 17:24:27 -04:00
Eugene Yurtsev
97a4ae50d2 How To: Custom tools (#21725)
- Remove double implementations of functions. The single input is just
taking up space.
- Added tool specific information for `async + showing invoke vs.
ainvoke.
- Added more general information about about `async` (this should live
in a different place eventually since it's not specific to tools).
- Changed ordering of custom tools (StructuredTool is simpler and should
appear before the inheritance)
- Improved the error handling section (not convinced it should be here
though)
2024-05-16 21:06:33 +00:00
Bagatur
1cf80a5956 docs: link runnable api (#21783) 2024-05-16 20:49:37 +00:00
Bagatur
aee3842a21 docs: intro nit (#21785) 2024-05-16 13:46:11 -07:00
Marco Lamina
d0fae6cd54 community: Add token cost for GPT-4o model (#21771)
Adding [token cost for the new GPT-4o
model](https://openai.com/api/pricing/):
* Input cost US$5.00 / 1M tokens
* Output cost US$15.00 / 1M tokens
2024-05-16 20:36:23 +00:00
Bagatur
4231cf0696 docs: update chat feat table (#21778) 2024-05-16 12:58:51 -07:00
Massimiliano Pronesti
0c0db7c5db feat(community): support semantic hybrid score threshold in Azure AI Search (#21527)
Support semantic hybrid search with a score threshold -- similar to what
we do for similarity search and for hybrid search (#20907).
2024-05-16 15:54:32 -04:00
Erick Friis
5e445a7e4e docs: dont rewrite ipynb links that have double slash (#21775) 2024-05-16 19:06:30 +00:00
Eugene Yurtsev
e3a03b324d docs: concepts -- add information about tool calling models, update tools section (#21760)
- Add information about naitve tool calling capabilities
- Add information about standard langchain interface for tool calling
- Update description for tools

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-05-16 15:03:25 -04:00
Bagatur
6416d16d39 anthropic[patch]: Release 0.1.13, tool_choice support (#21773) 2024-05-16 17:56:29 +00:00
Stefano Lottini
040597e832 community: init signature revision for Cassandra LLM cache classes + small maintenance (#17765)
This PR improves on the `CassandraCache` and `CassandraSemanticCache`
classes, mainly in the constructor signature, and also introduces
several minor improvements around these classes.

### Init signature

A (sigh) breaking change is tentatively introduced to the constructor.
To me, the advantages outweigh the possible discomfort: the new syntax
places the DB-connection objects `session` and `keyspace` later in the
param list, so that they can be given a default value. This is what
enables the pattern of _not_ specifying them, provided one has
previously initialized the Cassandra connection through the versatile
utility method `cassio.init(...)`.

In this way, a much less unwieldy instantiation can be done, such as
`CassandraCache()` and `CassandraSemanticCache(embedding=xyz)`,
everything else falling back to defaults.

A downside is that, compared to the earlier signature, this might turn
out to be breaking for those doing positional instantiation. As a way to
mitigate this problem, this PR typechecks its first argument trying to
detect the legacy usage.
(And to make this point less tricky in the future, most arguments are
left to be keyword-only).

If this is considered too harsh, I'd like guidance on how to further
smoothen this transition. **Our plan is to make the pattern of optional
session/keyspace a standard across all Cassandra classes**, so that a
repeatable strategy would be ideal. A possibility would be to keep
positional arguments for legacy reasons but issue a deprecation warning
if any of them is actually used, to later remove them with 0.2 - please
advise on this point.

### Other changes

- class docstrings: enriched, completely moved to class level, added
note on `cassio.init(...)` pattern, added tiny sample usage code.
- semantic cache: revised terminology to never mention "distance" (it is
in fact a similarity!). Kept the legacy constructor param with a
deprecation warning if used.
- `llm_caching` notebook: uniform flow with the Cassandra and Astra DB
separate cases; better and Cassandra-first description; all imports made
explicit and from community where appropriate.
- cache integration tests moved to community (incl. the imported tools),
env var bugfix for `CASSANDRA_CONTACT_POINTS`.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-16 17:22:24 +00:00
fzowl
8db4a14648 docs: new voyageai text_embeddings model: voyage-large-2-instruct (#21706) 2024-05-16 10:06:22 -07:00
Bagatur
901e09aa30 docs: datacamp course (#21767) 2024-05-16 16:56:32 +00:00
Kyle Cassidy
eca8c4bcc6 Standardized openai init params (#21739)
## Patch Summary
community:openai[patch]: standardize init args

## Details
I made changes to the OpenAI Chat API wrapper test in the Langchain
open-source repository

- **File**: `libs/community/tests/unit_tests/chat_models/test_openai.py`
- **Changes**:
  - Updated `max_retries` with Pydantic Field
  - Updated the corresponding unit test
- **Related Issues**: #20085
  - Updated max_retries with Pydantic Field, updated the unit test.

---------

Co-authored-by: JuHyung Son <sonju0427@gmail.com>
2024-05-16 16:30:52 +00:00
laishzh
c03fd93fc1 docs: Remove unnecessary comment marks from the Makefile help section (#21749)
**Previous screenshot:**
<img width="758" alt="image"
src="https://github.com/langchain-ai/langchain/assets/1683919/7b90626e-35ab-4486-b41d-b664e69eec0b">

**Current:**
<img width="744" alt="image"
src="https://github.com/langchain-ai/langchain/assets/1683919/cdb69512-dc6c-4b7f-a466-4be92d94c076">
2024-05-16 09:05:44 -07:00
Ethan Yang
e44b448ec3 community: update openvino doc with streaming support (#21519)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-16 15:54:45 +00:00
Eugene Yurtsev
7022260bc5 How to: Streaming (#21715)
Update the how to guide on streaming

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-05-16 11:48:11 -04:00
ccurme
19e6bf814b community: fix CI (#21766) 2024-05-16 15:41:03 +00:00
Michael Ozery
dda5a9c97a docs: sql_qa.ipynb tutorial update (#21756)
1. Updated deprecated method usage.
2. Added LangGraph required installation in tutorial.

X: MichaelOzery
2024-05-16 15:23:20 +00:00
Mish Ushakov
d77e60a7f4 community: updated Browserbase loader (#21757)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: updated Browserbase loader"

- [x] **PR message**:
    Updates the Browserbase loader with more options and improved docs.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-05-16 08:21:23 -07:00
Ikko Eltociear Ashimine
1e6517ba73 docs: update sql_large_db.ipynb (#21765)
mispelling -> misspelling
2024-05-16 15:20:55 +00:00
Eugene Yurtsev
6ed0aa3239 core[major]: only use function description (#21622)
Do not prefix function signature

---

* Reason for this is that information is already present with tool
calling models.
* This will save on tokens for those models, and makes it more obvious
what the description is!
* The @tool can get more parameters to allow a user to re-introduce the
the signature if we want
2024-05-16 11:17:53 -04:00
William FH
8498b41cda Finish agent migration doc (#21731) 2024-05-16 14:43:19 +00:00
Cheese
0ead09f84d community: Implement bind_tools for ChatTongyi (#20725)
## Description

Implement `bind_tools` in ChatTongyi. Usage example:

```py
from langchain_core.tools import tool
from langchain_community.chat_models.tongyi import ChatTongyi

@tool
def multiply(first_int: int, second_int: int) -> int:
    """Multiply two integers together."""
    return first_int * second_int

llm = ChatTongyi(model="qwen-turbo")

llm_with_tools = llm.bind_tools([multiply])

msg = llm_with_tools.invoke("What's 5 times forty two")

print(msg)
```

Streaming is also supported.

## Dependencies

No Dependency is required for this change.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-16 10:39:35 -04:00
yoogle
b216a1dddb docs: fix monorepo typo (#21761)
### Description
fix monorepo typo. `monorep` -> `monorepo`
2024-05-16 14:15:10 +00:00
Bagatur
347166874f docs: aca-ds nit (#21759) 2024-05-16 13:53:08 +00:00
Bagatur
867adbf27b docs: add aca-ds (#21746) 2024-05-16 08:52:07 +00:00
Bagatur
74f54599f4 docs: aza-ds cookbook (#21747) 2024-05-16 01:27:13 -07:00
Erick Friis
be15740084 fireworks: add secret (#21744) 2024-05-15 19:48:51 -07:00
Erick Friis
06110e20b9 pinecone: bump min core version (#21742) 2024-05-15 19:31:43 -07:00
Erick Friis
bd3e7d50f3 fireworks: bump min core version (#21741) 2024-05-15 19:29:13 -07:00
Erick Friis
1647b28a87 infra: release min version dont clobber current lib (#21740) 2024-05-15 19:27:39 -07:00
Erick Friis
f5c31078d7 airbyte[patch]: airbyte-cdk compatible pydantic versions (#21738) 2024-05-15 19:13:25 -07:00
Erick Friis
3d33b89fa4 ibm[patch]: release 0.1.7 (#21737) 2024-05-15 19:10:15 -07:00
Erick Friis
e41d801369 openai[patch]: fix embedding float precision issue (#21736)
also clean up + comment some of the embedding batching code
2024-05-16 02:06:51 +00:00
JuHyung Son
38c297a025 upstage: Support batch input in embedding request. (#21730)
**Description:** upstage embedding now supports batch input.
2024-05-15 18:13:44 -07:00
junefish
c5a981e3b4 docs: Update Pinecone example notebook with embedded widget (#21719)
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-15 21:20:46 +00:00
Erick Friis
0aea7f4b1d docs: fix installation link (#21728) 2024-05-15 21:10:12 +00:00
Harrison Chase
15be439719 Harrison/move flashrank rerank (#21448)
third party integration, should be in community
2024-05-15 13:08:52 -07:00
Harrison Chase
c6c2649a5a move installation (#21711) 2024-05-15 12:59:45 -07:00
Erick Friis
aca98fd150 multiple: releases with relaxed core dep (#21724) 2024-05-15 19:29:35 +00:00
Bagatur
af284518bc openai[patch]: Release 0.1.7, bump tiktoken 0.7.0 (#21723) 2024-05-15 12:19:29 -07:00
Bagatur
0405933914 docs: add feedback link to 0.2 banner (#21600) 2024-05-15 10:53:48 -07:00
William FH
ca768c8353 [Core] Check is async callable (#21714)
To permit proper coercion of objects like the following:


```python
class MyAsyncCallable:
    async def __call__(self, foo):
        return await ...

class MyAsyncGenerator:
    async def __call__(self, foo):
        await ...
        yield 
```
2024-05-15 10:49:49 -07:00
ccurme
7128c2d8ad docs: add tutorial for vector stores and retrievers (#21683)
also update how-to guide for parent document retriever
2024-05-15 11:50:24 -04:00
Eugene Yurtsev
5c2cfabec6 core[minor]: Add v2 implementation of astream events (#21638)
This PR introduces a v2 implementation of astream events that removes
intermediate abstractions and fixes some issues with v1 implementation.

The v2 implementation significantly reduces relevant code that's
associated with the astream events implementation together with
overhead.

After this PR, the astream events implementation:

- Uses an async callback handler
- No longer relies on BaseTracer
- No longer relies on json patch

As a result of this re-write, a number of issues were discovered with
the existing implementation.

## Changes in V2 vs. V1

### on_chat_model_end `output`

The outputs associated with `on_chat_model_end` changed depending on
whether it was within a chain or not.

As a root level runnable the output was: 

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

As part of a chain the output was:

```
            "data": {
                "output": {
                    "generations": [
                        [
                            {
                                "generation_info": None,
                                "message": AIMessageChunk(
                                    content="hello world!", id=AnyStr()
                                ),
                                "text": "hello world!",
                                "type": "ChatGenerationChunk",
                            }
                        ]
                    ],
                    "llm_output": None,
                }
            },
```

After this PR, we will always use the simpler representation:

```python
"data": {"output": AIMessageChunk(content="hello world!", id='some id')}
```

**NOTE** Non chat models (i.e., regular LLMs) are still associated with
the more verbose format.

### Remove some `_stream` events

`on_retriever_stream` and `on_tool_stream` events were removed -- these
were not real events, but created as an artifact of implementing on top
of astream_log.

The same information is already available in the `x_on_end` events.

### Propagating Names

Names of runnables have been updated to be more consistent

```python
  model = GenericFakeChatModel(messages=infinite_cycle).configurable_fields(
        messages=ConfigurableField(
            id="messages",
            name="Messages",
            description="Messages return by the LLM",
        )
    )
```

Before:
```python
"name": "RunnableConfigurableFields",
```

After:
```python
"name": "GenericFakeChatModel",
```

### on_retriever_end

on_retriever_end will always return `output` which is a list of
documents (rather than a dict containing a key called "documents")

### Retry events

Removed the `on_retry` callback handler. It was incorrectly showing that
the failed function being retried has invoked `on_chain_end`


https://github.com/langchain-ai/langchain/pull/21638/files#diff-e512e3f84daf23029ebcceb11460f1c82056314653673e450a5831147d8cb84dL1394
2024-05-15 11:48:47 -04:00
Rajendra Kadam
54e003268e langchain[minor]: Add PebbloRetrievalQA chain with Identity & Semantic Enforcement support (#20641)
- **Description:** PebbloRetrievalQA chain introduces identity
enforcement using vector-db metadata filtering
- **Dependencies:** None
- **Issue:** None
- **Documentation:** Adding documentation for PebbloRetrievalQA chain in
a separate PR(https://github.com/langchain-ai/langchain/pull/20746)
- **Unit tests:** New unit-tests added

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-05-15 13:14:52 +00:00
Bagatur
f2f970f93d docs: openai bind tools nit (#21692) 2024-05-15 01:20:53 +00:00
Erick Friis
5fa5a73dc0 docs: disable contextual search (#21691) 2024-05-14 16:59:11 -07:00
Erick Friis
3ee0747382 infra: remove prints from notebook build (#21688) 2024-05-14 16:27:56 -07:00
Erick Friis
024c11ff9c docs: v0.2 search index (#21619) 2024-05-14 15:37:42 -07:00
Bagatur
241a6e43a5 docs: update structured how to (#21679) 2024-05-14 22:19:51 +00:00
Jib
f369495fa0 mongodb: [performance] Increase DEFAULT_INSERT_BATCH_SIZE to 100,000 and introduce sizing constraints (#19608) 2024-05-14 22:11:26 +00:00
Eugene Yurtsev
e69a9bedf8 core[patch]: Update mypy config (#21684)
Update mypy config to ignore checking deps from numpy and pytest (which are optional in langsmith sdk)
2024-05-14 17:29:07 -04:00
Erick Friis
9973547aef mongodb: release 0.1.4 (#21678) 2024-05-14 11:54:23 -07:00
Jib
a97473c846 mongodb[patch]: Make ObjectId JSON-serializable on generation (#21394) 2024-05-14 11:52:29 -07:00
ccurme
12b599c47f docs: add how-to on multi-modal tool calling (#21667)
Can move this to a dedicated multi-modal section if desired.
2024-05-14 12:26:25 -04:00
Eugene Yurtsev
5c64c004cc core[patch]: Add unit tests with some streaming scenarios (#21668)
Add unit tests that show differences between sync / async versions when
streaming.

The inner on_chain_chunk event is missing if mixing sync and async
functionality. Likely due to missing tap_output_iter implementation on
the sync variant of `_transform_stream_with_config`
2024-05-14 15:30:57 +00:00
Eugene Yurtsev
2ac4d2960c core[patch]: Add unit test to catch ordering (#21669)
Add unit test to catch ordering issues
2024-05-14 15:25:33 +00:00
ccurme
3390dc2266 docs: style nits (#21666) 2024-05-14 10:18:13 -04:00
ccurme
2463c8060c docs: how-to on adding scores to retriever results (#21626) 2024-05-14 09:41:36 -04:00
Zhao Blake
972d2071c6 core[patch]: Fix typo in VectorStoreExampleSelector doc-string (#21574) 2024-05-14 13:31:37 +00:00
William FH
714cba96a8 [docs] Update langgraph migration guide (#21644)
- add links to references where appropriate
- use the create_react_agent
- Fix the timeout recommendation
2024-05-14 06:13:17 +00:00
Erick Friis
5144c94603 docs: add 0.2 search notice (#21653) 2024-05-14 04:00:18 +00:00
Erick Friis
2a984e8e3f docs: huggingface package (#21645) 2024-05-14 03:17:40 +00:00
Anush
cd1879f5e7 docs: Qdrant partner package reference (#21649)
## Description:
As the title goes.
2024-05-13 19:51:57 -07:00
Erick Friis
c77d2f2b06 multiple: core 0.2 nonbreaking dep, check_diff community->langchain dep (#21646)
0.2 is not a breaking release for core (but it is for langchain and
community)

To keep the core+langchain+community packages in sync at 0.2, we will
relax deps throughout the ecosystem to tolerate `langchain-core` 0.2
2024-05-13 19:50:36 -07:00
Anush
edd68e4ad4 qdrant: init package (#21146)
## Description

This PR introduces the new `langchain-qdrant` partner package, intending
to deprecate the community package.

## Changes

- Moved the Qdrant vector store implementation `/libs/partners/qdrant`
with integration tests.
- The conditional imports of the client library are now regular with
minor implementation improvements.
- Added a deprecation warning to
`langchain_community.vectorstores.qdrant.Qdrant`.
- Replaced references/imports from `langchain_community` with either
`langchain_core` or by moving the definitions to the `langchain_qdrant`
package itself.
- Updated the Qdrant vector store documentation to reflect the changes.

## Testing
- `QDRANT_URL` and
[`QDRANT_API_KEY`](583e36bf6b)
env values need to be set to [run integration
tests](d608c93d1f)
in the [cloud](https://cloud.qdrant.tech).
- If a Qdrant instance is running at `http://localhost:6333`, the
integration tests will use it too.
- By default, tests use an
[`in-memory`](https://github.com/qdrant/qdrant-client?tab=readme-ov-file#local-mode)
instance(Not comprehensive).

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-05-13 18:20:03 -07:00
Erick Friis
fe8c9d621a docs: ignore nb echo:false blocks (#21624)
not working currently
2024-05-13 17:18:26 -07:00
Prashanth Rao
63c3a0e56c [community][graph]: Update KuzuQAChain and docs (#21218)
This PR makes some small updates for `KuzuQAChain` for graph QA.

- Updated Cypher generation prompt (we now support `WHERE EXISTS`) and
generalize it more
- Support different LLMs for Cypher generation and QA
- Update docs and examples
2024-05-13 17:17:14 -07:00
Bagatur
752b1e85f8 docs: gh feedback link (#21606)
Co-authored-by: bracesproul <braceasproul@gmail.com>
2024-05-14 00:11:37 +00:00
Bagatur
506df439eb docs: how to index nits (#21623) 2024-05-13 23:52:50 +00:00
Bagatur
b514a479c0 docs: standardize capitalization (#21641) 2024-05-13 16:25:51 -07:00
Bagatur
89aae3e043 docs: add Techniques to Concepts (#21636)
- Adds Techniques section
- Moves function calling, retrieval types to Techniques
- Removes Installation section (not conceptual)
- Reorders a few things (chat models before llms, package descriptions
before diagram)
- Add text splitter types to Techniques
2024-05-13 16:06:16 -07:00
Tomaz Bratanic
89ff6a3d3b Add sentiment and confidence levels to diffbotgraphtransformer (#21590)
Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-13 23:00:52 +00:00
Bagatur
526ba235f3 docs: fix prereq links (#21630) 2024-05-13 15:40:53 -07:00
Erick Friis
0541e06e21 infra: 0.2 docs 404 page (#21634) 2024-05-13 22:11:28 +00:00
Erick Friis
e861b5bcb7 infra: fix api ref link generation (#21631) 2024-05-13 14:52:26 -07:00
Erick Friis
9b51ca08bc huggingface: fix community dep checking (#21628) 2024-05-13 21:52:18 +00:00
Erick Friis
91a2ea5cd6 chroma, mongodb: fix docstrings (#21629) 2024-05-13 21:27:43 +00:00
Jofthomas
afd85b60fc huggingface: init package (#21097)
First Pr for the langchain_huggingface partner Package

- Moved some of the hugging face related class from `community` to the
new `partner package`

Still needed :
- Documentation
- Tests
- Support for the new apply_chat_template in `ChatHuggingFace`
- Confirm choice of class to support for embeddings witht he
sentence-transformer team.

cc : @efriis

---------

Co-authored-by: Cyril Kondratenko <kkn1993@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-13 20:53:15 +00:00
Tomaz Bratanic
9fce03e7db community[patch]: Fix neo4j enhanced schema (#21582) 2024-05-13 15:26:06 -04:00
Christophe Bornet
66a4da8ad0 community[patch]: Improve Cassandra VectorStore docsctrings (#21620) 2024-05-13 15:24:26 -04:00
adreo00
40aff1eacc core[major]: AsyncCallbackManagerForToolRun no longer casts return object to string (#20374)
- **Description:** Stops `AsyncCallbackManagerForToolRun` from
converting the output to str
- **Issue:** #20372
- **Dependencies:** None
2024-05-13 15:09:12 -04:00
Eugene Yurtsev
25fbe356b4 community[patch]: upgrade to recent version of mypy (#21616)
This PR upgrades community to a recent version of mypy. It inserts type:
ignore on all existing failures.
2024-05-13 14:55:07 -04:00
Eugene Yurtsev
b923951062 langchain[patch]: CI add lint rule for community imports (#21618)
Add a rule to check for imports from community in global scope
2024-05-13 14:51:25 -04:00
Jorge Piedrahita Ortiz
4378fbbef0 community[patch]: Fix typos in Sambanova integration doc-strings (#21617)
- **Description:** Sambanova integration docstrings updated, bad
formated

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-05-13 18:35:16 +00:00
Erick Friis
0f5bf94f9f infra: remove ai21 docs scan features (#21614)
ai21 depends on ai21-tokenizer which depends on too restrictive/old
version of `tokenizers`
2024-05-13 18:05:53 +00:00
ccurme
fe08421207 docs: add hybrid retrieval how-to guide (#21613)
Updating v0.2 docs with
https://github.com/langchain-ai/langchain/pull/21245
2024-05-13 14:03:55 -04:00
Christophe Bornet
bcf53f93e1 [community]: Add missing docstring param to CassandraLoader (#21611) 2024-05-13 16:03:18 +00:00
Christophe Bornet
e6fa4547b1 community[minor]: Add alazy_load to AsyncHtmlLoader (#21536)
Also fixes a bug that `_scrape` was called and was doing a second HTTP
request synchronously.

**Twitter handle:** cbornet_
2024-05-13 12:01:03 -04:00
Leonid Ganeline
4c48732f94 docs: providers updates 1 (#20256)
- Proviers pages: added missed integrations; fixed format
- `mistralai` converted from notebook to .mdx format
2024-05-13 11:54:51 -04:00
ccurme
15cb1133e7 docs: fix path for state_of_the_union sample file (#21609) 2024-05-13 11:46:02 -04:00
Bagatur
83a8fdcfd1 infra: fix local doc make command (#21608) 2024-05-13 08:30:30 -07:00
Eugene Yurtsev
4dc625057e README: Update downloads to show downloads of langchain-core (#21387)
Update downloads to keep track of langchain-core
2024-05-13 11:26:50 -04:00
Wang Guan
b53548dcda langchain[minor]: allow CacheBackedEmbeddings to cache queries (#20073)
Add optional caching of queries to cache backed embeddings
2024-05-13 15:18:04 +00:00
Guangdong Liu
a156aace2b core[patch]:Fix Incorrect listeners parameters for Runnable.with_listeners() and .map() (#20661)
- **Issue:** fix #20509
-  @baskaryan, @eyurtsev


![image](https://github.com/langchain-ai/langchain/assets/48236177/f799a976-b983-4d8b-b373-64392e1fd6c6)
2024-05-13 11:16:17 -04:00
ccurme
b0f5a47f25 docs: update some retrievers how-to guides (#21607) 2024-05-13 11:03:33 -04:00
junkeon
480c02bf55 upstage[minor]: add merge_and_split function for document loader (#21603)
- Introduce the `merge_and_split` function in the
`UpstageLayoutAnalysisLoader`.
- The `merge_and_split` function takes a list of documents and a
splitter as inputs.
- This function merges all documents and then divides them using the
`split_documents` method, which is a proprietary function of the
splitter.
- If the provided splitter is `None` (which is the default setting), the
function will simply merge the documents without splitting them.
2024-05-13 10:55:19 -04:00
Leonid Ganeline
500569da48 community[patch]: vectorstores import update (#21169)
Issue: we have several helper functions to import third-party libraries
like lancedb.import_lancedb in
[community.vectorstores](https://api.python.langchain.com/en/latest/vectorstores/langchain_community.vectorstores.lancedb.import_lancedb.html#langchain_community.vectorstores.lancedb.import_lancedb).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
2024-05-13 10:45:31 -04:00
ccurme
3003363605 langchain, community: remove cap on sqlalchemy and bump duckdb (#21509) 2024-05-13 10:16:09 -04:00
ccurme
01a3228d8e standard tests: add test for few-shot examples (#21019) 2024-05-13 10:06:12 -04:00
David Duong
db22fcb58b docs: style fixes for api reference docs (#21602)
- Make sure the left nav bar is horizontally scrollable 
- Make sure the navigation dropdown is vertically scrollable and height
capped at 80% of viewport height

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-13 06:49:50 -07:00
Chuyuan Qu
af875cff57 prompty: adding Microsoft langchain_prompty package (#21346)
Co-authored-by: Micky Liu <wayliu@microsoft.com>
Co-authored-by: wayliums <wayliums@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-11 04:03:44 +00:00
Erick Friis
56c6b5868b infra: run codespell on v0.1 prs (#21545) 2024-05-10 12:51:42 -07:00
Matt Florence
d3ca2cc8c3 langchain: Fix broken OpenAIModerationChain and implement async (#18537)
Thank you for contributing to LangChain!

## PR title
lancghain[patch]: fix `OpenAIModerationChain` and implement async

## PR message
Description: fix `OpenAIModerationChain` and implement async

Issues: 
- https://github.com/langchain-ai/langchain/issues/18533 
- https://github.com/langchain-ai/langchain/issues/13685

Dependencies: none
Twitter handle: mattflo


## Add tests and docs
 
Existing documentation is broken:
https://python.langchain.com/docs/guides/safety/moderation


- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Emilia Katari <emilia@outpace.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-05-10 19:04:13 +00:00
ccurme
4170e72a42 openai: fix loads unit test (#21542)
following changes to tests in core here:
https://github.com/langchain-ai/langchain/pull/21342/files
2024-05-10 18:46:34 +00:00
ccurme
d3ff9c5d6a infra: turn off fail-fast for standard tests (#21541) 2024-05-10 18:28:57 +00:00
Erick Friis
e8efe8384d docs: announcement bar dark mode 0.2 (#21540) 2024-05-10 10:13:02 -07:00
Erick Friis
64c47224a0 docs: baseUrl for ganalytics, throw on broken links (#21455) 2024-05-10 13:49:59 +00:00
Usama Jamil
913792f5e6 docs: myscale code typo (#21522)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-10 13:33:22 +00:00
Sevin F. Varoglu
85cbc55f86 docs: update OctoAI LLM doc (#21528)
This PR updates OctoAI doc to remove warnings when running the example
code.
2024-05-10 09:31:16 -04:00
Daniel Glogowski
70a79f45d7 docs: update nvidia nbs (#21498) 2024-05-10 04:38:35 -04:00
Eugene Yurtsev
39e9b644b9 docs: Add langchain over time (#21434)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-10 00:34:35 +00:00
Erick Friis
3db85cbb5b community: deps (#21508) 2024-05-09 15:12:34 -07:00
ccurme
9c2828aaa8 docs: add local LLMs page to v0.2 docs (#21493)
Adding this page from v0.1 docs:
https://python.langchain.com/v0.1/docs/guides/development/local_llms/
2024-05-09 17:57:56 -04:00
Erick Friis
8580e350be cli: release 0.0.22 (#21507) 2024-05-09 21:45:20 +00:00
Anthony Chu
c735849e76 azure-dynamic-sessions: add Python REPL tool (#21264)
Adds a Python REPL that executes code in a code interpreter session
using Azure Container Apps dynamic sessions.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-09 21:39:04 +00:00
Erick Friis
02701c277f langchain: core min version (#21506) 2024-05-09 13:45:44 -07:00
ccurme
81ae184cc9 docs: add response metadata page to v0.2 docs (#21489)
Adding this page from v0.1 docs:
https://python.langchain.com/v0.1/docs/modules/model_io/chat/response_metadata/
2024-05-09 16:17:04 -04:00
Erick Friis
13b01104c9 langchain: drop sqlalchemy max, release 0.2.0rc2 (#21504) 2024-05-09 13:12:38 -07:00
ccurme
375f447e58 community: fix builds with min dependencies (#21495) 2024-05-09 13:01:44 -07:00
Erick Friis
2be4b1b2c9 Revert "docs: redirect base slug" (#21499)
Reverts langchain-ai/langchain#21457
2024-05-09 12:20:16 -07:00
Erick Friis
d1fc841b1a docs: redirect base slug (#21457) 2024-05-09 10:52:36 -07:00
Trayan Azarov
ba7d53689c community: Chroma Adding create_collection_if_not_exists flag to Chroma constructor (#21420)
- **Description:** Adds the ability to either `get_or_create` or simply
`get_collection`. This is useful when dealing with read-only Chroma
instances where users are constraint to using `get_collection`. Targeted
at Http/CloudClients mostly.
- **Issue:** chroma-core/chroma#2163
- **Dependencies:** N/A
- **Twitter handle:** `@t_azarov`




| Collection Exists | create_collection_if_not_exists | Outcome | test |

|-------------------|---------------------------------|----------------------------------------------------------------|----------------------------------------------------------|
| True | False | No errors, collection state unchanged |
`test_create_collection_if_not_exist_false_existing` |
| True | True | No errors, collection state unchanged |
`test_create_collection_if_not_exist_true_existing` |
| False | False | Error, `get_collection()` fails |
`test_create_collection_if_not_exist_false_non_existing` |
| False | True | No errors, `get_or_create_collection()` creates the
collection | `test_create_collection_if_not_exist_true_non_existing` |
2024-05-09 11:45:10 -04:00
ccurme
3bb9bec314 bedrock: add unit test for retriever (#21485)
This was implemented in
https://github.com/langchain-ai/langchain/pull/21349 but dropped before
merge.
2024-05-09 11:37:03 -04:00
Renu Rozera
4035a1d234 Add source metadata to bedrock retriever response (#21349)
Thank you for contributing to LangChain!

- [X] **PR title**: "community: Add source metadata to bedrock retriever
response"

- [X] **PR message**: 
- **Description:** Bedrock retrieve API returns extra metadata in the
response which is currently not returned in the retriever response
- **Issue:** The change adds the metadata from bedrock retrieve API
response to the bedrock retriever in a backward compatible way. Renamed
metadata to sourceMetadata as metadata term is being used in the
Document already. This is in sync with what we are doing in llama-index
as well.
    - **Dependencies:** No


- [X] **Add tests and docs**:
  1. Added unit tests
  2. Notebook already exists and does not need any change
3. Response from end to end testing, just to ensure backward
compatibility: `[Document(page_content='Exoplanets.',
metadata={'location': {'s3Location': {'uri':
's3://bucket/file_name.txt'}, 'type': 'S3'}, 'score': 0.46886647,
'source_metadata': {'x-amz-bedrock-kb-source-uri':
's3://bucket/file_name.txt', 'tag': 'space', 'team': 'Nasa', 'year':
1946.0}})]`


- [X] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Piyush Jain <piyushjain@duck.com>
2024-05-09 11:06:22 -04:00
ccurme
9fa17bfabe docs; fix links in v0.2.0 (#21483) 2024-05-09 11:05:17 -04:00
Erick Friis
f178c67ad0 community: release 0.2.0rc1, bump deps (#21470) 2024-05-08 23:32:44 -07:00
William FH
b28be5d407 Pass through Run ID Explicitly (#21469) 2024-05-08 22:20:51 -07:00
Erick Friis
83eecd54fe experimental: 0.2 relax (#21468) 2024-05-08 21:39:42 -07:00
roiperlman
9992beaff9 community: Add arguments to whisper parser (#20378)
**Description:** Added a few additional arguments to the whisper parser,
which can be consumed by the underlying API.
The prompt is especially important to fine-tune transcriptions.

---------

Co-authored-by: Roi Perlman <roi@fivesigmalabs.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-08 17:53:13 -07:00
Erick Friis
5542eacad8 docs: sidebar autogen hidden support (#21454) 2024-05-09 00:23:52 +00:00
Yash
cb31c3611f Ndb enterprise (#21233)
Description: Adds NeuralDBClientVectorStore to the langchain, which is
our enterprise client.

---------

Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
2024-05-08 16:30:58 -07:00
Erick Friis
74044e44a5 docs: useBaseUrl on svg paths (#21446) 2024-05-08 21:55:42 +00:00
Oguz Vuruskaner
5b35f077f9 [community][fix](DeepInfraEmbeddings): Implement chunking for large batches (#21189)
**Description:**
This PR introduces chunking logic to the `DeepInfraEmbeddings` class to
handle large batch sizes without exceeding maximum batch size of the
backend. This enhancement ensures that embedding generation processes
large batches by breaking them down into smaller, manageable chunks,
each conforming to the maximum batch size limit.

**Issue:**
Fixes #21189

**Dependencies:**
No new dependencies introduced.
2024-05-08 14:45:42 -07:00
Sokolov Fedor
f4ddf64faa community: Add MarkdownifyTransformer to langchain_community.document_transformers (#21247)
- Added new document_transformer: MarkdonifyTransformer, that uses
`markdonify` package with customizable options to convert HTML to
Markdown. It's similar to Html2TextTransformer, but has more flexible
options and also I've noticed that sometimes MarkdownifyTransformer
performs better than html2text one, so that's why I use markdownify on
my project.
- Added docs and tests

- Usage:
```python
from langchain_community.document_transformers import MarkdownifyTransformer

markdownify = MarkdownifyTransformer()
docs_transform = markdownify.transform_documents(docs)
```

- Example of better performance on simple task, that I've noticed:
```
<html>
<head><title>Reports on product movement</title></head>
<body>
<p data-block-key="2wst7">The reports on product movement will be useful for forming supplier orders and controlling outcomes.</p>
</body>
```
**Html2TextTransformer**: 
```python
[Document(page_content='The reports on product movement will be useful for forming supplier orders and\ncontrolling outcomes.\n\n')]
# Here we can see 'and\ncontrolling', which has extra '\n' in it
```
**MarkdownifyTranformer**:
```python
[Document(page_content='Reports on product movement\n\nThe reports on product movement will be useful for forming supplier orders and controlling outcomes.')]
```

---------

Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.bbrouter>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Sokolov Fedor <f.sokolov@sokolov-macbook.local>
Co-authored-by: Sokolov Fedor <f.sokolov@192.168.1.6>
2024-05-08 14:45:13 -07:00
Alex JW
d3ce6aad2e community: Instantiate GPT4AllEmbeddings with parameters (#21238)
### GPT4AllEmbeddings parameters
---

**Description:** 
As of right now the **Embed4All** class inside _GPT4AllEmbeddings_ is
instantiated as it's default which leaves no room to customize the
chosen model and it's behavior. Thus:

- GPT4AllEmbeddings can now be instantiated with custom parameters like
a different model that shall be used.

---------

Co-authored-by: AlexJauchWalser <alexander.jauch-walser@knime.com>
2024-05-08 14:44:47 -07:00
Philippe PRADOS
7be68228da community[patch]: Make sql record manager fully compatible with async (#20735)
The `_amake_session()` method does not allow modifying the
`self.session_factory` with
anything other than `async_sessionmaker`. This prohibits advanced uses
of `index()`.

In a RAG architecture, it is necessary to import document chunks.
To keep track of the links between chunks and documents, we can use the
`index()` API.
This API proposes to use an SQL-type record manager.

In a classic use case, using `SQLRecordManager` and a vector database,
it is impossible
to guarantee the consistency of the import. Indeed, if a crash occurs
during the import
(problem with the network, ...)
there is an inconsistency between the SQL database and the vector
database.

With the
[PR](https://github.com/langchain-ai/langchain-postgres/pull/32) we are
proposing for `langchain-postgres`,
it is now possible to guarantee the consistency of the import of chunks
into
a vector database.  It's possible only if the outer session is built
with the connection.

```python
def main():
    db_url = "postgresql+psycopg://postgres:password_postgres@localhost:5432/"
    engine = create_engine(db_url, echo=True)
    embeddings = FakeEmbeddings()
    pgvector:VectorStore = PGVector(
        embeddings=embeddings,
        connection=engine,
    )

    record_manager = SQLRecordManager(
        namespace="namespace",
        engine=engine,
    )
    record_manager.create_schema()

    with engine.connect() as connection:
        session_maker = scoped_session(sessionmaker(bind=connection))
        # NOTE: Update session_factories
        record_manager.session_factory = session_maker
        pgvector.session_maker = session_maker
        with connection.begin():
            loader = CSVLoader(
                    "data/faq/faq.csv",
                    source_column="source",
                    autodetect_encoding=True,
                )
            result = index(
                source_id_key="source",
                docs_source=loader.load()[:1],
                cleanup="incremental",
                vector_store=pgvector,
                record_manager=record_manager,
            )
            print(result)
```
The same thing is possible asynchronously, but a bug in
`sql_record_manager.py`
in `_amake_session()` must first be fixed.

```python
    async def _amake_session(self) -> AsyncGenerator[AsyncSession, None]:
        """Create a session and close it after use."""

        # FIXME: REMOVE if not isinstance(self.session_factory, async_sessionmaker):~~
        if not isinstance(self.engine, AsyncEngine):
            raise AssertionError("This method is not supported for sync engines.")

        async with self.session_factory() as session:
            yield session
``` 

Then, it is possible to do the same thing asynchronously:

```python
async def main():
    db_url = "postgresql+psycopg://postgres:password_postgres@localhost:5432/"
    engine = create_async_engine(db_url, echo=True)
    embeddings = FakeEmbeddings()
    pgvector:VectorStore = PGVector(
        embeddings=embeddings,
        connection=engine,
    )
    record_manager = SQLRecordManager(
        namespace="namespace",
        engine=engine,
        async_mode=True,
    )
    await record_manager.acreate_schema()

    async with engine.connect() as connection:
        session_maker = async_scoped_session(
            async_sessionmaker(bind=connection),
            scopefunc=current_task)
        record_manager.session_factory = session_maker
        pgvector.session_maker = session_maker
        async with connection.begin():
            loader = CSVLoader(
                "data/faq/faq.csv",
                source_column="source",
                autodetect_encoding=True,
            )
            result = await aindex(
                source_id_key="source",
                docs_source=loader.load()[:1],
                cleanup="incremental",
                vector_store=pgvector,
                record_manager=record_manager,
            )
            print(result)


asyncio.run(main())
```

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Sean <sean@upstage.ai>
Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: YISH <mokeyish@hotmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Jason_Chen <820542443@qq.com>
Co-authored-by: Joan Fontanals <joan.fontanals.martinez@jina.ai>
Co-authored-by: Pavlo Paliychuk <pavlo.paliychuk.ca@gmail.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
Co-authored-by: samanhappy <samanhappy@gmail.com>
Co-authored-by: Lei Zhang <zhanglei@apache.org>
Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com>
Co-authored-by: merdan <48309329+merdan-9@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Andres Algaba <andresalgaba@gmail.com>
Co-authored-by: davidefantiniIntel <115252273+davidefantiniIntel@users.noreply.github.com>
Co-authored-by: Jingpan Xiong <71321890+klaus-xiong@users.noreply.github.com>
Co-authored-by: kaka <kaka@zbyte-inc.cloud>
Co-authored-by: jingsi <jingsi@leadincloud.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Shengsheng Huang <shannie.huang@gmail.com>
Co-authored-by: Michael Schock <mjschock@users.noreply.github.com>
Co-authored-by: Anish Chakraborty <anish749@users.noreply.github.com>
Co-authored-by: am-kinetica <85610855+am-kinetica@users.noreply.github.com>
Co-authored-by: Dristy Srivastava <58721149+dristysrivastava@users.noreply.github.com>
Co-authored-by: Matt <matthew.gotteiner@microsoft.com>
Co-authored-by: William FH <13333726+hinthornw@users.noreply.github.com>
2024-05-08 17:31:11 -04:00
Andreas Motl
17e42bbd18 community[patch]: pgvector: Slight refactoring to make code a bit more reusable (#16243)
- **Description:** Improve [pgvector vector store
adapter](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py)
to make it reusable by adapters deriving from that.
  - **Issue:** NA
  - **Dependencies:** NA
  - **References:** https://github.com/crate-workbench/langchain/pull/1
  - **Addressed to:** @eyurtsev, @cbornet


Hi from the CrateDB team,

first of all, thanks a stack for conceiving and maintaining LangChain.
We are currently [preparing a
patch](https://github.com/crate-workbench/langchain/pull/1) for adding
[CrateDB](https://github.com/crate/crate) to the list of community
adapters.

Because CrateDB aims to be compatible with PostgreSQL to some degree,
the vector store subsystem in LangChain derives functionality from the
corresponding implementation for pgvector.

Therefore, in order to make the implementation more reusable, we needed
to rename the private methods `__from` and `__query_collection` to the
less private counterparts `_from` and `_query_collection`, so they can
be overwritten, in order to unlock other adapters deriving from
[pgvector](https://github.com/langchain-ai/langchain/blob/v0.1.1/libs/community/langchain_community/vectorstores/pgvector.py).

With kind regards,
Andreas.
2024-05-08 17:21:30 -04:00
Mehrdad Shokri
f103927b88 bugfix(community): fix Playwright import paths. (#21395)
- **Description:** Fix import class name exporeted from
'playwright.async_api' and 'playwright.sync_api' to match the correct
name in playwright tool. Change import from inline guard_import to
helper function that calls guard_import to make code more readable in
gmail tool. Upgrade playwright version to 1.43.0
- **Issue:** #21354
- **Dependencies:** upgrade playwright version(this is not required for
the bugfix itself, just trying to keep dependencies fresh. I can remove
the playwright version upgrade if you want.)
2024-05-08 14:20:25 -07:00
Shailendra Mishra
aa966b6161 Replaced bind variable in SQL with formatted string for compatibility with sql syntax. (#21439)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-08 13:51:30 -07:00
Eugene Yurtsev
f92006de3c multiple: langchain 0.2 in master (#21191)
0.2rc 

migrations

- [x] Move memory
- [x] Move remaining retrievers
- [x] graph_qa chains
- [x] some dependency from evaluation code potentially on math utils
- [x] Move openapi chain from `langchain.chains.api.openapi` to
`langchain_community.chains.openapi`
- [x] Migrate `langchain.chains.ernie_functions` to
`langchain_community.chains.ernie_functions`
- [x] migrate `langchain/chains/llm_requests.py` to
`langchain_community.chains.llm_requests`
- [x] Moving `langchain_community.cross_enoders.base:BaseCrossEncoder`
->
`langchain_community.retrievers.document_compressors.cross_encoder:BaseCrossEncoder`
(namespace not ideal, but it needs to be moved to `langchain` to avoid
circular deps)
- [x] unit tests langchain -- add pytest.mark.community to some unit
tests that will stay in langchain
- [x] unit tests community -- move unit tests that depend on community
to community
- [x] mv integration tests that depend on community to community
- [x] mypy checks

Other todo

- [x] Make deprecation warnings not noisy (need to use warn deprecated
and check that things are implemented properly)
- [x] Update deprecation messages with timeline for code removal (likely
we actually won't be removing things until 0.4 release) -- will give
people more time to transition their code.
- [ ] Add information to deprecation warning to show users how to
migrate their code base using langchain-cli
- [ ] Remove any unnecessary requirements in langchain (e.g., is
SQLALchemy required?)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-08 16:46:52 -04:00
ccurme
6b392d6d12 robocorp: release 0.0.6 (#21441) 2024-05-08 16:16:24 -04:00
Erick Friis
21d14549a9 docs: v0.2 docs in master (#21438)
current python.langchain.com is building from branch `v0.1`. Iterate on
v0.2 docs here.

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: jacoblee93 <jacoblee93@gmail.com>
Co-authored-by: Leonid Ganeline <leo.gan.57@gmail.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Averi Kitsch <akitsch@google.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Martín Gotelli Ferenaz <martingotelliferenaz@gmail.com>
Co-authored-by: Fayfox <admin@fayfox.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Dawson Bauer <105886620+djbauer2@users.noreply.github.com>
Co-authored-by: Ravindu Somawansa <ravindu.somawansa@gmail.com>
Co-authored-by: Dhruv Chawla <43818888+Dominastorm@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: WeichenXu <weichen.xu@databricks.com>
Co-authored-by: Benito Geordie <89472452+benitoThree@users.noreply.github.com>
Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
Co-authored-by: Sevin F. Varoglu <sfvaroglu@octoml.ai>
Co-authored-by: MacanPN <martin.triska@gmail.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
Co-authored-by: Hyeongchan Kim <kozistr@gmail.com>
Co-authored-by: sdan <git@sdan.io>
Co-authored-by: Guangdong Liu <liugddx@gmail.com>
Co-authored-by: Rahul Triptahi <rahul.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: pjb157 <84070455+pjb157@users.noreply.github.com>
Co-authored-by: Eun Hye Kim <ehkim1440@gmail.com>
Co-authored-by: kaijietti <43436010+kaijietti@users.noreply.github.com>
Co-authored-by: Pengcheng Liu <pcliu.fd@gmail.com>
Co-authored-by: Tomer Cagan <tomer@tomercagan.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
2024-05-08 12:29:59 -07:00
Tommi Holmgren
ee35b9ba56 langchain-robocorp: remove toolkit return content max length (#21436)
Robocorp (action server) toolkit had a limitation that the content
length returned by the tool was always cut to max 5000 chars. This was
from the time when context windows were much more limited.

This PR removes the limitation. Whatever the underlying tool provides
gets sent back to the agent.

As the robocorp toolkit no longer restricts the content, the implication
is that either the Action (tool) developer or the agent developer needs
to be aware of potentially oversized tool responses. Our point of view
is this should be the agent developer's responsibility, them being in
control of the use case and aware of the context window the LLM has.
2024-05-08 15:05:55 -04:00
JuHyung Son
710e57d779 upstage: deprecate UPSTAGE_DOCUMENT_AI_API_KEY (#21363)
Description: We are merging UPSTAGE_DOCUMENT_AI_API_KEY and
UPSTAGE_API_KEY into one, and only UPSTAGE_API_KEY will be used going
forward. And we changed the base class of ChatUpstage to BaseChatOpenAI.

---------

Co-authored-by: Sean <chosh0615@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-08 18:02:26 +00:00
Erick Friis
6a295d1ec0 upstage: release 0.1.4 (#21432) 2024-05-08 17:57:40 +00:00
Mateusz Szewczyk
7926cc1929 ibm: Fix llm and embeddings "verify" attribute default value (#21429)
Thank you for contributing to LangChain!

- [x] **PR title**: "langchain-ibm: Fix llm and embeddings 'verify'
attribute default value"


- [x] **PR message**: 
    - **Description:** fix default value of "verify" attribute
    - **Dependencies:** `ibm_watsonx_ai`


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-08 17:23:14 +00:00
Kevin Zhang
0715545378 docs: fix typo in text (#21393)
**Description:** The previous text had an unclosed parenthesis, this fix
adds the closing parenthesis
2024-05-08 15:58:15 +00:00
Dobiichi-Origami
5b00885b49 community: add bind_tools and with_structured_output support to QianfanChatEndpoint (#21412)
…Endpoint`

Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** add `bind_tools` and `with_structured_output` support
to `QianfanChatEndpoint`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-05-08 11:35:10 -04:00
Silas Xu
aafaf3e193 The predict_and_parse is deprecated, instead pass an output parser directly to LLMChain. (#20130)
The `predict_and_parse` method is deprecated, instead pass an output
parser directly to LLMChain.

- [x] **PR title**: "langchain: update chain_extract.py"


![image](https://github.com/langchain-ai/langchain/assets/40889019/e950d79f-5a0f-4086-86e9-89f627990fe5)

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-08 09:32:17 -04:00
ccurme
3c31bd0ed0 langchain: update use of predict_and_parse in LLMChainFilter (#21389)
Following https://github.com/langchain-ai/langchain/pull/20130

Removes deprecation warnings in docs here:
https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression/

Tested using the same docs notebook + existing integration test.
2024-05-08 09:31:33 -04:00
Tomaz Bratanic
dd70f2f473 Update graph docs (#21414)
Update the deprecated docs and added node properties to graph
construction
2024-05-08 09:05:39 -04:00
Erick Friis
bbdf0f8801 experimental[patch]: core and langchain dep (#21402) 2024-05-07 21:39:34 -07:00
Erick Friis
e4aca0d052 experimental[patch]: release 0.0.58 (#21397) 2024-05-08 03:52:44 +00:00
Erick Friis
893f06b5de infra: rewrite ipynb links to md (#21392) 2024-05-07 23:16:52 +00:00
Hassan El Mghari
225ceedcb6 docs: Add together docs in chat models & update provider docs (#21391)
- Added Together docs in chat models section
- Update Together provider docs to match the LLM & chat models sections

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-07 22:40:57 +00:00
Heidi Steen
af97d58c9e docs: update docs/integrations/retrievers/azure_ai_search.ipynb (#21160)
This is a doc update. It fixes up formatting and product name
references. The example code is updated to use a local built-in text
file.

@mmhangami Please take a look

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-05-07 22:33:46 +00:00
snova-jamesv
ca753e7c15 community: updated performance limitation wording in sambanova.ipynb (#21390)
- **Description:** updated performance limitation wording in
sambanova.ipynb
    - **Issue:** NA
    - **Dependencies:** NA
    - **Twitter handle:** NA
2024-05-07 22:21:46 +00:00
Leonid Ganeline
791d59a2c8 community: callbacks guard_imports (#21173)
Issue: we have several helper functions to import third-party libraries
like import_uptrain in
[community.callbacks](https://api.python.langchain.com/en/latest/callbacks/langchain_community.callbacks.uptrain_callback.import_uptrain.html#langchain_community.callbacks.uptrain_callback.import_uptrain).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
2024-05-07 15:04:54 -07:00
Hassan El Mghari
416549bed2 docs: Updated Together integration docs (#21388)
**Description:** Updated the together integration docs by leading with
the streaming example, explicitly specifying a model to show users how
to do that, and updating the sections to more closely match other
integrations.
2024-05-07 21:51:42 +00:00
Rahul Triptahi
7994cba18d [Community][Minor]: Fetch loader_source of GoogleDriveLoader in PebbloSafeLoader. (#21314)
Description: This PR includes fix for loader_source to be fetched from
metadata in case of GdriveLoaders.
Documentation: NA
Unit Test: NA

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-05-07 14:45:58 -07:00
Leonid Ganeline
7cbf1c31aa docs: table legend updated (#21351)
Compacted the table column legends. Added links. Similar to #21259
2024-05-07 14:45:04 -07:00
Erick Friis
d5bde4fa91 infra: use nbconvert for docs build (#21135)
todo

- [x] remove quarto build semantics
- [x] remove quarto download/install
- [x] make `uv` not verbose
2024-05-07 12:30:17 -07:00
Nuno Campos
ad0f3c14c2 core: allow mermaid node labels to have any characters (#21385)
- it's only node ids that are limited

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-07 12:16:49 -07:00
Eugene Yurtsev
6a1d61dbf1 community[patch]: Fix in memory vectorstore to take into account ids when adding docs (#21384)
Should respect `ids` if passed
2024-05-07 15:05:16 -04:00
Ikko Eltociear Ashimine
80170da6c5 docs: update cassandra_database.ipynb (#21145)
Enviroment -> Environment
2024-05-07 15:00:24 -04:00
Miroslav
04e2611fea Added additional headers for HuggingFaceInferenceAPIEmbeddings endpoint. (#21282)
Thank you for contributing to LangChain!

- [ ] **HuggingFaceInferenceAPIEmbeddings**: "Additional Headers"
  - Where: langchain, community, embeddings. huggingface.py.
- Community: add additional headers when needed by custom HuggingFace
TEI embedding endpoints. HuggingFaceInferenceAPIEmbeddings"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Adding the `additional_headers` to be passed to
requests library if needed
    - **Dependencies:** none
 

- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. Tested with locally available TEI endpoints with and without
`additional_headers`
  2. Example  Usage
  
```python
embeddings=HuggingFaceInferenceAPIEmbeddings(
                             api_key=MY_CUSTOM_API_KEY,
                             api_url=MY_CUSTOM_TEI_URL,
                             additional_headers={
                                "Content-Type": "application/json"
                               }
)
```

 

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-05-07 14:17:53 -04:00
Ikko Eltociear Ashimine
c34419e200 docs: update quick_start.ipynb (#21358)
initalize -> initialize



- [x] **PR title**: "package: description"
2024-05-07 08:44:48 -07:00
Guangdong Liu
1fe66f5d39 community(patch) fix MoonshotChat moonshot_api_key is invaild for api key (#21361)
Description: close
https://github.com/langchain-ai/langchain/issues/21237
@baskaryan, @eyurtsev
2024-05-07 08:44:30 -07:00
snova-jamesv
c2ed484653 community: add Sambaverse rate limitation info to sambanova.ipynb (#21379)
- **Description:** add Sambaverse rate limitation info to
sambanova.ipynb
    - **Issue:** NA
    - **Dependencies:** NA
2024-05-07 15:42:44 +00:00
Tomaz Bratanic
0bf7596839 Add simple node properties to llm graph transformer (#21369)
Add support for simple node properties in llm graph transformer.

Linter and dynamic pydantic classes aren't friends, hence I added two
ignores
2024-05-07 08:41:09 -07:00
ccurme
080af0ec53 langchain: sync -> async methods in OpenAI assistants (#21378) 2024-05-07 10:25:55 -04:00
Tomaz Bratanic
ad3fd44a7f experimental: Fix llm graph transformer bug (#21362) 2024-05-06 23:59:55 -07:00
Erick Friis
bb81ae5c8c together: fix chat model and embedding classes (#21353) 2024-05-06 18:26:03 -07:00
Hassan El Mghari
d6ef5fe86a together: add chat models, use openai base (#21337)
**Description:** Adding chat completions to the Together AI package,
which is our most popular API. Also staying backwards compatible with
the old API so folks can continue to use the completions API as well.
Also moved the embedding API to use the OpenAI library to standardize it
further.

**Twitter handle:** @nutlope

- [x] **Add tests and docs**: If you're adding a new integration, please
include
- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-06 17:47:06 -07:00
Jacob Lee
a2d31307bb Adds confirmation logs after creating a new project (#12618)
@efriis @hwchase17

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-06 23:28:12 +00:00
Erick Friis
0fb93cd740 core: release 0.1.52 (#21350) 2024-05-06 22:20:35 +00:00
Wu Enze
32c61b3ece community[patch]: chat message history mypy fixes #17048 (#20114)
Relates [#17048]
Description : Applied fix to redis and neo4j file.

Error was : `Cannot override writeable attribute with read-only
property`

fix with the same solution of
[[langchain/libs/community/langchain_community/chat_message_histories/elasticsearch.py](d5c412b0a9/libs/community/langchain_community/chat_message_histories/elasticsearch.py (L170-L175))]

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-06 22:17:45 +00:00
nrpd25
95cc8e3fc3 premai[patch]:Standardized model init args (#21308)
[Standardized model init args
#20085](https://github.com/langchain-ai/langchain/issues/20085)
- Enable premai chat model to be initialized with `model_name` as an
alias for `model`, `api_key` as an alias for `premai_api_key`.
- Add initialization test `test_premai_initialization`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-06 18:12:29 -04:00
Nuno Campos
6f17158606 fix: core: Include in json output also fields set outside the constructor (#21342) 2024-05-06 14:37:36 -07:00
Tomaz Bratanic
ac14f171ac Add indexed properties to neo4j enhanced schema (#21335) 2024-05-06 14:28:34 -07:00
scaserini
a6cdf6572f community: add Kendra DocumentRelevanceOverrideConfigurations request parameter (#20695)
- **Description:** add **DocumentRelevanceOverrideConfigurations**
request parameter to Kendra retriever

Co-authored-by: Simone Caserini <simone.caserini@klarna.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-06 14:26:36 -07:00
Nuno Campos
0345bcf4ef Fix failing test for serialization (#21344)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-06 21:19:54 +00:00
Trayan Azarov
93226b1945 community: Updated Chroma version range to include 0.5.0 release (#21224)
- Updated Chroma version range to allow releases in 0.5.x.
- Bumped mypy version as linting was failing
2024-05-06 13:31:40 -07:00
Jorge Piedrahita Ortiz
e65652c3e8 community: add SambaNova embeddings integration (#21227)
- **Description:**  SambaNova hosted embeddings integration
2024-05-06 13:29:59 -07:00
Jorge Piedrahita Ortiz
df1c10260c community: minor changes sambanova integration (#21231)
- **Description:** fix: variable names in root validator not allowing
pass credentials as named parameters in llm instancing, also added
sambanova's sambaverse and sambastudio llms to __init__.py for module
import
2024-05-06 13:28:35 -07:00
Jan Soubusta
d9a61c0fa9 fix: respect table_name argument when calling from_texts (#21252)
valid for from_documents() as well

fixes #21251
2024-05-06 20:28:22 +00:00
Pedro Lima
bebf46c4a2 community: added args_schema to YahooFinanceNewsTool (#21232)
Description: this change adds args_schema (pydantic BaseModel) to
YahooFinanceNewsTool for correct schema formatting on LLM function calls

Issue: currently using YahooFinanceNewsTool with OpenAI function calling
returns the following error "TypeError("YahooFinanceNewsTool._run() got
an unexpected keyword argument '__arg1'")". This happens because the
schema sent to the LLM is "input: "{'__arg1': 'MSFT'}"" while the method
should be called with the "query" parameter.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-06 13:27:54 -07:00
Mark Cusack
060987d755 community[minor]: Add indexing via locality sensitive hashing to the Yellowbrick vector store (#20856)
- **Description:** Add LSH-based indexing to the Yellowbrick vector
store module
- **Twitter handle:** @markcusack

---------

Co-authored-by: markcusack <markcusack@markcusacksmac.lan>
Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-05-06 20:18:02 +00:00
Rashmi Pawar
a2fdabdad2 mark NemoEmbeddings as deprecated (#21239)
The NemoEmbeddings is deprecated, instead use
langchain-nvidia-ai-endpoints NVIDIAEmbeddings interface.

cc: @mattf

---------

Co-authored-by: Daniel Glogowski <167348611+dglogo@users.noreply.github.com>
Co-authored-by: andyjessen <62343929+andyjessen@users.noreply.github.com>
Co-authored-by: Chris Germann <88305668+TAAGECH9@users.noreply.github.com>
Co-authored-by: gere <gere@kapo.zh.ch>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-06 19:44:58 +00:00
Erick Friis
9e4b24a2d6 langchain: release 0.1.18 (#21338) 2024-05-06 19:39:46 +00:00
Erick Friis
5c000f8d79 community: release 0.0.37 (#21332) 2024-05-06 12:17:42 -07:00
Leonid Ganeline
8c13e8a79b langchain: qa_chain fix (#21279)
Issue: `load_qa_chain` is placed in the __init__.py file. As a result,
it is not listed in the API Reference docs.
BTW `load_qa_chain` is heavily presented in the doc examples, but is
missed in API Ref.
Change: moved code from init.py into a new file. Related: #21266
2024-05-06 14:45:51 -04:00
Erick Friis
7ecf9996f1 community: Revert "community: langkit dependency" (#21333)
Reverts langchain-ai/langchain#21174

Hey team - going to revert this because it doesn't seem necessary for
testing. We should only be adding optional + extended_testing
dependencies for deps that have extended tests.

otherwise it just increases probability of dependency conflicts in the
community lockfile.
2024-05-06 18:44:41 +00:00
Param Singh
fee91d43b7 baichuan[patch]:standardize chat init args (#21298)
Thank you for contributing to LangChain!

community:baichuan[patch]: standardize init args

updated `baichuan_api_key` so that aliased to `api_key`. Added test that
it continues to set the same underlying attribute. Test checks for
`SecretStr`

updated `temperature` with Pydantic Field, added unit test. 

Related to https://github.com/langchain-ai/langchain/issues/20085
2024-05-06 18:33:57 +00:00
Leonid Ganeline
62559b20b3 docs: chains page format (#21259)
Compacted the table column descriptions.
2024-05-06 11:33:38 -07:00
Christophe Bornet
484a009012 community[minor]: Relax constraints on Cassandra VectorStore constructors (#21209)
If Session and/or keyspace are not provided, they are resolved from
cassio's context. So they are not required.
This change is fully backward compatible.
2024-05-06 14:32:32 -04:00
Daniel Glogowski
27e73ebe57 docs: update nvidia docs v2 (#21288)
More doc updates por favor @baskaryan!
2024-05-06 11:29:02 -07:00
Leonid Ganeline
6feddfae88 community: langkit dependency (#21174)
Issue: the `langkit` package is not presented in the `pyproject.toml`
but it is a requirement for the `WhyLabsCallbackHandler`
Change: added `langkit`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-06 18:09:31 +00:00
Erick Friis
811e9cee8b core: release 0.1.51 (#21328) 2024-05-06 10:40:19 -07:00
Pengcheng Liu
144f2821af docs: add example for loading data from LarkSuite wiki. (#21311)
**Description:** Update LarkSuite loader doc to give an example for
loading data from LarkSuite wiki.
**Issue:** None
**Dependencies:** None
**Twitter handle:** None
2024-05-06 09:56:12 -07:00
Mateusz Szewczyk
682d21c3de ibm: Add support for ibm-watsonx-ai new major version (#21313)
Thank you for contributing to LangChain!

- [x] **PR title**: "langchain-ibm: Add support for ibm-watsonx-ai new
major version"


- [x] **PR message**: 
    - **Description:** Add support for ibm-watsonx-ai new major version
    - **Dependencies:** `ibm_watsonx_ai`


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-06 16:48:26 +00:00
Chris Papademetrious
ee6c922c91 langchain[minor]: enhance LocalFileStore to offer update_atime parameter that updates access times on read (#20951)
**Description:**
The `LocalFileStore` class can be used to create an on-disk
`CacheBackedEmbeddings` cache. The number of files in these embeddings
caches can grow to be quite large over time (hundreds of thousands) as
embeddings are computed for new versions of content, but the embeddings
for old/deprecated content are not removed.

A *least-recently-used* (LRU) cache policy could be applied to the
`LocalFileStore` directory to delete cache entries that have not been
referenced for some time:

```bash
# delete files that have not been accessed in the last 90 days
find embeddings_cache_dir/ -atime 90 -print0 | xargs -0 rm
```

However, most filesystems in enterprise environments disable access time
modification on read to improve performance. As a result, the access
times of these cache entry files are not updated when their values are
read.

To resolve this, this pull request updates the `LocalFileStore`
constructor to offer an `update_atime` parameter that causes access
times to be updated when a cache entry is read.

For example,

```python
file_store = LocalFileStore(temp_dir, update_atime=True)
```

The default is `False`, which retains the original behavior.

**Testing:**
I updated the LocalFileStore unit tests to test the access time update.
2024-05-06 11:52:29 -04:00
Tomaz Bratanic
5b6d1a907d Add the extract types to diffbot graph transformer (#21315)
Before you could only extract triples (diffbot calls it facts) from
diffbot to avoid isolated nodes. However, sometimes isolated nodes can
still be useful like for prefiltering, so we want to allow users to
extract them if they want. Default behaviour is unchanged.
2024-05-06 09:19:52 -04:00
Jagadish Krishnamoorthy
c038991590 docs: Update pandas.ipynb (#21289)
Remove the redundant comment.
2024-05-05 20:22:17 +00:00
aditya thomas
b868c78a12 partners[anthropic]: update unit test for key passed in from the environment (#21290)
**Description:** Update unit test for ChatAnthropic
**Issue:** Test for key passed in from the environment should not have
the key initialized in the constructor
**Dependencies:** None
2024-05-05 16:19:10 -04:00
tanersekmen
d310f9c71e docs:update code structure (#21302)
update the structure of llm_chain variables

Co-authored-by: tanersemenn <0418>
2024-05-05 17:18:15 +00:00
Christophe Bornet
ba9dc04ffa docs: Add doc for hybrid search (#21245)
See
[preview](https://langchain-git-fork-cbornet-doc-hybrid-search-langchain.vercel.app/docs/use_cases/question_answering/hybrid/)

In the model of [per user
retrieval](https://python.langchain.com/docs/use_cases/question_answering/per_user/)
2024-05-04 08:22:56 -04:00
Rohan Aggarwal
8021d2a2ab community[minor]: Oraclevs integration (#21123)
Thank you for contributing to LangChain!

- Oracle AI Vector Search 
Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.


- Oracle AI Vector Search is designed for Artificial Intelligence (AI)
workloads that allows you to query data based on semantics, rather than
keywords. One of the biggest benefit of Oracle AI Vector Search is that
semantic search on unstructured data can be combined with relational
search on business data in one single system. This is not only powerful
but also significantly more effective because you don't need to add a
specialized vector database, eliminating the pain of data fragmentation
between multiple systems.
This Pull Requests Adds the following functionalities
Oracle AI Vector Search : Vector Store
Oracle AI Vector Search : Document Loader
Oracle AI Vector Search : Document Splitter
Oracle AI Vector Search : Summary
Oracle AI Vector Search : Oracle Embeddings


- We have added unit tests and have our own local unit test suite which
verifies all the code is correct. We have made sure to add guides for
each of the components and one end to end guide that shows how the
entire thing runs.


- We have made sure that make format and make lint run clean.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: skmishraoracle <shailendra.mishra@oracle.com>
Co-authored-by: hroyofc <harichandan.roy@oracle.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-04 03:15:35 +00:00
ccurme
c9e9470c5a langchain: fix deprecation decorators on extraction chains (#21276)
Calling any of these raises
```
ValueError: A pending deprecation cannot have a scheduled removal
```
2024-05-03 18:29:40 -04:00
Wickes Wong
ee1adaacaa langchain[patch]: Fix summary buffer memory with return message flag (#21115)
## Description
Memory return could be set as `str` or `message` by `return_messages`
flag as mentioned in
https://python.langchain.com/docs/modules/memory/#whether-memory-is-a-string-or-a-list-of-messages,
where
`langchain.chains.conversation.memory.ConversationSummaryBufferMemory`
did not implement that.
This commit added `buffer_as_str` and `buffer_as_messages` function, and
`buffer` now affected by `return_messages` flag.

## Example Test Code and Output

```python
# Fix: ConversationSummaryBufferMemory with return_messages flag function
# Test code
from langchain.chains.conversation.memory import ConversationSummaryBufferMemory
from langchain_community.llms.ollama import Ollama

llm = Ollama()

# Create an instance of ConversationSummaryBufferMemory with return_messages set to True
memory = ConversationSummaryBufferMemory(return_messages=True, llm=llm)

# Add user and AI messages to the chat memory
memory.chat_memory.add_user_message("hi!")
memory.chat_memory.add_ai_message("what's up?")

# Print the buffer
print("Buffer:")
print(*map(type, memory.buffer), sep="\n")
print(memory.buffer, "\n")

# Print the buffer as a string
print("Buffer as String:")
print(type(memory.buffer_as_str))
print(memory.buffer_as_str, "\n")

# Print the buffer as messages
print("Buffer as Messages:")
print(*map(type, memory.buffer_as_messages), sep="\n")
print(memory.buffer_as_messages, "\n")

# Print the buffer after setting return_messages to False
memory.return_messages = False
print("Buffer after setting return_messages to False:")
print(type(memory.buffer))
print(memory.buffer, "\n")
```

```plaintext
Buffer:
<class 'langchain_core.messages.human.HumanMessage'>
<class 'langchain_core.messages.ai.AIMessage'>
[HumanMessage(content='hi!'), AIMessage(content="what's up?")] 

Buffer as String:
<class 'str'>
Human: hi!
AI: what's up? 

Buffer as Messages:
<class 'langchain_core.messages.human.HumanMessage'>
<class 'langchain_core.messages.ai.AIMessage'>
[HumanMessage(content='hi!'), AIMessage(content="what's up?")] 

Buffer after setting return_messages to False:
<class 'str'>
Human: hi!
AI: what's up? 
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-03 17:25:09 -04:00
Leonid Ganeline
9639457222 community[patch]: tools imports (#21156)
Issue: we have several helper functions to import third-party libraries
like tools.gmail.utils.import_google in
[community.tools](https://api.python.langchain.com/en/latest/community_api_reference.html#id37).
And we have core.utils.utils.guard_import that works exactly for this
purpose.
The import_<package> functions work inconsistently and rather be private
functions.
Change: replaced these functions with the guard_import function.

Related to #21133
2024-05-03 17:22:45 -04:00
Leonid Ganeline
3ef8b24277 core[patch]: utils.guard_import fix (#21133)
Issues (nit): 
1. `utils.guard_import` prints wrong error message when there is an
import `error.` It prints the whole `module_name` but should be only the
first part as the pip package name. E.i. `langchain_core.utils` -> print
not `langchain-core` but `langchain_core.utils`. Also replace '_' with
'-' in the pip package name.
2. it does not handle the `ModuleNotFoundError` which raised if
`guard_import("wrong_module")`

Fixed issues; added ut-s. Controversial: I've reraised
`ModuleNotFoundError` as `ImportError`, since in case of the error, the
proposed action is the same - we need to install a missed package.
2024-05-03 17:21:36 -04:00
Erick Friis
36c2ca3c8b mistralai: relax tokenizers dep (#21277) 2024-05-03 14:16:22 -07:00
Nuno Campos
6e1e0c7d5c fix: core: draw_mermaid() would create subgroup for edges with same src and tgt (#21275)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-05-03 13:51:08 -07:00
Eugene Yurtsev
26a37dce0a langchain[patch]: Remove jsonpatch from poetry file (#21272)
jsonpatch is only used in langchain-core not in langchain
2024-05-03 15:46:05 -04:00
Eugene Yurtsev
335bd01e45 langchain[patch]: Update deprecation warning (#21268)
Update deprecation warning
2024-05-03 15:31:29 -04:00
Leonid Ganeline
23a05c3986 langchain: summarize chain fix (#21266)
Issue: `load_summarize_chain` is placed in the __init__.py file. As a
result, it doesn't listed in the API Reference docs.
Change: moved code from __init__.py into a new file.
2024-05-03 14:44:39 -04:00
ccurme
6da3d92b42 (all): update removal in deprecation warnings from 0.2 to 0.3 (#21265)
We are pushing out the removal of these to 0.3.

`find . -type f -name "*.py" -exec sed -i ''
's/removal="0\.2/removal="0.3/g' {} +`
2024-05-03 14:29:36 -04:00
Eugene Yurtsev
d6e34f9ee5 langchain[patch]: Improve deprecation warnings (#21262)
* Remove spurious derprecation warning
* Make deprecation warnings consistent with 0.1 namespaces that were announced as deprecated
2024-05-03 13:40:16 -04:00
Eugene Yurtsev
487aff7e46 langchain[patch]: Revert 20794 until 0.2 release (#21257)
PR of 2079 was already released as part of 0.1.17rc.


Issue for 0.2 release:
https://github.com/langchain-ai/langchain/issues/21080
2024-05-03 17:02:48 +00:00
Eugene Yurtsev
ba4a309d98 langchain[patch]: Revert breaking change until 0.2 release (#21256)
Reverts a minor breaking change until 0.2 release
2024-05-03 09:42:27 -07:00
Eugene Yurtsev
66a1e3f083 langchain[patch]: Fix flaky unit test (#21258)
Should sort the results of the import test since it depends on import order
2024-05-03 15:55:46 +00:00
Eugene Yurtsev
0989c48028 langchain[minor]: Re-add deleted ainetwork tool (#21254)
* Adding __init__.py to turn it into a package in community
* Adding proxy imports that assume that langchain_community is optional
2024-05-03 11:39:40 -04:00
Christophe Bornet
2fbe82f5e6 community[minor]: Relax constraints on CassandraChatMessageHistory constructor (#21241) 2024-05-03 10:20:39 -04:00
Chris Germann
3a8d1d8838 Hotfix RetrievalQA Docs: docs: Fix formatting (#21183)
# Newline Characters breaking formatting 

**Description**: 
As you can see in the image below, the formatting in the documentation
is broken. As far as I can see the two added `\n` characters are
breaking the documentation. Therefore I would propose to remove those

![image](https://github.com/langchain-ai/langchain/assets/88305668/23b6e726-71b2-4812-91ea-3e8600683733)

**Dependencies**:
None

**Twitter Handle**
- epu9byj

---------

Co-authored-by: gere <gere@kapo.zh.ch>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-03 12:46:29 +00:00
andyjessen
64e17bd793 docs: Fix comment within "handle long text" example (#21248)
The current doc-string comment is referring to the wrong schema.
2024-05-03 12:36:53 +00:00
Daniel Glogowski
c3d169ab00 docs: Update Nvidia documentation (#21240)
Updating Nvidia docs ahead for 5/15 competition. 

Thanks!
2024-05-03 12:29:03 +00:00
Bagatur
70bde15480 docs: add tool choice to tool calling (#21229) 2024-05-03 03:10:22 -04:00
Bagatur
67a5cc34c6 openai[patch]: Release 0.1.6 (#21236) 2024-05-03 04:10:39 +00:00
Erick Friis
c1eb95b967 core: release 0.1.50 (#21230) 2024-05-02 22:44:18 +00:00
Nuno Campos
47ce8d5a57 core: tracer: remove numeric execution order (#21220)
- this hasn't been used in a long time and requires some additional
bookkeeping i'm going to streamline in the next pr
2024-05-02 15:38:55 -07:00
Bagatur
6ac6158a07 openai[patch]: support tool_choice="required" (#21216)
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-05-02 18:33:25 -04:00
Erick Friis
aa9faa8512 docs: model table keywords, remove tool calling from llm (#21225) 2024-05-02 21:04:29 +00:00
xindoo
c1aa237bc2 langchain: fix syntax error in code comment for create_tool_calling_agent (#21205)
**PR message**:
- **Description:** Corrected a syntax error in the code comments within
the `create_tool_calling_agent` function in the langchain package.
- **Issue:** N/A
- **Dependencies:** No additional dependencies required.
- **Twitter handle:** N/A
2024-05-02 19:17:23 +00:00
ccurme
eb0a2fd53a mistral: release 0.1.6 (#21214) 2024-05-02 13:59:19 -04:00
ccurme
2d77e5e3a1 (standard tests): add test for basic conversation sequence (#21213) 2024-05-02 13:47:10 -04:00
Maxime Perrin
1ebb5a70ad partners(mistralai): Removing unused variable in completion request (using tool_calls or content) (#21201)
This PR fixes #21196.

The error was occurring when calling chat completion API with a chat
history. Indeed, the Mistral API does not accept both `content` and
`tool_calls` in the same body.

This PR removes one of theses variables depending on the necessity.

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-05-02 13:20:14 -04:00
Christophe Bornet
683fb45c6b community[patch]: Refactor CassandraDatabase wrapper (#21075)
* Introduce individual `fetch_` methods for easier typing.
* Rework some docstrings to google style
* Move some logic to the tool
* Merge the 2 cassandra utility files
2024-05-02 13:13:08 -04:00
Bagatur
b00fd1dbde infra: Undo gh cache removal (#21210)
Co-authored-by: Nuno Campos <nuno@langchain.dev>
2024-05-02 17:12:32 +00:00
Aditya
ee2c55ca09 docs: Added documentation on Anthropic models on vertex (#21070)
Description:Added documentation on Anthropic models on Vertex
@lkuligin for review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
2024-05-02 13:12:01 -04:00
Raghav Dixit
7d451d0041 community[patch]: Update lancedb.py (#21192)
very minor update in LanceDB integration, 'metric' argument was missing.
2024-05-02 17:06:39 +00:00
Bagatur
d297d90ad9 core[patch]: Release 0.1.49 (#21211) 2024-05-02 17:06:27 +00:00
Nuno Campos
663747b730 core[patch]: Fixes for convert_messages (#21207)
- support two-tuples of any sequence type (eg. json.loads never produces
tuples)
- support type alias for role key
- if id is passed in in dict form use it
- if tool_calls passed in in dict form use them

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-02 16:55:42 +00:00
Eugene Yurtsev
df49404794 langchain[patch]: Make more memory code handle community dependency as optional (#21199) 2024-05-02 11:05:26 -04:00
ccurme
bd5d2c2674 langchain: import InMemoryChatMessageHistory from core (#21198) 2024-05-02 14:53:07 +00:00
Eugene Yurtsev
3cd7fced5f langchain[patch],community[minor]: Migrate memory implementations to community (#20845)
Migrates memory implementations to community
2024-05-02 10:46:50 -04:00
Eugene Yurtsev
b5c3a04e4b langchain[patch]: chat histories to handle optional community dependence (#21194) 2024-05-02 10:36:08 -04:00
Eugene Yurtsev
c9119b0e75 langchain[patch],community[minor]: Move some unit tests from langchain to community, use core for fake models (#21190) 2024-05-02 09:57:52 -04:00
Eugene Yurtsev
c306364b06 langchain[patch]: Update more code to use langchain community as an optional dependency (#21170)
More code to use langchain community as an optional dependency
2024-05-02 09:05:48 -04:00
Erick Friis
cd4c54282a infra: cleanup docs build (#21134)
Refactors the docs build in order to:
- run the same `make build` command in both vercel and local build
- incrementally build artifacts in 2 distinct steps, instead of building
all docs in-place (in vercel) or in a _dist dir (locally)

Highlights:
- introduces `make build` in order to build the docs
- collects and generates all files for the build in
`docs/build/intermediate`
- renders those jupyter notebook + markdown files into
`docs/build/outputs`

And now the outputs to host are in `docs/build/outputs`, which will need
a vercel settings change.

Todo:
- [ ] figure out how to point the right directory (right now deleting
and moving docs dir in vercel_build.sh isn't great)
2024-05-01 17:34:05 -07:00
Bagatur
6fa8626e2f openai[patch]: fix azure open lc serialization, release 0.1.5 (#21159) 2024-05-01 18:03:29 -04:00
Eugene Yurtsev
94a838740e langchain[patch]: Migrate more code in utils to use optional langchain import (#21166)
Moving is interactive util to avoid circular deps
2024-05-01 17:18:42 -04:00
Eugene Yurtsev
23fdd320bc langchain[patch]: Migrate more code to use optional community in agents namespace (#21167) 2024-05-01 16:25:44 -04:00
Tomaz Bratanic
9e53fa7d2e Some more fixes to neo4j enhanced schema (#21139) 2024-05-01 13:12:43 -07:00
Erick Friis
0694538c39 ai21: fix core version (#21168) 2024-05-01 13:10:22 -07:00
Eugene Yurtsev
44602bdc20 langchain[patch],community[minor]: Move load_tools to community (#21158)
Move load tools to community
2024-05-01 16:05:41 -04:00
Eugene Yurtsev
9932f49b3e langchain[patch]: Migrate llms to use optional community imports (#21101) 2024-05-01 16:04:45 -04:00
Eugene Yurtsev
57e8e70daa langchain[patch]: Migrate chat models to optional community imports (#21090)
Migrate chat models to optional community imports
2024-05-01 16:04:12 -04:00
Eugene Yurtsev
2914abd747 langchain[patch]: Fix how the serializable test identifies serializable objects (#21165)
dir() will not work if we're using optional imports. The only way to do this is by using contents of __all__
2024-05-01 15:56:11 -04:00
Eugene Yurtsev
23c5d87311 langchain[patch]: Migrate utils to use optional langchain_community (#21163)
Migrate utils to use optional imports from langchain community
2024-05-01 15:24:02 -04:00
Eugene Yurtsev
bec3eee3fa langchain[patch]: Migrate retrievers to use optional langchain community imports (#21155) 2024-05-01 14:44:44 -04:00
Eugene Yurtsev
43110daea5 langchain[patch]: Update some agent tool kits to handle community import as optional (#21157)
A few things that were not caught by the migration script
2024-05-01 14:22:54 -04:00
Eugene Yurtsev
59f10ab3e0 langchain[patch]: Migrate embeddings to optional imports (#21099) 2024-05-01 13:47:37 -04:00
Eugene Yurtsev
2f709d94d7 langchain[patch]: Migrate vectorstores to use optional langchain community imports (#21150) 2024-05-01 13:33:37 -04:00
Eugene Yurtsev
7230e430db langchain[patch]: Migrate top level files to use optional langchain community (#21152)
Migrate a few top level files to treat langchain community as an optional dependency
2024-05-01 13:23:03 -04:00
Erick Friis
daab9789a8 ai21: release 0.1.4 (#21151) 2024-05-01 17:16:27 +00:00
Asaf Joseph Gardin
642975dd9f partners: AI21 Labs Jamba Support (#20815)
Description: Added support for AI21 new model - Jamba
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-05-01 10:12:44 -07:00
Eugene Yurtsev
7a39fe60da langchain[patch]: Migrate utilities to handle langchain community as optional (#21149) 2024-05-01 13:09:34 -04:00
Eugene Yurtsev
b879184595 langchain[patch]: embedddings distance move import of openai embeddings into local scope (#21148) 2024-05-01 12:51:51 -04:00
Bagatur
8b4b75e543 docs: standardize vertexai params (#20167)
Related to #20085

Requires https://github.com/langchain-ai/langchain-google/pull/121
2024-05-01 11:42:18 -04:00
Eugene Yurtsev
0e5bf16d00 langchain[patch]: Migrate document loaders to use optional langchain community imports (#21095) 2024-05-01 11:26:25 -04:00
Jacob Lee
bd38073d76 👥 Update LangChain people data (#21143)
👥 Update LangChain people data

Co-authored-by: github-actions <github-actions@github.com>
2024-05-01 11:01:43 -04:00
Harrison Chase
4d1c21d97d community[patch]: Fix alternative name in deprecation notice for sql_database (#21144) 2024-05-01 10:59:42 -04:00
East Agile
2a6f78a53f community[minor]: Rememberizer retriever (#20052)
**Description:**
This pull request introduces a new feature for LangChain: the
integration with the Rememberizer API through a custom retriever.
This enables LangChain applications to allow users to load and sync
their data from Dropbox, Google Drive, Slack, their hard drive into a
vector database that LangChain can query. Queries involve sending text
chunks generated within LangChain and retrieving a collection of
semantically relevant user data for inclusion in LLM prompts.
User knowledge dramatically improved AI applications.
The Rememberizer integration will also allow users to access general
purpose vectorized data such as Reddit channel discussions and US
patents.

**Issue:**
N/A

**Dependencies:**
N/A

**Twitter handle:**
https://twitter.com/Rememberizer
2024-05-01 10:41:44 -04:00
Eugene Yurtsev
1ce1a10f2b langchain[patch],community[minor]: Move graph index creator (#20795)
Move graph index creator to community
2024-05-01 10:04:30 -04:00
Eugene Yurtsev
aa0bc7467c langchain[patch]: Migrate agents module into optional imports for community (#21088) 2024-05-01 09:36:03 -04:00
Eugene Yurtsev
86ff8a3fb4 langchain[patch]: Update docstore module to use optional imports from community (#21091) 2024-05-01 09:35:05 -04:00
Eugene Yurtsev
d640605694 langchain[patch]: Migrate chat loaders to optional community imports (#21089)
Migrate chat loaders to optional community imports
2024-05-01 09:34:44 -04:00
Charlie Marsh
2b10c4dd52 ci: Use ruff check in Makefile (#21138)
## Summary

`ruff /path/to/file.py` works but is deprecated, and we now recommend
`ruff check /path/to/file.py` (to match `ruff format /path/to/file.py`).
2024-05-01 09:34:15 -04:00
Eugene Yurtsev
2fcab9acd9 langchain[patch]: Upgrade storage to treat langchain community as optional (#21105) 2024-05-01 09:33:31 -04:00
William FH
ab55f6996d [Core] Tracing: update parent run_tree's child_runs (#21049) 2024-05-01 06:33:08 -07:00
Abhishek Bhagwat
86fe484e24 docs: Docs (sample notebook) for Vertex DIY RAG Ranking API (#21054)
Vertex DIY RAG APIs helps to build complex RAG systems and provide more
granular control, and are suited for custom use cases.

The Ranking API takes in a list of documents and reranks those documents
based on how relevant the documents are to a given query. Compared to
embeddings that look purely at the semantic similarity of a document and
a query, the ranking API can give you a more precise score for how well
a document answers a given query.


[Reference](https://cloud.google.com/generative-ai-app-builder/docs/ranking)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-01 05:39:39 +00:00
Stuart Leeks
8a01760a0f infra: Sync devcontainer.json and compose file mount location (#20461)
**Sync the config in `devcontainer.json` and `docker-compose.yml`**

Issue: when opening the current `master` branch in a dev container in VS
Code, I get the following message as VS Code cannot find the mounted
source folder:


![image](https://github.com/langchain-ai/langchain/assets/1824461/41cf20c0-d1e0-4648-9578-edf80b99c2db)

Opening in a GitHub Codespace works (it seems to ignore the mounts in
the `docker-compose.yml`.

This PR updates the mount in `docker-compose.yml` and the config in
`devcontainer.json` so that the two align.

I have tested these changes in GitHub Codespaces and a VS Code dev
container and both loaded successfully.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-01 01:32:12 -04:00
aditya thomas
12b1caf295 openai[patch]: add tests for secret_str for keys (#20982)
**Description:** Add tests to check API keys and Active Directory tokens
are masked
**Issue:** Resolves #12165 for OpenAI and Azure OpenAI models
**Dependencies:** None

Also resolves #12473 which may be closed.

Additional contributors @alex4321 (#12473) and @onesolpark (#12542)
2024-05-01 01:26:20 -04:00
Noah
45ddf4d26f community[patch]: Update comments for lazy_load method (#21063)
- [ ] **PR message**: 
- **Description:** Refactored the lazy_load method to use asynchronous
execution for improved performance. The method now initiates scraping of
all URLs simultaneously using asyncio.gather, enhancing data fetching
efficiency. Each Document object is yielded immediately once its content
becomes available, streamlining the entire process.
    - **Issue:** N/A
- **Dependencies:** Requires the asyncio library for handling
asynchronous tasks, which should already be part of standard Python
libraries in Python 3.7 and above.
    - **Email:** [r73327118@gmail.com](mailto:r73327118@gmail.com)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-01 01:20:57 -04:00
Liu Xiaodong
3b473d10f2 experimental: clean python repl input(experimental:Added code for PythonREPL) (#20930)
Update python.py(experimental:Added code for PythonREPL)

Added code for PythonREPL, defining a static method 'sanitize_input'
that takes the string 'query' as input and returns a sanitizing string.
The purpose of this method is to remove unwanted characters from the
input string, Specifically:

1. Delete the whitespace at the beginning and end of the string (' \s').
2. Remove the quotation marks (`` ` ``) at the beginning and end of the
string.
3. Remove the keyword "python" at the beginning of the string (case
insensitive) because the user may have typed it.

This method uses regular expressions (regex) to implement sanitizing.

It all started with this code:
from langchain.agents import Tool
from langchain_experimental.utilities import PythonREPL

python_repl = PythonREPL()
repl_tool = Tool(
    name="python_repl",
description="Remove redundant formatting marks at the beginning and end
of source code from input.Use a Python shell to execute python commands.
If you want to see the output of a value, you should print it out with
`print(...)`.",
    func=python_repl.run,
)

When I call the agent to write a piece of code for me and execute it
with the defined code, I must get an error: SyntaxError('invalid
syntax', ('<string>', 1, 1,'In', 1, 2))

After checking, I found that pythonREPL has less formatting of input
code than the soon-to-be deprecated pythonREPL tool, so I added this
step to it, so that no matter what code I ask the agent to write for me,
it can be executed smoothly and get the output result.
I have tried modifying the prompt words to solve this problem before,
but it did not work, and by adding a simple format check, the problem is
well resolved.
<img width="1271" alt="image"
src="https://github.com/langchain-ai/langchain/assets/164149097/c49a685f-d246-4b11-b655-fd952fc2f04c">

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-05-01 05:19:09 +00:00
Ismail Hossain Polas
1fdf63fa6c community[patch]: update package name to bagelML (#19948)
**Description**
This pull request updates the Bagel Network package name from
"betabageldb" to "bagelML" to align with the latest changes made by the
Bagel Network team.

The following modifications have been made:

- Updated all references to the old package name ("betabageldb") with
the new package name ("bagelML") throughout the codebase.
- Modified the documentation, and any relevant scripts to reflect the
package name change.
- Tested the changes to ensure that the functionality remains intact and
no breaking changes were introduced.

By merging this pull request, our project will stay up to date with the
latest Bagel Network package naming convention, ensuring compatibility
and smooth integration with their updated library.

Please review the changes and provide any feedback or suggestions. Thank
you!
2024-05-01 01:17:33 -04:00
Tomaz Bratanic
7860e4c649 experimental[patch]: Add support for non-function calling LLMs in llm graph transformers (#21014) 2024-05-01 01:16:07 -04:00
Erick Friis
67e6744e0f docs: fix some notebook formatting (#21136) 2024-04-30 21:39:03 -07:00
tianzedavid
5a8909440b docs: remove repetitive words (#21058)
remove repetitive words

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-05-01 01:10:42 +00:00
Leonid Kuligin
a36935b520 docs: updated docs on langchain_google_community (#21064)
Thank you for contributing to LangChain!

- [ ] **PR title**: "docs: updated docs on langchain_google_community"


- [ ] **PR message**:
    - **Description:** updated docs on langchain_google_community
2024-04-30 20:20:49 -04:00
Tomaz Bratanic
c9e96bb5e2 community[patch]: Fix neo4j enhanced schema bugs (#21072) 2024-04-30 20:16:26 -04:00
junkeon
8d2909ee25 upstage[minor]: Update few codes and add upstage loader in pdf section (#21085)
**Description:** Update UpstageLayoutAnalysisParser and Loader and add
upstage loader example in pdf section
**Dependencies:** langchain_community
**Twitter handle:** [@upstageai](https://twitter.com/upstageai)

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-30 20:15:49 -04:00
Bagatur
bef50ded63 openai[patch]: fix special token default behavior (#21131)
By default handle special sequences as regular text
2024-04-30 20:08:24 -04:00
MacanPN
0f7f448603 community[patch]: add delete() method to AzureSearch vector store (#21127)
**Issue:**
Currently `AzureSearch` vector store does not implement `delete` method.
This PR implements it. This also makes it compatible with LangChain
indexer.

**Dependencies:**
None

**Twitter handle:**
@martintriska1

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-30 23:46:18 +00:00
Jorge Piedrahita Ortiz
3441a11b21 docs: minor changes in sambanova community integration docs (#21129)
- **Description:** minor changes in sambanova community integration
notebook docs

---------

Co-authored-by: Renate Kempf <165940384+renate-snova@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-30 23:44:26 +00:00
Bagatur
6d3e9eaf84 docs: format (#21132) 2024-04-30 23:32:41 +00:00
Erick Friis
14422a4220 langchain: fix core dep (#21128) 2024-04-30 14:55:12 -07:00
Erick Friis
6c938da302 langchain: release 0.1.17 (#21125) 2024-04-30 14:43:59 -07:00
Erick Friis
5f8a307565 infra: same tagging for langchain (#21126) 2024-04-30 14:43:45 -07:00
Eugene Yurtsev
bf95414758 langchain[minor]: enhance unit test to test imports recursively (#21122) 2024-04-30 17:05:53 -04:00
Eugene Yurtsev
e4f51f59a2 langchain[patch]: Migrate tools to treat community imports as optional (#21117)
Migrate tools to treat community imports as optional
2024-04-30 16:26:18 -04:00
Eugene Yurtsev
9e788f09c6 langchain[patch]: Migrate output parsers to support optional community imports (#21103)
Migrate output parsers
2024-04-30 16:24:29 -04:00
Eugene Yurtsev
3853fe9f64 langchain[patch]: Migrate graphs to use optional community imports (#21100)
Migrate graphs to use optional community imports.
2024-04-30 16:24:06 -04:00
Eugene Yurtsev
8658d52587 langchain[patch]: Upgrade prompts to optional imports (#21078)
Upgrades prompts module to use optional imports.

This code was generated with a migration script, but had to be adjusted
manually a bit.

Testing in preparation for applying this code modification across the
rest of the modules in langchain package to reverse the dependency
between langchain community and langchain.
2024-04-30 16:23:39 -04:00
Eugene Yurtsev
9b6d04a187 langchain[patch]: Migrate document transformers (#21098)
Migrate document transformers
2024-04-30 16:20:02 -04:00
Eugene Yurtsev
aec13a6123 langchain[patch]: Migrate callbacks module to use optional imports for community (#21086) 2024-04-30 16:19:13 -04:00
Erick Friis
8a62fb0570 community: release 0.0.36 (#21118) 2024-04-30 13:18:44 -07:00
Erick Friis
2407c353be core: release 0.1.48 (#21113) 2024-04-30 19:52:36 +00:00
Erick Friis
dbdfa3d34e infra: fix minimum version install to force pypi install (#21112) 2024-04-30 12:41:26 -07:00
Charlie Marsh
fd94aa8366 partner[patch]: Upgrade to Ruff v0.4.2 (#21108)
## Summary

No new diagnostics (given that the set of enabled rules hasn't changed),
but gains access to our new parser (much faster) and reduced false
positives all around.
2024-04-30 15:06:42 -04:00
Jamsheed Mistri
3e749369ef community[minor]: bump version of LayerupSecurity, add support for untrusted_input parameter (#19985)
**Description:** update version of LayerupSecurity package for the
Layerup Security integration. Add untrusted_input parameter.
2024-04-30 14:55:26 -04:00
fubuki8087
f1c3687aa5 community[patch]: Using the right encoding to parse the web page in RecursiveUrlLoader (#20632)
As shown in #13749 , `RecursiveUrlLoader` has encoding issue. This PR is
to solve this.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-30 18:41:36 +00:00
Jakub Pawłowski
b0b1a67771 community[patch]: Skip unexpected 404 HTTP Error in Arxiv download (#21042)
### Description:
When attempting to download PDF files from arXiv, an unexpected 404
error frequently occurs. This error halts the operation, regardless of
whether there are additional documents to process. As a solution, I
suggest implementing a mechanism to ignore and communicate this error
and continue processing the next document from the list.

Proposed Solution: To address the issue of unexpected 404 errors during
PDF downloads from arXiv, I propose implementing the following solution:

- Error Handling: Implement error handling mechanisms to catch and
handle 404 errors gracefully.
- Communication: Inform the user or logging system about the occurrence
of the 404 error.
- Continued Processing: After encountering a 404 error, continue
processing the remaining documents from the list without interruption.

This solution ensures that the application can handle unexpected errors
without terminating the entire operation. It promotes resilience and
robustness in the face of intermittent issues encountered during PDF
downloads from arXiv.

### Issue:
#20909 
### Dependencies:
none

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-30 18:29:22 +00:00
Erick Friis
b9c53e95b7 community: release 0.0.35 (#21104) 2024-04-30 17:48:56 +00:00
Eugene Yurtsev
3c064a757f core[minor],langchain[patch],community[patch]: Move storage interfaces to core (#20750)
* Move storage interface to core
* Move in memory and file system implementation to core
2024-04-30 13:14:26 -04:00
Charlie Marsh
8f38b7a725 multiple: Remove unnecessary Ruff suppression comments (#21050)
## Summary

I ran `ruff check --extend-select RUF100 -n` to identify `# noqa`
comments that weren't having any effect in Ruff, and then `ruff check
--extend-select RUF100 -n --fix` on select files to remove all of the
unnecessary `# noqa: F401` violations. It's possible that these were
needed at some point in the past, but they're not necessary in Ruff
v0.1.15 (used by LangChain) or in the latest release.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-30 17:13:48 +00:00
Erick Friis
748f2ba9ea core: release 0.1.47 (#21094) 2024-04-30 09:22:05 -07:00
Erick Friis
efe27ef849 infra: tag non-langchain releases (#20805) 2024-04-30 16:15:46 +00:00
Eugene Yurtsev
c8f18a2524 langchain[patch]: Update import handling in adapters (#21079) 2024-04-30 10:55:29 -04:00
William FH
5c63ac3dd7 [Patch] Dedent docstring (#20959)
Technically a slight prompt breaking change, but I think positive EV in
that it saves tokens and results in more sane / in-distribution prompts
2024-04-30 07:40:57 -07:00
Eugene Yurtsev
845d8e0025 langchain[patch]: Update handling of deprecation warnings (#21083)
Chains should not be emitting deprecation warnings.
2024-04-30 10:30:23 -04:00
Christophe Bornet
5c77f45b06 community[minor]: Add async methods to CassandraCache and CassandraSemanticCache (#20654) 2024-04-30 10:27:44 -04:00
Christophe Bornet
d6e9bd3011 docs: Bump cassio min version in docs (#21081)
Cassio 0.6+ is recommended for async vector store (not blocking on
getting the embedding dimension) and for hybrid search support.
2024-04-30 10:25:37 -04:00
William FH
db14d4326d [Core] Feat Pretty Print Tool calls (#20997)
Right now, `tool_calls` are not included in the `pretty_print()` output.
Would be nice to show!


![image](https://github.com/langchain-ai/langchain/assets/13333726/6a0ffca3-d02f-4e18-bc76-513eeca2e964)
2024-04-30 07:14:43 -07:00
Kuro Denjiro
fa4124b821 community[minor]: add mintbase loader to langchain (#20089)
- [x] **Add Near NFT loader**: "community: Load NFT near block chain
using mintbase graph API"

- [x] **PR message**: 
    - **Description:** a description of the change
    - **Twitter handle:**Kurodenjiro

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-30 04:11:56 +00:00
Alexander Dicke
d7e12750df community[patch]: allows using text-generation-inference /generate route with HuggingFaceEndpoint (#20100)
- **Description:** allows to use the /generate route of
`text-generation-inference` with the `HuggingFaceEndpoint`
2024-04-29 23:09:55 -04:00
Jonathan Evans
ea43c669f2 community[patch]: Fix Bedrock Mistral stop sequence request key (#20115)
- **Description:** Change Bedrock's Mistral stop sequence key mapping to
"stop" rather than "stop_sequences" which is the correct key [Bedrock
docs
link](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-mistral.html)
`{
    "prompt": string,
    "max_tokens" : int,
    "stop" : [string],    
    "temperature": float,
    "top_p": float,
    "top_k": int
}`
- **Issue:** #20053 
- **Dependencies:** N/A
- **Twitter handle:** N/a
2024-04-29 20:14:36 -04:00
davidkgp
28b0b0d863 community[patch]: Fix for github issue #17690 (#20117)
…/17690

Thank you for contributing to LangChain!

- [x] **Fix Google Lens knowledge graph issue**: "langchain: community"
- Fix for [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** handled the existence of keys in the json response of
Google Lens
- **Issue:** [No "knowledge_graph" property in Google Lens API call from
SerpAPI](https://github.com/langchain-ai/langchain/issues/17690)



- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-30 00:10:08 +00:00
高远
a7a4630bf4 community[patch]: Modify the text field type and add new exception handling (#20116)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
2024-04-29 20:06:00 -04:00
Rahul Triptahi
c172611647 community[patch]: Add classifier_url argument in PebbloSafeLoader and documentation update. (#21030)
Description: Add classifier_url argument in PebbloSafeLoader.
Documentation: Updated PebbloSafeLoader documentation with above change
and new links for pebblo github pages.

---------

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-04-29 17:41:09 -04:00
Leonid Ganeline
08d08d7c83 docs: langchain docstrings updates (#21032)
Added missed docstings. Formatted docstrings into a consistent format.
2024-04-29 17:40:44 -04:00
Leonid Ganeline
85094cbb3a docs: community docstring updates (#21040)
Added missed docstrings. Updated docstrings to consistent format.
2024-04-29 17:40:23 -04:00
Rodrigo Nogueira
90f19028e5 community[patch]: Add maritalk streaming (sync and async) (#19203)
Co-authored-by: RosevalJr <rdmalajr@gmail.com>
Co-authored-by: Roseval Donisete Malaquias Junior <roseval@maritaca.ai>
2024-04-29 21:31:14 +00:00
Cahid Arda Öz
cc6191cb90 community[minor]: Add support for Upstash Vector (#20824)
## Description

Adding `UpstashVectorStore` to utilize [Upstash
Vector](https://upstash.com/docs/vector/overall/getstarted)!

#17012 was opened to add Upstash Vector to langchain but was closed to
wait for filtering. Now filtering is added to Upstash vector and we open
a new PR. Additionally, [embedding
feature](https://upstash.com/docs/vector/features/embeddingmodels) was
added and we add this to our vectorstore aswell.

## Dependencies

[upstash-vector](https://pypi.org/project/upstash-vector/) should be
installed to use `UpstashVectorStore`. Didn't update dependencies
because of [this comment in the previous
PR](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1876522450).

## Tests

Tests are added and they pass. Tests are naturally network bound since
Upstash Vector is offered through an API.

There was [a discussion in the previous PR about mocking the
unittests](https://github.com/langchain-ai/langchain/pull/17012#pullrequestreview-1891820567).
We didn't make changes to this end yet. We can update the tests if you
can explain how the tests should be mocked.

---------

Co-authored-by: ytkimirti <yusuftaha9@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 17:25:01 -04:00
Leonid Ganeline
1a2ff56cd8 core[patch[: docstring update (#21036)
Added missed docstrings. Updated docstrings to consistent format.
2024-04-29 15:35:34 -04:00
Eugene Yurtsev
f479a337cc langchain[patch]: replace deprecated imports with imports from langchain_core (#21033)
* Output of running the migration script.
* Ran only against langchain code itself and not the unit tests.
2024-04-29 15:34:31 -04:00
Eugene Yurtsev
82d4afcac0 langchain[minor]: Code to handle dynamic imports (#20893)
Proposing to centralize code for handling dynamic imports. This allows treating langchain-community as an optional dependency.

---

The proposal is to scan the code base and to replace all existing imports with dynamic imports using this functionality.
2024-04-29 15:34:03 -04:00
Erick Friis
854ae3e1de mistralai: release 0.1.5, allow client passing in (#21034) 2024-04-29 17:14:26 +00:00
chyroc
3e241956d3 community[minor]: add coze chat model (#20770)
add coze chat model, to call coze.com apis
2024-04-29 12:26:16 -04:00
Eugene Yurtsev
29493bb598 cli[minor]: improve confirmation message with more details (#21027)
Improve confirmation message with more details
2024-04-29 12:20:42 -04:00
Eugene Yurtsev
aab78a37f3 cli[patch]: Ignore imports that change the name of the class (#21026)
Not currently handeled by migration script
2024-04-29 12:20:30 -04:00
Massimiliano Pronesti
ce89b34fc0 community[patch]: support hybrid search with threshold in Azure AI Search Retriever (#20907)
Support hybrid search with a score threshold -- similar to what we do
for similarity search.
2024-04-29 12:11:44 -04:00
Andrei Panferov
b3efa38cc0 community[patch]: GigaChat model selection fix (#20988)
Fixed the error that the model name is never actually put into GigaChat
request payload, always defaulting to `GigaChat-Lite`.

With this fix, model selection through
```python
import os
from langchain.chat_models.gigachat import GigaChat

chat = GigaChat(
    name="GigaChat-Pro", # <- HERE!!!!!
    ...
)
```
should actually work, as intended in
[here](804390ba4b/libs/community/langchain_community/llms/gigachat.py (L36)).

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-29 16:08:26 +00:00
Patrick McFadin
3331865f6b community[minor]: add Cassandra Database Toolkit (#20246)
**Description**: ToolKit and Tools for accessing data in a Cassandra
Database primarily for Agent integration. Initially, this includes the
following tools:
- `cassandra_db_schema` Gathers all schema information for the connected
database or a specific schema. Critical for the agent when determining
actions.
- `cassandra_db_select_table_data` Selects data from a specific keyspace
and table. The agent can pass paramaters for a predicate and limits on
the number of returned records.
- `cassandra_db_query` Expiriemental alternative to
`cassandra_db_select_table_data` which takes a query string completely
formed by the agent instead of parameters. May be removed in future
versions.

Includes unit test and two notebooks to demonstrate usage. 

**Dependencies**: cassio
**Twitter handle**: @PatrickMcFadin

---------

Co-authored-by: Phil Miesle <phil.miesle@datastax.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 15:51:43 +00:00
Igor Brai
b3e74f2b98 community[minor]: add mojeek search util (#20922)
**Description:** This pull request introduces a new feature to community
tools, enhancing its search capabilities by integrating the Mojeek
search engine
**Dependencies:** None

---------

Co-authored-by: Igor Brai <igor@mojeek.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-04-29 15:49:53 +00:00
hmn falahi
4822beb298 Ignore self/cls from required args of class functions in convert_to_openai_tool (#20691)
Removed redundant self/cls from required args of class functions in
_get_python_function_required_args:

```python
class MemberTool:
    def search_member(
            self,
            keyword: str,
            *args,
            **kwargs,
    ):
        """Search on members with any keyword like first_name, last_name, email

        Args:
            keyword: Any keyword of member
        """

        headers = dict(authorization=kwargs['token'])
        members = []
        try:
            members = request_(
                method='SEARCH',
                url=f'{service_url}/apiv1/members',
                headers=headers,
                json=dict(query=keyword),
            )

        except Exception as e:
            logger.info(e.__doc__)

        return members

convert_to_openai_tool(MemberTool.search_member)
```
expected result:
```
{'type': 'function', 'function': {'name': 'search_member', 'description': 'Search on members with any keyword like first_name, last_name, username, email', 'parameters': {'type': 'object', 'properties': {'keyword': {'type': 'string', 'description': 'Any keyword of member'}}, 'required': ['keyword']}}}
```

#20685

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 11:46:26 -04:00
Rahul Triptahi
a64a1943fd docs: Document update for load_extended_matadata in GoogleDriveLoader (#20950)
Document: Updated google_drive,ipynb for loading following extended
metadata.
 - full_path - Full path of the file/s in google drive.
 - owner - owner of the file/s.
 - size - size of the file/s.

Code changes:
[langchain-google/pull/179.](https://github.com/langchain-ai/langchain-google/pull/179)

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-29 11:41:57 -04:00
Eugene Yurtsev
4f4ee8e2cf cli[patch]: Update migrations file manually (#21021)
We need to replace occurrences in the code of RunnableMap not just the
import,
so for now, we don't replace RunnableMap.
2024-04-29 10:53:31 -04:00
Tomaz Bratanic
67428c4052 community[patch]: Neo4j enhanced schema (#20983)
Scan the database for example values and provide them to an LLM for
better inference of Text2cypher
2024-04-29 10:45:55 -04:00
Leonid Kuligin
dc70c23a11 docs: switched GCSLoaders docs to langchain-google-community (#20985)
Thank you for contributing to LangChain!

- [ ] **PR title**: "docs: switched GCSLoaders docs to
langchain-google-community"

- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** switched GCSLoaders docs to
langchain-google-community
2024-04-29 10:45:11 -04:00
aditya thomas
8b59bddc03 anthropic[patch]: add tests for secret_str for api key (#20986)
**Description:** Add tests to check API keys are masked
**Issue:** Resolves
https://github.com/langchain-ai/langchain/issues/12165 for Anthropic
models
**Dependencies:** None
2024-04-29 10:39:14 -04:00
Pengcheng Liu
1fad39be1c community[minor]: Add LarkSuite wiki document loader. (#21016)
**Description:** Add LarkSuite wiki document loader. Refer to [LarkSuite
api document
](https://open.feishu.cn/document/server-docs/docs/wiki-v2/space-node/list)for
details.
**Issue:** None
**Dependencies:** None
**Twitter handle:** None
2024-04-29 10:37:50 -04:00
Tomaz Bratanic
d36332476c docs: Add neo4j relationship vector index docs (#20990)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 14:36:47 +00:00
Leonid Ganeline
dc7c06bc07 community[minor]: import fix (#20995)
Issue: When the third-party package is not installed, whenever we need
to `pip install <package>` the ImportError is raised.
But sometimes, the `ValueError` or `ModuleNotFoundError` is raised. It
is bad for consistency.
Change: replaced the `ValueError` or `ModuleNotFoundError` with
`ImportError` when we raise an error with the `pip install <package>`
message.
Note: Ideally, we replace all `try: import... except... raise ... `with
helper functions like `import_aim` or just use the existing
[langchain_core.utils.utils.guard_import](https://api.python.langchain.com/en/latest/utils/langchain_core.utils.utils.guard_import.html#langchain_core.utils.utils.guard_import)
But it would be much bigger refactoring. @baskaryan Please, advice on
this.
2024-04-29 10:32:50 -04:00
Karim Lalani
2ddac9a7c3 experimental[minor]: Add bind_tools and with_structured_output functions to OllamaFunctions (#20881)
Implemented bind_tools for OllamaFunctions.
Made OllamaFunctions sub class of ChatOllama.
Implemented with_structured_output for OllamaFunctions.

integration unit test has been updated.
notebook has been updated.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-29 14:13:33 +00:00
Eugene Yurtsev
d781560722 cli[minor]: Add ipynb support, add text_splitters (#20963) 2024-04-29 10:11:21 -04:00
Vadym Barda
5e0b6b3e75 docs: update langserve link in LCEL docs (#20992) 2024-04-29 09:06:10 -04:00
Aditya
07ce39bfe7 docs: updated tutorials for Image generation and Vector Search (#21000)
Description: docs: updated tutorials for Image generation and Vector
Search

@lkuligin for review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
2024-04-29 09:04:11 -04:00
Aditya
17bbb7d2a5 docs: updated tutorial for Gemini versions, included safety attribute updates (#21006)
Description:updated tutorial for Gemini versions, included safety
attribute updates

@lkuligin For review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
2024-04-29 09:01:54 -04:00
WilliamEspegren
804390ba4b community: Spider integration (#20937)
Added the [Spider.cloud](https://spider.cloud) document loader.
[Spider](https://github.com/spider-rs/spider) is the
[fastest](https://github.com/spider-rs/spider/blob/main/benches/BENCHMARKS.md)
and cheapest crawler that returns LLM-ready data.

```
- **Description:** Adds Spider data loader
- **Dependencies:** spider-client
- **Twitter handle:** @WilliamEspegren 
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: = <=>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-27 21:45:03 +00:00
Jamie Lemon
6342217b93 docs: Moves "Using PyMuPDF" to higher up the page. (#20832)
**Description:**
This PR moves the **PyMuPDF** PDF loader solution to be underneath
**PyPDF**. This is because it is the the 2nd most popular PyPI package
after **PyPDF**.

Please refer to these numbers, at the time of writing as follows:

PyPDF
https://www.pepy.tech/projects/PyPDF2
160 million

PyMuPDF
https://www.pepy.tech/projects/pymupdf
60 million

PDFPlumber
https://www.pepy.tech/projects/pdfplumber
23 million

PDFMiner
https://www.pepy.tech/projects/pdfminer
16 million

PyPDFium2
https://www.pepy.tech/projects/pypdfium2
8 million

Unstructured
https://www.pepy.tech/projects/unstructured
8 million


Please note I am an active contributor to
https://github.com/pymupdf/PyMuPDF

Many thanks!

----

**Twitter handle:**
@artifex
2024-04-27 20:40:20 +00:00
Chouaieb Nemri
8097bec472 Added LogEntry, Any, Dict, List, Optional, TypedDict imports (#20970)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: docs"

- [ ] **PR message**:
- **Description:** Uptaded docs: Rag streaming use-cases notebook with
LogEntry, Any, Dict, List, Optional, TypedDict imports
    - **Twitter handle:** c_nemri

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-27 20:13:54 +00:00
ccurme
9ec7151317 fireworks: fix integration tests (#20973) 2024-04-27 19:49:46 +00:00
William FH
9fa9f05e5d Catch System Error in ast parse (#20961)
I can't seem to reproduce, but i got this:

```
SystemError: AST constructor recursion depth mismatch (before=102, after=37)
```

And the operation isn't critical for the actual forward pass so seems
preferable to expand our caught exceptions
2024-04-26 19:31:55 -07:00
YH
2aca7fcdcf core[patch]: Enhance link extraction with query parameters (#20259)
**Description**: This update enhances the `extract_sub_links` function
within the `langchain_core/utils/html.py` module to include query
parameters in the extracted URLs.

**Issue**: N/A

**Dependencies**: No additional dependencies required for this change.

**Twitter handle**: N/A

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-27 02:22:36 +00:00
CT
0e917e319b docs: Add langchainhub to pip install (#20185)
Added langchainhub package in import statement which is required for
"from langchain import hub" to work.

Added sample code to add OpenAI key

Co-authored-by: Chi Yan Tang <100466443+poochiekittie@users.noreply.github.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-27 02:21:40 +00:00
Pamela Fox
45092a36a2 docs: Fix langgraph link (#20244)
Just a simple PR to fix a broken link. Apparently having backticks
outside a link makes it render as code.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-27 02:18:52 +00:00
Chip Davis
e818c75f8a infra: test directory loader multithreaded (#20281)
This is a unit test for #20230 which was a fix for using multithreaded
mode with directory loader @eyurtsev
2024-04-26 19:16:47 -07:00
Guilherme Zanotelli
f931a9ce60 community[patch]: Pass kwargs to SPARQLStore from RdfGraph (#20385)
This introduces `store_kwargs` which behaves similarly to `graph_kwargs`
on the `RdfGraph` object, which will enable users to pass `headers` and
other arguments to the underlying `SPARQLStore` object. I have also made
a [PR in `rdflib` to support passing
`default_graph`](https://github.com/RDFLib/rdflib/pull/2761).

Example usage:
```python
from langchain_community.graphs import RdfGraph

graph = RdfGraph(
    query_endpoint="http://localhost/sparql",
    standard="rdf",
    store_kwargs=dict(
        default_graph="http://example.com/mygraph"
    )
)
```

<!--If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.-->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-27 01:38:29 +00:00
Chandre Van Der Westhuizen
e57cf73cf5 docs: Added MindsDB provider (#20322)
MindsDB integrates with LangChain, enabling users to deploy, serve, and
fine-tune models available via LangChain within MindsDB, making them
accessible to numerous data sources.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-27 01:36:08 +00:00
Jorge Piedrahita Ortiz
40b2e2916b community[minor]: Sambanova llm integration (#20955)
- **Description:** Added [Sambanova systems](https://sambanova.ai/)
integration, including sambaverse and sambastudio LLMs
- **Dependencies:**   sseclient-py  (optional)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-27 01:05:13 +00:00
Rahul Triptahi
955cf186d2 community[patch]: Ingest source, owner and full_path if present in Document's metadata. (#20949)
Description: The PebbloSafeLoader should first check for owner,
full_path and size in metadata before implementing its own logic.
Dependencies: None
Documentation: NA.

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-04-26 17:50:57 -07:00
Amine Djeghri
790ea75cf7 community[minor]: add exllamav2 library for GPTQ & EXL2 models (#17817)
Added 3 files : 
- Library : ExLlamaV2 
- Test integration
- Notebook

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-27 00:44:43 +00:00
Naveen Tatikonda
8bbdb4f6a0 community[patch]: Add OpenSearch as semantic cache (#20254)
### Description
Use OpenSearch vector store as Semantic Cache.

### Twitter Handle
**@OpenSearchProj**

---------

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
Co-authored-by: Harish Tatikonda <harishtatikonda@Harishs-MacBook-Air.local>
Co-authored-by: EC2 Default User <ec2-user@ip-172-31-31-155.ec2.internal>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-27 00:20:24 +00:00
Giacomo Berardi
61f14f00d7 docs: ElasticsearchCache in cache integrations documentation (#20790)
The package for LangChain integrations with Elasticsearch
https://github.com/langchain-ai/langchain-elastic is going to contain a
LLM cache integration in the next release (see
https://github.com/langchain-ai/langchain-elastic/pull/14). This is the
documentation contribution on the page dedicated to cache integrations
2024-04-26 15:43:58 -07:00
Mayank Solanki
8c085fc697 community[patch]: Added a function from_existing_collection in Qdrant vector database. (#20779)
Issue: #20514 
The current implementation of `construct_instance` expects a `texts:
List[str]` that will call the embedding function. This might not be
needed when we already have a client with collection and `path, you
don't want to add any text.

This PR adds a class method that returns a qdrant instance with an
existing client.

Here everytime
cb6e5e56c2/libs/community/langchain_community/vectorstores/qdrant.py (L1592)
`construct_instance` is called, this line sends some text for embedding
generation.

---------

Co-authored-by: Anush <anushshetty90@gmail.com>
2024-04-26 15:34:09 -07:00
Leonid Kuligin
893a924b90 core[minor], community[patch], langchain[patch]: move BaseChatLoader to core (#19607)
Thank you for contributing to LangChain!

- [ ] **PR title**: "core: move BaseChatLoader and BaseToolkit from
community"


- [ ] **PR message**: move BaseChatLoader and BaseToolkit

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-26 21:45:51 +00:00
Erick Friis
d4befd0cfb core: fix batch ordering test (#20952) 2024-04-26 21:17:26 +00:00
Eugene Yurtsev
8ed150b2fe cli[minor]: Fix bug to account for name changes (#20948)
* Fix bug to account for name changes / aliases
* Generate migration list from langchain to langchain_core
2024-04-26 15:45:11 -04:00
ccurme
989e4a92c2 (infra) pass input to test-release (#20947) 2024-04-26 15:17:40 -04:00
Eugene Yurtsev
2fa0ff1a2d cli[minor]: update code to generate migrations from langchain to community (#20946)
Updates code that generates migrations from langchain to community
2024-04-26 15:11:32 -04:00
Erick Friis
078c5d9bc6 infra: nonmaster release checkbox (#20945)
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-04-26 14:50:07 -04:00
Leonid Kuligin
d4aec8fc8f docs: adding langchain_google_community to the docs (#20665)
Thank you for contributing to LangChain!

- [ ] **PR title**: "docs: step1. adjusting langchain_community ->
langchain_google_community"


- [ ] 
- **Description:** step1. adjusting langchain_community ->
langchain_google_community
2024-04-26 18:49:03 +00:00
ccurme
bf16cefd18 langchain: deprecate create_structured_output_runnable (#20933) 2024-04-26 14:00:40 -04:00
Erick Friis
38eccab3ae upstage: release 0.1.3 (#20941) 2024-04-26 10:36:11 -07:00
Sean
e1c2e2fdfa upstage: Upstage Groundedness Check parameter update (#20914)
* Groundedness Check takes `str` or `list[Document]` as input.

* Deprecate `GroundednessCheck` due to its naming.
* Added `UpstageGroundednessCheck`. 

* Hotfix for Groundedness Check parameter. 
  The name `query` was misleading and it should be `answer` instead.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-26 17:34:05 +00:00
ccurme
84b8e67c9c mistral: release 0.1.4 (#20940) 2024-04-26 13:06:02 -04:00
ccurme
465fbaa30b openai: release 0.1.4 (#20939) 2024-04-26 09:56:49 -07:00
Eugene Yurtsev
12c906f6ce cli[minor]: Improve partner migrations (#20938)
This auto generates partner migrations.

At the moment the migration is from community -> partner.

So one would need to run the migration script twice to go from langchain to partner.
2024-04-26 12:30:15 -04:00
Eugene Yurtsev
5653f36adc cli[minor]: Add script to generate migrations for partner packages (#20932)
Add script to help generate migrations.

This works well for partner packages. Migrations are generated based on run time rather than static analysis (much simpler to get the correct migrations implemented).

The script for generating migrations from langchain to community still needs work.
2024-04-26 11:17:20 -04:00
ccurme
fe1304afc4 openai: add unit test (#20931)
Test a helper function that was added earlier.
2024-04-26 15:02:19 +00:00
Eugene Yurtsev
6598757037 cli[minor]: Add first version of migrate (#20902)
Adds a first version of the migrate script.
2024-04-26 10:50:21 -04:00
Pengcheng Liu
d95e9fb67f docs: add tool calling example in Tongyi chat model integration. (#20925)
**Description:** add tool calling example in Tongyi chat model
integration.
  **Issue:** None
  **Dependencies:** None
2024-04-26 10:18:54 -04:00
Lei Zhang
9281841cfe community[patch]: fix integrated test case test_recursive_url_loader.py assertions (issue-20919) (#20920)
**Description:** 
Fix integrated test case test_recursive_url_loader.py

Local testing successful

```shell
(venv) lei@LeideMacBook-Pro community % poetry run pytest tests/integration_tests/document_loaders/test_recursive_url_loader.py
================================================================================ test session starts ================================================================================
platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.4.0 -- /Users/zhanglei/Work/github/langchain/venv/bin/python
cachedir: .pytest_cache
rootdir: /Users/zhanglei/Work/github/langchain/libs/community
configfile: pyproject.toml
plugins: syrupy-4.6.1, asyncio-0.20.3, cov-4.1.0, vcr-1.0.2, mock-3.12.0, anyio-3.7.1, dotenv-0.5.2, requests-mock-1.11.0, socket-0.6.0
asyncio: mode=Mode.AUTO
collected 6 items                                                                                                                                                                   

tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader PASSED                                                                 [ 16%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic PASSED                                                   [ 33%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader FAILED                                                                  [ 50%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent PASSED                                                                      [ 66%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_loading_invalid_url PASSED                                                                        [ 83%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties PASSED                                                   [100%]

===================================================================================== FAILURES ======================================================================================
__________________________________________________________________________ test_sync_recursive_url_loader ___________________________________________________________________________

    def test_sync_recursive_url_loader() -> None:
        url = "https://docs.python.org/3.9/"
        loader = RecursiveUrlLoader(
            url, extractor=lambda _: "placeholder", use_async=False, max_depth=2
        )
        docs = loader.load()
>       assert len(docs) == 23
E       AssertionError: assert 24 == 23
E        +  where 24 = len([Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/', 'content_type': 'text/html', 'title': '3.9.18 Documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/py-modindex.html', 'content_type': 'text/html', 'title': 'Python Module Index — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/download.html', 'content_type': 'text/html', 'title': 'Download — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/howto/index.html', 'content_type': 'text/html', 'title': 'Python HOWTOs — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/whatsnew/index.html', 'content_type': 'text/html', 'title': 'Whatâ\x80\x99s New in Python — Python 3.9.18 documentation', 'language': None}), Document(page_content='placeholder', metadata={'source': 'https://docs.python.org/3.9/c-api/index.html', 'content_type': 'text/html', 'title': 'Python/C API Reference Manual — Python 3.9.18 documentation', 'language': None}), ...])

tests/integration_tests/document_loaders/test_recursive_url_loader.py:38: AssertionError
================================================================================= warnings summary ==================================================================================
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties
  /Users/zhanglei/.pyenv/versions/3.11.4/lib/python3.11/html/parser.py:170: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor.
    k = self.parse_starttag(i)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================ slowest 5 durations ================================================================================
56.75s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic
38.99s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader
31.20s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties
30.37s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent
15.44s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader
============================================================================== short test summary info ==============================================================================
FAILED tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader - AssertionError: assert 24 == 23
================================================================ 1 failed, 5 passed, 5 warnings in 172.97s (0:02:52) ================================================================
(venv) zhanglei@LeideMacBook-Pro community % poetry run pytest tests/integration_tests/document_loaders/test_recursive_url_loader.py
================================================================================ test session starts ================================================================================
platform darwin -- Python 3.11.4, pytest-7.4.4, pluggy-1.4.0 -- /Users/zhanglei/Work/github/langchain/venv/bin/python
cachedir: .pytest_cache
rootdir: /Users/zhanglei/Work/github/langchain/libs/community
configfile: pyproject.toml
plugins: syrupy-4.6.1, asyncio-0.20.3, cov-4.1.0, vcr-1.0.2, mock-3.12.0, anyio-3.7.1, dotenv-0.5.2, requests-mock-1.11.0, socket-0.6.0
asyncio: mode=Mode.AUTO
collected 6 items                                                                                                                                                                   

tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader PASSED                                                                 [ 16%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic PASSED                                                   [ 33%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader PASSED                                                                  [ 50%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent PASSED                                                                      [ 66%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_loading_invalid_url PASSED                                                                        [ 83%]
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties PASSED                                                   [100%]

================================================================================= warnings summary ==================================================================================
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent
tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties
  /Users/zhanglei/.pyenv/versions/3.11.4/lib/python3.11/html/parser.py:170: XMLParsedAsHTMLWarning: It looks like you're parsing an XML document using an HTML parser. If this really is an HTML document (maybe it's XHTML?), you can ignore or filter this warning. If it's XML, you should know that using an XML parser will be more reliable. To parse this document as XML, make sure you have the lxml package installed, and pass the keyword argument `features="xml"` into the BeautifulSoup constructor.
    k = self.parse_starttag(i)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================ slowest 5 durations ================================================================================
46.99s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader_deterministic
32.43s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_async_recursive_url_loader
31.23s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_equivalent
30.75s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_async_metadata_necessary_properties
15.89s call     tests/integration_tests/document_loaders/test_recursive_url_loader.py::test_sync_recursive_url_loader
===================================================================== 6 passed, 5 warnings in 157.42s (0:02:37) =====================================================================
(venv) lei@LeideMacBook-Pro community % 
```

**Issue:** https://github.com/langchain-ai/langchain/issues/20919

**Twitter handle:** @coolbeevip
2024-04-26 10:00:08 -04:00
ccurme
7d8d0229fa remove placeholder error message (#20340) 2024-04-26 13:48:48 +00:00
William FH
4c437ebb9c Use lstv2 (#20747) 2024-04-25 16:51:42 -07:00
ccurme
891ae37437 langchain: support PineconeVectorStore in self query retriever (#20905)
`langchain_pinecone.Pinecone` is deprecated in favor of
`PineconeVectorStore`, and is currently a subclass of
`PineconeVectorStore`.
```python
@deprecated(since="0.0.3", removal="0.2.0", alternative="PineconeVectorStore")
class Pinecone(PineconeVectorStore):
    """Deprecated. Use PineconeVectorStore instead."""

    pass
```
2024-04-25 20:54:58 +00:00
Matt
28df4750ef community[patch]: Add initial tests for AzureSearch vector store (#17663)
**Description:** AzureSearch vector store has no tests. This PR adds
initial tests to validate the code can be imported and used.
**Issue:** N/A
**Dependencies:** azure-search-documents and azure-identity are added as
optional dependencies for testing

---------

Co-authored-by: Matt Gotteiner <[email protected]>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 20:42:01 +00:00
Dristy Srivastava
5f1d1666e3 community[patch]: Add support for pebblo server and client version (#20269)
**Description**:
_PebbloSafeLoader_: Add support for pebblo server and client version


**Documentation:** NA
**Unit test:** NA
**Issue:** NA
**Dependencies:**  None

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-25 20:39:17 +00:00
am-kinetica
b54b19ba1c community[minor]: Implemented Kinetica Document Loader and added notebooks (#20002)
- [ ] **Kinetica Document Loader**: "community: a class to load
Documents from Kinetica"



- [ ] **Kinetica Document Loader**: 
- **Description:** implemented KineticaLoader in `kinetica_loader.py`
- **Dependencies:** install the Kinetica API using `pip install
gpudb==7.2.0.1 `
2024-04-25 13:39:00 -07:00
Michael Schock
5e60d65917 experimental[patch]: return from HuggingGPT task executor task.run() exception (#20219)
**Description:** Fixes a bug in the HuggingGPT task execution logic
here:

      except Exception as e:
          self.status = "failed"
          self.message = str(e)
      self.status = "completed"
      self.save_product()

where a caught exception effectively just sets `self.message` and can
then throw an exception if, e.g., `self.product` is not defined.

**Issue:** None that I'm aware of.
**Dependencies:** None
**Twitter handle:** https://twitter.com/michaeljschock

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-25 20:16:39 +00:00
Anish Chakraborty
898362de81 core[patch]: improve comma separated list output parser to handle non-space separated list (#20434)
- **Description:** Changes
`lanchain_core.output_parsers.CommaSeparatedListOutputParser` to handle
`,` as a delimiter alongside the previous implementation which used `, `
as delimiter.
- **Issue:** Started noticing that some results returned by LLMs were
not getting parsed correctly when the output contained `,` instead of `,
`.
  - **Dependencies:** No
  - **Twitter handle:** not active on twitter.


<!---
If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
-->
2024-04-25 20:10:56 +00:00
Michael Schock
63a07f52df experimental[patch]: remove \n from AutoGPT feedback_tool exit check (#20132) 2024-04-25 20:10:33 +00:00
Shengsheng Huang
fd1061e7bf community[patch]: add more data types support to ipex-llm llm integration (#20833)
- **Description**:  
- **add support for more data types**: by default `IpexLLM` will load
the model in int4 format. This PR adds more data types support such as
`sym_in5`, `sym_int8`, etc. Data formats like NF3, NF4, FP4 and FP8 are
only supported on GPU and will be added in future PR.
    - Fix a small issue in saving/loading, update api docs
- **Dependencies**: `ipex-llm` library
- **Document**: In `docs/docs/integrations/llms/ipex_llm.ipynb`, added
instructions for saving/loading low-bit model.
- **Tests**: added new test cases to
`libs/community/tests/integration_tests/llms/test_ipex_llm.py`, added
config params.
- **Contribution maintainer**: @shane-huang
2024-04-25 12:58:18 -07:00
Rahul Triptahi
dc921f0823 community[patch]: Add semantic info to metadata, classified by pebblo-server. (#20468)
Description: Add support for Semantic topics and entities.
Classification done by pebblo-server is not used to enhance metadata of
Documents loaded by document loaders.
Dependencies: None
Documentation: Updated.

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-04-25 12:55:33 -07:00
Eugene Yurtsev
a5028b6356 cli[minor]: Add __version__ (#20903)
Add __version__ to cli
2024-04-25 15:51:33 -04:00
Jingpan Xiong
1202017c56 community[minor]: Add relyt vector database (#20316)
Co-authored-by: kaka <kaka@zbyte-inc.cloud>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: jingsi <jingsi@leadincloud.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-25 19:49:29 +00:00
davidefantiniIntel
f386f71bb3 community: fix tqdm import (#20263)
Description: Fix tqdm import in QuantizedBiEncoderEmbeddings
2024-04-25 19:44:53 +00:00
Andres Algaba
05ae8ca7d4 community[patch]: deprecate persist method in Chroma (#20855)
Thank you for contributing to LangChain!

- [x] **PR title**

- [x] **PR message**:
- **Description:** Deprecate persist method in Chroma no longer exists
in Chroma 0.4.x
    - **Issue:** #20851 
    - **Dependencies:** None
    - **Twitter handle:** AndresAlgaba1

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-25 19:42:03 +00:00
ccurme
fdabd3cdf5 mistral, openai: support custom tokenizers in chat models (#20901) 2024-04-25 15:23:29 -04:00
ccurme
6986e44959 docs: update chat model feature table (#20899) 2024-04-25 15:05:43 -04:00
ccurme
b8db73233c core, community: deprecate tool.__call__ (#20900)
Does not update docs.
2024-04-25 14:50:39 -04:00
merdan
52896258ee docs: hide model import in multiple_tools.ipynb (#20883)
**Description:** 
This PR removes an unnecessary code snippet from the documentation. The
snippet in question is not relevant to the content and does not
contribute to the overall understanding of the topic. It contained
redundant imports and unused code, potentially causing confusion for
readers.

**Issue:** 
There is no specific issue number associated with this change.

**Dependencies:** 
No additional dependencies are required for this change.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 18:47:22 +00:00
Tomaz Bratanic
520972fd0f community[patch]: Support passing graph object to Neo4j integrations (#20876)
For driver connection reusage, we introduce passing the graph object to
neo4j integrations
2024-04-25 11:30:22 -07:00
Lei Zhang
748a6ae609 community[patch]: add HTTP response headers Content-Type to metadata of RecursiveUrlLoader document (#20875)
**Description:** 
The RecursiveUrlLoader loader offers a link_regex parameter that can
filter out URLs. However, this filtering capability is limited, and if
the internal links of the website change, unexpected resources may be
loaded. These resources, such as font files, can cause problems in
subsequent embedding processing.

>
https://blog.langchain.dev/assets/fonts/source-sans-pro-v21-latin-ext_latin-regular.woff2?v=0312715cbf

We can add the Content-Type in the HTTP response headers to the document
metadata so developers can choose which resources to use. This allows
developers to make their own choices.

For example, the following may be a good choice for text knowledge.

- text/plain - simple text file
- text/html - HTML web page
- text/xml - XML format file
- text/json - JSON format data
- application/pdf - PDF file
- application/msword - Word document

and ignore the following

- text/css - CSS stylesheet
- text/javascript - JavaScript script
- application/octet-stream - binary data
- image/jpeg - JPEG image
- image/png - PNG image
- image/gif - GIF image
- image/svg+xml - SVG image
- audio/mpeg - MPEG audio files
- video/mp4 - MP4 video file
- application/font-woff - WOFF font file
- application/font-ttf - TTF font file
- application/zip - ZIP compressed file
- application/octet-stream - binary data

**Twitter handle:** @coolbeevip

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 11:29:41 -07:00
samanhappy
37cbbc00a9 docs: Fix broken link in agents.ipynb (#20872) 2024-04-25 10:42:06 -07:00
fzowl
a6b8ff23bd docs: Use voyage-law-2 in the examples (#20784)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


**Description:** In VoyageAI text-embedding examples use voyage-law-2
model


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-25 10:41:36 -07:00
Erick Friis
eca3640af7 upstage: release 0.1.2 (#20898) 2024-04-25 10:41:19 -07:00
Pavlo Paliychuk
82b5bdc7a1 docs: Fix misplaced zep cloud example links (#20867)
Thank you for contributing to LangChain!

- [x] **PR title**: Fix misplaced zep cloud example links
- [x] **PR message**: 
- **Description:** Fixes misplaced links for vector store and memory zep
cloud examples

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-25 10:41:08 -07:00
Joan Fontanals
baefbfb14e community[mionr]: add Jina Reranker in retrievers module (#19406)
- **Description:** Adapt JinaEmbeddings to run with the new Jina AI
Rerank API
- **Twitter handle:** https://twitter.com/JinaAI_


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 10:27:10 -07:00
Erick Friis
92969d49cb multiple: remove external repo mds (#20896)
api docs build doesn't tolerate them
2024-04-25 17:18:29 +00:00
Jason_Chen
53bb7dbd29 community[patch]: add BeautifulSoupTransformer remove_unwanted_classnames method (#20467)
Add the remove_unwanted_classnames method to the
BeautifulSoupTransformer class, which can filter more effectively.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 17:04:04 +00:00
YISH
ed26149a29 openai[patch]: Allow disablling safe_len_embeddings(OpenAIEmbeddings) (#19743)
OpenAI API compatible server may not support `safe_len_embedding`, 

use `disable_safe_len_embeddings=True` to disable it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 09:45:52 -07:00
Bagatur
5b83130855 core[minor], langchain[patch], community[patch]: mv StructuredQuery (#20849)
mv StructuredQuery to core
2024-04-25 09:40:26 -07:00
Sean
540f384197 partner: Upstage quick documentation update (#20869)
* Updating the provider docs page. 
The RAG example was meant to be moved to cookbook, but was merged by
mistake.

* Fix bug in Groundedness Check

---------

Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-25 16:36:54 +00:00
Bagatur
ffad3985a1 core[patch]: Release 0.1.46 (#20891) 2024-04-25 15:40:17 +00:00
Mish Ushakov
6ccecf2363 community[minor]: added Browserbase loader (#20478) 2024-04-25 01:11:03 +00:00
aditya thomas
9e694963a4 docs: custom callback handlers page (#20494)
**Description:** Update to the Callbacks page on custom callback
handlers
**Issue:** #20493 
**Dependencies:** None

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 01:08:36 +00:00
Erick Friis
5da9dd1195 mistral: comment batching param (#20868)
Addresses #20523
2024-04-25 00:38:21 +00:00
Ivaylo Bratoev
7c5063ef60 infra: fix how Poetry is installed in the dev container (#20521)
Currently, when a new dev container is created, poetry does not work in
it with the error "No module named 'rapidfuzz'".

Install Poetry outside the project venv so that poetry and project
dependencies do not get mixed. Use pipx to install poetry securely in
its own isolated environment.

Issue: #12237

Twitter handle: https://twitter.com/ibratoev

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-24 17:33:25 -07:00
GustavoSept
c2d09a5186 experimental[patch]: Makes regex customizable in text_splitter.py (SemanticChunker class) (#20485)
- **Description:** Currently, the regex is static (`r"(?<=[.?!])\s+"`),
which is only useful for certain use cases. The current change only
moves this to be a parameter of split_text(). Which adds flexibility
without making it more complex (as the default regex is still the same).
- **Issue:** Not applicable (I searched, no one seems to have created
this issue yet).
  - **Dependencies:** None.


_If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17._

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-25 00:32:40 +00:00
William FH
a936f696a6 [Core] Feat: update config CVar in tool.invoke (#20808) 2024-04-24 17:17:21 -07:00
Lei Zhang
2cd907ad7e text-splitters[patch]: fix MarkdownHeaderTextSplitter fails to parse headers with non-printable characters (#20645)
Description: MarkdownHeaderTextSplitter Fails to Parse Headers with
non-printable characters. more #20643

The following is the official test case. Just replacing `# Foo\n\n` with
`\ufeff# Foo\n\n` will cause the test case to fail.

chunk metadata is empty

```python
def test_md_header_text_splitter_1() -> None:
    """Test markdown splitter by header: Case 1."""

    markdown_document = (
        "\ufeff# Foo\n\n"
        "    ## Bar\n\n"
        "Hi this is Jim\n\n"
        "Hi this is Joe\n\n"
        " ## Baz\n\n"
        " Hi this is Molly"
    )
    headers_to_split_on = [
        ("#", "Header 1"),
        ("##", "Header 2"),
    ]
    markdown_splitter = MarkdownHeaderTextSplitter(
        headers_to_split_on=headers_to_split_on,
    )
    output = markdown_splitter.split_text(markdown_document)
    expected_output = [
        Document(
            page_content="Hi this is Jim  \nHi this is Joe",
            metadata={"Header 1": "Foo", "Header 2": "Bar"},
        ),
        Document(
            page_content="Hi this is Molly",
            metadata={"Header 1": "Foo", "Header 2": "Baz"},
        ),
    ]
    assert output == expected_output
```

twitter: @coolbeevip

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-25 00:07:42 +00:00
jtanios
2968f20970 docs: git dependency name correction (#20662)
This PR corrects the name of the `git` python package to `GitPython`.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-24 23:43:44 +00:00
ccurme
481d3855dc patch: remove usage of llm, chat model __call__ (#20788)
- `llm(prompt)` -> `llm.invoke(prompt)`
- `llm(prompt=prompt` -> `llm.invoke(prompt)` (same with `messages=`)
- `llm(prompt, callbacks=callbacks)` -> `llm.invoke(prompt,
config={"callbacks": callbacks})`
- `llm(prompt, **kwargs)` -> `llm.invoke(prompt, **kwargs)`
2024-04-24 19:39:23 -04:00
Raghav Dixit
9b7fb381a4 community[patch]: LanceDB integration patch update (#20686)
Description : 

- added functionalities - delete, index creation, using existing
connection object etc.
- updated usage 
- Added LaceDB cloud OSS support

make lint_diff , make test checks done
2024-04-24 16:27:43 -07:00
Nikita Pokidyshev
9e983c9500 langchain[patch]: fix agent_token_buffer_memory not working with openai tools (#20708)
- **Description:** fix a bug in the agent_token_buffer_memory
- **Issue:** agent_token_buffer_memory was not working with openai tools
- **Dependencies:** None
- **Twitter handle:** @pokidyshef
2024-04-24 15:51:58 -07:00
Salika Dave
6353991498 docs: [Retrieval > .. > PDF] update package installation instructions for Unstructured and PDFMiner (#20723)
**Description:** Adds the command to install packages required before
using _Unstructured_ and _PDFMiner_ from `langchain.community`
**Documentation Page Being Updated:** [LangChain > Retrieval > Document
loaders > PDF > Using
Unstructured](https://python.langchain.com/docs/modules/data_connection/document_loaders/pdf/#using-unstructured)
**Issue:** #20719 
**Dependencies:** no dependencies
**Twitter handle:** SalikaDave

<!--
Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17. -->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-24 22:24:11 +00:00
dpdjvhxm
a9e2e98708 docs: Update apache_age.ipynb (#20722)
typo
2024-04-24 22:18:59 +00:00
Erick Friis
1aef8116de upstage: release 0.1.1 (#20864) 2024-04-24 15:18:30 -07:00
junkeon
c8fd51e8c8 upstage: Add Upstage partner package LA and GC (#20651)
---------

Co-authored-by: Sean <chosh0615@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Sean Cho <sean@upstage.ai>
2024-04-24 15:17:20 -07:00
hsmtkk
5ecebf168c docs: imported List is not used (#20720)
# Description

Minor sample code fix

# Issue

Imported `List` is not used.

# Dependencies

N/A

# Twitter handle

N/A
2024-04-24 15:17:07 -07:00
Alex Lee
243ba71b28 langchain[patch]: add aprep_output method to langchain/chains/base.py (#20748)
## Description

Add `aprep_output` method to `langchain/chains/base.py`. Some downstream
`ChatMessageHistory` objects that use async connections require an async
way to append to the context.

It turned out that `ainvoke()` was calling `prep_output` which is
synchronous.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-24 22:16:25 +00:00
Harrison Chase
43c041cda5 support messages in messages out (#20862) 2024-04-24 14:58:58 -07:00
back2nix
a1614b88ac groq[patch]: groq proxy support (#20758)
# Proxy Fix for Groq Class 🐛 🚀

## Description
This PR fixes a bug related to proxy settings in the `Groq` class,
allowing users to connect to LangChain services via a proxy.

## Changes Made
-  FIX support for specifying proxy settings in the `Groq` class.
-  Resolved the bug causing issues with proxy settings.
-  Did not include unit tests and documentation updates.
-  Did not run make format, make lint, and make test to ensure code
quality and functionality because I couldn't get it to run, so I don't
program in Python and couldn't run `ruff`.
-  Ensured that the changes are backwards compatible.
-  No additional dependencies were added to `pyproject.toml`.

### Error Before Fix
```python
Traceback (most recent call last):
  File "/home/bg/Documents/code/github.com/back2nix/test/groq/main.py", line 9, in <module>
    chat = ChatGroq(
           ^^^^^^^^^
  File "/home/bg/Documents/code/github.com/back2nix/test/groq/venv310/lib/python3.11/site-packages/langchain_core/load/serializable.py", line 120, in __init__
    super().__init__(**kwargs)
  File "/home/bg/Documents/code/github.com/back2nix/test/groq/venv310/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for ChatGroq
__root__
  Invalid `http_client` argument; Expected an instance of `httpx.AsyncClient` but got <class 'httpx.Client'> (type=type_error)
  ```
  
### Example usage after fix
  ```python3
import os

import httpx
from langchain_core.prompts import ChatPromptTemplate
from langchain_groq import ChatGroq

chat = ChatGroq(
    temperature=0,
    groq_api_key=os.environ.get("GROQ_API_KEY"),
    model_name="mixtral-8x7b-32768",
    http_client=httpx.Client(
        proxies="socks5://127.0.0.1:1080",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
    http_async_client=httpx.AsyncClient(
        proxies="socks5://127.0.0.1:1080",
        transport=httpx.HTTPTransport(local_address="0.0.0.0"),
    ),
)

system = "You are a helpful assistant."
human = "{text}"
prompt = ChatPromptTemplate.from_messages([("system", system), ("human", human)])

chain = prompt | chat
out = chain.invoke({"text": "Explain the importance of low latency LLMs"})

print(out)
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-24 21:58:03 +00:00
volodymyr-memsql
493afe4d8d community[patch]: add hybrid search to singlestoredb vectorstore (#20793)
Implemented the ability to enable full-text search within the
SingleStore vector store, offering users a versatile range of search
strategies. This enhancement allows users to seamlessly combine
full-text search with vector search, enabling the following search
strategies:

* Search solely by vector similarity.
* Conduct searches exclusively based on text similarity, utilizing
Lucene internally.
* Filter search results by text similarity score, with the option to
specify a threshold, followed by a search based on vector similarity.
* Filter results by vector similarity score before conducting a search
based on text similarity.
* Perform searches using a weighted sum of vector and text similarity
scores.

Additionally, integration tests have been added to comprehensively cover
all scenarios.
Updated notebook with examples.

CC: @baskaryan, @hwchase17

---------

Co-authored-by: Volodymyr Tkachuk <vtkachuk-ua@singlestore.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-24 21:34:50 +00:00
Tomaz Bratanic
9efab3ed66 community[patch]: Add driver config param for neo4j graph (#20772)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-24 21:14:41 +00:00
Leonid Ganeline
13751c3297 community: tigergraph fixes (#20034)
- added guard on the `pyTigerGraph` import
- added a missed example page in the `docs/integrations/graphs/`
- formatted the `docs/integrations/providers/` page to the consistent
format. Added links.
2024-04-24 16:49:21 -04:00
Martin Kolb
0186e4e633 community[patch]: Advanced filtering for HANA Cloud Vector Engine (#20821)
- **Description:**
This PR adds support for advanced filtering to the integration of HANA
Vector Engine.
The newly supported filtering operators are: $eq, $ne, $gt, $gte, $lt,
$lte, $between, $in, $nin, $like, $and, $or

  - **Issue:** N/A
  - **Dependencies:** no new dependencies added

Added integration tests to:
`libs/community/tests/integration_tests/vectorstores/test_hanavector.py`

Description of the new capabilities in notebook:
`docs/docs/integrations/vectorstores/hanavector.ipynb`
2024-04-24 13:47:27 -07:00
Alex Sherstinsky
12e5ec6de3 community: Support both Predibase SDK-v1 and SDK-v2 in Predibase-LangChain integration (#20859) 2024-04-24 13:31:01 -07:00
Erick Friis
8c95ac3145 docs, multiple: de-beta with_structured_output (#20850) 2024-04-24 19:34:57 +00:00
Nuno Campos
477eb1745c Better support for subgraphs in graph viz (#20840) 2024-04-24 12:32:52 -07:00
aditya thomas
a9c7d47c03 docs: update openai llm documentation (#20827)
**Description:** Bring OpenAI LLM page to the LCEL era
**Issue:** See discussion #20810
**Dependencies:** None
2024-04-24 12:26:57 -07:00
JeffKatzy
5ab3f9a995 community[patch]: standardize chat init args (#20844)
Thank you for contributing to LangChain!

community:perplexity[patch]: standardize init args

updated pplx_api_key and request_timeout so that aliased to api_key, and
timeout respectively. Added test that both continue to set the same
underlying attributes.

Related to
[20085](https://github.com/langchain-ai/langchain/issues/20085)

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-24 12:26:05 -07:00
Pavlo Paliychuk
70ae59bcfe docs: Update Zep Messaging, add links to Zep Cloud Docs (#20848)
Thank you for contributing to LangChain!

- [x] **PR title**: docs: Update Zep Messaging, add links to Zep Cloud
Docs

- [x] **PR message**: 
- **Description:** This PR updates Zep messaging in the docs + links to
Langchain Zep Cloud examples in our documentation
    - **Twitter handle:** @paulpaliychuk51


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-24 19:14:54 +00:00
Massimiliano Pronesti
8d1167b32f community[patch]: add support for similarity_score_threshold search in… (#20852)
See
https://github.com/langchain-ai/langchain/issues/20600#issuecomment-2075569338
for details.

@chrislrobert
2024-04-24 19:14:33 +00:00
Bagatur
87d31a3ec0 docs: contributing note (#20843) 2024-04-24 10:41:19 -07:00
Eugene Yurtsev
d8aa72f51d core[minor],langchain[patch]: Move base indexing interface and logic to core (#20667)
This PR moves the interface and the logic to core.

The following changes to namespaces:


`indexes` -> `indexing`
`indexes._api` -> `indexing.api`


Testing code is intentionally duplicated for now since it's testing
different
implementations of the record manager (in-memory vs. SQL).

Common logic will need to be pulled out into the test client.


A follow up PR will move the SQL based implementation outside of
LangChain.
2024-04-24 13:18:42 -04:00
ccurme
3bcfbcc871 groq: handle null queue_time (#20839) 2024-04-24 09:50:09 -07:00
Eugene Yurtsev
30e48c9878 core[patch],community[patch]: Move file chat history back to community (#20834)
Marking as patch since we haven't had releases in between. This just reverting part of a PR from yesterday.
2024-04-24 12:47:25 -04:00
ccurme
6debadaa70 groq: bump core (#20838) 2024-04-24 11:51:46 -04:00
Erick Friis
7984206c95 groq: release 0.1.3 (#20836)
Fixes #20811
2024-04-24 08:06:06 -07:00
Nestor Qin
9111d3a636 community[patch]: Fix message formatting for Anthropic models on Amazon Bedrock (#20801)
**Description:**
This PR fixes an issue in message formatting function for Anthropic
models on Amazon Bedrock.

Currently, LangChain BedrockChat model will crash if it uses Anthropic
models and the model return a message in the following type:
- `AIMessageChunk`

Moreover, when use BedrockChat with for building Agent, the following
message types will trigger the same issue too:
- `HumanMessageChunk`
- `FunctionMessage`

**Issue:**
https://github.com/langchain-ai/langchain/issues/18831

**Dependencies:**
No.

**Testing:**
Manually tested. The following code was failing before the patch and
works after.

```
@tool
def square_root(x: str):
    "Useful when you need to calculate the square root of a number"
    return math.sqrt(int(x))

llm = ChatBedrock(
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    model_kwargs={ "temperature": 0.0 },
)

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", FUNCTION_CALL_PROMPT),
        ("human", "Question: {user_input}"),
        MessagesPlaceholder(variable_name="agent_scratchpad"),
    ]
)

tools = [square_root]
tools_string = format_tool_to_anthropic_function(square_root)

agent = (
        RunnablePassthrough.assign(
            user_input=lambda x: x['user_input'],
            agent_scratchpad=lambda x: format_to_openai_function_messages(
                x["intermediate_steps"]
            )
        )
        | prompt
        | llm
        | AnthropicFunctionsAgentOutputParser()
)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, return_intermediate_steps=True)
output = agent_executor.invoke({
    "user_input": "What is the square root of 2?",
    "tools_string": tools_string,
})
```
List of messages returned from Bedrock:
```
<SystemMessage> content='You are a helpful assistant.'
<HumanMessage> content='Question: What is the square root of 2?'
<AIMessageChunk> content="Okay, let's calculate the square root of 2.<scratchpad>\nTo calculate the square root of a number, I can use the square_root tool:\n\n<function_calls>\n  <invoke>\n    <tool_name>square_root</tool_name>\n    <parameters>\n      <__arg1>2</__arg1>\n    </parameters>\n  </invoke>\n</function_calls>\n</scratchpad>\n\n<function_results>\n<search_result>\nThe square root of 2 is approximately 1.414213562373095\n</search_result>\n</function_results>\n\n<answer>\nThe square root of 2 is approximately 1.414213562373095\n</answer>" id='run-92363df7-eff6-4849-bbba-fa16a1b2988c'"
<FunctionMessage> content='1.4142135623730951' name='square_root'
```
2024-04-23 22:40:39 +00:00
ccurme
06b04b80b8 groq: fix warning filter for integration test (#20806) 2024-04-23 18:11:41 -04:00
ccurme
5a3c65a756 standard tests: add xfails (#20659) 2024-04-23 17:14:16 -04:00
Erick Friis
ddc2274aea standard-tests: split tool calling test (#20803)
just making it a bit easier to grok
2024-04-23 20:59:45 +00:00
ccurme
6622829c67 mistral: catch GatedRepoError, release 0.1.3 (#20802)
https://github.com/langchain-ai/langchain/issues/20618

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-23 20:56:42 +00:00
Eugene Yurtsev
a7c347ab35 langchain[patch]: Update evaluation logic that instantiates a default LLM (#20760)
Favor langchain_openai over langchain_community for evaluation logic.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-04-23 16:09:32 -04:00
Eugene Yurtsev
72f720fa38 langchain[major]: Remove default instantations of LLMs from VectorstoreToolkit (#20794)
Remove default instantiation from vectorstore toolkit.
2024-04-23 16:09:14 -04:00
ccurme
42de5168b1 langchain: deprecate LLMChain, RetrievalQA, and ConversationalRetrievalChain (#20751) 2024-04-23 15:55:34 -04:00
Erick Friis
30c7951505 core: use qualname in beta message (#20361) 2024-04-23 11:20:13 -07:00
Aliaksandr Kuzmik
5560cc448c community[patch]: fix CometTracer bug (#20796)
Hi! My name is Alex, I'm an SDK engineer from
[Comet](https://www.comet.com/site/)

This PR updates the `CometTracer` class.

Fixed an issue when `CometTracer` failed while logging the data to Comet
because this data is not JSON-encodable.

The problem was in some of the `Run` attributes that could contain
non-default types inside, now these attributes are taken not from the
run instance, but from the `run.dict()` return value.
2024-04-23 13:24:41 -04:00
Eugene Yurtsev
1c89e45c14 langchain[major]: breaks some chains to remove hidden defaults (#20759)
Breaks some chains in langchain to remove hidden chat model / llm instantiation.
2024-04-23 11:11:40 -04:00
Eugene Yurtsev
ad6b5f84e5 community[patch],core[minor]: Move in memory cache implementation to core (#20753)
This PR moves the InMemoryCache implementation from community to core.
2024-04-23 11:10:11 -04:00
Stefano Ottolenghi
4f67ce485a docs: Fix typo to render list (#20774)
This _should_ fix the currently broken list in the [Neo4jVector
page](https://python.langchain.com/docs/integrations/vectorstores/neo4jvector/).

![Screenshot from 2024-04-23
08-40-37](https://github.com/langchain-ai/langchain/assets/114478074/ab5ad622-879e-4764-93db-5f502eae479b)
2024-04-23 14:46:58 +00:00
Eugene Yurtsev
a2cc9b55ba core[patch]: Remove autoupgrade to addable dict in Runnable/RunnableLambda/RunnablePassthrough transform (#20677)
Causes an issue for this code

```python
from langchain.chat_models.openai import ChatOpenAI
from langchain.output_parsers.openai_tools import JsonOutputToolsParser
from langchain.schema import SystemMessage

prompt = SystemMessage(content="You are a nice assistant.") + "{question}"

llm = ChatOpenAI(
    model_kwargs={
        "tools": [
            {
                "type": "function",
                "function": {
                    "name": "web_search",
                    "description": "Searches the web for the answer to the question.",
                    "parameters": {
                        "type": "object",
                        "properties": {
                            "query": {
                                "type": "string",
                                "description": "The question to search for.",
                            },
                        },
                    },
                },
            }
        ],
    },
    streaming=True,
)

parser = JsonOutputToolsParser(first_tool_only=True)

llm_chain = prompt | llm | parser | (lambda x: x)


for chunk in llm_chain.stream({"question": "tell me more about turtles"}):
    print(chunk)

# message = llm_chain.invoke({"question": "tell me more about turtles"})

# print(message)
```

Instead by definition, we'll assume that RunnableLambdas consume the
entire stream and that if the stream isn't addable then it's the last
message of the stream that's in the usable format.

---

If users want to use addable dicts, they can wrap the dict in an
AddableDict class.

---

Likely, need to follow up with the same change for other places in the
code that do the upgrade
2024-04-23 10:35:06 -04:00
Oleksandr Yaremchuk
9428923bab experimental[minor]: upgrade the prompt injection model (#20783)
- **Description:** In January, Laiyer.ai became part of ProtectAI, which
means the model became owned by ProtectAI. In addition to that,
yesterday, we released a new version of the model addressing issues the
Langchain's community and others mentioned to us about false-positives.
The new model has a better accuracy compared to the previous version,
and we thought the Langchain community would benefit from using the
[latest version of the
model](https://huggingface.co/protectai/deberta-v3-base-prompt-injection-v2).
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** @alex_yaremchuk
2024-04-23 10:23:39 -04:00
Eugene Yurtsev
645b1e142e core[minor],langchain[patch],community[patch]: Move InMemory and File implementations of Chat History to core (#20752)
This PR moves the implementations for chat history to core. So it's
easier to determine which dependencies need to be broken / add
deprecation warnings
2024-04-23 10:22:11 -04:00
ccurme
7a922f3e48 core, openai: support custom token encoders (#20762) 2024-04-23 13:57:05 +00:00
Chen94yue
b481b73805 Update custom_retriever.ipynb (#20776)
Fixed an error in the sample code to ensure that the code can run
directly.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-23 13:47:08 +00:00
Bagatur
ed980601e1 docs: update examples in api ref (#20768) 2024-04-23 00:47:52 +00:00
Bagatur
be51cd3bc9 docs: fix api ref link autogeneration (#20766) 2024-04-22 17:36:41 -07:00
monke111
c807f0a6dd Update google_drive.ipynb (#20731)
langchain_community.document_loaders depricated 
new langchain_google_community

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-22 23:30:46 +00:00
Katarina Supe
dc61e23886 docs: update Memgraph docs (#20736)
- **Description:** Memgraph Platform is being run differently now so I
updated this (I am DX engineer from Memgraph).
2024-04-22 19:27:12 -04:00
Tabish Mir
6a0d44d632 docs: Fix link for partition_pdf in Semi_Structured_RAG.ipynb cookbook (#20763)
docs: Fix link for `partition_pdf` in Semi_Structured_RAG.ipynb cookbook

- **Description:** Fix incorrect link to unstructured-io `partition_pdf`
section
2024-04-22 23:22:55 +00:00
Bagatur
fa4d6f9f8b docs: install partner pkgs vercel (#20761) 2024-04-22 23:08:02 +00:00
Christophe Bornet
0ae5027d98 community[patch]: Remove usage of deprecated StoredBlobHistory in CassandraChatMessageHistory (#20666) 2024-04-22 17:11:05 -04:00
Bagatur
eb18f4e155 infra: rm sep repo partner dirs (#20756)
so you can `poetry run pip install -e libs/partners/*/` to your hearts
content
2024-04-22 14:05:39 -07:00
Bagatur
2a11a30572 docs: automatically add api ref links (#20755)
![Screenshot 2024-04-22 at 1 51 13
PM](https://github.com/langchain-ai/langchain/assets/22008038/b8b09fec-3800-4b97-bd26-5571b8308f4a)
2024-04-22 14:05:29 -07:00
Eugene Yurtsev
936c6cc74a langchain[patch]: Add missing deprecation for openai adapters (#20668)
Add missing deprecation for openai adapters
2024-04-22 14:05:55 -04:00
Eugene Yurtsev
38adbfdf34 community[patch],core[minor]: Move BaseToolKit to core.tools (#20669) 2024-04-22 14:04:30 -04:00
Mark Needham
ce23f8293a Community patch clickhouse make it possible to not specify index (#20460)
Vector indexes in ClickHouse are experimental at the moment and can
sometimes break/change behaviour. So this PR makes it possible to say
that you don't want to specify an index type.

Any queries against the embedding column will be brute force/linear
scan, but that gives reasonable performance for small-medium dataset
sizes.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-22 10:46:37 -07:00
ccurme
c010ec8b71 patch: deprecate (a)get_relevant_documents (#20477)
- `.get_relevant_documents(query)` -> `.invoke(query)`
- `.get_relevant_documents(query=query)` -> `.invoke(query)`
- `.get_relevant_documents(query, callbacks=callbacks)` ->
`.invoke(query, config={"callbacks": callbacks})`
- `.get_relevant_documents(query, **kwargs)` -> `.invoke(query,
**kwargs)`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-22 11:14:53 -04:00
A Noor
939d113d10 docs: Fixed grammar mistake (#20697)
Description: Changed "You are" to "You are a". Grammar issue.
Dependencies: None

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-22 02:55:05 +00:00
Matheus Henrique Raymundo
bb69819267 community: Fix the stop sequence key name for Mistral in Bedrock (#20709)
Fixing the wrong stop sequence key name that causes an error on AWS
Bedrock.
You can check the MistralAI bedrock parameters
[here](https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-mistral.html)
This change fixes this
[issue](https://github.com/langchain-ai/langchain/issues/20095)
2024-04-21 20:06:06 -04:00
Bagatur
1c7b3c75a7 community[patch], experimental[patch]: support tool-calling sql and p… (#20639)
d agents
2024-04-21 15:43:09 -07:00
Bagatur
d0cee65cdc langchain[patch]: langchain-pinecone self query support (#20702) 2024-04-21 15:42:39 -07:00
Leonid Kuligin
5ae738c4fe docs: on google-genai vs google-vertexai (#20713)
Thank you for contributing to LangChain!

- [ ] **PR title**: "docs: added a description of differences
langchain_google_genai vs langchain_google_vertexai"


- [ ]
- **Description:** added a description of differences
langchain_google_genai vs langchain_google_vertexai
2024-04-21 12:53:19 -07:00
shumway743
cb6e5e56c2 community[minor]: add graph store implementation for apache age (#20582)
**Description:** implemented GraphStore class for Apache Age graph db

**Dependencies:** depends on psycopg2

Unit and integration tests included. Formatting and linting have been
run.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-20 14:31:04 -07:00
Christophe Bornet
c909ae0152 community[minor]: Add async methods to CassandraVectorStore (#20602)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-20 02:09:58 +00:00
Leonid Ganeline
06d18c106d langchain[patch]: example_selector import fix (#20676)
Cleaned up updated imports
2024-04-19 21:42:18 -04:00
Leonid Ganeline
d6470aab60 langchain: dosctore import fix (#20678)
Cleaned up imports
2024-04-19 21:41:36 -04:00
Leonid Ganeline
3a750e130c templates: utilities import fix (#20679)
Updated imports from `from langchain.utilities` to `from
langchain_community.utilities`
2024-04-19 21:41:15 -04:00
Dmitry Tyumentsev
f111efeb6e community[patch]: YandexGPT API add ability to disable request logging (#20670)
Closes (#20622)

Added the ability to [disable logging of requests to
YandexGPT](https://yandex.cloud/en/docs/foundation-models/operations/yandexgpt/disable-logging).
2024-04-19 21:40:37 -04:00
Erick Friis
e5f5d9ff56 docs: aws listing (#20674) 2024-04-19 21:27:35 +00:00
Mateusz Szewczyk
75ffe51bbe ibm: Add support for Embedding Models (#20647)
---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-19 20:56:24 +00:00
Erick Friis
73809817ff community: release 0.0.34 (#20672) 2024-04-19 12:44:41 -07:00
Tomaz Bratanic
e4b38e2822 Update neo4j cypher templates to the function callback (#20515)
Update Neo4j Cypher templates to use function callback to pass context
instead of passing it in user prompt.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-19 18:33:32 +00:00
Tomaz Bratanic
3d9b26fc28 Update neo4j vector documentation (#20455)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-19 18:32:13 +00:00
Tomaz Bratanic
8c08cf4619 community: Add support for relationship indexes in neo4j vector (#20657)
Neo4j has added relationship vector indexes.
We can't populate them, but we can use existing indexes for retrieval
2024-04-19 11:22:42 -07:00
Erick Friis
940242c1ec core: release 0.1.45 (#20664) 2024-04-19 09:55:02 -07:00
Saurabh Chalke
3dd6266bcc docs: Remove Duplicate --quiet Flag in Installation Command in LangSmith Docs (#20121)
**Description:** This pull request removes a duplicated `--quiet` flag
in the pip install command found in the LangSmith Walkthrough section of
the documentation.

**Issue:** N/A

**Dependencies:** None
2024-04-19 11:16:44 -04:00
Aditya
6a97448928 Updated Tutorials for Vertex Vector Search (#20376)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: docs"


- [ ] **PR message**: 
    - **Description:** Updated Tutorials for Vertex Vector Search
    - **Issue:** NA
    - **Dependencies:** NA
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

@lkuligin for review

---------

Co-authored-by: adityarane@google.com <adityarane@google.com>
Co-authored-by: Leonid Kuligin <lkuligin@yandex.ru>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-19 10:38:00 -04:00
Boris Djurdjevic
c5aab9afe3 docs: Fix minor typo in data_connection/document_loaders/custom (#20648)
**Description:**
Minor documentation typo fix in
`data_connection/document_loaders/custom`: `thta's` -> `that's`
2024-04-19 14:17:00 +00:00
Souls-R
36084e7500 docs: fix variable name typo in example code (#20658)
This pull request corrects a mistake in the variable name within the
example code. The variable doc_schema has been changed to dog_schema to
fix the error.
2024-04-19 14:08:25 +00:00
Leonid Ganeline
beebd73f95 docs: integrations/retrievers cleanup (#20357)
Fixed format inconsistencies; added descriptions, links.
2024-04-19 10:02:41 -04:00
Leonid Ganeline
0b99e9201d docs: providers alibaba update (#20560)
Added missed integrations to the Alibaba Cloud provider page
2024-04-18 23:11:17 -07:00
Leonid Ganeline
27a4682415 docs: imports update (#20625)
Updated imports in docs

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-18 23:04:07 -07:00
Ethan Yang
53ae77b13e docs: Update openvino example documents links (#20638) 2024-04-18 22:57:28 -07:00
Sivaudha
baedc3ec0a langchain[minor]: Databricks vector search self query integration (#20627)
- Enable self querying feature for databricks vector search

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-19 03:44:38 +00:00
ccurme
6d530481c1 openai: fix allowed block types (#20636) 2024-04-18 22:12:57 -04:00
Erick Friis
764871f97d infra: add test-doc-imports to ci failure (#20637) 2024-04-19 02:06:57 +00:00
Erick Friis
5c216ad08f upstage[patch]: un-xfail tool calling test, release 0.1.0 (#20635) 2024-04-19 02:02:21 +00:00
Nuno Campos
48307e46a3 core[patch]: Fix runnable map ser/de (#20631) 2024-04-18 18:52:33 -07:00
Charlie Holtz
1cbab0ebda community: update Replicate to work with official models (#20633)
Description: you don't need to pass a version for Replicate official
models. That was broken on LangChain until now!

You can now run: 

```
llm = Replicate(
    model="meta/meta-llama-3-8b-instruct",
    model_kwargs={"temperature": 0.75, "max_length": 500, "top_p": 1},
)
prompt = """
User: Answer the following yes/no question by reasoning step by step. Can a dog drive a car?
Assistant:
"""
llm(prompt)
```

I've updated the replicate.ipynb to reflect that.

twitter: @charliebholtz

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-19 01:43:40 +00:00
Congyu
dd5139e304 community[patch]: truncate zhipuai temperature and top_p parameters to [0.01, 0.99] (#20261)
ZhipuAI API only accepts `temperature` parameter between `(0, 1)` open
interval, and if `0` is passed, it responds with status code `400`.

However, 0 and 1 is often accepted by other APIs, for example, OpenAI
allows `[0, 2]` for temperature closed range.

This PR truncates temperature parameter passed to `[0.01, 0.99]` to
improve the compatibility between langchain's ecosystem's and ZhipuAI
(e.g., ragas `evaluate` often generates temperature 0, which results in
a lot of 400 invalid responses). The PR also truncates `top_p` parameter
since it has the same restriction.

Reference: [glm-4 doc](https://open.bigmodel.cn/dev/api#glm-4) (which
unfortunately is in Chinese though).

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-19 01:31:30 +00:00
Lance Martin
d5c22b80a5 community[patch]: Fix Ollama for LLaMA3 (#20624)
We see verbose generations w/ LLaMA3 and Ollama - 

https://smith.langchain.com/public/88c4cd21-3d57-4229-96fe-53443398ca99/r

--- 

Fix here implies that when stop was being set to an empty list, the
stream had no conditions under which to stop, which could lead to
excessive or unintended output.

Test LLaMA2 - 

https://smith.langchain.com/public/57dfc64a-591b-46fa-a1cd-8783acaefea2/r

Test LLaMA3 - 

https://smith.langchain.com/public/76ff5f47-ac89-4772-a7d2-5caa907d3fd6/r

https://smith.langchain.com/public/a31d2fad-9094-4c93-949a-964b27630ccb/r

Test Mistral -

https://smith.langchain.com/public/a4fe7114-c308-4317-b9fd-6c86d31f1c5b/r

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-19 00:20:32 +00:00
Erick Friis
726234eee5 infra: fix doc imports ci (#20629) 2024-04-18 23:42:03 +00:00
Erick Friis
3425988de7 core: deprecation default to qualname (#20578) 2024-04-18 15:35:17 -07:00
hulitaitai
7d0a008744 community[minor]: Add audio-parser "faster-whisper" in audio.py (#20012)
faster-whisper is a reimplementation of OpenAI's Whisper model using
CTranslate2, which is up to 4 times faster than enai/whisper for the
same accuracy while using less memory. The efficiency can be further
improved with 8-bit quantization on both CPU and GPU.

It can automatically detect the following 14 languages and transcribe
the text into their respective languages: en, zh, fr, de, ja, ko, ru,
es, th, it, pt, vi, ar, tr.

The gitbub repository for faster-whisper is :
    https://github.com/SYSTRAN/faster-whisper

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-18 20:50:59 +00:00
Guangdong Liu
e3c2431c5b comminuty[patch]:Fix Error in apache doris insert (#19989)
- **Issue:** #19886
2024-04-18 16:34:32 -04:00
naaive
6f0d4f3f09 docs: Update body_func to hybrid_query in ElasticsearchRetriever (#20498) 2024-04-18 20:19:02 +00:00
Tomaz Bratanic
27370b679e community[patch]: Ignore null and invalid embedding values for neo4j metadata filtering (#20558) 2024-04-18 16:15:45 -04:00
Eugene Yurtsev
718c9cbe3a mistral[patch]: Support both model and model_name (#20557) 2024-04-18 16:12:33 -04:00
Eugene Yurtsev
e3bd521654 docs: Remove example vsdx data (#20620)
VSDX data contains EMF files. Some of these apparently can contain
exploits with some Adobe tools.

This is likely a false positive from antivirus software, but we
can remove it nonetheless.
2024-04-18 16:10:40 -04:00
Dhruv Chawla
c0548eb632 docs: Update uptrain.ipynb to show outputs (#20551)
Hey @eyurtsev, I noticed that the notebook isn't displaying the outputs
properly. I've gone ahead and rerun the cells to ensure that readers can
easily understand the functionality without having to run the code
themselves.
2024-04-18 16:10:23 -04:00
Leonid Ganeline
95dc90609e experimental[patch]: prompts import fix (#20534)
Replaced `from langchain.prompts` with `from langchain_core.prompts`
where it is appropriate.
Most of the changes go to `langchain_experimental`
Similar to #20348
2024-04-18 16:09:11 -04:00
Massimiliano Pronesti
2542a09abc community[patch]: AzureSearch incorrectly converted to retriever (#20601)
Closes #20600.

Please see the issue for more details.
2024-04-18 16:06:47 -04:00
Leonid Ganeline
520ef24fb9 docs: import update (#20610)
Updated imports
2024-04-18 16:05:17 -04:00
Christophe Bornet
8f0b5687a3 community[minor]: Add hybrid search to Cassandra VectorStore (#20286)
Only supported by Astra DB at the moment.
**Twitter handle:** cbornet_
2024-04-18 15:58:43 -04:00
Christophe Bornet
d2d01370bc community[minor]: Add async methods to CassandraLoader (#20609)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-18 19:45:20 +00:00
Eugene Yurtsev
8c29b7bf35 mistralai[patch]: Use public attribute for eventsource.response (#20580)
Minor change, use the public attribute instead of the protected one.
2024-04-18 14:12:12 -04:00
Erick Friis
66fb0b1f35 core: fix fireworks mapping (#20613) 2024-04-18 18:08:40 +00:00
balloonio
e786da7774 community[patch]: Invoke callback prior to yielding token fix [HuggingFaceTextGenInference] (#20426)
…gFaceTextGenInference)

- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [HuggingFaceTextGenInference]


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in [HuggingFaceTextGenInference]
    - **Issue:** https://github.com/langchain-ai/langchain/issues/16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-18 14:25:20 +00:00
Ethan Yang
2d6d796040 community: Add save_model function for openvino reranker and embedding (#19896) 2024-04-18 10:20:33 -04:00
zR
9c1d7f2405 update zhipuai notebook (#20595)
fix timeout issue
fix zhipuai usecase notebookbook

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-18 10:12:12 -04:00
MajorDouble
9c175bc618 Update README.md -- broken hyperlink (#20422)
fixed broken `LangGraph` hyperlink

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-18 14:07:52 +00:00
Ikko Eltociear Ashimine
7a884eb416 Update RAPTOR.ipynb (#20586)
Langauge -> Language
2024-04-18 09:47:17 -04:00
Justsosostar
697d98cac9 fix typo in langchain/docs/docs/intergrations/tools/nuclia.ipynb (#20591)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-18 13:46:45 +00:00
ccurme
c897264b9b community: (milvus) check for num_shards (#20603)
@rgupta2508 I believe this change is necessary following
https://github.com/langchain-ai/langchain/pull/20318 because of how
Milvus handles defaults:


59bf5e811a/pymilvus/client/prepare.py (L82-L85)
```python
num_shards = kwargs[next(iter(same_key))]
if not isinstance(num_shards, int):
    msg = f"invalid num_shards type, got {type(num_shards)}, expected int"
    raise ParamError(message=msg)
req.shards_num = num_shards
```
this way lets Milvus control the default value (instead of maintaining a
separate default in Langchain).

Let me know if I've got this wrong or you feel it's unnecessary. Thanks.
2024-04-18 09:44:56 -04:00
Rohit Gupta
25c4c24e89 Support to create shards_num in milvus vectorstores (#20318)
To support number of the shards for the collection to create in milvus
vvectorstores.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-18 08:58:00 -04:00
aditya thomas
8bad536c6c docs[callbacks]: update to the FileCallbackHandler documentation (#20496)
**Description:** Update to the `FileCallbackHandler` documentation
**Issue:** #20493 
**Dependencies:** None
2024-04-17 22:32:21 -04:00
aditya thomas
cea379e7c7 community, core[callbacks]: move FileCallbackHandler from community to core (#20495)
**Description:** Move `FileCallbackHandler` from community to core
**Issue:** #20493 
**Dependencies:** None

(imo) `FileCallbackHandler` is a built-in LangChain callback handler
like `StdOutCallbackHandler` and should properly be in in core.
2024-04-17 22:29:30 -04:00
Erick Friis
084bedd16e docs: nits (#20577) 2024-04-18 00:20:44 +00:00
Erick Friis
e7e94b37f1 upstage: fix core dep (#20576) 2024-04-17 16:33:09 -07:00
Erick Friis
e395115807 docs: aws docs updates (#20571) 2024-04-17 23:32:00 +00:00
Erick Friis
f09bd0b75b upstage: init package (#20574)
Co-authored-by: Sean Cho <sean@upstage.ai>
Co-authored-by: JuHyung-Son <sonju0427@gmail.com>
2024-04-17 23:25:36 +00:00
Marco Perini
11c9ed3362 community[patch]: exposing headless flag parameter to AsyncChromiumLoader class (#20424)
- **Description:** added the headless parameter as optional argument to
the langchain_community.document_loaders AsyncChromiumLoader class
  - **Dependencies:** None
  - **Twitter handle:** @perinim_98

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-17 16:00:28 -07:00
Bagatur
54e9271504 anthropic[patch]: fix msg mutation (#20572) 2024-04-17 15:47:19 -07:00
Nuno Campos
719da8746e core: fix attributeerror in runnablelambda.deps (#20569)
- would happen when user's code tries to access attritbute that doesnt
exist, we prefer to let this crash in the user's code, rather than here
- also catch more cases where a runnable is invoked/streamed inside a
lambda. before we weren't seeing these as deps
2024-04-17 15:38:39 -07:00
Jacob Lee
8b09e81496 Lock low level dep to fix Vercel docs build (#20573)
@baskaryan @efriis 

TODO: Figure out why our lockfile isn't being respected here
2024-04-17 15:21:28 -07:00
Christophe Bornet
a22da4315b community[patch]: Replace function in CassandraVectorStore with simpler lambda (#20323) 2024-04-17 17:13:13 -04:00
Christophe Bornet
75733c5cc1 community[minor]: Improve CassandraVectorStore from_texts (#20284) 2024-04-17 17:12:28 -04:00
Tomer Cagan
463160c3f6 community: fix DirectoryLoader progress bar (#19821)
**Description:** currently, the `DirectoryLoader` progress-bar maximum value is based on an incorrect number of files to process

In langchain_community/document_loaders/directory.py:127:

```python
        paths = p.rglob(self.glob) if self.recursive else p.glob(self.glob)
        items = [
            path
            for path in paths
            if not (self.exclude and any(path.match(glob) for glob in self.exclude))
        ]
```

`paths` returns both files and directories. `items` is later used to determine the maximum value of the progress-bar which gives an incorrect progress indication.
2024-04-17 21:12:16 +00:00
Bagatur
984e7e36c2 anthropic[patch]: Release 0.1.10 (#20568) 2024-04-17 14:05:42 -07:00
Pengcheng Liu
ecd19a9e58 community[patch]: Add function call support in Tongyi chat model. (#20119)
- [ ] **PR message**: 
- **Description:** This pr adds function calling support in Tongyi chat
model.
    - **Issue:** None
    - **Dependencies:** None
    - **Twitter handle:** None

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-17 20:42:23 +00:00
kaijietti
80679ab906 zep[patch]: implement add_messages and aadd_messages (#20099)
This PR implement `add_messages` and `aadd_messages` to avoid
unnecessary round-trips.
2024-04-17 13:40:24 -07:00
Guangdong Liu
55dd349472 docs: Get rid of ZeroShotAgent and use create_react_agent instead (#20154)
- **Issue:** close #20122
 - @baskaryan, @eyurtsev.
2024-04-17 13:35:14 -07:00
Guangdong Liu
1e3b07aae2 docs: Get rid of ZeroShotAgent and use create_react_agent instead (#20155)
- **Issue:** #20122
- @baskaryan,@eyurtsev
2024-04-17 13:34:57 -07:00
ccurme
2238490069 mistral, openai: allow anthropic-style messages in message histories (#20565) 2024-04-17 15:55:45 -04:00
Eugene Yurtsev
7a7851aa06 anthropic[patch]: Handle empty text block (#20566)
Handle empty text block
2024-04-17 15:37:04 -04:00
Bagatur
7917e2c418 core[patch]: Release 0.1.44 (#20564) 2024-04-17 18:34:44 +00:00
ccurme
4a17951900 mistral: read tool calls from AIMessage (#20554)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-17 13:38:24 -04:00
Eugene Yurtsev
f257909699 mistralai[patch]: Surface http errors (#20555)
Do not swallow errors when streaming with httpx.

Update affected code if this PR gets merged to httpx:
https://github.com/florimondmanca/httpx-sse/pull/25/files
2024-04-17 10:47:56 -04:00
Sevin F. Varoglu
3f156e0ece community[minor]: add ChatOctoAI (#20059)
This PR adds ChatOctoAI, a chat model integration for OctoAI.
2024-04-17 03:20:56 -07:00
Eun Hye Kim
b34f1086fe community[patch]: Add streaming logic in ChatHuggingFace (#18784)
- Add functions (_stream, _astream)
- Connect to _generate and _agenerate

Thank you for contributing to LangChain!

- [x] **PR title**: "community: Add streaming logic in ChatHuggingFace"

- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Addition functions (_stream, _astream) and connection
to _generate and _agenerate
    - **Issue:** #18782
    - **Dependencies:** none
    - **Twitter handle:** @lunara_x
2024-04-16 19:17:03 -07:00
Bagatur
c05c379b26 docs: add structred output to feat table (#20539) 2024-04-16 19:14:26 -07:00
pjb157
479be3cc91 community[minor]: Unify Titan Takeoff Integrations and Adding Embedding Support (#18775)
**Community: Unify Titan Takeoff Integrations and Adding Embedding
Support**

 **Description:** 
Titan Takeoff no longer reflects this either of the integrations in the
community folder. The two integrations (TitanTakeoffPro and
TitanTakeoff) where causing confusion with clients, so have moved code
into one place and created an alias for backwards compatibility. Added
Takeoff Client python package to do the bulk of the work with the
requests, this is because this package is actively updated with new
versions of Takeoff. So this integration will be far more robust and
will not degrade as badly over time.

**Issue:**
Fixes bugs in the old Titan integrations and unified the code with added
unit test converge to avoid future problems.

**Dependencies:**
Added optional dependency takeoff-client, all imports still work without
dependency including the Titan Takeoff classes but just will fail on
initialisation if not pip installed takeoff-client

**Twitter**
@MeryemArik9

Thanks all :)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-17 01:43:35 +00:00
Rahul Triptahi
2cbfc94bcb community[patch]: Add support for authorized identities in PebbloSafeLoader. (#20055)
Description: Add support for authorized identities in PebbloSafeLoader.
Now with this change, PebbloSafeLoader will extract
authorized_identities from metadata and send it to pebblo server
Dependencies: None
Documentation: None

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-04-16 18:34:06 -07:00
Rahul Triptahi
475892ca0e docs: Add Documentation to enable authorized access identities in GoogleDriveLoader. (#20065)
Description: Document update.

GoogleDriveLoader: Added documentation for `load_auth` a new argument in
document_loaders/GoogleDriveLoader.

Dependencies: None
Documentation:
https://python.langchain.com/docs/integrations/document_loaders/google_drive/

Associated PR: https://github.com/langchain-ai/langchain-google/pull/110

Twitter handle: @rahul_tripathi2

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-04-16 18:33:10 -07:00
Guangdong Liu
b78ede2f96 community[patch]: standardize init args (#20166)
Related to https://github.com/langchain-ai/langchain/issues/20085

@baskaryan
2024-04-16 18:30:26 -07:00
Guangdong Liu
3729bec1a2 community[patch]: standardize init args (#20210)
Related to https://github.com/langchain-ai/langchain/issues/20085

@baskaryan
2024-04-16 18:29:57 -07:00
sdan
a7c5e41443 community[minor]: Added VLite as VectorStore (#20245)
Support [VLite](https://github.com/sdan/vlite) as a new VectorStore
type.

**Description**:
vlite is a simple and blazing fast vector database(vdb) made with numpy.
It abstracts a lot of the functionality around using a vdb in the
retrieval augmented generation(RAG) pipeline such as embeddings
generation, chunking, and file processing while still giving developers
the functionality to change how they're made/stored.

**Before submitting**:
Added tests
[here](c09c2ebd5c/libs/community/tests/integration_tests/vectorstores/test_vlite.py)
Added ipython notebook
[here](c09c2ebd5c/docs/docs/integrations/vectorstores/vlite.ipynb)
Added simple docs on how to use
[here](c09c2ebd5c/docs/docs/integrations/providers/vlite.mdx)

**Profiles**

Maintainers: @sdan
Twitter handles: [@sdand](https://x.com/sdand)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-17 01:24:38 +00:00
Hyeongchan Kim
7824291252 community[patch]: Fix not to cast to str type when file_path is None (#20057)
From `langchain_community 0.0.30`, there's a bug that cannot send a
file-like object via `file` parameter instead of `file path` due to
casting the `file_path` to str type even if `file_path` is None.

which means that when I call the `partition_via_api()`, exactly one of
`filename` and `file` must be specified by the following error message.

however, from `langchain_community 0.0.30`, `file_path` is casted into
`str` type even `file_path` is None in `get_elements_from_api()` and got
an error at `exactly_one(filename=filename, file=file)`.

here's an error message
```
---> 51     exactly_one(filename=filename, file=file)
     53     if metadata_filename and file_filename:
     54         raise ValueError(
     55             "Only one of metadata_filename and file_filename is specified. "
     56             "metadata_filename is preferred. file_filename is marked for deprecation.",
     57         )

File /opt/homebrew/lib/python3.11/site-packages/unstructured/partition/common.py:441, in exactly_one(**kwargs)
    439 else:
    440     message = f"{names[0]} must be specified."
--> 441 raise ValueError(message)

ValueError: Exactly one of filename and file must be specified.
```

So, I simply made a change that casting to str type when `file_path` is
not None.

I use `UnstructuredAPIFileLoader` like below.

```
from langchain_community.document_loaders.unstructured import UnstructuredAPIFileLoader

documents: list = UnstructuredAPIFileLoader(
    file_path=None,
    file=file,  # file-like object, io.BytesIO type
    mode='elements',
    url='http://127.0.0.1:8000/general/v0/general',
    content_type='application/pdf',
    metadata_filename='asdf.pdf',
).load_and_split()
```
2024-04-16 18:06:21 -07:00
Prashanth Rao
295b9b704b community[patch]: Improve Kuzu Cypher generation prompt (#20481)
- [x] **PR title**: "community: improve kuzu cypher generation prompt"

- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Improves the Kùzu Cypher generation prompt to be more
robust to open source LLM outputs
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** @kuzudb

- [x] **Add tests and docs**: If you're adding a new integration, please
include
No new tests (non-breaking. change)

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-04-16 18:01:36 -07:00
MacanPN
bce69ae43d community[patch]: Changes to base_o365 and sharepoint document loaders (#20373)
## Description:
The PR introduces 3 changes:
1. added `recursive` property to `O365BaseLoader`. (To keep the behavior
unchanged, by default is set to `False`). When `recursive=True`,
`_load_from_folder()` also recursively loads all nested folders.
2. added `folder_id` to SharePointLoader.(similar to (this
PR)[https://github.com/langchain-ai/langchain/pull/10780] ) This
provides an alternative to `folder_path` that doesn't seem to reliably
work.
3. when none of `document_ids`, `folder_id`, `folder_path` is provided,
the loader fetches documets from root folder. Combined with
`recursive=True` this provides an easy way of loading all compatible
documents from SharePoint.

The PR contains the same logic as [this stale
PR](https://github.com/langchain-ai/langchain/pull/10780) by
@WaleedAlfaris. I'd like to ask his blessing for moving forward with
this one.

## Issue:
- As described in https://github.com/langchain-ai/langchain/issues/19938
and https://github.com/langchain-ai/langchain/pull/10780 the sharepoint
loader often does not seem to work with folder_path.
- Recursive loading of subfolders is a missing functionality

## Dependecies: None

Twitter handle:
@martintriska1 @WRhetoric

This is my first PR here, please be gentle :-)
Please review @baskaryan

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-17 00:36:15 +00:00
Sevin F. Varoglu
54d388d898 community[patch]: update OctoAI endpoint to subclass BaseOpenAI (#19757)
This PR updates OctoAIEndpoint LLM to subclass BaseOpenAI as OctoAI is
an OpenAI-compatible service. The documentation and tests have also been
updated.
2024-04-16 17:32:20 -07:00
Erick Friis
0c95ddbcd8 docs: add snowflake provider page (#20538) 2024-04-17 00:31:27 +00:00
Benito Geordie
57b226532d community[minor]: Added integrations for ThirdAI's NeuralDB as a Retriever (#17334)
**Description:** Adds ThirdAI NeuralDB retriever integration. NeuralDB
is a CPU-friendly and fine-tunable text retrieval engine. We previously
added a vector store integration but we think that it will be easier for
our customers if they can also find us under under
langchain-community/retrievers.

---------

Co-authored-by: kartikTAI <129414343+kartikTAI@users.noreply.github.com>
Co-authored-by: Kartik Sarangmath <kartik@thirdai.com>
2024-04-16 16:36:55 -07:00
WeichenXu
e9fc87aab1 community[patch]: Make ChatDatabricks model supports streaming response (#19912)
**Description:** Make ChatDatabricks model supports stream
**Issue:** N/A
**Dependencies:** MLflow nightly build version (we will release next
MLflow version soon)
**Twitter handle:** N/A

Manually test:

(Before testing, please install `pip install
git+https://github.com/mlflow/mlflow.git`)

```python
# Test Databricks Foundation LLM model
from langchain.chat_models import ChatDatabricks

chat_model = ChatDatabricks(
    endpoint="databricks-llama-2-70b-chat",
    max_tokens=500
)
from langchain_core.messages import AIMessageChunk

for chunk in chat_model.stream("What is mlflow?"):
  print(chunk.content, end="|")
```

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Signed-off-by: Weichen Xu <weichen.xu@databricks.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-16 23:34:49 +00:00
ccurme
a892f985d3 standardized-tests[patch]: test tool call messages (#20519)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-16 23:25:50 +00:00
Erick Friis
e7fe5f7d3f anthropic[patch]: serialization in partner package (#18828) 2024-04-16 16:05:58 -07:00
Bagatur
f74d5d642e anthropic[patch]: bump to core 0.1.43 (#20537) 2024-04-16 22:47:07 +00:00
Bagatur
96d8769eae anthropic[patch]: release 0.1.9, use tool calls if content is empty (#20535) 2024-04-16 15:27:29 -07:00
Erick Friis
6adca37eb7 core: default chat/llm _identifying_params to lc_attributes (#20232) 2024-04-16 14:55:47 -07:00
ccurme
22da9f5f3f update scheduled tests (#20526)
repurpose scheduled tests to test over provider packages
2024-04-16 16:49:46 -04:00
Nuno Campos
806a54908c Runnable graph viz improvements (#20529)
- Add conditional: bool property to json representation of the graphs
- Add option to generate mermaid graph stripped of styles (useful as a
text representation of graph)
2024-04-16 20:17:47 +00:00
Nuno Campos
f3aa26d6bf Fix getattr in runnable binding for cases where config is passed in as arg too (#20528)
…s arg too

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-16 13:10:29 -07:00
Dhruv Chawla
d6d559d50d community[minor]: add UpTrainCallbackHandler (#19956)
- **Description:** 
This PR adds a callback handler for UpTrain. It performs evaluations in
the RAG pipeline to check the quality of retrieved documents, generated
queries and responses.

- **Dependencies:** 
    - The UpTrainCallbackHandler requires the uptrain package

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-04-16 19:32:03 +00:00
Bagatur
07f23bd4ff docs: response metadata (#20527) 2024-04-16 12:17:27 -07:00
Leonid Ganeline
45d045b2c5 core[minor], langchain[patch]: tools dependencies refactoring (#18759)
The `langchain.tools`
[namespace](https://api.python.langchain.com/en/latest/langchain_api_reference.html#module-langchain.tools)
can be completely eliminated by moving one class and 3 functions into
`core`. It makes sense since the class and functions are very core.
2024-04-16 14:15:09 -04:00
Erick Friis
77eba10f47 standard-tests: fix default fixtures (#20520) 2024-04-16 16:12:36 +00:00
Ravindu Somawansa
5acc7ba622 community[minor]: Add glue catalog loader (#20220)
Add Glue Catalog loader
2024-04-16 11:39:23 -04:00
Dawson Bauer
aab075345e core[patch]: Fix imports defined in messages sub-package (#20500)
core[patch]: Fix imports defined in messages sub-package (#20500)
2024-04-16 14:19:51 +00:00
Fayfox
9fd36efdb5 anthropic[patch]: env ANTHROPIC_API_URL not work (#20507)
enviroment variable ANTHROPIC_API_URL will not work if anthropic_api_url
has default value

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
2024-04-16 10:16:51 -04:00
Martín Gotelli Ferenaz
b48add4353 community[patch]: Fix pgvector deprecated filter clause usage with OR and AND conditions (#20446)
**Description**: Support filter by OR and AND for deprecated PGVector
version
**Issue**: #20445 
**Dependencies**: N/A
**Twitter** handle: @martinferenaz
2024-04-16 14:08:07 +00:00
Eugene Yurtsev
c50099161b community[patch]: Use uuid4 not uuid1 (#20487)
Using UUID1 is incorrect since it's time dependent, which makes it easy
to generate the exact same uuid
2024-04-16 09:40:44 -04:00
Bagatur
f7667c614b docs: update tool use case (#20404) 2024-04-16 04:27:27 +00:00
Erick Friis
86cf1d3ee1 community: release 0.0.33 (#20490) 2024-04-16 00:30:05 +00:00
Erick Friis
90184255f8 core: release 0.1.43 (#20489) 2024-04-15 22:48:34 +00:00
Erick Friis
7997f3b7f8 core: forward config params to default (#20402)
nuno's fault not mine

---------

Co-authored-by: Nuno Campos <nuno@boringbits.io>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
2024-04-15 15:42:39 -07:00
Nuno Campos
97b2191e99 core: Add concept of conditional edge to graph rendering (#20480)
- implement for mermaid, graphviz and ascii
- this is to be used in langgraph
2024-04-15 13:49:06 -07:00
Averi Kitsch
30b00090ef docs: Add Google Firestore Vectorstore doc (#20078)
- **Description:**Add Google Firestore Vector store docs
    - **Issue:** NA
    - **Dependencies:** NA

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-15 20:09:32 +00:00
Leonid Kuligin
cc3c343673 docs: changed model's name in google-vertex-ai integration to a publicly available model (#20482)
docs: changed model's name in google-vertex-ai integration to a publicly
available model
2024-04-15 15:18:27 -04:00
Leonid Ganeline
7ea80bcb22 docs: tutorials update (#20483)
Added the `freeCodeCamp` tutorials link
2024-04-15 15:17:32 -04:00
Ángel Igareta
60c7a17781 Remove logic to exclude intermediate nodes from rendering time (#20459)
Description: For simplicity, migrate the logic of excluding intermediate
nodes in the .get_graph() of langgraph package
(https://github.com/langchain-ai/langgraph/pull/310) at graph creation
time instead of graph rendering time.

Note: #20381 needs to be approved first

---------

Co-authored-by: Angel Igareta <angel.igareta@klarna.com>
Co-authored-by: Nuno Campos <nuno@langchain.dev>
Co-authored-by: Nuno Campos <nuno@boringbits.io>
2024-04-15 16:40:51 +00:00
Mohammed Noumaan Ahamed
4dd05791a2 docs: quickstart retrieval chain for Cohere(API) (#20475)
- **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


Description: fixes LangChainDeprecationWarning: The class
`langchain_community.embeddings.cohere.CohereEmbeddings` was deprecated
in langchain-community 0.0.30 and will be removed in 0.2.0. An updated
version of the class exists in the langchain-cohere package and should
be used instead. To use it run `pip install -U langchain-cohere` and
import as `from langchain_cohere import CohereEmbeddings`.

![Screenshot 2024-04-15
200948](https://github.com/langchain-ai/langchain/assets/93511919/085b967d-a6fd-42c6-9404-faab8c5630ec)



Dependencies : langchain_cohere

Twitter handle: @Mo_Noumaan
2024-04-15 11:28:39 -04:00
Ángel Igareta
d55a365c6c Fix CDN URL in mermaid graph renderer (#20381)
Description of features on mermaid graph renderer:
- Fixing CDN to use official Mermaid JS CDN:
https://www.jsdelivr.com/package/npm/mermaid?tab=files
- Add device_scale_factor to allow increasing quality of resulting PNG.
2024-04-15 08:01:35 -07:00
Eugene Yurtsev
3cbc4693f5 docs: Add integration doc for postgres vectorstore (#20473)
Adds a postgres vectorstore via langchain-postgres.
2024-04-15 14:20:27 +00:00
Leonid Kuligin
676c68d318 community[patch]: deprecating remaining google_community integrations (#20471)
Deprecating remaining google community integrations
2024-04-15 09:57:12 -04:00
balloonio
b66a4f48fa community[patch]: Invoke callback prior to yielding token fix [DeepInfra] (#20427)
- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for [DeepInfra]


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in [DeepInfra]
    - **Issue:** https://github.com/langchain-ai/langchain/issues/16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-14 14:32:52 -04:00
Juan Carlos José Camacho
450c458f8f community[minor]: Add Datahareld tool (#19680)
**Description:** Integrate [dataherald](https://www.dataherald.com)
tool, It is a natural language-to-SQL tool.
**Dependencies:** Install dataherald sdk to use it,
```
pip install dataherald
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Christophe Bornet <cbornet@hotmail.com>
2024-04-13 23:27:16 +00:00
Alexander Smirnov
ece008f117 docs: Refine RunnablePassthrough docstring (#19812)
Description: This update refines the documentation for
`RunnablePassthrough` by removing an unnecessary import and correcting a
minor syntactical error in the example provided. This change enhances
the clarity and correctness of the documentation, ensuring that users
have a more accurate guide to follow.

Issue: N/A

Dependencies: None

This PR focuses solely on documentation improvements, specifically
targeting the `RunnablePassthrough` class within the `langchain_core`
module. By clarifying the example provided in the docstring, users are
offered a more straightforward and error-free guide to utilizing the
`RunnablePassthrough` class effectively.

As this is a documentation update, it does not include changes that
require new integrations, tests, or modifications to dependencies. It
adheres to the guidelines of minimal package interference and backward
compatibility, ensuring that the overall integrity and functionality of
the LangChain package remain unaffected.

Thank you for considering this documentation refinement for inclusion in
the LangChain project.
2024-04-13 16:23:32 -07:00
Egor Krasheninnikov
c8391d4ff1 community[patch]: Fix YandexGPT embeddings (#19720)
Fix of YandexGPT embeddings. 

The current version uses a single `model_name` for queries and
documents, essentially making the `embed_documents` and `embed_query`
methods the same. Yandex has a different endpoint (`model_uri`) for
encoding documents, see
[this](https://yandex.cloud/en/docs/yandexgpt/concepts/embeddings). The
bug may impact retrievers built with `YandexGPTEmbeddings` (for instance
FAISS database as retriever) since they use both `embed_documents` and
`embed_query`.

A simple snippet to test the behaviour:
```python
from langchain_community.embeddings.yandex import YandexGPTEmbeddings
embeddings = YandexGPTEmbeddings()
q_emb = embeddings.embed_query('hello world')
doc_emb = embeddings.embed_documents(['hello world', 'hello world'])
q_emb == doc_emb[0]
```
The response is `True` with the current version and `False` with the
changes I made.


Twitter: @egor_krash

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-13 16:23:01 -07:00
Guangdong Liu
4be7ca7b4c community[patch]:sparkllm standardize init args (#20194)
Related to https://github.com/langchain-ai/langchain/issues/20085
@baskaryan
2024-04-13 16:03:19 -07:00
Rohit Agarwal
7d7a08e458 docs: Update Portkey provider integration (#20412)
**Description:** Updates the documentation for Portkey and Langchain.
Also updates the notebook. The current documentation is fairly old and
is non-functional.
**Twitter handle:** @portkeyai

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-13 23:01:48 +00:00
Yuki Oshima
0758da8940 community[patch]: Set default value for _ListSQLDatabaseToolInput tool_input (#20409)
**Description:**

`_ListSQLDatabaseToolInput` raise error if model returns `{}`.
For example, gpt-4-turbo returns `{}` with SQL Agent initialized by
`create_sql_agent`.

So, I set default value `""` for `_ListSQLDatabaseToolInput` tool_input.

This is actually a gpt-4-turbo issue, not a LangChain issue, but I
thought it would be helpful to set a default value `""`.

This problem is discussed in detail in the following Issue.

**Issue:** https://github.com/langchain-ai/langchain/issues/20405

**Dependencies:** none

Sorry, I did not add or change the test code, as tests for this
components was not exist .

However, I have tested the following code based on the [SQL Agent
Document](https://python.langchain.com/docs/use_cases/sql/agents/), to
make sure it works.

```
from langchain_community.agent_toolkits.sql.base import create_sql_agent
from langchain_community.utilities.sql_database import SQLDatabase
from langchain_openai import ChatOpenAI

db = SQLDatabase.from_uri("sqlite:///Chinook.db")
llm = ChatOpenAI(model="gpt-4-turbo", temperature=0)
agent_executor = create_sql_agent(llm, db=db, agent_type="openai-tools", verbose=True)
result = agent_executor.invoke("List the total sales per country. Which country's customers spent the most?")
print(result["output"])
```
2024-04-13 15:58:47 -07:00
Kenneth Choe
b507cd222b docs: changed the link to more helpful source (#20411)
docs: changed a link to better source

[Previous
link](https://www.philschmid.de/custom-inference-huggingface-sagemaker)
is about how to upload embeddings model.
[New
link](https://huggingface.co/blog/kchoe/deploy-any-huggingface-model-to-sagemaker)
is about how to upload cross encoder model, which directly addresses
what is needed here. For full disclosure, I wrote this article and the
sample `inference.py` is the result of this new article.

Co-authored-by: Kenny Choe <kchoe@amazon.com>
2024-04-13 15:54:33 -07:00
saberuster
160bcaeb93 text-splitters[minor]: Add lua code splitting (#20421)
- **Description:** Complete the support for Lua code in
langchain.text_splitter module.
- **Dependencies:** No
- **Twitter handle:** @saberuster

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-13 22:42:51 +00:00
ccurme
4b6b0a87b6 groq[patch]: Make stream robust to ToolMessage (#20417)
```python
from langchain.agents import AgentExecutor, create_tool_calling_agent, tool
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_groq import ChatGroq


prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant"),
        ("human", "{input}"),
        MessagesPlaceholder("agent_scratchpad"),
    ]
)

model = ChatGroq(model_name="mixtral-8x7b-32768", temperature=0)

@tool
def magic_function(input: int) -> int:
    """Applies a magic function to an input."""
    return input + 2

tools = [magic_function]


agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what is the value of magic_function(3)?"})
```
```
> Entering new AgentExecutor chain...

Invoking: `magic_function` with `{'input': 3}`


5The value of magic\_function(3) is 5.

> Finished chain.
{'input': 'what is the value of magic_function(3)?',
 'output': 'The value of magic\\_function(3) is 5.'}
```
2024-04-13 15:40:55 -07:00
Leonid Ganeline
6dc4f592ba docs: tutorials update (#20401)
Added 3 new `LangChain.ai` playlists
2024-04-12 21:56:14 -04:00
ccurme
38faa74c23 community[patch]: update use of deprecated llm methods (#20393)
.predict and .predict_messages for BaseLanguageModel and BaseChatModel
2024-04-12 17:28:23 -04:00
Corey Zumar
3a068b26f3 community[patch]: Databricks - fix scope of dangerous deserialization error in Databricks LLM connector (#20368)
fix scope of dangerous deserialization error in Databricks LLM connector

---------

Signed-off-by: dbczumar <corey.zumar@databricks.com>
2024-04-12 17:27:26 -04:00
Bagatur
f1248f8d9a core[patch]: configurable init params (#20070)
Proposed fix for #20061. need to test

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-12 21:18:43 +00:00
Eugene Yurtsev
4808441d29 Docs: Add guide for implementing custom retriever (#20350)
Add longer guide for implementing custom retriever.

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-04-12 17:18:35 -04:00
aditya thomas
4f75b230ed partner[ai21]: masking of the api key for ai21 models (#20257)
**Description:** Masking of the API key for AI21 models
**Issue:** Fixes #12165 for AI21
**Dependencies:** None

Note: This fix came in originally through #12418 but was possibly missed
in the refactor to the AI21 partner package


---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-12 20:19:31 +00:00
Leonid Ganeline
e512d3c6a6 langchain: callbacks imports fix (#20348)
Replaced all `from langchain.callbacks` into `from
langchain_core.callbacks` .
Changes in the `langchain` and `langchain_experimental`

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-12 20:13:14 +00:00
Erick Friis
d83b720c40 templates: readme langsmith not private beta (#20173) 2024-04-12 13:08:10 -07:00
michael
525226fb0b docs: fix extraction/quickstart.ipynb example code (#20397)
- **Description**: The pydantic schema fields are supposed to be
optional but the use of `...` makes them required. This causes a
`ValidationError` when running the example code. I replaced `...` with
`default=None` to make the fields optional as intended. I also
standardized the format for all fields.
- **Issue**: n/a
- **Dependencies**: none
- **Twitter handle**: https://twitter.com/m_atoms

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-12 19:59:32 +00:00
balloonio
e7b1a44c5b community[patch]: Invoke callback prior to yielding token fix for Llamafile (#20365)
- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for Llamafile


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community llamafile.py
    - **Issue:** https://github.com/langchain-ai/langchain/issues/16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-12 19:26:12 +00:00
milind
1b272fa2f4 Update index.mdx (#20395)
spelling error fixed

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-12 19:22:08 +00:00
balloonio
93caa568f9 community[patch]: Invoke callback prior to yielding token fix for HuggingFaceEndpoint (#20366)
- [x] **PR title**: community[patch]: Invoke callback prior to yielding
token fix for HuggingFaceEndpoint


- [x] **PR message**: 
- **Description:** Invoke callback prior to yielding token in stream
method in community HuggingFaceEndpoint
    - **Issue:** https://github.com/langchain-ai/langchain/issues/16913
    - **Dependencies:** None
    - **Twitter handle:** @bolun_zhang

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-12 19:16:34 +00:00
Nicolas
ad04585e30 community[minor]: Firecrawl.dev integration (#20364)
Added the [FireCrawl](https://firecrawl.dev) document loader. Firecrawl
crawls and convert any website into LLM-ready data. It crawls all
accessible subpages and give you clean markdown for each.

    - **Description:** Adds FireCrawl data loader
    - **Dependencies:** firecrawl-py
    - **Twitter handle:** @mendableai 

ccing contributors: (@ericciarla @nickscamara)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-12 19:13:48 +00:00
Tomaz Bratanic
a1b105ac00 experimental[patch]: Skip pydantic validation for llm graph transformer and fix JSON response where possible (#19915)
LLMs might sometimes return invalid response for LLM graph transformer.
Instead of failing due to pydantic validation, we skip it and manually
check and optionally fix error where we can, so that more information
gets extracted
2024-04-12 11:29:25 -07:00
Erick Friis
20f5cd7c95 docs: langchain-chroma package (#20394) 2024-04-12 11:17:05 -07:00
Haris Ali
6786fa9186 docs: Adding api documentation link at the end of each output parser class description page. (#20391)
- **Description:** Added cross-links for easy access of api
documentation of each output parser class from it's description page.
  - **Issue:** related to issue #19969

Co-authored-by: Haris Ali <haris.ali@formulatrix.com>
2024-04-12 17:58:18 +00:00
P. Taylor Goetz
9317df7f16 community[patch]: Add "model" attribute to the payload sent to Ollama in ChatOllama (#20354)
Example Ollama API calls:

Request without "model":
```
curl --location 'http://localhost:11434/api/chat' \
--header 'Content-Type: application/json' \
--data '{
  "messages": [
    {
      "role": "user",
      "content": "What is the capitol of PA?"
    }
  ],
  "stream": false
}'
```
Response:
```
{"error":"model is required"}
```

Request with "model":
```
curl --location 'http://localhost:11434/api/chat' \
--header 'Content-Type: application/json' \
--data '{
  "model": "openchat",
  "messages": [
    {
      "role": "user",
      "content": "What is the capitol of PA?"
    }
  ],
  "stream": false
}'
```

Response:
```
{
  "eval_duration" : 733248000,
  "created_at" : "2024-04-11T23:04:08.735766843Z",
  "model" : "openchat",
  "message" : {
    "content" : " The capital city of Pennsylvania is Harrisburg.",
    "role" : "assistant"
  },
  "total_duration" : 3138731168,
  "prompt_eval_count" : 25,
  "load_duration" : 466562959,
  "done" : true,
  "prompt_eval_duration" : 1938495000,
  "eval_count" : 10
}
```
2024-04-12 13:32:53 -04:00
Bagatur
57bb940c17 docs: vertexai tool call update (#20362) 2024-04-12 10:09:54 -07:00
Alex Sherstinsky
fad0962643 community: for Predibase -- enable both Predibase-hosted and HuggingFace-hosted fine-tuned adapter repositories (#20370) 2024-04-12 08:32:00 -07:00
ccurme
5395c409cb docs: add Cohere to ChatModelTabs (#20386) 2024-04-12 10:35:10 -04:00
Eugene Yurtsev
6470b30173 langchain[patch]: Add deprecation warning to extraction chains (#20224)
Add deprecation warnings to extraction chains
2024-04-12 10:24:32 -04:00
Eugene Yurtsev
b65a1d4cfd langchain[patch]: Add another unit test for indexing code (#20387)
Add another unit test for indexing
2024-04-12 10:19:18 -04:00
Erick Friis
29282371db core: bind_tools interface on basechatmodel (#20360) 2024-04-12 01:32:19 +00:00
Erick Friis
e6806a08d4 multiple: standard chat model tests (#20359) 2024-04-11 18:23:13 -07:00
Bagatur
f78564d75c docs: show tool msg in tool call docs (#20358) 2024-04-11 16:42:04 -07:00
Isak Nyberg
bac9fb9a7c community: add gpt-4 pricing in callback (#20292)
Added the pricing for `gpt-4-turbo` and `gpt-4-turbo-2024-04-09` in the
callback method.
related to issue #17173 

https://openai.com/pricing#language-models
2024-04-11 18:02:39 -04:00
Ikko Eltociear Ashimine
cb29b42285 docs: Update ibm_watsonx.ipynb (#20329)
avaliable -> available


    - **Description:** fixed typo
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
2024-04-11 17:59:23 -04:00
Jack Wotherspoon
204a16addc docs: add Cloud SQL for MySQL vector store integration docs (#20278)
Adding docs page for `Google Cloud SQL for MySQL` vector store
integration. This was recently released as part of the Cloud SQL for
MySQL LangChain package
([release](https://github.com/googleapis/langchain-google-cloud-sql-mysql-python/releases/tag/v0.2.0))

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-11 21:57:46 +00:00
Leonid Ganeline
7cf2d2759d community[patch]: docstrings update (#20301)
Added missed docstrings. Format docstings to the consistent form.
2024-04-11 16:23:27 -04:00
Eugene Yurtsev
2900720cd3 core[patch]: Update documentation for base retriever (#20345)
Updating in code documentation for base retriever to direct folks toward
the .invoke and .ainvoke methods + explain how to implement
2024-04-11 16:20:14 -04:00
Bagatur
d2f4153fe6 docs: tool call nits (#20356) 2024-04-11 12:56:36 -07:00
Bagatur
eafd8c580b docs: tool agent nit (#20353) 2024-04-11 19:41:31 +00:00
Erick Friis
ec0273fc92 chroma: release 0.1.0 (#20355) 2024-04-11 12:39:52 -07:00
Bagatur
a889cd14f3 docs: use vertexai in chat model tabs (#20352) 2024-04-11 12:34:19 -07:00
Bagatur
9d302c1b57 docs: update anthropic tool call (#20344) 2024-04-11 11:38:26 -07:00
Erick Friis
da707d0755 chroma: remove relevance score int test (#20346)
deprecating feature in #20302
2024-04-11 11:29:33 -07:00
Eugene Yurtsev
de938a4451 docs: Update chat model providers include package information (#20336)
Include package information
2024-04-11 13:29:42 -04:00
Bagatur
56fe4ab382 docs: update tool-calling table (#20338) 2024-04-11 09:50:20 -07:00
Bagatur
43a98592c1 docs: tool agent nit (#20337) 2024-04-11 09:43:12 -07:00
Bagatur
562b546bcc docs: update chat openai (#20331) 2024-04-11 09:29:46 -07:00
Bagatur
2c4741b5ed docs: add tool-calling agent (#20328) 2024-04-11 09:29:40 -07:00
ccurme
f02e55aaf7 docs: add component page for tool calls (#20282)
Note: includes links to API reference pages for ToolCall and other
objects that currently don't exist (e.g.,
https://api.python.langchain.com/en/latest/messages/langchain_core.messages.tool.ToolCall.html#langchain_core.messages.tool.ToolCall).
2024-04-11 09:29:25 -07:00
Bagatur
6608089030 langchain[patch]: Release 0.1.16 (#20335) 2024-04-11 09:28:37 -07:00
Eugene Yurtsev
0e74fb4ec1 docs: Update list of chat models tool calling providers (#20330)
Will follow up with a few missing providers
2024-04-11 12:22:49 -04:00
Eugene Yurtsev
653489a1a9 docs: Update documentation for custom LLMs (#19972)
Update documentation for customizing LLMs
2024-04-11 12:21:27 -04:00
Bagatur
799714c629 release anthropic, fireworks, openai, groq, mistral (#20333) 2024-04-11 09:19:52 -07:00
Bagatur
e72330aacc core[patch]: Release 0.1.42 (#20332) 2024-04-11 09:10:27 -07:00
ccurme
795c728f71 mistral[patch]: add IDs to tool calls (#20299)
Mistral gives us one ID per response, no individual IDs for tool calls.

```python
from langchain.agents import AgentExecutor, create_tool_calling_agent, tool
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_mistralai import ChatMistralAI


prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant"),
        ("human", "{input}"),
        MessagesPlaceholder("agent_scratchpad"),
    ]
)
model = ChatMistralAI(model="mistral-large-latest", temperature=0)

@tool
def magic_function(input: int) -> int:
    """Applies a magic function to an input."""
    return input + 2

tools = [magic_function]

agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what is the value of magic_function(3)?"})
```

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-11 11:09:30 -04:00
Eugene Yurtsev
22fd844e8a community[patch]: Add deprecation warnings to postgres implementation (#20222)
Add deprecation warnings to postgres implementation that are in langchain-postgres.
2024-04-11 10:33:22 -04:00
Eugene Yurtsev
f02f708f52 core[patch]: For now remove user warning (#20321)
Remove warning since it creates a lot of noise.
2024-04-11 10:33:01 -04:00
Mayank Solanki
f709ab4cdf docs: added backtick on RunnablePassthrough (#20310)
added backtick on RunnablePassthrough
Isuue: #20094
2024-04-11 08:39:10 -04:00
Bagatur
c706689413 openai[patch]: use tool_calls in request (#20272) 2024-04-11 03:55:52 -07:00
Bagatur
e936fba428 langchain[patch]: agents check prompt partial vars (#20303) 2024-04-11 03:55:09 -07:00
Bagatur
cb25fa0d55 core[patch]: fix ChatGeneration.text with content blocks (#20294) 2024-04-10 15:54:06 -07:00
Bagatur
03b247cca1 core[patch]: include tool_calls in ai msg chunk serialization (#20291) 2024-04-10 22:27:40 +00:00
Erick Friis
0fa551c278 chroma: bump rc, keep optional (#20298) 2024-04-10 14:22:56 -07:00
Erick Friis
16f8fff14f chroma: add required fastapi dep to restrict to <1 (#20297) 2024-04-10 14:16:13 -07:00
Erick Friis
991fd82532 chroma: add optional fastapi dep to restrict to <1 (#20295) 2024-04-10 12:49:44 -07:00
killind-dev
f8a54d1d73 chroma: Add chroma partner package (#19292)
**Description:** Adds chroma to the partners package. Tests & code
mirror those in the community package.
**Dependencies:** None
**Twitter handle:** @akiradev0x

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-10 19:33:45 +00:00
Yuki Watanabe
eef19954f3 core[patch]: fix duplicated kwargs in _load_sql_databse_chain (#19908)
`kwargs` is specified twice in [this
line](3218463f6a/libs/langchain/langchain/chains/loading.py (L386)),
causing runtime error when passing any keyword arguments.
2024-04-10 12:20:28 -07:00
ccurme
39471a9c87 docs: update tool calling cookbook (#20290)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-10 15:06:33 -04:00
Nuno Campos
15271ac832 core: mustache prompt templates (#19980)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-10 11:25:32 -07:00
Leonid Ganeline
4cb5f4c353 community[patch]: import flattening fix (#20110)
This PR should make it easier for linters to do type checking and for IDEs to jump to definition of code.

See #20050 as a template for this PR.
- As a byproduct: Added 3 missed `test_imports`.
- Added missed `SolarChat` in to __init___.py Added it into test_import
ut.
- Added `# type: ignore` to fix linting. It is not clear, why linting
errors appear after ^ changes.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-10 13:01:19 -04:00
Yuki Oshima
12190ad728 openai[patch]: Fix langchain-openai unknown parameter error with gpt-4-turbo (#20271)
**Description:** 

I fixed langchain-openai unknown parameter error with gpt-4-turbo.

It seems that the behavior of the Chat Completions API implicitly
changed when using the latest gpt-4-turbo model, differing from previous
models. It now appears to reject parameters that are not listed in the
[API
Reference](https://platform.openai.com/docs/api-reference/chat/create).
So I found some errors and fixed them.

**Issue:** https://github.com/langchain-ai/langchain/issues/20264

**Dependencies:** none

**Twitter handle:** https://twitter.com/oshima_123
2024-04-10 09:51:38 -07:00
ccurme
21c1ce0bc1 update agents to use tool call messages (#20074)
```python
from langchain.agents import AgentExecutor, create_tool_calling_agent, tool
from langchain_anthropic import ChatAnthropic
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are a helpful assistant"),
        MessagesPlaceholder("chat_history", optional=True),
        ("human", "{input}"),
        MessagesPlaceholder("agent_scratchpad"),
    ]
)
model = ChatAnthropic(model="claude-3-opus-20240229")

@tool
def magic_function(input: int) -> int:
    """Applies a magic function to an input."""
    return input + 2

tools = [magic_function]

agent = create_tool_calling_agent(model, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

agent_executor.invoke({"input": "what is the value of magic_function(3)?"})
```
```
> Entering new AgentExecutor chain...

Invoking: `magic_function` with `{'input': 3}`
responded: [{'text': '<thinking>\nThe user has asked for the value of magic_function applied to the input 3. Looking at the available tools, magic_function is the relevant one to use here, as it takes an integer input and returns an integer output.\n\nThe magic_function has one required parameter:\n- input (integer)\n\nThe user has directly provided the value 3 for the input parameter. Since the required parameter is present, we can proceed with calling the function.\n</thinking>', 'type': 'text'}, {'id': 'toolu_01HsTheJPA5mcipuFDBbJ1CW', 'input': {'input': 3}, 'name': 'magic_function', 'type': 'tool_use'}]

5
Therefore, the value of magic_function(3) is 5.

> Finished chain.
{'input': 'what is the value of magic_function(3)?',
 'output': 'Therefore, the value of magic_function(3) is 5.'}
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-10 11:54:51 -04:00
Erick Friis
9eb6f538f0 infra, multiple: rc release versions (#20252) 2024-04-09 17:54:58 -07:00
Bagatur
0d0458d1a7 mistralai[patch]: Pre-release 0.1.2-rc.1 (#20251) 2024-04-10 00:25:38 +00:00
Bagatur
e4046939d0 anthropic[patch]: Pre-release 0.1.8-rc.1 (#20250) 2024-04-10 00:23:10 +00:00
Bagatur
a8eb0f5b1b openai[patch]: pre-release 0.1.3-rc.1 (#20249) 2024-04-10 00:22:08 +00:00
Bagatur
a43b9e4f33 core[patch]: Pre-release 0.1.42-rc.1 (#20248) 2024-04-09 19:10:38 -05:00
Bagatur
9514bc4d67 core[minor], ...: add tool calls message (#18947)
core[minor], langchain[patch], openai[minor], anthropic[minor], fireworks[minor], groq[minor], mistralai[minor]

```python
class ToolCall(TypedDict):
    name: str
    args: Dict[str, Any]
    id: Optional[str]

class InvalidToolCall(TypedDict):
    name: Optional[str]
    args: Optional[str]
    id: Optional[str]
    error: Optional[str]

class ToolCallChunk(TypedDict):
    name: Optional[str]
    args: Optional[str]
    id: Optional[str]
    index: Optional[int]


class AIMessage(BaseMessage):
    ...
    tool_calls: List[ToolCall] = []
    invalid_tool_calls: List[InvalidToolCall] = []
    ...


class AIMessageChunk(AIMessage, BaseMessageChunk):
    ...
    tool_call_chunks: Optional[List[ToolCallChunk]] = None
    ...
```
Important considerations:
- Parsing logic occurs within different providers;
- ~Changing output type is a breaking change for anyone doing explicit
type checking;~
- ~Langsmith rendering will need to be updated:
https://github.com/langchain-ai/langchainplus/pull/3561~
- ~Langserve will need to be updated~
- Adding chunks:
- ~AIMessage + ToolCallsMessage = ToolCallsMessage if either has
non-null .tool_calls.~
- Tool call chunks are appended, merging when having equal values of
`index`.
  - additional_kwargs accumulate the normal way.
- During streaming:
- ~Messages can change types (e.g., from AIMessageChunk to
AIToolCallsMessageChunk)~
- Output parsers parse additional_kwargs (during .invoke they read off
tool calls).

Packages outside of `partners/`:
- https://github.com/langchain-ai/langchain-cohere/pull/7
- https://github.com/langchain-ai/langchain-google/pull/123/files

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-09 18:41:42 -05:00
Erick Friis
00552918ac groq: xfail tool_choice tests (#20247) 2024-04-09 23:29:59 +00:00
Bagatur
2d83505be9 experimental[patch]: Release 0.0.57 (#20243) 2024-04-09 17:08:01 -05:00
Bagatur
f06cb59ab9 groq[patch]: Release 0.1.1 (#20242) 2024-04-09 21:59:58 +00:00
Erick Friis
ad3f1a9e85 docs: fix external repo partner docs (#20238) 2024-04-09 21:58:04 +00:00
Bagatur
0b2f0307d7 openai[patch]: Release 0.1.2 (#20241) 2024-04-09 21:55:19 +00:00
Bagatur
4b84c9b28c anthropic[patch]: Release 0.1.7 (#20240) 2024-04-09 21:53:16 +00:00
Bagatur
74d04a4e80 mistralai[patch]: Release 0.1.1 (#20239) 2024-04-09 21:53:01 +00:00
Bagatur
e5913c8758 langchain[patch]: Release 0.1.15 (#20237) 2024-04-09 21:50:32 +00:00
Bagatur
e39fdfddf1 community[patch]: Release 0.0.32 (#20236) 2024-04-09 21:37:10 +00:00
Bagatur
a07238d14e core[patch]: Release 0.1.41 (#20233) 2024-04-09 21:11:37 +00:00
Chip Davis
806d4ae48f community[patch]: fixed multithreading returning List[List[Documents]] instead of List[Documents] (#20230)
Description: When multithreading is set to True and using the
DirectoryLoader, there was a bug that caused the return type to be a
double nested list. This resulted in other places upstream not being
able to utilize the from_documents method as it was no longer a
`List[Documents]` it was a `List[List[Documents]]`. The change made was
to just loop through the `future.result()` and yield every item.
Issue: #20093
Dependencies: N/A
Twitter handle: N/A
2024-04-09 17:06:37 -04:00
Sholto Armstrong
230376f183 docs: Fix typo in citations example (#20218)
Small typo in the citations notebook "ojbects" changed to "objects"
2024-04-09 21:05:33 +00:00
Eugene Yurtsev
fe35e13083 langchain[patch]: Update unit test (#20228)
This unit test fails likely validation by the openai client.

Newer openai library seems to be doing more validation so the existing
test fails since http_client needs to be of httpx instance
2024-04-09 16:44:23 -04:00
Casper da Costa-Luis
b972f394c8 langchain[patch]: make BooleanOutputParser check words not substrings (#20064)
- **Description**: fixes BooleanOutputParser detecting sub-words ("NOW
this is likely (YES)" -> `True`, not `AmbiguousError`)
- **Issue(s)**: fixes #11408 (follow-up to #17810)
- **Dependencies**: None
- **GitHub handle**: @casperdcl

<!-- if unreviewd after a few days, @-mention one of baskaryan, efriis,
eyurtsev, hwchase17 -->

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-09 20:43:31 +00:00
seray
add31f46d0 community[patch]: OpenLLM Async Client Fixes and Timeout Parameter (#20007)
Same changes as this merged
[PR](https://github.com/langchain-ai/langchain/pull/17478)
(https://github.com/langchain-ai/langchain/pull/17478), but for the
async client, as the same issues persist.

- Replaced 'responses' attribute of OpenLLM's GenerationOutput schema to
'outputs'.
reference:
66de54eae7/openllm-core/src/openllm_core/_schemas.py (L135)

- Added timeout parameter for the async client.

---------

Co-authored-by: Seray Arslan <seray.arslan@knime.com>
2024-04-09 16:34:56 -04:00
Erick Friis
37a9e23c05 community: switch to falkordb python client (#20229) 2024-04-09 20:19:44 +00:00
Christophe Bornet
f43b48aebc core[minor]: Implement aformat_messages for _StringImageMessagePromptTemplate (#20036) 2024-04-09 15:59:39 -04:00
Christophe Bornet
19001e6cb9 core[minor]: Implement aformat for FewShotPromptWithTemplates (#20039) 2024-04-09 15:58:41 -04:00
Erick Friis
855ba46f80 standard-tests: a standard unit and integration test set (#20182)
just chat models for now
2024-04-09 12:43:00 -07:00
Erick Friis
9b5cae045c together: release 0.1.0 (#20225)
Resolved #20217
2024-04-09 12:23:52 -07:00
Eugene Yurtsev
7cfb643a1c langchain-postgres: Remove remaining README.md file (#20221)
Repository has moved to langchain-ai/langchain-postgres
2024-04-09 14:02:15 -04:00
Eugene Yurtsev
2fa7266ebb Remove postgres package (#20207)
Package moved
2024-04-09 13:51:17 -04:00
Simon Kelly
a682f0d12b openai[patch]: wrap stream code in context manager blocks (#18013)
**Description:**
Use the `Stream` context managers in `ChatOpenAi` `stream` and `astream`
method.

Using the context manager returned by the OpenAI client makes it
possible to terminate the stream early since the response connection
will be closed when the context manager exists.

**Issue:** #5340
**Twitter handle:** @snopoke

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-09 17:40:16 +00:00
Shotaro Sano
6c11c8dac6 docs: Add documentation of ElasticsearchStore.BM25RetrievalStrategy (#20098)
This pull request follows up on
https://github.com/langchain-ai/langchain/pull/19314 and
https://github.com/langchain-ai/langchain-elastic/pull/6, adding
documentation for the `ElasticsearchStore.BM25RetrievalStrategy`.

Like other retrieval strategies, we are now introducing
BM25RetrievalStrategy.

### Background
- The `BM25RetrievalStrategy` has been introduced to `langchain-elastic`
via the pull request
https://github.com/langchain-ai/langchain-elastic/pull/6.
- This PR was initially created in the main `langchain` repository but
was moved to `langchain-elastic` during the review process due to the
migration of the partner package.
- The original PR can be found at
https://github.com/langchain-ai/langchain/pull/19314.
- As
[commented](https://github.com/langchain-ai/langchain/pull/19314#issuecomment-2023202401)
by @joemcelroy, documenting the new retrieval strategy is part of the
requirements for its introduction.

Although the `BM25RetrievalStrategy` has been merged into
`langchain-elastic`, its documentation is still to be maintained in the
main `langchain` repository. Therefore, this pull request adds the
documentation portion of `BM25RetrievalStrategy`.

The content of the documentation remains the same as that included in
the original PR, https://github.com/langchain-ai/langchain/pull/19314.

---------

Co-authored-by: Max Jakob <max.jakob@elastic.co>
2024-04-09 12:37:15 -05:00
David Lee
0394c6e126 community[minor]: add allow_dangerous_requests for OpenAPI toolkits (#19493)
**OpenAPI allow_dangerous_requests**: community: add
allow_dangerous_requests for OpenAPI toolkits

**Description:** a description of the change

Due to BaseRequestsTool changes, we need to pass
allow_dangerous_requests manually.


b617085af0/libs/community/langchain_community/tools/requests/tool.py (L26-L46)

While OpenAPI toolkits didn't pass it in the arguments.


b617085af0/libs/community/langchain_community/agent_toolkits/openapi/planner.py (L262-L269)


**Issue:** the issue # it fixes, if applicable

https://github.com/langchain-ai/langchain/issues/19440

If not passing allow_dangerous_requests, it won't be able to do
requests.

**Dependencies:** any dependencies required for this change

Not much

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-09 17:14:02 +00:00
Guangdong Liu
301dc3dfd2 docs: Get rid of ZeroShotAgent and use create_react_agent instead (#20157)
- **Issue:** #20122
 -  @baskaryan, @eyurtsev.
2024-04-09 12:00:29 -05:00
Timothy
0c848a25ad community[patch]: GCSDirectoryLoader bugfix (#20005)
- **Description:** Bug fix. Removed extra line in `GCSDirectoryLoader`
to allow catching Exceptions. Now also logs the file path if Exception
is raised for easier debugging.
- **Issue:** #20198 Bug since langchain-community==0.0.31
- **Dependencies:** No change
- **Twitter handle:** timothywong731

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-09 16:57:00 +00:00
jeff kit
ac42e96e4c community[patch], langchain[minor]: Enhance Tencent Cloud VectorDB, langchain: make Tencent Cloud VectorDB self query retrieve compatible (#19651)
- make Tencent Cloud VectorDB support metadata filtering.
- implement delete function for Tencent Cloud VectorDB.
- support both Langchain Embedding model and Tencent Cloud VDB embedding
model.
- Tencent Cloud VectorDB support filter search keyword, compatible with
langchain filtering syntax.
- add Tencent Cloud VectorDB TranslationVisitor, now work with self
query retriever.
- more documentations.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-09 16:50:48 +00:00
Bagatur
1a34c65e01 community[patch]: pass through sql agent kwargs (#19962)
Fix #19961
2024-04-09 16:47:32 +00:00
Haris Ali
1b480914b4 docs: Fix the class links in openai_tools and openai_functions description in output parser documentations (#20197)
- **Description:** In this PR I fixed the links which points to the API
docs for classes in OpenAI functions and OpenAI tools section of output
parsers.
  - **Issue:** It fixed the issue #19969

Co-authored-by: Haris Ali <haris.ali@formulatrix.com>
2024-04-09 16:07:19 +00:00
Guangdong Liu
97d91ec17c community[patch]: standardize baichuan init args (#20209)
Related to https://github.com/langchain-ai/langchain/issues/20085

@baskaryan
2024-04-09 11:00:40 -05:00
Piyush Jain
cd7abc495a community[minor]: add neptune analytics graph (#20047)
Replacement for PR
[#19772](https://github.com/langchain-ai/langchain/pull/19772).

---------

Co-authored-by: Dave Bechberger <dbechbe@amazon.com>
Co-authored-by: bechbd <bechbd@users.noreply.github.com>
2024-04-09 09:20:59 -05:00
Shuqian
ad9750403b community[minor]: add bedrock anthropic callback for token usage counting (#19864)
**Description:** add bedrock anthropic callback for token usage
counting, consulted openai callback.

---------

Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com>
2024-04-09 09:18:48 -05:00
Prince Canuma
1f9f4d8742 community[minor]: Add support for MLX models (chat & llm) (#18152)
**Description:** This PR adds support for MLX models both chat (i.e.,
instruct) and llm (i.e., pretrained) types/
**Dependencies:** mlx, mlx_lm, transformers
**Twitter handle:** @Prince_Canuma

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-09 14:17:07 +00:00
aditya thomas
6baeaf4802 docs: TogetherAI as a drop-in replacement for OpenAI (#19900)
**Description:** TogetherAI as a drop-in replacement for OpenAI
**Issue:** None
**Dependencies:** None

@baskaryan apropos #20032
2024-04-09 09:12:52 -05:00
Leonid Ganeline
2f8dd1a161 community[patch]: cross_encoders flatten namespaces (#20183)
Issue `langchain_community.cross_encoders` didn't have flattening
namespace code in the __init__.py file.
Changes:
- added code to flattening namespaces (used #20050 as a template)
- added ut for a change
- added missed `test_imports` for `chat_loaders` and
`chat_message_histories` modules
2024-04-08 20:50:23 -04:00
Bagatur
1af7133828 docs: add vertexai to structured output (#20171) 2024-04-08 16:09:49 -05:00
kaijietti
a812839f0c community: add request_timeout and max_retries to ChatAnthropic (#19402)
This PR make `request_timeout` and `max_retries` configurable for
ChatAnthropic.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-08 21:04:17 +00:00
Richmond Alake
c769421aa4 cookbook: MongoDB Cookbook for Chat history and semantic cache (#19998)
Thank you for contributing to LangChain!

- [ ] **PR title**: "community: Add semantic caching and memory using
MongoDB"


- [ ] **PR message**: 
- **Description:** This PR introduces functionality for adding semantic
caching and chat message history using MongoDB in RAG applications. By
leveraging the MongoDBCache and MongoDBChatMessageHistory classes,
developers can now enhance their retrieval-augmented generation
applications with efficient semantic caching mechanisms and persistent
conversation histories, improving response times and consistency across
chat sessions.
    - **Issue:** N/A
- **Dependencies:** Requires `datasets`, `langchain`,
`langchain-mongodb`, `langchain-openai`, `pymongo`, and `pandas` for
implementation. MongoDB Atlas is used for database services, and the
OpenAI API for model access.
    - **Twitter handle:** @richmondalake

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-08 20:21:24 +00:00
Erick Friis
391e8f2050 pinecone[patch]: fix core min version (#20177) 2024-04-08 20:06:59 +00:00
Harry Jiang
1ee208541c langchain: fix pinecone upsert when async_req is set to False (#19793)
Issue: 
When async_req is the default value True, pinecone client return the
multiprocessing AsyncResult object.
When async_req is set to False, pinecone client return the result
directly. `[{'upserted_count': 1}]` . Calling get() method will throw an
error in this case.
2024-04-08 12:55:59 -07:00
Alex Sherstinsky
5f563e040a community: extend Predibase integration to support fine-tuned LLM adapters (#19979)
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Langchain-Predibase integration was failing, because
it was not current with the Predibase SDK; in addition, Predibase
integration tests were instantiating the Langchain Community `Predibase`
class with one required argument (`model`) missing. This change updates
the Predibase SDK usage and fixes the integration tests.
    - **Twitter handle:** `@alexsherstinsky`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-08 18:54:29 +00:00
Bagatur
a27d88f12a anthropic[patch]: standardize init args (#20161)
Related to #20085
2024-04-08 12:09:06 -05:00
Bagatur
3490d70238 mistralai[patch]: standardize model params (#20163)
Related to #20085
2024-04-08 11:48:38 -05:00
Bagatur
17182406f3 docs: standardize fireworks params (#20162)
Related to #20085
2024-04-08 10:57:56 -05:00
Bagatur
5ae0e687b3 docs: use standard openai params (#20160)
Part of #20085
2024-04-08 10:56:53 -05:00
david02871
e1a24d09c5 community: Add PHP language parser to document_loaders (#19850)
**Description:**
Added a PHP language parser to document_loaders
**Issue:** N/A
**Dependencies:** N/A
**Twitter handle:** N/A

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-08 11:30:28 -04:00
Marlene
2f03bc397e Community: Updating Azure Retriever and Docs to be Azure AI Search instead of Azure Cognitive Search (#19925)
Last year Microsoft [changed the
name](https://learn.microsoft.com/en-us/azure/search/search-what-is-azure-search)
of Azure Cognitive Search to Azure AI Search. This PR updates the
Langchain Azure Retriever API and it's associated docs to reflect this
change. It may be confusing for users to see the name Cognitive here and
AI in the Microsoft documentation which is why this is needed. I've also
added a more detailed example to the Azure retriever doc page.

There are more places that need a similar update but I'm breaking it up
so the PRs are not too big 😄 Fixing my errors from the previous PR.

Twitter: @marlene_zw

Two new tests added to test backward compatibility in
`libs/community/tests/integration_tests/retrievers/test_azure_cognitive_search.py`

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-08 11:12:41 -04:00
Rahul Triptahi
820b713086 community[minor]: Add support for Pebblo cloud_api_key in PebbloSafeLoader (#19855)
**Description**:
_PebbloSafeLoader_: Add support for pebblo's cloud api-key in
PebbloSafeLoader

- This Pull request enables PebbloSafeLoader to accept pebblo's cloud
api-key and send the semantic classification data to pebblo cloud.

**Documentation**: Updated 
**Unit test**: Added
**Issue**: NA
**Dependencies**: - None
**Twitter handle**: @rahul_tripathi2

Signed-off-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-04-08 11:10:04 -04:00
Eugene Yurtsev
34a24d4df6 postgres[minor]: Add pgvector community as is (#20096)
This moves langchain pgvector community as is

The only modification is support for psycopg3 rather than psycopg2!
2024-04-08 09:34:10 -04:00
Eugene Yurtsev
ba9e0d76c1 postgres[minor]: add postgres checkpoint implementation (#20025)
Adds checkpoint implementation using psycopg
2024-04-08 09:27:15 -04:00
William FH
039b7a472d [core] fix: manually specifying run_id for chat models.invoke() and .ainvoke() (#20082) 2024-04-06 16:57:32 -07:00
Chris Germann
ba602dc562 Documentation: Fixed the typo of Discord -> Telegram (#20008)
Description: Just fixed one string
Issues: None
Dependencies: None
Twitter handle: @epu9byj

Co-authored-by: gere <gere@kapo.zh.ch>
2024-04-06 20:00:03 +00:00
Erick Friis
96dc0ea49d pinecone[patch]: release 0.1.0 (#20109) 2024-04-06 18:41:28 +00:00
donbr
de496062b3 templates: migrate to langchain_anthropic package to support Claude 3 models (#19393)
- **Description:** update langchain anthropic templates to support
Claude 3 (iterative search, chain of note, summarization, and XML
response)
- **Issue:** issue # N/A. Stability issues and errors encountered when
trying to use older langchain and anthropic libraries.
- **Dependencies:**
  - langchain_anthropic version 0.1.4\
- anthropic package version in the range ">=0.17.0,<1" to support
langchain_anthropic.
- **Twitter handle:** @d_w_b7


- [ x]**Add tests and docs**: If you're adding a new integration, please
include
  1. used instructions in the README for testing

- [ x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-06 00:33:59 +00:00
Maxime Perrin
5ac0d1f67b partners[anthropic]: fix anthropic chat model message type lookup keys (#19034)
- **Description:** Fixing message formatting issue in ChatAnthropic
model by adding dictionary keys for `AIMessageChunk `and
`HumanMessageChunk`
  - **Issue:** #19025 
  - **Twitter handle:** @maximeperrin_

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-06 00:22:14 +00:00
Krista Pratico
d64bd32b20 templates: add rag azure search template (#18143)
- **Description:** Adds a template for performing RAG with the
AzureSearch vectorstore.
- **Issue:** N/A
- **Dependencies:** N/A
- **Twitter handle:** N/A

---------

Co-authored-by: Erick Friis <erickfriis@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-06 00:20:40 +00:00
Bagatur
46f580d42d docs: anthropic tool docstring (#20091) 2024-04-05 21:50:40 +00:00
Erick Friis
28dfde2cb2 cohere: move package to external repo (#20081) 2024-04-05 14:29:15 -07:00
Jacob Lee
58a2123ca0 docs[patch]: Add missing redirects (#20076) 2024-04-05 12:54:00 -07:00
Eugene Yurtsev
520ff50adc community[patch]: Improve import callbacks to make it IDE friendly (#20050)
* declares __all__ as a list of strings (instead of dynamically
computing it)
* import type definitions when TYPE_CHECKING is true
2024-04-05 15:17:51 -04:00
Guangdong Liu
5a76087965 langchain-core[minor]: Allow passing local cache to language models (#19331)
After this PR it will be possible to pass a cache instance directly to a
language model. This is useful to allow different language models to use
different caches if needed.

- **Issue:** close #19276

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-05 11:19:54 -04:00
Eugene Yurtsev
e4fc0e7502 core[patch]: Document BaseCache abstraction in code (#20046)
Document the base cache abstraction in the cache.
2024-04-05 10:56:57 -04:00
Christophe Bornet
4d8a6a27a3 core[minor]: Implement aformat_prompt and ainvoke in BasePromptTemplate (#20035) 2024-04-05 10:36:43 -04:00
Christophe Bornet
7e5c1905b1 core[minor]: Add async aformat_document method (#20037) 2024-04-05 10:29:53 -04:00
Christophe Bornet
927793d088 Merge pull request #20038
* Implement aformat_messages for ChatMessagePromptTemplate
2024-04-05 10:25:27 -04:00
Erick Friis
ebd24bb5d6 docs: fix title cap (#20048) 2024-04-05 02:36:33 +00:00
Eugene Yurtsev
1ee8cf7b20 Docs: Update custom chat model (#19967)
* Clean up in the existing tutorial
* Add model_name to identifying params
* Add table to summarize messages
2024-04-04 22:36:03 -04:00
Erick Friis
5fc7bb01e9 docs: weaviate docs (#20042) 2024-04-04 19:01:02 -07:00
Bagatur
38fb1429fe docs: fix together model tab (#20032) 2024-04-04 15:33:43 -07:00
Jacob Lee
b69af26717 docs[patch]: Fix Model I/O quickstart (#20031)
@baskaryan
2024-04-04 15:28:58 -07:00
Usama Ahmed
94ac42c573 docs: fixing typo in argument name (#20028)
it's "mode" instead of "model", I fixed it
2024-04-04 22:28:28 +00:00
Bagatur
07eeeb84f3 docs: hide experimental anthropic (#20030) 2024-04-04 15:27:52 -07:00
Lance Martin
e76b9210dd Update example cookbook for Anthropic tool use (#20029) 2024-04-04 14:53:18 -07:00
Leonid Ganeline
3856dedff4 docs: integrations/providers update 9 (#19941)
- Added missed providers
- Added links, descriptions in related examples
- Formatted in a consistent format

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-04 21:37:48 +00:00
Bagatur
644ff46100 docs: mark anthropic tools wrapper as deprecated (#20024) 2024-04-04 21:33:55 +00:00
Leonid Ganeline
69bf6262aa docs: integrations/providers/unstructured update (#19892)
Updated a page with existing document loaders with links to examples.
Fixed formatting of one example.

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-04 21:31:27 +00:00
Bagatur
1b7ed6071a anthropic[patch]: Release 0.1.6 (#20026) 2024-04-04 14:29:50 -07:00
Bagatur
6860450e48 anthropic[patch]: use anthropic 0.23 (#20022) 2024-04-04 14:23:53 -07:00
Leonid Ganeline
4c969286fe docs integrations/providers update 10 (#19970)
Fixed broken links. Formatted to get consistent forms. Added missed
imports in the example code
2024-04-04 14:22:45 -07:00
Leonid Ganeline
82f0198be2 docs: graphs update (#19675)
Issue: The `graph` code was moved into the `community` package a long
ago. But the related documentation is still in the
[use_cases](https://python.langchain.com/docs/use_cases/graph/integrations/diffbot_graphtransformer)
section and not in the `integrations`.
Changes:
- moved the `use_cases/graph/integrations` notebooks into the
`integrations/graphs`
- renamed files and changed titles to follow the consistent format
- redirected old page URLs to new URLs in `vercel.json` and in several
other pages
- added descriptions and links when necessary
- formatted into the consistent format
2024-04-04 14:13:22 -07:00
Bagatur
be3dd62de4 anthropic[patch]: fix experimental tests (#20021) 2024-04-04 13:37:43 -07:00
Lance Martin
a6926772f0 Add cookbook for Anthropic .with_structured_output() (#20017) 2024-04-04 13:30:44 -07:00
Bagatur
86fdb79454 anthropic[patch]: bump core dep (#20019)
]
2024-04-04 13:28:23 -07:00
Bagatur
209de0a561 anthropic[minor]: tool use (#20016) 2024-04-04 13:22:48 -07:00
Leonid Ganeline
3aacd11846 community[minor]: added missed class to __all__ (#19888)
Added missed `UnstructuredCHMLoader` class to the
document_loader.\_\_init\_\_.py \_\_all\_\_
2024-04-04 16:16:51 -04:00
Jacob Lee
7f0cb3bfba docs[patch]: Make Docusaurus and Vercel add trailing slashes when navigating by default (#20014)
Should hopefully avoid weird broken link edge cases.

Relative links now trip up the Docusaurus broken link checker, so this
PR also removes them.

Also snuck in a small addition about asyncio
2024-04-04 12:49:15 -07:00
Chris Papademetrious
a954dedb77 langchain[minor]: enhance LocalFileStore to allow directory/file permissions to be specified (#18857)
**Description:**
The `LocalFileStore` class can be used to create an on-disk
`CacheBackedEmbeddings` cache. However, the default `umask` settings
gives file/directory write permissions only to the original user. Once
the cache directory is created by the first user, other users cannot
write their own cache entries into the directory.

To make the cache usable by multiple users, this pull request updates
the `LocalFileStore` constructor to allow the permissions for newly
created directories and files to be specified. The specified permissions
override the default `umask` values.

For example, when configured as follows:

```python
file_store = LocalFileStore(temp_dir, chmod_dir=0o770, chmod_file=0o660)
```

then "user" and "group" (but not "other") have permissions to access the
store, which means:

* Anyone in our group could contribute embeddings to the cache.
* If we implement cache cleanup/eviction in the future, anyone in our
group could perform the cleanup.

The default values for the `chmod_dir` and `chmod_file` parameters is
`None`, which retains the original behavior of using the default `umask`
settings.

**Issue:**
Implements enhancement #18075.

**Testing:**
I updated the `LocalFileStore` unit tests to test the permissions.

---------

Signed-off-by: chrispy <chrispy@synopsys.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-04 16:40:16 +00:00
Tomaz Bratanic
df25829f33 community[minor]: Add metadata filtering support for neo4j vector (#20001) 2024-04-04 11:37:06 -04:00
Ben Mitchell
b52b78478f community[minor]: Implement Async OpenSearch afrom_texts & afrom_embeddings (#20009)
- **Description:** Adds async variants of afrom_texts and
afrom_embeddings into `OpenSearchVectorSearch`, which allows for
`afrom_documents` to be called.
- **Issue:** I implemented this because my use case involves an async
scraper generating documents as and when they're ready to be ingested by
Embedding/OpenSearch
- **Dependencies:** None that I'm aware

Co-authored-by: Ben Mitchell <b.mitchell@reply.com>
2024-04-04 15:36:14 +00:00
Christophe Bornet
02152d3909 [docs][minor]: Fix typo in Custom Document Loader doc (#20003) 2024-04-04 10:59:33 -04:00
Jan Nissen
31e3ecc728 core[minor]: support pydantic V2 for JSONOutputParser, allow for other sources of JSON schemas (#19716)
This PR supports using Pydantic v2 objects to generate the schema for
the JSONOutputParser (#19441). This also adds a `json_schema` parameter
to allow users to pass any JSON schema to validate with, not just
pydantic.
2024-04-04 10:57:47 -04:00
Christophe Bornet
f97de4e275 core[minor]: Add aformat to FewShotPromptTemplate (#19652) 2024-04-04 10:24:55 -04:00
Utkarsha Gupte
b27f81c51c core[patch]: mypy ignore fixes #17048 (#19931)
core/langchain_core/_api[Patch]: mypy ignore fixes #17048
Related to #17048

Applied mypy fixes to below two files:
libs/core/langchain_core/_api/deprecation.py
libs/core/langchain_core/_api/beta_decorator.py

Summary of Fixes:
**Issue 1**
class _deprecated_property(type(obj)): # type: ignore
error: Unsupported dynamic base class "type"  [misc]
Fix: 
1. Added an __init__ method to _deprecated_property to initialize the
fget, fset, fdel, and __doc__ attributes.
2. In the __get__, __set__, and __delete__ methods, we now use the
self.fget, self.fset, and self.fdel attributes to call the original
methods after emitting the warning.

3. The finalize function now creates an instance of _deprecated_property
with the fget, fset, fdel, and doc attributes from the original obj
property.



**Issue 2**



 def finalize(  # type: ignore
                wrapper: Callable[..., Any], new_doc: str
            ) -> T:


error: All conditional function variants must have identical
signatures



Fix:
Ensured that both definitions of the finalize function have the
same signature

Twitter Handle -
https://x.com/gupteutkarsha?s=11&t=uwHe4C3PPpGRvoO5Qpm1aA
2024-04-04 10:22:38 -04:00
harry-cohere
e103492eb8 cohere: Add citations to agent, flexibility to tool parsing, fix SDK issue (#19965)
**Description:** Citations are the main addition in this PR. We now emit
them from the multihop agent! Additionally the agent is now more
flexible with observations (`Any` is now accepted), and the Cohere SDK
version is bumped to fix an issue with the most recent version of
pydantic v1 (1.10.15)
2024-04-04 07:02:30 -07:00
Jacob Lee
605c3f23e1 docs: reorg and visual refresh (#19765)
- put use cases in main sidebar
- move modules to own sidebar, rename components
- cleanup lcel section
- cleanup guides
- update font, cell highlighting

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-04 00:58:36 -07:00
Erick Friis
51bdfe04e9 groq: handle streaming tool call case (#19978) 2024-04-03 15:22:59 -07:00
Erick Friis
5acb564d6f groq: fix core version (#19976) 2024-04-03 14:49:57 -07:00
Erick Friis
9e60159043 groq: release 0.1.0 (#19975) 2024-04-03 14:41:48 -07:00
Graden Rea
88cf8a2905 groq: Add tool calling support (#19971)
**Description:** Add with_structured_output to groq chat models
**Issue:** 
**Dependencies:** N/A
**Twitter handle:** N/A
2024-04-03 14:40:20 -07:00
Eugene Yurtsev
6f20f140ca cli[minor]: Add disable sockets in unit tests (#19877) 2024-04-03 17:17:50 -04:00
Eugene Yurtsev
ea276d6547 docs: Custom Document Loaders (#19935)
Add information that shows how to create custom document loaders
2024-04-03 15:34:01 -04:00
Erick Friis
83f62fdacf core: fix try_load_from_hub for older langchain versions load_chain (#19964) 2024-04-03 17:00:25 +00:00
Tomaz Bratanic
09a0ecd000 langchain[minor]: Tests update metadata filtering examples of documents (#19963)
Removing metadata properties that are dicts as some databases don't
support that, and those properties aren't used in tests anyhow..
2024-04-03 12:44:14 -04:00
happy-go-lucky
c6432abdbe community[patch]: Implement delete method and all async methods in opensearch_vector_search (#17321)
- **Description:** In order to use index and aindex in
libs/langchain/langchain/indexes/_api.py, I implemented delete method
and all async methods in opensearch_vector_search
- **Dependencies:** No changes
2024-04-03 09:40:49 -07:00
Cheng, Penghui
cc407e8a1b community[minor]: weight only quantization with intel-extension-for-transformers. (#14504)
Support weight only quantization with intel-extension-for-transformers.
[Intel® Extension for
Transformers](https://github.com/intel/intel-extension-for-transformers)
is an innovative toolkit to accelerate Transformer-based models on Intel
platforms, in particular effective on 4th Intel Xeon Scalable processor
[Sapphire
Rapids](https://www.intel.com/content/www/us/en/products/docs/processors/xeon-accelerated/4th-gen-xeon-scalable-processors.html)
(codenamed Sapphire Rapids). The toolkit provides the below key
features:

* Seamless user experience of model compressions on Transformer-based
models by extending [Hugging Face
transformers](https://github.com/huggingface/transformers) APIs and
leveraging [Intel® Neural
Compressor](https://github.com/intel/neural-compressor)
* Advanced software optimizations and unique compression-aware runtime.
* Optimized Transformer-based model packages.
*
[NeuralChat](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat),
a customizable chatbot framework to create your own chatbot within
minutes by leveraging a rich set of plugins and SOTA optimizations.
*
[Inference](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/llm/runtime/graph)
of Large Language Model (LLM) in pure C/C++ with weight-only
quantization kernels.
This PR is an integration of weight only quantization feature with
intel-extension-for-transformers.

Unit test is in
lib/langchain/tests/integration_tests/llm/test_weight_only_quantization.py
The notebook is in
docs/docs/integrations/llms/weight_only_quantization.ipynb.
The document is in
docs/docs/integrations/providers/weight_only_quantization.mdx.

---------

Signed-off-by: Cheng, Penghui <penghui.cheng@intel.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-03 16:21:34 +00:00
Eugene Yurtsev
d6d843ec24 langchain-postgres: Initial package with postgres chat history implementation (#19884)
- [x] Add in code examples for the chat message history class
- [ ] ~Add docs with notebook examples~ (can this be done later?)
- [x] Update README.md
2024-04-03 10:57:21 -04:00
Eugene Yurtsev
d293431e10 core[minor]: Add aload to document loader (#19936)
Add aload to document loader
2024-04-03 10:46:47 -04:00
Ángel Igareta
31a641a155 core: fix return of draw_mermaid_png and change to not save image by default (#19950)
- **Description:** Improvement for #19599: fixing missing return of
graph.draw_mermaid_png and improve it to make the saving of the rendered
image optional

Co-authored-by: Angel Igareta <angel.igareta@klarna.com>
2024-04-03 06:20:35 -07:00
Bagatur
4328c54aab core[patch]: Release 0.1.39 (#19940) 2024-04-03 00:25:56 +00:00
Nuno Campos
f4568fe0c6 core: BaseChatModel modify chat message before passing to run_manager (#19939)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-02 16:40:27 -07:00
aditya thomas
73ebe78249 docs: update cohere documentation (#19700)
**Description:** Update of Cohere documentation (main provider page)
**Issue:** After addition of the Cohere partner package, the
documentation was out of date
**Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-02 18:16:48 -04:00
Leonid Kuligin
eb0521064e deprecating integrations moved to langchain_google_community (#19841)
Thank you for contributing to LangChain!

- [ ] **PR title**: "community: deprecating integrations moved to
langchain_google_community"

- [ ] **PR message**: deprecating integrations moved to
langchain_google_community

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-04-02 17:06:07 -04:00
Erick Friis
f0d5b59962 core[patch]: remove requests (#19891)
Removes required usage of `requests` from `langchain-core`, all of which
has been deprecated.

- removes Tracer V1 implementations
- removes old `try_load_from_hub` github-based hub implementations

Removal done in a way where imports will still succeed, and usage will
fail with a `RuntimeError`.
2024-04-02 20:28:10 +00:00
Erick Friis
d5a2ff58e9 pinecone[patch]: source tag (#19739) 2024-04-02 19:53:59 +00:00
Wang Guan
8638029a37 docs: mention caveats with CacheBackedEmbeddings.embed_query (#19926)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**:
- **Description:** mention not-caching methods in CacheBackedEmbeddings
  - **Issue:** n/a I almost created one until I read the code 
  - **Dependencies:** n/a
  - **Twitter handle:** `tarsylia`


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-02 19:19:29 +00:00
harry-cohere
beab9adffb cohere: Improve integration test stability, fix documents bug (#19929)
**Description**: Improves the stability of all Cohere partner package
integration tests. Fixes a bug with document parsing (both dicts and
Documents are handled).
2024-04-02 11:22:30 -07:00
harry-cohere
37fc1c525a cohere: simplify integration test (#19928)
**Description**: This PR simplifies an integration test within the
Cohere partner package:
 * It no longer relies on exact model answers
 * It no longer relies on a third party tool
2024-04-02 10:57:25 -07:00
billytrend-cohere
de6c0cf248 cohere, docs: update imports and installs to langchain_cohere (#19918)
cohere: update imports and installs to langchain_cohere

---------

Co-authored-by: Harry M <127103098+harry-cohere@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-02 09:47:58 -07:00
Erick Friis
146d1a6347 cohere[patch]: release 0.1.0rc2 (#19924) 2024-04-02 16:24:23 +00:00
harry-cohere
e2b83c87b1 cohere[patch]: Add multihop tool agent (#19919)
**Description**: Adds an agent that uses Cohere with multiple hops and
multiple tools.

This PR is a continuation of
https://github.com/langchain-ai/langchain/pull/19650 - which was
previously approved. Conceptually nothing has changed, but this PR has
extra fixes, documentation and testing.

---------

Co-authored-by: BeatrixCohere <128378696+BeatrixCohere@users.noreply.github.com>
Co-authored-by: Erick Friis <erickfriis@gmail.com>
2024-04-02 09:18:50 -07:00
Max Jakob
22dbcc9441 langchain[patch]: fix ElasticsearchStore reference for self query (#19907)
Initializing self query with an ElasticsearchStore from the partners
packages failed previously, see
https://github.com/langchain-ai/langchain/discussions/18976.
2024-04-02 08:39:12 -07:00
Bagatur
3218463f6a core[patch]: Release 0.1.38 (#19895) 2024-04-01 22:47:46 -07:00
Mohammad Mohtashim
9ae2df36fc Core[major]: Base Tracer to propagate raw output from tool for on_tool_end (#18932)
This PR completes work for PR #18798 to expose raw tool output in
on_tool_end.

Affected APIs:
* astream_log
* astream_events
* callbacks sent to langsmith via langsmith-sdk
* Any other code that relies on BaseTracer!

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-02 01:24:46 +00:00
Nuno Campos
2ae6dcdf01 core: Assign missing message ids in BaseChatModel (#19863)
- This ensures ids are stable across streamed chunks
- Multiple messages in batch call get separate ids
- Also fix ids being dropped when combining message chunks

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-02 01:18:36 +00:00
Peter Vandenabeele
e830a4e731 community[patch]: Add remove_comments option (default True): do not extract html comments (#13259)
- **Description:** add `remove_comments` option (default: True): do not
extract html _comments_,
  - **Issue:** None,
  - **Dependencies:** None,
  - **Tag maintainer:** @nfcampos ,
  - **Twitter handle:** peter_v

I ran `make format`, `make lint` and `make test`.

Discussion: I my use case, I prefer to not have the comments in the
extracted text:
* e.g. from a Google tag that is added in the html as comment
* e.g. content that the authors have temporarily hidden to make it non
visible to the regular reader

Removing the comments makes the extracted text more alike the intended
text to be seen by the reader.


**Choice to make:** do we prefer to make the default for this
`remove_comments` option to be True or False?
I have changed it to True in a second commit, since that is how I would
prefer to use it by default. Have the
cleaned text (without technical Google tags etc.) and also closer to the
actually visible and intended content.
I am not sure what is best aligned with the conventions of langchain in
general ...


INITIAL VERSION (new version above):
~**Choice to make:** do we prefer to make the default for this
`ignore_comments` option to be True or False?
I have set it to False now to be backwards compatible. On the other
hand, I would use it mostly with True.
I am not sure what is best aligned with the conventions of langchain in
general ...~

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-02 00:19:12 +00:00
Jamsheed Mistri
4f70bc119d community[minor]: add Layerup Security integration (#19787)
**Description:** adds integration with [Layerup
Security](https://uselayerup.com). Docs can be found
[here](https://docs.uselayerup.com). Integrates directly with our Python
SDK.

**Dependencies:**
[LayerupSecurity](https://pypi.org/project/LayerupSecurity/)

**Note**: all methods for our product require a paid API key, so I only
included 1 test which checks for an invalid API key response. I have
tested extensively locally.

**Twitter handle**: [@layerup_](https://twitter.com/layerup_)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-01 23:49:00 +00:00
Brace Sproul
22f78c37c8 docs[patch]: Hide google from function calling docs (#19887) 2024-04-01 14:26:31 -07:00
Massimiliano Pronesti
06dac394a6 cohere[patch]: support request timeout in BaseCohere (#19641)
As in #19346, this PR exposes `request_timeout` in `BaseCohere`, while
`max_retires` is no longer a parameter of the beneath client
(`cohere.Client`) and it is already configured in
`langchain_cohere.llms.Cohere`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-01 14:16:32 -07:00
Mayank Solanki
d5c412b0a9 core: Add docs for RunnableConfigurableFields (#19849)
- [x] **docs**: core: Add docs for `RunnableConfigurableFields`

- **Description:** Added incode docs for `RunnableConfigurableFields`
with example
    - **Issue:** #18803 
    - **Dependencies:** NA
    - **Twitter handle:** NA

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-01 20:40:10 +00:00
Mahdi Setayesh
c28efb878c text-splitters[minor]: Adding a new section aware splitter to langchain (#16526)
- **Description:** the layout of html pages can be variant based on the
bootstrap framework or the styles of the pages. So we need to have a
splitter to transform the html tags to a proper layout and then split
the html content based on the provided list of tags to determine its
html sections. We are using BS4 library along with xslt structure to
split the html content using an section aware approach.
  - **Dependencies:** No new dependencies
  - **Twitter handle:** @m_setayesh

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-01 20:32:26 +00:00
Eugene Yurtsev
356a139b0a cli[minor]: Add __version__ to integration package template (#19876)
Packages should export __version__
2024-04-01 15:34:38 -04:00
northern-64bit
dfbc10c943 docs: Fix link in Unstructured notebook (#19851)
**Description:** This PR fixes the link to the Unstructured
documentation in the docs.
2024-04-01 15:26:48 -04:00
Brace Sproul
7538c4de19 docs[patch]: Revert quarto update (#19880) 2024-04-01 12:11:27 -07:00
Anıl Berk Altuner
4384fa8e49 community[minor]: Add Dria retriever (#17098)
[Dria](https://dria.co/) is a hub of public RAG models for developers to
both contribute and utilize a shared embedding lake. This PR adds a
retriever that can retrieve documents from Dria.
2024-04-01 12:04:19 -07:00
Erick Friis
0b0a55192f robocorp[patch]: fix core min version (#19879) 2024-04-01 11:34:14 -07:00
Mikko Korpela
3f06cef60c robocorp[patch]: Fix nested arguments descriptors and tool names (#19707)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**:
- **Description:** Fix argument translation from OpenAPI spec to OpenAI
function call (and similar)
- **Issue:** OpenGPTs failures with calling Action Server based actions.
    - **Dependencies:** None
    - **Twitter handle:** mikkorpela


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
~2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.~


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-04-01 11:29:39 -07:00
Ethan Yang
48f84e253e community[minor]: Add OpenVINO rerank model support (#19791)
@eaidova @AlexKoff88 Could you help to review, thanks

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-04-01 18:27:23 +00:00
Erick Friis
4fbdc2a7ee openai[patch]: remove openai chunk size validation (#19878) 2024-04-01 18:26:06 +00:00
Chenhui Zhang
a1f3e9f537 community[minor]: Update ChatZhipuAI to support GLM-4 model (#16695)
Description: Update `ChatZhipuAI` to support the latest `glm-4` model.
Issue: N/A
Dependencies: httpx, httpx-sse, PyJWT

The previous `ChatZhipuAI` implementation requires the `zhipuai`
package, and cannot call the latest GLM model. This is because
- The old version `zhipuai==1.*` doesn't support the latest model.
- `zhipuai==2.*` requires `pydantic V2`, which is incompatible with
'langchain-community'.

This re-implementation invokes the GLM model by sending HTTP requests to
[open.bigmodel.cn](https://open.bigmodel.cn/dev/api) via the `httpx`
package, and uses the `httpx-sse` package to handle stream events.

---------

Co-authored-by: zR <2448370773@qq.com>
2024-04-01 18:11:21 +00:00
Bagatur
d25b5b6f25 community[patch]: Release 0.0.31 (#19873) 2024-04-01 10:50:22 -07:00
Erick Friis
e3ed6a7c28 ai21[patch]: fix core dep (#19874) 2024-04-01 10:48:16 -07:00
Nuno Campos
aa5797d908 openai[patch]: Partially Revert Update openai chat model to new base class interface (#19871)
Partially Reverts langchain-ai/langchain#19729

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-04-01 10:31:06 -07:00
Erick Friis
be92cf57ca openai[patch]: fix azure embedding length check (#19870) 2024-04-01 10:26:15 -07:00
Bagatur
d62e84c4f5 community[patch]: Revert " Fix the bug that Chroma does not specify `e… (#19866)
…mbedding_function` (#19277)"

This reverts commit 7042934b5f.

Fixes #19848
2024-04-01 10:10:44 -07:00
Jacob Lee
f06229bbf1 👥 Update LangChain people data (#19858)
👥 Update LangChain people data

Co-authored-by: github-actions <github-actions@github.com>
2024-04-01 09:57:31 -07:00
Erick Friis
7376e4dbe9 ai21[patch]: release 0.1.3 (#19867) 2024-04-01 09:56:23 -07:00
Ángel Igareta
c2ccf22dfd core: generate mermaid syntax and render visual graph (#19599)
- **Description:** Add functionality to generate Mermaid syntax and
render flowcharts from graph data. This includes support for custom node
colors and edge curve styles, as well as the ability to export the
generated graphs to PNG images using either the Mermaid.INK API or
Pyppeteer for local rendering.
- **Dependencies:** Optional dependencies are `pyppeteer` if rendering
wants to be done using Pypeteer and Javascript code.

---------

Co-authored-by: Angel Igareta <angel.igareta@klarna.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-04-01 08:14:46 -07:00
Ikko Eltociear Ashimine
8711a05a51 Update cross_encoder_reranker.ipynb (#19846)
HuggingFace -> Hugging Face
2024-04-01 10:49:54 -04:00
Vardhaman
039f314f20 docs: remove unnecessary args from the pip install (#19823)
**Description:** An additional `U` argument was added for the
instructions to install the pip packages for the MediaWiki Dump Document
loader which was leading to error in installing the package. Removing
the argument fixed the command to install.

**Issue:** #19820 
**Dependencies:** No dependency change requierd
**Twitter handle:** [@vardhaman722](https://twitter.com/vardhaman722)
2024-04-01 10:47:26 -04:00
Bagatur
003c98e5b4 experimental[patch]: Release 0.0.56 (#19840) 2024-03-31 22:00:59 -07:00
Bagatur
c4eb841c37 langchain[patch]: Release 0.1.14 (#19839) 2024-03-31 21:44:01 -07:00
Bagatur
0242bce38c community[patch]: Release 0.0.30 (#19838) 2024-03-31 21:26:30 -07:00
Bagatur
08c10bd66a core[patch]: Release 0.1.37 (#19831) 2024-03-31 14:50:39 -07:00
Giannis
8cf1d75d08 cohere[patch]: Fix retriever (#19771)
* Replace `source_documents` with `documents`
* Pass `documents` as a named arg vs keyword
* Make `parsed_docs` more robust
* Fix edge case of doc page_content being `None`
2024-03-31 14:47:03 -07:00
Guangdong Liu
b6ebddbacc langchain[patch]: Upgrade openai's sdk and solve some interface adaptation problems. #19548 (#19785)
- #19548
- @baskaryan @eyurtsev PTAL

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-31 21:35:38 +00:00
Yash Mathur
c42ec58578 together[minor]: Update endpoint to non deprecated version (#19649)
- **Updating Together.ai Endpoint**: "langchain_together: Updated
Deprecated endpoint for partner package"

- Description: The inference API of together is deprecates, do replaced
with completions and made corresponding changes.
- Twitter handle: @dev_yashmathur

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-31 21:21:46 +00:00
hsuyuming
5ab6b39098 community[patch]: add attribution_token within GoogleVertexAISearchRetriever (#18520)
- **Description:** Add attribution_token within
GoogleVertexAISearchRetriever so user can provide this information to
Google support team or product team during debug session.
    
Reference:
https://cloud.google.com/generative-ai-app-builder/docs/view-analytics#user-events

Attribution tokens. Attribution tokens are unique IDs generated by
Vertex AI Search and returned with each search request. Make sure to
include that attribution token as UserEvent.attributionToken with any
user events resulting from a search. This is needed to identify if a
search is served by the API. Only user events with a Google-generated
attribution token are used to compute metrics.
    
    - **Issue:** No
    - **Dependencies:** No
    - **Twitter handle:** abehsu1992626
---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-31 13:54:56 -07:00
Kenneth Choe
f98d7f7494 langchain[minor], community[minor]: add CrossEncoderReranker with HuggingFaceCrossEncoder and SagemakerEndpointCrossEncoder (#13687)
- **Description:** Support reranking based on cross encoder models
available from HuggingFace.
      - Added `CrossEncoder` schema
- Implemented `HuggingFaceCrossEncoder` and
`SagemakerEndpointCrossEncoder`
- Implemented `CrossEncoderReranker` that performs similar functionality
to `CohereRerank`
- Added `cross-encoder-reranker.ipynb` to demonstrate how to use it.
Please let me know if anything else needs to be done to make it visible
on the table-of-contents navigation bar on the left, or on the card list
on [retrievers documentation
page](https://python.langchain.com/docs/integrations/retrievers).
  - **Issue:** N/A
  - **Dependencies:** None other than the existing ones.

---------

Co-authored-by: Kenny Choe <kchoe@amazon.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-31 20:51:31 +00:00
cxumol
3f7da03dd8 docs: fix a dead link (#19814)
**Description**

Google Colab returned 404 when trying to click an "Open In Colab" button
from document. This PR corrected the link.
2024-03-31 10:28:51 -04:00
aditya thomas
b8271bbc4a docs: (minor) updates to voyage ai documentation (#19819)
**Description:** Updates to Voyage AI documentation
**Issue:** Not Applicable
**Dependencies:** None
2024-03-31 10:27:19 -04:00
Tomaz Bratanic
ed49cca191 templates: Update neo4j templates (#19789) 2024-03-30 14:40:05 +00:00
aditya thomas
765d6762bc docs[minor]: include tab info for togetherai (#19796)
**Description:** Included information for the TogetherAI tab
**Issue:** The tab for TogetherAI information was not correct
**Dependencies:** None
2024-03-30 09:23:45 -04:00
LunarECL
b7d180a70d experimental[minor]: Create Closed Captioning Chain for .mp4 videos (#14059)
Description: Video imagery to text (Closed Captioning)
This pull request introduces the VideoCaptioningChain, a tool for
automated video captioning. It processes audio and video to generate
subtitles and closed captions, merging them into a single SRT output.

Issue: https://github.com/langchain-ai/langchain/issues/11770
Dependencies: opencv-python, ffmpeg-python, assemblyai, transformers,
pillow, torch, openai
Tag maintainer:
@baskaryan
@hwchase17


Hello!

We are a group of students from the University of Toronto
(@LunarECL, @TomSadan, @nicoledroi1, @A2113S) that want to make a
contribution to the LangChain community! We have ran make format, make
lint and make test locally before submitting the PR. To our knowledge,
our changes do not introduce any new errors.

Thank you for taking the time to review our PR!

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-30 01:57:53 +00:00
Harrison Chase
56525f2ac1 dont mutate metadata/tags (#19742) 2024-03-29 17:55:27 -07:00
Kamal Zhang
368e35c3b1 community[patch]: introduce convert_to_secret() to bananadev llm (#14283)
- **Description:** Per #12165, this PR add to BananaLLM the function
convert_to_secret_str() during environment variable validation.
- **Issue:** #12165
- **Tag maintainer:** @eyurtsev
- **Twitter handle:** @treewatcha75751

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-30 00:52:25 +00:00
DrKroll
c4da8d0813 langchain[patch]: load ReadFileTool (#14301)
---------

Co-authored-by: Dr. Simon Kroll <krolls@fida.de>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-30 00:46:24 +00:00
anshaneel
0884e5de7f community[minor]: Add Alpha Vantage API Tool (#14332)
### Description
This implementation adds functionality from the AlphaVantage API,
renowned for its comprehensive financial data. The class encapsulates
various methods, each dedicated to fetching specific types of financial
information from the API.

### Implemented Functions

- **`search_symbols`**: 
- Searches the AlphaVantage API for financial symbols using the provided
keywords.

- **`_get_market_news_sentiment`**: 
- Retrieves market news sentiment for a specified stock symbol from the
AlphaVantage API.

- **`_get_time_series_daily`**: 
- Fetches daily time series data for a specific symbol from the
AlphaVantage API.

- **`_get_quote_endpoint`**: 
- Obtains the latest price and volume information for a given symbol
from the AlphaVantage API.

- **`_get_time_series_weekly`**: 
- Gathers weekly time series data for a particular symbol from the
AlphaVantage API.

- **`_get_top_gainers_losers`**: 
- Provides details on top gainers, losers, and most actively traded
tickers in the US market from the AlphaVantage API.

  ### Issue: 
  - #11994 
  
### Dependencies: 
  - 'requests' library for HTTP requests. (import requests)
  - 'pytest' library for testing. (import pytest)

---------

Co-authored-by: Adam Badar <94140103+adam-badar@users.noreply.github.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-30 00:44:01 +00:00
Alex Sherstinsky
a9bc212bf2 community[minor]: fix failing Predibase integration (#19776)
- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Langchain-Predibase integration was failing, because
it was not current with the Predibase SDK; in addition, Predibase
integration tests were instantiating the Langchain Community `Predibase`
class with one required argument (`model`) missing. This change updates
the Predibase SDK usage and fixes the integration tests.
    - **Twitter handle:** `@alexsherstinsky`


---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-30 00:38:13 +00:00
ethynic
e9caa22d47 community[patch]: Update minimax.py (#14384)
MiniMaxChat class _generate method shoud return a ChatResult object not
str

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 23:57:06 +00:00
Ahmed Moubtahij
f5d4ce840f langchain[patch]: Simplify ensemble retriever (#14427)
- **Description:** code simplification to improve readability and remove
unnecessary memory allocations.
  - **Tag maintainer**: @baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 16:49:49 -07:00
Snehil Kumar
b36f4147b0 docs: Google Drive Loader always set the env var (#14791)
- **Description:** Code written by following, the official documentation
of [Google Drive
Loader](https://python.langchain.com/docs/integrations/document_loaders/google_drive),
gives errors. I have opened an issue regarding this. See #14725. This is
a pull request for modifying the documentation to use an approach that
makes the code work. Basically, the change is that we need to always set
the GOOGLE_APPLICATION_CREDENTIALS env var to an emtpy string, rather
than only in case of RefreshError. Also, rewrote 2 paragraphs to make
the instructions more clear.
- **Issue:** See this related [issue #
14725](https://github.com/langchain-ai/langchain/issues/14725)
  - **Dependencies:** NA
  - **Tag maintainer:** @baskaryan
  - **Twitter handle:** NA

Co-authored-by: Snehil <snehil@example.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 23:19:37 +00:00
M.Abdulrahman Alnaseer
ba54f1577f community[minor]: add support for llmsherpa (#19741)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: added support for llmsherpa library"

- [x] **Add tests and docs**: 
1. Integration test:
'docs/docs/integrations/document_loaders/test_llmsherpa.py'.
2. an example notebook:
`docs/docs/integrations/document_loaders/llmsherpa.ipynb`.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 16:04:57 -07:00
Naveenkhasyap
a99bd098ac docs: fix for #16702 and #16703 (#16705)
- **Description:** Quickstart Documentation updates for missing
dependency installation steps.
- **Issue:** the issue # it prompts users to install required
dependency.
  - **Dependencies:** no,
  - **Twitter handle:** @naveenkashyap_

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 15:57:51 -07:00
Brace Sproul
6d93a03bef docs[patch]: Fix or remove broken mdx links (#19777)
this pr also drops the community added action for checking broken links
in mdx. It does not work well for our use case, throwing errors for
local paths, plus the rest of the errors our in house solution had.
2024-03-29 15:25:08 -07:00
Bagatur
2f5606a318 mistralai[patch]: correct integration_test (#19774) 2024-03-29 21:47:35 +00:00
Pierre Véron
ace7b66261 mistralai[patch]: add missing _combine_llm_outputs implementation in ChatMistralAI (#18603)
# Description
Implementing `_combine_llm_outputs` to `ChatMistralAI` to override the
default implementation in `BaseChatModel` returning `{}`. The
implementation is inspired by the one in `ChatOpenAI` from package
`langchain-openai`.
# Issue
None
# Dependencies
None
# Twitter handle
None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 14:43:20 -07:00
lvliang-intel
0175906437 templates: add RAG template for Intel Xeon Scalable Processors (#18424)
**Description:**
This template utilizes Chroma and TGI (Text Generation Inference) to
execute RAG on the Intel Xeon Scalable Processors. It serves as a
demonstration for users, illustrating the deployment of the RAG service
on the Intel Xeon Scalable Processors and showcasing the resulting
performance enhancements.

**Issue:**
None

**Dependencies:**
The template contains the poetry project requirements to run this
template.
CPU TGI batching is WIP.

**Twitter handle:**
None

---------

Signed-off-by: lvliang-intel <liang1.lv@intel.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 14:37:32 -07:00
Nuno Campos
d4673a3507 openai[patch]: Update openai chat model to new base class interface (#19729) 2024-03-29 14:30:28 -07:00
harry-cohere
23fcc14650 cohere[patch]: support kwargs in with_structured_output (#19736)
**Description:** We'd like to support passing additional kwargs in
`with_structured_output`. I believe this is the accepted approach to
enable additional arguments on API calls.
2024-03-29 14:30:14 -07:00
Brace Sproul
ce0a588ae6 docs[minor]: Add chat model tabs to docs pages (#19589) 2024-03-29 14:23:55 -07:00
BeatrixCohere
bd02b83acd cohere[patch]: Allow overriding of the base URL in Cohere Client (#19766)
This PR adds the ability for a user to override the base API url for the
Cohere client for embeddings and chat llm.
2024-03-29 14:22:30 -07:00
Nisarg Trivedi
1252ccce6f text-splitters[minor]: Added Haskell support in langchain.text_splitter module (#16191)
- **Description:** Haskell language support added in text_splitter
module
  - **Dependencies:** No
  - **Twitter handle:** @nisargtr

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 20:17:50 +00:00
Hrvoje Milković
b7344e3347 community[minor]: Infobip tool integration (#16805)
**Description:** Adding Tool that wraps Infobip API for sending sms or
emails and email validation.
**Dependencies:** None,
**Twitter handle:** @hmilkovic

Implementation:
```
libs/community/langchain_community/utilities/infobip.py
```

Integration tests:
```
libs/community/tests/integration_tests/utilities/test_infobip.py
```

Example notebook:
```
docs/docs/integrations/tools/infobip.ipynb
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 19:01:27 +00:00
Luka Krapic
727a2ea9f1 community[patch]: history size support for DynamoDBChatMessageHistory (#16794)
**Description:** PR adds support for limiting number of messages
preserved in a session history for DynamoDBChatMessageHistory

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 18:56:21 +00:00
Dt22
6dbf1a2de0 community[patch]: fix redis input type for index_schema field (#16874)
### Subject: Fix Type Misdeclaration for index_schema in redis/base.py

I noticed a type misdeclaration for the index_schema column in the
redis/base.py file.

When following the instructions outlined in [Redis Custom Metadata
Indexing](https://python.langchain.com/docs/integrations/vectorstores/redis)
to create our own index_schema, it leads to a Pylance type error. <br/>
**The error message indicates that Dict[str, list[Dict[str, str]]] is
incompatible with the type Optional[Union[Dict[str, str], str,
os.PathLike]].**

```
index_schema = {
    "tag": [{"name": "credit_score"}],
    "text": [{"name": "user"}, {"name": "job"}],
    "numeric": [{"name": "age"}],
}

rds, keys = Redis.from_texts_return_keys(
    texts,
    embeddings,
    metadatas=metadata,
    redis_url="redis://localhost:6379",
    index_name="users_modified",
    index_schema=index_schema,  
)
```
Therefore, I have created this pull request to rectify the type
declaration problem.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 18:55:54 +00:00
morgana
074ad5095f community[patch]: mmr search for Rockset vectorstore integration (#16908)
- **Description:** Adding support for mmr search in the Rockset
vectorstore integration.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** `@_morgan_adams_`

---------

Co-authored-by: Rockset API Bot <admin@rockset.io>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 18:45:22 +00:00
shahrin014
f51e6a35ba community[patch]: OllamaEmbeddings - Pass headers to post request (#16880)
## Feature
- Set additional headers in constructor
- Headers will be sent in post request

This feature is useful if deploying Ollama on a cloud service such as
hugging face, which requires authentication tokens to be passed in the
request header.

## Tests
- Test if header is passed
- Test if header is not passed

Similar to https://github.com/langchain-ai/langchain/pull/15881

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 18:44:52 +00:00
Lance Martin
e0f137dbe0 docs: Agentic and Self-RAG w/ LangGraph (#16910)
To do:
[ ] Add streaming
[ ] Move to LangGraph
2024-03-29 11:11:35 -07:00
Jan Chorowski
b8b42ccbc5 community[minor]: Pathway vectorstore(#14859)
- **Description:** Integration with pathway.com data processing pipeline
acting as an always updated vectorstore
  - **Issue:** not applicable
- **Dependencies:** optional dependency on
[`pathway`](https://pypi.org/project/pathway/)
  - **Twitter handle:** pathway_com

The PR provides and integration with `pathway` to provide an easy to use
always updated vector store:

```python
import pathway as pw
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.text_splitter import CharacterTextSplitter
from langchain.vectorstores import PathwayVectorClient, PathwayVectorServer

data_sources = []
data_sources.append(
    pw.io.gdrive.read(object_id="17H4YpBOAKQzEJ93xmC2z170l0bP2npMy", service_user_credentials_file="credentials.json", with_metadata=True))

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
embeddings_model = OpenAIEmbeddings(openai_api_key=os.environ["OPENAI_API_KEY"])
vector_server = PathwayVectorServer(
    *data_sources,
    embedder=embeddings_model,
    splitter=text_splitter,
)
vector_server.run_server(host="127.0.0.1", port="8765", threaded=True, with_cache=False)
client = PathwayVectorClient(
    host="127.0.0.1",
    port="8765",
)
query = "What is Pathway?"
docs = client.similarity_search(query)
```

The `PathwayVectorServer` builds a data processing pipeline which
continusly scans documents in a given source connector (google drive,
s3, ...) and builds a vector store. The `PathwayVectorClient` implements
LangChain's `VectorStore` interface and connects to the server to
retrieve documents.

---------

Co-authored-by: Mateusz Lewandowski <lewymati@users.noreply.github.com>
Co-authored-by: mlewandowski <mlewandowski@MacBook-Pro-mlewandowski.local>
Co-authored-by: Berke <berkecanrizai1@gmail.com>
Co-authored-by: Adrian Kosowski <adrian@pathway.com>
Co-authored-by: mlewandowski <mlewandowski@macbook-pro-mlewandowski.home>
Co-authored-by: berkecanrizai <63911408+berkecanrizai@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: mlewandowski <mlewandowski@MBPmlewandowski.ht.home>
Co-authored-by: Szymon Dudycz <szymond@pathway.com>
Co-authored-by: Szymon Dudycz <szymon.dudycz@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 10:50:39 -07:00
ccurme
0dbd5f5012 add script to check imports (#19611) 2024-03-29 13:30:20 -04:00
Arturs Konfino
2319212d54 community[patch]: avoid executing toolkit.get_context() when not necessary (#19762)
If `prompt` is passed into `create_sql_agent()`, then
`toolkit.get_context()` shouldn't be executed against the database
unless relevant prompt variables (`table_info` or `table_names`) are
present .
2024-03-29 16:42:21 +00:00
高璟琦
ec7a59c96c community[minor]: Add solar embedding (#19761)
Solar is a large language model developed by
[Upstage](https://upstage.ai/). It's a powerful and purpose-trained LLM.
You can visit the embedding service provided by Solar within this pr.

You may get **SOLAR_API_KEY** from
https://console.upstage.ai/services/embedding
You can refer to more details about accepted llm integration at
https://python.langchain.com/docs/integrations/llms/solar.
2024-03-29 09:36:05 -07:00
Tomaz Bratanic
dec00d3050 community[patch]: Add the ability to pass maps to neo4j retrieval query (#19758)
Makes it easier to flatten complex values to text, so you don't have to
use a lot of Cypher to do it.
2024-03-29 08:33:48 -07:00
Robby
f7e8a382cc community[minor]: add hugging face text-to-speech inference API (#18880)
Description: I implemented a tool to use Hugging Face text-to-speech
inference API.

Issue: n/a

Dependencies: n/a

Twitter handle: No Twitter, but do have
[LinkedIn](https://www.linkedin.com/in/robby-horvath/) lol.

---------

Co-authored-by: Robby <h0rv@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-29 15:02:29 +00:00
DasDingoCodes
73eb3f8fd9 community[minor]: Implement DirectoryLoader lazy_load function (#19537)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: Implement DirectoryLoader lazy_load
function"

- [x] **Description**: The `lazy_load` function of the `DirectoryLoader`
yields each document separately. If the given `loader_cls` of the
`DirectoryLoader` also implemented `lazy_load`, it will be used to yield
subdocuments of the file.

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access:
`libs/community/tests/unit_tests/document_loaders/test_directory_loader.py`
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory:
`docs/docs/integrations/document_loaders/directory.ipynb`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-29 14:46:52 +00:00
Christophe Bornet
6b2b511f68 core[minor]: Add aformat_messages to FewShotChatMessagePromptTemplate and ChatPromptTemplate (#19648)
Needed since the example selector may use a vector store.
2024-03-29 10:31:32 -04:00
Leonid Ganeline
5f814820f6 docs: providers pinecone fix (#19737)
Current providers page use link to the old package.
- Fixed installation instructions
- Added a reference to the Pinecone retriever
2024-03-29 08:30:30 -04:00
Bob Lin
53a74ad12b docs: use markdown cell instead of code block (#19740)
I found that the code of async and async batch was divided into two
blocks:

<img width="823" alt="Screenshot 2024-03-29 at 7 45 59 AM"
src="https://github.com/langchain-ai/langchain/assets/10000925/0fa59d29-a692-4309-afb8-2260f03242ec">


so I changed it to unified.
2024-03-29 08:27:48 -04:00
Ekaterina Aidova
4ce36af335 docs: fix link in openvino integration doc (#19749)
- **Description:** fix incorrect link in docs
 - **Dependencies:** None
2024-03-29 12:24:07 +00:00
Jialei
f7c903e24a community[minor]: add support for Moonshot llm and chat model (#17100) 2024-03-29 08:54:23 +00:00
Gustavo Isturiz
824dccf5e2 docs: fixed xml URL on sitemap docs exmaple, issue #17236 (#17304) 2024-03-29 01:36:54 -07:00
Ethan Yang
7164015135 community[minor]: Add Openvino embedding support (#19632)
This PR is used to support both HF and BGE embeddings with openvino

---------

Co-authored-by: Alexander Kozlov <alexander.kozlov@intel.com>
2024-03-29 01:34:51 -07:00
Guangdong Liu
cd55d587c2 langchain[patch]: Upgrade openai's sdk and solve some interface adaptation problems. (#19548)
- **Issue:** close #19534
2024-03-29 01:25:17 -07:00
Kirushikesh DB
12861273e1 experimental[patch]: Removed 'SQLResults:' from the LLMResponse in SQLDatabaseChain (#17104)
**Description:** 
When using the SQLDatabaseChain with Llama2-70b LLM and, SQLite
database. I was getting `Warning: You can only execute one statement at
a time.`.

```
from langchain.sql_database import SQLDatabase
from langchain_experimental.sql import SQLDatabaseChain

sql_database_path = '/dccstor/mmdataretrieval/mm_dataset/swimming_record/rag_data/swimmingdataset.db'
sql_db = get_database(sql_database_path)
db_chain = SQLDatabaseChain.from_llm(mistral, sql_db, verbose=True, callbacks = [callback_obj])
db_chain.invoke({
    "query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
})
```
Error:
```
Warning                                   Traceback (most recent call last)
Cell In[31], line 3
      1 import langchain
      2 langchain.debug=False
----> 3 db_chain.invoke({
      4     "query": "What is the best time of Lance Larson in men's 100 meter butterfly competition?"
      5 })

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:162, in Chain.invoke(self, input, config, **kwargs)
    160 except BaseException as e:
    161     run_manager.on_chain_error(e)
--> 162     raise e
    163 run_manager.on_chain_end(outputs)
    164 final_outputs: Dict[str, Any] = self.prep_outputs(
    165     inputs, outputs, return_only_outputs
    166 )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain/chains/base.py:156, in Chain.invoke(self, input, config, **kwargs)
    149 run_manager = callback_manager.on_chain_start(
    150     dumpd(self),
    151     inputs,
    152     name=run_name,
    153 )
    154 try:
    155     outputs = (
--> 156         self._call(inputs, run_manager=run_manager)
    157         if new_arg_supported
    158         else self._call(inputs)
    159     )
    160 except BaseException as e:
    161     run_manager.on_chain_error(e)

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:198, in SQLDatabaseChain._call(self, inputs, run_manager)
    194 except Exception as exc:
    195     # Append intermediate steps to exception, to aid in logging and later
    196     # improvement of few shot prompt seeds
    197     exc.intermediate_steps = intermediate_steps  # type: ignore
--> 198     raise exc

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_experimental/sql/base.py:143, in SQLDatabaseChain._call(self, inputs, run_manager)
    139     intermediate_steps.append(
    140         sql_cmd
    141     )  # output: sql generation (no checker)
    142     intermediate_steps.append({"sql_cmd": sql_cmd})  # input: sql exec
--> 143     result = self.database.run(sql_cmd)
    144     intermediate_steps.append(str(result))  # output: sql exec
    145 else:

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:436, in SQLDatabase.run(self, command, fetch, include_columns)
    425 def run(
    426     self,
    427     command: str,
    428     fetch: Literal["all", "one"] = "all",
    429     include_columns: bool = False,
    430 ) -> str:
    431     """Execute a SQL command and return a string representing the results.
    432 
    433     If the statement returns rows, a string of the results is returned.
    434     If the statement returns no rows, an empty string is returned.
    435     """
--> 436     result = self._execute(command, fetch)
    438     res = [
    439         {
    440             column: truncate_word(value, length=self._max_string_length)
   (...)
    443         for r in result
    444     ]
    446     if not include_columns:

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/langchain_community/utilities/sql_database.py:413, in SQLDatabase._execute(self, command, fetch)
    410     elif self.dialect == "postgresql":  # postgresql
    411         connection.exec_driver_sql("SET search_path TO %s", (self._schema,))
--> 413 cursor = connection.execute(text(command))
    414 if cursor.returns_rows:
    415     if fetch == "all":

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1416, in Connection.execute(self, statement, parameters, execution_options)
   1414     raise exc.ObjectNotExecutableError(statement) from err
   1415 else:
-> 1416     return meth(
   1417         self,
   1418         distilled_parameters,
   1419         execution_options or NO_OPTIONS,
   1420     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/sql/elements.py:516, in ClauseElement._execute_on_connection(self, connection, distilled_params, execution_options)
    514     if TYPE_CHECKING:
    515         assert isinstance(self, Executable)
--> 516     return connection._execute_clauseelement(
    517         self, distilled_params, execution_options
    518     )
    519 else:
    520     raise exc.ObjectNotExecutableError(self)

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1639, in Connection._execute_clauseelement(self, elem, distilled_parameters, execution_options)
   1627 compiled_cache: Optional[CompiledCacheType] = execution_options.get(
   1628     "compiled_cache", self.engine._compiled_cache
   1629 )
   1631 compiled_sql, extracted_params, cache_hit = elem._compile_w_cache(
   1632     dialect=dialect,
   1633     compiled_cache=compiled_cache,
   (...)
   1637     linting=self.dialect.compiler_linting | compiler.WARN_LINTING,
   1638 )
-> 1639 ret = self._execute_context(
   1640     dialect,
   1641     dialect.execution_ctx_cls._init_compiled,
   1642     compiled_sql,
   1643     distilled_parameters,
   1644     execution_options,
   1645     compiled_sql,
   1646     distilled_parameters,
   1647     elem,
   1648     extracted_params,
   1649     cache_hit=cache_hit,
   1650 )
   1651 if has_events:
   1652     self.dispatch.after_execute(
   1653         self,
   1654         elem,
   (...)
   1658         ret,
   1659     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1848, in Connection._execute_context(self, dialect, constructor, statement, parameters, execution_options, *args, **kw)
   1843     return self._exec_insertmany_context(
   1844         dialect,
   1845         context,
   1846     )
   1847 else:
-> 1848     return self._exec_single_context(
   1849         dialect, context, statement, parameters
   1850     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1988, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1985     result = context._setup_result_proxy()
   1987 except BaseException as e:
-> 1988     self._handle_dbapi_exception(
   1989         e, str_statement, effective_parameters, cursor, context
   1990     )
   1992 return result

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:2346, in Connection._handle_dbapi_exception(self, e, statement, parameters, cursor, context, is_sub_exec)
   2344     else:
   2345         assert exc_info[1] is not None
-> 2346         raise exc_info[1].with_traceback(exc_info[2])
   2347 finally:
   2348     del self._reentrant_error

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/base.py:1969, in Connection._exec_single_context(self, dialect, context, statement, parameters)
   1967                 break
   1968     if not evt_handled:
-> 1969         self.dialect.do_execute(
   1970             cursor, str_statement, effective_parameters, context
   1971         )
   1973 if self._has_events or self.engine._has_events:
   1974     self.dispatch.after_cursor_execute(
   1975         self,
   1976         cursor,
   (...)
   1980         context.executemany,
   1981     )

File ~/.conda/envs/guardrails1/lib/python3.9/site-packages/sqlalchemy/engine/default.py:922, in DefaultDialect.do_execute(self, cursor, statement, parameters, context)
    921 def do_execute(self, cursor, statement, parameters, context=None):
--> 922     cursor.execute(statement, parameters)

Warning: You can only execute one statement at a time.
```
**Issue:** 
The Error occurs because when generating the SQLQuery, the llm_input
includes the stop character of "\nSQLResult:", so for this user query
the LLM generated response is **SELECT Time FROM men_butterfly_100m
WHERE Swimmer = 'Lance Larson';\nSQLResult:** it is required to remove
the SQLResult suffix on the llm response before executing it on the
database.

```
llm_inputs = {
            "input": input_text,
            "top_k": str(self.top_k),
            "dialect": self.database.dialect,
            "table_info": table_info,
            "stop": ["\nSQLResult:"],
        }

sql_cmd = self.llm_chain.predict(
                callbacks=_run_manager.get_child(),
                **llm_inputs,
            ).strip()

if SQL_RESULT in sql_cmd:
    sql_cmd = sql_cmd.split(SQL_RESULT)[0].strip()
result = self.database.run(sql_cmd)
```


<!-- Thank you for contributing to LangChain!

Please title your PR "<package>: <description>", where <package> is
whichever of langchain, community, core, experimental, etc. is being
modified.

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes if applicable,
  - **Dependencies:** any dependencies required for this change,
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` from the root
of the package you've modified to check this locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc: https://python.langchain.com/docs/contributing/

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 01:22:35 -07:00
T Cramer
540ebf35a9 community[patch]: Add explicit error message to Bedrock error output. (#17328)
- **Description:** Propagate Bedrock errors into Langchain explicitly.
Use-case: unset region error is hidden behind 'Could not load
credentials...' message
- **Issue:**
[17654](https://github.com/langchain-ai/langchain/issues/17654)
  - **Dependencies:** None

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 03:07:33 +00:00
Marcus Virginia
69bb96c80f community[patch]: surrealdb handle for empty metadata and allow collection names with complex characters (#17374)
- **Description:** Handle for empty metadata and allow collection names
with complex characters
  - **Issue:** #17057
  - **Dependencies:** `surrealdb`

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 01:04:27 +00:00
ale-delfino
0df76bee37 core[patch]:: XML parser to cover the case when the xml only contains the root level tag (#17456)
Description: Fix xml parser to handle strings that only contain the root
tag
Issue: N/A
Dependencies: None
Twitter handle: N/A

A valid xml text can contain only the root level tag. Example: <body>
  Some text here
</body>
The example above is a valid xml string. If parsed with the current
implementation the result is {"body": []}. This fix checks if the root
level text contains any non-whitespace character and if that's the case
it returns {root.tag: root.text}. The result is that the above text is
correctly parsed as {"body": "Some text here"}

@ale-delfino

Thank you for contributing to LangChain!

Checklist:

- [x] PR title: Please title your PR "package: description", where
"package" is whichever of langchain, community, core, experimental, etc.
is being modified. Use "docs: ..." for purely docs changes, "templates:
..." for template changes, "infra: ..." for CI changes.
  - Example: "community: add foobar LLM"
- [x] PR message: **Delete this entire template message** and replace it
with the following bulleted list
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-29 00:55:23 +00:00
kYLe
124ab79c23 community[minor]: Add Anyscale embedding support (#17605)
**Description:** Add embedding model support for Anyscale Endpoint
**Dependencies:** openai

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:53:53 +00:00
Lance Martin
12843f292f community[patch]: llama cpp embeddings reset default n_batch (#17594)
When testing Nomic embeddings --
```
from langchain_community.embeddings import LlamaCppEmbeddings
embd_model_path = "/Users/rlm/Desktop/Code/llama.cpp/models/nomic-embd/nomic-embed-text-v1.Q4_K_S.gguf"
embd_lc = LlamaCppEmbeddings(model_path=embd_model_path)
embedding_lc = embd_lc.embed_query(query)
```

We were seeing this error for strings > a certain size -- 
```
File ~/miniforge3/envs/llama2/lib/python3.9/site-packages/llama_cpp/llama.py:827, in Llama.embed(self, input, normalize, truncate, return_count)
    824     s_sizes = []
    826 # add to batch
--> 827 self._batch.add_sequence(tokens, len(s_sizes), False)
    828 t_batch += n_tokens
    829 s_sizes.append(n_tokens)

File ~/miniforge3/envs/llama2/lib/python3.9/site-packages/llama_cpp/_internals.py:542, in _LlamaBatch.add_sequence(self, batch, seq_id, logits_all)
    540 self.batch.token[j] = batch[i]
    541 self.batch.pos[j] = i
--> 542 self.batch.seq_id[j][0] = seq_id
    543 self.batch.n_seq_id[j] = 1
    544 self.batch.logits[j] = logits_all

ValueError: NULL pointer access
```

The default `n_batch` of llama-cpp-python's Llama is `512` but we were
explicitly setting it to `8`.
 
These need to be set to equal for embedding models. 
* The embedding.cpp example has an assertion to make sure these are
always equal.
* Apparently this is not being done properly in llama-cpp-python.

With `n_batch` set to 8, if more than 8 tokens are passed the batch runs
out of space and it crashes.

This also explains why the CPU compute buffer size was small:

raw client with default `n_batch=512`
```
llama_new_context_with_model:        CPU input buffer size   =     3.51 MiB
llama_new_context_with_model:        CPU compute buffer size =    21.00 MiB
```
langchain with `n_batch=8`
```
llama_new_context_with_model:        CPU input buffer size   =     0.04 MiB
llama_new_context_with_model:        CPU compute buffer size =     0.33 MiB
```

We can work around this by passing `n_batch=512`, but this will not be
obvious to some users:
```
    embedding = LlamaCppEmbeddings(model_path=embd_model_path,
                                   n_batch=512)
```

From discussion w/ @cebtenzzre. Related:

https://github.com/abetlen/llama-cpp-python/issues/1189

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:47:22 +00:00
Zijian Han
8e976545f3 community[patch]: support OpenAI whisper base url (#17695)
**Description:** The base URL for OpenAI is retrieved from the
environment variable "OPENAI_BASE_URL", whereas for langchain it is
obtained from "OPENAI_API_BASE". By adding `base_url =
os.environ.get("OPENAI_API_BASE")`, the OpenAI proxy can execute
correctly.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:35:27 +00:00
Paulo Nascimento
44a3484503 community[patch]: add NotebookLoader unit test (#17721)
Thank you for contributing to LangChain!

- **Description:** added unit tests for NotebookLoader. Linked PR:
https://github.com/langchain-ai/langchain/pull/17614
- **Issue:**
[#17614](https://github.com/langchain-ai/langchain/pull/17614)
    - **Twitter handle:** @paulodoestech
- [x] Pass lint and test: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified to check that you're
passing lint and testing. See contribution guidelines for more
information on how to write/run tests, lint, etc:
https://python.langchain.com/docs/contributing/
- [x] Add tests and docs: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: lachiewalker <lachiewalker1@hotmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:27:46 +00:00
Paulo Nascimento
4c3a67122f community[patch]: add Integration for OpenAI image gen with v1 sdk (#17771)
**Description:** Created a Langchain Tool for OpenAI DALLE Image
Generation.
**Issue:**
[#15901](https://github.com/langchain-ai/langchain/issues/15901)
**Dependencies:** n/a
**Twitter handle:** @paulodoestech

- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:23:14 +00:00
Kaixin Yang
a8104ea8e9 openai[patch]: add checking codes for calling AI model get error (#17909)
**Description:**: adding checking codes for calling AI model get error
in chat_models/base.py and llms/base.py
**Issue**: Sometimes the AI Model calling will get error, we should
raise it.
Otherwise, the next code 'choices.extend(response["choices"])' will
throw a "TypeError: 'NoneType' object is not iterable" error to mask the
true error.
       Because 'response["choices"]' is None.
**Dependencies**: None

---------

Co-authored-by: yangkx <yangkx@asiainfo-int.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-29 00:17:32 +00:00
Vincent Chen
833d61adb3 docs: update Together README.md (#18004)
## PR message
**Description:** This PR adds a README file for the Together API in the
`libs/partners` folder of this repository. The README includes:
 - A brief description of the package
 - Installation instructions and class introductions
 - Simple usage examples

**Issue:** #17545 

This PR only contains document changes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-29 00:02:32 +00:00
Jiaming
3d3cc71287 community[patch]: fix bugs for bilibili Loader (#18036)
- **Description:** 
1. Fix the BiliBiliLoader that can receive cookie parameters, it
requires 3 other parameters to run. The change is backward compatible.
  2. Add test;      
  3. Add example in docs

- **Issue:** [#14213]

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 16:39:38 -07:00
Ethan Knights
1ef3fa0411 docs: improve readability of Langchain Expression Language get_started.ipynb (#18157)
**Description:** A few grammatical changes to improve readability of the
LCEL .ipynb and tidy some null characters.
**Issue:** N/A

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 23:38:30 +00:00
Sachin Paryani
25c9f3d1d1 community[patch]: Support Streaming in Azure Machine Learning (#18246)
- [x] **PR title**: "community: Support streaming in Azure ML and few
naming changes"

- [x] **PR message**:
- **Description:** Added support for streaming for azureml_endpoint.
Also, renamed and AzureMLEndpointApiType.realtime to
AzureMLEndpointApiType.dedicated. Also, added new classes
CustomOpenAIChatContentFormatter and CustomOpenAIContentFormatter and
updated the classes LlamaChatContentFormatter and LlamaContentFormatter
to now show a deprecated warning message when instantiated.

---------

Co-authored-by: Sachin Paryani <saparan@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 23:38:20 +00:00
xiaohuanshu
ecb11a4a32 langchain[patch]: fix BaseChatMemory get output data error with extra key (#18117)
**Description:** At times, BaseChatMemory._get_input_output may acquire
some extra keys such as 'intermediate_steps' (agent_executor with
return_intermediate_steps set to True) and 'messages'
(agent_executor.iter with memory). In these instances, _get_input_output
can raise an error due to the presence of multiple keys. The 'output'
field should be used as the default field in these cases.
**Issue:** #16791
2024-03-28 16:38:08 -07:00
Isaac Francisco
f5e84c8858 docs: fixing markdown for tips (#18199)
Previous markdown code was not working as intended, new code should add
green box around the tip so it is highlighted

Co-authored-by: Hershenson, Isaac (Extern) <isaac.hershenson.extern@bayer04.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 23:37:37 +00:00
Hayden Wolff
85deee521a docs: Nvidia Riva Runnables Documentation (#18237)
- **Description:** Documents how to use the Riva runnables to add
streamed automatic-speech-recognition (ASR) and text-to-speech (TTS) to
chains.
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** @HaydenWolff1

---------

Co-authored-by: Hayden Wolff <hwolff@Haydens-Laptop.local>
Co-authored-by: Hayden Wolff <hwolff@MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 23:35:00 +00:00
Victor Adan
afa2d85405 community[patch]: Added missing from_documents method to KNNRetriever. (#18411)
- Description: Added missing `from_documents` method to `KNNRetriever`,
providing the ability to supply metadata to LangChain `Document`s, and
to give it parity to the other retrievers, which do have
`from_documents`.
- Issue: None
- Dependencies: None
- Twitter handle: None

Co-authored-by: Victor Adan <vadan@netroadshow.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 23:18:50 +00:00
Smit Parmar
dfc4177b50 community[patch]: mypy ignore fix (#18483)
Relates to #17048 
Description : Applied fix to dynamodb and elasticsearch file.

Error was : `Cannot override writeable attribute with read-only
property`
Suggestion:
instead of adding 
```
@messages.setter
def messages(self, messages: List[BaseMessage]) -> None:
    raise NotImplementedError("Use add_messages instead")
```

we can change base class property
`messages: List[BaseMessage]`
to
```
@property
def messages(self) -> List[BaseMessage]:...
```

then we don't need to add `@messages.setter` in all child classes.
2024-03-28 15:36:53 -07:00
aditya thomas
dc9e9a66db docs: update docstring of the ChatAnthropic and AnthropicLLM classes (#18649)
**Description:** Update docstring of the ChatAnthropic and AnthropicLLM
classes
**Issue:** Not applicable
**Dependencies:** None
2024-03-28 15:33:54 -07:00
Luca Dorigo
f19229c564 core[patch]: fix beta, deprecated typing (#18877)
**Description:** 

While not technically incorrect, the TypeVar used for the `@beta`
decorator prevented pyright (and thus most vscode users) from correctly
seeing the types of functions/classes decorated with `@beta`.

This is in part due to a small bug in pyright
(https://github.com/microsoft/pyright/issues/7448 ) - however, the
`Type` bound in the typevar `C = TypeVar("C", Type, Callable)` is not
doing anything - classes are `Callables` by default, so by my
understanding binding to `Type` does not actually provide any more
safety - the modified annotation still works correctly for both
functions, properties, and classes.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 22:33:43 +00:00
aditya thomas
263ee78886 core[runnables]: docstring for class RunnableSerializable, method configurable_fields (#19722)
**Description:** Update to the docstring for class RunnableSerializable,
method configurable_fields
**Issue:** [Add in code documentation to core Runnable methods
#18804](https://github.com/langchain-ai/langchain/issues/18804)
**Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-28 18:15:18 -04:00
HuangZiy
e1f10a697e openai[patch]: perform judgment processing on chat model streaming delta (#18983)
**PR title:** partners: openai chat model
**PR message:** perform judgment processing on chat model streaming
delta
Closes #18977

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 14:46:27 -07:00
wulixuan
b7c8bc8268 community[patch]: fix yuan2 errors in LLMs (#19004)
1. fix yuan2 errors while invoke Yuan2.
2. update tests.
2024-03-28 14:37:44 -07:00
Bob Lin
aba4bd0d13 docs: Add async batch case (#19686) 2024-03-28 14:00:46 -07:00
aditya thomas
ec4dcfca7f core[runnables]: docstring of class RunnableSerializable, method configurable_alternatives (#19724)
**Description:** Update to the docstring for class RunnableSerializable,
method configurable_alternatives
**Issue:** [Add in code documentation to core Runnable methods
#18804](https://github.com/langchain-ai/langchain/issues/18804)
**Dependencies:** None

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-28 17:00:08 -04:00
Davide Menini
824dbc49ee langchain[patch]: add template_tool_response arg to create_json_chat (#19696)
In this small PR I added the `template_tool_response` arg to the
`create_json_chat` function, so that users can customize this prompt in
case of need.
Thanks for your reviews!

---------

Co-authored-by: taamedag <Davide.Menini@swisscom.com>
2024-03-28 13:59:54 -07:00
高远
688ca48019 community[patch]: Adding validation when vector does not exist (#19698)
Adding validation when vector does not exist

Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
2024-03-28 13:58:23 -07:00
Erick Friis
f55b11fb73 infra: Revert run partner CI on core PRs (#19733)
Reverts parts of langchain-ai/langchain#19688
2024-03-28 20:45:59 +00:00
Alessandro Rossi
665f15bd48 docs: fix typos and make quickstart more readable (#19712)
Description: minor docs changes to make it more readable.
Issue: N/A
Dependencies: N/A
Twitter handle: _kubealex
2024-03-28 20:10:32 +00:00
standby24x7
36090c84f2 docs: Update function "run" to "invoke" in llm_symbolic_math.ipynb (#19713)
This patch updates multiple function "run" to "invoke" in
llm_symbolic_math.ipynb.

Without this patch, you see following message.
The function `run` was deprecated in LangChain 0.1.0
 and will be removed in 0.2.0. Use invoke instead.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-28 13:08:22 -07:00
Chaunte W. Lacewell
4a49fc5a95 community[patch]: Fix bug in vdms (#19728)
**Description:** Fix embedding check in vdms
**Contribution maintainer:** [@cwlacewe](https://github.com/cwlacewe)
2024-03-28 12:54:24 -07:00
高璟琦
75173d31db community[minor]: Add solar model chat model (#18556)
Add our solar chat models, available model choices:
* solar-1-mini-chat
* solar-1-mini-translate-enko
* solar-1-mini-translate-koen

More documents and pricing can be found at
https://console.upstage.ai/services/solar.

The references to our solar model can be found at
* https://arxiv.org/abs/2402.17032

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 12:31:11 -07:00
Erick Friis
e576d6c6b4 cohere[patch]: release 0.1.0rc1 (rc1-2 never released) (#19731) 2024-03-28 19:12:22 +00:00
harry-cohere
ea57050122 cohere: add with_structured_output to ChatCohere (#19730)
**Description:** Adds support for `with_structured_output` to Cohere,
which supports single function calling.

---------

Co-authored-by: BeatrixCohere <128378696+BeatrixCohere@users.noreply.github.com>
2024-03-28 12:09:25 -07:00
Guangdong Liu
0571f886d1 core[patch]: Fix jsonOutputParser fails if a json value contains ``` inside it. (#19717)
- **Issue:** fix #19646 
- @baskaryan, @eyurtsev PTAL
2024-03-28 12:01:09 -07:00
Davide Menini
f7042321f1 community[patch]: gather token usage info in BedrockChat during generation (#19127)
This PR allows to calculate token usage for prompts and completion
directly in the generation method of BedrockChat. The token usage
details are then returned together with the generations, so that other
downstream tasks can access them easily.

This allows to define a callback for tokens tracking and cost
calculation, similarly to what happens with OpenAI (see
[OpenAICallbackHandler](https://api.python.langchain.com/en/latest/_modules/langchain_community/callbacks/openai_info.html#OpenAICallbackHandler).
I plan on adding a BedrockCallbackHandler later.
Right now keeping track of tokens in the callback is already possible,
but it requires passing the llm, as done here:
https://how.wtf/how-to-count-amazon-bedrock-anthropic-tokens-with-langchain.html.
However, I find the approach of this PR cleaner.

Thanks for your reviews. FYI @baskaryan, @hwchase17

---------

Co-authored-by: taamedag <Davide.Menini@swisscom.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 18:58:46 +00:00
ligang-super
a662468dde community[patch]: Fix the error of Baidu Qianfan not passing the stop parameter (#18666)
- [x] **PR title**: "community: fix baidu qianfan missing stop
parameter"
- [x] **PR message**:
- **Description: Baidu Qianfan lost the stop parameter when requesting
service due to extracting it from kwargs. This bug can cause the agent
to receive incorrect results

---------

Co-authored-by: ligang33 <ligang33@baidu.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 18:21:49 +00:00
BeatrixCohere
d1a2e194c3 cohere[patch]: misc fixs tool use agent and cohere chat (#19705)
Bug fixes in this PR:
* allows for other params such as "message" not just the input param to
the prompt for the cohere tools agent
* fixes to documents kwarg from messages
* fixes to tool_calls API call

---------

Co-authored-by: Harry M <127103098+harry-cohere@users.noreply.github.com>
2024-03-28 10:19:38 -07:00
ccurme
b35e68c41f docs: update use_cases/question_answering/chat_history (#19349)
Update following https://github.com/langchain-ai/langchain/issues/19344
2024-03-28 12:51:01 -04:00
Erick Friis
8c2ed85a45 core[patch], infra: release 0.1.36, run partner CI on core PRs (#19688) 2024-03-28 08:55:10 -07:00
Erick Friis
5327bc9ec4 elasticsearch[patch]: move to repo (#19620) 2024-03-28 08:54:57 -07:00
Nilanjan De
239dd7c0c0 langchain[patch]: Use map() and avoid "ValueError: max() arg is an empty sequence" in MergerRetriever (#18679)
- **Issue:** When passing an empty list to MergerRetriever it fails with
error: ValueError: max() arg is an empty sequence

- **Description:** We have a use case where we dynamically select
retrievers and use MergerRetriever for merging the output of the
retrievers. We faced this issue when the retriever_docs list is empty.
Adding a default 0 for cases when retriever_docs is an empty list to
avoid "ValueError: max() arg is an empty sequence". Also, changed to use
map() which is more than twice as fast compared to the current
implementation.
```
import timeit
# Sample retriever_docs with varying lengths of sublists
retriever_docs = [[i for i in range(j)] for j in range(1, 1000)]
# First code snippet
code1 = '''
max_docs = max(len(docs) for docs in retriever_docs)
'''
# Second code snippet
code2 = '''
max_docs = max(map(len, retriever_docs), default=0)
'''
# Benchmarking
time1 = timeit.timeit(stmt=code1, globals=globals(), number=10000)
time2 = timeit.timeit(stmt=code2, globals=globals(), number=10000)
# Output
print(f"Execution time for code snippet 1: {time1} seconds")
print(f"Execution time for code snippet 2: {time2} seconds")
```

- **Dependencies:** none
2024-03-27 23:52:57 -07:00
aditya thomas
4cd38fe89f docs: update docstring of the ChatGroq class (#18645)
**Description:** Update docstring of the ChatGroq class
**Issue:** Not applicable
**Dependencies:** None
2024-03-27 23:46:52 -07:00
Jaid
e4d7b1a482 voyageai[patch]: top level reranker import (#19645)
The previous version didn't had  Voyage rerank in the init file


- [ ] **PR title**: langchain_voyageai reranker is not working
 


- [ ] **PR message**: 
    - **Description:** This fix let you run reranker from voyage
    - **Issue:** Was not able to run reranker from voyage
  






 @efriis
2024-03-28 06:37:55 +00:00
Xinwei Xiong
26eed70c11 infra: Optimize Makefile for Better Usability and Maintenance (#18859)
**Previous screenshots:**


![image](https://github.com/langchain-ai/langchain/assets/86140903/e2f326e3-4d97-4b22-aacb-e789a9d815e4)

**Current screenshot:**

![image](https://github.com/langchain-ai/langchain/assets/86140903/bd8a3ea7-1b8a-4803-9168-df45f6fa4893)
2024-03-27 23:37:39 -07:00
Juan Jose Miguel Ovalle Villamil
51baa1b5cf langchain[patch]: fix-cohere-reranker-rerank-method with cohere v5 (#19486)
#### Description
Fixed the following error with `rerank` method from `CohereRerank`:
```
---> [79](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:79) results = self.client.rerank(
     [80](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:80)     query, docs, model, top_n=top_n, max_chunks_per_doc=max_chunks_per_doc
     [81](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:81) )
     [82](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:82) result_dicts = []
     [83](https://vscode-remote+wsl-002bubuntu.vscode-resource.vscode-cdn.net/home/jjmov99/legal-colombia/~/legal-colombia/.venv/lib/python3.11/site-packages/langchain/retrievers/document_compressors/cohere_rerank.py:83) for res in results.results:

TypeError: BaseCohere.rerank() takes 1 positional argument but 4 positional arguments (and 2 keyword-only arguments) were given
```
This was easily fixed going from this:
```
   def rerank(
        self,
        documents: Sequence[Union[str, Document, dict]],
        query: str,
        *,
        model: Optional[str] = None,
        top_n: Optional[int] = -1,
        max_chunks_per_doc: Optional[int] = None,
    ) -> List[Dict[str, Any]]:
         ...
        if len(documents) == 0:  # to avoid empty api call
            return []
        docs = [
            doc.page_content if isinstance(doc, Document) else doc for doc in documents
        ]
        model = model or self.model
        top_n = top_n if (top_n is None or top_n > 0) else self.top_n
        results = self.client.rerank(
            query, docs, model, top_n=top_n, max_chunks_per_doc=max_chunks_per_doc
        )
        result_dicts = []
        for res in results:
            result_dicts.append(
                {"index": res.index, "relevance_score": res.relevance_score}
            )
        return result_dicts
```
to this:
```
    def rerank(
        self,
        documents: Sequence[Union[str, Document, dict]],
        query: str,
        *,
        model: Optional[str] = None,
        top_n: Optional[int] = -1,
        max_chunks_per_doc: Optional[int] = None,
    ) -> List[Dict[str, Any]]:
         ...
        if len(documents) == 0:  # to avoid empty api call
            return []
        docs = [
            doc.page_content if isinstance(doc, Document) else doc for doc in documents
        ]
        model = model or self.model
        top_n = top_n if (top_n is None or top_n > 0) else self.top_n
        results = self.client.rerank(
            query=query, documents=docs, model=model, top_n=top_n, max_chunks_per_doc=max_chunks_per_doc <-------------
        )
        result_dicts = []
        for res in results.results:  <-------------
            result_dicts.append(
                {"index": res.index, "relevance_score": res.relevance_score}
            )
        return result_dicts
```
#### Unit & Integration tests
I added a unit test to check the behaviour of `rerank`. Also fixed the
original integration test which was failing.

#### Format & Linting
Everything worked properly with `make lint_diff`, `make format_diff` and
`make format`. However I noticed an error coming from other part of the
library when doing `make lint`:

```
(langchain-py3.9) ➜  langchain git:(master) make format
[ "." = "" ] || poetry run ruff format .
1636 files left unchanged
[ "." = "" ] || poetry run ruff --select I --fix .
(langchain-py3.9) ➜  langchain git:(master) make lint
./scripts/check_pydantic.sh .
./scripts/lint_imports.sh
poetry run ruff .
[ "." = "" ] || poetry run ruff format . --diff
1636 files already formatted
[ "." = "" ] || poetry run ruff --select I .
[ "." = "" ] || mkdir -p .mypy_cache && poetry run mypy . --cache-dir .mypy_cache
langchain/agents/openai_assistant/base.py:252: error: Argument "file_ids" to "create" of "Assistants" has incompatible type "Optional[Any]"; expected "Union[list[str], NotGiven]"  [arg-type]
langchain/agents/openai_assistant/base.py:374: error: Argument "file_ids" to "create" of "AsyncAssistants" has incompatible type "Optional[Any]"; expected "Union[list[str], NotGiven]"  [arg-type]
Found 2 errors in 1 file (checked 1634 source files)
make: *** [Makefile:65: lint] Error 1
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 06:32:03 +00:00
Shuqian
332996b4b2 openai[patch]: fix ChatOpenAI model's openai proxy (#19559)
Due to changes in the OpenAI SDK, the previous method of setting the
OpenAI proxy in ChatOpenAI no longer works. This PR fixes this issue,
making the previous way of setting the OpenAI proxy in ChatOpenAI
effective again.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 23:16:55 -07:00
Bagatur
b15c7fdde6 anthropic[patch]: fix response metadata type (#19683) 2024-03-27 23:16:26 -07:00
kaijietti
9c4b6dc979 community[patch]: fix bug in cohere that async for a coroutine in ChatCohere (#19381)
Without `await`, the `stream` returned from the `async_client` is
actually a coroutine, which could not be used in `async for`.
2024-03-27 21:34:46 -07:00
Christian Galo
1adaa3c662 community[minor]: Update Azure Cognitive Services to Azure AI Services (#19488)
This is a follow up to #18371. These are the changes:
- New **Azure AI Services** toolkit and tools to replace those of
**Azure Cognitive Services**.
- Updated documentation for Microsoft platform.
- The image analysis tool has been rewritten to use the new package
`azure-ai-vision-imageanalysis`, doing a proper replacement of
`azure-ai-vision`.

These changes:
- Update outdated naming from "Azure Cognitive Services" to "Azure AI
Services".
- Update documentation to use non-deprecated methods to create and use
agents.
- Removes need to depend on yanked python package (`azure-ai-vision`)

There is one new dependency that is needed as a replacement to
`azure-ai-vision`:
- `azure-ai-vision-imageanalysis`. This is optional and declared within
a function.

There is a new `azure_ai_services.ipynb` notebook showing usage; Changes
have been linted and formatted.

I am leaving the actions of adding deprecation notices and future
removal of Azure Cognitive Services up to the LangChain team, as I am
not sure what the current practice around this is.

---

If this PR makes it, my handle is  @galo@mastodon.social

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ccurme <chester.curme@gmail.com>
2024-03-28 03:19:02 +00:00
Shengsheng Huang
ac1dd8ad94 community[minor]: migrate bigdl-llm to ipex-llm (#19518)
- **Description**: `bigdl-llm` library has been renamed to
[`ipex-llm`](https://github.com/intel-analytics/ipex-llm). This PR
migrates the `bigdl-llm` integration to `ipex-llm` .
- **Issue**: N/A. The original PR of `bigdl-llm` is
https://github.com/langchain-ai/langchain/pull/17953
- **Dependencies**: `ipex-llm` library
- **Contribution maintainer**: @shane-huang

Updated doc:   docs/docs/integrations/llms/ipex_llm.ipynb
Updated test:
libs/community/tests/integration_tests/llms/test_ipex_llm.py
2024-03-27 20:12:59 -07:00
Chaunte W. Lacewell
a31f692f4e community[minor]: Add VDMS vectorstore (#19551)
- **Description:** Add support for Intel Lab's [Visual Data Management
System (VDMS)](https://github.com/IntelLabs/vdms) as a vector store
- **Dependencies:** `vdms` library which requires protobuf = "4.24.2".
There is a conflict with dashvector in `langchain` package but conflict
is resolved in `community`.
- **Contribution maintainer:** [@cwlacewe](https://github.com/cwlacewe)
- **Added tests:**
libs/community/tests/integration_tests/vectorstores/test_vdms.py
- **Added docs:** docs/docs/integrations/vectorstores/vdms.ipynb
- **Added cookbook:** cookbook/multi_modal_RAG_vdms.ipynb

---------

Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 03:12:11 +00:00
William FH
b7b62e29fb community[patch], mongodb[patch]: Stop spamming SIMD import warnings (#19531)
If you use an embedding dist function in an eval loop, you get warned
every time. Would prefer to just check once and forget about it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-28 03:11:02 +00:00
Tomaz Bratanic
b04e663426 experimental[patch]: Flatten relationships in LLM graph transformer (#19642) 2024-03-27 19:35:34 -07:00
billytrend-cohere
36abb5dd41 cohere[patch]: Fix positional argument (#19678)
cohere: Fix positional argument

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-28 02:26:08 +00:00
Nuno Campos
fdfb51ad8d core: Two updates to chat model interface (#19684)
- .stream() and .astream() call on_llm_new_token, removing the need for
subclasses to do so. Backwards compatible because now we don't pass
run_manager into ._stream and ._astream
- .generate() and .agenerate() now handle `stream: bool` kwarg for
_generate and _agenerate. Subclasses handle this arg by delegating to
._stream(), now one less thing they need to do. Backwards compat because
this is an optional arg that we now never pass to the subclasses
- .generate() and .agenerate() now inspect callback handlers to decide
on a default value for stream:bool if not passed in. This auto enables
streaming when using astream_events and astream_log
- as a result of these three changes any usage of .astream_events and
.astream_log should now yield chat model stream events
- In future PRs we can update all subclasses to reflect these two things
now handled by base class, but in meantime all will continue to work
2024-03-27 18:45:01 -07:00
harry-cohere
3685f8ceac cohere[patch]: Add cohere tools agent (#19602)
**Description**: Adds a cohere tools agent and related notebook.

---------

Co-authored-by: BeatrixCohere <128378696+BeatrixCohere@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-27 18:35:43 -07:00
William FH
5c41f4083e [Evals] Fix function calling support (#19658)
Current implementation is overzealous in validating chat datasets

Fixes
[#langsmith-sdk:557](https://github.com/langchain-ai/langsmith-sdk/issues/557)
2024-03-27 17:23:35 -07:00
yongheng.liu
7e29b6061f community[minor]: integrate China Mobile Ecloud vector search (#15298)
- **Description:** integrate China Mobile Ecloud vector search, 
  - **Dependencies:** elasticsearch==7.10.1

Co-authored-by: liuyongheng <liuyongheng@cmss.chinamobile.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 23:02:40 +00:00
Hyeongchan Kim
9b70131aed community[patch]: refactor the type hint of file_path in UnstructuredAPIFileLoader class (#18839)
* **Description**: add `None` type for `file_path` along with `str` and
`List[str]` types.
* `file_path`/`filename` arguments in `get_elements_from_api()` and
`partition()` can be `None`, however, there's no `None` type hint for
`file_path` in `UnstructuredAPIFileLoader` and `UnstructuredFileLoader`
currently.
* calling the function with `file_path=None` is no problem, but my IDE
annoys me lol.
* **Issue**: N/A
* **Dependencies**: N/A

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-27 22:31:54 +00:00
CaroFG
cf96060ab7 community[patch]: update for compatibility with latest Meilisearch version (#18970)
- **Description:** Updates Meilisearch vectorstore for compatibility
with v1.6 and above. Adds embedders settings and embedder_name which are
now required.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 22:08:27 +00:00
chyroc
be2adb1083 community[patch]: support unstructured_kwargs for s3 loader (#15473)
fix https://github.com/langchain-ai/langchain/issues/15472

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 22:03:48 +00:00
Bagatur
b901649032 docs: move extraction up (#19667) 2024-03-27 14:55:16 -07:00
Kahlil Wehmeyer
9c08cdea92 core[patch]: ToolException docs/exception message (#17590)
**Description:**
This PR adds a slightly more helpful message to a Tool Exception

```
# current state
langchain_core.tools.ToolException: Too many arguments to single-input tool

# proposed state
langchain_core.tools.ToolException: Too many arguments to single-input tool. Consider using a StructuredTool instead.
```
**Issue:** Somewhat discussed here 👉  #6197 
 **Dependencies:** None
**Twitter handle:** N/A

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-27 21:52:36 +00:00
Evgenii Zheltonozhskii
5b1f9c6d3a infra: Consistent lxml requirements (#19520)
Update the dependency for lxml to be consistent among different
packages; should fix
https://github.com/langchain-ai/langchain/issues/19040
2024-03-27 20:27:59 +00:00
Filip Michalsky
2fceec3771 docs: update cookbook example for SalesGPT - include Stripe Payment Link Generation (#19622)
Thank you for contributing to LangChain!

- [ ] **cookbook** - update example for SalesGPT - include Stripe
Payment Link Generation

- **Description:** We updated the Jupyter notebook example with the
ability of the AI Agent to negotiate with customers and then close the
deal by generating a custom Stripe payment link.
    - **Issue:** N/A
    - **Dependencies:** N/a
    - **Twitter handle:** @FilipMichalsky @0xtotaylor


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Filip Michalsky <filip_michalsky@g.harvard.edu>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 20:16:21 +00:00
Christophe Bornet
33fa8cfcd0 core[minor]: Add async methods to MaxMarginalRelevanceExampleSelector (#19639) 2024-03-27 16:03:18 -04:00
Taqi Jaffri
72c8b3127d cli[patch]: Fix typo in dev script name for the --chat-playground option on the cli (#19673)
Fixes typo

---------

Co-authored-by: Taqi Jaffri <tjaffri@docugami.com>
2024-03-27 15:56:11 -04:00
Jan Nissen
2e0ddd6fb8 core[minor]: support pydantic v2 models in PydanticOutputParser (#18811)
As mentioned in #18322, the current PydanticOutputParser won't work for
anyone trying to parse to pydantic v2 models. This PR adds a separate
`PydanticV2OutputParser`, as well as a `langchain_core.pydantic_v2`
namespace that will fail on import to any projects using pydantic<2.
Happy to update the docs for output parsers if this is something we're
interesting in adding.

On a separate note, I also updated `check_pydantic.sh` to detect
pydantic imports with leading whitespace and excluded the internal
namespaces. That change can be separated into its own PR if needed.

---------

Co-authored-by: Jan Nissen <jan23@gmail.com>
2024-03-27 15:37:52 -04:00
Kangmoon Seo
d0accc3275 docs: fix error output in XMLOutputParser documentation (#19569)
- **Description:** I've made a fix to a ParseError call in the
XMLOutputParser documentation.
- **Issue:** None
- **Dependencies:** None

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-27 18:29:00 +00:00
Tomaz Bratanic
87d2a6b777 community[minor]: Add the option to omit schema refresh in Neo4jGraph (#19654) 2024-03-27 14:20:12 -04:00
Bagatur
5fc6531c74 docs: use first_tool_only instead of return_single (#19666) 2024-03-27 18:19:39 +00:00
jhicks2306
bcb8ab5216 docs: Improve docstring for Runnable bind method (#19659)
Added example to the docstring of the "bind" method of Runnable. This
makes it easier to understand the purpose of the method when reviewing
in code editors. E.g. VS Code below.

<img width="833" alt="Screenshot 2024-03-27 at 16 24 18"
src="https://github.com/langchain-ai/langchain/assets/45722942/ad022d4e-7bc0-4f4b-aa7a-838f1816cc52">

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-27 14:05:41 -04:00
ccurme
4e9b358ed8 docs: Fix broken imports in documentation (#19655)
Found via script in https://github.com/langchain-ai/langchain/pull/19611
2024-03-27 13:54:05 -04:00
Rajendra Kadam
0019d8a948 community[minor]: Add support for non-file-based Document Loaders in PebbloSafeLoader (#19574)
**Description:**
PebbloSafeLoader: Add support for non-file-based Document Loaders

This pull request enhances PebbloSafeLoader by introducing support for
several non-file-based Document Loaders. With this update,
PebbloSafeLoader now seamlessly integrates with the following loaders:
- GoogleDriveLoader
- SlackDirectoryLoader
- Unstructured EmailLoader

**Issue:** NA
**Dependencies:** - None
**Twitter handle:** @Raj__725

---------

Co-authored-by: Rahul Tripathi <rauhl.psit.ec@gmail.com>
2024-03-27 17:39:52 +00:00
Christophe Bornet
9954c6a38e langchain[minor]: Add async methods to EncoderBackedStore (#19597)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-27 17:36:36 +00:00
Erick Friis
929ed65554 cohere[patch]: release 0.1.0rc1 (#19663) 2024-03-27 17:14:56 +00:00
hulitaitai
dc2c9dd4d7 Update text2vec.py (#19657)
Add that URL of the embedding tool "text2vec".
Fix minor mistakes in the doc-string.
2024-03-27 13:13:30 -04:00
Erick Friis
7630e9529c Revert "community: added partners/package-name folders" (#19662)
Reverts langchain-ai/langchain#19290
2024-03-27 17:09:30 +00:00
Christophe Bornet
409c6eeb0b core: Add async methods to LengthBasedExampleSelector (#19640) 2024-03-27 13:05:58 -04:00
Bagatur
c7f1962f73 core[patch]: Release 0.1.35 (#19660) 2024-03-27 16:54:03 +00:00
Eugene Yurtsev
e8339b1d83 core[patch]: Patch XML vulnerability in XMLOutputParser (CVE-2024-1455) (#19653)
Patch potential XML vulnerability CVE-2024-1455

This patches a potential XML vulnerability in the XMLOutputParser in
langchain-core. The vulnerability in some situations could lead to a
denial of service attack.

At risk are users that:

1) Running older distributions of python that have older version of
libexpat
2) Are using XMLOutputParser with an agent
3) Accept inputs from untrusted sources with this agent (e.g., endpoint
on the web that allows an untrusted user to interact wiith the parser)
2024-03-27 12:41:52 -04:00
Guangdong Liu
7042934b5f community[patch]: Fix the bug that Chroma does not specify embedding_function (#19277)
- **Issue:** close #18291
- @baskaryan, @eyurtsev PTAL
2024-03-27 11:43:38 -04:00
billytrend-cohere
85f57ab4cd cohere[patch]: Fix cohere rerank (#19624)
Fix cohere rerank inspired by
https://github.com/langchain-ai/langchain/pull/19486
2024-03-27 08:41:53 -07:00
Eugene Yurtsev
8ab7bb3166 core[patch]: XMLOutputParser fix to handle changes to xml standard library (#19612)
Newest python micro releases broke streaming in the XMLOutputParser. This fixes the parsing code to work with trailing junk after the XML content.
2024-03-27 09:25:28 -04:00
yuwenzho
3a7d2cf443 community[minor]: Add ITREX optimized Embeddings (#18474)
Introduction
[Intel® Extension for
Transformers](https://github.com/intel/intel-extension-for-transformers)
is an innovative toolkit designed to accelerate GenAI/LLM everywhere
with the optimal performance of Transformer-based models on various
Intel platforms

Description

adding ITREX runtime embeddings using intel-extension-for-transformers.
added mdx documentation and example notebooks
added embedding import testing.

---------

Signed-off-by: yuwenzho <yuwen.zhou@intel.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 07:22:06 +00:00
Juan Jose Miguel Ovalle Villamil
1fe10a3e3d experimental[patch]: Enhance LLMGraphTransformer with async processing and improved readability (#19205)
- [x] **PR title**: "experimental: Enhance LLMGraphTransformer with
async processing and improved readability"


- [x] **PR message**: 
- **Description:** This pull request refactors the `process_response`
and `convert_to_graph_documents` methods in the LLMGraphTransformer
class to improve code readability and adds async versions of these
methods for concurrent processing.
    The main changes include:
- Simplifying list comprehensions and conditional logic in the
process_response method for better readability.
- Adding async versions aprocess_response and
aconvert_to_graph_documents to enable concurrent processing of
documents.
These enhancements aim to improve the overall efficiency and
maintainability of the `LLMGraphTransformer` class.
  - **Issue:** N/A
  - **Dependencies:** No additional dependencies required.
  - **Twitter handle:** @jjovalle99


- [x] **Add tests and docs**: N/A (This PR does not introduce a new
integration)


- [x] **Lint and test**: Ran make format, make lint, and make test from
the root of the modified package(s). All tests pass successfully.

Additional notes:

- The changes made in this PR are backwards compatible and do not
introduce any breaking changes.
- The PR touches only the `LLMGraphTransformer` class within the
experimental package.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 23:40:21 -07:00
Fabrizio Ruocco
f12cb0bea4 community[patch]: Microsoft Azure Document Intelligence updates (#16932)
- **Description:** Update Azure Document Intelligence implementation by
Microsoft team and RAG cookbook with Azure AI Search

---------

Co-authored-by: Lu Zhang (AI) <luzhan@microsoft.com>
Co-authored-by: Yateng Hong <yatengh@microsoft.com>
Co-authored-by: teethache <hongyateng2006@126.com>
Co-authored-by: Lu Zhang <44625949+luzhang06@users.noreply.github.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 23:36:59 -07:00
Guangdong Liu
cd79305eb9 openai[patch]: fix AzureChatOpenAI missing parameter problem (#19258)
- **Issue:** close #19255
- PTAL @baskaryan @eyurtsev
2024-03-26 22:31:36 -07:00
Leonid Ganeline
3a978a4bdc docs: output_parsers page fix (#19623)
Issue with this
[page](https://python.langchain.com/docs/modules/model_io/output_parsers/):
Table: "Input Type" columns: strings `str \| Message` (the escape char
"\" doesn't work inside backticked text).
2024-03-26 22:17:41 -07:00
Ethan Yang
28cd5522c2 docs: fix typo in openvino document (#19627) 2024-03-26 22:13:54 -07:00
xsai9101
1c27de6ce2 docs: Fix oracle doc loader format issue (#19628) 2024-03-26 22:13:36 -07:00
Timothy
ad77fa15ee community[patch]: Adding try-except block for GCSDirectoryLoader (#19591)
- **Description:** Implemented try-except block for
`GCSDirectoryLoader`. Reason: Users processing large number of
unstructured files in a folder may experience many different errors. A
try-exception block is added to capture these errors. A new argument
`use_try_except=True` is added to enable *silent failure* so that error
caused by processing one file does not break the whole function.
- **Issue:** N/A
- **Dependencies:** no new dependencies
- **Twitter handle:** timothywong731

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-27 00:12:24 +00:00
fzowl
aea2be5bf3 voyageai[patch]: VoyageAI rerank (#19521)
Adding VoyageAI reranking

---------

Co-authored-by: fodizoltan <zoltan@conway.expert>
Co-authored-by: Yujie Qian <thomasq0809@gmail.com>
2024-03-26 17:07:23 -07:00
Leonid Ganeline
4d85485e71 docs: PromptTemplate import from core (#19616)
Changed import of `PromptTemplate` from `langchain` to `langchain_core`
in all examples (notebooks)
2024-03-26 17:03:36 -07:00
Leonid Ganeline
3dc0f3c371 experimental[patch]: PromptTemplate import fix (#19617)
Changed import of `PromptTemplate` from `langchain` to `langchain_core`
in `langchain_experimental`
2024-03-26 17:03:13 -07:00
xsai9101
160a8eb178 community[minor]: add oracle autonomous database doc loader integration (#19536)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
- **Description:** Adding oracle autonomous database document loader
integration. This will allow users to connect to oracle autonomous
database through connection string or TNS configuration.
    https://www.oracle.com/autonomous-database/
    - **Issue:** None
    - **Dependencies:** oracledb python package 
    https://pypi.org/project/oracledb/
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.
  Unit test and doc are added.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 17:02:18 -07:00
Ethan Yang
5784dfed00 docs: update openvino documents (#19543)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 22:15:30 +00:00
Erick Friis
bf8ba00520 cli[patch]: release 0.0.22rc0, chat playground (#19614) 2024-03-26 15:08:56 -07:00
Leonid Ganeline
a3d24bc10b docs: release date fix (#19585)
Replaced the overdue release promise.
2024-03-26 14:51:09 -07:00
Raghav Rawat
b5640a0883 docs: Update apify.ipynb for Document class import (#19598)
- **Description:**
Update to correctly import Document class -
from langchain_core.documents import Document

- **Issue:**
Fixes the notebook and the hosted documentation
[here](https://python.langchain.com/docs/integrations/tools/apify)

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 21:46:29 +00:00
jhicks2306
087823aefa docs: Update docstring for MessagesPlaceholder (#19601)
Update to docstring for MessagesPlaceholder so that it shows helpful
information in code editors. E.g. VS Code as shown below.


<img width="587" alt="Screenshot 2024-03-26 at 17 18 58"
src="https://github.com/langchain-ai/langchain/assets/45722942/8f49d09f-ed8d-4f61-a9d4-3611dbe9c9c5">

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 14:34:00 -07:00
Christophe Bornet
7c2578bd55 langchain[patch]: Add async methods to EmbeddingRouterChain (#19603) 2024-03-26 14:33:36 -07:00
Christophe Bornet
b3d7b5a653 langchain[patch[: Add async methods to TimeWeightedVectorStoreRetriever (#19606) 2024-03-26 14:03:47 -07:00
Adam Law
aeb7b6b11d community[patch]: use semantic_configurations in AzureSearch (#19347)
- **Description:** Currently the semantic_configurations are not used
when creating an AzureSearch instance, instead creating a new one with
default values. This PR changes the behavior to use the passed
semantic_configurations if it is present, and the existing default
configuration if not.

---------

Co-authored-by: Adam Law <adamlaw@microsoft.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 13:57:39 -07:00
Christophe Bornet
a7274f006e langchain[patch]: Add async methods to VectorstoreIndexCreator (#19582) 2024-03-26 13:57:13 -07:00
Bagatur
241774012a core[patch]: Release 0.1.34 (#19609)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-26 13:50:48 -07:00
Nuno Campos
c78eb55859 load: Optionally disable reading secrets from env (#19596)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-26 20:32:56 +00:00
Eugene Yurtsev
d3c9974da2 core[patch]: Temporarily disable test for streaming xml parser (#19610)
Test is failing due to micro version bump in python interpreter which
changed something about how std xml parser works
2024-03-26 20:24:20 +00:00
Eugene Yurtsev
8bc5cdccee core[patch]: Reverting changes with defusedXML (#19604)
DefusedXML is causing parsing errors on previously functional code with
the 0.7.x versions. These do not seem to support newer version of python
well. 0.8.x has only been released as rc, so we're not going to to use
it in the core package
2024-03-26 15:13:09 -04:00
Giannis
9ea2a9b0c1 cohere[patch]: Add additional kwargs support for Cohere SDK params (#19533)
* Adds support for `additional_kwargs` in `get_cohere_chat_request`
* This functionality passes in Cohere SDK specific parameters from
`BaseMessage` based classes to the API

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-26 18:30:37 +00:00
Adrian Valente
2763d8cbe5 community: add len() implementation to Chroma (#19419)
Thank you for contributing to LangChain!

- [x] **Add len() implementation to Chroma**: "package: community"


- [x] **PR message**: 
- **Description:** add an implementation of the __len__() method for the
Chroma vectostore, for convenience.
- **Issue:** no exposed method to know the size of a Chroma vectorstore
    - **Dependencies:** None
    - **Twitter handle:** lowrank_adrian


- [x] **Add tests and docs**

- [x] **Lint and test**

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 12:53:10 -04:00
Tom Aarsen
e0a1278d2b docs: HFEmbeddings: Add more information to model_kwargs/encode_kwargs (#19594)
- **Description:** Be more explicit with the `model_kwargs` and
`encode_kwargs` for `HuggingFaceEmbeddings`.
    - **Issue:** -
    - **Dependencies:** -

I received some reports by my users that they didn't realise that you
could change the default `batch_size` with `HuggingFaceEmbeddings`,
which may be attributed to how the `model_kwargs` and `encode_kwargs`
don't give much information about what you can specify.

I've added some parameter names & links to the Sentence Transformers
documentation to help clear it up. Let me know if you'd rather have
Markdown/Sphinx-style hyperlinks rather than a "bare URL".

- Tom Aarsen
2024-03-26 12:46:04 -04:00
Dobiichi-Origami
18e6f9376d community[Qianfan]: add function_call in additional_kwargs (#19550)
- **Description:** add lacked `function_call` field in
`additional_kwargs` in previous version
- **Dependencies:** None of new dependency
2024-03-26 12:20:19 -04:00
Eugene Yurtsev
9c7e860cf6 core[patch]: Remove anyio dependency (#19583)
The dependency isn't used anymore
2024-03-26 11:59:22 -04:00
mwmajewsk
f7a1fd91b8 community: better support of pathlib paths in document loaders (#18396)
So this arose from the
https://github.com/langchain-ai/langchain/pull/18397 problem of document
loaders not supporting `pathlib.Path`.

This pull request provides more uniform support for Path as an argument.
The core ideas for this upgrade: 
- if there is a local file path used as an argument, it should be
supported as `pathlib.Path`
- if there are some external calls that may or may not support Pathlib,
the argument is immidiately converted to `str`
- if there `self.file_path` is used in a way that it allows for it to
stay pathlib without conversion, is is only converted for the metadata.

Twitter handle: https://twitter.com/mwmajewsk
2024-03-26 11:51:52 -04:00
Guangdong Liu
94b869a974 github action: Add dead link check for .mdx files (#19492)
- **Description:** Add dead link check for .mdx files. I checked the
logs and found that files with .mdx suffix were not checked.

https://github.com/langchain-ai/langchain/actions/runs/8409525467/job/23026924465#logs
- @baskaryan, @efriis, @eyurtsev, @hwchase17.
2024-03-26 08:42:34 -07:00
Christophe Bornet
6f477e3cb6 docs: Remove chromadb from required dependency in examples with VectorstoreIndexCreator (#19578) 2024-03-26 11:12:21 -04:00
Yuki Watanabe
cfecbda48b community[minor]: Allow passing allow_dangerous_deserialization when loading LLM chain (#18894)
### Issue
Recently, the new `allow_dangerous_deserialization` flag was introduced
for preventing unsafe model deserialization that relies on pickle
without user's notice (#18696). Since then some LLMs like Databricks
requires passing in this flag with true to instantiate the model.

However, this breaks existing functionality to loading such LLMs within
a chain using `load_chain` method, because the underlying loader
function
[load_llm_from_config](f96dd57501/libs/langchain/langchain/chains/loading.py (L40))
 (and load_llm) ignores keyword arguments passed in. 

### Solution
This PR fixes this issue by propagating the
`allow_dangerous_deserialization` argument to the class loader iff the
LLM class has that field.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 11:07:55 -04:00
hulitaitai
d7c14cb6f9 community[minor]: Add embeddings integration for text2vec (#19267)
Create a Class which allows to use the "text2vec" open source embedding
model.

It should install the model by running 'pip install -U text2vec'.
Example to call the model through LangChain:

from langchain_community.embeddings.text2vec import Text2vecEmbeddings

            embedding = Text2vecEmbeddings()
            bookend.embed_documents([
                "This is a CoSENT(Cosine Sentence) model.",
"It maps sentences to a 768 dimensional dense vector space.",
            ])
            bookend.embed_query(
                "It can be used for text matching or semantic search."
            )

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-26 11:06:58 -04:00
Shotaro Sano
55c624a694 infra: Resolve the endless dependency resolution during the build of dev.Dockerfile by copying poetry.lock (#19465)
## Description
This PR proposes a modification to the `libs/langchain/dev.Dockerfile`
configuration to copy the `libs/langchain/poetry.lock` into the working
directory. The change aims to address the issue where the Poetry install
command, the last command in the `dev.Dockerfile`, takes excessively
long hours, and to ensure the reproducibility of the poetry environment
in the devcontainer.

## Problem
The `dev.Dockerfile`, prepared for development environments such as
`.devcontainer`, encounters an unending dependency resolution when
attempting the Poetry installation.

### Steps to Reproduce
Execute the following build command: 

```bash
docker build -f libs/langchain/dev.Dockerfile .
```

### Current Behavior
The Docker build process gets stuck at the following step, which, in my
experience, did not conclude even after an entire night:

```
 => [langchain-dev-dependencies 4/6] COPY libs/community/ ../community/                                                                                0.9s
 => [langchain-dev-dependencies 5/6] COPY libs/text-splitters/ ../text-splitters/                                                                      0.0s
 => [langchain-dev-dependencies 6/6] RUN poetry install --no-interaction --no-ansi --with dev,test,docs                                               12.3s
 => => # Updating dependencies                                                                                                                             
 => => # Resolving dependencies...  
```

### Expected Behavior
The Docker build completes in a realistic timeframe. By applying this
PR, the build finishes within a few minutes.

### Analysis
The complexity of LangChain's dependencies has reached a point where
Poetry is required to resolve dependencies akin to threading a needle.
Consequently, poetry install fails to complete in a practical timeframe.

## Solution
The solution for dependency resolution is already recorded in
`libs/langchain/poetry.lock`, so we can use it. When copying
`project.toml` and `poetry.toml`, the `poetry.lock` located in the same
directory should also be copied.

```diff
# Copy only the dependency files for installation
-COPY libs/langchain/pyproject.toml libs/langchain/poetry.toml ./
+COPY libs/langchain/pyproject.toml libs/langchain/poetry.toml libs/langchain/poetry.lock ./
```

## Note
I am not intimately familiar with the historical context of the
`dev.Dockerfile` and thus do not know why `poetry.lock` has not been
copied until now. It might have been an oversight, or perhaps dependency
resolution used to complete quickly even without the `poetry.lock` file
in the past. However, if there are deliberate reasons why copying
`poetry.lock` is not advisable, please just close this PR.
2024-03-26 10:54:53 -04:00
Kalyan Mudumby
d27600c6f7 community[patch]: GPTCache pydantic validation error on lookup (#19427)
Description:
this change fixes the pydantic validation error when looking up from
GPTCache, the `ChatOpenAI` class returns `ChatGeneration` as response
which is not handled.
use the existing `_loads_generations` and `_dumps_generations` functions
to handle it

Trace
```
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/development/scripts/chatbot-postgres-test.py", line 90, in <module>
    print(llm.invoke("tell me a joke"))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 166, in invoke
    self.generate_prompt(
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 544, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 408, in generate
    raise e
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 398, in generate
    self._generate_with_cache(
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 585, in _generate_with_cache
    cache_val = llm_cache.lookup(prompt, llm_string)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_community/cache.py", line 807, in lookup
    return [
           ^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_community/cache.py", line 808, in <listcomp>
    Generation(**generation_dict) for generation_dict in json.loads(res)
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/load/serializable.py", line 120, in __init__
    super().__init__(**kwargs)
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/pydantic/v1/main.py", line 341, in __init__
    raise validation_error
pydantic.v1.error_wrappers.ValidationError: 1 validation error for Generation
type
  unexpected value; permitted: 'Generation' (type=value_error.const; given=ChatGeneration; permitted=('Generation',))
```


Although I don't seem to find any issues here, here's an
[issue](https://github.com/zilliztech/GPTCache/issues/585) raised in
GPTCache. Please let me know if I need to do anything else

Thank you

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 10:52:30 -04:00
Leonid Ganeline
4159a4723c experimental[patch]: update module doc strings (#19539)
Added missed module descriptions. Fixed format.
2024-03-26 10:38:10 -04:00
Piyush Jain
72ba738bf5 community[minor]: Improvements for NeptuneRdfGraph, Improve discovery of graph schema using database statistics (#19546)
Fixes linting for PR
[19244](https://github.com/langchain-ai/langchain/pull/19244)

---------

Co-authored-by: mhavey <mchavey@gmail.com>
2024-03-26 10:36:51 -04:00
aditya thomas
fc6b92bb9a docs: add cohere to the list of partners (#19552)
**Description:** Add Cohere to the list of LangChain partners
**Issue:** The Cohere partner package was recently added
[#19049](https://github.com/langchain-ai/langchain/pull/19049)
**Dependencies:** None
2024-03-26 10:22:03 -04:00
Christophe Bornet
1f422318b7 core[minor]: Use BaseChatMessageHistory async methods in RunnableWithMessageHistory (#19565)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-26 14:13:58 +00:00
Christophe Bornet
8595c3ab59 community[minor]: Add InMemoryVectorStore to module level imports (#19576) 2024-03-26 14:07:44 +00:00
Christophe Bornet
a9457d269e core: Add async methods to BaseExampleSelector and SemanticSimilarityExampleSelector (#19399)
Few-Shot prompt template may use a `SemanticSimilarityExampleSelector`
that in turn uses a `VectorStore` that does I/O operations.
So to work correctly on the event loop, we need:
* async methods for the `VectorStore` (OK)
* async methods for the `SemanticSimilarityExampleSelector` (this PR)
* async methods for `BasePromptTemplate` and `BaseChatPromptTemplate`
(future work)
2024-03-26 10:06:43 -04:00
Christophe Bornet
29c58528c7 core[minor]: Add default implementations to amax_marginal_relevance_search_by_vector and adelete (#19269) 2024-03-26 10:03:22 -04:00
Christophe Bornet
999365186b langchain[major]: Use InMemoryVectorStore by default in VectorstoreIndexCreator (#19575)
This is a small breaking change but I think it should be done as:
* No external dependency needs to be installed anymore for the default
to work
* It is vendor-neutral
2024-03-26 10:01:23 -04:00
standby24x7
16e64d889a docs: Update function "run" to "invoke" in fake_llm.ipynb (#19570)
This patch updates function "run" to "invoke" in fake_llm.ipynb. Without
this patch, you see following warning.

LangChainDeprecationWarning: The function `run` was deprecated in
LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-26 09:54:31 -04:00
Guangdong Liu
c93d4ea91c docs: Add in code documentation to core Runnable map methods (docs only) (#19517)
- **Issue:** #18804
- @baskaryan, @eyurtsev
2024-03-25 19:18:30 -07:00
Leonid Ganeline
0199b73188 docs: added partners/package-name folders (#19290)
Added references to new integration packages from Google, by adding
subfolders to `partners/`.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 02:16:59 +00:00
Aayush Kataria
03c38005cb community[patch]: Fixing some caching issues for AzureCosmosDBSemanticCache (#18884)
Fixing some issues for AzureCosmosDBSemanticCache
- Added the entry for "AzureCosmosDBSemanticCache" which was missing in
langchain/cache.py
- Added application name when creating the MongoClient for the
AzureCosmosDBVectorSearch, for tracking purposes.

@baskaryan, can you please review this PR, we need this to go in asap.
These are just small fixes which we found today in our testing.
2024-03-25 19:06:17 -07:00
Clément Tamines
a6cbb755a7 community[patch]: fix semantic answer bug in AzureSearch vector store (#18938)
- **Description:** The `semantic_hybrid_search_with_score_and_rerank`
method of `AzureSearch` contains a hardcoded field name "metadata" for
the document metadata in the Azure AI Search Index. Adding such a field
is optional when creating an Azure AI Search Index, as other snippets
from `AzureSearch` test for the existence of this field before trying to
access it. Furthermore, the metadata field name shouldn't be hardcoded
as "metadata" and use the `FIELDS_METADATA` variable that defines this
field name instead. In the current implementation, any index without a
metadata field named "metadata" will yield an error if a semantic answer
is returned by the search in
`semantic_hybrid_search_with_score_and_rerank`.

- **Issue:** https://github.com/langchain-ai/langchain/issues/18731

- **Prior fix to this bug:** This bug was fixed in this PR
https://github.com/langchain-ai/langchain/pull/15642 by adding a check
for the existence of the metadata field named `FIELDS_METADATA` and
retrieving a value for the key called "key" in that metadata if it
exists. If the field named `FIELDS_METADATA` was not present, an empty
string was returned. This fix was removed in this PR
https://github.com/langchain-ai/langchain/pull/15659 (see
ed1ffca911#).
@lz-chen: could you confirm this wasn't intentional? 

- **New fix to this bug:** I believe there was an oversight in the logic
of the fix from
[#1564](https://github.com/langchain-ai/langchain/pull/15642) which I
explain below.
The `semantic_hybrid_search_with_score_and_rerank` method creates a
dictionary `semantic_answers_dict` with semantic answers returned by the
search as follows.

5c2f7e6b2b/libs/community/langchain_community/vectorstores/azuresearch.py (L574-L581)
The keys in this dictionary are the unique document ids in the index, if
I understand the [documentation of semantic
answers](https://learn.microsoft.com/en-us/azure/search/semantic-answers)
in Azure AI Search correctly. When the method transforms a search result
into a `Document` object, an "answer" key is added to the document's
metadata. The value for this "answer" key should be the semantic answer
returned by the search from this document, if such an answer is
returned. The match between a `Document` object and the semantic answers
returned by the search should be done through the unique document id,
which is used as a key for the `semantic_answers_dict` dictionary. This
id is defined in the search result's field named `FIELDS_ID`. I added a
check to avoid any error in case no field named `FIELDS_ID` exists in a
search result (which shouldn't happen in theory).
A benefit of this approach is that this fix should work whether or not
the Azure AI Search Index contains a metadata field.

@levalencia could you confirm my analysis and test the fix?
@raunakshrivastava7 do you agree with the fix?

Thanks for the help!
2024-03-25 18:51:54 -07:00
miri-bar
55db737302 ai21[minor]: AI21 Labs Semantic Text Splitter support (#19510)
Description: Added support for AI21 Labs model - Segmentation, as a Text
Splitter
Dependencies: ai21, langchain-text-splitter
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 01:39:37 +00:00
Anindyadeep
b2a11ce686 community[minor]: Prem AI langchain integration (#19113)
### Prem SDK integration in LangChain

This PR adds the integration with [PremAI's](https://www.premai.io/)
prem-sdk with langchain. User can now access to deployed models
(llms/embeddings) and use it with langchain's ecosystem. This PR adds
the following:

### This PR adds the following:

- [x]  Add chat support
- [X]  Adding embedding support
- [X]  writing integration tests
    - [X]  writing tests for chat 
    - [X]  writing tests for embedding
- [X]  writing unit tests
    - [X]  writing tests for chat 
    - [X]  writing tests for embedding
- [X]  Adding documentation
    - [X]  writing documentation for chat
    - [X]  writing documentation for embedding
- [X] run `make test`
- [X] run `make lint`, `make lint_diff` 
- [X]  Final checks (spell check, lint, format and overall testing)

---------

Co-authored-by: Anindyadeep Sannigrahi <anindyadeepsannigrahi@Anindyadeeps-MacBook-Pro.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 01:37:19 +00:00
Alessandro D'Armiento
37eb3a4a9e docs: Some import nits (#19130)
- **Description:** fixes some minor issues in the documentation

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-26 01:25:44 +00:00
Souhail Hanfi
cbec43afa9 community[patch]: avoid creating extension PGvector while using readOnly Databases (#19268)
- **Description:** PgVector class always runs "create extension" on init
and this statement crashes on ReadOnly databases (read only replicas).
but wierdly the next create collection etc work even in readOnly
databases
- **Dependencies:** no new dependencies
- **Twitter handle:** @VenOmaX666

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 01:25:01 +00:00
Dixing (Dex) Xu
903541f439 docs: update dependecy for autogpt/marathon.ipynb (#19491)
fixes the import error from notebook based on the
[documentation](https://api.python.langchain.com/en/latest/agents/langchain_experimental.agents.agent_toolkits.pandas.base.create_pandas_dataframe_agent.html)

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 18:14:22 -07:00
Mauricio Cruz
fb9ce95184 cli[patch]: Fix Tuple typing problem when create new langchain app (#19141)
Thank you for contributing to LangChain!

When run command langchain app new my-app, i get this error:

File
"/home/mauricio/.local/lib/python3.8/site-packages/langchain_cli/utils/pyproject.py",
line 15, in <module>
pyproject_toml: Path, local_editable_dependencies: Iterable[tuple[str,
Path]]
TypeError: 'type' object is not subscriptable

This PR fix the error.
2024-03-26 01:09:51 +00:00
Anthony Shaw
6c9b0f96f3 docs: Add guidance for splitting Chinese, Japanese, and Thai (#19295)
The existing default list of separators for the `RecursiveTextSplitter`
assumes spaces are word boundaries. Some languages [don't use spaces
between
words](https://en.wikipedia.org/wiki/Category:Writing_systems_without_word_boundaries)
(Chinese, Japanese, Thai, Burmese).

This PR extends the documentation to explain how to cater for those
languages by adding additional punctuation to the separators and
zero-width spaces which are used by some typesetters and will assist the
splitter to not split in words.

Ideally, **these separators could be a constant in the module** but for
now, defining them in the documentation is a start.
2024-03-26 00:34:00 +00:00
Erick Friis
441a8012b3 mistralai[patch]: release 0.1.0 (#19540) 2024-03-25 17:29:40 -07:00
Barun Amalkumar Halder
9246ec6b36 community[patch] : [Fiddler] ensure dataset is not added if model is present (#19293)
**Description:**
- minor PR to speed up onboarding by not trying to add a dataset, if a
model is already present.
- replace batch publish API with streaming when single events are
published.

**Dependencies:** any dependencies required for this change
**Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-03-25 17:28:05 -07:00
JSDu
6e090280fd community[patch]: milvus will autoflush, manual flush is slowly (#19300)
reference:


https://milvus.io/docs/configure_quota_limits.md#quotaAndLimitsflushRateenabled

https://github.com/milvus-io/milvus/issues/31407

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 00:26:58 +00:00
mackong
e65dc4b95b community[patch]: clean warning when delete by ids (#19301)
* Description: rearrange to avoid variable overwrite, which cause
warning always.
* Issue: N/A
* Dependencies: N/A
2024-03-25 17:23:22 -07:00
Ian
d5415dbd68 docs: improve tidb integrations documents (#19321)
This PR aims to enhance the documentation for TiDB integration, driven
by feedback from our users. It provides detailed introductions to key
features, ensuring developers can fully leverage TiDB for AI application
development.
2024-03-25 17:08:23 -07:00
Stefano Mosconi
01fc69c191 community[patch]: expanding version in confluence loader (#19324)
**Description:**
Expanding version in all the Confluence API calls so to get when the
page was last modified/created in all cases.

**Issue:** #12812 
**Twitter handle:** zzste
2024-03-25 17:08:01 -07:00
Dmitry Tyumentsev
08b769d539 community[patch]: YandexGPT Use recent yandexcloud sdk version (#19341)
Fixed inability to work with [yandexcloud
SDK](https://pypi.org/project/yandexcloud/) version higher 0.265.0
2024-03-25 17:05:57 -07:00
Marlene
f1313339ac community[patch]: Fixing incorrect base URLs for Azure Cognitive Search Retriever (#19352)
This PR adds code to make sure that the correct base URL is being
created for the Azure Cognitive Search retriever. At the moment an
incorrect base URL is being generated. I think this is happening because
the original code was based on a depreciated API version. No
dependencies need to be added. I've also added more context to the test
doc strings.

I should also note that ACS is now Azure AI Search. I will open a
separate PR to make these changes as that would be a breaking change and
should potentially be discussed.

Twitter: @marlene_zw



- No new tests added, however the current ACS retriever tests are now
passing when I run them.
- Code was linted.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-26 00:04:59 +00:00
Tridib Roy Arjo
d667b1ea8f docs: Update async_chromium.ipynb (#19514)
In Jupyter, asyncio would throw an error before `.load()` unless
`nest_asyncio` is applied (Issue #8494 mentioned this)

+Minor typo fixes..
2024-03-26 00:02:50 +00:00
Bob Lin
5b6b1f9e1d docs: Fix several sample code errors (#19382) 2024-03-25 16:59:52 -07:00
FinTech秋田
03ba1d4731 community[patch]: Add Support for GPU Index Types in Milvus 2.4 (#19468)
- **Description:** This commit introduces support for the newly
available GPU index types introduced in Milvus 2.4 within the LangChain
project's `milvus.py`. With the release of Milvus 2.4, a range of
GPU-accelerated index types have been added, offering enhanced search
capabilities and performance optimizations for vector search operations.
This update ensures LangChain users can fully utilize the new
performance benefits for vector search operations.
    - Reference: https://milvus.io/docs/gpu_index.md

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 23:39:54 +00:00
Hamid Ali
c281ec8887 docs: Fix broken link in semantic-chunker.ipynb (#19464)
Corrected a broken link within the semantic-chunker.ipynb notebook,
ensuring that users can access the referenced resource.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 23:39:32 +00:00
Ash Vardanian
d01bad5169 core[patch]: Convert SimSIMD back to NumPy (#19473)
This patch fixes the #18022 issue, converting the SimSIMD internal
zero-copy outputs to NumPy.

I've also noticed, that oftentimes `dtype=np.float32` conversion is used
before passing to SimSIMD. Which numeric types do LangChain users
generally care about? We support `float64`, `float32`, `float16`, and
`int8` for cosine distances and `float16` seems reasonable for
practically any kind of embeddings and any modern piece of hardware, so
we can change that part as well 🤗
2024-03-25 16:36:26 -07:00
Ikko Eltociear Ashimine
980658cb47 docs: Update streaming.ipynb (#19500)
Fixed typo.

occuring -> occurring
2024-03-25 16:21:45 -07:00
Leonid Kuligin
91f4c80143 docs: fixed links (#19503)
- [ ] **PR title**: "docs: fixed broken links"


- [ ] **PR message**:
    - **Description:** fixed links in the documentation
2024-03-25 16:19:28 -07:00
Mikelarg
dac2e0165a community[minor]: Added GigaChat Embeddings support + updated previous GigaChat integration (#19516)
- **Description:** Added integration with
[GigaChat](https://developers.sber.ru/portal/products/gigachat)
embeddings. Also added support for extra fields in GigaChat LLM and
fixed docs.
2024-03-25 16:08:37 -07:00
Martin Kolb
e5bdb26f76 community[patch]: More flexible handling for entity names in vector store "HANA Cloud" (#19523)
- **Description:** Added support for lower-case and mixed-case names
The names for tables and columns previouly had to be UPPER_CASE.
With this enhancement, also lower_case and MixedCase are supported,


  - **Issue:** N/A
  - **Dependencies:** no new dependecies added
  - **Twitter handle:** @sapopensource
2024-03-25 15:52:45 -07:00
Erica Clark
a1ff21f90f docs: Update local llms article to use invoke instead of deprecated __call__ (#19528)
- **Description:** Since the implicit `__call__` has been deprecated in
favor of `invoke`, the local_llms article also needed to be updated.
This article was my introduction to Lanchain, and as it was helpful in
getting me setup with running LLMs locally, it is nice to not have any
warnings when running the example code. With this change, the warnings
go away when running the example code.
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** clarkerican
2024-03-25 15:51:39 -07:00
Orest Xherija
0b1e09029f openai[patch]: increase max batch size for Azure OpenAI Embeddings API (#19532)
**Description:** Azure OpenAI has increased its maximum batch size from
16 to 2048 for the Embeddings API per this How-To
[page](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/embeddings?tabs=console#best-practices)

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 15:50:07 -07:00
Eugene Yurtsev
56f4c5459b core[patch]: fix xml output parser transform (#19530)
Previous PR passed _parser attribute which apparently is not meant to be
used by user code and causes non deterministic failures on CI when
testing the transform and a transform methods. Reverting this change
temporarily.
2024-03-25 21:34:45 +00:00
Erick Friis
e6952b04d5 cohere[patch]: fix release (#19529) 2024-03-25 13:46:29 -07:00
aditya thomas
aa68fd7e91 core[runnables]: docstring for class runnable, method with_listeners() (#19515)
**Description:** Docstring for method with_listerners() of class
Runnable
**Issue:** [Add in code documentation to core Runnable methods
#18804](https://github.com/langchain-ai/langchain/issues/18804)
**Dependencies:** None
2024-03-25 16:24:58 -04:00
billytrend-cohere
63343b4987 cohere[patch]: add cohere as a partner package (#19049)
Description: adds support for langchain_cohere

---------

Co-authored-by: Harry M <127103098+harry-cohere@users.noreply.github.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-25 20:23:47 +00:00
Eugene Yurtsev
727d5023ce core[patch]: Use defusedxml in XMLOutputParser (#19526)
This mitigates a security concern for users still using older versions of libexpat that causes an attacker to compromise the availability of the system if an attacker manages to surface malicious payload to this XMLParser.
2024-03-25 16:21:52 -04:00
Zachary Wilkins
e1a6341940 langchain: Passthrough batch_size on index()/aindex() calls (#19443)
**Description:** This change passes through `batch_size` to
`add_documents()`/`aadd_documents()` on calls to `index()` and
`aindex()` such that the documents are processed in the expected batch
size.
**Issue:** #19415
**Dependencies:** N/A
**Twitter handle:** N/A
2024-03-25 11:58:29 -04:00
ccurme
82de8fd6c9 add kwargs (#19519)
`HanaDB.add_texts` is missing **kwargs.
2024-03-25 11:56:01 -04:00
Nikhil Kumar
3d3b46a782 docs: Update docs for HuggingFacePipeline (#19306)
Updated `HuggingFacePipeline` docs to be in sync with list of supported
tasks, including translation.

- [x] **PR title**: "community: Update docs for `HuggingFacePipeline`"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**:
- **Description:** Update docs for `HuggingFacePipeline`, was earlier
missing `translation` as a valid task
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** None


- [x] **Add tests and docs**:


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-25 00:29:21 -07:00
Igor Muniz Soares
743f888580 community[minor]: Dappier chat model integration (#19370)
**Description:** 

This PR adds [Dappier](https://dappier.com/) for the chat model. It
supports generate, async generate, and batch functionalities. We added
unit and integration tests as well as a notebook with more details about
our chat model.


**Dependencies:** 
    No extra dependencies are needed.
2024-03-25 07:29:05 +00:00
Jacob Lezberg
64e1df3d3a infra: Update package version to apply CVE-related patch (#19490)
- **Description:** [CVE
2024-21503](https://www.cve.org/CVERecord?id=CVE-2024-21503) was
recently identified. The python linter "black" suffers from a potential
Regex-related denial of service attack. Updated version from the
vulnerable 24.2.0 to the patched 24.3.0.
- **Issue:** N/A
- **Dependencies:** The 'black' package in both `langchain` (top-level)
and `templates/python-lint`.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 07:11:23 +00:00
Hugoberry
96dc180883 community[minor]: Add DuckDB as a vectorstore (#18916)
DuckDB has a cosine similarity function along list and array data types,
which can be used as a vector store.
- **Description:** The latest version of DuckDB features a cosine
similarity function, which can be used with its support for list or
array column types. This PR surfaces this functionality to langchain.
    - **Dependencies:** duckdb 0.10.0
    - **Twitter handle:** @igocrite

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 07:02:35 +00:00
Ethan Yang
fa6397d76a docs: Add OpenVINO llms docs (#19489)
Add OpenVINOpipeline instructions in docs. OpenVINO users can find more
details in this page.
2024-03-24 23:57:30 -07:00
preak95
6ea3e57a63 community[minor]: S3FileLoader to use expose mode and post_processors arguments of unstructured loader (#19270)
**Description:** Update s3_file.py to use arguments **mode** and
**post_processors** from the base class **UnstructuredBaseLoader** to
include more metadata about the files from the S3 bucket such as
*'page_number', 'languages'* etc.

**Issue:** NA
**Dependencies:** None
**Twitter handle:** preak95

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-25 06:56:55 +00:00
Guangdong Liu
560e2182d8 docs: docstring Runnable pipe and pick methods (docs only) (#19395)
- **Issue:**  #18804
-  @eyurtsev @ccurme PTAL

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-24 23:50:04 -07:00
Christophe Bornet
63898dbda0 langchain[patch]: Use async memory in Chain when needed (#19429) 2024-03-24 23:49:00 -07:00
Lance Martin
db7403d667 docs: Remove non-rendering images & output spamming from doc ntbks (#19475)
Looking at tokens / page of our docs, we see a few outliers:
<img width="761" alt="image"
src="https://github.com/langchain-ai/langchain/assets/122662504/677aa2d6-0a29-45e4-882a-db2bbf46d02b">

It is due to non-rendering images in one case, and output spamming. 

Clean these, along with other cases of excessing output spamming in
docs.

All get sucked into chat-langchain for retrieval.
2024-03-24 23:47:38 -07:00
Erick Friis
b617085af0 mistralai[patch]: streaming tool calls (#19469) 2024-03-23 19:24:53 +00:00
aditya thomas
b43a9d5808 docs: adding voyageai to the list of partner packages (#19376)
**Description:** Adding VoyageAI to the list of partners
**Issue:** A standalone langchain-voyageai package has been added
**Dependencies:** None
2024-03-22 17:08:15 -07:00
Zeeland
2549df00cd docs: fix error bilibili url (#19375)
Thank you for contributing to LangChain!

bilibili-api-python use https://github.com/Nemo2011/bilibili-api repo.
Change to the correct address.

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-22 17:06:17 -07:00
aditya thomas
375ab7bf59 docs: update module imports for fireworks documentation (#19377)
**Description:** Update module imports for Fireworks documentation
**Issue:** Module imports not present or in incorrect location
**Dependencies:** None
2024-03-22 17:05:27 -07:00
aditya thomas
0cc0467267 docs: update import paths and move to lcel for llama.cpp examples (#19391)
**Description:** Update import paths and move to lcel for llama.cpp
examples
**Issue:** Update import paths to reflect package refactoring and move
chains to LCEL in examples
**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-23 00:04:12 +00:00
fengjial
3b52ee05d1 community[patch]: fix bugs in baiduvectordb as vectorstore (#19380)
fix small bugs in vectorstore/baiduvectordb
2024-03-22 17:03:59 -07:00
Cailin Wang
5402aef32e docs: Add partition parameter to DashVector (#19385)
**Description**: Add `partition` parameter to DashVector
dashvector.ipynb
**Related PR**: https://github.com/langchain-ai/langchain/pull/19023
**Twitter handle**: @CailinWang_

---------

Co-authored-by: root <root@Bluedot-AI>
2024-03-22 17:00:29 -07:00
aditya thomas
515aab3312 community[patch]: invoke callback prior to yielding token (openai) (#19389)
**Description:** Invoke callback prior to yielding token for BaseOpenAI
& OpenAIChat
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)
**Dependencies:** None
2024-03-22 16:45:55 -07:00
aditya thomas
49e932cd24 community[patch]: invoke callback prior to yielding token (fireworks) (#19388)
**Description:** Invoke callback prior to yielding token for Fireworks
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)
**Dependencies:** None
2024-03-22 16:44:06 -07:00
aditya thomas
16ef88a87d docs: moving FireworksEmbeddings documentation to docs folder (#19398)
**Description:** Moving FireworksEmbeddings documentation to the
location docs/integration/text_embedding/ from langchain_fireworks/docs/
**Issue:** FireworksEmbeddings documentation was not in the correct
location
**Dependencies:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-22 23:24:22 +00:00
Leonid Ganeline
06190063e7 infra: makefile api_docs_clean fix (#19405)
Fixed a Makefile command that cleans up the api_docs
2024-03-22 15:45:55 -07:00
Christophe Bornet
1b813fe6fe langchain[patch]: Add async methods to VectorStoreRetrieverMemory (#19408) 2024-03-22 15:44:24 -07:00
Tarun Jain
ef6d3d66d6 community[patch]: docarray requires hnsw installation (#19416)
I have a small dataset, and I tried to use docarray:
``DocArrayHnswSearch ``. But when I execute, it returns:

```bash
    raise ImportError(
ImportError: Could not import docarray python package. Please install it with `pip install "langchain[docarray]"`.
```

Instead of docarray it needs to be 

```bash
docarray[hnswlib]
```

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-22 22:39:07 +00:00
German Swan
d4dc98a9f9 community[patch]: RecursiveUrlLoader: add base_url option (#19421)
RecursiveUrlLoader does not currently provide an option to set
`base_url` other than the `url`, though it uses a function with such an
option.
For example, this causes it unable to parse the
`https://python.langchain.com/docs`, as it returns the 404 page, and
`https://python.langchain.com/docs/get_started/introduction` has no
child routes to parse.
`base_url` allows setting the `https://python.langchain.com/docs` to
filter by, while the starting URL is anything inside, that contains
relevant links to continue crawling.
I understand that for this case, the docusaurus loader could be used,
but it's a common issue with many websites.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-22 15:34:31 -07:00
Erick Friis
e71daa7a03 openai[patch]: add test coverage to output (#19462) 2024-03-22 15:33:10 -07:00
igeni
4babefcb2f cli[patch]: Modified regular expression (#19449)
- **Description:** Modified regular expression to add support for
unicode chars and simplify pattern

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-22 15:24:08 -07:00
Ray Bell
7d36ee38b7 docs: point to titantic dataset on web (#19455)
Updated `pd.read_csv("titantic.csv")` to
`pd.read_csv("https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv")`
i.e. it will read it
https://raw.githubusercontent.com/pandas-dev/pandas/main/doc/data/titanic.csv
and allow anyone to run the code.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-22 22:22:41 +00:00
Ray Bell
f959fad56e docs: use invoke instead of run (#19457)
Updated the deprecated run with invoke

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-22 15:08:26 -07:00
Bagatur
d93d49bc43 openai[patch]: tool use integration test (#19460) 2024-03-22 14:49:54 -07:00
Erick Friis
a99e644913 openai[patch]: integration test structured output (#19459) 2024-03-22 21:43:24 +00:00
Erick Friis
ac57123f40 openai[patch]: release 0.1.1 (#19458) 2024-03-22 21:36:21 +00:00
Luca Dorigo
47cfbe7522 openai[patch]: [URGENT REGRESSION FIX] Don't fail if tool message already doesn't contain name (#19435)
- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-22 14:33:50 -07:00
aditya thomas
bc028294d0 docs: delete mistralai embeddings doc from incorrect location (#19432)
**Description:** Delete MistralAIEmbeddings usage document from folder
partners/mistralai/docs
**Issue:** The document is present in the folder docs/docs
**Dependencies:** None
2024-03-22 14:02:59 -07:00
Erick Friis
11e37943ed mistralai[patch]: fix core version (#19454) 2024-03-22 20:48:13 +00:00
Erick Friis
3b093160c4 mistralai[patch]: release 0.1.0rc1 (#19453) 2024-03-22 20:34:36 +00:00
aditya thomas
4856a87261 community[patch]: invoke callback prior to yielding token (llama.cpp) (#19392)
**Description:** Invoke callback prior to yielding token for llama.cpp
**Issue:** [Callback for on_llm_new_token should be invoked before the
token is yielded by the model
#16913](https://github.com/langchain-ai/langchain/issues/16913)
**Dependencies:** None
2024-03-22 16:17:56 -04:00
ccurme
c4599444ee mistralai: update tool calling (#19451)
```python
from langchain.agents import tool
from langchain_mistralai import ChatMistralAI


llm = ChatMistralAI(model="mistral-large-latest", temperature=0)

@tool
def get_word_length(word: str) -> int:
    """Returns the length of a word."""
    return len(word)


tools = [get_word_length]
llm_with_tools = llm.bind_tools(tools)

llm_with_tools.invoke("how long is the word chrysanthemum")
```
currently raises
```
AttributeError: 'dict' object has no attribute 'model_dump'
```

Same with `.with_structured_output`
```python
from langchain_mistralai import ChatMistralAI
from langchain_core.pydantic_v1 import BaseModel

class AnswerWithJustification(BaseModel):
    """An answer to the user question along with justification for the answer."""
    answer: str
    justification: str

llm = ChatMistralAI(model="mistral-large-latest", temperature=0)
structured_llm = llm.with_structured_output(AnswerWithJustification)

structured_llm.invoke("What weighs more a pound of bricks or a pound of feathers")
```

This appears to fix.
2024-03-22 16:03:48 -04:00
Erick Friis
cceaca3e4f cookbook[patch]: add strip of quotes (#19452) 2024-03-22 19:10:39 +00:00
ccurme
8a2528c34a [langchain] fix OpenAIAssistantRunnable.create_assistant (#19081)
- **Description:** OpenAI assistants support some pre-built tools (e.g.,
`"retrieval"` and `"code_interpreter"`) and expect these as `{"type":
"code_interpreter"}`. This may have been upset by
https://github.com/langchain-ai/langchain/pull/18935
- **Issue:** https://github.com/langchain-ai/langchain/issues/19057
2024-03-22 13:23:19 -04:00
Harrison Chase
b40c80007f core[minor]: Add utility code to create tool examples (#18602)
Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-22 13:17:40 -04:00
Erick Friis
53ac1ebbbc mistralai[minor]: 0.1.0rc0, remove mistral sdk (#19420) 2024-03-22 01:24:58 +00:00
William FH
e980c14d6a core[patch]: allow "placeholder" type in from_messages tuples (#19152)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-21 22:09:24 +00:00
billytrend-cohere
f6bcd42421 community[patch]: Replace positional argument with text=text for cohere>=5 compatibility (#19407)
- **Description:** Replace positional argument with text=text for
cohere>=5 compatibility
2024-03-21 10:42:51 -07:00
enfeng
b20c2640da anthropic[patch]: update base_url of anthropic (#18634)
A small change ~

- [ ] **update base_url**: "package: langchain_anthropic"

---------

Co-authored-by: yangenfeng <yangenfeng@xiaoniangao.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-20 21:04:55 -07:00
Erick Friis
a9cda536ad openai[patch]: fix core min version (#19366) 2024-03-20 15:38:29 -07:00
Erick Friis
0b20c098df openai[patch]: fix name param (#19365) 2024-03-20 22:22:09 +00:00
Erick Friis
f6c8700326 openai[patch]: release 0.1.0, message id and name support (#19363) 2024-03-20 15:11:39 -07:00
Bagatur
3fa711dce0 experimental[patch]: Release 0.0.55 (#19353) 2024-03-20 13:06:39 -07:00
Erick Friis
2bcd760c46 robocorp[patch]: run integration tests on release (#19358) 2024-03-20 19:31:12 +00:00
Erick Friis
a031c183ae robocorp[patch]: release 0.0.4 (#19357) 2024-03-20 12:28:41 -07:00
Bagatur
d95ea3550e langchain[patch]: Release 0.1.13 (#19351) 2024-03-20 18:25:12 +00:00
Bagatur
b58b38769d community[patch]: Release 0.0.29 (#19350) 2024-03-20 18:09:48 +00:00
Bagatur
5d220975fc core[patch]: Release 0.1.33 (#19348) 2024-03-20 17:28:56 +00:00
Eugene Yurtsev
aa9ccca775 langchain[patch]: Add tests for indexing (#19342)
This PR adds tests for the indexing API
2024-03-20 13:00:22 -04:00
William FH
68298cdc82 [Feat] Accept non-dict if only 1 prompt input variable (#19156)
For prompt templates with only 1 variable (common in e.g.,
MessageGraph), it's convenient to wrap the incoming object in the
variable before formatting.


The downside of this, of course, would be that some number of
invocations will successfully format when the user may have intended to
format it properly before
2024-03-20 09:59:32 -07:00
mackong
d9396bdec1 langchain[patch]: add stop for various non-openai agents (#19333)
* Description: add stop for various non-openai agents.
* Issue: N/A
* Dependencies: N/A

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-20 11:34:10 -04:00
Yudhajit Sinha
7d216ad1e1 community[patch]: Invoke callback prior to yielding token (titan_takeoff_pro) (#18624)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/titan_takeoff_pro.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:58:18 -07:00
Yudhajit Sinha
455a74486b community[patch]: Invoke callback prior to yielding token (sparkllm) (#18625)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/sparkllm.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:57:53 -07:00
Yudhajit Sinha
5ac1860484 community[patch]: Invoke callback prior to yielding token (replicate) (#18626)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/replicate.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:57:27 -07:00
Yudhajit Sinha
9525e392de community[patch]: Invoke callback prior to yielding token (pai_eas_endpoint) (#18627)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/pai_eas_endpoint.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:56:58 -07:00
Yudhajit Sinha
140f06e59a community[patch]: Invoke callback prior to yielding token (openai) (#18628)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_
method in llms/openai.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:56:30 -07:00
Yudhajit Sinha
280a914920 community[patch]: Invoke callback prior to yielding token (ollama) (#18629)
## PR title
community[patch]: Invoke callback prior to yielding token

## PR message
- Description: Invoke callback prior to yielding token in _stream_ &
_astream_ methods in llms/ollama.
- Issue: #16913 
- Dependencies: None
2024-03-20 07:56:09 -07:00
老阿張
9dfce56b31 docs: Fix typo in infino.ipynb (#18640)
Description: "conquerer should be conqueror "? 🤔
Issue: Typo
Dependencies: Nope
Twitter handle: laoazhang
2024-03-20 07:51:58 -07:00
Christophe Bornet
00614f332a community[minor]: Add InMemoryVectorStore (#19326)
This is a basic VectorStore implementation using an in-memory dict to
store the documents.
It doesn't need any extra/optional dependency as it uses numpy which is
already a dependency of langchain.
This is useful for quick testing, demos, examples.
Also it allows to write vendor-neutral tutorials, guides, etc...
2024-03-20 10:21:07 -04:00
Devesh Rahatekar
3c4529ac69 core: Updated docstring for RunnablePick (#18832)
**Description:** : Updated the docstring for RunnablePick. Added
Overview and an Example for RunnablePick class.
   **Issue:** : #18803
2024-03-20 13:54:42 +00:00
aditya thomas
e46419c851 docs: contribute / integrations code examples update (#19319)
**Description:** Update to make the code examples consistent with the
actual use
**Issue:** Code examples were different from actual use in the LangChain
code
**Dependencies:** Changes on top of
https://github.com/langchain-ai/langchain/pull/19294

Note: If these changes are acceptable, please merge them after
https://github.com/langchain-ai/langchain/pull/19294.
2024-03-20 09:27:53 -04:00
Leonid Ganeline
8609afbd10 core[patch]: Update messages namespace to fix API reference docs (#19161)
Classes and functions defined in __init__.py are not parsed into the API
Reference.
For example:
- libs/core/langchain_core/messages/__init__.py : AnyMessage,
MessageLikeRepresentation, get_buffer_string(), messages_from_dict(),
...

Opinionated: __init__.py is not a typical place to define artifacts.

Moved artifacts from __init__ into utils.py. 
Added `MessageLikeRepresentation` to __all__ since it is used outside of
`messages`, for example, in
`libs/core/langchain_core/language_models/base.py`
Added `_message_from_dict` to __all__ since it is used outside of
`messages`(???) I would add `message_from_dict` (without underscore) as
an alias. Please, advise.
2024-03-20 09:25:09 -04:00
Christophe Bornet
4c2e887276 core: Simplify astream logic in BaseChatModel and BaseLLM (#19332)
Covered by tests in
`libs/core/tests/unit_tests/language_models/chat_models/test_base.py`,
`libs/core/tests/unit_tests/language_models/llms/test_base.py` and
`libs/core/tests/unit_tests/runnables/test_runnable_events.py`
2024-03-20 09:05:51 -04:00
Brace Sproul
40f846e65d docs[minor]: Add chat model selection tabs component (#19296)
<img width="1728" alt="image"
src="https://github.com/langchain-ai/langchain/assets/46789226/45e70a92-c2ee-48c8-9964-100eed22687b">
2024-03-19 18:12:46 -07:00
Erick Friis
69e9610f62 openai[patch]: pass message name (#17537) 2024-03-19 19:57:27 +00:00
Guangdong Liu
e5d7e455dc splitters: Add ensure_ascii parameter (#18485)
- **Description:** Add ensure_ascii parameter
2024-03-19 12:51:16 -07:00
Nithish Raghunandanan
7ad0a3f2a7 community: add Couchbase Vector Store (#18994)
- **Description:** Added support for Couchbase Vector Search to
LangChain.
- **Dependencies:** couchbase>=4.1.12
- **Twitter handle:** @nithishr

---------

Co-authored-by: Nithish Raghunandanan <nithishr@users.noreply.github.com>
2024-03-19 12:39:51 -07:00
Chris Papademetrious
305d74c67a core: implement a batch_size parameter for CacheBackedEmbeddings (#18070)
**Description:**

Currently, `CacheBackedEmbeddings` computes vectors for *all* uncached
documents before updating the store. This pull request updates the
embedding computation loop to compute embeddings in batches, updating
the store after each batch.

I noticed this when I tried `CacheBackedEmbeddings` on our 30k document
set and the cache directory hadn't appeared on disk after 30 minutes.

The motivation is to minimize compute/data loss when problems occur:

* If there is a transient embedding failure (e.g. a network outage at
the embedding endpoint triggers an exception), at least the completed
vectors are written to the store instead of being discarded.
* If there is an issue with the store (e.g. no write permissions), the
condition is detected early without computing (and discarding!) all the
vectors.

**Issue:**
Implements enhancement #18026.

**Testing:**
I was unable to run unit tests; details in [this
post](https://github.com/langchain-ai/langchain/discussions/15019#discussioncomment-8576684).

---------

Signed-off-by: chrispy <chrispy@synopsys.com>
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-19 18:55:43 +00:00
William FH
89af30807b Permit function eval on llm data type (#19287) 2024-03-19 11:53:50 -07:00
Jib
f8078e41e5 mongodb[patch]: Added scoring threshold to caching (#19286)
## Description
Semantic Cache can retrieve noisy information if the score threshold for
the value is too low. Adding the ability to set a `score_threshold` on
cache construction can allow for less noisy scores to appear.


- [x] **Add tests and docs**
  1. Added tests that confirm the `score_threshold` query is valid.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-19 11:30:02 -07:00
Christophe Bornet
30e4a35d7a community: Use langchain-astradb for AstraDB caches (#18419)
- [x] Needs https://github.com/langchain-ai/langchain-datastax/pull/4
- [x] Needs a new release of langchain-astradb
2024-03-19 14:04:36 -04:00
Brace Sproul
17c62e0f3a ci[minor]: Bump LC scripts package, add retry option (#19285)
The `retryFailed` option will retry all failed links, once at a time
with the goal of not triggering bot protection

`microsoft.com` is now hard coded into the whitelist
2024-03-19 10:42:59 -07:00
Erick Friis
7eb376d5fc docs: integration deprecation docs (#19283) 2024-03-19 17:11:15 +00:00
Guangdong Liu
2c835baae4 code[patch]: Add in code documentation to core Runnable with_retry method (docs only) (#19192)
- **Description:** Add in code documentation to core Runnable with_retry
method (docs only)
- **Issue:** #18804 
@baskaryan @eyurtsev PTAL

---------

Co-authored-by: ccurme <chester.curme@gmail.com>
2024-03-19 12:52:29 -04:00
Eugene Yurtsev
4b3dd34544 core[patch]: Pass sync run manager for sync stream fallback in astream (#19280)
This PR patches the fallback in chat models and language models to pass
in the appropriate version of the run manager (sync vs. async)
2024-03-19 16:32:33 +00:00
Leonid Ganeline
d314acb2d5 core[patch]: Move globals to a module instead of a package (non breaking change) (#19159)
Classes and functions defined in __init__.py are not parsed into the API
Reference.
For example: libs/core/langchain_core/globals/__init__.py :
`set_verbose` `get_llm_cache`, `set_llm_cache`, ...
And the whole `langchain_core.globals` namespace is not visible in the
API Reference. The refactoring is just file renaming.
2024-03-19 12:29:12 -04:00
Al-Ekram Elahee Hridoy
50f93d86ec core[minor]: Enhance cache flexibility in BaseChatModel (#17386)
- **Description:** Enhanced the `BaseChatModel` to support an
`Optional[Union[bool, BaseCache]]` type for the `cache` attribute,
allowing for both boolean flags and custom cache implementations.
Implemented logic within chat model methods to utilize the provided
custom cache implementation effectively. This change aims to provide
more flexibility in caching strategies for chat models.
  - **Issue:** Implements enhancement request #17242.
- **Dependencies:** No additional dependencies required for this change.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-19 11:26:58 -04:00
HatsuneMK00
4761c09e94 docs: update slack toolkit ipynb in integration (#19219)
Thank you for contributing to LangChain!

- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- **PR message**:
- **Description:** Update the slack toolkit doc to use an agent that
support multiple inputs. Using ReAct agent will cause a ValidationError
when invoking the slack tools. This is because the agent return a string
like `'{"channel": "C05LDF54S21", "message": "Hello, world!"}'` but the
ReAct agent does not support multiple inputs.
- **Issue:** This is related to this
[Discussion#18083](https://github.com/langchain-ai/langchain/discussions/18083)
    - **Dependencies:** No dependencies required

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-03-19 10:39:09 -04:00
Zihong
ff31cc1648 experimental: update the notebook link of semantic chunk. (#19253)
update the notebook link of semantic chunk.
2024-03-19 07:24:51 -04:00
Frederico Wu
f36418a5b0 langchain: creating assistants with file_ids (#19199)
Changing OpenAIAssistantRunnable.create_assistant to send the `file_ids`
parameter to openai.beta.assistants.create

Co-authored-by: Frederico Wu <fred.diaswu@coxautoinc.com>
2024-03-18 21:34:03 -07:00
Vittorio Rigamonti
9b2f9ee952 community: VectorStore Infinispan, adding autoconfiguration (#18967)
**Description**:
this PR enable VectorStore autoconfiguration for Infinispan: if
metadatas are only of basic types, protobuf
config will be automatically generated for the user.
2024-03-18 21:33:45 -07:00
Max Jakob
6f544a6a25 elasticsearch: check for deployed models (#18973)
When creating a new index, if we use a retrieval strategy that expects a
model to be deployed in Elasticsearch, check if a model with this name
is indeed deployed before creating an index. This lowers the probability
to get into a state in which an index was created with a faulty model
ID, which cannot be overwritten any more (the index has to manually be
deleted).
2024-03-18 21:32:00 -07:00
gonvee
b82644078e community: Add keep_alive parameter to control how long the model w… (#19005)
Add `keep_alive` parameter to control how long the model will stay
loaded into memory with Ollama。

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-19 04:29:01 +00:00
Anthony Shaw
bb0dd8f82f docs: Embellish article on splitting by tokens with more examples and missing details (#18997)
**Description**

This PR adds some missing details from the "Split by tokens" page in the
documentation. Specifically:

- The `.from_tiktoken_encoder()` class methods for both the
`CharacterTextSplitter` and `RecursiveCharacterTextSplitter` default to
the old `gpt-2` encoding. I've added a comment to suggest specifying
`model_name` or `encoding`
- The docs didn't mention that the `from_tiktoken_encoder()` class
method passes additional kwargs down to the constructor of the splitter.
I only discovered this by reading the source code
- Added an example of using the `.from_tiktoken_encoder()` class method
with `RecursiveCharacterTextSplitter` which is the recommended approach
for most scenarios above `CharacterTextSplitter`
- Added a warning that `TokenTextSplitter` can split characters which
have multiple tokens (e.g. 猫 has 3 cl100k_base tokens) between multiple
chunks which creates malformed Unicode strings and should not be used in
these situations.

Side note: I think the default argument of `gpt2` for
`.from_tiktoken_encoder()` should be updated?

**Twitter handle** anthonypjshaw

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-18 21:28:17 -07:00
Roshan Santhosh
7afecec280 core: update _rm_titles to account for title argument name bug (#19036)
Issue : For functions which have an argument with the name 'title', the
convert_pydantic_to_openai_function generates an incorrect output and
omits the argument all together. This is because the _rm_titles function
removes all instances of the the key 'title' from the output.



Description : Updates the _rm_titles function to check the presence of
the 'type' key as well before removing the 'title' key. As the title key
that we wish to omit always has a type key along with it.

Potential gap if there is a function defined which has both title and
key as argument names, in which case this would fail. Maybe we could set
a filter on the function argument names and reject those with keyword
argument names.


No dependencies. Passed all tests. 


- [x] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [x] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-18 21:25:06 -07:00
Harrison Chase
efcdf54edd Josha91 fix docstring (#19249)
Co-authored-by: Josha van Houdt <josha.van.houdt@sap.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-18 21:19:56 -07:00
Simon Stone
58c7687174 langchain: preserve document metadata in FlashrankRerank (#19148)
**Description:** Preserves document metadata in `FlashrankRerank`
    - **Issue:** #19142
    - **Dependencies:** None
    - **Twitter handle:** n/a

---------

Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>
2024-03-19 04:15:18 +00:00
Aaron Jimenez
bc648f6cfc core: Updated docstring for Context class (#19079)
- **Description:** Improves the docstring for `class Context` by
providing an overview and an example.
- **Issue:** #18803
2024-03-18 21:15:14 -07:00
Taqi Jaffri
044bc22acc Community: Add mistral oss model support to azureml endpoints, plus configurable timeout (#19123)
- **Description:** There was no formatter for mistral models for Azure
ML endpoints. Adding that, plus a configurable timeout (it was hard
coded before)
- **Dependencies:** none
- **Twitter handle:** @tjaffri @docugami
2024-03-18 21:10:42 -07:00
Kangmoon Seo
07de4abe70 core: Fix Exception handling in XMLOutputParser (#19126)
- **Description:** 
  - Exception handling in `XMLOutputParser`
1. Add Exception handling at `root = ET.fromstring(text)` // raises
`ET.ParseError`
    2. Fix Exception class (commonly uses in `BaseOutputParser` class)
  - AS-IS: raise `ValueError`, `ET.ParserError` without handling
    ```python
    # langchain_core/output_parsers/xml.py

        text = text.strip()
        if (text.startswith("<") or text.startswith("\n<")) and (
            text.endswith(">") or text.endswith(">\n")
        ):
            root = ET.fromstring(text)
            return self._root_to_dict(root)
        else:
            raise ValueError(f"Could not parse output: {text}")
    ```
  - TO-BE: raise `OutputParserException`
    ```python
    # langchain_core/output_parsers/xml.py

        text = text.strip()
        if (text.startswith("<") or text.startswith("\n<")) and (
            text.endswith(">") or text.endswith(">\n")
        ):
            try:
                root = ET.fromstring(text)
                return self._root_to_dict(root)

            except ET.ParseError:
raise OutputParserException(f"Could not parse output: {text}")

        else:
raise OutputParserException(f"Could not parse output: {text}")

    ``` 
- **Issue:** #19107  
- **Dependencies:** None
2024-03-18 21:08:32 -07:00
Hamza Muhammad Farooqi
24a0a4472a Add docstrings for Clickhouse class methods (#19195)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-19 04:03:12 +00:00
Simon Stone
dc4ce82ddd docs: fix import path for FlashrankRerank example notebook (#19146)
**Description:** Fixes the import paths for the `FlashrankRerank`
example notebook.
 **Issue:** #19139 
 **Dependencies:** None
 **Twitter handle:** n/a

---------

Co-authored-by: Simon Stone <simon.stone@dartmouth.edu>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-18 21:03:00 -07:00
Saurav Kumar
bde199d128 Updating format of pip install (#19198)
Thank you for contributing to LangChain!

- [x] **PR title**: "Updating format of pip install in two files of
docs/cookbook"
- pip install is not reflecting properly in some of the files in
cookbook
- Example:
[docs/expression_language/cookbook/sql_db](https://python.langchain.com/docs/expression_language/cookbook/sql_db)


- [x] **PR message**: Updating format of pip install in two files of
docs/cookbook
    - **Description:** a description of the change
    - **Issue:** #19197 

- Note - let's do squash merge for the PR

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-19 04:01:24 +00:00
Rohit Gupta
785f8ab174 [langchain_community] milvus vectorstores upsert: add **kwargs to make it use for other argument also (#19193)
add **kwargs in add_documents for upsert, to make it use for other
argument also.
Lets use this, it was unused as of now.

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Rohit Gupta <rohit.gupta2@walmart.com>
2024-03-18 21:01:12 -07:00
Cycle
77868b1974 experimental: add buffer_size hyperparameter to SemanticChunker as in source video (#19208)
add buffer_size hyperparameter which used in combine_sentences function
2024-03-19 03:54:20 +00:00
HowardChan
ae3c7f702c docs:Make url as a markdown link (#19212)
**Description**: same as the title

Co-authored-by: ChenZhengHao <chenzhenghao@mail.teletraan.io>
2024-03-19 03:47:52 +00:00
Shotaro Sano
ca9c8c58ea text-splitters, infra: fix libs/langchain/dev.Dockerfile so that the text-splitter directory is copied before poetry installation (#19214)
## Description
This PR modifies the settings in `libs/langchain/dev.Dockerfile` to
ensure that the `text-splitters` directory is copied before the poetry
installation process begins.

Without this modification, the `docker build` command fails for
`dev.Dockerfile`, preventing the setup of some development environments,
including `.devcontainer`.

## Bug Details

### Repro
Run the following command:

```bash
docker build -f libs/langchain/dev.Dockerfile .
```

### Current Behavior
The docker build command fails, raising the following error:

```
...
 => [langchain-dev-dependencies 4/5] COPY libs/community/ ../community/                                                                                0.4s
 => ERROR [langchain-dev-dependencies 5/5] RUN poetry install --no-interaction --no-ansi --with dev,test,docs                                          1.1s
------                                                                                                                                                      
 > [langchain-dev-dependencies 5/5] RUN poetry install --no-interaction --no-ansi --with dev,test,docs:
#13 0.970 
#13 0.970 Directory ../text-splitters does not exist
------
executor failed running [/bin/sh -c poetry install --no-interaction --no-ansi --with dev,test,docs]: exit code: 1
```

### Expected Behavior
The `docker build` command successfully completes without the poetry
error.

### Analysis
The error occurs because the `text-splitters` directory is not copied
into the build environment, unlike the other packages under the `libs`
directory. I suspect that the `COPY` setting was overlooked since
`text-splitters` was separated in a recent PR.

## Fix
Add the following lines to the `libs/langchain/dev.Dockerfile`:

```dockerfile
# Copy the text-splitters library for installation
COPY libs/text-splitters/ ../text-splitters/
```
2024-03-18 20:45:35 -07:00
Guangdong Liu
c3310c5e7f community: Fix Milvus got multiple values for keyword argument 'timeout' (#19232)
- **Description:** Fix Milvus got multiple values for keyword argument
'timeout'
- **Issue:**  fix #18580
- @baskaryan @eyurtsev PTAL
2024-03-18 20:44:25 -07:00
Erick Friis
95904fe443 langchain[patch]: update base imports to core (#19248)
still deprecated, but was misleading before
2024-03-19 03:17:07 +00:00
Asaf Joseph Gardin
21c45475c5 ai21[patch]: AI21 Labs bump SDK version (#19114)
Description: Added support AI21 SDK version 2.1.2
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:47:08 -07:00
daniel ung
edf9d1c905 templates: Added template for JaguarDB (#16757)
- **Description:**: added langchain template for JaguarDB

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-19 02:36:24 +00:00
gustavo-yt
7c26ef88a1 templates: Add rag lantern template (#16523)
Replace this entire comment with:
  - **Description:** Added a template for lantern rag usage.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-19 02:34:46 +00:00
Jib
516cc44b3f langchain-mongodb: [test-fix] add explicit index_name setting on test vector creation (#19245)
- **Description:** Tests fail to do value lookup because it does not
specify the index name
  - **Issue:** the issue # Failing integration test
 

- [x] **Add tests and docs**: Tests now pass


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-18 15:52:28 -07:00
Estephania Calvo Carvajal
94e58dd827 docs:Fix links to LangSmith docs on Evaluation page (#19210) (#19216)
- **Description:** Same as the title
- **Issue:** #19210
2024-03-18 22:27:43 +00:00
William FH
780337488e [Enhancement] Add support for directly providing a run_id (#18990)
The root run id (~trace id's) is useful for assigning feedback, but the
current recommended approach is to use callbacks to retrieve it, which
has some drawbacks:
1. Doesn't work for streaming until after the first event
2. Doesn't let you call other endpoints with the same trace ID in
parallel (since you have to wait until the call is completed/started to
use

This PR lets you provide = "run_id" in the runnable config.

Couple considerations:

1. For batch calls, we split the trace up into separate trees (to permit
better rendering). We keep the provided run ID for the first one and
generate a unique one for other elements of the batch.
2. For nested calls, the provided ID is ONLY used on the top root/trace.



### Example Usage


```
chain.invoke("foo", {"run_id": uuid.uuid4()})
```
2024-03-18 15:03:04 -07:00
Jacob Lee
bd329e9aad core[patch]: Add LLM output to message response_metadata (#19158)
This will more easily expose token usage information.

CC @baskaryan

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-18 13:58:32 -07:00
Erick Friis
6fa1438334 mongodb[patch]: release 0.1.2 (#19243) 2024-03-18 13:35:45 -07:00
Leonid Ganeline
7de1d9acfd community: llms imports fixes (#18943)
Classes are missed in  __all__  and in different places of __init__.py
- BaichuanLLM 
- ChatDatabricks
- ChatMlflow
- Llamafile
- Mlflow
- Together
Added classes to __all__. I also sorted __all__ list.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 20:24:40 +00:00
Anush
aee5138930 templates: update qdrant self query (#19218)
## Description

This PR
- Updates the Qdrant self-query template to reflect the recent updates.
- Enables reading config values from `env` files as the README [mentions
it](https://github.com/Anush008/langchain/tree/self-query-qdrant/templates/self-query-qdrant#environment-setup).

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:59:08 +00:00
Kenzie Mihardja
21f75991d4 deprecate community docugami loader (#19230)
Thank you for contributing to LangChain!

- [x] **PR title**: "community: deprecate DocugamiLoader"

- [x] **PR message**: Deprecate the langchain_community and use the
docugami_langchain DocugamiLoader

---------

Co-authored-by: Kenzie Mihardja <kenzie28@cs.washington.edu>
2024-03-18 12:56:47 -07:00
Jib
ec026004cb mongodb[patch]: Remove in-memory cache from cache abstractions (#18987)
## Description
* In memory cache easily gets out of sync with the server cache, so we
will remove it entirely to reduce the issues around invalidated caches.

## Dependencies
None

- [x]  If you're adding a new integration, please include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:44:34 +00:00
Jib
866d6408af mongodb[patch]: Remove embedding retrieval from mongodb payload (#19035)
## Description
Returning the embedding is not necessary in the vector search
functionality unless specified as a debugging step. This change defaults
the behavior such that the server _only_ returns the embedding key if
explicitly requested, such as in the case of
`max_marginal_relevance_search`.


- [x] **Add tests and docs**: If you're adding a new integration, please
include
* Added `test_from_documents_no_embedding_return`


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-18 19:43:50 +00:00
Leonid Kuligin
366ba77459 core[minor]: moved fake llms and embeddings to core (#19226)
- [ ] **PR title**: "core: moved fake llms and embeddings to core"


- [ ] **PR message**:
 - **Description:** moved fake llms and embeddings to core"
2024-03-18 10:01:26 -07:00
Pengfei Jiang
514fe80778 community[patch]: add stop parameter support to volcengine maas (#19052)
- **Description:** add stop parameter to volcengine maas model
- **Dependencies:** no

---------

Co-authored-by: 江鹏飞 <jiangpengfei.jiangpf@bytedance.com>
2024-03-17 01:58:50 +00:00
htaoruan
bcc771e37c docs: ChatTongyi example error (#19013) 2024-03-17 01:55:56 +00:00
Anubhav Madhav
9235dade90 docs: provided hyperlinks to text and fixed grammar (#19092)
1) Provided links to text in the prompt (Refer Page Link 1, Page Link 2
and Page Link 3)
2) Fixed Grammar in Considerations of Model I/O Concepts documentation
page - Update concepts.mdx (Page Link 4)

*Issues are on the following pages:*
Page Link 1:
https://python.langchain.com/docs/modules/model_io/concepts#prompttemplate
Page Link 2:
https://python.langchain.com/docs/modules/model_io/concepts#messageprompttemplate
Page Link 3:
https://python.langchain.com/docs/modules/model_io/concepts#chatprompttemplate
Page Link 4:
https://python.langchain.com/docs/modules/model_io/concepts#considerations


**Fix 1**:
Description: Fixed Grammar in Considerations of Model I/O Documentation
Page
Issue: "to work well with the model are you using" # "to work well with
the model you are using"
Dependencies: None
Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav)

**Fix 2**:
Description: Provided links to text in the prompt (Refer Page Link 1,
Page Link 2 and Page Link 3)
Issue: links not provided # links have been provided to the text
Dependencies: None
Twitter handle: @Anubhav_Madhav (https://twitter.com/Anubhav_Madhav)
baskaryan, efriis, eyurtsev, hwchase17.


*For Fix 1*
Refer to the first word 'This" word in the image attached with this PR.
PFA
<img width="839" alt="Screenshot 2024-03-15 at 3 04 17 AM"
src="https://github.com/langchain-ai/langchain/assets/42323737/94e8db16-249f-48c3-a1d1-dee8d36067fa">


If no one reviews your PR within a few days, please @-mention one of

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-17 01:37:42 +00:00
primate88
5aa68936e0 community: Fix import path for StreamingStdOutCallbackHandler example (#19170)
- Description:
- Updated the import path for `StreamingStdOutCallbackHandler` in the
streaming response example within `huggingface_endpoint.py`. This change
corrects the import statement to reflect the actual location of
`StreamingStdOutCallbackHandler` in
`langchain_core.callbacks.streaming_stdout`.
- Issue:
  - None
- Dependencies:
  - No additional dependencies are required for this change.
- Twitter handle:
  - None

## Note:
I have tested this change locally and confirmed that the
`StreamingStdOutCallbackHandler` works as expected with the updated
import path. This PR does not require the addition of new tests since it
is a correction to documentation/examples rather than functional code.
2024-03-17 00:50:37 +00:00
Bagatur
611d5a1618 openai[patch]: fix async http client (#19164)
Fix #19116
2024-03-16 17:50:22 -07:00
Nikhil Kumar
635b3372bd community[minor]: Add support for translation in HuggingFacePipeline (#19190)
- [x] **Support for translation**: "community: Add support for
translation in `HuggingFacePipeline`"


- [x] **Add support for translation in `HuggingFacePipeline`**:
- **Description:** Add support for translation in `HuggingFacePipeline`,
which earlier used to support only text summarization and generation.
    - **Issue:** N/A
    - **Dependencies:** N/A
    - **Twitter handle:** None
2024-03-17 00:48:13 +00:00
Nikhil Kumar
a1b26dd9b6 docs: Add docs for RouterRunnable (#19191)
- [x] **Docs for `RouterRunnable`**: core: Add docs for `RouterRunnable`

- [x] **Add docs for `RouterRunnable`**:
- **Description:** Add docs for `RouterRunnable`, which was previously
missing documentation
    - **Issue:** #18803 
    - **Dependencies:** N/A
    - **Twitter handle:** None
2024-03-17 00:48:00 +00:00
k.muto
8d2c34e655 community: Fix all page numbers were the same for _BaseGoogleVertexAISearchRetriever (#19175)
- Description:
- This pull request is to fix a bug where page numbers were not set
correctly. In the current code, all chunks share the same metadata
object doc_metadata, so the page number is set with the same value for
all documents. To fix this, I changed to using separate metadata objects
for each chunk.
- Issue:
  - None
- Dependencies:
  - No additional dependencies are required for this change.
- Twitter handle:
  - @eycjur

- Test
- Even if it's not a bug, there are cases where everything ends up with
the same number of pages, so it's very difficult for me to write
integration tests.
2024-03-16 22:28:56 +00:00
Matt Frediani
160a7077b0 Update README.md (#19172)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-16 15:23:25 -07:00
inpyeong
7c092f479f docs: Update why.ipynb (#19173)
I think that cell type for pip command may be 'code'.
Please check, thank you :)

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-16 22:21:51 +00:00
Vitalii Korsakov
d96e0b2de7 docs: Remove duplicated line in Get Started section (#19182)
Line `from langchain_openai import ChatOpenAI` is put twice in Get
Started / Serving with LangServe section.
Imports on lines 559 and 566 are identical

Co-authored-by: Vitalii <vitalii@localhost>
2024-03-16 22:21:25 +00:00
Cailin Wang
7cd87d2f6a community: Add partition parameter to DashVector (#19023)
**Description**: DashVector Add partition parameter
**Twitter handle**: @CailinWang_

---------

Co-authored-by: root <root@Bluedot-AI>
2024-03-16 15:20:30 -07:00
Rodrigo Nogueira
e64cf1aba4 community: Add model argument for maritalk models and better error handling (#19187) 2024-03-16 15:18:56 -07:00
samanhappy
ff94f86ce1 docs: fix link to interface TextSplitter (#19177) 2024-03-16 15:16:34 -07:00
Sergey Kozlov
1a55e950aa community[patch]: support fastembed v1 and v2 (#19125)
**Description:**
#18040 forces `fastembed>2.0`, and this causes dependency conflicts with
the new `unstructured` package (different `onnxruntime`). There may be
other dependency conflicts.. The only way to use
`langchain-community>=0.0.28` is rollback to `unstructured 0.10.X`. But
new `unstructured` contains many fixes.

This PR allows to use both `fastembed` `v1` and `v2`.

How to reproduce:

`pyproject.toml`:
```toml
[tool.poetry]
name = "depstest"
version = "0.0.0"
description = "test"
authors = ["<dev@example.org>"]

[tool.poetry.dependencies]
python = ">=3.10,<3.12"
langchain-community = "^0.0.28"
fastembed = "^0.2.0"
unstructured = {extras = ["pdf"], version = "^0.12"}
```

```bash
$ poetry lock
```

Co-authored-by: Sergey Kozlov <sergey.kozlov@ludditelabs.io>
2024-03-15 18:33:51 -07:00
six17
fd4f536c77 text-splitters[patch]: fix json split of RecursiveJsonSplitter (#19119)
- **Description:** This modification addresses the issue of mutable
default parameters in functions. In the original code, the `chunks`
parameter is defaulted to a list containing an empty dictionary, which
is mutable. Since default parameters in Python are evaluated only once
at function definition time, modifications to the parameter would
persist across future calls. By changing the default to `None` and
checking/initializing within the function, a new list is created for
each call, thus avoiding potential issues.

---------

Co-authored-by: sixiang <sixiang@lixiang.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-15 16:46:49 -07:00
aditya thomas
05008c4f94 docs: update stale links in Together AI documentation (#19011)
**Description:** Update stales link in Together AI documentation
**Issue:** Some links pointed to legacy webpages on the Together AI
website
**Dependencies:** None
**Lint and test**: `make format`, `make lint` were run
2024-03-15 16:38:04 -07:00
aditya thomas
80eb510a7b docs: update docstring of Together class (#19008)
**Description:** Update docstring of Together class to show example and
update API URL
**Issue:** Improves usability
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` were run
2024-03-15 16:30:45 -07:00
高远
ef9813dae6 docs: add vikingdb docstrings(#19016)
Co-authored-by: gaoyuan <gaoyuan.20001218@bytedance.com>
2024-03-15 16:29:29 -07:00
wulixuan
0e0030f494 community[patch]: fix yuan2 chat model errors while invoke. (#19015)
1. fix yuan2 chat model errors while invoke.
2. update related tests.
3. fix some deprecationWarning.
2024-03-15 16:28:36 -07:00
Shuai Liu
c244e1a50b community[patch]: Fixed bug in merging generation_info during chunk concatenation in Tongyi and ChatTongyi (#19014)
- **Description:** 

In #16218 , during the `GenerationChunk` and `ChatGenerationChunk`
concatenation, the `generation_info` merging changed from simple keys &
values replacement to using the util method
[`merge_dicts`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/utils/_merge.py):


![image](https://github.com/langchain-ai/langchain/assets/2098020/10f315bf-7fe0-43a7-a0ce-6a3834b99a15)

The `merge_dicts` method could not handle merging values of `int` or
some other types, and would raise a
[`TypeError`](https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/utils/_merge.py#L55).

This PR fixes this issue in the **Tongyi and ChatTongyi Model** by
adopting the `generation_info` of the last chunk
and discarding the `generation_info` of the intermediate chunks,
ensuring that `stream` and `astream` function correctly.

- **Issue:**  
    - Related issues or PRs about Tongyi & ChatTongyi: #16605, #17105 
    - Other models or cases: #18441, #17376
- **Dependencies:** No new dependencies
2024-03-15 16:27:53 -07:00
wulixuan
f79d0cb9fb docs: update docs for yuan2 in LLMs and Chat models integration. (#19028)
update yuan2.0 notebook in LLMs and Chat models.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-15 16:03:18 -07:00
Taraka Nithin Vankala
eec023766e docs: Corrected error (#19030)
- [ ] **PR title**: "docs: correction in
"https://github.com/langchain-ai/langchain/blob/master/docs/docs/get_started/quickstart.mdx",
line 289".
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: 
    - Corrected the spelling mistake
    - #18981
2024-03-15 16:02:33 -07:00
Christophe Bornet
f2a7dda4bd community[patch]: Use langchain-astradb for AstraDB doc loader (#19071)
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:57:25 +00:00
Leonid Ganeline
a49ac55964 docs: providers update 8 (#19053)
Added missed providers. Added missed integrations. Fixed format.
2024-03-15 15:49:14 -07:00
Holt Skinner
cee03630d9 community[patch]: Add Blended Search Support to GoogleVertexAISearchRetriever (#19082)
https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#multi-data-stores

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:39:31 +00:00
Eugene Yurtsev
0ddfe7fc9d langchain[patch]: make hub work with older langchainhub versions (#19076)
Make it work with older clients
2024-03-15 15:37:52 -07:00
William W Wang
0a784074d1 docs: Update llm_caching.ipynb (#19085) 2024-03-15 22:35:48 +00:00
William W Wang
6327be9048 docsUpdate azure_cosmos_db.ipynb (#19087)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:33:26 +00:00
Anubhav Madhav
553a520ab6 docs: Fixed Grammar in Considerations of Model I/O Concepts (#19091)
Fixed Grammar in Considerations of Model I/O Concepts documentation page
- Update concepts.mdx

Page Link:
https://python.langchain.com/docs/modules/model_io/concepts#considerations

- **Description:** Fixed Grammar in Considerations of Model I/O
Documentation Page
- **Issue:** "to work well with the model are you using" # "to work well
with the model you are using"
- **Dependencies:** None
- **Twitter handle:** @Anubhav_Madhav
(https://twitter.com/Anubhav_Madhav)


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:31:39 +00:00
Shotaro Sano
d647ff1a9a docs: Fix execution results of docs/docs/modules/data_connection/indexing.ipynb (#19112)
## Description
This PR addresses a documentation issue in the
[Indexing](https://python.langchain.com/docs/modules/data_connection/indexing)
page. Specifically, it corrects the execution results of the Jupyter
notebook under the
[Source](https://python.langchain.com/docs/modules/data_connection/indexing#source)
section, which were broken as detailed below.

## Problem
The execution results following the statement, `This should delete the
old versions of documents associated with doggy.txt source and replace
them with the new versions.`, appear to be incorrect, as described
below.

### Current Behavior
- For some reason, the `index` function fails to add the new content of
`doggy.txt`. Although it deletes the document objects associated with
the `doggy.txt` source, it does not add the objects in
`changed_doggy_docs`. Consequently, the execution result displays
`num_added: 0`.
- This unexpected behavior also impacts the results of
`vectorstore.similarity_search("dog", k=30)`, showing only the contents
of `kitty.txt`. It appears as though the contents of `doggy.txt` have
been completely removed from the index:

```
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

### Expected Behavior
- The `index` function should successfully add the objects in
`changed_doggy_docs` after removing the old content of `doggy.txt`. The
anticipated execution result is `num_added: 2`.
- Subsequently, the modified content of `doggy.txt` should appear in the
results of `vectorstore.similarity_search("dog", k=30)` as follows:

```
[Document(page_content='woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='woof woof woof', metadata={'source': 'doggy.txt'}),
 Document(page_content='tty kitty', metadata={'source': 'kitty.txt'}),
 Document(page_content='tty kitty ki', metadata={'source': 'kitty.txt'}),
 Document(page_content='kitty kit', metadata={'source': 'kitty.txt'})]
```

## Fix
I reran `docs/docs/modules/data_connection/indexing.ipynb` and have
included the diff in this PR.
2024-03-15 22:27:15 +00:00
case-k
ebc4a64f9e docs: fix databricks document url (#19096)
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-15 22:25:11 +00:00
Guangdong Liu
4468e5bdbe docs: Add in code documentation to core Runnable with_fallbacks method (docs only) (#19104)
- Description: [a description of the change] Add in code documentation
to core Runnable with_fallbacks method (docs only)
- Issue: the issue #18804 
@eyurtsev PTAL
2024-03-15 15:21:10 -07:00
Guangdong Liu
cced3eb9bc community[patch]: Fix sparkllm embeddings api bug. (#19122)
- **Description:** Fix sparkllm embeddings api bug.
@baskaryan PTAL
2024-03-15 15:08:49 -07:00
samanhappy
b9c62fb905 docs: fix API link for BaseLoader (#19128)
The link to the BaseLoader API requires an update as it has been moved
into the `langchain_core` package.
2024-03-15 14:46:05 -07:00
kaijietti
c20aeef79a community[patch]: implement qdrant _aembed_query and use it in other async funcs (#19155)
`amax_marginal_relevance_search ` and `asimilarity_search_with_score `
should use an async version of `_embed_query `.
2024-03-15 21:20:12 +00:00
Kostas Botsas
527676a753 docs: Fix source column xata.ipynb (#19137)
Docs fix: replace column name search with source.

The Xata integration expects metadata column named "source".

The docs suggest the name "search", which if used, yields the following
error:

```
File "/usr/local/lib/python3.11/site-packages/langchain_community/vectorstores/xata.py", line 95, in _add_vectors
    raise Exception(f"Error adding vectors to Xata: {r.status_code} {r}")
Exception: Error adding vectors to Xata: 400 {'errors': [{'status': 400, 'message': 'invalid record: column [source]: column not found'}]}
```
2024-03-15 14:06:18 -07:00
Barun Amalkumar Halder
34d6f0557d community[patch] : publishes duration as milliseconds to Fiddler (#19166)
**Description:** Many LLM steps complete in sub-second duration, which
can lead to non-collection of duration field for Fiddler. This PR
updates duration from seconds to milliseconds.
**Issue:** [INTERNAL] FDL-17568
**Dependencies:** NA
**Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-03-15 14:04:56 -07:00
Eugene Yurtsev
745d2476a2 langchain: upgrade mypy (#19163)
Update mypy in langchain
2024-03-15 16:37:09 -04:00
Maxime Perrin
aa785fa6ec core[minor]: allow LLMs async streaming to fallback on sync streaming (#18960)
- **Description:** Handling fallbacks when calling async streaming for a
LLM that doesn't support it.
- **Issue:** #18920 
- **Twitter handle:**@maximeperrin_

---------

Co-authored-by: Maxime Perrin <mperrin@doing.fr>
2024-03-15 16:06:50 -04:00
Erick Friis
caf47ab666 infra: run min version ci before integration tests (#18945) 2024-03-15 12:14:44 -07:00
Barun Amalkumar Halder
b551d49cf5 community[patch] : adds feedback and status for Fiddler callback handler events (#19157)
**Description:** This PR adds updates the fiddler events schema to also
pass user feedback, and llm status to fiddler
   **Tickets:** [INTERNAL] FDL-17559 
   **Dependencies:**  NA
   **Twitter handle:** behalder

Co-authored-by: Barun Halder <barun@fiddler.ai>
2024-03-15 12:03:49 -07:00
Juan Felipe Arias
f5b9aedc48 community[patch]: add args_schema to sql_database tools for langGraph integration (#18595)
- **Description:** This modification adds pydantic input definition for
sql_database tools. This helps for function calling capability in
LangGraph. Since actions nodes will usually check for the args_schema
attribute on tools, This update should make these tools compatible with
it (only implemented on the InfoSQLDatabaseTool)
  - **Issue:** N/A
  - **Dependencies:** N/A
  - **Twitter handle:** juanfe8881
2024-03-15 19:03:36 +00:00
fengjial
c922ea36cb community[minor]: Add Baidu VectorDB as vector store (#17997)
Co-authored-by: fengjialin <fengjialin@MacBook-Pro.local>
2024-03-15 19:01:58 +00:00
aditya thomas
190887c5cd docs: update the list of providers (#19012)
**Description:** Update the list of LangChain providers
**Issue:** Make the list of LangChain providers current
**Dependencies:** None
2024-03-15 12:00:24 -07:00
Erick Friis
bbe164ad28 docs: voyageai as provider (#19154) 2024-03-15 10:12:37 -07:00
Erick Friis
781aee0068 community, langchain, infra: revert store extended test deps outside of poetry (#19153)
Reverts langchain-ai/langchain#18995

Because it makes installing dependencies in python 3.11 extended testing
take 80 minutes
2024-03-15 17:10:47 +00:00
Leonid Kuligin
e3ff107e4f docs: updated google integration related imports in the documentation (#19131)
updated imports in the documentation for google vertex
2024-03-15 09:30:50 -04:00
Erick Friis
9e569d85a4 community, langchain, infra: store extended test deps outside of poetry (#18995)
poetry can't reliably handle resolving the number of optional "extended
test" dependencies we have. If we instead just rely on pip to install
extended test deps in CI, this isn't an issue.
2024-03-15 05:55:30 +00:00
Bagatur
191ddbc77e core[patch]: rc release 0.1.33-rc.1 (#19103) 2024-03-14 20:21:54 -07:00
Nuno Campos
508f75853c core[patch]: Change structured prompt lc id to match js (#19099) 2024-03-14 20:02:52 -07:00
Erick Friis
7ce81eb6f4 voyageai[patch]: init package (#19098)
Co-authored-by: fodizoltan <zoltan@conway.expert>
Co-authored-by: Yujie Qian <thomasq0809@gmail.com>
Co-authored-by: fzowl <160063452+fzowl@users.noreply.github.com>
2024-03-15 00:56:10 +00:00
Brace Sproul
5157b15446 ci[patch]: Set root dir to ./docs (#19102) 2024-03-14 17:55:04 -07:00
Brace Sproul
98cd8f673b docs[minor]ci[minor]: Add script & CI to check recurring links daily (#19100) 2024-03-14 17:42:22 -07:00
Asaf Joseph Gardin
4d7f6fa968 ai21[patch]: AI21 Labs Batch Support in Embeddings (#18633)
Description: Added support for batching when using AI21 Embeddings model
Twitter handle: https://github.com/AI21Labs

---------

Co-authored-by: Asaf Gardin <asafg@ai21.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 23:10:23 +00:00
Tomaz Bratanic
321db89e87 templates: Switch neo4j generation template to LLMGraphTransformer (#19024) 2024-03-14 16:00:42 -07:00
Erick Friis
d5cf360329 ibm[patch]: release 0.1.3 (#19094) 2024-03-14 15:59:42 -07:00
Mateusz Szewczyk
b15d150d22 ibm[patch]: add async tests, add tokenize support (#18898)
- **Description:** add async tests, add tokenize support
- **Dependencies:**
[ibm-watsonx-ai](https://pypi.org/project/ibm-watsonx-ai/),
  - **Tag maintainer:** 

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally -> 
Please make sure integration_tests passing locally -> 

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 22:57:05 +00:00
billytrend-cohere
7253b816cc community: Add support for cohere SDK v5 (keeps v4 backwards compatibility) (#19084)
- **Description:** Add support for cohere SDK v5 (keeps v4 backwards
compatibility)

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-14 15:53:24 -07:00
Eugene Yurtsev
06165efb5b core[patch]: RunnablePassthrough transform to autoupgrade to AddableDict (#19051)
Follow up on https://github.com/langchain-ai/langchain/pull/18743 which
missed RunnablePassthrough

Issues:

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
2024-03-14 16:59:46 -04:00
Eugene Yurtsev
41e2f60cd2 Updated security policy (#19089)
Updated security policy
2024-03-14 20:58:47 +00:00
Eugene Yurtsev
6cdca4355d community[minor]: Revamp PGVector Filtering (#18992)
This PR makes the following updates in the pgvector database:

1. Use JSONB field for metadata instead of JSON
2. Update operator syntax to include required `$` prefix before the
operators (otherwise there will be name collisions with fields)
3. The change is non-breaking, old functionality is still the default,
but it will emit a deprecation warning
4. Previous functionality has bugs associated with comparisons due to
casting to text (so lexical ordering is used incorrectly for numeric
fields)
5. Adds an a GIN index on the JSONB field for more efficient querying
2024-03-14 16:56:00 -04:00
Bagatur
e276817e1d docs: fix vercel build script (#19090)
amazon linux 2023 doesn't have `amazon-linux-extras` but shoudl have python3.9 by default
2024-03-14 20:53:43 +00:00
Guangdong Liu
d4b025c812 code[patch]: Add in code documentation to core Runnable assign method (docs only) (#18951)
**PR message**: ***Delete this entire checklist*** and replace with
- **Description:** [a description of the change](docs: Add in code
documentation to core Runnable assign method)
    - **Issue:** the issue  #18804
2024-03-14 15:41:19 -04:00
Anthony Yang
688a5bd106 docs:fixed typo in streaming document (#19045)
Fixed typo in line 661 - from 'mimimize' to 'minimize

- [ ] **PR message**: 
- **Description:** Fixed typo in streaming document - change 'mimimize'
to 'minimize

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-14 19:38:53 +00:00
Bagatur
573f48e34d core[patch]: Release 0.1.32 (#19088) 2024-03-14 12:01:58 -07:00
YHW
69a8ef2693 core: Runnable pass kwargs to _astream_log_implementation in astream_log (#19055)
- **Description:** When calling the `_stream_log_implementation` from
the `astream_log` method in the `Runnable` class, it is not handing over
the `kwargs` argument. Therefore, even if i want to customize APIHandler
and implement additional features with additional arguments, it is not
possible. Conversely, the `astream_events` method normally handing over
the `kwargs` argument.
- **Issue:** https://github.com/langchain-ai/langchain/issues/19054
- **Dependencies:**
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!

Co-authored-by: hyungwookyang <hyungwookyang@worksmobile.com>
2024-03-14 14:39:46 -04:00
Nuno Campos
751fb7de20 Add new beta StructuredPrompt (#19080)
Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-14 10:40:34 -07:00
Bagatur
0ae39ab30e docs: make links internal (#19063)
So they can be properly link checked
2024-03-14 16:22:56 +00:00
Anton Parkhomenko
ae73b9d839 community[patch]: Fix NotionDBLoader 400 Error by conditionally adding filter parameter (#19075)
- **Description:** This change fixes a bug where attempts to load data
from Notion using the NotionDBLoader resulted in a 400 Bad Request
error. The issue was traced to the unconditional addition of an empty
'filter' object in the request payload, which Notion's API does not
accept. The modification ensures that the 'filter' object is only
included in the payload when it is explicitly provided and not empty,
thus preventing the 400 error from occurring.
- **Issue:** Fixes
[#18009](https://github.com/langchain-ai/langchain/issues/18009)
- **Dependencies:** None
- **Twitter handle:** @gunnzolder

Co-authored-by: Anton Parkhomenko <anton@merge.rocks>
2024-03-14 13:56:57 +00:00
Erick Friis
2999d06938 docs: deprecate old airbyte loader docs (#19048) 2024-03-13 23:18:30 +00:00
Prakul
4c53e31377 docs: Updated index definition and reference to LangChain-MongoDB (#19047)
**Description:** 
Updates to LangChain-MongoDB documentation: updates to the Atlas vector
search index definition

**Issue:** 
NA

**Dependencies:** 
NA

**Twitter handle:** 
iprakul
2024-03-13 15:44:13 -07:00
Erick Friis
5e0c58f9c2 infra: update upload-artifact and download-artifact to v4 (#19044) 2024-03-13 20:08:29 +00:00
Tomaz Bratanic
e5e15c8d59 docs: Add graph construction docs (#18904) 2024-03-13 12:27:58 -07:00
Nuno Campos
2b7c3c548d core[minor]: Add Runnable.batch_as_completed (#17603)
This PR adds `batch as completed` method to the standard Runnable
interface. It takes in a list of inputs and yields the corresponding
outputs as the inputs are completed.
2024-03-13 11:18:02 -07:00
Erick Friis
71d0981f18 templates: fix rag-lancedb dep (#19010) 2024-03-13 04:36:24 +00:00
Erick Friis
74b2c0aa01 templates, cli: more security deps (#19006) 2024-03-12 20:48:56 -07:00
Erick Friis
9052d05442 template: bump more lockfiles (#19003)
- templates: bump lockfile deps
- x
2024-03-13 01:43:33 +00:00
Erick Friis
49f3cc0f6b templates: bump lockfile deps (#19001) 2024-03-13 01:25:45 +00:00
Erick Friis
2ffb2144a6 experimental[patch]: release 0.0.54 (#19000) 2024-03-13 00:38:46 +00:00
Erick Friis
873d06c009 langchain[patch]: release 0.1.12 (#18999) 2024-03-13 00:22:21 +00:00
Leonid Ganeline
9c8523b529 community[patch]: flattening imports 3 (#18939)
@eyurtsev
2024-03-12 15:18:54 -07:00
Erick Friis
af50f21765 community[patch]: release 0.0.28 (#18993) 2024-03-12 21:55:29 +00:00
Erick Friis
4881bb669c core[patch]: release 0.1.31 (#18989) 2024-03-12 19:45:21 +00:00
Erick Friis
a29e8d8594 elasticsearch[patch]: fix integration tests for release (#18980) 2024-03-12 10:22:07 -07:00
Erick Friis
0d1f6c417c elasticsearch[patch]: release 0.1.1 (#18978) 2024-03-12 16:46:22 +00:00
Max Jakob
911ccf9aa6 docs: elasticsearch retriever (#18965)
Add documentation notebook for `ElasticsearchRetriever`.

## Dependencies
- [ ] Release new `langchain-elasticsearch` version 0.2.0 that includes
`ElasticsearchRetriever`
2024-03-12 09:42:36 -07:00
Dobiichi-Origami
471f2ed40a community[patch]: re-arrange the addtional_kwargs of returned qianfan structure to avoid _merge_dict issue (#18889)
fix issue: https://github.com/langchain-ai/langchain/issues/18441
PTAL, thanks
@baskaryan, @efriis, @eyurtsev, @hwchase17.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-12 05:43:56 +00:00
Naman Jain
75122646b5 core[patch]: fixed circular dependency with json schema (#18657)
**Description:** Circular dependencies when parsing references leading
to `RecursionError: maximum recursion depth exceeded` issue. This PR
address the issue by handling previously seen refs as in any typical DFS
to avoid infinite depths.

**Issue:** https://github.com/langchain-ai/langchain/issues/12163

 **Twitter handle:** https://twitter.com/theBhulawat 


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
2024-03-12 05:42:45 +00:00
Tymofii
0bec1f6877 commnity[patch]: refactor code for faiss vectorstore, update faiss vectorstore documentation (#18092)
**Description:** Refactor code of FAISS vectorcstore and update the
related documentation.
Details: 
 - replace `.format()` with f-strings for strings formatting;
- refactor definition of a filtering function to make code more readable
and more flexible;
- slightly improve efficiency of
`max_marginal_relevance_search_with_score_by_vector` method by removing
unnecessary looping over the same elements;
- slightly improve efficiency of `delete` method by using set data
structure for checking if the element was already deleted;

**Issue:** fix small inconsistency in the documentation (the old example
was incorrect and unappliable to faiss vectorstore)

**Dependencies:** basic langchain-community dependencies and `faiss`
(for CPU or for GPU)

**Twitter handle:** antonenkodev
2024-03-11 22:33:03 -07:00
Roshan Santhosh
acf1ecc081 langchain[patch]: update llm_router.py (#18865)
Issue : _call method of LLMRouterChain uses predict_and_parse, which is
slated for deprecation.

Description : Instead of using predict_and_parse, this replaces it with
individual predict and parse functions.
2024-03-11 22:30:07 -07:00
Bagatur
18de77cc8c core[minor]: add streaming support to OAI tool parsers (#18940)
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-11 21:53:56 -07:00
Bagatur
e0e688a277 core[minor]: generation info on msg (#18592)
related to #16403 #17188
2024-03-12 04:43:17 +00:00
Tomaz Bratanic
cda43c5a11 experimental[patch]: Fix LLM graph transformer default prompt (#18856)
Some LLMs do not allow multiple user messages in sequence.
2024-03-11 20:11:52 -07:00
Bagatur
19721246f5 core[patch]: support labeled json schema as tools (#18935) 2024-03-11 19:51:35 -07:00
Jacob Lee
950ab056eb templates[patch]: Update pirate-speak deps, add messages placeholder (#18949)
CC @efriis
2024-03-11 19:20:30 -07:00
Leonid Ganeline
fad308a764 docs: providers update 2 (#18407)
Formatted pages into a consistent form. Added descriptions and links
when needed.
2024-03-11 18:35:37 -07:00
Erick Friis
239f0a615e templates: redis multi-modal multi-vector rag (#18946)
---------

Co-authored-by: Tyler Hutcherson <tyler.hutcherson@redis.com>
2024-03-12 00:32:25 +00:00
Bagatur
915c1f8673 infra: rm api build CI (#18944) 2024-03-11 16:12:34 -07:00
Brace Sproul
578e67c017 docs[patch]: properly load/use env vars (#18942) 2024-03-11 15:38:05 -07:00
Erick Friis
0d888a65cb core[patch]: move some attr/methods to BaseLanguageModel (#18936)
Cleans up some shared code between `BaseLLM` and `BaseChatModel`. One
functional difference to make it more consistent (see comment)
2024-03-11 14:59:45 -07:00
Brace Sproul
4ff6aa5c78 docs[minor]: Swap gtag for supabase (#18937)
Added deps:
- `@supabase/supabase-js` - for sending inserts
- `supabase` - dev dep, for generating types via cli
- `dotenv` for loading env vars

Added script:
- `yarn gen` - will auto generate the database schema types using the
supabase CLI. Not necessary for development, but is useful. Requires
authing with the supabase CLI (will error out w/ instructions if you're
not authed).

Added functionality:
- pulls users IP address (using a free endpoint: `https://api.ipify.org`
so we can filter out abuse down the line)

TODO:
- [x] add env vars to vercel
2024-03-11 14:23:12 -07:00
aditya thomas
5c2f7e6b2b partners[openai]: update the docstring of OpenAI, OpenAIEmbeddings and ChatOpenAI classes (#18908)
**Description:** Update the docstring of OpenAI, OpenAIEmbeddings and
ChatOpenAI classes
**Issue:** Update import module paths to the current LangChain API
**Dependencies:** None
**Lint and test**: `make format` and `make lint` were run

This incorporates the review comments from langchain-ai/langchain#18637
which I closed due to an issue I had in updating that pr branch

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-11 20:48:54 +00:00
Leonid Ganeline
11195cfa42 community[patch]: speed up import times in the community package (#18928)
This PR speeds up import times in the community package
2024-03-11 16:37:36 -04:00
fjk
a7fc731720 docs: change sparkllm spark_app_url to spark_api_url (#18000)
community: fix - change sparkllm spark_app_url to spark_api_url

- **Description:** 
- Change the variable name from `sparkllm spark_app_url` to
`spark_api_url` in the community package.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-11 20:01:30 +00:00
Sevin F. Varoglu
8639624d40 docs: update OctoAI doc (#18913)
This PR updates the OctoAI LLM doc.
2024-03-11 13:01:10 -07:00
Alexander Kozlov
a7500ab0fb docs: Update huggingface pipelines notebook (#18801) 2024-03-11 20:00:31 +00:00
Conroy Whitney
96d7fe0f85 docs: Change saved/configured chain variable name (#18863)
**Description:**
Variable name was `openai_poem` but it didn't pass in the `"prompt":
"poem"` config, so the examples were showing a joke being returned from
a variable called `*_poem`.

We could have gone one of two ways:

1. Updating the config line and the output line, or
2. Updating the variable name

The latter seemed simpler, so that's what I went with. But I'd be glad
to re-do this PR if you prefer the former.

Thanks for everything, y'all. You rock 🤘

**Issue:** N/A

**Dependencies:** N/A

**Twitter handle:** `conroywhitney`
2024-03-11 12:59:24 -07:00
aditya thomas
8544f748f2 community[patch]: update AnthropicLLM deprecation message (#18869)
**Description:** Update AnthropicLLM deprecation message import path for
ChatAnthropic
**Issue:** Incorrect import path in deprecation message
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` were run
2024-03-11 12:59:10 -07:00
Virat Singh
cafffe8a21 community: Add PolygonAggregates tool (#18882)
**Description:**
In this PR, I am adding a `PolygonAggregates` tool, which can be used to
get historical stock price data (called aggregates by Polygon) for a
given ticker.

Polygon
[docs](https://polygon.io/docs/stocks/get_v2_aggs_ticker__stocksticker__range__multiplier___timespan___from___to)
for this endpoint.

**Twitter**: 
[@virattt](https://twitter.com/virattt)
2024-03-11 11:58:10 -07:00
Bagatur
2d172181e0 Revert "update api build script (#18930)" (#18931) 2024-03-11 11:47:18 -07:00
Bagatur
def329b5f2 update api build script (#18930) 2024-03-11 11:44:37 -07:00
Bagatur
c24c871d88 docs: update readme diagram (#18929) 2024-03-11 11:17:45 -07:00
Bagatur
34284c25d4 docs: turn on link check (#18924) 2024-03-11 10:50:39 -07:00
Erick Friis
93ef8ead0b mongodb[patch]: fix core dep (#18926) 2024-03-11 10:27:29 -07:00
Mohammad Mohtashim
43db4cd20e core[major]: On Tool End Observation Casting Fix (#18798)
This PR updates the on_tool_end handlers to return the raw output from the tool instead of casting it to a string. 

This is technically a breaking change, though it's impact is expected to be somewhat minimal. It will fix behavior in `astream_events` as well.

Fixes the following issue #18760 raised by @eyurtsev

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-03-11 10:59:04 -04:00
Prashanth Rao
a96a6e0f2c docs: Fix typo and add KùzuDB to graphs docs (#18915)
- **Description:** Adding Kùzu (an embedded graph DB that uses Cypher)
to the graph docs, and fixing a typo
 - **Issue:** docs update
2024-03-11 14:42:46 +00:00
aditya thomas
3d15498612 docs: Update callbacks documentation (#18899)
**Description:** Update callbacks documentation
**Issue:** Change some module imports and a method invocation to reflect
the current LangChainAPI
**Dependencies:** None
2024-03-11 10:40:11 -04:00
Massimiliano Pronesti
8113d612bb community[patch]: support modin document loader (#18866)
Langchain community document loaders support `pyspark`, `polars`, and
`pandas` dataframes but not `modin`'s. This PR addresses this point.
2024-03-10 18:40:04 -07:00
Leonid Ganeline
dee256ef5a docs: platforms/google fixed broken links (#18878)
Several links are broken. Fixed them.
2024-03-10 18:19:43 -07:00
Pol Ruiz Farre
a7f63d8cb4 community[patch]: Fix BasePDFLoader suffix for s3 presigned urls (#18844)
BasePDFLoader doesn't parse the suffix of the file correctly when
parsing S3 presigned urls. This fix enables the proper detection and
parsing of S3 presigned URLs to prevent errors such as `OSError: [Errno
36] File name too long`.
No additional dependencies required.
2024-03-11 00:58:51 +00:00
Joshua Carroll
ddaf9de169 community: Fix bug with StreamlitChatMessageHistory (#18834)
- **Description:** Fix Streamlit bug which was introduced by
https://github.com/langchain-ai/langchain/pull/18250, update integration
test
- **Issue:** https://github.com/langchain-ai/langchain/issues/18684
- **Dependencies:** None
2024-03-09 13:42:22 -08:00
Kushagra
5fcbe9dd2a community[patch]: documented the feature to filter documents in MongoDBloader (#18842)
"community[docs]: documented the feature to filter documents in
MongoDBloader"
- Description: documented the feature to filter documents in
MongoDBloader
- Feature: the feature
https://github.com/langchain-ai/langchain/discussions/18251
- Dependencies: No
- Twitter handle: https://twitter.com/im_Kushagra
2024-03-09 13:41:34 -08:00
Ikko Eltociear Ashimine
c3580d3c64 docs: fix typo in google_cloud_sql_mysql.ipynb (#18847)
arbitary -> arbitrary
2024-03-09 13:39:36 -08:00
Luan Fernandes
5a006f7264 docs: update typo in docs about agent tools (#18850)
fixes #18849
2024-03-09 13:39:18 -08:00
Leonid Ganeline
3dabd3f214 docs: platform pages update (#17836)
`Integrations` platform page ToC-s: sections there are placed without
order. For example, the
[google](https://python.langchain.com/docs/integrations/platforms/google)
page. The `LLM` section is not the first section, as it is in the
[Components](https://python.langchain.com/docs/integrations/components)
menu.
Updates:
* reorganized the page sections so they follow the Component menu order.
* fixed names for the section names: "Text Embedding Models" ->
"Embedding Models"
2024-03-09 13:34:33 -08:00
Leonid Ganeline
07c518ad3e docs: providers update 4 (#18540)
Created the `facebook` page from `facebook_faiss` and `facebook_chat`
pages. Added another Facebook integrations into this page.
Updated `discord` page.
2024-03-09 13:30:48 -08:00
Leonid Ganeline
9c0f84ae95 docs: providers update 6 (#18610)
Cleaned up the `Integrations/Components/Memory` navbar by shortening the
page titles. Updated page titles and file names to consistent formats.
2024-03-09 13:29:44 -08:00
Tomaz Bratanic
a28be31a96 Switch to md5 for deduplication in neo4j integrations (#18846)
Deduplicate documents using MD5 of the page_content. Also allows for
custom deduplication with graph ingestion method by providing metadata
id attribute

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2024-03-09 13:28:55 -08:00
Tomaz Bratanic
246724faab LLM graph transformer prompt engineering (#18843)
A bit of prompt engineering to improve results
2024-03-09 11:27:16 -08:00
Tomaz Bratanic
e778d60aec Fix broken link in graph docs (#18837) 2024-03-09 10:40:33 -08:00
Erick Friis
b48865bf94 langchain[patch]: attach hub metadata (#18830) 2024-03-08 18:40:49 -08:00
Ammar
34b31a8cc7 core: add in-code docs for RunnableAssign class (#18826)
**Description:** Improves the docstring for `RunnableAssign` by
providing a concise description and a self-contained code example.
  **Issue:**  #18803
2024-03-09 02:04:52 +00:00
Leonid Ganeline
5d65b47e41 docs: chat menu item as icon (#18806)
Update chat icon in docs
2024-03-08 21:00:21 -05:00
Leonid Ganeline
476d6dc596 community[patch]: Use getattr for toolkits imports (#18825)
This will preserve the namespace, without actually loading the underlying packages on init.
2024-03-08 20:54:28 -05:00
Erick Friis
bbb609ac9d core[patch]: fix arbitrary config keys (#18827) 2024-03-08 17:35:13 -08:00
Luis Antonio Vieira Junior
67c880af74 community[patch]: adding linearization config to AmazonTextractPDFLoader (#17489)
- **Description:** Adding an optional parameter `linearization_config`
to the `AmazonTextractPDFLoader` so the caller can define how the output
will be linearized, instead of forcing a predefined set of linearization
configs. It will still have a default configuration as this will be an
optional parameter.
- **Issue:** #17457
- **Dependencies:** The same ones that already exist for
`AmazonTextractPDFLoader`
- **Twitter handle:** [@lvieirajr19](https://twitter.com/lvieirajr19)

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:25:22 -08:00
Anis ZAKARI
37e89ba5b1 community[patch]: Bedrock add support for mistral models (#18756)
*Description**: My previous
[PR](https://github.com/langchain-ai/langchain/pull/18521) was
mistakenly closed, so I am reopening this one. Context: AWS released two
Mistral models on Bedrock last Friday (March 1, 2024). This PR includes
some code adjustments to ensure their compatibility with the Bedrock
class.

---------

Co-authored-by: Anis ZAKARI <anis.zakari@hymaia.com>
Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-09 01:20:38 +00:00
Alexander Dicke
66576948e0 experimental[minor]: adds mixtral wrapper (#17423)
**Description:** Adds a chat wrapper for Mixtral models using the
[prompt
template](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1#instruction-format).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:14:23 -08:00
Erick Friis
4f4300723b docs: pinecone client version note (#17491) 2024-03-08 17:09:17 -08:00
Keith Chan
914af69b44 community[patch]: Update azuresearch vectorstore from_texts() method to include fields argument (#17661)
- **Description:** Update azuresearch vectorstore from_texts() method to
include fields argument, necessary for creating an Azure AI Search index
with custom fields.
- **Issue:** Currently index fields are fixed to default fields if Azure
Search index is created using from_texts() method
- **Dependencies:** None
- **Twitter handle:** None

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 17:05:35 -08:00
al1p
46f0cea2b9 community[patch][: improved the suffix prompt to avoid loop (#17791)
Small improvement to the openapi prompt.
The agent was not finding the server base URL (looping through all
nodes). This small change narrows the search and enables finding the url
faster.

No dependency 

Twitter : @al1pra
2024-03-08 16:53:09 -08:00
Dmitry Kankalovich
f5117e907d openai[patch]: Proper example for AzureOpenAI usage in error message (#17798)
# Proper example for AzureOpenAI usage in error message

The original error message is wrong in part of a usage example it gives.
Corrected to the right one.

Co-authored-by: Dzmitry Kankalovich <dzmitry_kankalovich@epam.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:52:55 -08:00
Pranav Agarwal
bd9b5dc2f3 docs: Updating cookbook README for amazon personalize (#17854)
This PR is a successor to this PR -
https://github.com/langchain-ai/langchain/pull/17436
This PR updates the cookbook README with the notebook so that it is
available on langchain docs for discoverability.

cc: @baskaryan, @3coins

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:52:36 -08:00
AtomicVar
23e62f8f8d docs: fix lists display issue (#17911)
**Description:** Fix lists display issues in **Docs > Use Cases > Q&A
with RAG > Quickstart**.

In essence, this PR changes:

```markdown
Some paragraph.
- Item a.
- Item b.
```

to:

```markdown
Some paragraph.

- Item a.
- Item b.
```

There needs an extra empty line to make the list rendered properly.

FYI, the old version is displayed not properly as:

<img width="856" alt="image"
src="https://github.com/langchain-ai/langchain/assets/22856433/65202577-8ea2-47c6-b310-39bf42796fac">

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:52:16 -08:00
Théo LEBRUN
cf94091cd0 community[patch]: Skip nested directories when using S3DirectoryLoader (#17829)
- **Description:** `S3DirectoryLoader` is failing if prefix is a folder
(ex: `my_folder/`) because `S3FileLoader` will try to load that folder
and will fail. This PR skip nested directories so prefix can be set to
folder instead of `my_folder/files_prefix`.
- **Issue:**
  - #11917
  - #6535
  - #4326
- **Dependencies:** none
- **Twitter handle:** @Falydoor


- [x] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/
2024-03-08 16:50:58 -08:00
Venkatesan
7a18b63dbf community[patch]: Mongo index creation (#17748)
- [ ] Title: Mongodb: MongoDB connection performance improvement. 
- [ ] Message: 
- **Description:** I made collection index_creation as optional. Index
Creation is one time process.
- **Issue:** MongoDBChatMessageHistory class object is attempting to
create an index during connection, causing each request to take longer
than usual. This should be optional with a parameter.
    - **Dependencies:** N/A
    - **Branch to be checked:** origin/mongo_index_creation

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:43:17 -08:00
wt3639
5b5b37a999 community[patch]: Add embedding instruction to HuggingFaceBgeEmbeddings (#18017)
- **Description:** Add embedding instruction to
HuggingFaceBgeEmbeddings, so that it can be compatible with nomic and
other models that need embedding instruction.

---------

Co-authored-by: Tao Wu <tao.wu@rwth-aachen.de>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 16:39:29 -08:00
Brace Sproul
9c218d0154 docs[patch]: Update how GA4 is collected (#18821)
There's some issue/setting with the current python GA4 app. I created a
new one just for feedback.
2024-03-08 14:32:40 -08:00
Erick Friis
a8de6d1533 anthropic[patch]: integration test update (#18823) 2024-03-08 13:47:31 -08:00
wewebber-merlin
d1f5bc4906 anthropic[patch]: add kwargs to format_output base (#18715)
_generate() and _agenerate() both accept **kwargs, then pass them on to
_format_output; but _format_output doesn't accept **kwargs. Attempting
to pass, e.g.,

     timeout=50

to _generate (or invoke()) results in a TypeError.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-08 21:47:21 +00:00
Erick Friis
aa7bce6b13 anthropic[patch]: release 0.1.4 (#18822) 2024-03-08 21:34:47 +00:00
Erick Friis
a5bcddc738 anthropic[patch]: streaming param (#18819) 2024-03-08 13:32:57 -08:00
Erick Friis
8c0b215c02 anthropic[patch]: fix format output args (#18816) 2024-03-08 12:34:11 -08:00
Ishani Vyas
2b0cbd65ba community[patch]: Add Passio Nutrition AI Food Search Tool to Community Package (#18278)
## Add Passio Nutrition AI Food Search Tool to Community Package

### Description
We propose adding a new tool to the `community` package, enabling
integration with Passio Nutrition AI for food search functionality. This
tool will provide a simple interface for retrieving nutrition facts
through the Passio Nutrition AI API, simplifying user access to
nutrition data based on food search queries.

### Implementation Details
- **Class Structure:** Implement `NutritionAI`, extending `BaseTool`. It
includes an `_run` method that accepts a query string and, optionally, a
`CallbackManagerForToolRun`.
- **API Integration:** Use `NutritionAIAPI` for the API wrapper,
encapsulating all interactions with the Passio Nutrition AI and
providing a clean API interface.
- **Error Handling:** Implement comprehensive error handling for API
request failures.

### Expected Outcome
- **User Benefits:** Enable easy querying of nutrition facts from Passio
Nutrition AI, enhancing the utility of the `langchain_community` package
for nutrition-related projects.
- **Functionality:** Provide a straightforward method for integrating
nutrition information retrieval into users' applications.

### Dependencies
- `langchain_core` for base tooling support
- `pydantic` for data validation and settings management
- Consider `requests` or another HTTP client library if not covered by
`NutritionAIAPI`.

### Tests and Documentation
- **Unit Tests:** Include tests that mock network interactions to ensure
tool reliability without external API dependency.
- **Documentation:** Create an example notebook in
`docs/docs/integrations/tools/passio_nutrition_ai.ipynb` showing usage,
setup, and example queries.

### Contribution Guidelines Compliance
- Adhere to the project's linting and formatting standards (`make
format`, `make lint`, `make test`).
- Ensure compliance with LangChain's contribution guidelines,
particularly around dependency management and package modifications.

### Additional Notes
- Aim for the tool to be a lightweight, focused addition, not
introducing significant new dependencies or complexity.
- Potential future enhancements could include caching for common queries
to improve performance.

### Twitter Handle
- Here is our Passio AI [twitter handle](https://twitter.com/@passio_ai)
where we announce our products.


If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-08 20:33:22 +00:00
Aaron Jimenez
bd9f98a20b docs: Fix typo in modules/chains.ipynb (#18808)
**Description:**  

Fix a minor typo in `modules/chains.ipynb`.
 
- **Issue:** 
    fixes #17851
2024-03-08 12:09:20 -08:00
Kushagra
b1f22bf76c community[minor]: added a feature to filter documents in Mongoloader (#18253)
"community: added a feature to filter documents in Mongoloader"
- **Description:** added a feature to filter documents in Mongoloader
    - **Feature:** the feature #18251
    - **Dependencies:** No
    - **Twitter handle:** https://twitter.com/im_Kushagra
2024-03-08 12:06:35 -08:00
Tomaz Bratanic
c0bdd4d45b docs: Add main graph documentation (#18021)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-08 20:03:03 +00:00
Leonid Ganeline
7c8c4e5743 docs: providers update 7 (#18620)
Added missed providers. Added missed integrations. Formatted to the
consistent form. Fixed outdated imports.
2024-03-08 12:00:27 -08:00
Eugene Yurtsev
1f50274df7 community[patch]: Add pgvector to docker compose and update settings used in integration test (#18815) 2024-03-08 14:39:28 -05:00
Erick Friis
ad29806255 nvidia-trt, nvidia-ai-endpoints: move to repo (#18814)
NVIDIA maintained in https://github.com/langchain-ai/langchain-nvidia
2024-03-08 19:30:50 +00:00
Christophe Bornet
e54a49b697 community[minor]: Add lazy_table_reflection param to SqlDatabase (#18742)
For some DBs with lots of tables, reflection of all the tables can take
very long. So this change will make the tables be reflected lazily when
get_table_info() is called and `lazy_table_reflection` is True.
2024-03-08 14:10:23 -05:00
Christophe Bornet
ead2a74806 community: Implement lazy_load() for JSONLoader (#18643)
Covered by `tests/unit_tests/document_loaders/test_json_loader.py`
2024-03-08 13:58:17 -05:00
Erick Friis
a88f62ec3c langchain[patch]: getattr import from langchain.chains (#18160) 2024-03-08 10:36:14 -08:00
kAIto47802
ff70cc4e80 docs: fix typo (#18810)
Fixed typo in docs
2024-03-08 13:28:17 -05:00
Eugene Yurtsev
cdfb5b4ca1 core[minor]: Chat Models to fallback astream to fallback on sync stream if available (#18748)
Allows all chat models that implement _stream, but not _astream to still have async streaming to work.

Amongst other things this should resolve issues with streaming community model implementations through langserve since langserve is exclusively async.
2024-03-08 13:27:29 -05:00
Leonid Ganeline
3624f56ccb docs: update imports of retrievers to use langchain_community (#18707)
Updated `langchain` imports to `langchain_community`.
2024-03-08 13:04:38 -05:00
Leonid Ganeline
48eed86931 docs: update imports of memory to use langchain_community (#18689)
Refactored imports from `langchain` to `langchain_community` whenever it
is applicable
2024-03-08 13:02:31 -05:00
aditya thomas
e00c1ff2b0 infra: ChatOpenAI unit tests for invoke() and ainvoke() (#18792)
**Description:** Replacing the deprecated predict() and apredict()
methods in the unit tests
**Issue:** Not applicable
**Dependencies:** None
**Lint and test**: `make format`, `make lint` and `make test` have been
run
2024-03-08 09:48:38 -08:00
aditya thomas
a35203b164 docs: (minor) update to anthropic doc (#18794)
**Description:** Minor update to Anthropic documentation
**Issue:** Not applicable
**Dependencies:** None
**Lint and test**: `make format` and `make lint` was done
2024-03-08 09:48:04 -08:00
Bagatur
3e29c04213 core[minor]: add BaseMessage.response_metadata (#18699) 2024-03-08 09:35:56 -08:00
standby24x7
67d48ea600 docs:Update function "run" to "invoke" in llm_bash.ipynb (#18663)
This path updates function "run" to "invoke" in llm_bash.ipynb. 
Without this path, you see following warning.

LangChainDeprecationWarning: The function `run` was deprecated in
LangChain 0.1.0
and will be removed in 0.2.0. Use invoke instead.

Signed-off-by: Masanari Iida <standby24x7@gmail.com>
2024-03-08 09:35:36 -08:00
Bagatur
bc6249c889 langchain[patch]: runnable agent streaming param (#18761)
Usage:

```python
agent = RunnableAgent(runnable=runnable, .., stream_runnable=False)
```
or for convenience
```python
agent_executor = AgentExecutor(agent=agent, ..., stream_runnable=False)
```
2024-03-07 20:53:53 -08:00
Tomaz Bratanic
c8c592d3f1 experimental[minor]: Add LLM graph transformer (#18733)
Add a class that constructs knowledge graphs based on text using an LLM.
2024-03-07 20:52:53 -08:00
Phat Vo
3ecb903d49 community[patch] : Tidy up and update Clarifai SDK functions (#18314)
Description :
* Tidy up, add missing docstring and fix unused params
* Enable using session token
2024-03-07 19:47:44 -08:00
Paul Sanders
93b87f2bfb docs: Fix typo (#18545)
Fixing a minor typo in the package name.

Thank you for contributing to LangChain!

- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core,
experimental, etc. is being modified. Use "docs: ..." for purely docs
changes, "templates: ..." for template changes, "infra: ..." for CI
changes.
  - Example: "community: add foobar LLM"


- [ ] **PR message**: ***Delete this entire checklist*** and replace
with
    - **Description:** a description of the change
    - **Issue:** the issue # it fixes, if applicable
    - **Dependencies:** any dependencies required for this change
- **Twitter handle:** if your PR gets announced, and you'd like a
mention, we'll gladly shout you out!


- [ ] **Add tests and docs**: If you're adding a new integration, please
include
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in
`docs/docs/integrations` directory.


- [ ] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

Additional guidelines:
- Make sure optional dependencies are imported within a function.
- Please do not add dependencies to pyproject.toml files (even optional
ones) unless they are required for unit tests.
- Most PRs should not touch more than one package.
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in
langchain.

If no one reviews your PR within a few days, please @-mention one of
baskaryan, efriis, eyurtsev, hwchase17.
2024-03-07 19:40:42 -08:00
Aaron Jimenez
fcf6213c22 docs: Fix link to HF TEI in text_embeddings_inference.ipynb (#18682)
- [ ] **PR title:** docs: Fix link to HF TEI in
text_embeddings_inference.ipynb
 
- [ ] **PR message:**

- **Description:** Fix the link to [Hugging Face Text Embeddings
Inference
(TEI)](https://huggingface.co/docs/text-embeddings-inference/index) in
text_embeddings_inference.ipynb
   - **Issue:** Fix #18576
2024-03-07 19:38:39 -08:00
Max Jakob
61a2eba081 elasticsearch[patch]: add top-level import, remove obsolete dependency (#18644)
Make `ElasticsearchRetriever` available as top-level import.

The `langchain` package depends on `langchain-community` so we do not
need to depend on it explicitly.
2024-03-07 19:38:31 -08:00
Averi Kitsch
8accee57a9 docs: update Google Cloud database integration docs (#18711)
**Description:** update Google Cloud database integration docs
 **Issue:** NA
**Dependencies:** NA
2024-03-07 19:36:00 -08:00
Tomaz Bratanic
010a234f1e docs: Fix diffbot graph transformer description (#18736)
The previous docstring was invalid
2024-03-07 19:25:41 -08:00
Jan Nissen
b8922480ed core[patch]: improve PydanticOutputParser typing (#18740)
This PR adds generic typing to `PydanticOutputParser` so we get a typed
output from `.parse` instead of `Any`. It should provide a better DX by
way of Intellisense and for anyone strictly typing.

Pre-change:

![Screenshot 2024-03-07 at 10 22
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/fd22dde0-9fdc-4283-b283-4c98f0bc46e5)

Post-change:

![Screenshot 2024-03-07 at 10 26
31 AM](https://github.com/langchain-ai/langchain/assets/22690160/7e23d2b7-8f8c-494f-80b3-187530a173ee)

I haven't dug too deep, but I think a similar change could probably be
added to `JsonOutputParser` so we don't have to pull up `.parse`.

Co-authored-by: Jan Nissen <jan23@gmail.com>
2024-03-07 19:25:24 -08:00
Massimiliano Pronesti
3b975c6ebe experimental[minor]: add support for modin in pandas agent (#18749)
Added support for Intel's
[modin](https://github.com/modin-project/modin) in
`create_pandas_dataframe_agent`.
2024-03-07 19:23:07 -08:00
Tomaz Bratanic
4bfe888717 comunity[patch]: Fix neo4j sanitizing values (#18750)
Fixing sanitization for when deeply nested lists appear
2024-03-07 19:21:52 -08:00
Ian
7f504c1f81 docs: Improve the tidb vector store notebook (#18773)
Remove redundant useless content, and fix some minor oversight
2024-03-07 19:15:55 -08:00
Eugene Yurtsev
6caceb5473 core[patch]: Automatic upgrade to AddableDict in transform and atransform (#18743)
Automatic upgrade to transform and atransform

Closes: 

https://github.com/langchain-ai/langchain/issues/18741
https://github.com/langchain-ai/langgraph/issues/136
https://github.com/langchain-ai/langserve/issues/504
2024-03-07 21:23:12 -05:00
Yunmo Koo
fee6f983ef community[minor]: Integration for Friendli LLM and ChatFriendli ChatModel. (#17913)
## Description
- Add [Friendli](https://friendli.ai/) integration for `Friendli` LLM
and `ChatFriendli` chat model.
- Unit tests and integration tests corresponding to this change are
added.
- Documentations corresponding to this change are added.

## Dependencies
- Optional dependency
[`friendli-client`](https://pypi.org/project/friendli-client/) package
is added only for those who use `Frienldi` or `ChatFriendli` model.

## Twitter handle
- https://twitter.com/friendliai
2024-03-08 02:20:47 +00:00
Smit Parmar
aed46cd6f2 community[patch]: Added support for filter out AWS Kendra search by score confidence (#12920)
**Description:** It will add support for filter out kendra search by
score confidence which will make result more accurate.
    For example
   ```
retriever = AmazonKendraRetriever(
        index_id=kendra_index_id, top_k=5, region_name=region,
        score_confidence="HIGH"
    )
```
Result will not include the records which has score confidence "LOW" or "MEDIUM". 
Relevant docs 
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/query.html
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/kendra/client/retrieve.html

 **Issue:** the issue # it resolve #11801 
**twitter:** [@SmitCode](https://twitter.com/SmitCode)
2024-03-07 17:28:09 -08:00
Ian
390ef6abe3 community[minor]: Add Initial Support for TiDB Vector Store (#15796)
This pull request introduces initial support for the TiDB vector store.
The current version is basic, laying the foundation for the vector store
integration. While this implementation provides the essential features,
we plan to expand and improve the TiDB vector store support with
additional enhancements in future updates.

Upcoming Enhancements:
* Support for Vector Index Creation: To enhance the efficiency and
performance of the vector store.
* Support for max marginal relevance search. 
* Customized Table Structure Support: Recognizing the need for
flexibility, we plan for more tailored and efficient data store
solutions.

Simple use case exmaple

```python
from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)
```
2024-03-07 17:18:20 -08:00
Bagatur
3b1eb1f828 community[patch]: chat hf typing fix (#18693) 2024-03-07 17:06:38 -08:00
Eugene Yurtsev
1e1cac50d8 Docs: remove sales from security (#18762)
Remove sales from security
2024-03-07 17:35:46 -05:00
Jib
d60e93b6ae langchain-mongodb: Standardize mongodb collection/index names in tests (#18755)
## **Description:**
MongoDB integration tests link to a provided Atlas Cluster. We have very
stringent permissions set against the cluster provided. In order to make
it easier to track and isolate the collections each test gets run
against, we've updated the collection names to map the test file name.
i.e. `langchain_{filename}` => `langchain_test_vectorstores`

Fixes integration test results

![image](https://github.com/langchain-ai/langchain/assets/2887713/41f911b9-55f7-4fe4-9134-5514b82009f9)

## **Dependencies:** 
Provided MONGODB_ATLAS_URI

- [x] **Lint and test**: Run `make format`, `make lint` and `make test`
from the root of the package(s) you've modified. See contribution
guidelines for more: https://python.langchain.com/docs/contributing/

cc: @shaneharvey, @blink1073 , @NoahStapp , @caseyclements
2024-03-07 17:16:04 -05:00
Eugene Yurtsev
ca299a8e08 Docs: Add custom parsing documentation and extending langchain (#18331)
* Added extending langchain.mdx -- we'll need to add links as we add
more custom documentation
* Added partial documentation about parsers
2024-03-07 16:30:57 -05:00
Eugene Yurtsev
8c71f92cb2 core: upgrade mypy to recent mypy (#18753)
Testing this works per package on CI
2024-03-07 15:25:19 -05:00
Eugene Yurtsev
e188d4ecb0 Add dangerous parameter to requests tool (#18697)
The tools are already documented as dangerous. Not clear whether adding
an opt-in parameter is necessary or not
2024-03-07 15:10:56 -05:00
Leonid Ganeline
dad949eb99 docs: update imports of adapters to use langchain_community (#18751)
Updated imports from `langchain` to `langchain_community`
2024-03-07 15:04:25 -05:00
Erick Friis
fcaa9cf2f1 community[patch]: deprecate community anthropic (#18745) 2024-03-07 13:51:55 -05:00
Erick Friis
1beb84b061 community[patch]: move pdf text tests to integration (#18746) 2024-03-07 10:34:22 -08:00
Christophe Bornet
4a7d73b39d community: If load() has been overridden, use it in default lazy_load() (#18690) 2024-03-07 11:52:19 -05:00
Christophe Bornet
6cd7607816 community[patch]: Implement lazy_load() for MHTMLLoader (#18648)
Covered by `tests/unit_tests/document_loaders/test_mhtml.py`
2024-03-07 11:50:18 -05:00
axiangcoding
9745b5894d community[patch]: Chroma use uuid4 instead of uuid1 to generate random ids (#18723)
- **Description:** Chroma use uuid4 instead of uuid1 as random ids. Use
uuid1 may leak mac address, changing to uuid4 will not cause other
effects.
  - **Issue:** None
  - **Dependencies:** None
  - **Twitter handle:** None
2024-03-07 11:48:25 -05:00
Leonid Ganeline
1af2130ff7 docs: update imports of tools to use langchain_community (#18705)
Updated imports from `langchain` to `langchain_community`.
2024-03-07 11:46:09 -05:00
Guangdong Liu
ced5e7bae7 community[patch]: Fix sparkllm authentication problem. (#18651)
- **Description:** fix sparkllm authentication problem.The current
timestamp is in RFC1123 format. The time deviation must be controlled
within 300s. I changed to re-obtain the url every time I ask a question.
https://www.xfyun.cn/doc/spark/general_url_authentication.html#_1-2-%E9%89%B4%E6%9D%83%E5%8F%82%E6%95%B0
2024-03-06 18:43:16 -08:00
Erick Friis
89d32ffbbd community[patch]: release 0.0.27 (#18708) 2024-03-07 01:08:43 +00:00
Erick Friis
c09b520ce4 core[patch]: release 0.1.30 (#18706) 2024-03-06 16:12:18 -08:00
Piyush Jain
2b234a4d96 Support for claude v3 models. (#18630)
Fixes #18513.

## Description
This PR attempts to fix the support for Anthropic Claude v3 models in
BedrockChat LLM. The changes here has updated the payload to use the
`messages` format instead of the formatted text prompt for all models;
`messages` API is backwards compatible with all models in Anthropic, so
this should not break the experience for any models.


## Notes
The PR in the current form does not support the v3 models for the
non-chat Bedrock LLM. This means, that with these changes, users won't
be able to able to use the v3 models with the Bedrock LLM. I can open a
separate PR to tackle this use-case, the intent here was to get this out
quickly, so users can start using and test the chat LLM. The Bedrock LLM
classes have also grown complex with a lot of conditions to support
various providers and models, and is ripe for a refactor to make future
changes more palatable. This refactor is likely to take longer, and
requires more thorough testing from the community. Credit to PRs
[18579](https://github.com/langchain-ai/langchain/pull/18579) and
[18548](https://github.com/langchain-ai/langchain/pull/18548) for some
of the code here.

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2024-03-06 15:46:18 -08:00
Sam Khano
1b4dcf22f3 community[minor]: Add DocumentDBVectorSearch VectorStore (#17757)
**Description:**
- Added Amazon DocumentDB Vector Search integration (HNSW index)
- Added integration tests
- Updated AWS documentation with DocumentDB Vector Search instructions
- Added notebook for DocumentDB integration with example usage

---------

Co-authored-by: EC2 Default User <ec2-user@ip-172-31-95-226.ec2.internal>
2024-03-06 15:11:34 -08:00
Vittorio Rigamonti
51f3902bc4 community[minor]: Adding support for Infinispan as VectorStore (#17861)
**Description:**
This integrates Infinispan as a vectorstore.
Infinispan is an open-source key-value data grid, it can work as single
node as well as distributed.

Vector search is supported since release 15.x 

For more: [Infinispan Home](https://infinispan.org)

Integration tests are provided as well as a demo notebook
2024-03-06 15:11:02 -08:00
Max Jakob
cca0167917 elasticsearch[patch], community[patch]: update references, deprecate community classes (#18506)
Follow up on https://github.com/langchain-ai/langchain/pull/17467.

- Update all references to the Elasticsearch classes to use the partners
package.
- Deprecate community classes.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-03-06 15:09:12 -08:00
José Luis Di Biase
6041ec3dd1 templates: rag-multi-modal typo, replace serch with search (#18519)
Thank you for contributing to LangChain!

- [x] **PR title**: "templates: rag-multi-modal typo, replace serch with
search "
- **Description:** Two little typos in multi modal templates (replace
serch string with search)

Signed-off-by: José Luis Di Biase <josx@interorganic.com.ar>
2024-03-06 15:08:55 -08:00
Djordje
12b4a4d860 community[patch]: Opensearch delete method added - indexing supported (#18522)
- **Description:** Added delete method for OpenSearchVectorSearch,
therefore indexing supported
    - **Issue:** No
    - **Dependencies:** No
    - **Twitter handle:** stkbmf
2024-03-06 15:08:47 -08:00
7068 changed files with 528439 additions and 581303 deletions

View File

@@ -10,7 +10,7 @@ You can use the dev container configuration in this folder to build and run the
You may use the button above, or follow these steps to open this repo in a Codespace:
1. Click the **Code** drop-down menu at the top of https://github.com/langchain-ai/langchain.
1. Click on the **Codespaces** tab.
1. Click **Create codespace on master** .
1. Click **Create codespace on master**.
For more info, check out the [GitHub documentation](https://docs.github.com/en/free-pro-team@latest/github/developing-online-with-codespaces/creating-a-codespace#creating-a-codespace).

View File

@@ -12,7 +12,7 @@
// The optional 'workspaceFolder' property is the path VS Code should open by default when
// connected. This is typically a file mount in .devcontainer/docker-compose.yml
"workspaceFolder": "/workspaces/${localWorkspaceFolderBasename}",
"workspaceFolder": "/workspaces/langchain",
// Prevent the container from shutting down
"overrideCommand": true

View File

@@ -5,10 +5,10 @@ services:
dockerfile: libs/langchain/dev.Dockerfile
context: ..
volumes:
# Update this to wherever you want VS Code to mount the folder of your project
- ..:/workspaces:cached
# Update this to wherever you want VS Code to mount the folder of your project
- ..:/workspaces/langchain:cached
networks:
- langchain-network
- langchain-network
# environment:
# MONGO_ROOT_USERNAME: root
# MONGO_ROOT_PASSWORD: example123
@@ -28,5 +28,3 @@ services:
networks:
langchain-network:
driver: bridge

2
.github/CODEOWNERS vendored Normal file
View File

@@ -0,0 +1,2 @@
/.github/ @efriis @baskaryan @ccurme
/libs/packages.yml @efriis

View File

@@ -96,25 +96,21 @@ body:
attributes:
label: System Info
description: |
Please share your system info with us.
Please share your system info with us. Do NOT skip this step and please don't trim
the output. Most users don't include enough information here and it makes it harder
for us to help you.
"pip freeze | grep langchain"
platform (windows / linux / mac)
python version
OR if you're on a recent version of langchain-core you can paste the output of:
Run the following command in your terminal and paste the output here:
python -m langchain_core.sys_info
or if you have an existing python interpreter running:
from langchain_core import sys_info
sys_info.print_sys_info()
alternatively, put the entire output of `pip freeze` here.
placeholder: |
"pip freeze | grep langchain"
platform
python version
Alternatively, if you're on a recent version of langchain-core you can paste the output of:
python -m langchain_core.sys_info
These will only surface LangChain packages, don't forget to include any other relevant
packages you're using (if you're not sure what's relevant, you can paste the entire output of `pip freeze`).
validations:
required: true

View File

@@ -4,9 +4,6 @@ contact_links:
- name: 🤔 Question or Problem
about: Ask a question or ask about a problem in GitHub Discussions.
url: https://www.github.com/langchain-ai/langchain/discussions/categories/q-a
- name: Discord
url: https://discord.gg/6adMQxSpJS
about: General community discussions
- name: Feature Request
url: https://www.github.com/langchain-ai/langchain/discussions/categories/ideas
about: Suggest a feature or an idea

View File

@@ -26,6 +26,13 @@ body:
[LangChain Github Discussions](https://github.com/langchain-ai/langchain/discussions),
[LangChain Github Issues](https://github.com/langchain-ai/langchain/issues?q=is%3Aissue),
[LangChain ChatBot](https://chat.langchain.com/)
- type: input
id: url
attributes:
label: URL
description: URL to documentation
validations:
required: false
- type: checkboxes
id: checks
attributes:
@@ -48,4 +55,4 @@ body:
label: "Idea or request for content:"
description: >
Please describe as clearly as possible what topics you think are missing
from the current documentation.
from the current documentation.

View File

@@ -1,7 +1,7 @@
Thank you for contributing to LangChain!
- [ ] **PR title**: "package: description"
- Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes.
- Where "package" is whichever of langchain, community, core, etc. is being modified. Use "docs: ..." for purely docs changes, "infra: ..." for CI changes.
- Example: "community: add foobar LLM"
@@ -26,4 +26,4 @@ Additional guidelines:
- Changes should be backwards compatible.
- If you are adding something to community, do not re-import it in langchain.
If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17.
If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

View File

@@ -350,11 +350,7 @@ def get_graphql_pr_edges(*, settings: Settings, after: Union[str, None] = None):
print("Querying PRs...")
else:
print(f"Querying PRs with cursor {after}...")
data = get_graphql_response(
settings=settings,
query=prs_query,
after=after
)
data = get_graphql_response(settings=settings, query=prs_query, after=after)
graphql_response = PRsResponse.model_validate(data)
return graphql_response.data.repository.pullRequests.edges
@@ -484,10 +480,16 @@ def get_contributors(settings: Settings):
lines_changed = pr.additions + pr.deletions
score = _logistic(files_changed, 20) + _logistic(lines_changed, 100)
contributor_scores[pr.author.login] += score
three_months_ago = (datetime.now(timezone.utc) - timedelta(days=3*30))
three_months_ago = datetime.now(timezone.utc) - timedelta(days=3 * 30)
if pr.createdAt > three_months_ago:
recent_contributor_scores[pr.author.login] += score
return contributors, contributor_scores, recent_contributor_scores, reviewers, authors
return (
contributors,
contributor_scores,
recent_contributor_scores,
reviewers,
authors,
)
def get_top_users(
@@ -524,9 +526,13 @@ if __name__ == "__main__":
# question_commentors, question_last_month_commentors, question_authors = get_experts(
# settings=settings
# )
contributors, contributor_scores, recent_contributor_scores, reviewers, pr_authors = get_contributors(
settings=settings
)
(
contributors,
contributor_scores,
recent_contributor_scores,
reviewers,
pr_authors,
) = get_contributors(settings=settings)
# authors = {**question_authors, **pr_authors}
authors = {**pr_authors}
maintainers_logins = {
@@ -537,7 +543,9 @@ if __name__ == "__main__":
"nfcampos",
"efriis",
"eyurtsev",
"rlancemartin"
"rlancemartin",
"ccurme",
"vbarda",
}
hidden_logins = {
"dev2049",
@@ -545,6 +553,7 @@ if __name__ == "__main__":
"obi1kenobi",
"langchain-infra",
"jacoblee93",
"isahers1",
"dqbd",
"bracesproul",
"akira",
@@ -556,7 +565,7 @@ if __name__ == "__main__":
maintainers.append(
{
"login": login,
"count": contributors[login], #+ question_commentors[login],
"count": contributors[login], # + question_commentors[login],
"avatarUrl": user.avatarUrl,
"twitterUsername": user.twitterUsername,
"url": user.url,
@@ -612,9 +621,7 @@ if __name__ == "__main__":
new_people_content = yaml.dump(
people, sort_keys=False, width=200, allow_unicode=True
)
if (
people_old_content == new_people_content
):
if people_old_content == new_people_content:
logging.info("The LangChain People data hasn't changed, finishing.")
sys.exit(0)
people_path.write_text(new_people_content, encoding="utf-8")
@@ -627,9 +634,7 @@ if __name__ == "__main__":
logging.info(f"Creating a new branch {branch_name}")
subprocess.run(["git", "checkout", "-B", branch_name], check=True)
logging.info("Adding updated file")
subprocess.run(
["git", "add", str(people_path)], check=True
)
subprocess.run(["git", "add", str(people_path)], check=True)
logging.info("Committing updated file")
message = "👥 Update LangChain people data"
result = subprocess.run(["git", "commit", "-m", message], check=True)
@@ -638,4 +643,4 @@ if __name__ == "__main__":
logging.info("Creating PR")
pr = repo.create_pull(title=message, body=message, base="master", head=branch_name)
logging.info(f"Created PR: {pr.number}")
logging.info("Finished")
logging.info("Finished")

View File

@@ -1,16 +1,239 @@
import glob
import json
import sys
import os
from typing import Dict
import sys
from collections import defaultdict
from typing import Dict, List, Set
from pathlib import Path
import tomllib
from get_min_versions import get_min_version_from_toml
LANGCHAIN_DIRS = [
"libs/core",
"libs/text-splitters",
"libs/community",
"libs/langchain",
"libs/experimental",
"libs/community",
]
# when set to True, we are ignoring core dependents
# in order to be able to get CI to pass for each individual
# package that depends on core
# e.g. if you touch core, we don't then add textsplitters/etc to CI
IGNORE_CORE_DEPENDENTS = False
# ignored partners are removed from dependents
# but still run if directly edited
IGNORED_PARTNERS = [
# remove huggingface from dependents because of CI instability
# specifically in huggingface jobs
# https://github.com/langchain-ai/langchain/issues/25558
"huggingface",
]
# Cap python version at 3.12 for some packages with dependencies that are not yet
# compatible with python 3.13 (mostly hf tokenizers).
PY_312_MAX_PACKAGES = [
f"libs/partners/{integration}"
for integration in [
"chroma",
"couchbase",
"huggingface",
"mistralai",
"nomic",
"qdrant",
]
]
def all_package_dirs() -> Set[str]:
return {
"/".join(path.split("/")[:-1]).lstrip("./")
for path in glob.glob("./libs/**/pyproject.toml", recursive=True)
if "libs/cli" not in path and "libs/standard-tests" not in path
}
def dependents_graph() -> dict:
"""
Construct a mapping of package -> dependents, such that we can
run tests on all dependents of a package when a change is made.
"""
dependents = defaultdict(set)
for path in glob.glob("./libs/**/pyproject.toml", recursive=True):
if "template" in path:
continue
# load regular and test deps from pyproject.toml
with open(path, "rb") as f:
pyproject = tomllib.load(f)["tool"]["poetry"]
pkg_dir = "libs" + "/".join(path.split("libs")[1].split("/")[:-1])
for dep in [
*pyproject["dependencies"].keys(),
*pyproject["group"]["test"]["dependencies"].keys(),
]:
if "langchain" in dep:
dependents[dep].add(pkg_dir)
continue
# load extended deps from extended_testing_deps.txt
package_path = Path(path).parent
extended_requirement_path = package_path / "extended_testing_deps.txt"
if extended_requirement_path.exists():
with open(extended_requirement_path, "r") as f:
extended_deps = f.read().splitlines()
for depline in extended_deps:
if depline.startswith("-e "):
# editable dependency
assert depline.startswith(
"-e ../partners/"
), "Extended test deps should only editable install partner packages"
partner = depline.split("partners/")[1]
dep = f"langchain-{partner}"
else:
dep = depline.split("==")[0]
if "langchain" in dep:
dependents[dep].add(pkg_dir)
for k in dependents:
for partner in IGNORED_PARTNERS:
if f"libs/partners/{partner}" in dependents[k]:
dependents[k].remove(f"libs/partners/{partner}")
return dependents
def add_dependents(dirs_to_eval: Set[str], dependents: dict) -> List[str]:
updated = set()
for dir_ in dirs_to_eval:
# handle core manually because it has so many dependents
if "core" in dir_:
updated.add(dir_)
continue
pkg = "langchain-" + dir_.split("/")[-1]
updated.update(dependents[pkg])
updated.add(dir_)
return list(updated)
def _get_configs_for_single_dir(job: str, dir_: str) -> List[Dict[str, str]]:
if job == "test-pydantic":
return _get_pydantic_test_configs(dir_)
if dir_ == "libs/core":
py_versions = ["3.9", "3.10", "3.11", "3.12", "3.13"]
# custom logic for specific directories
elif dir_ == "libs/partners/milvus":
# milvus poetry doesn't allow 3.12 because they
# declare deps in funny way
py_versions = ["3.9", "3.11"]
elif dir_ in PY_312_MAX_PACKAGES:
py_versions = ["3.9", "3.12"]
elif dir_ in ["libs/community", "libs/langchain"] and job == "extended-tests":
# community extended test resolution in 3.12 is slow
# even in uv
py_versions = ["3.9", "3.11"]
elif dir_ == "libs/community" and job == "compile-integration-tests":
# community integration deps are slow in 3.12
py_versions = ["3.9", "3.11"]
elif dir_ == ".":
# unable to install with 3.13 because tokenizers doesn't support 3.13 yet
py_versions = ["3.9", "3.12"]
else:
py_versions = ["3.9", "3.13"]
return [{"working-directory": dir_, "python-version": py_v} for py_v in py_versions]
def _get_pydantic_test_configs(
dir_: str, *, python_version: str = "3.11"
) -> List[Dict[str, str]]:
with open("./libs/core/poetry.lock", "rb") as f:
core_poetry_lock_data = tomllib.load(f)
for package in core_poetry_lock_data["package"]:
if package["name"] == "pydantic":
core_max_pydantic_minor = package["version"].split(".")[1]
break
with open(f"./{dir_}/poetry.lock", "rb") as f:
dir_poetry_lock_data = tomllib.load(f)
for package in dir_poetry_lock_data["package"]:
if package["name"] == "pydantic":
dir_max_pydantic_minor = package["version"].split(".")[1]
break
core_min_pydantic_version = get_min_version_from_toml(
"./libs/core/pyproject.toml", "release", python_version, include=["pydantic"]
)["pydantic"]
core_min_pydantic_minor = (
core_min_pydantic_version.split(".")[1]
if "." in core_min_pydantic_version
else "0"
)
dir_min_pydantic_version = get_min_version_from_toml(
f"./{dir_}/pyproject.toml", "release", python_version, include=["pydantic"]
).get("pydantic", "0.0.0")
dir_min_pydantic_minor = (
dir_min_pydantic_version.split(".")[1]
if "." in dir_min_pydantic_version
else "0"
)
custom_mins = {
# depends on pydantic-settings 2.4 which requires pydantic 2.7
"libs/community": 7,
}
max_pydantic_minor = min(
int(dir_max_pydantic_minor),
int(core_max_pydantic_minor),
)
min_pydantic_minor = max(
int(dir_min_pydantic_minor),
int(core_min_pydantic_minor),
custom_mins.get(dir_, 0),
)
configs = [
{
"working-directory": dir_,
"pydantic-version": f"2.{v}.0",
"python-version": python_version,
}
for v in range(min_pydantic_minor, max_pydantic_minor + 1)
]
return configs
def _get_configs_for_multi_dirs(
job: str, dirs_to_run: Dict[str, Set[str]], dependents: dict
) -> List[Dict[str, str]]:
if job == "lint":
dirs = add_dependents(
dirs_to_run["lint"] | dirs_to_run["test"] | dirs_to_run["extended-test"],
dependents,
)
elif job in ["test", "compile-integration-tests", "dependencies", "test-pydantic"]:
dirs = add_dependents(
dirs_to_run["test"] | dirs_to_run["extended-test"], dependents
)
elif job == "extended-tests":
dirs = list(dirs_to_run["extended-test"])
else:
raise ValueError(f"Unknown job: {job}")
return [
config for dir_ in dirs for config in _get_configs_for_single_dir(job, dir_)
]
if __name__ == "__main__":
files = sys.argv[1:]
@@ -19,10 +242,13 @@ if __name__ == "__main__":
"test": set(),
"extended-test": set(),
}
docs_edited = False
if len(files) == 300:
if len(files) >= 300:
# max diff length is 300 files - there are likely files missing
raise ValueError("Max diff reached. Please manually run CI on changed libs.")
dirs_to_run["lint"] = all_package_dirs()
dirs_to_run["test"] = all_package_dirs()
dirs_to_run["extended-test"] = set(LANGCHAIN_DIRS)
for file in files:
if any(
@@ -41,12 +267,29 @@ if __name__ == "__main__":
if any(file.startswith(dir_) for dir_ in LANGCHAIN_DIRS):
# add that dir and all dirs after in LANGCHAIN_DIRS
# for extended testing
found = False
for dir_ in LANGCHAIN_DIRS:
if dir_ == "libs/core" and IGNORE_CORE_DEPENDENTS:
dirs_to_run["extended-test"].add(dir_)
continue
if file.startswith(dir_):
found = True
if found:
dirs_to_run["extended-test"].add(dir_)
elif file.startswith("libs/standard-tests"):
# TODO: update to include all packages that rely on standard-tests (all partner packages)
# note: won't run on external repo partners
dirs_to_run["lint"].add("libs/standard-tests")
dirs_to_run["test"].add("libs/partners/mistralai")
dirs_to_run["test"].add("libs/partners/openai")
dirs_to_run["test"].add("libs/partners/anthropic")
dirs_to_run["test"].add("libs/partners/fireworks")
dirs_to_run["test"].add("libs/partners/groq")
elif file.startswith("libs/cli"):
# todo: add cli makefile
pass
elif file.startswith("libs/partners"):
partner_dir = file.split("/")[2]
if os.path.isdir(f"libs/partners/{partner_dir}") and [
@@ -56,21 +299,37 @@ if __name__ == "__main__":
] != ["README.md"]:
dirs_to_run["test"].add(f"libs/partners/{partner_dir}")
# Skip if the directory was deleted or is just a tombstone readme
elif file == "libs/packages.yml":
continue
elif file.startswith("libs/"):
raise ValueError(
f"Unknown lib: {file}. check_diff.py likely needs "
"an update for this new library!"
)
elif any(file.startswith(p) for p in ["docs/", "templates/", "cookbook/"]):
elif any(file.startswith(p) for p in ["docs/", "cookbook/"]):
if file.startswith("docs/"):
docs_edited = True
dirs_to_run["lint"].add(".")
outputs = {
"dirs-to-lint": list(
dirs_to_run["lint"] | dirs_to_run["test"] | dirs_to_run["extended-test"]
),
"dirs-to-test": list(dirs_to_run["test"] | dirs_to_run["extended-test"]),
"dirs-to-extended-test": list(dirs_to_run["extended-test"]),
dependents = dependents_graph()
# we now have dirs_by_job
# todo: clean this up
map_job_to_configs = {
job: _get_configs_for_multi_dirs(job, dirs_to_run, dependents)
for job in [
"lint",
"test",
"extended-tests",
"compile-integration-tests",
"dependencies",
"test-pydantic",
]
}
for key, value in outputs.items():
map_job_to_configs["test-doc-imports"] = (
[{"python-version": "3.12"}] if docs_edited else []
)
for key, value in map_job_to_configs.items():
json_output = json.dumps(value)
print(f"{key}={json_output}") # noqa: T201
print(f"{key}={json_output}")

View File

@@ -0,0 +1,35 @@
import sys
import tomllib
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
# read toml file
with open(toml_file, "rb") as file:
toml_data = tomllib.load(file)
# see if we're releasing an rc
version = toml_data["tool"]["poetry"]["version"]
releasing_rc = "rc" in version or "dev" in version
# if not, iterate through dependencies and make sure none allow prereleases
if not releasing_rc:
dependencies = toml_data["tool"]["poetry"]["dependencies"]
for lib in dependencies:
dep_version = dependencies[lib]
dep_version_string = (
dep_version["version"] if isinstance(dep_version, dict) else dep_version
)
if "rc" in dep_version_string:
raise ValueError(
f"Dependency {lib} has a prerelease version. Please remove this."
)
if isinstance(dep_version, dict) and dep_version.get(
"allow-prereleases", False
):
raise ValueError(
f"Dependency {lib} has allow-prereleases set to true. Please remove this."
)

View File

@@ -1,35 +1,105 @@
import sys
from typing import Optional
if sys.version_info >= (3, 11):
import tomllib
else:
# for python 3.10 and below, which doesnt have stdlib tomllib
import tomli as tomllib
from packaging.specifiers import SpecifierSet
from packaging.version import Version
import requests
from packaging.version import parse
from typing import List
import tomllib
from packaging.version import parse as parse_version
import re
MIN_VERSION_LIBS = ["langchain-core", "langchain-community", "langchain", "langchain-text-splitters"]
MIN_VERSION_LIBS = [
"langchain-core",
"langchain-community",
"langchain",
"langchain-text-splitters",
"SQLAlchemy",
]
# some libs only get checked on release because of simultaneous changes in
# multiple libs
SKIP_IF_PULL_REQUEST = [
"langchain-core",
"langchain-text-splitters",
"langchain",
"langchain-community",
]
def get_min_version(version: str) -> str:
# case ^x.x.x
_match = re.match(r"^\^(\d+(?:\.\d+){0,2})$", version)
if _match:
return _match.group(1)
def get_pypi_versions(package_name: str) -> List[str]:
"""
Fetch all available versions for a package from PyPI.
# case >=x.x.x,<y.y.y
_match = re.match(r"^>=(\d+(?:\.\d+){0,2}),<(\d+(?:\.\d+){0,2})$", version)
if _match:
_min = _match.group(1)
_max = _match.group(2)
assert parse_version(_min) < parse_version(_max)
return _min
Args:
package_name (str): Name of the package
# case x.x.x
_match = re.match(r"^(\d+(?:\.\d+){0,2})$", version)
if _match:
return _match.group(1)
Returns:
List[str]: List of all available versions
raise ValueError(f"Unrecognized version format: {version}")
Raises:
requests.exceptions.RequestException: If PyPI API request fails
KeyError: If package not found or response format unexpected
"""
pypi_url = f"https://pypi.org/pypi/{package_name}/json"
response = requests.get(pypi_url)
response.raise_for_status()
return list(response.json()["releases"].keys())
def get_min_version_from_toml(toml_path: str):
def get_minimum_version(package_name: str, spec_string: str) -> Optional[str]:
"""
Find the minimum published version that satisfies the given constraints.
Args:
package_name (str): Name of the package
spec_string (str): Version specification string (e.g., ">=0.2.43,<0.4.0,!=0.3.0")
Returns:
Optional[str]: Minimum compatible version or None if no compatible version found
"""
# rewrite occurrences of ^0.0.z to 0.0.z (can be anywhere in constraint string)
spec_string = re.sub(r"\^0\.0\.(\d+)", r"0.0.\1", spec_string)
# rewrite occurrences of ^0.y.z to >=0.y.z,<0.y+1 (can be anywhere in constraint string)
for y in range(1, 10):
spec_string = re.sub(rf"\^0\.{y}\.(\d+)", rf">=0.{y}.\1,<0.{y+1}", spec_string)
# rewrite occurrences of ^x.y.z to >=x.y.z,<x+1.0.0 (can be anywhere in constraint string)
for x in range(1, 10):
spec_string = re.sub(
rf"\^{x}\.(\d+)\.(\d+)", rf">={x}.\1.\2,<{x+1}", spec_string
)
spec_set = SpecifierSet(spec_string)
all_versions = get_pypi_versions(package_name)
valid_versions = []
for version_str in all_versions:
try:
version = parse(version_str)
if spec_set.contains(version):
valid_versions.append(version)
except ValueError:
continue
return str(min(valid_versions)) if valid_versions else None
def get_min_version_from_toml(
toml_path: str,
versions_for: str,
python_version: str,
*,
include: Optional[list] = None,
):
# Parse the TOML file
with open(toml_path, "rb") as file:
toml_data = tomllib.load(file)
@@ -41,14 +111,29 @@ def get_min_version_from_toml(toml_path: str):
min_versions = {}
# Iterate over the libs in MIN_VERSION_LIBS
for lib in MIN_VERSION_LIBS:
for lib in set(MIN_VERSION_LIBS + (include or [])):
if versions_for == "pull_request" and lib in SKIP_IF_PULL_REQUEST:
# some libs only get checked on release because of simultaneous
# changes in multiple libs
continue
# Check if the lib is present in the dependencies
if lib in dependencies:
if include and lib not in include:
continue
# Get the version string
version_string = dependencies[lib]
if isinstance(version_string, dict):
version_string = version_string["version"]
if isinstance(version_string, list):
version_string = [
vs
for vs in version_string
if check_python_version(python_version, vs["python"])
][0]["version"]
# Use parse_version to get the minimum supported version from version_string
min_version = get_min_version(version_string)
min_version = get_minimum_version(lib, version_string)
# Store the minimum version in the min_versions dictionary
min_versions[lib] = min_version
@@ -56,12 +141,45 @@ def get_min_version_from_toml(toml_path: str):
return min_versions
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
def check_python_version(version_string, constraint_string):
"""
Check if the given Python version matches the given constraints.
# Call the function to get the minimum versions
min_versions = get_min_version_from_toml(toml_file)
:param version_string: A string representing the Python version (e.g. "3.8.5").
:param constraint_string: A string representing the package's Python version constraints (e.g. ">=3.6, <4.0").
:return: True if the version matches the constraints, False otherwise.
"""
print(
" ".join([f"{lib}=={version}" for lib, version in min_versions.items()])
) # noqa: T201
# rewrite occurrences of ^0.0.z to 0.0.z (can be anywhere in constraint string)
constraint_string = re.sub(r"\^0\.0\.(\d+)", r"0.0.\1", constraint_string)
# rewrite occurrences of ^0.y.z to >=0.y.z,<0.y+1.0 (can be anywhere in constraint string)
for y in range(1, 10):
constraint_string = re.sub(
rf"\^0\.{y}\.(\d+)", rf">=0.{y}.\1,<0.{y+1}.0", constraint_string
)
# rewrite occurrences of ^x.y.z to >=x.y.z,<x+1.0.0 (can be anywhere in constraint string)
for x in range(1, 10):
constraint_string = re.sub(
rf"\^{x}\.0\.(\d+)", rf">={x}.0.\1,<{x+1}.0.0", constraint_string
)
try:
version = Version(version_string)
constraints = SpecifierSet(constraint_string)
return version in constraints
except Exception as e:
print(f"Error: {e}")
return False
if __name__ == "__main__":
# Get the TOML file path from the command line argument
toml_file = sys.argv[1]
versions_for = sys.argv[2]
python_version = sys.argv[3]
assert versions_for in ["release", "pull_request"]
# Call the function to get the minimum versions
min_versions = get_min_version_from_toml(toml_file, versions_for, python_version)
print(" ".join([f"{lib}=={version}" for lib, version in min_versions.items()]))

87
.github/scripts/prep_api_docs_build.py vendored Normal file
View File

@@ -0,0 +1,87 @@
#!/usr/bin/env python
"""Script to sync libraries from various repositories into the main langchain repository."""
import os
import shutil
import yaml
from pathlib import Path
from typing import Dict, Any
def load_packages_yaml() -> Dict[str, Any]:
"""Load and parse the packages.yml file."""
with open("langchain/libs/packages.yml", "r") as f:
return yaml.safe_load(f)
def get_target_dir(package_name: str) -> Path:
"""Get the target directory for a given package."""
package_name_short = package_name.replace("langchain-", "")
base_path = Path("langchain/libs")
if package_name_short == "experimental":
return base_path / "experimental"
return base_path / "partners" / package_name_short
def clean_target_directories(packages: Dict[str, Any]) -> None:
"""Remove old directories that will be replaced."""
for package in packages["packages"]:
if package["repo"] != "langchain-ai/langchain":
target_dir = get_target_dir(package["name"])
if target_dir.exists():
print(f"Removing {target_dir}")
shutil.rmtree(target_dir)
def move_libraries(packages: Dict[str, Any]) -> None:
"""Move libraries from their source locations to the target directories."""
for package in packages["packages"]:
# Skip if it's the main langchain repo or disabled
if package["repo"] == "langchain-ai/langchain" or package.get(
"disabled", False
):
continue
repo_name = package["repo"].split("/")[1]
source_path = package["path"]
target_dir = get_target_dir(package["name"])
# Handle root path case
if source_path == ".":
source_dir = repo_name
else:
source_dir = f"{repo_name}/{source_path}"
print(f"Moving {source_dir} to {target_dir}")
# Ensure target directory exists
os.makedirs(os.path.dirname(target_dir), exist_ok=True)
try:
# Move the directory
shutil.move(source_dir, target_dir)
except Exception as e:
print(f"Error moving {source_dir} to {target_dir}: {e}")
def main():
"""Main function to orchestrate the library sync process."""
try:
# Load packages configuration
packages = load_packages_yaml()
# Clean target directories
clean_target_directories(packages)
# Move libraries to their new locations
move_libraries(packages)
print("Library sync completed successfully!")
except Exception as e:
print(f"Error during library sync: {e}")
raise
if __name__ == "__main__":
main()

7
.github/workflows/.codespell-exclude vendored Normal file
View File

@@ -0,0 +1,7 @@
libs/community/langchain_community/llms/yuan2.py
"NotIn": "not in",
- `/checkin`: Check-in
docs/docs/integrations/providers/trulens.mdx
self.assertIn(
from trulens_eval import Tru
tru = Tru()

View File

@@ -7,6 +7,10 @@ on:
required: true
type: string
description: "From which folder this pipeline executes"
python-version:
required: true
type: string
description: "Python version to use"
env:
POETRY_VERSION: "1.7.1"
@@ -17,21 +21,14 @@ jobs:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
name: "poetry run pytest -m compile tests/integration_tests #${{ matrix.python-version }}"
name: "poetry run pytest -m compile tests/integration_tests #${{ inputs.python-version }}"
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: compile-integration

View File

@@ -1,117 +0,0 @@
name: dependencies
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
langchain-location:
required: false
type: string
description: "Relative path to the langchain library folder"
env:
POETRY_VERSION: "1.7.1"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
name: dependency checks ${{ matrix.python-version }}
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: pydantic-cross-compat
- name: Install dependencies
shell: bash
run: poetry install
- name: Check imports with base dependencies
shell: bash
run: poetry run make check_imports
- name: Install test dependencies
shell: bash
run: poetry install --with test
- name: Install langchain editable
working-directory: ${{ inputs.working-directory }}
if: ${{ inputs.langchain-location }}
env:
LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
run: |
poetry run pip install -e "$LANGCHAIN_LOCATION"
- name: Install the opposite major version of pydantic
# If normal tests use pydantic v1, here we'll use v2, and vice versa.
shell: bash
# airbyte currently doesn't support pydantic v2
if: ${{ !startsWith(inputs.working-directory, 'libs/partners/airbyte') }}
run: |
# Determine the major part of pydantic version
REGULAR_VERSION=$(poetry run python -c "import pydantic; print(pydantic.__version__)" | cut -d. -f1)
if [[ "$REGULAR_VERSION" == "1" ]]; then
PYDANTIC_DEP=">=2.1,<3"
TEST_WITH_VERSION="2"
elif [[ "$REGULAR_VERSION" == "2" ]]; then
PYDANTIC_DEP="<2"
TEST_WITH_VERSION="1"
else
echo "Unexpected pydantic major version '$REGULAR_VERSION', cannot determine which version to use for cross-compatibility test."
exit 1
fi
# Install via `pip` instead of `poetry add` to avoid changing lockfile,
# which would prevent caching from working: the cache would get saved
# to a different key than where it gets loaded from.
poetry run pip install "pydantic${PYDANTIC_DEP}"
# Ensure that the correct pydantic is installed now.
echo "Checking pydantic version... Expecting ${TEST_WITH_VERSION}"
# Determine the major part of pydantic version
CURRENT_VERSION=$(poetry run python -c "import pydantic; print(pydantic.__version__)" | cut -d. -f1)
# Check that the major part of pydantic version is as expected, if not
# raise an error
if [[ "$CURRENT_VERSION" != "$TEST_WITH_VERSION" ]]; then
echo "Error: expected pydantic version ${CURRENT_VERSION} to have been installed, but found: ${TEST_WITH_VERSION}"
exit 1
fi
echo "Found pydantic version ${CURRENT_VERSION}, as expected"
- name: Run pydantic compatibility tests
# airbyte currently doesn't support pydantic v2
if: ${{ !startsWith(inputs.working-directory, 'libs/partners/airbyte') }}
shell: bash
run: make test
- name: Ensure the tests did not create any additional files
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -6,30 +6,28 @@ on:
working-directory:
required: true
type: string
python-version:
required: true
type: string
description: "Python version to use"
env:
POETRY_VERSION: "1.7.1"
jobs:
build:
environment: Scheduled testing
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.11"
name: Python ${{ matrix.python-version }}
name: Python ${{ inputs.python-version }}
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: core
@@ -43,24 +41,28 @@ jobs:
shell: bash
run: poetry run pip install "boto3<2" "google-cloud-aiplatform<2"
- name: 'Authenticate to Google Cloud'
id: 'auth'
uses: google-github-actions/auth@v2
with:
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
- name: Run integration tests
shell: bash
env:
AI21_API_KEY: ${{ secrets.AI21_API_KEY }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
AZURE_OPENAI_API_VERSION: ${{ secrets.AZURE_OPENAI_API_VERSION }}
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
NOMIC_API_KEY: ${{ secrets.NOMIC_API_KEY }}
WATSONX_APIKEY: ${{ secrets.WATSONX_APIKEY }}
@@ -73,8 +75,11 @@ jobs:
ES_URL: ${{ secrets.ES_URL }}
ES_CLOUD_ID: ${{ secrets.ES_CLOUD_ID }}
ES_API_KEY: ${{ secrets.ES_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # for airbyte
MONGODB_ATLAS_URI: ${{ secrets.MONGODB_ATLAS_URI }}
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
UPSTAGE_API_KEY: ${{ secrets.UPSTAGE_API_KEY }}
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
run: |
make integration_tests

View File

@@ -7,10 +7,10 @@ on:
required: true
type: string
description: "From which folder this pipeline executes"
langchain-location:
required: false
python-version:
required: true
type: string
description: "Relative path to the langchain library folder"
description: "Python version to use"
env:
POETRY_VERSION: "1.7.1"
@@ -21,27 +21,15 @@ env:
jobs:
build:
name: "make lint #${{ matrix.python-version }}"
name: "make lint #${{ inputs.python-version }}"
runs-on: ubuntu-latest
strategy:
matrix:
# Only lint on the min and max supported Python versions.
# It's extremely unlikely that there's a lint issue on any version in between
# that doesn't show up on the min or max versions.
#
# GitHub rate-limits how many jobs can be running at any one time.
# Starting new jobs is also relatively slow,
# so linting on fewer versions makes CI faster.
python-version:
- "3.8"
- "3.11"
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: lint-with-extras
@@ -71,14 +59,6 @@ jobs:
run: |
poetry install --with lint,typing
- name: Install langchain editable
working-directory: ${{ inputs.working-directory }}
if: ${{ inputs.langchain-location }}
env:
LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
run: |
poetry run pip install -e "$LANGCHAIN_LOCATION"
- name: Get .mypy_cache to speed up mypy
uses: actions/cache@v4
env:
@@ -86,7 +66,7 @@ jobs:
with:
path: |
${{ env.WORKDIR }}/.mypy_cache
key: mypy-lint-${{ runner.os }}-${{ runner.arch }}-py${{ matrix.python-version }}-${{ inputs.working-directory }}-${{ hashFiles(format('{0}/poetry.lock', inputs.working-directory)) }}
key: mypy-lint-${{ runner.os }}-${{ runner.arch }}-py${{ inputs.python-version }}-${{ inputs.working-directory }}-${{ hashFiles(format('{0}/poetry.lock', inputs.working-directory)) }}
- name: Analysing the code with our lint
@@ -120,7 +100,7 @@ jobs:
with:
path: |
${{ env.WORKDIR }}/.mypy_cache_test
key: mypy-test-${{ runner.os }}-${{ runner.arch }}-py${{ matrix.python-version }}-${{ inputs.working-directory }}-${{ hashFiles(format('{0}/poetry.lock', inputs.working-directory)) }}
key: mypy-test-${{ runner.os }}-${{ runner.arch }}-py${{ inputs.python-version }}-${{ inputs.working-directory }}-${{ hashFiles(format('{0}/poetry.lock', inputs.working-directory)) }}
- name: Analysing the code with our lint
working-directory: ${{ inputs.working-directory }}

View File

@@ -13,6 +13,11 @@ on:
required: true
type: string
default: 'libs/langchain'
dangerous-nonmaster-release:
required: false
type: boolean
default: false
description: "Release from a non-master branch (danger!)"
env:
PYTHON_VERSION: "3.11"
@@ -20,7 +25,7 @@ env:
jobs:
build:
if: github.ref == 'refs/heads/master'
if: github.ref == 'refs/heads/master' || inputs.dangerous-nonmaster-release
environment: Scheduled testing
runs-on: ubuntu-latest
@@ -55,7 +60,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
- name: Upload build
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -67,19 +72,99 @@ jobs:
run: |
echo pkg-name="$(poetry version | cut -d ' ' -f 1)" >> $GITHUB_OUTPUT
echo version="$(poetry version --short)" >> $GITHUB_OUTPUT
release-notes:
needs:
- build
runs-on: ubuntu-latest
outputs:
release-body: ${{ steps.generate-release-body.outputs.release-body }}
steps:
- uses: actions/checkout@v4
with:
repository: langchain-ai/langchain
path: langchain
sparse-checkout: | # this only grabs files for relevant dir
${{ inputs.working-directory }}
ref: ${{ github.ref }} # this scopes to just ref'd branch
fetch-depth: 0 # this fetches entire commit history
- name: Check Tags
id: check-tags
shell: bash
working-directory: langchain/${{ inputs.working-directory }}
env:
PKG_NAME: ${{ needs.build.outputs.pkg-name }}
VERSION: ${{ needs.build.outputs.version }}
run: |
PREV_TAG="$PKG_NAME==${VERSION%.*}.$(( ${VERSION##*.} - 1 ))"; [[ "${VERSION##*.}" -eq 0 ]] && PREV_TAG=""
# backup case if releasing e.g. 0.3.0, looks up last release
# note if last release (chronologically) was e.g. 0.1.47 it will get
# that instead of the last 0.2 release
if [ -z "$PREV_TAG" ]; then
REGEX="^$PKG_NAME==\\d+\\.\\d+\\.\\d+\$"
echo $REGEX
PREV_TAG=$(git tag --sort=-creatordate | (grep -P $REGEX || true) | head -1)
fi
# if PREV_TAG is empty, let it be empty
if [ -z "$PREV_TAG" ]; then
echo "No previous tag found - first release"
else
# confirm prev-tag actually exists in git repo with git tag
GIT_TAG_RESULT=$(git tag -l "$PREV_TAG")
if [ -z "$GIT_TAG_RESULT" ]; then
echo "Previous tag $PREV_TAG not found in git repo"
exit 1
fi
fi
TAG="${PKG_NAME}==${VERSION}"
if [ "$TAG" == "$PREV_TAG" ]; then
echo "No new version to release"
exit 1
fi
echo tag="$TAG" >> $GITHUB_OUTPUT
echo prev-tag="$PREV_TAG" >> $GITHUB_OUTPUT
- name: Generate release body
id: generate-release-body
working-directory: langchain
env:
WORKING_DIR: ${{ inputs.working-directory }}
PKG_NAME: ${{ needs.build.outputs.pkg-name }}
TAG: ${{ steps.check-tags.outputs.tag }}
PREV_TAG: ${{ steps.check-tags.outputs.prev-tag }}
run: |
PREAMBLE="Changes since $PREV_TAG"
# if PREV_TAG is empty, then we are releasing the first version
if [ -z "$PREV_TAG" ]; then
PREAMBLE="Initial release"
PREV_TAG=$(git rev-list --max-parents=0 HEAD)
fi
{
echo 'release-body<<EOF'
echo $PREAMBLE
echo
git log --format="%s" "$PREV_TAG"..HEAD -- $WORKING_DIR
echo EOF
} >> "$GITHUB_OUTPUT"
test-pypi-publish:
needs:
- build
- release-notes
uses:
./.github/workflows/_test_release.yml
permissions: write-all
with:
working-directory: ${{ inputs.working-directory }}
dangerous-nonmaster-release: ${{ inputs.dangerous-nonmaster-release }}
secrets: inherit
pre-release-checks:
needs:
- build
- release-notes
- test-pypi-publish
runs-on: ubuntu-latest
steps:
@@ -100,6 +185,7 @@ jobs:
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
id: setup-python
with:
python-version: ${{ env.PYTHON_VERSION }}
poetry-version: ${{ env.POETRY_VERSION }}
@@ -112,7 +198,7 @@ jobs:
PKG_NAME: ${{ needs.build.outputs.pkg-name }}
VERSION: ${{ needs.build.outputs.version }}
# Here we use:
# - The default regular PyPI index as the *primary* index, meaning
# - The default regular PyPI index as the *primary* index, meaning
# that it takes priority (https://pypi.org/simple)
# - The test PyPI index as an extra index, so that any dependencies that
# are not found on test PyPI can be resolved and installed anyway.
@@ -125,7 +211,7 @@ jobs:
--extra-index-url https://test.pypi.org/simple/ \
"$PKG_NAME==$VERSION" || \
( \
sleep 5 && \
sleep 15 && \
poetry run pip install \
--extra-index-url https://test.pypi.org/simple/ \
"$PKG_NAME==$VERSION" \
@@ -138,7 +224,7 @@ jobs:
poetry run python -c "import $IMPORT_NAME; print(dir($IMPORT_NAME))"
- name: Import test dependencies
run: poetry install --with test,test_integration
run: poetry install --with test
working-directory: ${{ inputs.working-directory }}
# Overwrite the local version of the package with the test PyPI version.
@@ -157,11 +243,33 @@ jobs:
run: make tests
working-directory: ${{ inputs.working-directory }}
- name: 'Authenticate to Google Cloud'
id: 'auth'
uses: google-github-actions/auth@v2
with:
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
- name: Check for prerelease versions
working-directory: ${{ inputs.working-directory }}
run: |
poetry run python $GITHUB_WORKSPACE/.github/scripts/check_prerelease_dependencies.py pyproject.toml
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
poetry run pip install packaging requests
python_version="$(poetry run python --version | awk '{print $2}')"
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml release $python_version)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: Run unit tests with minimum dependency versions
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
poetry run pip install --force-reinstall $MIN_VERSIONS --editable .
make tests
working-directory: ${{ inputs.working-directory }}
- name: Import integration test dependencies
run: poetry install --with test,test_integration
working-directory: ${{ inputs.working-directory }}
- name: Run integration tests
if: ${{ startsWith(inputs.working-directory, 'libs/partners/') }}
@@ -176,12 +284,14 @@ jobs:
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
EXA_API_KEY: ${{ secrets.EXA_API_KEY }}
NOMIC_API_KEY: ${{ secrets.NOMIC_API_KEY }}
WATSONX_APIKEY: ${{ secrets.WATSONX_APIKEY }}
@@ -194,32 +304,18 @@ jobs:
ES_URL: ${{ secrets.ES_URL }}
ES_CLOUD_ID: ${{ secrets.ES_CLOUD_ID }}
ES_API_KEY: ${{ secrets.ES_API_KEY }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} # for airbyte
MONGODB_ATLAS_URI: ${{ secrets.MONGODB_ATLAS_URI }}
VOYAGE_API_KEY: ${{ secrets.VOYAGE_API_KEY }}
UPSTAGE_API_KEY: ${{ secrets.UPSTAGE_API_KEY }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
XAI_API_KEY: ${{ secrets.XAI_API_KEY }}
run: make integration_tests
working-directory: ${{ inputs.working-directory }}
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
run: |
poetry run pip install packaging
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: Run unit tests with minimum dependency versions
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
poetry run pip install $MIN_VERSIONS
make tests
working-directory: ${{ inputs.working-directory }}
publish:
needs:
- build
- release-notes
- test-pypi-publish
- pre-release-checks
runs-on: ubuntu-latest
@@ -246,7 +342,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
cache-key: release
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
@@ -257,10 +353,13 @@ jobs:
packages-dir: ${{ inputs.working-directory }}/dist/
verbose: true
print-hash: true
# Temp workaround since attestations are on by default as of gh-action-pypi-publish v1.11.0
attestations: false
mark-release:
needs:
- build
- release-notes
- test-pypi-publish
- pre-release-checks
- publish
@@ -285,18 +384,18 @@ jobs:
working-directory: ${{ inputs.working-directory }}
cache-key: release
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: dist
path: ${{ inputs.working-directory }}/dist/
- name: Create Release
- name: Create Tag
uses: ncipollo/release-action@v1
if: ${{ inputs.working-directory == 'libs/langchain' }}
with:
artifacts: "dist/*"
token: ${{ secrets.GITHUB_TOKEN }}
draft: false
generateReleaseNotes: true
tag: v${{ needs.build.outputs.version }}
commit: master
generateReleaseNotes: false
tag: ${{needs.build.outputs.pkg-name}}==${{ needs.build.outputs.version }}
body: ${{ needs.release-notes.outputs.release-body }}
commit: ${{ github.sha }}
makeLatest: ${{ needs.build.outputs.pkg-name == 'langchain-core'}}

View File

@@ -7,10 +7,10 @@ on:
required: true
type: string
description: "From which folder this pipeline executes"
langchain-location:
required: false
python-version:
required: true
type: string
description: "Relative path to the langchain library folder"
description: "Python version to use"
env:
POETRY_VERSION: "1.7.1"
@@ -21,42 +21,47 @@ jobs:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
strategy:
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
name: "make test #${{ matrix.python-version }}"
name: "make test #${{ inputs.python-version }}"
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
id: setup-python
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: core
- name: Install dependencies
shell: bash
run: poetry install --with test
- name: Install langchain editable
working-directory: ${{ inputs.working-directory }}
if: ${{ inputs.langchain-location }}
env:
LANGCHAIN_LOCATION: ${{ inputs.langchain-location }}
run: |
poetry run pip install -e "$LANGCHAIN_LOCATION"
- name: Run core tests
shell: bash
run: |
make test
- name: Get minimum versions
working-directory: ${{ inputs.working-directory }}
id: min-version
shell: bash
run: |
poetry run pip install packaging tomli requests
python_version="$(poetry run python --version | awk '{print $2}')"
min_versions="$(poetry run python $GITHUB_WORKSPACE/.github/scripts/get_min_versions.py pyproject.toml pull_request $python_version)"
echo "min-versions=$min_versions" >> "$GITHUB_OUTPUT"
echo "min-versions=$min_versions"
- name: Run unit tests with minimum dependency versions
if: ${{ steps.min-version.outputs.min-versions != '' }}
env:
MIN_VERSIONS: ${{ steps.min-version.outputs.min-versions }}
run: |
poetry run pip install $MIN_VERSIONS
make tests
working-directory: ${{ inputs.working-directory }}
- name: Ensure the tests did not create any additional files
shell: bash
run: |
@@ -68,3 +73,4 @@ jobs:
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

51
.github/workflows/_test_doc_imports.yml vendored Normal file
View File

@@ -0,0 +1,51 @@
name: test_doc_imports
on:
workflow_call:
inputs:
python-version:
required: true
type: string
description: "Python version to use"
env:
POETRY_VERSION: "1.7.1"
jobs:
build:
runs-on: ubuntu-latest
name: "check doc imports #${{ inputs.python-version }}"
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
cache-key: core
- name: Install dependencies
shell: bash
run: poetry install --with test
- name: Install langchain editable
run: |
poetry run pip install langchain-experimental -e libs/core libs/langchain libs/community
- name: Check doc imports
shell: bash
run: |
poetry run python docs/scripts/check_imports.py
- name: Ensure the test did not create any additional files
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

64
.github/workflows/_test_pydantic.yml vendored Normal file
View File

@@ -0,0 +1,64 @@
name: test pydantic intermediate versions
on:
workflow_call:
inputs:
working-directory:
required: true
type: string
description: "From which folder this pipeline executes"
python-version:
required: false
type: string
description: "Python version to use"
default: "3.11"
pydantic-version:
required: true
type: string
description: "Pydantic version to test."
env:
POETRY_VERSION: "1.7.1"
jobs:
build:
defaults:
run:
working-directory: ${{ inputs.working-directory }}
runs-on: ubuntu-latest
name: "make test # pydantic: ~=${{ inputs.pydantic-version }}, python: ${{ inputs.python-version }}, "
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ inputs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ inputs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: core
- name: Install dependencies
shell: bash
run: poetry install --with test
- name: Overwrite pydantic version
shell: bash
run: poetry run pip install pydantic~=${{ inputs.pydantic-version }}
- name: Run core tests
shell: bash
run: |
make test
- name: Ensure the tests did not create any additional files
shell: bash
run: |
set -eu
STATUS="$(git status)"
echo "$STATUS"
# grep will exit non-zero if the target message isn't found,
# and `set -e` above will cause the step to fail.
echo "$STATUS" | grep 'nothing to commit, working tree clean'

View File

@@ -7,6 +7,11 @@ on:
required: true
type: string
description: "From which folder this pipeline executes"
dangerous-nonmaster-release:
required: false
type: boolean
default: false
description: "Release from a non-master branch (danger!)"
env:
POETRY_VERSION: "1.7.1"
@@ -14,7 +19,7 @@ env:
jobs:
build:
if: github.ref == 'refs/heads/master'
if: github.ref == 'refs/heads/master' || inputs.dangerous-nonmaster-release
runs-on: ubuntu-latest
outputs:
@@ -48,7 +53,7 @@ jobs:
working-directory: ${{ inputs.working-directory }}
- name: Upload build
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: test-dist
path: ${{ inputs.working-directory }}/dist/
@@ -76,7 +81,7 @@ jobs:
steps:
- uses: actions/checkout@v4
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: test-dist
path: ${{ inputs.working-directory }}/dist/
@@ -93,3 +98,5 @@ jobs:
# This is *only for CI use* and is *extremely dangerous* otherwise!
# https://github.com/pypa/gh-action-pypi-publish#tolerating-release-package-file-duplicates
skip-existing: true
# Temp workaround since attestations are on by default as of gh-action-pypi-publish v1.11.0
attestations: false

View File

@@ -5,47 +5,46 @@ on:
schedule:
- cron: '0 13 * * *'
env:
POETRY_VERSION: "1.7.1"
PYTHON_VERSION: "3.10"
POETRY_VERSION: "1.8.1"
PYTHON_VERSION: "3.11"
jobs:
build:
if: github.repository == 'langchain-ai/langchain' || github.event_name != 'schedule'
runs-on: ubuntu-latest
permissions: write-all
steps:
- uses: actions/checkout@v4
with:
ref: bagatur/api_docs_build
path: langchain
- uses: actions/checkout@v4
with:
repository: langchain-ai/langchain-google
path: langchain-google
- uses: actions/checkout@v4
repository: langchain-ai/langchain-api-docs-html
path: langchain-api-docs-html
token: ${{ secrets.TOKEN_GITHUB_API_DOCS_HTML }}
- name: Get repos with yq
id: get-unsorted-repos
uses: mikefarah/yq@master
with:
repository: langchain-ai/langchain-datastax
path: langchain-datastax
cmd: yq '.packages[].repo' langchain/libs/packages.yml
- name: Set Git config
working-directory: langchain
- name: Parse YAML and checkout repos
env:
REPOS_UNSORTED: ${{ steps.get-unsorted-repos.outputs.result }}
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
git config --local user.email "actions@github.com"
git config --local user.name "Github Actions"
- name: Merge master
working-directory: langchain
run: |
git fetch origin master
git merge origin/master -m "Merge master" --allow-unrelated-histories -X theirs
# Get unique repositories
REPOS=$(echo "$REPOS_UNSORTED" | sort -u)
- name: Move google libs
run: |
rm -rf \
langchain/libs/partners/google-genai \
langchain/libs/partners/google-vertexai \
langchain/libs/partners/astradb
mv langchain-google/libs/genai langchain/libs/partners/google-genai
mv langchain-google/libs/vertexai langchain/libs/partners/google-vertexai
mv langchain-datastax/libs/astradb langchain/libs/partners/astradb
# Checkout each unique repository
for repo in $REPOS; do
if [ "$repo" != "langchain-ai/langchain" ]; then
REPO_NAME=$(echo $repo | cut -d'/' -f2)
echo "Checking out $repo to $REPO_NAME"
git clone --depth 1 https://github.com/$repo.git $REPO_NAME
fi
done
- name: Set up Python ${{ env.PYTHON_VERSION }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./langchain/.github/actions/poetry_setup"
@@ -55,24 +54,46 @@ jobs:
cache-key: api-docs
working-directory: langchain
- name: Install initial py deps
working-directory: langchain
run: |
python -m pip install -U uv
python -m uv pip install --upgrade --no-cache-dir pip setuptools pyyaml
- name: Move libs with script
run: python langchain/.github/scripts/prep_api_docs_build.py
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Rm old html
run:
rm -rf langchain-api-docs-html/api_reference_build/html
- name: Install dependencies
working-directory: langchain
run: |
poetry run python -m pip install --upgrade --no-cache-dir pip setuptools
poetry run python -m pip install --upgrade --no-cache-dir sphinx readthedocs-sphinx-ext
# skip airbyte and ibm due to pandas dependency issue
poetry run python -m pip install $(ls ./libs/partners | grep -vE "airbyte|ibm" | xargs -I {} echo "./libs/partners/{}")
poetry run python -m pip install --exists-action=w --no-cache-dir -r docs/api_reference/requirements.txt
python -m uv pip install $(ls ./libs/partners | xargs -I {} echo "./libs/partners/{}")
python -m uv pip install libs/core libs/langchain libs/text-splitters libs/community libs/experimental
python -m uv pip install -r docs/api_reference/requirements.txt
- name: Set Git config
working-directory: langchain
run: |
git config --local user.email "actions@github.com"
git config --local user.name "Github Actions"
- name: Build docs
working-directory: langchain
run: |
poetry run python -m pip install --upgrade --no-cache-dir pip setuptools
poetry run python docs/api_reference/create_api_rst.py
poetry run python -m sphinx -T -E -b html -d _build/doctrees -c docs/api_reference docs/api_reference api_reference_build/html -j auto
python docs/api_reference/create_api_rst.py
python -m sphinx -T -E -b html -d ../langchain-api-docs-html/_build/doctrees -c docs/api_reference docs/api_reference ../langchain-api-docs-html/api_reference_build/html -j auto
python docs/api_reference/scripts/custom_formatter.py ../langchain-api-docs-html/api_reference_build/html
# Default index page is blank so we copy in the actual home page.
cp ../langchain-api-docs-html/api_reference_build/html/{reference,index}.html
rm -rf ../langchain-api-docs-html/_build/
# https://github.com/marketplace/actions/add-commit
- uses: EndBug/add-and-commit@v9
with:
cwd: langchain
cwd: langchain-api-docs-html
message: 'Update API docs build'

View File

@@ -0,0 +1,25 @@
name: Check Broken Links
on:
workflow_dispatch:
schedule:
- cron: '0 13 * * *'
jobs:
check-links:
if: github.repository_owner == 'langchain-ai' || github.event_name != 'schedule'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Use Node.js 18.x
uses: actions/setup-node@v3
with:
node-version: 18.x
cache: "yarn"
cache-dependency-path: ./docs/yarn.lock
- name: Install dependencies
run: yarn install --immutable --mode=skip-build
working-directory: ./docs
- name: Check broken links
run: yarn check-broken-links
working-directory: ./docs

View File

@@ -26,97 +26,120 @@ jobs:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.10'
python-version: '3.11'
- id: files
uses: Ana06/get-changed-files@v2.2.0
- id: set-matrix
run: |
python -m pip install packaging requests
python .github/scripts/check_diff.py ${{ steps.files.outputs.all }} >> $GITHUB_OUTPUT
outputs:
dirs-to-lint: ${{ steps.set-matrix.outputs.dirs-to-lint }}
dirs-to-test: ${{ steps.set-matrix.outputs.dirs-to-test }}
dirs-to-extended-test: ${{ steps.set-matrix.outputs.dirs-to-extended-test }}
lint: ${{ steps.set-matrix.outputs.lint }}
test: ${{ steps.set-matrix.outputs.test }}
extended-tests: ${{ steps.set-matrix.outputs.extended-tests }}
compile-integration-tests: ${{ steps.set-matrix.outputs.compile-integration-tests }}
dependencies: ${{ steps.set-matrix.outputs.dependencies }}
test-doc-imports: ${{ steps.set-matrix.outputs.test-doc-imports }}
test-pydantic: ${{ steps.set-matrix.outputs.test-pydantic }}
lint:
name: cd ${{ matrix.working-directory }}
name: cd ${{ matrix.job-configs.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-lint != '[]' }}
if: ${{ needs.build.outputs.lint != '[]' }}
strategy:
matrix:
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-lint) }}
job-configs: ${{ fromJson(needs.build.outputs.lint) }}
fail-fast: false
uses: ./.github/workflows/_lint.yml
with:
working-directory: ${{ matrix.working-directory }}
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
test:
name: cd ${{ matrix.working-directory }}
name: cd ${{ matrix.job-configs.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-test != '[]' }}
if: ${{ needs.build.outputs.test != '[]' }}
strategy:
matrix:
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-test) }}
job-configs: ${{ fromJson(needs.build.outputs.test) }}
fail-fast: false
uses: ./.github/workflows/_test.yml
with:
working-directory: ${{ matrix.working-directory }}
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
test-pydantic:
name: cd ${{ matrix.job-configs.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.test-pydantic != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.test-pydantic) }}
fail-fast: false
uses: ./.github/workflows/_test_pydantic.yml
with:
working-directory: ${{ matrix.job-configs.working-directory }}
pydantic-version: ${{ matrix.job-configs.pydantic-version }}
secrets: inherit
test-doc-imports:
needs: [ build ]
if: ${{ needs.build.outputs.test-doc-imports != '[]' }}
strategy:
matrix:
job-configs: ${{ fromJson(needs.build.outputs.test-doc-imports) }}
fail-fast: false
uses: ./.github/workflows/_test_doc_imports.yml
secrets: inherit
with:
python-version: ${{ matrix.job-configs.python-version }}
compile-integration-tests:
name: cd ${{ matrix.working-directory }}
name: cd ${{ matrix.job-configs.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-test != '[]' }}
if: ${{ needs.build.outputs.compile-integration-tests != '[]' }}
strategy:
matrix:
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-test) }}
job-configs: ${{ fromJson(needs.build.outputs.compile-integration-tests) }}
fail-fast: false
uses: ./.github/workflows/_compile_integration_test.yml
with:
working-directory: ${{ matrix.working-directory }}
secrets: inherit
dependencies:
name: cd ${{ matrix.working-directory }}
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-test != '[]' }}
strategy:
matrix:
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-test) }}
uses: ./.github/workflows/_dependencies.yml
with:
working-directory: ${{ matrix.working-directory }}
working-directory: ${{ matrix.job-configs.working-directory }}
python-version: ${{ matrix.job-configs.python-version }}
secrets: inherit
extended-tests:
name: "cd ${{ matrix.working-directory }} / make extended_tests #${{ matrix.python-version }}"
name: "cd ${{ matrix.job-configs.working-directory }} / make extended_tests #${{ matrix.job-configs.python-version }}"
needs: [ build ]
if: ${{ needs.build.outputs.dirs-to-extended-test != '[]' }}
if: ${{ needs.build.outputs.extended-tests != '[]' }}
strategy:
matrix:
# note different variable for extended test dirs
working-directory: ${{ fromJson(needs.build.outputs.dirs-to-extended-test) }}
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
job-configs: ${{ fromJson(needs.build.outputs.extended-tests) }}
fail-fast: false
runs-on: ubuntu-latest
defaults:
run:
working-directory: ${{ matrix.working-directory }}
working-directory: ${{ matrix.job-configs.working-directory }}
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }} + Poetry ${{ env.POETRY_VERSION }}
- name: Set up Python ${{ matrix.job-configs.python-version }} + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
python-version: ${{ matrix.job-configs.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ matrix.working-directory }}
working-directory: ${{ matrix.job-configs.working-directory }}
cache-key: extended
- name: Install dependencies
shell: bash
run: |
echo "Running extended tests, installing dependencies with poetry..."
poetry install -E extended_testing --with test
poetry install --with test
poetry run pip install uv
poetry run uv pip install -r extended_testing_deps.txt
- name: Run extended tests
run: make extended_tests
@@ -134,7 +157,7 @@ jobs:
echo "$STATUS" | grep 'nothing to commit, working tree clean'
ci_success:
name: "CI Success"
needs: [build, lint, test, compile-integration-tests, dependencies, extended-tests]
needs: [build, lint, test, compile-integration-tests, extended-tests, test-doc-imports, test-pydantic]
if: |
always()
runs-on: ubuntu-latest

36
.github/workflows/check_new_docs.yml vendored Normal file
View File

@@ -0,0 +1,36 @@
---
name: Integration docs lint
on:
push:
branches: [master]
pull_request:
# If another push to the same PR or branch happens while this workflow is still running,
# cancel the earlier run in favor of the next run.
#
# There's no point in testing an outdated version of the code. GitHub only allows
# a limited number of job runners to be active at the same time, so it's better to cancel
# pointless jobs early so that more useful jobs can run sooner.
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.10'
- id: files
uses: Ana06/get-changed-files@v2.2.0
with:
filter: |
*.ipynb
*.md
*.mdx
- name: Check new docs
run: |
python docs/scripts/check_templates.py ${{ steps.files.outputs.added }}

View File

@@ -3,9 +3,8 @@ name: CI / cd . / make spell_check
on:
push:
branches: [master]
branches: [master, v0.1, v0.2]
pull_request:
branches: [master]
permissions:
contents: read
@@ -29,9 +28,9 @@ jobs:
python .github/workflows/extract_ignored_words_list.py
id: extract_ignore_words
- name: Codespell
uses: codespell-project/actions-codespell@v2
with:
skip: guide_imports.json,*.ambr,./cookbook/data/imdb_top_1000.csv,*.lock
ignore_words_list: ${{ steps.extract_ignore_words.outputs.ignore_words_list }}
exclude_file: libs/community/langchain_community/llms/yuan2.py
# - name: Codespell
# uses: codespell-project/actions-codespell@v2
# with:
# skip: guide_imports.json,*.ambr,./cookbook/data/imdb_top_1000.csv,*.lock
# ignore_words_list: ${{ steps.extract_ignore_words.outputs.ignore_words_list }}
# exclude_file: ./.github/workflows/codespell-exclude

View File

@@ -7,4 +7,4 @@ ignore_words_list = (
pyproject_toml.get("tool", {}).get("codespell", {}).get("ignore-words-list")
)
print(f"::set-output name=ignore_words_list::{ignore_words_list}") # noqa: T201
print(f"::set-output name=ignore_words_list::{ignore_words_list}")

View File

@@ -14,8 +14,9 @@ on:
jobs:
langchain-people:
if: github.repository_owner == 'langchain-ai'
if: github.repository_owner == 'langchain-ai' || github.event_name != 'schedule'
runs-on: ubuntu-latest
permissions: write-all
steps:
- name: Dump GitHub context
env:

74
.github/workflows/run_notebooks.yml vendored Normal file
View File

@@ -0,0 +1,74 @@
name: Run notebooks
on:
workflow_dispatch:
inputs:
python_version:
description: 'Python version'
required: false
default: '3.11'
working-directory:
description: 'Working directory or subset (e.g., docs/docs/tutorials/llm_chain.ipynb or docs/docs/how_to)'
required: false
default: 'all'
schedule:
- cron: '0 13 * * *'
env:
POETRY_VERSION: "1.7.1"
jobs:
build:
runs-on: ubuntu-latest
if: github.repository == 'langchain-ai/langchain' || github.event_name != 'schedule'
name: "Test docs"
steps:
- uses: actions/checkout@v4
- name: Set up Python + Poetry ${{ env.POETRY_VERSION }}
uses: "./.github/actions/poetry_setup"
with:
python-version: ${{ github.event.inputs.python_version || '3.11' }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: ${{ inputs.working-directory }}
cache-key: run-notebooks
- name: 'Authenticate to Google Cloud'
id: 'auth'
uses: google-github-actions/auth@v2
with:
credentials_json: '${{ secrets.GOOGLE_CREDENTIALS }}'
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ secrets.AWS_REGION }}
- name: Install dependencies
run: |
poetry install --with dev,test
- name: Pre-download files
run: |
poetry run python docs/scripts/cache_data.py
curl -s https://raw.githubusercontent.com/lerocha/chinook-database/master/ChinookDatabase/DataSources/Chinook_Sqlite.sql | sqlite3 docs/docs/how_to/Chinook.db
cp docs/docs/how_to/Chinook.db docs/docs/tutorials/Chinook.db
- name: Prepare notebooks
run: |
poetry run python docs/scripts/prepare_notebooks_for_ci.py --comment-install-cells --working-directory ${{ github.event.inputs.working-directory || 'all' }}
- name: Run notebooks
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
TAVILY_API_KEY: ${{ secrets.TAVILY_API_KEY }}
TOGETHER_API_KEY: ${{ secrets.TOGETHER_API_KEY }}
WORKING_DIRECTORY: ${{ github.event.inputs.working-directory || 'all' }}
run: |
./docs/scripts/execute_notebooks.sh $WORKING_DIRECTORY

View File

@@ -10,28 +10,53 @@ env:
jobs:
build:
defaults:
run:
working-directory: libs/langchain
if: github.repository_owner == 'langchain-ai' || github.event_name != 'schedule'
name: Python ${{ matrix.python-version }} - ${{ matrix.working-directory }}
runs-on: ubuntu-latest
environment: Scheduled testing
strategy:
fail-fast: false
matrix:
python-version:
- "3.8"
- "3.9"
- "3.10"
- "3.11"
name: Python ${{ matrix.python-version }}
working-directory:
- "libs/partners/openai"
- "libs/partners/anthropic"
- "libs/partners/fireworks"
- "libs/partners/groq"
- "libs/partners/mistralai"
- "libs/partners/google-vertexai"
- "libs/partners/google-genai"
- "libs/partners/aws"
steps:
- uses: actions/checkout@v4
with:
path: langchain
- uses: actions/checkout@v4
with:
repository: langchain-ai/langchain-google
path: langchain-google
- uses: actions/checkout@v4
with:
repository: langchain-ai/langchain-aws
path: langchain-aws
- name: Move libs
run: |
rm -rf \
langchain/libs/partners/google-genai \
langchain/libs/partners/google-vertexai
mv langchain-google/libs/genai langchain/libs/partners/google-genai
mv langchain-google/libs/vertexai langchain/libs/partners/google-vertexai
mv langchain-aws/libs/aws langchain/libs/partners/aws
- name: Set up Python ${{ matrix.python-version }}
uses: "./.github/actions/poetry_setup"
uses: "./langchain/.github/actions/poetry_setup"
with:
python-version: ${{ matrix.python-version }}
poetry-version: ${{ env.POETRY_VERSION }}
working-directory: libs/langchain
working-directory: langchain/${{ matrix.working-directory }}
cache-key: scheduled
- name: 'Authenticate to Google Cloud'
@@ -45,22 +70,15 @@ jobs:
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.AWS_REGION }}
aws-region: ${{ secrets.AWS_REGION }}
- name: Install dependencies
working-directory: libs/langchain
shell: bash
run: |
echo "Running scheduled tests, installing dependencies with poetry..."
cd langchain/${{ matrix.working-directory }}
poetry install --with=test_integration,test
- name: Install deps outside pyproject
if: ${{ startsWith(inputs.working-directory, 'libs/community/') }}
shell: bash
run: poetry run pip install "boto3<2" "google-cloud-aiplatform<2"
- name: Run tests
shell: bash
- name: Run integration tests
env:
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
@@ -68,14 +86,31 @@ jobs:
AZURE_OPENAI_API_BASE: ${{ secrets.AZURE_OPENAI_API_BASE }}
AZURE_OPENAI_API_KEY: ${{ secrets.AZURE_OPENAI_API_KEY }}
AZURE_OPENAI_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LEGACY_CHAT_DEPLOYMENT_NAME }}
AZURE_OPENAI_LLM_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_LLM_DEPLOYMENT_NAME }}
AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME: ${{ secrets.AZURE_OPENAI_EMBEDDINGS_DEPLOYMENT_NAME }}
FIREWORKS_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
GROQ_API_KEY: ${{ secrets.GROQ_API_KEY }}
HUGGINGFACEHUB_API_TOKEN: ${{ secrets.HUGGINGFACEHUB_API_TOKEN }}
MISTRAL_API_KEY: ${{ secrets.MISTRAL_API_KEY }}
COHERE_API_KEY: ${{ secrets.COHERE_API_KEY }}
NVIDIA_API_KEY: ${{ secrets.NVIDIA_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
GOOGLE_SEARCH_API_KEY: ${{ secrets.GOOGLE_SEARCH_API_KEY }}
GOOGLE_CSE_ID: ${{ secrets.GOOGLE_CSE_ID }}
run: |
make scheduled_tests
cd langchain/${{ matrix.working-directory }}
make integration_tests
- name: Remove external libraries
run: |
rm -rf \
langchain/libs/partners/google-genai \
langchain/libs/partners/google-vertexai \
langchain/libs/partners/aws
- name: Ensure the tests did not create any additional files
shell: bash
working-directory: langchain
run: |
set -eu

6
.gitignore vendored
View File

@@ -116,6 +116,7 @@ celerybeat.pid
.env
.envrc
.venv*
venv*
env/
ENV/
env.bak/
@@ -132,6 +133,7 @@ env.bak/
# mypy
.mypy_cache/
.mypy_cache_test/
.dmypy.json
dmypy.json
@@ -165,11 +167,14 @@ docs/.docusaurus/
docs/.cache-loader/
docs/_dist
docs/api_reference/*api_reference.rst
docs/api_reference/*.md
docs/api_reference/_build
docs/api_reference/*/
!docs/api_reference/_static/
!docs/api_reference/templates/
!docs/api_reference/themes/
!docs/api_reference/_extensions/
!docs/api_reference/scripts/
docs/docs/build
docs/docs/node_modules
docs/docs/yarn.lock
@@ -177,3 +182,4 @@ _dist
docs/docs/templates
prof
virtualenv/

View File

@@ -1,70 +1,11 @@
# Migrating
## 🚨Breaking Changes for select chains (SQLDatabase) on 7/28/23
Please see the following guides for migrating LangChain code:
In an effort to make `langchain` leaner and safer, we are moving select chains to `langchain_experimental`.
This migration has already started, but we are remaining backwards compatible until 7/28.
On that date, we will remove functionality from `langchain`.
Read more about the motivation and the progress [here](https://github.com/langchain-ai/langchain/discussions/8043).
* Migrate to [LangChain v0.3](https://python.langchain.com/docs/versions/v0_3/)
* Migrate to [LangChain v0.2](https://python.langchain.com/docs/versions/v0_2/)
* Migrating from [LangChain 0.0.x Chains](https://python.langchain.com/docs/versions/migrating_chains/)
* Upgrade to [LangGraph Memory](https://python.langchain.com/docs/versions/migrating_memory/)
### Migrating to `langchain_experimental`
We are moving any experimental components of LangChain, or components with vulnerability issues, into `langchain_experimental`.
This guide covers how to migrate.
### Installation
Previously:
`pip install -U langchain`
Now (only if you want to access things in experimental):
`pip install -U langchain langchain_experimental`
### Things in `langchain.experimental`
Previously:
`from langchain.experimental import ...`
Now:
`from langchain_experimental import ...`
### PALChain
Previously:
`from langchain.chains import PALChain`
Now:
`from langchain_experimental.pal_chain import PALChain`
### SQLDatabaseChain
Previously:
`from langchain.chains import SQLDatabaseChain`
Now:
`from langchain_experimental.sql import SQLDatabaseChain`
Alternatively, if you are just interested in using the query generation part of the SQL chain, you can check out [`create_sql_query_chain`](https://github.com/langchain-ai/langchain/blob/master/docs/extras/use_cases/tabular/sql_query.ipynb)
`from langchain.chains import create_sql_query_chain`
### `load_prompt` for Python files
Note: this only applies if you want to load Python files as prompts.
If you want to load json/yaml files, no change is needed.
Previously:
`from langchain.prompts import load_prompt`
Now:
`from langchain_experimental.prompts import load_prompt`
The [LangChain CLI](https://python.langchain.com/docs/versions/v0_3/#migrate-using-langchain-cli) can help you automatically upgrade your code to use non-deprecated imports.
This will be especially helpful if you're still on either version 0.0.x or 0.1.x of LangChain.

View File

@@ -1,44 +1,62 @@
.PHONY: all clean docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck
.PHONY: all clean help docs_build docs_clean docs_linkcheck api_docs_build api_docs_clean api_docs_linkcheck spell_check spell_fix lint lint_package lint_tests format format_diff
# Default target executed when no arguments are given to make.
## help: Show this help info.
help: Makefile
@printf "\n\033[1mUsage: make <TARGETS> ...\033[0m\n\n\033[1mTargets:\033[0m\n\n"
@sed -n 's/^## //p' $< | awk -F':' '{printf "\033[36m%-30s\033[0m %s\n", $$1, $$2}' | sort | sed -e 's/^/ /'
## all: Default target, shows help.
all: help
## clean: Clean documentation and API documentation artifacts.
clean: docs_clean api_docs_clean
######################
# DOCUMENTATION
######################
clean: docs_clean api_docs_clean
## docs_build: Build the documentation.
docs_build:
docs/.local_build.sh
cd docs && make build
## docs_clean: Clean the documentation build artifacts.
docs_clean:
@if [ -d _dist ]; then \
rm -r _dist; \
echo "Directory _dist has been cleaned."; \
else \
echo "Nothing to clean."; \
fi
cd docs && make clean
## docs_linkcheck: Run linkchecker on the documentation.
docs_linkcheck:
poetry run linkchecker _dist/docs/ --ignore-url node_modules
## api_docs_build: Build the API Reference documentation.
api_docs_build:
poetry run python docs/api_reference/create_api_rst.py
cd docs/api_reference && poetry run make html
poetry run python docs/api_reference/scripts/custom_formatter.py docs/api_reference/_build/html/
API_PKG ?= text-splitters
api_docs_quick_preview:
poetry run python docs/api_reference/create_api_rst.py $(API_PKG)
cd docs/api_reference && poetry run make html
poetry run python docs/api_reference/scripts/custom_formatter.py docs/api_reference/_build/html/
open docs/api_reference/_build/html/reference.html
## api_docs_clean: Clean the API Reference documentation build artifacts.
api_docs_clean:
rm -f docs/api_reference/api_reference.rst
cd docs/api_reference && poetry run make clean
find ./docs/api_reference -name '*_api_reference.rst' -delete
git clean -fdX ./docs/api_reference
rm docs/api_reference/index.md
## api_docs_linkcheck: Run linkchecker on the API Reference documentation.
api_docs_linkcheck:
poetry run linkchecker docs/api_reference/_build/html/index.html
## spell_check: Run codespell on the project.
spell_check:
poetry run codespell --toml pyproject.toml
## spell_fix: Run codespell on the project and fix the errors.
spell_fix:
poetry run codespell --toml pyproject.toml -w
@@ -46,31 +64,14 @@ spell_fix:
# LINTING AND FORMATTING
######################
## lint: Run linting on the project.
lint lint_package lint_tests:
poetry run ruff docs templates cookbook
poetry run ruff format docs templates cookbook --diff
poetry run ruff --select I docs templates cookbook
git grep 'from langchain import' docs/docs templates cookbook | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
poetry run ruff check docs cookbook
poetry run ruff format docs cookbook cookbook --diff
poetry run ruff check --select I docs cookbook
git grep 'from langchain import' docs/docs cookbook | grep -vE 'from langchain import (hub)' && exit 1 || exit 0
## format: Format the project files.
format format_diff:
poetry run ruff format docs templates cookbook
poetry run ruff --select I --fix docs templates cookbook
######################
# HELP
######################
help:
@echo '===================='
@echo '-- DOCUMENTATION --'
@echo 'clean - run docs_clean and api_docs_clean'
@echo 'docs_build - build the documentation'
@echo 'docs_clean - clean the documentation build artifacts'
@echo 'docs_linkcheck - run linkchecker on the documentation'
@echo 'api_docs_build - build the API Reference documentation'
@echo 'api_docs_clean - clean the API Reference documentation build artifacts'
@echo 'api_docs_linkcheck - run linkchecker on the API Reference documentation'
@echo 'spell_check - run codespell on the project'
@echo 'spell_fix - run codespell on the project and fix the errors'
@echo '-- TEST and LINT tasks are within libs/*/ per-package --'
poetry run ruff format docs cookbook
poetry run ruff check --select I --fix docs cookbook

119
README.md
View File

@@ -2,105 +2,134 @@
⚡ Build context-aware reasoning applications ⚡
[![Release Notes](https://img.shields.io/github/release/langchain-ai/langchain)](https://github.com/langchain-ai/langchain/releases)
[![Release Notes](https://img.shields.io/github/release/langchain-ai/langchain?style=flat-square)](https://github.com/langchain-ai/langchain/releases)
[![CI](https://github.com/langchain-ai/langchain/actions/workflows/check_diffs.yml/badge.svg)](https://github.com/langchain-ai/langchain/actions/workflows/check_diffs.yml)
[![Downloads](https://static.pepy.tech/badge/langchain/month)](https://pepy.tech/project/langchain)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
[![](https://dcbadge.vercel.app/api/server/6adMQxSpJS?compact=true&style=flat)](https://discord.gg/6adMQxSpJS)
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
[![PyPI - License](https://img.shields.io/pypi/l/langchain-core?style=flat-square)](https://opensource.org/licenses/MIT)
[![PyPI - Downloads](https://img.shields.io/pypi/dm/langchain-core?style=flat-square)](https://pypistats.org/packages/langchain-core)
[![GitHub star chart](https://img.shields.io/github/stars/langchain-ai/langchain?style=flat-square)](https://star-history.com/#langchain-ai/langchain)
[![Open Issues](https://img.shields.io/github/issues-raw/langchain-ai/langchain?style=flat-square)](https://github.com/langchain-ai/langchain/issues)
[![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode&style=flat-square)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/langchain-ai/langchain)
[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/langchain-ai/langchain)
[![GitHub star chart](https://img.shields.io/github/stars/langchain-ai/langchain?style=social)](https://star-history.com/#langchain-ai/langchain)
[![Dependency Status](https://img.shields.io/librariesio/github/langchain-ai/langchain)](https://libraries.io/github/langchain-ai/langchain)
[![Open Issues](https://img.shields.io/github/issues-raw/langchain-ai/langchain)](https://github.com/langchain-ai/langchain/issues)
[![Twitter](https://img.shields.io/twitter/url/https/twitter.com/langchainai.svg?style=social&label=Follow%20%40LangChainAI)](https://twitter.com/langchainai)
Looking for the JS/TS library? Check out [LangChain.js](https://github.com/langchain-ai/langchainjs).
To help you ship LangChain apps to production faster, check out [LangSmith](https://smith.langchain.com).
[LangSmith](https://smith.langchain.com) is a unified developer platform for building, testing, and monitoring LLM applications.
To help you ship LangChain apps to production faster, check out [LangSmith](https://smith.langchain.com).
[LangSmith](https://smith.langchain.com) is a unified developer platform for building, testing, and monitoring LLM applications.
Fill out [this form](https://www.langchain.com/contact-sales) to speak with our sales team.
## Quick Install
With pip:
```bash
pip install langchain
```
With conda:
```bash
conda install langchain -c conda-forge
```
## 🤔 What is LangChain?
**LangChain** is a framework for developing applications powered by language models. It enables applications that:
- **Are context-aware**: connect a language model to sources of context (prompt instructions, few shot examples, content to ground its response in, etc.)
- **Reason**: rely on a language model to reason (about how to answer based on provided context, what actions to take, etc.)
**LangChain** is a framework for developing applications powered by large language models (LLMs).
This framework consists of several parts.
- **LangChain Libraries**: The Python and JavaScript libraries. Contains interfaces and integrations for a myriad of components, a basic run time for combining these components into chains and agents, and off-the-shelf implementations of chains and agents.
- **[LangChain Templates](templates)**: A collection of easily deployable reference architectures for a wide variety of tasks.
- **[LangServe](https://github.com/langchain-ai/langserve)**: A library for deploying LangChain chains as a REST API.
- **[LangSmith](https://smith.langchain.com)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
- **[LangGraph](https://python.langchain.com/docs/langgraph)**: LangGraph is a library for building stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain. It extends the LangChain Expression Language with the ability to coordinate multiple chains (or actors) across multiple steps of computation in a cyclic manner.
For these applications, LangChain simplifies the entire application lifecycle:
The LangChain libraries themselves are made up of several different packages.
- **[`langchain-core`](libs/core)**: Base abstractions and LangChain Expression Language.
- **[`langchain-community`](libs/community)**: Third party integrations.
- **[`langchain`](libs/langchain)**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- **Open-source libraries**: Build your applications using LangChain's open-source [building blocks](https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel), [components](https://python.langchain.com/docs/concepts/), and [third-party integrations](https://python.langchain.com/docs/integrations/providers/).
Use [LangGraph](https://langchain-ai.github.io/langgraph/) to build stateful agents with first-class streaming and human-in-the-loop support.
- **Productionization**: Inspect, monitor, and evaluate your apps with [LangSmith](https://docs.smith.langchain.com/) so that you can constantly optimize and deploy with confidence.
- **Deployment**: Turn your LangGraph applications into production-ready APIs and Assistants with [LangGraph Cloud](https://langchain-ai.github.io/langgraph/cloud/).
![Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.](docs/static/img/langchain_stack.png "LangChain Architecture Overview")
### Open-source libraries
- **`langchain-core`**: Base abstractions and LangChain Expression Language.
- **`langchain-community`**: Third party integrations.
- Some integrations have been further split into **partner packages** that only rely on **`langchain-core`**. Examples include **`langchain_openai`** and **`langchain_anthropic`**.
- **`langchain`**: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.
- **[`LangGraph`](https://langchain-ai.github.io/langgraph/)**: A library for building robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph. Integrates smoothly with LangChain, but can be used without it. To learn more about LangGraph, check out our first LangChain Academy course, *Introduction to LangGraph*, available [here](https://academy.langchain.com/courses/intro-to-langgraph).
### Productionization:
- **[LangSmith](https://docs.smith.langchain.com/)**: A developer platform that lets you debug, test, evaluate, and monitor chains built on any LLM framework and seamlessly integrates with LangChain.
### Deployment:
- **[LangGraph Cloud](https://langchain-ai.github.io/langgraph/cloud/)**: Turn your LangGraph applications into production-ready APIs and Assistants.
![Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.](docs/static/svg/langchain_stack_112024.svg#gh-light-mode-only "LangChain Architecture Overview")
![Diagram outlining the hierarchical organization of the LangChain framework, displaying the interconnected parts across multiple layers.](docs/static/svg/langchain_stack_112024_dark.svg#gh-dark-mode-only "LangChain Architecture Overview")
## 🧱 What can you build with LangChain?
**❓ Retrieval augmented generation**
- [Documentation](https://python.langchain.com/docs/use_cases/question_answering/)
**❓ Question answering with RAG**
- [Documentation](https://python.langchain.com/docs/tutorials/rag/)
- End-to-end Example: [Chat LangChain](https://chat.langchain.com) and [repo](https://github.com/langchain-ai/chat-langchain)
**💬 Analyzing structured data**
**🧱 Extracting structured output**
- [Documentation](https://python.langchain.com/docs/use_cases/qa_structured/sql)
- End-to-end Example: [SQL Llama2 Template](https://github.com/langchain-ai/langchain/tree/master/templates/sql-llama2)
- [Documentation](https://python.langchain.com/docs/tutorials/extraction/)
- End-to-end Example: [SQL Llama2 Template](https://github.com/langchain-ai/langchain-extract/)
**🤖 Chatbots**
- [Documentation](https://python.langchain.com/docs/use_cases/chatbots)
- [Documentation](https://python.langchain.com/docs/tutorials/chatbot/)
- End-to-end Example: [Web LangChain (web researcher chatbot)](https://weblangchain.vercel.app) and [repo](https://github.com/langchain-ai/weblangchain)
And much more! Head to the [Use cases](https://python.langchain.com/docs/use_cases/) section of the docs for more.
And much more! Head to the [Tutorials](https://python.langchain.com/docs/tutorials/) section of the docs for more.
## 🚀 How does LangChain help?
The main value props of the LangChain libraries are:
1. **Components**: composable tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
1. **Components**: composable building blocks, tools and integrations for working with language models. Components are modular and easy-to-use, whether you are using the rest of the LangChain framework or not
2. **Off-the-shelf chains**: built-in assemblages of components for accomplishing higher-level tasks
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
Off-the-shelf chains make it easy to get started. Components make it easy to customize existing chains and build new ones.
## LangChain Expression Language (LCEL)
LCEL is a key part of LangChain, allowing you to build and organize chains of processes in a straightforward, declarative manner. It was designed to support taking prototypes directly into production without needing to alter any code. This means you can use LCEL to set up everything from basic "prompt + LLM" setups to intricate, multi-step workflows.
- **[Overview](https://python.langchain.com/docs/concepts/#langchain-expression-language-lcel)**: LCEL and its benefits
- **[Interface](https://python.langchain.com/docs/concepts/#runnable-interface)**: The standard Runnable interface for LCEL objects
- **[Primitives](https://python.langchain.com/docs/how_to/#langchain-expression-language-lcel)**: More on the primitives LCEL includes
- **[Cheatsheet](https://python.langchain.com/docs/how_to/lcel_cheatsheet/)**: Quick overview of the most common usage patterns
## Components
Components fall into the following **modules**:
**📃 Model I/O:**
**📃 Model I/O**
This includes prompt management, prompt optimization, a generic interface for all LLMs, and common utilities for working with LLMs.
This includes [prompt management](https://python.langchain.com/docs/concepts/#prompt-templates), [prompt optimization](https://python.langchain.com/docs/concepts/#example-selectors), a generic interface for [chat models](https://python.langchain.com/docs/concepts/#chat-models) and [LLMs](https://python.langchain.com/docs/concepts/#llms), and common utilities for working with [model outputs](https://python.langchain.com/docs/concepts/#output-parsers).
**📚 Retrieval:**
**📚 Retrieval**
Data Augmented Generation involves specific types of chains that first interact with an external data source to fetch data for use in the generation step. Examples include summarization of long pieces of text and question/answering over specific data sources.
Retrieval Augmented Generation involves [loading data](https://python.langchain.com/docs/concepts/#document-loaders) from a variety of sources, [preparing it](https://python.langchain.com/docs/concepts/#text-splitters), then [searching over (a.k.a. retrieving from)](https://python.langchain.com/docs/concepts/#retrievers) it for use in the generation step.
**🤖 Agents:**
**🤖 Agents**
Agents involve an LLM making decisions about which Actions to take, taking that Action, seeing an Observation, and repeating that until done. LangChain provides a standard interface for agents, a selection of agents to choose from, and examples of end-to-end agents.
Agents allow an LLM autonomy over how a task is accomplished. Agents make decisions about which Actions to take, then take that Action, observe the result, and repeat until the task is complete. LangChain provides a [standard interface for agents](https://python.langchain.com/docs/concepts/#agents), along with [LangGraph](https://github.com/langchain-ai/langgraph) for building custom agents.
## 📖 Documentation
Please see [here](https://python.langchain.com) for full documentation, which includes:
- [Getting started](https://python.langchain.com/docs/get_started/introduction): installation, setting up the environment, simple examples
- Overview of the [interfaces](https://python.langchain.com/docs/expression_language/), [modules](https://python.langchain.com/docs/modules/), and [integrations](https://python.langchain.com/docs/integrations/providers)
- [Use case](https://python.langchain.com/docs/use_cases/qa_structured/sql) walkthroughs and best practice [guides](https://python.langchain.com/docs/guides/adapters/openai)
- [LangSmith](https://python.langchain.com/docs/langsmith/), [LangServe](https://python.langchain.com/docs/langserve), and [LangChain Template](https://python.langchain.com/docs/templates/) overviews
- [Reference](https://api.python.langchain.com): full API docs
- [Introduction](https://python.langchain.com/docs/introduction/): Overview of the framework and the structure of the docs.
- [Tutorials](https://python.langchain.com/docs/tutorials/): If you're looking to build something specific or are more of a hands-on learner, check out our tutorials. This is the best place to get started.
- [How-to guides](https://python.langchain.com/docs/how_to/): Answers to “How do I….?” type questions. These guides are goal-oriented and concrete; they're meant to help you complete a specific task.
- [Conceptual guide](https://python.langchain.com/docs/concepts/): Conceptual explanations of the key parts of the framework.
- [API Reference](https://api.python.langchain.com): Thorough documentation of every class and method.
## 🌐 Ecosystem
- [🦜🛠️ LangSmith](https://docs.smith.langchain.com/): Trace and evaluate your language model applications and intelligent agents to help you move from prototype to production.
- [🦜🕸️ LangGraph](https://langchain-ai.github.io/langgraph/): Create stateful, multi-actor applications with LLMs. Integrates smoothly with LangChain, but can be used without it.
- [🦜🏓 LangServe](https://python.langchain.com/docs/langserve): Deploy LangChain runnables and chains as REST APIs.
## 💁 Contributing

View File

@@ -1,6 +1,61 @@
# Security Policy
## Reporting a Vulnerability
## Reporting OSS Vulnerabilities
Please report security vulnerabilities by email to `security@langchain.dev`.
This email is an alias to a subset of our maintainers, and will ensure the issue is promptly triaged and acted upon as needed.
LangChain is partnered with [huntr by Protect AI](https://huntr.com/) to provide
a bounty program for our open source projects.
Please report security vulnerabilities associated with the LangChain
open source projects by visiting the following link:
[https://huntr.com/bounties/disclose/](https://huntr.com/bounties/disclose/?target=https%3A%2F%2Fgithub.com%2Flangchain-ai%2Flangchain&validSearch=true)
Before reporting a vulnerability, please review:
1) In-Scope Targets and Out-of-Scope Targets below.
2) The [langchain-ai/langchain](https://python.langchain.com/docs/contributing/repo_structure) monorepo structure.
3) LangChain [security guidelines](https://python.langchain.com/docs/security) to
understand what we consider to be a security vulnerability vs. developer
responsibility.
### In-Scope Targets
The following packages and repositories are eligible for bug bounties:
- langchain-core
- langchain (see exceptions)
- langchain-community (see exceptions)
- langgraph
- langserve
### Out of Scope Targets
All out of scope targets defined by huntr as well as:
- **langchain-experimental**: This repository is for experimental code and is not
eligible for bug bounties, bug reports to it will be marked as interesting or waste of
time and published with no bounty attached.
- **tools**: Tools in either langchain or langchain-community are not eligible for bug
bounties. This includes the following directories
- langchain/tools
- langchain-community/tools
- Please review our [security guidelines](https://python.langchain.com/docs/security)
for more details, but generally tools interact with the real world. Developers are
expected to understand the security implications of their code and are responsible
for the security of their tools.
- Code documented with security notices. This will be decided done on a case by
case basis, but likely will not be eligible for a bounty as the code is already
documented with guidelines for developers that should be followed for making their
application secure.
- Any LangSmith related repositories or APIs see below.
## Reporting LangSmith Vulnerabilities
Please report security vulnerabilities associated with LangSmith by email to `security@langchain.dev`.
- LangSmith site: https://smith.langchain.com
- SDK client: https://github.com/langchain-ai/langsmith-sdk
### Other Security Concerns
For any other security concerns, please contact us at `security@langchain.dev`.

View File

@@ -38,9 +38,9 @@
"\n",
"To run locally, we use Ollama.ai. \n",
"\n",
"See [here](https://python.langchain.com/docs/integrations/chat/ollama) for details on installation and setup.\n",
"See [here](/docs/integrations/chat/ollama) for details on installation and setup.\n",
"\n",
"Also, see [here](https://python.langchain.com/docs/guides/local_llms) for our full guide on local LLMs.\n",
"Also, see [here](/docs/guides/development/local_llms) for our full guide on local LLMs.\n",
" \n",
"To use an external API, which is not private, we can use Replicate."
]

View File

@@ -64,7 +64,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install -U langchain openai chromadb langchain-experimental # (newest versions required for multi-modal)"
"! pip install -U langchain openai langchain-chroma langchain-experimental # (newest versions required for multi-modal)"
]
},
{
@@ -355,7 +355,7 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
@@ -464,8 +464,8 @@
" Check if the base64 data is an image by looking at the start of the data\n",
" \"\"\"\n",
" image_signatures = {\n",
" b\"\\xFF\\xD8\\xFF\": \"jpg\",\n",
" b\"\\x89\\x50\\x4E\\x47\\x0D\\x0A\\x1A\\x0A\": \"png\",\n",
" b\"\\xff\\xd8\\xff\": \"jpg\",\n",
" b\"\\x89\\x50\\x4e\\x47\\x0d\\x0a\\x1a\\x0a\": \"png\",\n",
" b\"\\x47\\x49\\x46\\x38\": \"gif\",\n",
" b\"\\x52\\x49\\x46\\x46\": \"webp\",\n",
" }\n",
@@ -604,7 +604,7 @@
"source": [
"# Check retrieval\n",
"query = \"Give me company names that are interesting investments based on EV / NTM and NTM rev growth. Consider EV / NTM multiples vs historical?\"\n",
"docs = retriever_multi_vector_img.get_relevant_documents(query, limit=6)\n",
"docs = retriever_multi_vector_img.invoke(query, limit=6)\n",
"\n",
"# We get 4 docs\n",
"len(docs)"
@@ -630,7 +630,7 @@
"source": [
"# Check retrieval\n",
"query = \"What are the EV / NTM and NTM rev growth for MongoDB, Cloudflare, and Datadog?\"\n",
"docs = retriever_multi_vector_img.get_relevant_documents(query, limit=6)\n",
"docs = retriever_multi_vector_img.invoke(query, limit=6)\n",
"\n",
"# We get 4 docs\n",
"len(docs)"

View File

@@ -37,7 +37,7 @@
"metadata": {},
"outputs": [],
"source": [
"%pip install -U --quiet langchain langchain_community openai chromadb langchain-experimental\n",
"%pip install -U --quiet langchain langchain-chroma langchain-community openai langchain-experimental\n",
"%pip install --quiet \"unstructured[all-docs]\" pypdf pillow pydantic lxml pillow matplotlib chromadb tiktoken"
]
},
@@ -185,7 +185,7 @@
" )\n",
" # Text summary chain\n",
" model = VertexAI(\n",
" temperature=0, model_name=\"gemini-pro\", max_output_tokens=1024\n",
" temperature=0, model_name=\"gemini-pro\", max_tokens=1024\n",
" ).with_fallbacks([empty_response])\n",
" summarize_chain = {\"element\": lambda x: x} | prompt | model | StrOutputParser()\n",
"\n",
@@ -254,9 +254,9 @@
"\n",
"def image_summarize(img_base64, prompt):\n",
" \"\"\"Make image summary\"\"\"\n",
" model = ChatVertexAI(model_name=\"gemini-pro-vision\", max_output_tokens=1024)\n",
" model = ChatVertexAI(model=\"gemini-pro-vision\", max_tokens=1024)\n",
"\n",
" msg = model(\n",
" msg = model.invoke(\n",
" [\n",
" HumanMessage(\n",
" content=[\n",
@@ -344,8 +344,8 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.embeddings import VertexAIEmbeddings\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_core.documents import Document\n",
"\n",
"\n",
@@ -445,7 +445,7 @@
"\n",
"\n",
"def plt_img_base64(img_base64):\n",
" \"\"\"Disply base64 encoded string as image\"\"\"\n",
" \"\"\"Display base64 encoded string as image\"\"\"\n",
" # Create an HTML img tag with the base64 string as the source\n",
" image_html = f'<img src=\"data:image/jpeg;base64,{img_base64}\" />'\n",
" # Display the image by rendering the HTML\n",
@@ -462,8 +462,8 @@
" Check if the base64 data is an image by looking at the start of the data\n",
" \"\"\"\n",
" image_signatures = {\n",
" b\"\\xFF\\xD8\\xFF\": \"jpg\",\n",
" b\"\\x89\\x50\\x4E\\x47\\x0D\\x0A\\x1A\\x0A\": \"png\",\n",
" b\"\\xff\\xd8\\xff\": \"jpg\",\n",
" b\"\\x89\\x50\\x4e\\x47\\x0d\\x0a\\x1a\\x0a\": \"png\",\n",
" b\"\\x47\\x49\\x46\\x38\": \"gif\",\n",
" b\"\\x52\\x49\\x46\\x46\": \"webp\",\n",
" }\n",
@@ -553,9 +553,7 @@
" \"\"\"\n",
"\n",
" # Multi-modal LLM\n",
" model = ChatVertexAI(\n",
" temperature=0, model_name=\"gemini-pro-vision\", max_output_tokens=1024\n",
" )\n",
" model = ChatVertexAI(temperature=0, model_name=\"gemini-pro-vision\", max_tokens=1024)\n",
"\n",
" # RAG pipeline\n",
" chain = (\n",
@@ -604,7 +602,7 @@
],
"source": [
"query = \"What are the EV / NTM and NTM rev growth for MongoDB, Cloudflare, and Datadog?\"\n",
"docs = retriever_multi_vector_img.get_relevant_documents(query, limit=1)\n",
"docs = retriever_multi_vector_img.invoke(query, limit=1)\n",
"\n",
"# We get 2 docs\n",
"len(docs)"

View File

@@ -7,7 +7,7 @@
"metadata": {},
"outputs": [],
"source": [
"pip install -U langchain umap-learn scikit-learn langchain_community tiktoken langchain-openai langchainhub chromadb langchain-anthropic"
"pip install -U langchain umap-learn scikit-learn langchain_community tiktoken langchain-openai langchainhub langchain-chroma langchain-anthropic"
]
},
{
@@ -535,9 +535,9 @@
" print(f\"--Generated {len(all_clusters)} clusters--\")\n",
"\n",
" # Summarization\n",
" template = \"\"\"Here is a sub-set of LangChain Expression Langauge doc. \n",
" template = \"\"\"Here is a sub-set of LangChain Expression Language doc. \n",
" \n",
" LangChain Expression Langauge provides a way to compose chain in LangChain.\n",
" LangChain Expression Language provides a way to compose chain in LangChain.\n",
" \n",
" Give a detailed summary of the documentation provided.\n",
" \n",
@@ -645,7 +645,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"\n",
"# Initialize all_texts with leaf_texts\n",
"all_texts = leaf_texts.copy()\n",

View File

@@ -4,10 +4,13 @@ Example code for building applications with LangChain, with an emphasis on more
Notebook | Description
:- | :-
[agent_fireworks_ai_langchain_mongodb.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/agent_fireworks_ai_langchain_mongodb.ipynb) | Build an AI Agent With Memory Using MongoDB, LangChain and FireWorksAI.
[mongodb-langchain-cache-memory.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/mongodb-langchain-cache-memory.ipynb) | Build a RAG Application with Semantic Cache Using MongoDB and LangChain.
[LLaMA2_sql_chat.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/LLaMA2_sql_chat.ipynb) | Build a chat application that interacts with a SQL database using an open source llm (llama2), specifically demonstrated on an SQLite database containing rosters.
[Semi_Structured_RAG.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_Structured_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data, including text and tables, using unstructured for parsing, multi-vector retriever for storing, and lcel for implementing chains.
[Semi_structured_and_multi_moda...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_and_multi_modal_RAG.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using unstructured for parsing, multi-vector retriever for storage and retrieval, and lcel for implementing chains.
[Semi_structured_multi_modal_RA...](https://github.com/langchain-ai/langchain/tree/master/cookbook/Semi_structured_multi_modal_RAG_LLaMA2.ipynb) | Perform retrieval-augmented generation (rag) on documents with semi-structured data and images, using various tools and methods such as unstructured for parsing, multi-vector retriever for storing, lcel for implementing chains, and open source language models like llama2, llava, and gpt4all.
[amazon_personalize_how_to.ipynb](https://github.com/langchain-ai/langchain/blob/master/cookbook/amazon_personalize_how_to.ipynb) | Retrieving personalized recommendations from Amazon Personalize and use custom agents to build generative AI apps
[analyze_document.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/analyze_document.ipynb) | Analyze a single long document.
[autogpt/autogpt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/autogpt.ipynb) | Implement autogpt, a language model, with langchain primitives such as llms, prompttemplates, vectorstores, embeddings, and tools.
[autogpt/marathon_times.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/autogpt/marathon_times.ipynb) | Implement autogpt for finding winning marathon times.
@@ -35,6 +38,7 @@ Notebook | Description
[llm_symbolic_math.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/llm_symbolic_math.ipynb) | Solve algebraic equations with the help of llms (language learning models) and sympy, a python library for symbolic mathematics.
[meta_prompt.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/meta_prompt.ipynb) | Implement the meta-prompt concept, which is a method for building self-improving agents that reflect on their own performance and modify their instructions accordingly.
[multi_modal_output_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_modal_output_agent.ipynb) | Generate multi-modal outputs, specifically images and text.
[multi_modal_RAG_vdms.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_modal_RAG_vdms.ipynb) | Perform retrieval-augmented generation (rag) on documents including text and images, using unstructured for parsing, Intel's Visual Data Management System (VDMS) as the vectorstore, and chains.
[multi_player_dnd.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multi_player_dnd.ipynb) | Simulate multi-player dungeons & dragons games, with a custom function determining the speaking schedule of the agents.
[multiagent_authoritarian.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_authoritarian.ipynb) | Implement a multi-agent simulation where a privileged agent controls the conversation, including deciding who speaks and when the conversation ends, in the context of a simulated news network.
[multiagent_bidding.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/multiagent_bidding.ipynb) | Implement a multi-agent simulation where agents bid to speak, with the highest bidder speaking next, demonstrated through a fictitious presidential debate example.
@@ -46,6 +50,7 @@ Notebook | Description
[press_releases.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/press_releases.ipynb) | Retrieve and query company press release data powered by [Kay.ai](https://kay.ai).
[program_aided_language_model.i...](https://github.com/langchain-ai/langchain/tree/master/cookbook/program_aided_language_model.ipynb) | Implement program-aided language models as described in the provided research paper.
[qa_citations.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/qa_citations.ipynb) | Different ways to get a model to cite its sources.
[rag_upstage_layout_analysis_groundedness_check.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag_upstage_layout_analysis_groundedness_check.ipynb) | End-to-end RAG example using Upstage Layout Analysis and Groundedness Check.
[retrieval_in_sql.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/retrieval_in_sql.ipynb) | Perform retrieval-augmented-generation (rag) on a PostgreSQL database using pgvector.
[sales_agent_with_context.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/sales_agent_with_context.ipynb) | Implement a context-aware ai sales agent, salesgpt, that can have natural sales conversations, interact with other systems, and use a product knowledge base to discuss a company's offerings.
[self_query_hotel_search.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/self_query_hotel_search.ipynb) | Build a hotel room search feature with self-querying retrieval, using a specific hotel recommendation dataset.
@@ -55,3 +60,7 @@ Notebook | Description
[two_agent_debate_tools.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/two_agent_debate_tools.ipynb) | Simulate multi-agent dialogues where the agents can utilize various tools.
[two_player_dnd.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/two_player_dnd.ipynb) | Simulate a two-player dungeons & dragons game, where a dialogue simulator class is used to coordinate the dialogue between the protagonist and the dungeon master.
[wikibase_agent.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/wikibase_agent.ipynb) | Create a simple wikibase agent that utilizes sparql generation, with testing done on http://wikidata.org.
[oracleai_demo.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/oracleai_demo.ipynb) | This guide outlines how to utilize Oracle AI Vector Search alongside Langchain for an end-to-end RAG pipeline, providing step-by-step examples. The process includes loading documents from various sources using OracleDocLoader, summarizing them either within or outside the database with OracleSummary, and generating embeddings similarly through OracleEmbeddings. It also covers chunking documents according to specific requirements using Advanced Oracle Capabilities from OracleTextSplitter, and finally, storing and indexing these documents in a Vector Store for querying with OracleVS.
[rag-locally-on-intel-cpu.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/rag-locally-on-intel-cpu.ipynb) | Perform Retrieval-Augmented-Generation (RAG) on locally downloaded open-source models using langchain and open source tools and execute it on Intel Xeon CPU. We showed an example of how to apply RAG on Llama 2 model and enable it to answer the queries related to Intel Q1 2024 earnings release.
[visual_RAG_vdms.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/visual_RAG_vdms.ipynb) | Performs Visual Retrieval-Augmented-Generation (RAG) using videos and scene descriptions generated by open source models.
[contextual_rag.ipynb](https://github.com/langchain-ai/langchain/tree/master/cookbook/contextual_rag.ipynb) | Performs contextual retrieval-augmented generation (RAG) prepending chunk-specific explanatory context to each chunk before embedding.

View File

@@ -39,7 +39,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain unstructured[all-docs] pydantic lxml langchainhub"
"! pip install langchain langchain-chroma \"unstructured[all-docs]\" pydantic lxml langchainhub"
]
},
{
@@ -75,7 +75,7 @@
"\n",
"Apply to the [`LLaMA2`](https://arxiv.org/pdf/2307.09288.pdf) paper. \n",
"\n",
"We use the Unstructured [`partition_pdf`](https://unstructured-io.github.io/unstructured/bricks/partition.html#partition-pdf), which segments a PDF document by using a layout model. \n",
"We use the Unstructured [`partition_pdf`](https://unstructured-io.github.io/unstructured/core/partition.html#partition-pdf), which segments a PDF document by using a layout model. \n",
"\n",
"This layout model makes it possible to extract elements, such as tables, from pdfs. \n",
"\n",
@@ -320,7 +320,7 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",

View File

@@ -59,7 +59,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain unstructured[all-docs] pydantic lxml"
"! pip install langchain langchain-chroma \"unstructured[all-docs]\" pydantic lxml"
]
},
{
@@ -375,7 +375,7 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
@@ -562,9 +562,7 @@
],
"source": [
"# We can retrieve this table\n",
"retriever.get_relevant_documents(\n",
" \"What are results for LLaMA across across domains / subjects?\"\n",
")[1]"
"retriever.invoke(\"What are results for LLaMA across across domains / subjects?\")[1]"
]
},
{
@@ -614,9 +612,7 @@
}
],
"source": [
"retriever.get_relevant_documents(\"Images / figures with playful and creative examples\")[\n",
" 1\n",
"]"
"retriever.invoke(\"Images / figures with playful and creative examples\")[1]"
]
},
{

View File

@@ -59,7 +59,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain unstructured[all-docs] pydantic lxml"
"! pip install langchain langchain-chroma \"unstructured[all-docs]\" pydantic lxml"
]
},
{
@@ -191,15 +191,15 @@
"source": [
"## Multi-vector retriever\n",
"\n",
"Use [multi-vector-retriever](https://python.langchain.com/docs/modules/data_connection/retrievers/multi_vector#summary).\n",
"Use [multi-vector-retriever](/docs/modules/data_connection/retrievers/multi_vector#summary).\n",
"\n",
"Summaries are used to retrieve raw tables and / or raw chunks of text.\n",
"\n",
"### Text and Table summaries\n",
"\n",
"Here, we use ollama.ai to run LLaMA2 locally. \n",
"Here, we use Ollama to run LLaMA2 locally. \n",
"\n",
"See details on installation [here](https://python.langchain.com/docs/guides/local_llms)."
"See details on installation [here](/docs/guides/development/local_llms)."
]
},
{
@@ -378,8 +378,8 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.embeddings import GPT4AllEmbeddings\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_core.documents import Document\n",
"\n",
"# The vectorstore to use to index the child chunks\n",
@@ -501,9 +501,7 @@
}
],
"source": [
"retriever.get_relevant_documents(\"Images / figures with playful and creative examples\")[\n",
" 0\n",
"]"
"retriever.invoke(\"Images / figures with playful and creative examples\")[0]"
]
},
{

View File

@@ -19,7 +19,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install -U langchain openai chromadb langchain-experimental # (newest versions required for multi-modal)"
"! pip install -U langchain openai langchain_chroma langchain-experimental # (newest versions required for multi-modal)"
]
},
{
@@ -132,7 +132,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"baseline = Chroma.from_texts(\n",
@@ -342,7 +342,7 @@
"# Testing on retrieval\n",
"query = \"What percentage of CPI is dedicated to Housing, and how does it compare to the combined percentage of Medical Care, Apparel, and Other Goods and Services?\"\n",
"suffix_for_images = \" Include any pie charts, graphs, or tables.\"\n",
"docs = retriever_multi_vector_img.get_relevant_documents(query + suffix_for_images)"
"docs = retriever_multi_vector_img.invoke(query + suffix_for_images)"
]
},
{
@@ -532,8 +532,8 @@
"def is_image_data(b64data):\n",
" \"\"\"Check if the base64 data is an image by looking at the start of the data.\"\"\"\n",
" image_signatures = {\n",
" b\"\\xFF\\xD8\\xFF\": \"jpg\",\n",
" b\"\\x89\\x50\\x4E\\x47\\x0D\\x0A\\x1A\\x0A\": \"png\",\n",
" b\"\\xff\\xd8\\xff\": \"jpg\",\n",
" b\"\\x89\\x50\\x4e\\x47\\x0d\\x0a\\x1a\\x0a\": \"png\",\n",
" b\"\\x47\\x49\\x46\\x38\": \"gif\",\n",
" b\"\\x52\\x49\\x46\\x46\": \"webp\",\n",
" }\n",

File diff suppressed because one or more lines are too long

View File

@@ -28,7 +28,7 @@
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAI, OpenAIEmbeddings\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",

View File

@@ -14,7 +14,7 @@
}
],
"source": [
"%pip install -qU langchain-airbyte"
"%pip install -qU langchain-airbyte langchain_chroma"
]
},
{
@@ -123,7 +123,7 @@
"outputs": [],
"source": [
"import tiktoken\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"enc = tiktoken.get_encoding(\"cl100k_base\")\n",

File diff suppressed because one or more lines are too long

View File

@@ -40,11 +40,13 @@
"import nest_asyncio\n",
"import pandas as pd\n",
"from langchain.docstore.document import Document\n",
"from langchain_community.agent_toolkits.pandas.base import create_pandas_dataframe_agent\n",
"from langchain_experimental.agents.agent_toolkits.pandas.base import (\n",
" create_pandas_dataframe_agent,\n",
")\n",
"from langchain_experimental.autonomous_agents import AutoGPT\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"# Needed synce jupyter runs an async eventloop\n",
"# Needed since jupyter runs an async eventloop\n",
"nest_asyncio.apply()"
]
},
@@ -57,7 +59,7 @@
},
"outputs": [],
"source": [
"llm = ChatOpenAI(model_name=\"gpt-4\", temperature=1.0)"
"llm = ChatOpenAI(model=\"gpt-4\", temperature=1.0)"
]
},
{

File diff suppressed because one or more lines are too long

View File

@@ -90,7 +90,7 @@
" ) -> AIMessage:\n",
" messages = self.update_messages(input_message)\n",
"\n",
" output_message = self.model(messages)\n",
" output_message = self.model.invoke(messages)\n",
" self.update_messages(output_message)\n",
"\n",
" return output_message"

View File

@@ -90,7 +90,8 @@
"import os\n",
"from getpass import getpass\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"if \"OPENAI_API_KEY\" not in os.environ:\n",
" os.environ[\"OPENAI_API_KEY\"] = getpass()\n",
"# Please manually enter OpenAI Key"
]
},
@@ -933,7 +934,7 @@
"**Answer**: The LangChain class includes various types of retrievers such as:\n",
"\n",
"- ArxivRetriever\n",
"- AzureCognitiveSearchRetriever\n",
"- AzureAISearchRetriever\n",
"- BM25Retriever\n",
"- ChaindeskRetriever\n",
"- ChatGPTPluginRetriever\n",
@@ -993,7 +994,7 @@
{
"data": {
"text/plain": [
"{'question': 'LangChain possesses a variety of retrievers including:\\n\\n1. ArxivRetriever\\n2. AzureCognitiveSearchRetriever\\n3. BM25Retriever\\n4. ChaindeskRetriever\\n5. ChatGPTPluginRetriever\\n6. ContextualCompressionRetriever\\n7. DocArrayRetriever\\n8. ElasticSearchBM25Retriever\\n9. EnsembleRetriever\\n10. GoogleVertexAISearchRetriever\\n11. AmazonKendraRetriever\\n12. KNNRetriever\\n13. LlamaIndexGraphRetriever\\n14. LlamaIndexRetriever\\n15. MergerRetriever\\n16. MetalRetriever\\n17. MilvusRetriever\\n18. MultiQueryRetriever\\n19. ParentDocumentRetriever\\n20. PineconeHybridSearchRetriever\\n21. PubMedRetriever\\n22. RePhraseQueryRetriever\\n23. RemoteLangChainRetriever\\n24. SelfQueryRetriever\\n25. SVMRetriever\\n26. TFIDFRetriever\\n27. TimeWeightedVectorStoreRetriever\\n28. VespaRetriever\\n29. WeaviateHybridSearchRetriever\\n30. WebResearchRetriever\\n31. WikipediaRetriever\\n32. ZepRetriever\\n33. ZillizRetriever\\n\\nIt also includes self query translators like:\\n\\n1. ChromaTranslator\\n2. DeepLakeTranslator\\n3. MyScaleTranslator\\n4. PineconeTranslator\\n5. QdrantTranslator\\n6. WeaviateTranslator\\n\\nAnd remote retrievers like:\\n\\n1. RemoteLangChainRetriever'}"
"{'question': 'LangChain possesses a variety of retrievers including:\\n\\n1. ArxivRetriever\\n2. AzureAISearchRetriever\\n3. BM25Retriever\\n4. ChaindeskRetriever\\n5. ChatGPTPluginRetriever\\n6. ContextualCompressionRetriever\\n7. DocArrayRetriever\\n8. ElasticSearchBM25Retriever\\n9. EnsembleRetriever\\n10. GoogleVertexAISearchRetriever\\n11. AmazonKendraRetriever\\n12. KNNRetriever\\n13. LlamaIndexGraphRetriever\\n14. LlamaIndexRetriever\\n15. MergerRetriever\\n16. MetalRetriever\\n17. MilvusRetriever\\n18. MultiQueryRetriever\\n19. ParentDocumentRetriever\\n20. PineconeHybridSearchRetriever\\n21. PubMedRetriever\\n22. RePhraseQueryRetriever\\n23. RemoteLangChainRetriever\\n24. SelfQueryRetriever\\n25. SVMRetriever\\n26. TFIDFRetriever\\n27. TimeWeightedVectorStoreRetriever\\n28. VespaRetriever\\n29. WeaviateHybridSearchRetriever\\n30. WebResearchRetriever\\n31. WikipediaRetriever\\n32. ZepRetriever\\n33. ZillizRetriever\\n\\nIt also includes self query translators like:\\n\\n1. ChromaTranslator\\n2. DeepLakeTranslator\\n3. MyScaleTranslator\\n4. PineconeTranslator\\n5. QdrantTranslator\\n6. WeaviateTranslator\\n\\nAnd remote retrievers like:\\n\\n1. RemoteLangChainRetriever'}"
]
},
"execution_count": 31,
@@ -1117,7 +1118,7 @@
"The LangChain class includes various types of retrievers such as:\n",
"\n",
"- ArxivRetriever\n",
"- AzureCognitiveSearchRetriever\n",
"- AzureAISearchRetriever\n",
"- BM25Retriever\n",
"- ChaindeskRetriever\n",
"- ChatGPTPluginRetriever\n",

File diff suppressed because it is too large Load Diff

557
cookbook/cql_agent.ipynb Normal file
View File

@@ -0,0 +1,557 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup Environment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Python Modules"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Install the following Python modules:\n",
"\n",
"```bash\n",
"pip install ipykernel python-dotenv cassio pandas langchain_openai langchain langchain-community langchainhub langchain_experimental openai-multi-tool-use-parallel-patch\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load the `.env` File"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Connection is via `cassio` using `auto=True` parameter, and the notebook uses OpenAI. You should create a `.env` file accordingly.\n",
"\n",
"For Cassandra, set:\n",
"```bash\n",
"CASSANDRA_CONTACT_POINTS\n",
"CASSANDRA_USERNAME\n",
"CASSANDRA_PASSWORD\n",
"CASSANDRA_KEYSPACE\n",
"```\n",
"\n",
"For Astra, set:\n",
"```bash\n",
"ASTRA_DB_APPLICATION_TOKEN\n",
"ASTRA_DB_DATABASE_ID\n",
"ASTRA_DB_KEYSPACE\n",
"```\n",
"\n",
"For example:\n",
"\n",
"```bash\n",
"# Connection to Astra:\n",
"ASTRA_DB_DATABASE_ID=a1b2c3d4-...\n",
"ASTRA_DB_APPLICATION_TOKEN=AstraCS:...\n",
"ASTRA_DB_KEYSPACE=notebooks\n",
"\n",
"# Also set \n",
"OPENAI_API_KEY=sk-....\n",
"```\n",
"\n",
"(You may also modify the below code to directly connect with `cassio`.)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from dotenv import load_dotenv\n",
"\n",
"load_dotenv(override=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to Cassandra"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"import cassio\n",
"\n",
"cassio.init(auto=True)\n",
"session = cassio.config.resolve_session()\n",
"if not session:\n",
" raise Exception(\n",
" \"Check environment configuration or manually configure cassio connection parameters\"\n",
" )\n",
"\n",
"keyspace = os.environ.get(\n",
" \"ASTRA_DB_KEYSPACE\", os.environ.get(\"CASSANDRA_KEYSPACE\", None)\n",
")\n",
"if not keyspace:\n",
" raise ValueError(\"a KEYSPACE environment variable must be set\")\n",
"\n",
"session.set_keyspace(keyspace)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup Database"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This needs to be done one time only!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Download Data"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The dataset used is from Kaggle, the [Environmental Sensor Telemetry Data](https://www.kaggle.com/datasets/garystafford/environmental-sensor-data-132k?select=iot_telemetry_data.csv). The next cell will download and unzip the data into a Pandas dataframe. The following cell is instructions to download manually. \n",
"\n",
"The net result of this section is you should have a Pandas dataframe variable `df`."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Download Automatically"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from io import BytesIO\n",
"from zipfile import ZipFile\n",
"\n",
"import pandas as pd\n",
"import requests\n",
"\n",
"datasetURL = \"https://storage.googleapis.com/kaggle-data-sets/788816/1355729/bundle/archive.zip?X-Goog-Algorithm=GOOG4-RSA-SHA256&X-Goog-Credential=gcp-kaggle-com%40kaggle-161607.iam.gserviceaccount.com%2F20240404%2Fauto%2Fstorage%2Fgoog4_request&X-Goog-Date=20240404T115828Z&X-Goog-Expires=259200&X-Goog-SignedHeaders=host&X-Goog-Signature=2849f003b100eb9dcda8dd8535990f51244292f67e4f5fad36f14aa67f2d4297672d8fe6ff5a39f03a29cda051e33e95d36daab5892b8874dcd5a60228df0361fa26bae491dd4371f02dd20306b583a44ba85a4474376188b1f84765147d3b4f05c57345e5de883c2c29653cce1f3755cd8e645c5e952f4fb1c8a735b22f0c811f97f7bce8d0235d0d3731ca8ab4629ff381f3bae9e35fc1b181c1e69a9c7913a5e42d9d52d53e5f716467205af9c8a3cc6746fc5352e8fbc47cd7d18543626bd67996d18c2045c1e475fc136df83df352fa747f1a3bb73e6ba3985840792ec1de407c15836640ec96db111b173bf16115037d53fdfbfd8ac44145d7f9a546aa\"\n",
"\n",
"response = requests.get(datasetURL)\n",
"if response.status_code == 200:\n",
" zip_file = ZipFile(BytesIO(response.content))\n",
" csv_file_name = zip_file.namelist()[0]\n",
"else:\n",
" print(\"Failed to download the file\")\n",
"\n",
"with zip_file.open(csv_file_name) as csv_file:\n",
" df = pd.read_csv(csv_file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Download Manually"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can download the `.zip` file and unpack the `.csv` contained within. Comment in the next line, and adjust the path to this `.csv` file appropriately."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# df = pd.read_csv(\"/path/to/iot_telemetry_data.csv\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load Data into Cassandra"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This section assumes the existence of a dataframe `df`, the following cell validates its structure. The Download section above creates this object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"assert df is not None, \"Dataframe 'df' must be set\"\n",
"expected_columns = [\n",
" \"ts\",\n",
" \"device\",\n",
" \"co\",\n",
" \"humidity\",\n",
" \"light\",\n",
" \"lpg\",\n",
" \"motion\",\n",
" \"smoke\",\n",
" \"temp\",\n",
"]\n",
"assert all(\n",
" [column in df.columns for column in expected_columns]\n",
"), \"DataFrame does not have the expected columns\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create and load tables:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from datetime import UTC, datetime\n",
"\n",
"from cassandra.query import BatchStatement\n",
"\n",
"# Create sensors table\n",
"table_query = \"\"\"\n",
"CREATE TABLE IF NOT EXISTS iot_sensors (\n",
" device text,\n",
" conditions text,\n",
" room text,\n",
" PRIMARY KEY (device)\n",
")\n",
"WITH COMMENT = 'Environmental IoT room sensor metadata.';\n",
"\"\"\"\n",
"session.execute(table_query)\n",
"\n",
"pstmt = session.prepare(\n",
" \"\"\"\n",
"INSERT INTO iot_sensors (device, conditions, room)\n",
"VALUES (?, ?, ?)\n",
"\"\"\"\n",
")\n",
"\n",
"devices = [\n",
" (\"00:0f:00:70:91:0a\", \"stable conditions, cooler and more humid\", \"room 1\"),\n",
" (\"1c:bf:ce:15:ec:4d\", \"highly variable temperature and humidity\", \"room 2\"),\n",
" (\"b8:27:eb:bf:9d:51\", \"stable conditions, warmer and dryer\", \"room 3\"),\n",
"]\n",
"\n",
"for device, conditions, room in devices:\n",
" session.execute(pstmt, (device, conditions, room))\n",
"\n",
"print(\"Sensors inserted successfully.\")\n",
"\n",
"# Create data table\n",
"table_query = \"\"\"\n",
"CREATE TABLE IF NOT EXISTS iot_data (\n",
" day text,\n",
" device text,\n",
" ts timestamp,\n",
" co double,\n",
" humidity double,\n",
" light boolean,\n",
" lpg double,\n",
" motion boolean,\n",
" smoke double,\n",
" temp double,\n",
" PRIMARY KEY ((day, device), ts)\n",
")\n",
"WITH COMMENT = 'Data from environmental IoT room sensors. Columns include device identifier, timestamp (ts) of the data collection, carbon monoxide level (co), relative humidity, light presence, LPG concentration, motion detection, smoke concentration, and temperature (temp). Data is partitioned by day and device.';\n",
"\"\"\"\n",
"session.execute(table_query)\n",
"\n",
"pstmt = session.prepare(\n",
" \"\"\"\n",
"INSERT INTO iot_data (day, device, ts, co, humidity, light, lpg, motion, smoke, temp)\n",
"VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)\n",
"\"\"\"\n",
")\n",
"\n",
"\n",
"def insert_data_batch(name, group):\n",
" batch = BatchStatement()\n",
" day, device = name\n",
" print(f\"Inserting batch for day: {day}, device: {device}\")\n",
"\n",
" for _, row in group.iterrows():\n",
" timestamp = datetime.fromtimestamp(row[\"ts\"], UTC)\n",
" batch.add(\n",
" pstmt,\n",
" (\n",
" day,\n",
" row[\"device\"],\n",
" timestamp,\n",
" row[\"co\"],\n",
" row[\"humidity\"],\n",
" row[\"light\"],\n",
" row[\"lpg\"],\n",
" row[\"motion\"],\n",
" row[\"smoke\"],\n",
" row[\"temp\"],\n",
" ),\n",
" )\n",
"\n",
" session.execute(batch)\n",
"\n",
"\n",
"# Convert columns to appropriate types\n",
"df[\"light\"] = df[\"light\"] == \"true\"\n",
"df[\"motion\"] = df[\"motion\"] == \"true\"\n",
"df[\"ts\"] = df[\"ts\"].astype(float)\n",
"df[\"day\"] = df[\"ts\"].apply(\n",
" lambda x: datetime.fromtimestamp(x, UTC).strftime(\"%Y-%m-%d\")\n",
")\n",
"\n",
"grouped_df = df.groupby([\"day\", \"device\"])\n",
"\n",
"for name, group in grouped_df:\n",
" insert_data_batch(name, group)\n",
"\n",
"print(\"Data load complete\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(session.keyspace)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load the Tools"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Python `import` statements for the demo:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import AgentExecutor, create_openai_tools_agent\n",
"from langchain_community.agent_toolkits.cassandra_database.toolkit import (\n",
" CassandraDatabaseToolkit,\n",
")\n",
"from langchain_community.tools.cassandra_database.prompt import QUERY_PATH_PROMPT\n",
"from langchain_community.tools.cassandra_database.tool import (\n",
" GetSchemaCassandraDatabaseTool,\n",
" GetTableDataCassandraDatabaseTool,\n",
" QueryCassandraDatabaseTool,\n",
")\n",
"from langchain_community.utilities.cassandra_database import CassandraDatabase\n",
"from langchain_openai import ChatOpenAI"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `CassandraDatabase` object is loaded from `cassio`, though it does accept a `Session`-type parameter as an alternative."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Create a CassandraDatabase instance\n",
"db = CassandraDatabase(include_tables=[\"iot_sensors\", \"iot_data\"])\n",
"\n",
"# Create the Cassandra Database tools\n",
"query_tool = QueryCassandraDatabaseTool(db=db)\n",
"schema_tool = GetSchemaCassandraDatabaseTool(db=db)\n",
"select_data_tool = GetTableDataCassandraDatabaseTool(db=db)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The tools can be invoked directly:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Test the tools\n",
"print(\"Executing a CQL query:\")\n",
"query = \"SELECT * FROM iot_sensors LIMIT 5;\"\n",
"result = query_tool.run({\"query\": query})\n",
"print(result)\n",
"\n",
"print(\"\\nGetting the schema for a keyspace:\")\n",
"schema = schema_tool.run({\"keyspace\": keyspace})\n",
"print(schema)\n",
"\n",
"print(\"\\nGetting data from a table:\")\n",
"table = \"iot_data\"\n",
"predicate = \"day = '2020-07-14' and device = 'b8:27:eb:bf:9d:51'\"\n",
"data = select_data_tool.run(\n",
" {\"keyspace\": keyspace, \"table\": table, \"predicate\": predicate, \"limit\": 5}\n",
")\n",
"print(data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Agent Configuration"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import Tool\n",
"from langchain_experimental.utilities import PythonREPL\n",
"\n",
"python_repl = PythonREPL()\n",
"\n",
"repl_tool = Tool(\n",
" name=\"python_repl\",\n",
" description=\"A Python shell. Use this to execute python commands. Input should be a valid python command. If you want to see the output of a value, you should print it out with `print(...)`.\",\n",
" func=python_repl.run,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain import hub\n",
"\n",
"llm = ChatOpenAI(temperature=0, model=\"gpt-4-1106-preview\")\n",
"toolkit = CassandraDatabaseToolkit(db=db)\n",
"\n",
"# context = toolkit.get_context()\n",
"# tools = toolkit.get_tools()\n",
"tools = [schema_tool, select_data_tool, repl_tool]\n",
"\n",
"input = (\n",
" QUERY_PATH_PROMPT\n",
" + f\"\"\"\n",
"\n",
"Here is your task: In the {keyspace} keyspace, find the total number of times the temperature of each device has exceeded 23 degrees on July 14, 2020.\n",
" Create a summary report including the name of the room. Use Pandas if helpful.\n",
"\"\"\"\n",
")\n",
"\n",
"prompt = hub.pull(\"hwchase17/openai-tools-agent\")\n",
"\n",
"# messages = [\n",
"# HumanMessagePromptTemplate.from_template(input),\n",
"# AIMessage(content=QUERY_PATH_PROMPT),\n",
"# MessagesPlaceholder(variable_name=\"agent_scratchpad\"),\n",
"# ]\n",
"\n",
"# prompt = ChatPromptTemplate.from_messages(messages)\n",
"# print(prompt)\n",
"\n",
"# Choose the LLM that will drive the agent\n",
"# Only certain models support this\n",
"llm = ChatOpenAI(model=\"gpt-3.5-turbo-1106\", temperature=0)\n",
"\n",
"# Construct the OpenAI Tools agent\n",
"agent = create_openai_tools_agent(llm, tools, prompt)\n",
"\n",
"print(\"Available tools:\")\n",
"for tool in tools:\n",
" print(\"\\t\" + tool.name + \" - \" + tool.description + \" - \" + str(tool))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)\n",
"\n",
"response = agent_executor.invoke({\"input\": input})\n",
"\n",
"print(response[\"output\"])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -169,7 +169,7 @@
"\n",
"def get_tools(query):\n",
" # Get documents, which contain the Plugins to use\n",
" docs = retriever.get_relevant_documents(query)\n",
" docs = retriever.invoke(query)\n",
" # Get the toolkits, one for each plugin\n",
" tool_kits = [toolkits_dict[d.metadata[\"plugin_name\"]] for d in docs]\n",
" # Get the tools: a separate NLAChain for each endpoint\n",

View File

@@ -193,7 +193,7 @@
"\n",
"def get_tools(query):\n",
" # Get documents, which contain the Plugins to use\n",
" docs = retriever.get_relevant_documents(query)\n",
" docs = retriever.invoke(query)\n",
" # Get the toolkits, one for each plugin\n",
" tool_kits = [toolkits_dict[d.metadata[\"plugin_name\"]] for d in docs]\n",
" # Get the tools: a separate NLAChain for each endpoint\n",

View File

@@ -142,7 +142,7 @@
"\n",
"\n",
"def get_tools(query):\n",
" docs = retriever.get_relevant_documents(query)\n",
" docs = retriever.invoke(query)\n",
" return [ALL_TOOLS[d.metadata[\"index\"]] for d in docs]"
]
},

View File

@@ -166,7 +166,7 @@
"source": [
"### SQL Database Agent example\n",
"\n",
"This example demonstrates the use of the [SQL Database Agent](/docs/integrations/toolkits/sql_database.html) for answering questions over a Databricks database."
"This example demonstrates the use of the [SQL Database Agent](/docs/integrations/tools/sql_database) for answering questions over a Databricks database."
]
},
{

View File

@@ -39,7 +39,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install langchain docugami==0.0.8 dgml-utils==0.3.0 pydantic langchainhub chromadb hnswlib --upgrade --quiet"
"! pip install langchain docugami==0.0.8 dgml-utils==0.3.0 pydantic langchainhub langchain-chroma hnswlib --upgrade --quiet"
]
},
{
@@ -547,7 +547,7 @@
"\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryStore\n",
"from langchain_community.vectorstores.chroma import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.documents import Document\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",

View File

@@ -84,7 +84,7 @@
"metadata": {},
"outputs": [],
"source": [
"llm = ChatOpenAI(model_name=\"gpt-4\", temperature=0)\n",
"llm = ChatOpenAI(model=\"gpt-4\", temperature=0)\n",
"chain = ElasticsearchDatabaseChain.from_llm(llm=llm, database=db, verbose=True)"
]
},

View File

@@ -100,7 +100,7 @@
}
],
"source": [
"agent.run(\"whats 2 + 2\")"
"agent.invoke(\"whats 2 + 2\")"
]
},
{

View File

@@ -84,7 +84,7 @@
}
],
"source": [
"%pip install --quiet pypdf chromadb tiktoken openai \n",
"%pip install --quiet pypdf langchain-chroma tiktoken openai \n",
"%pip uninstall -y langchain-fireworks\n",
"%pip install --editable /mnt/disks/data/langchain/libs/partners/fireworks"
]
@@ -138,7 +138,7 @@
"all_splits = text_splitter.split_documents(data)\n",
"\n",
"# Add to vectorDB\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_fireworks.embeddings import FireworksEmbeddings\n",
"\n",
"vectorstore = Chroma.from_documents(\n",

View File

@@ -362,7 +362,7 @@
],
"source": [
"llm = OpenAI()\n",
"llm(query)"
"llm.invoke(query)"
]
},
{

View File

@@ -108,7 +108,7 @@
" return obs_message\n",
"\n",
" def _act(self):\n",
" act_message = self.model(self.message_history)\n",
" act_message = self.model.invoke(self.message_history)\n",
" self.message_history.append(act_message)\n",
" action = int(self.action_parser.parse(act_message.content)[\"action\"])\n",
" return action\n",

View File

@@ -170,7 +170,7 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"\n",
"with open(\"../../state_of_the_union.txt\") as f:\n",

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -52,7 +52,7 @@
"\n",
"bash_chain = LLMBashChain.from_llm(llm, verbose=True)\n",
"\n",
"bash_chain.run(text)"
"bash_chain.invoke(text)"
]
},
{
@@ -135,7 +135,7 @@
"\n",
"text = \"Please write a bash script that prints 'Hello World' to the console.\"\n",
"\n",
"bash_chain.run(text)"
"bash_chain.invoke(text)"
]
},
{
@@ -190,7 +190,7 @@
"\n",
"text = \"List the current directory then move up a level.\"\n",
"\n",
"bash_chain.run(text)"
"bash_chain.invoke(text)"
]
},
{
@@ -231,7 +231,7 @@
],
"source": [
"# Run the same command again and see that the state is maintained between calls\n",
"bash_chain.run(text)"
"bash_chain.invoke(text)"
]
}
],

View File

@@ -45,7 +45,7 @@
}
],
"source": [
"llm_symbolic_math.run(\"What is the derivative of sin(x)*exp(x) with respect to x?\")"
"llm_symbolic_math.invoke(\"What is the derivative of sin(x)*exp(x) with respect to x?\")"
]
},
{
@@ -65,7 +65,7 @@
}
],
"source": [
"llm_symbolic_math.run(\n",
"llm_symbolic_math.invoke(\n",
" \"What is the integral of exp(x)*sin(x) + exp(x)*cos(x) with respect to x?\"\n",
")"
]
@@ -94,7 +94,7 @@
}
],
"source": [
"llm_symbolic_math.run('Solve the differential equation y\" - y = e^t')"
"llm_symbolic_math.invoke('Solve the differential equation y\" - y = e^t')"
]
},
{
@@ -114,7 +114,7 @@
}
],
"source": [
"llm_symbolic_math.run(\"What are the solutions to this equation y^3 + 1/3y?\")"
"llm_symbolic_math.invoke(\"What are the solutions to this equation y^3 + 1/3y?\")"
]
},
{
@@ -134,7 +134,7 @@
}
],
"source": [
"llm_symbolic_math.run(\"x = y + 5, y = z - 3, z = x * y. Solve for x, y, z\")"
"llm_symbolic_math.invoke(\"x = y + 5, y = z - 3, z = x * y. Solve for x, y, z\")"
]
}
],

View File

@@ -0,0 +1,818 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "70b333e6",
"metadata": {},
"source": [
"[![View Article](https://img.shields.io/badge/View%20Article-blue)](https://www.mongodb.com/developer/products/atlas/advanced-rag-langchain-mongodb/)\n"
]
},
{
"cell_type": "markdown",
"id": "d84a72ea",
"metadata": {},
"source": [
"# Adding Semantic Caching and Memory to your RAG Application using MongoDB and LangChain\n",
"\n",
"In this notebook, we will see how to use the new MongoDBCache and MongoDBChatMessageHistory in your RAG application.\n"
]
},
{
"cell_type": "markdown",
"id": "65527202",
"metadata": {},
"source": [
"## Step 1: Install required libraries\n",
"\n",
"- **datasets**: Python library to get access to datasets available on Hugging Face Hub\n",
"\n",
"- **langchain**: Python toolkit for LangChain\n",
"\n",
"- **langchain-mongodb**: Python package to use MongoDB as a vector store, semantic cache, chat history store etc. in LangChain\n",
"\n",
"- **langchain-openai**: Python package to use OpenAI models with LangChain\n",
"\n",
"- **pymongo**: Python toolkit for MongoDB\n",
"\n",
"- **pandas**: Python library for data analysis, exploration, and manipulation"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "cbc22fa4",
"metadata": {},
"outputs": [],
"source": [
"! pip install -qU datasets langchain langchain-mongodb langchain-openai pymongo pandas"
]
},
{
"cell_type": "markdown",
"id": "39c41e87",
"metadata": {},
"source": [
"## Step 2: Setup pre-requisites\n",
"\n",
"* Set the MongoDB connection string. Follow the steps [here](https://www.mongodb.com/docs/manual/reference/connection-string/) to get the connection string from the Atlas UI.\n",
"\n",
"* Set the OpenAI API key. Steps to obtain an API key as [here](https://help.openai.com/en/articles/4936850-where-do-i-find-my-openai-api-key)"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "b56412ae",
"metadata": {},
"outputs": [],
"source": [
"import getpass"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "16a20d7a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Enter your MongoDB connection string:········\n"
]
}
],
"source": [
"MONGODB_URI = getpass.getpass(\"Enter your MongoDB connection string:\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "978682d4",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Enter your OpenAI API key:········\n"
]
}
],
"source": [
"OPENAI_API_KEY = getpass.getpass(\"Enter your OpenAI API key:\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "606081c5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"········\n"
]
}
],
"source": [
"# Optional-- If you want to enable Langsmith -- good for debugging\n",
"import os\n",
"\n",
"os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"os.environ[\"LANGCHAIN_API_KEY\"] = getpass.getpass()"
]
},
{
"cell_type": "markdown",
"id": "f6b8302c",
"metadata": {},
"source": [
"## Step 3: Download the dataset\n",
"\n",
"We will be using MongoDB's [embedded_movies](https://huggingface.co/datasets/MongoDB/embedded_movies) dataset"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "1a3433a6",
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from datasets import load_dataset"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aee5311b",
"metadata": {},
"outputs": [],
"source": [
"# Ensure you have an HF_TOKEN in your development enviornment:\n",
"# access tokens can be created or copied from the Hugging Face platform (https://huggingface.co/docs/hub/en/security-tokens)\n",
"\n",
"# Load MongoDB's embedded_movies dataset from Hugging Face\n",
"# https://huggingface.co/datasets/MongoDB/airbnb_embeddings\n",
"\n",
"data = load_dataset(\"MongoDB/embedded_movies\")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "1d630a26",
"metadata": {},
"outputs": [],
"source": [
"df = pd.DataFrame(data[\"train\"])"
]
},
{
"cell_type": "markdown",
"id": "a1f94f43",
"metadata": {},
"source": [
"## Step 4: Data analysis\n",
"\n",
"Make sure length of the dataset is what we expect, drop Nones etc."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "b276df71",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>fullplot</th>\n",
" <th>type</th>\n",
" <th>plot_embedding</th>\n",
" <th>num_mflix_comments</th>\n",
" <th>runtime</th>\n",
" <th>writers</th>\n",
" <th>imdb</th>\n",
" <th>countries</th>\n",
" <th>rated</th>\n",
" <th>plot</th>\n",
" <th>title</th>\n",
" <th>languages</th>\n",
" <th>metacritic</th>\n",
" <th>directors</th>\n",
" <th>awards</th>\n",
" <th>genres</th>\n",
" <th>poster</th>\n",
" <th>cast</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Young Pauline is left a lot of money when her ...</td>\n",
" <td>movie</td>\n",
" <td>[0.00072939653, -0.026834568, 0.013515796, -0....</td>\n",
" <td>0</td>\n",
" <td>199.0</td>\n",
" <td>[Charles W. Goddard (screenplay), Basil Dickey...</td>\n",
" <td>{'id': 4465, 'rating': 7.6, 'votes': 744}</td>\n",
" <td>[USA]</td>\n",
" <td>None</td>\n",
" <td>Young Pauline is left a lot of money when her ...</td>\n",
" <td>The Perils of Pauline</td>\n",
" <td>[English]</td>\n",
" <td>NaN</td>\n",
" <td>[Louis J. Gasnier, Donald MacKenzie]</td>\n",
" <td>{'nominations': 0, 'text': '1 win.', 'wins': 1}</td>\n",
" <td>[Action]</td>\n",
" <td>https://m.media-amazon.com/images/M/MV5BMzgxOD...</td>\n",
" <td>[Pearl White, Crane Wilbur, Paul Panzer, Edwar...</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" fullplot type \\\n",
"0 Young Pauline is left a lot of money when her ... movie \n",
"\n",
" plot_embedding num_mflix_comments \\\n",
"0 [0.00072939653, -0.026834568, 0.013515796, -0.... 0 \n",
"\n",
" runtime writers \\\n",
"0 199.0 [Charles W. Goddard (screenplay), Basil Dickey... \n",
"\n",
" imdb countries rated \\\n",
"0 {'id': 4465, 'rating': 7.6, 'votes': 744} [USA] None \n",
"\n",
" plot title \\\n",
"0 Young Pauline is left a lot of money when her ... The Perils of Pauline \n",
"\n",
" languages metacritic directors \\\n",
"0 [English] NaN [Louis J. Gasnier, Donald MacKenzie] \n",
"\n",
" awards genres \\\n",
"0 {'nominations': 0, 'text': '1 win.', 'wins': 1} [Action] \n",
"\n",
" poster \\\n",
"0 https://m.media-amazon.com/images/M/MV5BMzgxOD... \n",
"\n",
" cast \n",
"0 [Pearl White, Crane Wilbur, Paul Panzer, Edwar... "
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Previewing the contents of the data\n",
"df.head(1)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "22ab375d",
"metadata": {},
"outputs": [],
"source": [
"# Only keep records where the fullplot field is not null\n",
"df = df[df[\"fullplot\"].notna()]"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "fceed99a",
"metadata": {},
"outputs": [],
"source": [
"# Renaming the embedding field to \"embedding\" -- required by LangChain\n",
"df.rename(columns={\"plot_embedding\": \"embedding\"}, inplace=True)"
]
},
{
"cell_type": "markdown",
"id": "aedec13a",
"metadata": {},
"source": [
"## Step 5: Create a simple RAG chain using MongoDB as the vector store"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "11d292f3",
"metadata": {},
"outputs": [],
"source": [
"from langchain_mongodb import MongoDBAtlasVectorSearch\n",
"from pymongo import MongoClient\n",
"\n",
"# Initialize MongoDB python client\n",
"client = MongoClient(MONGODB_URI, appname=\"devrel.content.python\")\n",
"\n",
"DB_NAME = \"langchain_chatbot\"\n",
"COLLECTION_NAME = \"data\"\n",
"ATLAS_VECTOR_SEARCH_INDEX_NAME = \"vector_index\"\n",
"collection = client[DB_NAME][COLLECTION_NAME]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "d8292d53",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"DeleteResult({'n': 1000, 'electionId': ObjectId('7fffffff00000000000000f6'), 'opTime': {'ts': Timestamp(1710523288, 1033), 't': 246}, 'ok': 1.0, '$clusterTime': {'clusterTime': Timestamp(1710523288, 1042), 'signature': {'hash': b\"i\\xa8\\xe9'\\x1ed\\xf2u\\xf3L\\xff\\xb1\\xf5\\xbfA\\x90\\xabJ\\x12\\x83\", 'keyId': 7299545392000008318}}, 'operationTime': Timestamp(1710523288, 1033)}, acknowledged=True)"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Delete any existing records in the collection\n",
"collection.delete_many({})"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "36c68914",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Data ingestion into MongoDB completed\n"
]
}
],
"source": [
"# Data Ingestion\n",
"records = df.to_dict(\"records\")\n",
"collection.insert_many(records)\n",
"\n",
"print(\"Data ingestion into MongoDB completed\")"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "cbfca0b8",
"metadata": {},
"outputs": [],
"source": [
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"# Using the text-embedding-ada-002 since that's what was used to create embeddings in the movies dataset\n",
"embeddings = OpenAIEmbeddings(\n",
" openai_api_key=OPENAI_API_KEY, model=\"text-embedding-ada-002\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "798e176c",
"metadata": {},
"outputs": [],
"source": [
"# Vector Store Creation\n",
"vector_store = MongoDBAtlasVectorSearch.from_connection_string(\n",
" connection_string=MONGODB_URI,\n",
" namespace=DB_NAME + \".\" + COLLECTION_NAME,\n",
" embedding=embeddings,\n",
" index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,\n",
" text_key=\"fullplot\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "c71cd087",
"metadata": {},
"outputs": [],
"source": [
"# Using the MongoDB vector store as a retriever in a RAG chain\n",
"retriever = vector_store.as_retriever(search_type=\"similarity\", search_kwargs={\"k\": 5})"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "b6588cd3",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnablePassthrough\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"# Generate context using the retriever, and pass the user question through\n",
"retrieve = {\n",
" \"context\": retriever | (lambda docs: \"\\n\\n\".join([d.page_content for d in docs])),\n",
" \"question\": RunnablePassthrough(),\n",
"}\n",
"template = \"\"\"Answer the question based only on the following context: \\\n",
"{context}\n",
"\n",
"Question: {question}\n",
"\"\"\"\n",
"# Defining the chat prompt\n",
"prompt = ChatPromptTemplate.from_template(template)\n",
"# Defining the model to be used for chat completion\n",
"model = ChatOpenAI(temperature=0, openai_api_key=OPENAI_API_KEY)\n",
"# Parse output as a string\n",
"parse_output = StrOutputParser()\n",
"\n",
"# Naive RAG chain\n",
"naive_rag_chain = retrieve | prompt | model | parse_output"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "aaae21f5",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Once a Thief'"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"naive_rag_chain.invoke(\"What is the best movie to watch when sad?\")"
]
},
{
"cell_type": "markdown",
"id": "75f929ef",
"metadata": {},
"source": [
"## Step 6: Create a RAG chain with chat history"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "94e7bd4a",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.prompts import MessagesPlaceholder\n",
"from langchain_core.runnables.history import RunnableWithMessageHistory\n",
"from langchain_mongodb.chat_message_histories import MongoDBChatMessageHistory"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "5bb30860",
"metadata": {},
"outputs": [],
"source": [
"def get_session_history(session_id: str) -> MongoDBChatMessageHistory:\n",
" return MongoDBChatMessageHistory(\n",
" MONGODB_URI, session_id, database_name=DB_NAME, collection_name=\"history\"\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "f51d0f35",
"metadata": {},
"outputs": [],
"source": [
"# Given a follow-up question and history, create a standalone question\n",
"standalone_system_prompt = \"\"\"\n",
"Given a chat history and a follow-up question, rephrase the follow-up question to be a standalone question. \\\n",
"Do NOT answer the question, just reformulate it if needed, otherwise return it as is. \\\n",
"Only return the final standalone question. \\\n",
"\"\"\"\n",
"standalone_question_prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", standalone_system_prompt),\n",
" MessagesPlaceholder(variable_name=\"history\"),\n",
" (\"human\", \"{question}\"),\n",
" ]\n",
")\n",
"\n",
"question_chain = standalone_question_prompt | model | parse_output"
]
},
{
"cell_type": "code",
"execution_count": 51,
"id": "f3ef3354",
"metadata": {},
"outputs": [],
"source": [
"# Generate context by passing output of the question_chain i.e. the standalone question to the retriever\n",
"retriever_chain = RunnablePassthrough.assign(\n",
" context=question_chain\n",
" | retriever\n",
" | (lambda docs: \"\\n\\n\".join([d.page_content for d in docs]))\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 55,
"id": "5afb7345",
"metadata": {},
"outputs": [],
"source": [
"# Create a prompt that includes the context, history and the follow-up question\n",
"rag_system_prompt = \"\"\"Answer the question based only on the following context: \\\n",
"{context}\n",
"\"\"\"\n",
"rag_prompt = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", rag_system_prompt),\n",
" MessagesPlaceholder(variable_name=\"history\"),\n",
" (\"human\", \"{question}\"),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "f95f47d0",
"metadata": {},
"outputs": [],
"source": [
"# RAG chain\n",
"rag_chain = retriever_chain | rag_prompt | model | parse_output"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "9618d395",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'The best movie to watch when feeling down could be \"Last Action Hero.\" It\\'s a fun and action-packed film that blends reality and fantasy, offering an escape from the real world and providing an entertaining distraction.'"
]
},
"execution_count": 57,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# RAG chain with history\n",
"with_message_history = RunnableWithMessageHistory(\n",
" rag_chain,\n",
" get_session_history,\n",
" input_messages_key=\"question\",\n",
" history_messages_key=\"history\",\n",
")\n",
"with_message_history.invoke(\n",
" {\"question\": \"What is the best movie to watch when sad?\"},\n",
" {\"configurable\": {\"session_id\": \"1\"}},\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 58,
"id": "6e3080d1",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'I apologize for the confusion. Another movie that might lift your spirits when you\\'re feeling sad is \"Smilla\\'s Sense of Snow.\" It\\'s a mystery thriller that could engage your mind and distract you from your sadness with its intriguing plot and suspenseful storyline.'"
]
},
"execution_count": 58,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with_message_history.invoke(\n",
" {\n",
" \"question\": \"Hmmm..I don't want to watch that one. Can you suggest something else?\"\n",
" },\n",
" {\"configurable\": {\"session_id\": \"1\"}},\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 59,
"id": "daea2953",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'For a lighter movie option, you might enjoy \"Cousins.\" It\\'s a comedy film set in Barcelona with action and humor, offering a fun and entertaining escape from reality. The storyline is engaging and filled with comedic moments that could help lift your spirits.'"
]
},
"execution_count": 59,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with_message_history.invoke(\n",
" {\"question\": \"How about something more light?\"},\n",
" {\"configurable\": {\"session_id\": \"1\"}},\n",
")"
]
},
{
"cell_type": "markdown",
"id": "0de23a88",
"metadata": {},
"source": [
"## Step 7: Get faster responses using Semantic Cache\n",
"\n",
"**NOTE:** Semantic cache only caches the input to the LLM. When using it in retrieval chains, remember that documents retrieved can change between runs resulting in cache misses for semantically similar queries."
]
},
{
"cell_type": "code",
"execution_count": 61,
"id": "5d6b6741",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.globals import set_llm_cache\n",
"from langchain_mongodb.cache import MongoDBAtlasSemanticCache\n",
"\n",
"set_llm_cache(\n",
" MongoDBAtlasSemanticCache(\n",
" connection_string=MONGODB_URI,\n",
" embedding=embeddings,\n",
" collection_name=\"semantic_cache\",\n",
" database_name=DB_NAME,\n",
" index_name=ATLAS_VECTOR_SEARCH_INDEX_NAME,\n",
" wait_until_ready=True, # Optional, waits until the cache is ready to be used\n",
" )\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 62,
"id": "9825bc7b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 87.8 ms, sys: 670 µs, total: 88.5 ms\n",
"Wall time: 1.24 s\n"
]
},
{
"data": {
"text/plain": [
"'Once a Thief'"
]
},
"execution_count": 62,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"naive_rag_chain.invoke(\"What is the best movie to watch when sad?\")"
]
},
{
"cell_type": "code",
"execution_count": 63,
"id": "a5e518cf",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 43.5 ms, sys: 4.16 ms, total: 47.7 ms\n",
"Wall time: 255 ms\n"
]
},
{
"data": {
"text/plain": [
"'Once a Thief'"
]
},
"execution_count": 63,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"naive_rag_chain.invoke(\"What is the best movie to watch when sad?\")"
]
},
{
"cell_type": "code",
"execution_count": 64,
"id": "3d3d3ad3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 115 ms, sys: 171 µs, total: 115 ms\n",
"Wall time: 1.38 s\n"
]
},
{
"data": {
"text/plain": [
"'I would recommend watching \"Last Action Hero\" when sad, as it is a fun and action-packed film that can help lift your spirits.'"
]
},
"execution_count": 64,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%time\n",
"naive_rag_chain.invoke(\"Which movie do I watch when sad?\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "conda_pytorch_p310",
"language": "python",
"name": "conda_pytorch_p310"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -58,7 +58,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install -U langchain openai chromadb langchain-experimental # (newest versions required for multi-modal)"
"! pip install -U langchain openai langchain-chroma langchain-experimental # (newest versions required for multi-modal)"
]
},
{
@@ -187,7 +187,7 @@
"\n",
"import chromadb\n",
"import numpy as np\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_experimental.open_clip import OpenCLIPEmbeddings\n",
"from PIL import Image as _PILImage\n",
"\n",
@@ -435,7 +435,7 @@
" display(HTML(image_html))\n",
"\n",
"\n",
"docs = retriever.get_relevant_documents(\"Woman with children\", k=10)\n",
"docs = retriever.invoke(\"Woman with children\", k=10)\n",
"for doc in docs:\n",
" if is_base64(doc.page_content):\n",
" plt_img_base64(doc.page_content)\n",

File diff suppressed because one or more lines are too long

View File

@@ -74,7 +74,7 @@
" Applies the chatmodel to the message history\n",
" and returns the message string\n",
" \"\"\"\n",
" message = self.model(\n",
" message = self.model.invoke(\n",
" [\n",
" self.system_message,\n",
" HumanMessage(content=\"\\n\".join(self.message_history + [self.prefix])),\n",

View File

@@ -79,7 +79,7 @@
" Applies the chatmodel to the message history\n",
" and returns the message string\n",
" \"\"\"\n",
" message = self.model(\n",
" message = self.model.invoke(\n",
" [\n",
" self.system_message,\n",
" HumanMessage(content=\"\\n\".join(self.message_history + [self.prefix])),\n",
@@ -234,7 +234,7 @@
" termination_clause=self.termination_clause if self.stop else \"\",\n",
" )\n",
"\n",
" self.response = self.model(\n",
" self.response = self.model.invoke(\n",
" [\n",
" self.system_message,\n",
" HumanMessage(content=response_prompt),\n",
@@ -263,7 +263,7 @@
" speaker_names=speaker_names,\n",
" )\n",
"\n",
" choice_string = self.model(\n",
" choice_string = self.model.invoke(\n",
" [\n",
" self.system_message,\n",
" HumanMessage(content=choice_prompt),\n",
@@ -299,7 +299,7 @@
" ),\n",
" next_speaker=self.next_speaker,\n",
" )\n",
" message = self.model(\n",
" message = self.model.invoke(\n",
" [\n",
" self.system_message,\n",
" HumanMessage(content=next_prompt),\n",

View File

@@ -71,7 +71,7 @@
" Applies the chatmodel to the message history\n",
" and returns the message string\n",
" \"\"\"\n",
" message = self.model(\n",
" message = self.model.invoke(\n",
" [\n",
" self.system_message,\n",
" HumanMessage(content=\"\\n\".join(self.message_history + [self.prefix])),\n",
@@ -164,7 +164,7 @@
" message_history=\"\\n\".join(self.message_history),\n",
" recent_message=self.message_history[-1],\n",
" )\n",
" bid_string = self.model([SystemMessage(content=prompt)]).content\n",
" bid_string = self.model.invoke([SystemMessage(content=prompt)]).content\n",
" return bid_string"
]
},

View File

@@ -58,7 +58,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install -U langchain-nomic langchain_community tiktoken langchain-openai chromadb langchain"
"! pip install -U langchain-nomic langchain-chroma langchain-community tiktoken langchain-openai langchain"
]
},
{
@@ -167,7 +167,7 @@
"source": [
"import os\n",
"\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.runnables import RunnableLambda, RunnablePassthrough\n",
"from langchain_nomic import NomicEmbeddings\n",

View File

@@ -0,0 +1,497 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "9fc3897d-176f-4729-8fd1-cfb4add53abd",
"metadata": {},
"source": [
"## Nomic multi-modal RAG\n",
"\n",
"Many documents contain a mixture of content types, including text and images. \n",
"\n",
"Yet, information captured in images is lost in most RAG applications.\n",
"\n",
"With the emergence of multimodal LLMs, like [GPT-4V](https://openai.com/research/gpt-4v-system-card), it is worth considering how to utilize images in RAG:\n",
"\n",
"In this demo we\n",
"\n",
"* Use multimodal embeddings from Nomic Embed [Vision](https://huggingface.co/nomic-ai/nomic-embed-vision-v1.5) and [Text](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) to embed images and text\n",
"* Retrieve both using similarity search\n",
"* Pass raw images and text chunks to a multimodal LLM for answer synthesis \n",
"\n",
"## Signup\n",
"\n",
"Get your API token, then run:\n",
"```\n",
"! nomic login\n",
"```\n",
"\n",
"Then run with your generated API token \n",
"```\n",
"! nomic login < token > \n",
"```\n",
"\n",
"## Packages\n",
"\n",
"For `unstructured`, you will also need `poppler` ([installation instructions](https://pdf2image.readthedocs.io/en/latest/installation.html)) and `tesseract` ([installation instructions](https://tesseract-ocr.github.io/tessdoc/Installation.html)) in your system."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "54926b9b-75c2-4cd4-8f14-b3882a0d370b",
"metadata": {},
"outputs": [],
"source": [
"! nomic login token"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "febbc459-ebba-4c1a-a52b-fed7731593f8",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"! pip install -U langchain-nomic langchain-chroma langchain-community tiktoken langchain-openai langchain # (newest versions required for multi-modal)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "acbdc603-39e2-4a5f-836c-2bbaecd46b0b",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# lock to 0.10.19 due to a persistent bug in more recent versions\n",
"! pip install \"unstructured[all-docs]==0.10.19\" pillow pydantic lxml pillow matplotlib tiktoken"
]
},
{
"cell_type": "markdown",
"id": "1e94b3fb-8e3e-4736-be0a-ad881626c7bd",
"metadata": {},
"source": [
"## Data Loading\n",
"\n",
"### Partition PDF text and images\n",
" \n",
"Let's look at an example pdfs containing interesting images.\n",
"\n",
"1/ Art from the J Paul Getty museum:\n",
"\n",
" * Here is a [zip file](https://drive.google.com/file/d/18kRKbq2dqAhhJ3DfZRnYcTBEUfYxe1YR/view?usp=sharing) with the PDF and the already extracted images. \n",
"* https://www.getty.edu/publications/resources/virtuallibrary/0892360224.pdf\n",
"\n",
"2/ Famous photographs from library of congress:\n",
"\n",
"* https://www.loc.gov/lcm/pdf/LCM_2020_1112.pdf\n",
"* We'll use this as an example below\n",
"\n",
"We can use `partition_pdf` below from [Unstructured](https://unstructured-io.github.io/unstructured/introduction.html#key-concepts) to extract text and images.\n",
"\n",
"To supply this to extract the images:\n",
"```\n",
"extract_images_in_pdf=True\n",
"```\n",
"\n",
"\n",
"\n",
"If using this zip file, then you can simply process the text only with:\n",
"```\n",
"extract_images_in_pdf=False\n",
"```"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9646b524-71a7-4b2a-bdc8-0b81f77e968f",
"metadata": {},
"outputs": [],
"source": [
"# Folder with pdf and extracted images\n",
"from pathlib import Path\n",
"\n",
"# replace with actual path to images\n",
"path = Path(\"../art\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "77f096ab-a933-41d0-8f4e-1efc83998fc3",
"metadata": {},
"outputs": [],
"source": [
"path.resolve()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bc4839c0-8773-4a07-ba59-5364501269b2",
"metadata": {},
"outputs": [],
"source": [
"# Extract images, tables, and chunk text\n",
"from unstructured.partition.pdf import partition_pdf\n",
"\n",
"raw_pdf_elements = partition_pdf(\n",
" filename=str(path.resolve()) + \"/getty.pdf\",\n",
" extract_images_in_pdf=False,\n",
" infer_table_structure=True,\n",
" chunking_strategy=\"by_title\",\n",
" max_characters=4000,\n",
" new_after_n_chars=3800,\n",
" combine_text_under_n_chars=2000,\n",
" image_output_dir_path=path,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "969545ad",
"metadata": {},
"outputs": [],
"source": [
"# Categorize text elements by type\n",
"tables = []\n",
"texts = []\n",
"for element in raw_pdf_elements:\n",
" if \"unstructured.documents.elements.Table\" in str(type(element)):\n",
" tables.append(str(element))\n",
" elif \"unstructured.documents.elements.CompositeElement\" in str(type(element)):\n",
" texts.append(str(element))"
]
},
{
"cell_type": "markdown",
"id": "5d8e6349-1547-4cbf-9c6f-491d8610ec10",
"metadata": {},
"source": [
"## Multi-modal embeddings with our document\n",
"\n",
"We will use [nomic-embed-vision-v1.5](https://huggingface.co/nomic-ai/nomic-embed-vision-v1.5) embeddings. This model is aligned \n",
"to [nomic-embed-text-v1.5](https://huggingface.co/nomic-ai/nomic-embed-text-v1.5) allowing for multimodal semantic search and Multimodal RAG!"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4bc15842-cb95-4f84-9eb5-656b0282a800",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import uuid\n",
"\n",
"import chromadb\n",
"import numpy as np\n",
"from langchain_chroma import Chroma\n",
"from langchain_nomic import NomicEmbeddings\n",
"from PIL import Image as _PILImage\n",
"\n",
"# Create chroma\n",
"text_vectorstore = Chroma(\n",
" collection_name=\"mm_rag_clip_photos_text\",\n",
" embedding_function=NomicEmbeddings(\n",
" vision_model=\"nomic-embed-vision-v1.5\", model=\"nomic-embed-text-v1.5\"\n",
" ),\n",
")\n",
"image_vectorstore = Chroma(\n",
" collection_name=\"mm_rag_clip_photos_image\",\n",
" embedding_function=NomicEmbeddings(\n",
" vision_model=\"nomic-embed-vision-v1.5\", model=\"nomic-embed-text-v1.5\"\n",
" ),\n",
")\n",
"\n",
"# Get image URIs with .jpg extension only\n",
"image_uris = sorted(\n",
" [\n",
" os.path.join(path, image_name)\n",
" for image_name in os.listdir(path)\n",
" if image_name.endswith(\".jpg\")\n",
" ]\n",
")\n",
"\n",
"# Add images\n",
"image_vectorstore.add_images(uris=image_uris)\n",
"\n",
"# Add documents\n",
"text_vectorstore.add_texts(texts=texts)\n",
"\n",
"# Make retriever\n",
"image_retriever = image_vectorstore.as_retriever()\n",
"text_retriever = text_vectorstore.as_retriever()"
]
},
{
"cell_type": "markdown",
"id": "02a186d0-27e0-4820-8092-63b5349dd25d",
"metadata": {},
"source": [
"## RAG\n",
"\n",
"`vectorstore.add_images` will store / retrieve images as base64 encoded strings.\n",
"\n",
"These can be passed to [GPT-4V](https://platform.openai.com/docs/guides/vision)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "344f56a8-0dc3-433e-851c-3f7600c7a72b",
"metadata": {},
"outputs": [],
"source": [
"import base64\n",
"import io\n",
"from io import BytesIO\n",
"\n",
"import numpy as np\n",
"from PIL import Image\n",
"\n",
"\n",
"def resize_base64_image(base64_string, size=(128, 128)):\n",
" \"\"\"\n",
" Resize an image encoded as a Base64 string.\n",
"\n",
" Args:\n",
" base64_string (str): Base64 string of the original image.\n",
" size (tuple): Desired size of the image as (width, height).\n",
"\n",
" Returns:\n",
" str: Base64 string of the resized image.\n",
" \"\"\"\n",
" # Decode the Base64 string\n",
" img_data = base64.b64decode(base64_string)\n",
" img = Image.open(io.BytesIO(img_data))\n",
"\n",
" # Resize the image\n",
" resized_img = img.resize(size, Image.LANCZOS)\n",
"\n",
" # Save the resized image to a bytes buffer\n",
" buffered = io.BytesIO()\n",
" resized_img.save(buffered, format=img.format)\n",
"\n",
" # Encode the resized image to Base64\n",
" return base64.b64encode(buffered.getvalue()).decode(\"utf-8\")\n",
"\n",
"\n",
"def is_base64(s):\n",
" \"\"\"Check if a string is Base64 encoded\"\"\"\n",
" try:\n",
" return base64.b64encode(base64.b64decode(s)) == s.encode()\n",
" except Exception:\n",
" return False\n",
"\n",
"\n",
"def split_image_text_types(docs):\n",
" \"\"\"Split numpy array images and texts\"\"\"\n",
" images = []\n",
" text = []\n",
" for doc in docs:\n",
" doc = doc.page_content # Extract Document contents\n",
" if is_base64(doc):\n",
" # Resize image to avoid OAI server error\n",
" images.append(\n",
" resize_base64_image(doc, size=(250, 250))\n",
" ) # base64 encoded str\n",
" else:\n",
" text.append(doc)\n",
" return {\"images\": images, \"texts\": text}"
]
},
{
"cell_type": "markdown",
"id": "23a2c1d8-fea6-4152-b184-3172dd46c735",
"metadata": {},
"source": [
"Currently, we format the inputs using a `RunnableLambda` while we add image support to `ChatPromptTemplates`.\n",
"\n",
"Our runnable follows the classic RAG flow - \n",
"\n",
"* We first compute the context (both \"texts\" and \"images\" in this case) and the question (just a RunnablePassthrough here) \n",
"* Then we pass this into our prompt template, which is a custom function that formats the message for the gpt-4-vision-preview model. \n",
"* And finally we parse the output as a string."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5d8919dc-c238-4746-86ba-45d940a7d260",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4c93fab3-74c4-4f1d-958a-0bc4cdd0797e",
"metadata": {},
"outputs": [],
"source": [
"from operator import itemgetter\n",
"\n",
"from langchain_core.messages import HumanMessage, SystemMessage\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.runnables import RunnableLambda, RunnablePassthrough\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"\n",
"def prompt_func(data_dict):\n",
" # Joining the context texts into a single string\n",
" formatted_texts = \"\\n\".join(data_dict[\"text_context\"][\"texts\"])\n",
" messages = []\n",
"\n",
" # Adding image(s) to the messages if present\n",
" if data_dict[\"image_context\"][\"images\"]:\n",
" image_message = {\n",
" \"type\": \"image_url\",\n",
" \"image_url\": {\n",
" \"url\": f\"data:image/jpeg;base64,{data_dict['image_context']['images'][0]}\"\n",
" },\n",
" }\n",
" messages.append(image_message)\n",
"\n",
" # Adding the text message for analysis\n",
" text_message = {\n",
" \"type\": \"text\",\n",
" \"text\": (\n",
" \"As an expert art critic and historian, your task is to analyze and interpret images, \"\n",
" \"considering their historical and cultural significance. Alongside the images, you will be \"\n",
" \"provided with related text to offer context. Both will be retrieved from a vectorstore based \"\n",
" \"on user-input keywords. Please use your extensive knowledge and analytical skills to provide a \"\n",
" \"comprehensive summary that includes:\\n\"\n",
" \"- A detailed description of the visual elements in the image.\\n\"\n",
" \"- The historical and cultural context of the image.\\n\"\n",
" \"- An interpretation of the image's symbolism and meaning.\\n\"\n",
" \"- Connections between the image and the related text.\\n\\n\"\n",
" f\"User-provided keywords: {data_dict['question']}\\n\\n\"\n",
" \"Text and / or tables:\\n\"\n",
" f\"{formatted_texts}\"\n",
" ),\n",
" }\n",
" messages.append(text_message)\n",
"\n",
" return [HumanMessage(content=messages)]\n",
"\n",
"\n",
"model = ChatOpenAI(temperature=0, model=\"gpt-4-vision-preview\", max_tokens=1024)\n",
"\n",
"# RAG pipeline\n",
"chain = (\n",
" {\n",
" \"text_context\": text_retriever | RunnableLambda(split_image_text_types),\n",
" \"image_context\": image_retriever | RunnableLambda(split_image_text_types),\n",
" \"question\": RunnablePassthrough(),\n",
" }\n",
" | RunnableLambda(prompt_func)\n",
" | model\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "1566096d-97c2-4ddc-ba4a-6ef88c525e4e",
"metadata": {},
"source": [
"## Test retrieval and run RAG"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "90121e56-674b-473b-871d-6e4753fd0c45",
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import HTML, display\n",
"\n",
"\n",
"def plt_img_base64(img_base64):\n",
" # Create an HTML img tag with the base64 string as the source\n",
" image_html = f'<img src=\"data:image/jpeg;base64,{img_base64}\" />'\n",
"\n",
" # Display the image by rendering the HTML\n",
" display(HTML(image_html))\n",
"\n",
"\n",
"docs = text_retriever.invoke(\"Women with children\", k=5)\n",
"for doc in docs:\n",
" if is_base64(doc.page_content):\n",
" plt_img_base64(doc.page_content)\n",
" else:\n",
" print(doc.page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44eaa532-f035-4c04-b578-02339d42554c",
"metadata": {},
"outputs": [],
"source": [
"docs = image_retriever.invoke(\"Women with children\", k=5)\n",
"for doc in docs:\n",
" if is_base64(doc.page_content):\n",
" plt_img_base64(doc.page_content)\n",
" else:\n",
" print(doc.page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69fb15fd-76fc-49b4-806d-c4db2990027d",
"metadata": {},
"outputs": [],
"source": [
"chain.invoke(\"Women with children\")"
]
},
{
"cell_type": "markdown",
"id": "227f08b8-e732-4089-b65c-6eb6f9e48f15",
"metadata": {},
"source": [
"We can see the images retrieved in the LangSmith trace:\n",
"\n",
"LangSmith [trace](https://smith.langchain.com/public/69c558a5-49dc-4c60-a49b-3adbb70f74c5/r/e872c2c8-528c-468f-aefd-8b5cd730a673)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -20,8 +20,8 @@
"outputs": [],
"source": [
"from langchain.chains import RetrievalQA\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.document_loaders import TextLoader\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"from langchain_text_splitters import CharacterTextSplitter"
]

View File

@@ -80,7 +80,7 @@
"outputs": [],
"source": [
"from langchain.schema import Document\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_chroma import Chroma\n",
"from langchain_openai import OpenAIEmbeddings\n",
"\n",
"embeddings = OpenAIEmbeddings()"

View File

@@ -0,0 +1,880 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Oracle AI Vector Search with Document Processing\n",
"Oracle AI Vector Search is designed for Artificial Intelligence (AI) workloads that allows you to query data based on semantics, rather than keywords.\n",
"One of the biggest benefits of Oracle AI Vector Search is that semantic search on unstructured data can be combined with relational search on business data in one single system.\n",
"This is not only powerful but also significantly more effective because you don't need to add a specialized vector database, eliminating the pain of data fragmentation between multiple systems.\n",
"\n",
"In addition, your vectors can benefit from all of Oracle Databases most powerful features, like the following:\n",
"\n",
" * [Partitioning Support](https://www.oracle.com/database/technologies/partitioning.html)\n",
" * [Real Application Clusters scalability](https://www.oracle.com/database/real-application-clusters/)\n",
" * [Exadata smart scans](https://www.oracle.com/database/technologies/exadata/software/smartscan/)\n",
" * [Shard processing across geographically distributed databases](https://www.oracle.com/database/distributed-database/)\n",
" * [Transactions](https://docs.oracle.com/en/database/oracle/oracle-database/23/cncpt/transactions.html)\n",
" * [Parallel SQL](https://docs.oracle.com/en/database/oracle/oracle-database/21/vldbg/parallel-exec-intro.html#GUID-D28717E4-0F77-44F5-BB4E-234C31D4E4BA)\n",
" * [Disaster recovery](https://www.oracle.com/database/data-guard/)\n",
" * [Security](https://www.oracle.com/security/database-security/)\n",
" * [Oracle Machine Learning](https://www.oracle.com/artificial-intelligence/database-machine-learning/)\n",
" * [Oracle Graph Database](https://www.oracle.com/database/integrated-graph-database/)\n",
" * [Oracle Spatial and Graph](https://www.oracle.com/database/spatial/)\n",
" * [Oracle Blockchain](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_blockchain_table.html#GUID-B469E277-978E-4378-A8C1-26D3FF96C9A6)\n",
" * [JSON](https://docs.oracle.com/en/database/oracle/oracle-database/23/adjsn/json-in-oracle-database.html)\n",
"\n",
"This guide demonstrates how Oracle AI Vector Search can be used with Langchain to serve an end-to-end RAG pipeline. This guide goes through examples of:\n",
"\n",
" * Loading the documents from various sources using OracleDocLoader\n",
" * Summarizing them within/outside the database using OracleSummary\n",
" * Generating embeddings for them within/outside the database using OracleEmbeddings\n",
" * Chunking them according to different requirements using Advanced Oracle Capabilities from OracleTextSplitter\n",
" * Storing and Indexing them in a Vector Store and querying them for queries in OracleVS"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If you are just starting with Oracle Database, consider exploring the [free Oracle 23 AI](https://www.oracle.com/database/free/#resources) which provides a great introduction to setting up your database environment. While working with the database, it is often advisable to avoid using the system user by default; instead, you can create your own user for enhanced security and customization. For detailed steps on user creation, refer to our [end-to-end guide](https://github.com/langchain-ai/langchain/blob/master/cookbook/oracleai_demo.ipynb) which also shows how to set up a user in Oracle. Additionally, understanding user privileges is crucial for managing database security effectively. You can learn more about this topic in the official [Oracle guide](https://docs.oracle.com/en/database/oracle/oracle-database/19/admqs/administering-user-accounts-and-security.html#GUID-36B21D72-1BBB-46C9-A0C9-F0D2A8591B8D) on administering user accounts and security."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Prerequisites\n",
"\n",
"Please install Oracle Python Client driver to use Langchain with Oracle AI Vector Search. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# pip install oracledb"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Demo User\n",
"First, create a demo user with all the required privileges. "
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Connection successful!\n",
"User setup done!\n"
]
}
],
"source": [
"import sys\n",
"\n",
"import oracledb\n",
"\n",
"# Update with your username, password, hostname, and service_name\n",
"username = \"\"\n",
"password = \"\"\n",
"dsn = \"\"\n",
"\n",
"try:\n",
" conn = oracledb.connect(user=username, password=password, dsn=dsn)\n",
" print(\"Connection successful!\")\n",
"\n",
" cursor = conn.cursor()\n",
" try:\n",
" cursor.execute(\n",
" \"\"\"\n",
" begin\n",
" -- Drop user\n",
" begin\n",
" execute immediate 'drop user testuser cascade';\n",
" exception\n",
" when others then\n",
" dbms_output.put_line('Error dropping user: ' || SQLERRM);\n",
" end;\n",
" \n",
" -- Create user and grant privileges\n",
" execute immediate 'create user testuser identified by testuser';\n",
" execute immediate 'grant connect, unlimited tablespace, create credential, create procedure, create any index to testuser';\n",
" execute immediate 'create or replace directory DEMO_PY_DIR as ''/scratch/hroy/view_storage/hroy_devstorage/demo/orachain''';\n",
" execute immediate 'grant read, write on directory DEMO_PY_DIR to public';\n",
" execute immediate 'grant create mining model to testuser';\n",
" \n",
" -- Network access\n",
" begin\n",
" DBMS_NETWORK_ACL_ADMIN.APPEND_HOST_ACE(\n",
" host => '*',\n",
" ace => xs$ace_type(privilege_list => xs$name_list('connect'),\n",
" principal_name => 'testuser',\n",
" principal_type => xs_acl.ptype_db)\n",
" );\n",
" end;\n",
" end;\n",
" \"\"\"\n",
" )\n",
" print(\"User setup done!\")\n",
" except Exception as e:\n",
" print(f\"User setup failed with error: {e}\")\n",
" finally:\n",
" cursor.close()\n",
" conn.close()\n",
"except Exception as e:\n",
" print(f\"Connection failed with error: {e}\")\n",
" sys.exit(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Process Documents using Oracle AI\n",
"Consider the following scenario: users possess documents stored either in an Oracle Database or a file system and intend to utilize this data with Oracle AI Vector Search powered by Langchain.\n",
"\n",
"To prepare the documents for analysis, a comprehensive preprocessing workflow is necessary. Initially, the documents must be retrieved, summarized (if required), and chunked as needed. Subsequent steps involve generating embeddings for these chunks and integrating them into the Oracle AI Vector Store. Users can then conduct semantic searches on this data.\n",
"\n",
"The Oracle AI Vector Search Langchain library encompasses a suite of document processing tools that facilitate document loading, chunking, summary generation, and embedding creation.\n",
"\n",
"In the sections that follow, we will detail the utilization of Oracle AI Langchain APIs to effectively implement each of these processes."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Connect to Demo User\n",
"The following sample code will show how to connect to Oracle Database. By default, python-oracledb runs in a Thin mode which connects directly to Oracle Database. This mode does not need Oracle Client libraries. However, some additional functionality is available when python-oracledb uses them. Python-oracledb is said to be in Thick mode when Oracle Client libraries are used. Both modes have comprehensive functionality supporting the Python Database API v2.0 Specification. See the following [guide](https://python-oracledb.readthedocs.io/en/latest/user_guide/appendix_a.html#featuresummary) that talks about features supported in each mode. You might want to switch to thick-mode if you are unable to use thin-mode."
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Connection successful!\n"
]
}
],
"source": [
"import sys\n",
"\n",
"import oracledb\n",
"\n",
"# please update with your username, password, hostname and service_name\n",
"username = \"\"\n",
"password = \"\"\n",
"dsn = \"\"\n",
"\n",
"try:\n",
" conn = oracledb.connect(user=username, password=password, dsn=dsn)\n",
" print(\"Connection successful!\")\n",
"except Exception as e:\n",
" print(\"Connection failed!\")\n",
" sys.exit(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Populate a Demo Table\n",
"Create a demo table and insert some sample documents."
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Table created and populated.\n"
]
}
],
"source": [
"try:\n",
" cursor = conn.cursor()\n",
"\n",
" drop_table_sql = \"\"\"drop table demo_tab\"\"\"\n",
" cursor.execute(drop_table_sql)\n",
"\n",
" create_table_sql = \"\"\"create table demo_tab (id number, data clob)\"\"\"\n",
" cursor.execute(create_table_sql)\n",
"\n",
" insert_row_sql = \"\"\"insert into demo_tab values (:1, :2)\"\"\"\n",
" rows_to_insert = [\n",
" (\n",
" 1,\n",
" \"If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.\",\n",
" ),\n",
" (\n",
" 2,\n",
" \"A tablespace can be online (accessible) or offline (not accessible) whenever the database is open.\\nA tablespace is usually online so that its data is available to users. The SYSTEM tablespace and temporary tablespaces cannot be taken offline.\",\n",
" ),\n",
" (\n",
" 3,\n",
" \"The database stores LOBs differently from other data types. Creating a LOB column implicitly creates a LOB segment and a LOB index. The tablespace containing the LOB segment and LOB index, which are always stored together, may be different from the tablespace containing the table.\\nSometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.\",\n",
" ),\n",
" ]\n",
" cursor.executemany(insert_row_sql, rows_to_insert)\n",
"\n",
" conn.commit()\n",
"\n",
" print(\"Table created and populated.\")\n",
" cursor.close()\n",
"except Exception as e:\n",
" print(\"Table creation failed.\")\n",
" cursor.close()\n",
" conn.close()\n",
" sys.exit(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With the inclusion of a demo user and a populated sample table, the remaining configuration involves setting up embedding and summary functionalities. Users are presented with multiple provider options, including local database solutions and third-party services such as Ocigenai, Hugging Face, and OpenAI. Should users opt for a third-party provider, they are required to establish credentials containing the necessary authentication details. Conversely, if selecting a database as the provider for embeddings, it is necessary to upload an ONNX model to the Oracle Database. No additional setup is required for summary functionalities when using the database option."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load ONNX Model\n",
"\n",
"Oracle accommodates a variety of embedding providers, enabling users to choose between proprietary database solutions and third-party services such as OCIGENAI and HuggingFace. This selection dictates the methodology for generating and managing embeddings.\n",
"\n",
"***Important*** : Should users opt for the database option, they must upload an ONNX model into the Oracle Database. Conversely, if a third-party provider is selected for embedding generation, uploading an ONNX model to Oracle Database is not required.\n",
"\n",
"A significant advantage of utilizing an ONNX model directly within Oracle is the enhanced security and performance it offers by eliminating the need to transmit data to external parties. Additionally, this method avoids the latency typically associated with network or REST API calls.\n",
"\n",
"Below is the example code to upload an ONNX model into Oracle Database:"
]
},
{
"cell_type": "code",
"execution_count": 47,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ONNX model loaded.\n"
]
}
],
"source": [
"from langchain_community.embeddings.oracleai import OracleEmbeddings\n",
"\n",
"# please update with your related information\n",
"# make sure that you have onnx file in the system\n",
"onnx_dir = \"DEMO_PY_DIR\"\n",
"onnx_file = \"tinybert.onnx\"\n",
"model_name = \"demo_model\"\n",
"\n",
"try:\n",
" OracleEmbeddings.load_onnx_model(conn, onnx_dir, onnx_file, model_name)\n",
" print(\"ONNX model loaded.\")\n",
"except Exception as e:\n",
" print(\"ONNX model loading failed!\")\n",
" sys.exit(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Credential\n",
"\n",
"When selecting third-party providers for generating embeddings, users are required to establish credentials to securely access the provider's endpoints.\n",
"\n",
"***Important:*** No credentials are necessary when opting for the 'database' provider to generate embeddings. However, should users decide to utilize a third-party provider, they must create credentials specific to the chosen provider.\n",
"\n",
"Below is an illustrative example:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" cursor = conn.cursor()\n",
" cursor.execute(\n",
" \"\"\"\n",
" declare\n",
" jo json_object_t;\n",
" begin\n",
" -- HuggingFace\n",
" dbms_vector_chain.drop_credential(credential_name => 'HF_CRED');\n",
" jo := json_object_t();\n",
" jo.put('access_token', '<access_token>');\n",
" dbms_vector_chain.create_credential(\n",
" credential_name => 'HF_CRED',\n",
" params => json(jo.to_string));\n",
"\n",
" -- OCIGENAI\n",
" dbms_vector_chain.drop_credential(credential_name => 'OCI_CRED');\n",
" jo := json_object_t();\n",
" jo.put('user_ocid','<user_ocid>');\n",
" jo.put('tenancy_ocid','<tenancy_ocid>');\n",
" jo.put('compartment_ocid','<compartment_ocid>');\n",
" jo.put('private_key','<private_key>');\n",
" jo.put('fingerprint','<fingerprint>');\n",
" dbms_vector_chain.create_credential(\n",
" credential_name => 'OCI_CRED',\n",
" params => json(jo.to_string));\n",
" end;\n",
" \"\"\"\n",
" )\n",
" cursor.close()\n",
" print(\"Credentials created.\")\n",
"except Exception as ex:\n",
" cursor.close()\n",
" raise"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Load Documents\n",
"Users have the flexibility to load documents from either the Oracle Database, a file system, or both, by appropriately configuring the loader parameters. For comprehensive details on these parameters, please consult the [Oracle AI Vector Search Guide](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-73397E89-92FB-48ED-94BB-1AD960C4EA1F).\n",
"\n",
"A significant advantage of utilizing OracleDocLoader is its capability to process over 150 distinct file formats, eliminating the need for multiple loaders for different document types. For a complete list of the supported formats, please refer to the [Oracle Text Supported Document Formats](https://docs.oracle.com/en/database/oracle/oracle-database/23/ccref/oracle-text-supported-document-formats.html).\n",
"\n",
"Below is a sample code snippet that demonstrates how to use OracleDocLoader"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of docs loaded: 3\n"
]
}
],
"source": [
"from langchain_community.document_loaders.oracleai import OracleDocLoader\n",
"from langchain_core.documents import Document\n",
"\n",
"# loading from Oracle Database table\n",
"# make sure you have the table with this specification\n",
"loader_params = {}\n",
"loader_params = {\n",
" \"owner\": \"testuser\",\n",
" \"tablename\": \"demo_tab\",\n",
" \"colname\": \"data\",\n",
"}\n",
"\n",
"\"\"\" load the docs \"\"\"\n",
"loader = OracleDocLoader(conn=conn, params=loader_params)\n",
"docs = loader.load()\n",
"\n",
"\"\"\" verify \"\"\"\n",
"print(f\"Number of docs loaded: {len(docs)}\")\n",
"# print(f\"Document-0: {docs[0].page_content}\") # content"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate Summary\n",
"Now that the user loaded the documents, they may want to generate a summary for each document. The Oracle AI Vector Search Langchain library offers a suite of APIs designed for document summarization. It supports multiple summarization providers such as Database, OCIGENAI, HuggingFace, among others, allowing users to select the provider that best meets their needs. To utilize these capabilities, users must configure the summary parameters as specified. For detailed information on these parameters, please consult the [Oracle AI Vector Search Guide book](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-EC9DDB58-6A15-4B36-BA66-ECBA20D2CE57)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"***Note:*** The users may need to set proxy if they want to use some 3rd party summary generation providers other than Oracle's in-house and default provider: 'database'. If you don't have proxy, please remove the proxy parameter when you instantiate the OracleSummary."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"# proxy to be used when we instantiate summary and embedder object\n",
"proxy = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following sample code will show how to generate summary:"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of Summaries: 3\n"
]
}
],
"source": [
"from langchain_community.utilities.oracleai import OracleSummary\n",
"from langchain_core.documents import Document\n",
"\n",
"# using 'database' provider\n",
"summary_params = {\n",
" \"provider\": \"database\",\n",
" \"glevel\": \"S\",\n",
" \"numParagraphs\": 1,\n",
" \"language\": \"english\",\n",
"}\n",
"\n",
"# get the summary instance\n",
"# Remove proxy if not required\n",
"summ = OracleSummary(conn=conn, params=summary_params, proxy=proxy)\n",
"\n",
"list_summary = []\n",
"for doc in docs:\n",
" summary = summ.get_summary(doc.page_content)\n",
" list_summary.append(summary)\n",
"\n",
"\"\"\" verify \"\"\"\n",
"print(f\"Number of Summaries: {len(list_summary)}\")\n",
"# print(f\"Summary-0: {list_summary[0]}\") #content"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Split Documents\n",
"The documents may vary in size, ranging from small to very large. Users often prefer to chunk their documents into smaller sections to facilitate the generation of embeddings. A wide array of customization options is available for this splitting process. For comprehensive details regarding these parameters, please consult the [Oracle AI Vector Search Guide](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-4E145629-7098-4C7C-804F-FC85D1F24240).\n",
"\n",
"Below is a sample code illustrating how to implement this:"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of Chunks: 3\n"
]
}
],
"source": [
"from langchain_community.document_loaders.oracleai import OracleTextSplitter\n",
"from langchain_core.documents import Document\n",
"\n",
"# split by default parameters\n",
"splitter_params = {\"normalize\": \"all\"}\n",
"\n",
"\"\"\" get the splitter instance \"\"\"\n",
"splitter = OracleTextSplitter(conn=conn, params=splitter_params)\n",
"\n",
"list_chunks = []\n",
"for doc in docs:\n",
" chunks = splitter.split_text(doc.page_content)\n",
" list_chunks.extend(chunks)\n",
"\n",
"\"\"\" verify \"\"\"\n",
"print(f\"Number of Chunks: {len(list_chunks)}\")\n",
"# print(f\"Chunk-0: {list_chunks[0]}\") # content"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Generate Embeddings\n",
"Now that the documents are chunked as per requirements, the users may want to generate embeddings for these chunks. Oracle AI Vector Search provides multiple methods for generating embeddings, utilizing either locally hosted ONNX models or third-party APIs. For comprehensive instructions on configuring these alternatives, please refer to the [Oracle AI Vector Search Guide](https://docs.oracle.com/en/database/oracle/oracle-database/23/arpls/dbms_vector_chain1.html#GUID-C6439E94-4E86-4ECD-954E-4B73D53579DE)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"***Note:*** Users may need to configure a proxy to utilize third-party embedding generation providers, excluding the 'database' provider that utilizes an ONNX model."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"# proxy to be used when we instantiate summary and embedder object\n",
"proxy = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following sample code will show how to generate embeddings:"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Number of embeddings: 3\n"
]
}
],
"source": [
"from langchain_community.embeddings.oracleai import OracleEmbeddings\n",
"from langchain_core.documents import Document\n",
"\n",
"# using ONNX model loaded to Oracle Database\n",
"embedder_params = {\"provider\": \"database\", \"model\": \"demo_model\"}\n",
"\n",
"# get the embedding instance\n",
"# Remove proxy if not required\n",
"embedder = OracleEmbeddings(conn=conn, params=embedder_params, proxy=proxy)\n",
"\n",
"embeddings = []\n",
"for doc in docs:\n",
" chunks = splitter.split_text(doc.page_content)\n",
" for chunk in chunks:\n",
" embed = embedder.embed_query(chunk)\n",
" embeddings.append(embed)\n",
"\n",
"\"\"\" verify \"\"\"\n",
"print(f\"Number of embeddings: {len(embeddings)}\")\n",
"# print(f\"Embedding-0: {embeddings[0]}\") # content"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Oracle AI Vector Store\n",
"Now that you know how to use Oracle AI Langchain library APIs individually to process the documents, let us show how to integrate with Oracle AI Vector Store to facilitate the semantic searches."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, let's import all the dependencies."
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [],
"source": [
"import sys\n",
"\n",
"import oracledb\n",
"from langchain_community.document_loaders.oracleai import (\n",
" OracleDocLoader,\n",
" OracleTextSplitter,\n",
")\n",
"from langchain_community.embeddings.oracleai import OracleEmbeddings\n",
"from langchain_community.utilities.oracleai import OracleSummary\n",
"from langchain_community.vectorstores import oraclevs\n",
"from langchain_community.vectorstores.oraclevs import OracleVS\n",
"from langchain_community.vectorstores.utils import DistanceStrategy\n",
"from langchain_core.documents import Document"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let's combine all document processing stages together. Here is the sample code below:"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Connection successful!\n",
"ONNX model loaded.\n",
"Number of total chunks with metadata: 3\n"
]
}
],
"source": [
"\"\"\"\n",
"In this sample example, we will use 'database' provider for both summary and embeddings.\n",
"So, we don't need to do the followings:\n",
" - set proxy for 3rd party providers\n",
" - create credential for 3rd party providers\n",
"\n",
"If you choose to use 3rd party provider, \n",
"please follow the necessary steps for proxy and credential.\n",
"\"\"\"\n",
"\n",
"# oracle connection\n",
"# please update with your username, password, hostname, and service_name\n",
"username = \"\"\n",
"password = \"\"\n",
"dsn = \"\"\n",
"\n",
"try:\n",
" conn = oracledb.connect(user=username, password=password, dsn=dsn)\n",
" print(\"Connection successful!\")\n",
"except Exception as e:\n",
" print(\"Connection failed!\")\n",
" sys.exit(1)\n",
"\n",
"\n",
"# load onnx model\n",
"# please update with your related information\n",
"onnx_dir = \"DEMO_PY_DIR\"\n",
"onnx_file = \"tinybert.onnx\"\n",
"model_name = \"demo_model\"\n",
"try:\n",
" OracleEmbeddings.load_onnx_model(conn, onnx_dir, onnx_file, model_name)\n",
" print(\"ONNX model loaded.\")\n",
"except Exception as e:\n",
" print(\"ONNX model loading failed!\")\n",
" sys.exit(1)\n",
"\n",
"\n",
"# params\n",
"# please update necessary fields with related information\n",
"loader_params = {\n",
" \"owner\": \"testuser\",\n",
" \"tablename\": \"demo_tab\",\n",
" \"colname\": \"data\",\n",
"}\n",
"summary_params = {\n",
" \"provider\": \"database\",\n",
" \"glevel\": \"S\",\n",
" \"numParagraphs\": 1,\n",
" \"language\": \"english\",\n",
"}\n",
"splitter_params = {\"normalize\": \"all\"}\n",
"embedder_params = {\"provider\": \"database\", \"model\": \"demo_model\"}\n",
"\n",
"# instantiate loader, summary, splitter, and embedder\n",
"loader = OracleDocLoader(conn=conn, params=loader_params)\n",
"summary = OracleSummary(conn=conn, params=summary_params)\n",
"splitter = OracleTextSplitter(conn=conn, params=splitter_params)\n",
"embedder = OracleEmbeddings(conn=conn, params=embedder_params)\n",
"\n",
"# process the documents\n",
"chunks_with_mdata = []\n",
"for id, doc in enumerate(docs, start=1):\n",
" summ = summary.get_summary(doc.page_content)\n",
" chunks = splitter.split_text(doc.page_content)\n",
" for ic, chunk in enumerate(chunks, start=1):\n",
" chunk_metadata = doc.metadata.copy()\n",
" chunk_metadata[\"id\"] = chunk_metadata[\"_oid\"] + \"$\" + str(id) + \"$\" + str(ic)\n",
" chunk_metadata[\"document_id\"] = str(id)\n",
" chunk_metadata[\"document_summary\"] = str(summ[0])\n",
" chunks_with_mdata.append(\n",
" Document(page_content=str(chunk), metadata=chunk_metadata)\n",
" )\n",
"\n",
"\"\"\" verify \"\"\"\n",
"print(f\"Number of total chunks with metadata: {len(chunks_with_mdata)}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"At this point, we have processed the documents and generated chunks with metadata. Next, we will create Oracle AI Vector Store with those chunks.\n",
"\n",
"Here is the sample code how to do that:"
]
},
{
"cell_type": "code",
"execution_count": 55,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Vector Store Table: oravs\n"
]
}
],
"source": [
"# create Oracle AI Vector Store\n",
"vectorstore = OracleVS.from_documents(\n",
" chunks_with_mdata,\n",
" embedder,\n",
" client=conn,\n",
" table_name=\"oravs\",\n",
" distance_strategy=DistanceStrategy.DOT_PRODUCT,\n",
")\n",
"\n",
"\"\"\" verify \"\"\"\n",
"print(f\"Vector Store Table: {vectorstore.table_name}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The example provided illustrates the creation of a vector store using the DOT_PRODUCT distance strategy. Users have the flexibility to employ various distance strategies with the Oracle AI Vector Store, as detailed in our [comprehensive guide](https://python.langchain.com/v0.1/docs/integrations/vectorstores/oracle/)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With embeddings now stored in vector stores, it is advisable to establish an index to enhance semantic search performance during query execution.\n",
"\n",
"***Note*** Should you encounter an \"insufficient memory\" error, it is recommended to increase the ***vector_memory_size*** in your database configuration\n",
"\n",
"Below is a sample code snippet for creating an index:"
]
},
{
"cell_type": "code",
"execution_count": 56,
"metadata": {},
"outputs": [],
"source": [
"oraclevs.create_index(\n",
" conn, vectorstore, params={\"idx_name\": \"hnsw_oravs\", \"idx_type\": \"HNSW\"}\n",
")\n",
"\n",
"print(\"Index created.\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example demonstrates the creation of a default HNSW index on embeddings within the 'oravs' table. Users may adjust various parameters according to their specific needs. For detailed information on these parameters, please consult the [Oracle AI Vector Search Guide book](https://docs.oracle.com/en/database/oracle/oracle-database/23/vecse/manage-different-categories-vector-indexes.html).\n",
"\n",
"Additionally, various types of vector indices can be created to meet diverse requirements. More details can be found in our [comprehensive guide](https://python.langchain.com/v0.1/docs/integrations/vectorstores/oracle/).\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Perform Semantic Search\n",
"All set!\n",
"\n",
"We have successfully processed the documents and stored them in the vector store, followed by the creation of an index to enhance query performance. We are now prepared to proceed with semantic searches.\n",
"\n",
"Below is the sample code for this process:"
]
},
{
"cell_type": "code",
"execution_count": 58,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[Document(page_content='The database stores LOBs differently from other data types. Creating a LOB column implicitly creates a LOB segment and a LOB index. The tablespace containing the LOB segment and LOB index, which are always stored together, may be different from the tablespace containing the table. Sometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.', metadata={'_oid': '662f2f257677f3c2311a8ff999fd34e5', '_rowid': 'AAAR/xAAEAAAAAnAAC', 'id': '662f2f257677f3c2311a8ff999fd34e5$3$1', 'document_id': '3', 'document_summary': 'Sometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.\\n\\n'})]\n",
"[]\n",
"[(Document(page_content='The database stores LOBs differently from other data types. Creating a LOB column implicitly creates a LOB segment and a LOB index. The tablespace containing the LOB segment and LOB index, which are always stored together, may be different from the tablespace containing the table. Sometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.', metadata={'_oid': '662f2f257677f3c2311a8ff999fd34e5', '_rowid': 'AAAR/xAAEAAAAAnAAC', 'id': '662f2f257677f3c2311a8ff999fd34e5$3$1', 'document_id': '3', 'document_summary': 'Sometimes the database can store small amounts of LOB data in the table itself rather than in a separate LOB segment.\\n\\n'}), 0.055675752460956573)]\n",
"[]\n",
"[Document(page_content='If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.', metadata={'_oid': '662f2f253acf96b33b430b88699490a2', '_rowid': 'AAAR/xAAEAAAAAnAAA', 'id': '662f2f253acf96b33b430b88699490a2$1$1', 'document_id': '1', 'document_summary': 'If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.\\n\\n'})]\n",
"[Document(page_content='If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.', metadata={'_oid': '662f2f253acf96b33b430b88699490a2', '_rowid': 'AAAR/xAAEAAAAAnAAA', 'id': '662f2f253acf96b33b430b88699490a2$1$1', 'document_id': '1', 'document_summary': 'If the answer to any preceding questions is yes, then the database stops the search and allocates space from the specified tablespace; otherwise, space is allocated from the database default shared temporary tablespace.\\n\\n'})]\n"
]
}
],
"source": [
"query = \"What is Oracle AI Vector Store?\"\n",
"filter = {\"document_id\": [\"1\"]}\n",
"\n",
"# Similarity search without a filter\n",
"print(vectorstore.similarity_search(query, 1))\n",
"\n",
"# Similarity search with a filter\n",
"print(vectorstore.similarity_search(query, 1, filter=filter))\n",
"\n",
"# Similarity search with relevance score\n",
"print(vectorstore.similarity_search_with_score(query, 1))\n",
"\n",
"# Similarity search with relevance score with filter\n",
"print(vectorstore.similarity_search_with_score(query, 1, filter=filter))\n",
"\n",
"# Max marginal relevance search\n",
"print(vectorstore.max_marginal_relevance_search(query, 1, fetch_k=20, lambda_mult=0.5))\n",
"\n",
"# Max marginal relevance search with filter\n",
"print(\n",
" vectorstore.max_marginal_relevance_search(\n",
" query, 1, fetch_k=20, lambda_mult=0.5, filter=filter\n",
" )\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -129,7 +129,7 @@
" return obs_message\n",
"\n",
" def _act(self):\n",
" act_message = self.model(self.message_history)\n",
" act_message = self.model.invoke(self.message_history)\n",
" self.message_history.append(act_message)\n",
" action = int(self.action_parser.parse(act_message.content)[\"action\"])\n",
" return action\n",

View File

@@ -84,7 +84,7 @@
"from langchain.retrievers import KayAiRetriever\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"model = ChatOpenAI(model_name=\"gpt-3.5-turbo\")\n",
"model = ChatOpenAI(model=\"gpt-3.5-turbo\")\n",
"retriever = KayAiRetriever.create(\n",
" dataset_id=\"company\", data_types=[\"PressRelease\"], num_contexts=6\n",
")\n",

View File

@@ -0,0 +1,761 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "10f50955-be55-422f-8c62-3a32f8cf02ed",
"metadata": {},
"source": [
"# RAG application running locally on Intel Xeon CPU using langchain and open-source models"
]
},
{
"cell_type": "markdown",
"id": "48113be6-44bb-4aac-aed3-76a1365b9561",
"metadata": {},
"source": [
"Author - Pratool Bharti (pratool.bharti@intel.com)"
]
},
{
"cell_type": "markdown",
"id": "8b10b54b-1572-4ea1-9c1e-1d29fcc3dcd9",
"metadata": {},
"source": [
"In this cookbook, we use langchain tools and open source models to execute locally on CPU. This notebook has been validated to run on Intel Xeon 8480+ CPU. Here we implement a RAG pipeline for Llama2 model to answer questions about Intel Q1 2024 earnings release."
]
},
{
"cell_type": "markdown",
"id": "acadbcec-3468-4926-8ce5-03b678041c0a",
"metadata": {},
"source": [
"**Create a conda or virtualenv environment with python >=3.10 and install following libraries**\n",
"<br>\n",
"\n",
"`pip install --upgrade langchain langchain-community langchainhub langchain-chroma bs4 gpt4all pypdf pysqlite3-binary` <br>\n",
"`pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu`"
]
},
{
"cell_type": "markdown",
"id": "84c392c8-700a-42ec-8e94-806597f22e43",
"metadata": {},
"source": [
"**Load pysqlite3 in sys modules since ChromaDB requires sqlite3.**"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "145cd491-b388-4ea7-bdc8-2f4995cac6fd",
"metadata": {},
"outputs": [],
"source": [
"__import__(\"pysqlite3\")\n",
"import sys\n",
"\n",
"sys.modules[\"sqlite3\"] = sys.modules.pop(\"pysqlite3\")"
]
},
{
"cell_type": "markdown",
"id": "14dde7e2-b236-49b9-b3a0-08c06410418c",
"metadata": {},
"source": [
"**Import essential components from langchain to load and split data**"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "887643ba-249e-48d6-9aa7-d25087e8dfbf",
"metadata": {},
"outputs": [],
"source": [
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n",
"from langchain_community.document_loaders import PyPDFLoader"
]
},
{
"cell_type": "markdown",
"id": "922c0eba-8736-4de5-bd2f-3d0f00b16e43",
"metadata": {},
"source": [
"**Download Intel Q1 2024 earnings release**"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "2d6a2419-5338-4188-8615-a40a65ff8019",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2024-07-15 15:04:43-- https://d1io3yog0oux5.cloudfront.net/_11d435a500963f99155ee058df09f574/intel/db/887/9014/earnings_release/Q1+24_EarningsRelease_FINAL.pdf\n",
"Resolving proxy-dmz.intel.com (proxy-dmz.intel.com)... 10.7.211.16\n",
"Connecting to proxy-dmz.intel.com (proxy-dmz.intel.com)|10.7.211.16|:912... connected.\n",
"Proxy request sent, awaiting response... 200 OK\n",
"Length: 133510 (130K) [application/pdf]\n",
"Saving to: intel_q1_2024_earnings.pdf\n",
"\n",
"intel_q1_2024_earni 100%[===================>] 130.38K --.-KB/s in 0.005s \n",
"\n",
"2024-07-15 15:04:44 (24.6 MB/s) - intel_q1_2024_earnings.pdf saved [133510/133510]\n",
"\n"
]
}
],
"source": [
"!wget 'https://d1io3yog0oux5.cloudfront.net/_11d435a500963f99155ee058df09f574/intel/db/887/9014/earnings_release/Q1+24_EarningsRelease_FINAL.pdf' -O intel_q1_2024_earnings.pdf"
]
},
{
"cell_type": "markdown",
"id": "e3612627-e105-453d-8a50-bbd6e39dedb5",
"metadata": {},
"source": [
"**Loading earning release pdf document through PyPDFLoader**"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "cac6278e-ebad-4224-a062-bf6daca24cb0",
"metadata": {},
"outputs": [],
"source": [
"loader = PyPDFLoader(\"intel_q1_2024_earnings.pdf\")\n",
"data = loader.load()"
]
},
{
"cell_type": "markdown",
"id": "a7dca43b-1c62-41df-90c7-6ed2904f823d",
"metadata": {},
"source": [
"**Splitting entire document in several chunks with each chunk size is 500 tokens**"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "4486adbe-0d0e-4685-8c08-c1774ed6e993",
"metadata": {},
"outputs": [],
"source": [
"text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)\n",
"all_splits = text_splitter.split_documents(data)"
]
},
{
"cell_type": "markdown",
"id": "af142346-e793-4a52-9a56-63e3be416b3d",
"metadata": {},
"source": [
"**Looking at the first split of the document**"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "e4240fd1-898e-4bfc-a377-02c9bc25b56e",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(metadata={'source': 'intel_q1_2024_earnings.pdf', 'page': 0}, page_content='Intel Corporation\\n2200 Mission College Blvd.\\nSanta Clara, CA 95054-1549\\n \\nNews Release\\n Intel Reports First -Quarter 2024 Financial Results\\nNEWS SUMMARY\\n▪First-quarter revenue of $12.7 billion , up 9% year over year (YoY).\\n▪First-quarter GAAP earnings (loss) per share (EPS) attributable to Intel was $(0.09) ; non-GAAP EPS \\nattributable to Intel was $0.18 .')"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"all_splits[0]"
]
},
{
"cell_type": "markdown",
"id": "b88d2632-7c1b-49ef-a691-c0eb67d23e6a",
"metadata": {},
"source": [
"**One of the major step in RAG is to convert each split of document into embeddings and store in a vector database such that searching relevant documents are efficient.** <br>\n",
"**For that, importing Chroma vector database from langchain. Also, importing open source GPT4All for embedding models**"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "9ff99dd7-9d47-4239-ba0a-d775792334ba",
"metadata": {},
"outputs": [],
"source": [
"from langchain_chroma import Chroma\n",
"from langchain_community.embeddings import GPT4AllEmbeddings"
]
},
{
"cell_type": "markdown",
"id": "b5d1f4dd-dd8d-4a20-95d1-2dbdd204375a",
"metadata": {},
"source": [
"**In next step, we will download one of the most popular embedding model \"all-MiniLM-L6-v2\". Find more details of the model at this link https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2**"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "05db3494-5d8e-4a13-9941-26330a86f5e5",
"metadata": {},
"outputs": [],
"source": [
"model_name = \"all-MiniLM-L6-v2.gguf2.f16.gguf\"\n",
"gpt4all_kwargs = {\"allow_download\": \"True\"}\n",
"embeddings = GPT4AllEmbeddings(model_name=model_name, gpt4all_kwargs=gpt4all_kwargs)"
]
},
{
"cell_type": "markdown",
"id": "4e53999e-1983-46ac-8039-2783e194c3ae",
"metadata": {},
"source": [
"**Store all the embeddings in the Chroma database**"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "0922951a-9ddf-4761-973d-8e9a86f61284",
"metadata": {},
"outputs": [],
"source": [
"vectorstore = Chroma.from_documents(documents=all_splits, embedding=embeddings)"
]
},
{
"cell_type": "markdown",
"id": "29f94fa0-6c75-4a65-a1a3-debc75422479",
"metadata": {},
"source": [
"**Now, let's find relevant splits from the documents related to the question**"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "88c8152d-ec7a-4f0b-9d86-877789407537",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4\n"
]
}
],
"source": [
"question = \"What is Intel CCG revenue in Q1 2024\"\n",
"docs = vectorstore.similarity_search(question)\n",
"print(len(docs))"
]
},
{
"cell_type": "markdown",
"id": "53330c6b-cb0f-43f9-b379-2e57ac1e5335",
"metadata": {},
"source": [
"**Look at the first retrieved document from the vector database**"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "43a6d94f-b5c4-47b0-a353-2db4c3d24d9c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Document(metadata={'page': 1, 'source': 'intel_q1_2024_earnings.pdf'}, page_content='Client Computing Group (CCG) $7.5 billion up31%\\nData Center and AI (DCAI) $3.0 billion up5%\\nNetwork and Edge (NEX) $1.4 billion down 8%\\nTotal Intel Products revenue $11.9 billion up17%\\nIntel Foundry $4.4 billion down 10%\\nAll other:\\nAltera $342 million down 58%\\nMobileye $239 million down 48%\\nOther $194 million up17%\\nTotal all other revenue $775 million down 46%\\nIntersegment eliminations $(4.4) billion\\nTotal net revenue $12.7 billion up9%\\nIntel Products Highlights')"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"docs[0]"
]
},
{
"cell_type": "markdown",
"id": "64ba074f-4b36-442e-b7e2-b26d6e2815c3",
"metadata": {},
"source": [
"**Download Lllama-2 model from Huggingface and store locally** <br>\n",
"**You can download different quantization variant of Lllama-2 model from the link below. We are using Q8 version here (7.16GB).** <br>\n",
"https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c8dd0811-6f43-4bc6-b854-2ab377639c9a",
"metadata": {},
"outputs": [],
"source": [
"!huggingface-cli download TheBloke/Llama-2-7b-Chat-GGUF llama-2-7b-chat.Q8_0.gguf --local-dir . --local-dir-use-symlinks False"
]
},
{
"cell_type": "markdown",
"id": "3895b1f5-f51d-4539-abf0-af33d7ca48ea",
"metadata": {},
"source": [
"**Import langchain components required to load downloaded LLMs model**"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "fb087088-aa62-44c0-8356-061e9b9f1186",
"metadata": {},
"outputs": [],
"source": [
"from langchain.callbacks.manager import CallbackManager\n",
"from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler\n",
"from langchain_community.llms import LlamaCpp"
]
},
{
"cell_type": "markdown",
"id": "5a8a111e-2614-4b70-b034-85cd3e7304cb",
"metadata": {},
"source": [
"**Loading the local Lllama-2 model using Llama-cpp library**"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "fb917da2-c0d7-4995-b56d-26254276e0da",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"llama_model_loader: loaded meta data with 19 key-value pairs and 291 tensors from llama-2-7b-chat.Q8_0.gguf (version GGUF V2)\n",
"llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.\n",
"llama_model_loader: - kv 0: general.architecture str = llama\n",
"llama_model_loader: - kv 1: general.name str = LLaMA v2\n",
"llama_model_loader: - kv 2: llama.context_length u32 = 4096\n",
"llama_model_loader: - kv 3: llama.embedding_length u32 = 4096\n",
"llama_model_loader: - kv 4: llama.block_count u32 = 32\n",
"llama_model_loader: - kv 5: llama.feed_forward_length u32 = 11008\n",
"llama_model_loader: - kv 6: llama.rope.dimension_count u32 = 128\n",
"llama_model_loader: - kv 7: llama.attention.head_count u32 = 32\n",
"llama_model_loader: - kv 8: llama.attention.head_count_kv u32 = 32\n",
"llama_model_loader: - kv 9: llama.attention.layer_norm_rms_epsilon f32 = 0.000001\n",
"llama_model_loader: - kv 10: general.file_type u32 = 7\n",
"llama_model_loader: - kv 11: tokenizer.ggml.model str = llama\n",
"llama_model_loader: - kv 12: tokenizer.ggml.tokens arr[str,32000] = [\"<unk>\", \"<s>\", \"</s>\", \"<0x00>\", \"<...\n",
"llama_model_loader: - kv 13: tokenizer.ggml.scores arr[f32,32000] = [0.000000, 0.000000, 0.000000, 0.0000...\n",
"llama_model_loader: - kv 14: tokenizer.ggml.token_type arr[i32,32000] = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...\n",
"llama_model_loader: - kv 15: tokenizer.ggml.bos_token_id u32 = 1\n",
"llama_model_loader: - kv 16: tokenizer.ggml.eos_token_id u32 = 2\n",
"llama_model_loader: - kv 17: tokenizer.ggml.unknown_token_id u32 = 0\n",
"llama_model_loader: - kv 18: general.quantization_version u32 = 2\n",
"llama_model_loader: - type f32: 65 tensors\n",
"llama_model_loader: - type q8_0: 226 tensors\n",
"llm_load_vocab: special tokens cache size = 259\n",
"llm_load_vocab: token to piece cache size = 0.1684 MB\n",
"llm_load_print_meta: format = GGUF V2\n",
"llm_load_print_meta: arch = llama\n",
"llm_load_print_meta: vocab type = SPM\n",
"llm_load_print_meta: n_vocab = 32000\n",
"llm_load_print_meta: n_merges = 0\n",
"llm_load_print_meta: vocab_only = 0\n",
"llm_load_print_meta: n_ctx_train = 4096\n",
"llm_load_print_meta: n_embd = 4096\n",
"llm_load_print_meta: n_layer = 32\n",
"llm_load_print_meta: n_head = 32\n",
"llm_load_print_meta: n_head_kv = 32\n",
"llm_load_print_meta: n_rot = 128\n",
"llm_load_print_meta: n_swa = 0\n",
"llm_load_print_meta: n_embd_head_k = 128\n",
"llm_load_print_meta: n_embd_head_v = 128\n",
"llm_load_print_meta: n_gqa = 1\n",
"llm_load_print_meta: n_embd_k_gqa = 4096\n",
"llm_load_print_meta: n_embd_v_gqa = 4096\n",
"llm_load_print_meta: f_norm_eps = 0.0e+00\n",
"llm_load_print_meta: f_norm_rms_eps = 1.0e-06\n",
"llm_load_print_meta: f_clamp_kqv = 0.0e+00\n",
"llm_load_print_meta: f_max_alibi_bias = 0.0e+00\n",
"llm_load_print_meta: f_logit_scale = 0.0e+00\n",
"llm_load_print_meta: n_ff = 11008\n",
"llm_load_print_meta: n_expert = 0\n",
"llm_load_print_meta: n_expert_used = 0\n",
"llm_load_print_meta: causal attn = 1\n",
"llm_load_print_meta: pooling type = 0\n",
"llm_load_print_meta: rope type = 0\n",
"llm_load_print_meta: rope scaling = linear\n",
"llm_load_print_meta: freq_base_train = 10000.0\n",
"llm_load_print_meta: freq_scale_train = 1\n",
"llm_load_print_meta: n_ctx_orig_yarn = 4096\n",
"llm_load_print_meta: rope_finetuned = unknown\n",
"llm_load_print_meta: ssm_d_conv = 0\n",
"llm_load_print_meta: ssm_d_inner = 0\n",
"llm_load_print_meta: ssm_d_state = 0\n",
"llm_load_print_meta: ssm_dt_rank = 0\n",
"llm_load_print_meta: model type = 7B\n",
"llm_load_print_meta: model ftype = Q8_0\n",
"llm_load_print_meta: model params = 6.74 B\n",
"llm_load_print_meta: model size = 6.67 GiB (8.50 BPW) \n",
"llm_load_print_meta: general.name = LLaMA v2\n",
"llm_load_print_meta: BOS token = 1 '<s>'\n",
"llm_load_print_meta: EOS token = 2 '</s>'\n",
"llm_load_print_meta: UNK token = 0 '<unk>'\n",
"llm_load_print_meta: LF token = 13 '<0x0A>'\n",
"llm_load_print_meta: max token length = 48\n",
"llm_load_tensors: ggml ctx size = 0.14 MiB\n",
"llm_load_tensors: CPU buffer size = 6828.64 MiB\n",
"...................................................................................................\n",
"llama_new_context_with_model: n_ctx = 2048\n",
"llama_new_context_with_model: n_batch = 512\n",
"llama_new_context_with_model: n_ubatch = 512\n",
"llama_new_context_with_model: flash_attn = 0\n",
"llama_new_context_with_model: freq_base = 10000.0\n",
"llama_new_context_with_model: freq_scale = 1\n",
"llama_kv_cache_init: CPU KV buffer size = 1024.00 MiB\n",
"llama_new_context_with_model: KV self size = 1024.00 MiB, K (f16): 512.00 MiB, V (f16): 512.00 MiB\n",
"llama_new_context_with_model: CPU output buffer size = 0.12 MiB\n",
"llama_new_context_with_model: CPU compute buffer size = 164.01 MiB\n",
"llama_new_context_with_model: graph nodes = 1030\n",
"llama_new_context_with_model: graph splits = 1\n",
"AVX = 1 | AVX_VNNI = 0 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | AVX512_BF16 = 0 | FMA = 1 | NEON = 0 | SVE = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 | MATMUL_INT8 = 0 | LLAMAFILE = 0 | \n",
"Model metadata: {'tokenizer.ggml.unknown_token_id': '0', 'tokenizer.ggml.eos_token_id': '2', 'general.architecture': 'llama', 'llama.context_length': '4096', 'general.name': 'LLaMA v2', 'llama.embedding_length': '4096', 'llama.feed_forward_length': '11008', 'llama.attention.layer_norm_rms_epsilon': '0.000001', 'llama.rope.dimension_count': '128', 'llama.attention.head_count': '32', 'tokenizer.ggml.bos_token_id': '1', 'llama.block_count': '32', 'llama.attention.head_count_kv': '32', 'general.quantization_version': '2', 'tokenizer.ggml.model': 'llama', 'general.file_type': '7'}\n",
"Using fallback chat format: llama-2\n"
]
}
],
"source": [
"llm = LlamaCpp(\n",
" model_path=\"llama-2-7b-chat.Q8_0.gguf\",\n",
" n_gpu_layers=-1,\n",
" n_batch=512,\n",
" n_ctx=2048,\n",
" f16_kv=True, # MUST set to True, otherwise you will run into problem after a couple of calls\n",
" callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]),\n",
" verbose=True,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "43e06f56-ef97-451b-87d9-8465ea442aed",
"metadata": {},
"source": [
"**Now let's ask the same question to Llama model without showing them the earnings release.**"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "1033dd82-5532-437d-a548-27695e109589",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"?\n",
"(NASDAQ:INTC)\n",
"Intel's CCG (Client Computing Group) revenue for Q1 2024 was $9.6 billion, a decrease of 35% from the previous quarter and a decrease of 42% from the same period last year."
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"llama_print_timings: load time = 131.20 ms\n",
"llama_print_timings: sample time = 16.05 ms / 68 runs ( 0.24 ms per token, 4236.76 tokens per second)\n",
"llama_print_timings: prompt eval time = 131.14 ms / 16 tokens ( 8.20 ms per token, 122.01 tokens per second)\n",
"llama_print_timings: eval time = 3225.00 ms / 67 runs ( 48.13 ms per token, 20.78 tokens per second)\n",
"llama_print_timings: total time = 3466.40 ms / 83 tokens\n"
]
},
{
"data": {
"text/plain": [
"\"?\\n(NASDAQ:INTC)\\nIntel's CCG (Client Computing Group) revenue for Q1 2024 was $9.6 billion, a decrease of 35% from the previous quarter and a decrease of 42% from the same period last year.\""
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"llm.invoke(question)"
]
},
{
"cell_type": "markdown",
"id": "75f5cb10-746f-4e37-9386-b85a4d2b84ef",
"metadata": {},
"source": [
"**As you can see, model is giving wrong information. Correct asnwer is CCG revenue in Q1 2024 is $7.5B. Now let's apply RAG using the earning release document**"
]
},
{
"cell_type": "markdown",
"id": "0f4150ec-5692-4756-b11a-22feb7ab88ff",
"metadata": {},
"source": [
"**in RAG, we modify the input prompt by adding relevent documents with the question. Here, we use one of the popular RAG prompt**"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "226c14b0-f43e-4a1f-a1e4-04731d467ec4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template=\"You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\\nQuestion: {question} \\nContext: {context} \\nAnswer:\"))]"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain import hub\n",
"\n",
"rag_prompt = hub.pull(\"rlm/rag-prompt\")\n",
"rag_prompt.messages"
]
},
{
"cell_type": "markdown",
"id": "77deb6a0-0950-450a-916a-f2a029676c20",
"metadata": {},
"source": [
"**Appending all retreived documents in a single document**"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "2dbc3327-6ef3-4c1f-8797-0c71964b0921",
"metadata": {},
"outputs": [],
"source": [
"def format_docs(docs):\n",
" return \"\\n\\n\".join(doc.page_content for doc in docs)"
]
},
{
"cell_type": "markdown",
"id": "2e2d9f18-49d0-43a3-bea8-78746ffa86b7",
"metadata": {},
"source": [
"**The last step is to create a chain using langchain tool that will create an e2e pipeline. It will take question and context as an input.**"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "427379c2-51ff-4e0f-8278-a45221363299",
"metadata": {},
"outputs": [],
"source": [
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.runnables import RunnablePassthrough, RunnablePick\n",
"\n",
"# Chain\n",
"chain = (\n",
" RunnablePassthrough.assign(context=RunnablePick(\"context\") | format_docs)\n",
" | rag_prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "095d6280-c949-4d00-8e32-8895a82d245f",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Llama.generate: prefix-match hit\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" Based on the provided context, Intel CCG revenue in Q1 2024 was $7.5 billion up 31%."
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"llama_print_timings: load time = 131.20 ms\n",
"llama_print_timings: sample time = 7.74 ms / 31 runs ( 0.25 ms per token, 4004.13 tokens per second)\n",
"llama_print_timings: prompt eval time = 2529.41 ms / 674 tokens ( 3.75 ms per token, 266.46 tokens per second)\n",
"llama_print_timings: eval time = 1542.94 ms / 30 runs ( 51.43 ms per token, 19.44 tokens per second)\n",
"llama_print_timings: total time = 4123.68 ms / 704 tokens\n"
]
},
{
"data": {
"text/plain": [
"' Based on the provided context, Intel CCG revenue in Q1 2024 was $7.5 billion up 31%.'"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"chain.invoke({\"context\": docs, \"question\": question})"
]
},
{
"cell_type": "markdown",
"id": "638364b2-6bd2-4471-9961-d3a1d1b9d4ee",
"metadata": {},
"source": [
"**Now we see the results are correct as it is mentioned in earnings release.** <br>\n",
"**To further automate, we will create a chain that will take input as question and retriever so that we don't need to retrieve documents separately**"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "4654e5b7-635f-4767-8b31-4c430164cdd5",
"metadata": {},
"outputs": [],
"source": [
"retriever = vectorstore.as_retriever()\n",
"qa_chain = (\n",
" {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n",
" | rag_prompt\n",
" | llm\n",
" | StrOutputParser()\n",
")"
]
},
{
"cell_type": "markdown",
"id": "0979f393-fd0a-4e82-b844-68371c6ad68f",
"metadata": {},
"source": [
"**Now we only need to pass the question to the chain and it will fetch the contexts directly from the vector database to generate the answer**\n",
"<br>\n",
"**Let's try with another question**"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "3ea07b82-e6ec-4084-85f4-191373530172",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Llama.generate: prefix-match hit\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
" According to the provided context, Intel DCAI revenue in Q1 2024 was $3.0 billion up 5%."
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"\n",
"llama_print_timings: load time = 131.20 ms\n",
"llama_print_timings: sample time = 6.28 ms / 31 runs ( 0.20 ms per token, 4937.88 tokens per second)\n",
"llama_print_timings: prompt eval time = 2681.93 ms / 730 tokens ( 3.67 ms per token, 272.19 tokens per second)\n",
"llama_print_timings: eval time = 1471.07 ms / 30 runs ( 49.04 ms per token, 20.39 tokens per second)\n",
"llama_print_timings: total time = 4206.77 ms / 760 tokens\n"
]
},
{
"data": {
"text/plain": [
"' According to the provided context, Intel DCAI revenue in Q1 2024 was $3.0 billion up 5%.'"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"qa_chain.invoke(\"what is Intel DCAI revenue in Q1 2024?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9407f2a0-4a35-4315-8e96-02fcb80f210c",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.11.1 64-bit",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.1"
},
"vscode": {
"interpreter": {
"hash": "1a1af0ee75eeea9e2e1ee996c87e7a2b11a0bebd85af04bb136d915cefc0abce"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,82 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# RAG using Upstage Layout Analysis and Groundedness Check\n",
"This example illustrates RAG using [Upstage](https://python.langchain.com/docs/integrations/providers/upstage/) Layout Analysis and Groundedness Check."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from typing import List\n",
"\n",
"from langchain_community.vectorstores import DocArrayInMemorySearch\n",
"from langchain_core.output_parsers import StrOutputParser\n",
"from langchain_core.prompts import ChatPromptTemplate\n",
"from langchain_core.runnables import RunnablePassthrough\n",
"from langchain_core.runnables.base import RunnableSerializable\n",
"from langchain_upstage import (\n",
" ChatUpstage,\n",
" UpstageEmbeddings,\n",
" UpstageGroundednessCheck,\n",
" UpstageLayoutAnalysisLoader,\n",
")\n",
"\n",
"model = ChatUpstage()\n",
"\n",
"files = [\"/PATH/TO/YOUR/FILE.pdf\", \"/PATH/TO/YOUR/FILE2.pdf\"]\n",
"\n",
"loader = UpstageLayoutAnalysisLoader(file_path=files, split=\"element\")\n",
"\n",
"docs = loader.load()\n",
"\n",
"vectorstore = DocArrayInMemorySearch.from_documents(\n",
" docs, embedding=UpstageEmbeddings(model=\"solar-embedding-1-large\")\n",
")\n",
"retriever = vectorstore.as_retriever()\n",
"\n",
"template = \"\"\"Answer the question based only on the following context:\n",
"{context}\n",
"\n",
"Question: {question}\n",
"\"\"\"\n",
"prompt = ChatPromptTemplate.from_template(template)\n",
"output_parser = StrOutputParser()\n",
"\n",
"retrieved_docs = retriever.get_relevant_documents(\"How many parameters in SOLAR model?\")\n",
"\n",
"groundedness_check = UpstageGroundednessCheck()\n",
"groundedness = \"\"\n",
"while groundedness != \"grounded\":\n",
" chain: RunnableSerializable = RunnablePassthrough() | prompt | model | output_parser\n",
"\n",
" result = chain.invoke(\n",
" {\n",
" \"context\": retrieved_docs,\n",
" \"question\": \"How many parameters in SOLAR model?\",\n",
" }\n",
" )\n",
"\n",
" groundedness = groundedness_check.invoke(\n",
" {\n",
" \"context\": retrieved_docs,\n",
" \"answer\": result,\n",
" }\n",
" )"
]
}
],
"metadata": {
"language_info": {
"name": "python"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -36,15 +36,13 @@
"from bs4 import BeautifulSoup as Soup\n",
"from langchain.retrievers.multi_vector import MultiVectorRetriever\n",
"from langchain.storage import InMemoryByteStore, LocalFileStore\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.document_loaders.recursive_url_loader import (\n",
" RecursiveUrlLoader,\n",
")\n",
"\n",
"# noqa\n",
"from langchain_community.vectorstores import Chroma\n",
"\n",
"# For our example, we'll load docs from the web\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter # noqa\n",
"from langchain_text_splitters import RecursiveCharacterTextSplitter\n",
"\n",
"DOCSTORE_DIR = \".\"\n",
"DOCSTORE_ID_KEY = \"doc_id\""
@@ -372,13 +370,14 @@
],
"source": [
"import torch\n",
"from langchain.llms.huggingface_pipeline import HuggingFacePipeline\n",
"from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline\n",
"from langchain_huggingface.llms import HuggingFacePipeline\n",
"from optimum.intel.ipex import IPEXModelForCausalLM\n",
"from transformers import AutoTokenizer, pipeline\n",
"\n",
"model_id = \"Intel/neural-chat-7b-v3-3\"\n",
"tokenizer = AutoTokenizer.from_pretrained(model_id)\n",
"model = AutoModelForCausalLM.from_pretrained(\n",
" model_id, device_map=\"auto\", torch_dtype=torch.bfloat16\n",
"model = IPEXModelForCausalLM.from_pretrained(\n",
" model_id, torch_dtype=torch.bfloat16, export=True\n",
")\n",
"\n",
"pipe = pipeline(\"text-generation\", model=model, tokenizer=tokenizer, max_new_tokens=100)\n",
@@ -583,7 +582,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
"version": "3.10.14"
}
},
"nbformat": 4,

View File

@@ -274,7 +274,7 @@
"db = SQLDatabase.from_uri(\n",
" CONNECTION_STRING\n",
") # We reconnect to db so the new columns are loaded as well.\n",
"llm = ChatOpenAI(model_name=\"gpt-4\", temperature=0)\n",
"llm = ChatOpenAI(model=\"gpt-4\", temperature=0)\n",
"\n",
"sql_query_chain = (\n",
" RunnablePassthrough.assign(schema=get_schema)\n",

View File

@@ -245,7 +245,7 @@
"\n",
"\n",
"def _parse(text):\n",
" return text.strip(\"**\")"
" return text.strip('\"').strip(\"**\")"
]
},
{

View File

@@ -1,28 +1,32 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# SalesGPT - Your Context-Aware AI Sales Assistant With Knowledge Base\n",
"# SalesGPT - Context-Aware AI Sales Assistant With Knowledge Base and Ability Generate Stripe Payment Links\n",
"\n",
"This notebook demonstrates an implementation of a **Context-Aware** AI Sales agent with a Product Knowledge Base. \n",
"This notebook demonstrates an implementation of a **Context-Aware** AI Sales agent with a Product Knowledge Base which can actually close sales. \n",
"\n",
"This notebook was originally published at [filipmichalsky/SalesGPT](https://github.com/filip-michalsky/SalesGPT) by [@FilipMichalsky](https://twitter.com/FilipMichalsky).\n",
"\n",
"SalesGPT is context-aware, which means it can understand what section of a sales conversation it is in and act accordingly.\n",
" \n",
"As such, this agent can have a natural sales conversation with a prospect and behaves based on the conversation stage. Hence, this notebook demonstrates how we can use AI to automate sales development representatives activities, such as outbound sales calls. \n",
"As such, this agent can have a natural sales conversation with a prospect and behaves based on the conversation stage. Hence, this notebook demonstrates how we can use AI to automate sales development representatives activites, such as outbound sales calls. \n",
"\n",
"Additionally, the AI Sales agent has access to tools, which allow it to interact with other systems.\n",
"\n",
"Here, we show how the AI Sales Agent can use a **Product Knowledge Base** to speak about a particular's company offerings,\n",
"hence increasing relevance and reducing hallucinations.\n",
"\n",
"We leverage the [`langchain`](https://github.com/langchain-ai/langchain) library in this implementation, specifically [Custom Agent Configuration](https://langchain-langchain.vercel.app/docs/modules/agents/how_to/custom_agent_with_tool_retrieval) and are inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) architecture ."
"Furthermore, we show how our AI Sales Agent can **generate sales** by integration with the AI Agent Highway called [Mindware](https://www.mindware.co/). In practice, this allows the agent to autonomously generate a payment link for your customers **to pay for your products via Stripe**.\n",
"\n",
"We leverage the [`langchain`](https://github.com/hwchase17/langchain) library in this implementation, specifically [Custom Agent Configuration](https://langchain-langchain.vercel.app/docs/modules/agents/how_to/custom_agent_with_tool_retrieval) and are inspired by [BabyAGI](https://github.com/yoheinakajima/babyagi) architecture ."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -38,9 +42,10 @@
"import os\n",
"import re\n",
"\n",
"# import your OpenAI key\n",
"OPENAI_API_KEY = \"sk-xx\"\n",
"os.environ[\"OPENAI_API_KEY\"] = OPENAI_API_KEY\n",
"# make sure you have .env file saved locally with your API keys\n",
"from dotenv import load_dotenv\n",
"\n",
"load_dotenv()\n",
"\n",
"from typing import Any, Callable, Dict, List, Union\n",
"\n",
@@ -49,27 +54,18 @@
"from langchain.agents.conversational.prompt import FORMAT_INSTRUCTIONS\n",
"from langchain.chains import LLMChain, RetrievalQA\n",
"from langchain.chains.base import Chain\n",
"from langchain.llms import BaseLLM\n",
"from langchain.prompts import PromptTemplate\n",
"from langchain.prompts.base import StringPromptTemplate\n",
"from langchain_community.llms import BaseLLM\n",
"from langchain_community.vectorstores import Chroma\n",
"from langchain_core.agents import AgentAction, AgentFinish\n",
"from langchain_openai import ChatOpenAI, OpenAI, OpenAIEmbeddings\n",
"from langchain_text_splitters import CharacterTextSplitter\n",
"from langchain.schema import AgentAction, AgentFinish\n",
"from langchain.text_splitter import CharacterTextSplitter\n",
"from langchain.vectorstores import Chroma\n",
"from langchain_openai import ChatOpenAI, OpenAIEmbeddings\n",
"from pydantic import BaseModel, Field"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# install additional dependencies\n",
"# ! pip install chromadb openai tiktoken"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -77,19 +73,21 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Seed the SalesGPT agent\n",
"2. Run Sales Agent to decide what to do:\n",
"\n",
" a) Use a tool, such as look up Product Information in a Knowledge Base\n",
" a) Use a tool, such as look up Product Information in a Knowledge Base or Generate a Payment Link\n",
" \n",
" b) Output a response to a user \n",
"3. Run Sales Stage Recognition Agent to recognize which stage is the sales agent at and adjust their behaviour accordingly."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -98,15 +96,17 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Architecture diagram\n",
"\n",
"<img src=\"https://singularity-assets-public.s3.amazonaws.com/new_flow.png\" width=\"800\" height=\"440\"/>\n"
"<img src=\"https://demo-bucket-45.s3.amazonaws.com/new_flow2.png\" width=\"800\" height=\"440\">\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -131,7 +131,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
@@ -149,7 +149,7 @@
" {conversation_history}\n",
" ===\n",
"\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting only from the following options:\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting ony from the following options:\n",
" 1. Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\n",
" 2. Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.\n",
" 3. Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.\n",
@@ -171,7 +171,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
@@ -223,7 +223,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
@@ -240,13 +240,17 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# test the intermediate chains\n",
"verbose = True\n",
"llm = ChatOpenAI(temperature=0.9)\n",
"llm = ChatOpenAI(\n",
" model=\"gpt-4-turbo-preview\",\n",
" temperature=0.9,\n",
" openai_api_key=os.getenv(\"OPENAI_API_KEY\"),\n",
")\n",
"\n",
"stage_analyzer_chain = StageAnalyzerChain.from_llm(llm, verbose=verbose)\n",
"\n",
@@ -257,7 +261,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"metadata": {},
"outputs": [
{
@@ -276,7 +280,7 @@
" \n",
" ===\n",
"\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting only from the following options:\n",
" Now determine what should be the next immediate conversation stage for the agent in the sales conversation by selecting ony from the following options:\n",
" 1. Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\n",
" 2. Qualification: Qualify the prospect by confirming if they are the right person to talk to regarding your product/service. Ensure that they have the authority to make purchasing decisions.\n",
" 3. Value proposition: Briefly explain how your product/service can benefit the prospect. Focus on the unique selling points and value proposition of your product/service that sets it apart from competitors.\n",
@@ -296,21 +300,21 @@
{
"data": {
"text/plain": [
"'1'"
"{'conversation_history': '', 'text': '1'}"
]
},
"execution_count": 7,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"stage_analyzer_chain.run(conversation_history=\"\")"
"stage_analyzer_chain.invoke({\"conversation_history\": \"\"})"
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 7,
"metadata": {},
"outputs": [
{
@@ -352,32 +356,44 @@
{
"data": {
"text/plain": [
"\"I'm doing great, thank you for asking! As a Business Development Representative at Sleep Haven, I wanted to reach out to see if you are looking to achieve a better night's sleep. We provide premium mattresses that offer the most comfortable and supportive sleeping experience possible. Are you interested in exploring our sleep solutions? <END_OF_TURN>\""
"{'salesperson_name': 'Ted Lasso',\n",
" 'salesperson_role': 'Business Development Representative',\n",
" 'company_name': 'Sleep Haven',\n",
" 'company_business': 'Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.',\n",
" 'company_values': \"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" 'conversation_purpose': 'find out whether they are looking to achieve better sleep via buying a premier mattress.',\n",
" 'conversation_history': 'Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>',\n",
" 'conversation_type': 'call',\n",
" 'conversation_stage': 'Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional. Your greeting should be welcoming. Always clarify in your greeting the reason why you are contacting the prospect.',\n",
" 'text': \"I'm doing well, thank you for asking. The reason I'm calling is to discuss how Sleep Haven can help enhance your sleep quality with our premium mattresses. Are you currently looking for ways to achieve a better night's sleep? <END_OF_TURN>\"}"
]
},
"execution_count": 8,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sales_conversation_utterance_chain.run(\n",
" salesperson_name=\"Ted Lasso\",\n",
" salesperson_role=\"Business Development Representative\",\n",
" company_name=\"Sleep Haven\",\n",
" company_business=\"Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.\",\n",
" company_values=\"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" conversation_purpose=\"find out whether they are looking to achieve better sleep via buying a premier mattress.\",\n",
" conversation_history=\"Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>\",\n",
" conversation_type=\"call\",\n",
" conversation_stage=conversation_stages.get(\n",
" \"1\",\n",
" \"Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\",\n",
" ),\n",
"sales_conversation_utterance_chain.invoke(\n",
" {\n",
" \"salesperson_name\": \"Ted Lasso\",\n",
" \"salesperson_role\": \"Business Development Representative\",\n",
" \"company_name\": \"Sleep Haven\",\n",
" \"company_business\": \"Sleep Haven is a premium mattress company that provides customers with the most comfortable and supportive sleeping experience possible. We offer a range of high-quality mattresses, pillows, and bedding accessories that are designed to meet the unique needs of our customers.\",\n",
" \"company_values\": \"Our mission at Sleep Haven is to help people achieve a better night's sleep by providing them with the best possible sleep solutions. We believe that quality sleep is essential to overall health and well-being, and we are committed to helping our customers achieve optimal sleep by offering exceptional products and customer service.\",\n",
" \"conversation_purpose\": \"find out whether they are looking to achieve better sleep via buying a premier mattress.\",\n",
" \"conversation_history\": \"Hello, this is Ted Lasso from Sleep Haven. How are you doing today? <END_OF_TURN>\\nUser: I am well, howe are you?<END_OF_TURN>\",\n",
" \"conversation_type\": \"call\",\n",
" \"conversation_stage\": conversation_stages.get(\n",
" \"1\",\n",
" \"Introduction: Start the conversation by introducing yourself and your company. Be polite and respectful while keeping the tone of the conversation professional.\",\n",
" ),\n",
" }\n",
")"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -385,6 +401,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -395,7 +412,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
@@ -429,7 +446,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
@@ -445,7 +462,7 @@
" text_splitter = CharacterTextSplitter(chunk_size=10, chunk_overlap=0)\n",
" texts = text_splitter.split_text(product_catalog)\n",
"\n",
" llm = OpenAI(temperature=0)\n",
" llm = ChatOpenAI(temperature=0)\n",
" embeddings = OpenAIEmbeddings()\n",
" docsearch = Chroma.from_texts(\n",
" texts, embeddings, collection_name=\"product-knowledge-base\"\n",
@@ -454,29 +471,12 @@
" knowledge_base = RetrievalQA.from_chain_type(\n",
" llm=llm, chain_type=\"stuff\", retriever=docsearch.as_retriever()\n",
" )\n",
" return knowledge_base\n",
"\n",
"\n",
"def get_tools(product_catalog):\n",
" # query to get_tools can be used to be embedded and relevant tools found\n",
" # see here: https://langchain-langchain.vercel.app/docs/use_cases/agents/custom_agent_with_plugin_retrieval#tool-retriever\n",
"\n",
" # we only use one tool for now, but this is highly extensible!\n",
" knowledge_base = setup_knowledge_base(product_catalog)\n",
" tools = [\n",
" Tool(\n",
" name=\"ProductSearch\",\n",
" func=knowledge_base.run,\n",
" description=\"useful for when you need to answer questions about product information\",\n",
" )\n",
" ]\n",
"\n",
" return tools"
" return knowledge_base"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 10,
"metadata": {},
"outputs": [
{
@@ -485,16 +485,18 @@
"text": [
"Created a chunk of size 940, which is longer than the specified 10\n",
"Created a chunk of size 844, which is longer than the specified 10\n",
"Created a chunk of size 837, which is longer than the specified 10\n"
"Created a chunk of size 837, which is longer than the specified 10\n",
"/Users/filipmichalsky/Odyssey/sales_bot/SalesGPT/env/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: The function `run` was deprecated in LangChain 0.1.0 and will be removed in 0.2.0. Use invoke instead.\n",
" warn_deprecated(\n"
]
},
{
"data": {
"text/plain": [
"' We have four products available: the Classic Harmony Spring Mattress, the Plush Serenity Bamboo Mattress, the Luxury Cloud-Comfort Memory Foam Mattress, and the EcoGreen Hybrid Latex Mattress. Each product is available in different sizes, with the Classic Harmony Spring Mattress available in Queen and King sizes, the Plush Serenity Bamboo Mattress available in King size, the Luxury Cloud-Comfort Memory Foam Mattress available in Twin, Queen, and King sizes, and the EcoGreen Hybrid Latex Mattress available in Twin and Full sizes.'"
"'The Sleep Haven products available are:\\n\\n1. Luxury Cloud-Comfort Memory Foam Mattress\\n2. Classic Harmony Spring Mattress\\n3. EcoGreen Hybrid Latex Mattress\\n4. Plush Serenity Bamboo Mattress\\n\\nEach product has its unique features and price point.'"
]
},
"execution_count": 11,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -508,12 +510,199 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up the SalesGPT Controller with the Sales Agent and Stage Analyzer and a Knowledge Base"
"### Payment gateway"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In order to set up your AI agent to use a payment gateway to generate payment links for your users you need two things:\n",
"\n",
"1. Sign up for a Stripe account and obtain a STRIPE API KEY\n",
"2. Create products you would like to sell in the Stripe UI. Then follow out example of `example_product_price_id_mapping.json`\n",
"to feed the product name to price_id mapping which allows you to generate the payment links."
]
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"from litellm import completion\n",
"\n",
"# set GPT model env variable\n",
"os.environ[\"GPT_MODEL\"] = \"gpt-4-turbo-preview\"\n",
"\n",
"product_price_id_mapping = {\n",
" \"ai-consulting-services\": \"price_1Ow8ofB795AYY8p1goWGZi6m\",\n",
" \"Luxury Cloud-Comfort Memory Foam Mattress\": \"price_1Owv99B795AYY8p1mjtbKyxP\",\n",
" \"Classic Harmony Spring Mattress\": \"price_1Owv9qB795AYY8p1tPcxCM6T\",\n",
" \"EcoGreen Hybrid Latex Mattress\": \"price_1OwvLDB795AYY8p1YBAMBcbi\",\n",
" \"Plush Serenity Bamboo Mattress\": \"price_1OwvMQB795AYY8p1hJN2uS3S\",\n",
"}\n",
"with open(\"example_product_price_id_mapping.json\", \"w\") as f:\n",
" json.dump(product_price_id_mapping, f)\n",
"\n",
"\n",
"def get_product_id_from_query(query, product_price_id_mapping_path):\n",
" # Load product_price_id_mapping from a JSON file\n",
" with open(product_price_id_mapping_path, \"r\") as f:\n",
" product_price_id_mapping = json.load(f)\n",
"\n",
" # Serialize the product_price_id_mapping to a JSON string for inclusion in the prompt\n",
" product_price_id_mapping_json_str = json.dumps(product_price_id_mapping)\n",
"\n",
" # Dynamically create the enum list from product_price_id_mapping keys\n",
" enum_list = list(product_price_id_mapping.values()) + [\n",
" \"No relevant product id found\"\n",
" ]\n",
" enum_list_str = json.dumps(enum_list)\n",
"\n",
" prompt = f\"\"\"\n",
" You are an expert data scientist and you are working on a project to recommend products to customers based on their needs.\n",
" Given the following query:\n",
" {query}\n",
" and the following product price id mapping:\n",
" {product_price_id_mapping_json_str}\n",
" return the price id that is most relevant to the query.\n",
" ONLY return the price id, no other text. If no relevant price id is found, return 'No relevant price id found'.\n",
" Your output will follow this schema:\n",
" {{\n",
" \"$schema\": \"http://json-schema.org/draft-07/schema#\",\n",
" \"title\": \"Price ID Response\",\n",
" \"type\": \"object\",\n",
" \"properties\": {{\n",
" \"price_id\": {{\n",
" \"type\": \"string\",\n",
" \"enum\": {enum_list_str}\n",
" }}\n",
" }},\n",
" \"required\": [\"price_id\"]\n",
" }}\n",
" Return a valid directly parsable json, dont return in it within a code snippet or add any kind of explanation!!\n",
" \"\"\"\n",
" prompt += \"{\"\n",
" response = completion(\n",
" model=os.getenv(\"GPT_MODEL\", \"gpt-3.5-turbo-1106\"),\n",
" messages=[{\"content\": prompt, \"role\": \"user\"}],\n",
" max_tokens=1000,\n",
" temperature=0,\n",
" )\n",
"\n",
" product_id = response.choices[0].message.content.strip()\n",
" return product_id"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"\n",
"import requests\n",
"\n",
"\n",
"def generate_stripe_payment_link(query: str) -> str:\n",
" \"\"\"Generate a stripe payment link for a customer based on a single query string.\"\"\"\n",
"\n",
" # example testing payment gateway url\n",
" PAYMENT_GATEWAY_URL = os.getenv(\n",
" \"PAYMENT_GATEWAY_URL\", \"https://agent-payments-gateway.vercel.app/payment\"\n",
" )\n",
" PRODUCT_PRICE_MAPPING = \"example_product_price_id_mapping.json\"\n",
"\n",
" # use LLM to get the price_id from query\n",
" price_id = get_product_id_from_query(query, PRODUCT_PRICE_MAPPING)\n",
" price_id = json.loads(price_id)\n",
" payload = json.dumps(\n",
" {\"prompt\": query, **price_id, \"stripe_key\": os.getenv(\"STRIPE_API_KEY\")}\n",
" )\n",
" headers = {\n",
" \"Content-Type\": \"application/json\",\n",
" }\n",
"\n",
" response = requests.request(\n",
" \"POST\", PAYMENT_GATEWAY_URL, headers=headers, data=payload\n",
" )\n",
" return response.text"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'{\"response\":\"https://buy.stripe.com/test_6oEbLS8JB1F9bv229d\"}'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"generate_stripe_payment_link(\n",
" query=\"Please generate a payment link for John Doe to buy two mattresses - the Classic Harmony Spring Mattress\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup agent tools"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"def get_tools(product_catalog):\n",
" # query to get_tools can be used to be embedded and relevant tools found\n",
" # see here: https://langchain-langchain.vercel.app/docs/use_cases/agents/custom_agent_with_plugin_retrieval#tool-retriever\n",
"\n",
" # we only use one tool for now, but this is highly extensible!\n",
" knowledge_base = setup_knowledge_base(product_catalog)\n",
" tools = [\n",
" Tool(\n",
" name=\"ProductSearch\",\n",
" func=knowledge_base.run,\n",
" description=\"useful for when you need to answer questions about product information or services offered, availability and their costs.\",\n",
" ),\n",
" Tool(\n",
" name=\"GeneratePaymentLink\",\n",
" func=generate_stripe_payment_link,\n",
" description=\"useful to close a transaction with a customer. You need to include product name and quantity and customer name in the query input.\",\n",
" ),\n",
" ]\n",
"\n",
" return tools"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Set up the SalesGPT Controller with the Sales Agent and Stage Analyzer\n",
"\n",
"#### The Agent has access to a Knowledge Base and can autonomously sell your products via Stripe"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
@@ -563,19 +752,11 @@
" print(\"TEXT\")\n",
" print(text)\n",
" print(\"-------\")\n",
" if f\"{self.ai_prefix}:\" in text:\n",
" return AgentFinish(\n",
" {\"output\": text.split(f\"{self.ai_prefix}:\")[-1].strip()}, text\n",
" )\n",
" regex = r\"Action: (.*?)[\\n]*Action Input: (.*)\"\n",
" match = re.search(regex, text)\n",
" if not match:\n",
" ## TODO - this is not entirely reliable, sometimes results in an error.\n",
" return AgentFinish(\n",
" {\n",
" \"output\": \"I apologize, I was unable to find the answer to your question. Is there anything else I can help with?\"\n",
" },\n",
" text,\n",
" {\"output\": text.split(f\"{self.ai_prefix}:\")[-1].strip()}, text\n",
" )\n",
" # raise OutputParserException(f\"Could not parse LLM output: `{text}`\")\n",
" action = match.group(1)\n",
@@ -589,7 +770,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
@@ -647,18 +828,18 @@
"Previous conversation history:\n",
"{conversation_history}\n",
"\n",
"{salesperson_name}:\n",
"Thought:\n",
"{agent_scratchpad}\n",
"\"\"\""
]
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"class SalesGPT(Chain, BaseModel):\n",
"class SalesGPT(Chain):\n",
" \"\"\"Controller model for the Sales Agent.\"\"\"\n",
"\n",
" conversation_history: List[str] = []\n",
@@ -804,7 +985,9 @@
"\n",
" # WARNING: this output parser is NOT reliable yet\n",
" ## It makes assumptions about output from LLM which can break and throw an error\n",
" output_parser = SalesConvoOutputParser(ai_prefix=kwargs[\"salesperson_name\"])\n",
" output_parser = SalesConvoOutputParser(\n",
" ai_prefix=kwargs[\"salesperson_name\"], verbose=verbose\n",
" )\n",
"\n",
" sales_agent_with_tools = LLMSingleActionAgent(\n",
" llm_chain=llm_chain,\n",
@@ -828,6 +1011,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -835,6 +1019,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -843,7 +1028,7 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
@@ -880,6 +1065,7 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
@@ -888,7 +1074,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 22,
"metadata": {},
"outputs": [
{
@@ -897,7 +1083,9 @@
"text": [
"Created a chunk of size 940, which is longer than the specified 10\n",
"Created a chunk of size 844, which is longer than the specified 10\n",
"Created a chunk of size 837, which is longer than the specified 10\n"
"Created a chunk of size 837, which is longer than the specified 10\n",
"/Users/filipmichalsky/Odyssey/sales_bot/SalesGPT/env/lib/python3.10/site-packages/langchain_core/_api/deprecation.py:117: LangChainDeprecationWarning: The class `langchain.agents.agent.LLMSingleActionAgent` was deprecated in langchain 0.1.0 and will be removed in 0.2.0. Use Use new agent constructor methods like create_react_agent, create_json_agent, create_structured_chat_agent, etc. instead.\n",
" warn_deprecated(\n"
]
}
],
@@ -907,7 +1095,7 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
@@ -917,7 +1105,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 22,
"metadata": {},
"outputs": [
{
@@ -934,14 +1122,14 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Hello, this is Ted Lasso from Sleep Haven. How are you doing today?\n"
"Ted Lasso: Good day! This is Ted Lasso from Sleep Haven. How are you doing today?\n"
]
}
],
@@ -951,18 +1139,18 @@
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"I am well, how are you? I would like to learn more about your mattresses.\"\n",
" \"I am well, how are you? I would like to learn more about your services.\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 25,
"metadata": {},
"outputs": [
{
@@ -977,92 +1165,32 @@
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: I'm glad to hear that you're doing well! As for our mattresses, at Sleep Haven, we provide customers with the most comfortable and supportive sleeping experience possible. Our high-quality mattresses are designed to meet the unique needs of our customers. Can I ask what specifically you'd like to learn more about? \n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\"Yes, what materials are you mattresses made from?\")"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Our mattresses are made from a variety of materials, depending on the model. We have the EcoGreen Hybrid Latex Mattress, which is made from 100% natural latex harvested from eco-friendly plantations. The Plush Serenity Bamboo Mattress features a layer of plush, adaptive foam and a base of high-resilience support foam, with a bamboo-infused top layer. The Luxury Cloud-Comfort Memory Foam Mattress has an innovative, temperature-sensitive memory foam layer and a high-density foam base with cooling gel-infused particles. Finally, the Classic Harmony Spring Mattress has a robust inner spring construction and layers of plush padding, with a quilted top layer and a natural cotton cover. Is there anything specific you'd like to know about these materials?\n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: I'm doing great, thank you for asking! I'm glad to hear you're interested. Sleep Haven is a premium mattress company, and we're all about offering the best sleep solutions, including top-notch mattresses, pillows, and bedding accessories. Our mission is to help you achieve a better night's sleep. May I know if you're looking to enhance your sleep experience with a new mattress or bedding accessories? \n"
]
}
],
"source": [
"sales_agent.human_step(\n",
" \"Yes, I am looking for a queen sized mattress. Do you have any mattresses in queen size?\"\n",
")"
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"outputs": [],
"source": [
"sales_agent.determine_conversation_stage()"
"sales_agent.human_step(\n",
" \"Yes, I would like to improve my sleep. Can you tell me more about your products?\"\n",
")"
]
},
{
@@ -1074,7 +1202,24 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Yes, we do have queen-sized mattresses available. We offer the Luxury Cloud-Comfort Memory Foam Mattress and the Classic Harmony Spring Mattress in queen size. Both mattresses provide exceptional comfort and support. Is there anything specific you would like to know about these options?\n"
"Conversation Stage: Needs analysis: Ask open-ended questions to uncover the prospect's needs and pain points. Listen carefully to their responses and take notes.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Absolutely, I'd be happy to share more about our products. At Sleep Haven, we offer a variety of high-quality mattresses designed to cater to different sleeping preferences and needs. Whether you're looking for memory foam's comfort, the support of hybrid mattresses, or the breathability of natural latex, we have options for everyone. Our pillows and bedding accessories are similarly curated to enhance your sleep quality. Every product is built with the aim of helping you achieve the restful night's sleep you deserve. What specific features are you looking for in a mattress? \n"
]
}
],
@@ -1084,16 +1229,16 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\"Yea, compare and contrast those two options, please.\")"
"sales_agent.human_step(\"What mattresses do you have and how much do they cost?\")"
]
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 32,
"metadata": {},
"outputs": [
{
@@ -1110,14 +1255,14 @@
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: The Luxury Cloud-Comfort Memory Foam Mattress is priced at $999 and is available in Twin, Queen, and King sizes. It features an innovative, temperature-sensitive memory foam layer and a high-density foam base. On the other hand, the Classic Harmony Spring Mattress is priced at $1,299 and is available in Queen and King sizes. It features a robust inner spring construction and layers of plush padding. Both mattresses provide exceptional comfort and support, but the Classic Harmony Spring Mattress may be a better option if you prefer the traditional feel of an inner spring mattress. Do you have any other questions about these options?\n"
"Ted Lasso: We offer two primary types of mattresses at Sleep Haven. The first is our Luxury Cloud-Comfort Memory Foam Mattress, which is priced at $999 and comes in Twin, Queen, and King sizes. The second is our Classic Harmony Spring Mattress, priced at $1,299, available in Queen and King sizes. Both are designed to provide exceptional comfort and support for a better night's sleep. Which type of mattress would you be interested in learning more about? \n"
]
}
],
@@ -1127,14 +1272,66 @@
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"Great, thanks, that's it. I will talk to my wife and call back if she is onboard. Have a good day!\"\n",
" \"Okay.I would like to order two Memory Foam mattresses in Twin size please.\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Conversation Stage: Close: Ask for the sale by proposing a next step. This could be a demo, a trial or a meeting with decision-makers. Ensure to summarize what has been discussed and reiterate the benefits.\n"
]
}
],
"source": [
"sales_agent.determine_conversation_stage()"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ted Lasso: Fantastic choice! You're on your way to a better night's sleep with our Luxury Cloud-Comfort Memory Foam Mattresses. I've generated a payment link for two Twin size mattresses for you. Here is the link to complete your purchase: https://buy.stripe.com/test_6oEg28e3V97BdDabJn. Is there anything else I can assist you with today? \n"
]
}
],
"source": [
"sales_agent.step()"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [],
"source": [
"sales_agent.human_step(\n",
" \"Great, thanks! I will discuss with my wife and will buy it if she is onboard. Have a good day!\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -1153,9 +1350,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.10.9"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -31,7 +31,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install langchain lark openai elasticsearch pandas"
"!pip install langchain langchain-elasticsearch lark openai elasticsearch pandas"
]
},
{
@@ -355,15 +355,15 @@
"metadata": {},
"outputs": [],
"source": [
"attribute_info[-2][\n",
" \"description\"\n",
"] += f\". Valid values are {sorted(latest_price['starrating'].value_counts().index.tolist())}\"\n",
"attribute_info[3][\n",
" \"description\"\n",
"] += f\". Valid values are {sorted(latest_price['maxoccupancy'].value_counts().index.tolist())}\"\n",
"attribute_info[-3][\n",
" \"description\"\n",
"] += f\". Valid values are {sorted(latest_price['country'].value_counts().index.tolist())}\""
"attribute_info[-2][\"description\"] += (\n",
" f\". Valid values are {sorted(latest_price['starrating'].value_counts().index.tolist())}\"\n",
")\n",
"attribute_info[3][\"description\"] += (\n",
" f\". Valid values are {sorted(latest_price['maxoccupancy'].value_counts().index.tolist())}\"\n",
")\n",
"attribute_info[-3][\"description\"] += (\n",
" f\". Valid values are {sorted(latest_price['country'].value_counts().index.tolist())}\"\n",
")"
]
},
{
@@ -688,9 +688,9 @@
"metadata": {},
"outputs": [],
"source": [
"attribute_info[-3][\n",
" \"description\"\n",
"] += \". NOTE: Only use the 'eq' operator if a specific country is mentioned. If a region is mentioned, include all relevant countries in filter.\"\n",
"attribute_info[-3][\"description\"] += (\n",
" \". NOTE: Only use the 'eq' operator if a specific country is mentioned. If a region is mentioned, include all relevant countries in filter.\"\n",
")\n",
"chain = load_query_constructor_runnable(\n",
" ChatOpenAI(model=\"gpt-3.5-turbo\", temperature=0),\n",
" doc_contents,\n",
@@ -1227,7 +1227,7 @@
}
],
"source": [
"results = retriever.get_relevant_documents(\n",
"results = retriever.invoke(\n",
" \"I want to stay somewhere highly rated along the coast. I want a room with a patio and a fireplace.\"\n",
")\n",
"for res in results:\n",

View File

@@ -22,7 +22,8 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import AgentExecutor, Tool, ZeroShotAgent\n",
"from langchain import hub\n",
"from langchain.agents import AgentExecutor, Tool, ZeroShotAgent, create_react_agent\n",
"from langchain.chains import LLMChain\n",
"from langchain.memory import ConversationBufferMemory, ReadOnlySharedMemory\n",
"from langchain.prompts import PromptTemplate\n",
@@ -84,19 +85,7 @@
"metadata": {},
"outputs": [],
"source": [
"prefix = \"\"\"Have a conversation with a human, answering the following questions as best you can. You have access to the following tools:\"\"\"\n",
"suffix = \"\"\"Begin!\"\n",
"\n",
"{chat_history}\n",
"Question: {input}\n",
"{agent_scratchpad}\"\"\"\n",
"\n",
"prompt = ZeroShotAgent.create_prompt(\n",
" tools,\n",
" prefix=prefix,\n",
" suffix=suffix,\n",
" input_variables=[\"input\", \"chat_history\", \"agent_scratchpad\"],\n",
")"
"prompt = hub.pull(\"hwchase17/react\")"
]
},
{
@@ -114,16 +103,14 @@
"metadata": {},
"outputs": [],
"source": [
"llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)\n",
"agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)\n",
"agent_chain = AgentExecutor.from_agent_and_tools(\n",
" agent=agent, tools=tools, verbose=True, memory=memory\n",
")"
"model = OpenAI()\n",
"agent = create_react_agent(model, tools, prompt)\n",
"agent_executor = AgentExecutor(agent=agent, tools=tools, memory=memory)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 36,
"id": "ca4bc1fb",
"metadata": {},
"outputs": [
@@ -133,15 +120,15 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I should research ChatGPT to answer this question.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I should research ChatGPT to answer this question.\n",
"Action: Search\n",
"Action Input: \"ChatGPT\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mNov 30, 2022 ... We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer ... ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large ... ChatGPT. We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer ... Feb 2, 2023 ... ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after ... 2 days ago ... ChatGPT recently launched a new version of its own plagiarism detection tool, with hopes that it will squelch some of the criticism around how ... An API for accessing new AI models developed by OpenAI. Feb 19, 2023 ... ChatGPT is an AI chatbot system that OpenAI released in November to show off and test what a very large, powerful AI system can accomplish. You ... ChatGPT is fine-tuned from GPT-3.5, a language model trained to produce text. ChatGPT was optimized for dialogue by using Reinforcement Learning with Human ... 3 days ago ... Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. Dec 1, 2022 ... ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with a ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\u001b[0m\n",
"Action Input: \"ChatGPT\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mNov 30, 2022 ... We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer ... ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large ... ChatGPT. We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer ... Feb 2, 2023 ... ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after ... 2 days ago ... ChatGPT recently launched a new version of its own plagiarism detection tool, with hopes that it will squelch some of the criticism around how ... An API for accessing new AI models developed by OpenAI. Feb 19, 2023 ... ChatGPT is an AI chatbot system that OpenAI released in November to show off and test what a very large, powerful AI system can accomplish. You ... ChatGPT is fine-tuned from GPT-3.5, a language model trained to produce text. ChatGPT was optimized for dialogue by using Reinforcement Learning with Human ... 3 days ago ... Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. Dec 1, 2022 ... ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with a ...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -153,10 +140,40 @@
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
},
{
"ename": "KeyboardInterrupt",
"evalue": "",
"output_type": "error",
"traceback": [
"\u001B[0;31m---------------------------------------------------------------------------\u001B[0m",
"\u001B[0;31mKeyboardInterrupt\u001B[0m Traceback (most recent call last)",
"Cell \u001B[0;32mIn[36], line 1\u001B[0m\n\u001B[0;32m----> 1\u001B[0m \u001B[43magent_executor\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43minvoke\u001B[49m\u001B[43m(\u001B[49m\u001B[43m{\u001B[49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[38;5;124;43minput\u001B[39;49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[43m:\u001B[49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[38;5;124;43mWhat is ChatGPT?\u001B[39;49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[43m}\u001B[49m\u001B[43m)\u001B[49m\n",
"File \u001B[0;32m~/code/langchain/libs/langchain/langchain/chains/base.py:163\u001B[0m, in \u001B[0;36mChain.invoke\u001B[0;34m(self, input, config, **kwargs)\u001B[0m\n\u001B[1;32m 161\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m \u001B[38;5;167;01mBaseException\u001B[39;00m \u001B[38;5;28;01mas\u001B[39;00m e:\n\u001B[1;32m 162\u001B[0m run_manager\u001B[38;5;241m.\u001B[39mon_chain_error(e)\n\u001B[0;32m--> 163\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m e\n\u001B[1;32m 164\u001B[0m run_manager\u001B[38;5;241m.\u001B[39mon_chain_end(outputs)\n\u001B[1;32m 166\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m include_run_info:\n",
"File \u001B[0;32m~/code/langchain/libs/langchain/langchain/chains/base.py:153\u001B[0m, in \u001B[0;36mChain.invoke\u001B[0;34m(self, input, config, **kwargs)\u001B[0m\n\u001B[1;32m 150\u001B[0m \u001B[38;5;28;01mtry\u001B[39;00m:\n\u001B[1;32m 151\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_validate_inputs(inputs)\n\u001B[1;32m 152\u001B[0m outputs \u001B[38;5;241m=\u001B[39m (\n\u001B[0;32m--> 153\u001B[0m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_call\u001B[49m\u001B[43m(\u001B[49m\u001B[43minputs\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mrun_manager\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mrun_manager\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 154\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m new_arg_supported\n\u001B[1;32m 155\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_call(inputs)\n\u001B[1;32m 156\u001B[0m )\n\u001B[1;32m 158\u001B[0m final_outputs: Dict[\u001B[38;5;28mstr\u001B[39m, Any] \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mprep_outputs(\n\u001B[1;32m 159\u001B[0m inputs, outputs, return_only_outputs\n\u001B[1;32m 160\u001B[0m )\n\u001B[1;32m 161\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m \u001B[38;5;167;01mBaseException\u001B[39;00m \u001B[38;5;28;01mas\u001B[39;00m e:\n",
"File \u001B[0;32m~/code/langchain/libs/langchain/langchain/agents/agent.py:1432\u001B[0m, in \u001B[0;36mAgentExecutor._call\u001B[0;34m(self, inputs, run_manager)\u001B[0m\n\u001B[1;32m 1430\u001B[0m \u001B[38;5;66;03m# We now enter the agent loop (until it returns something).\u001B[39;00m\n\u001B[1;32m 1431\u001B[0m \u001B[38;5;28;01mwhile\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_should_continue(iterations, time_elapsed):\n\u001B[0;32m-> 1432\u001B[0m next_step_output \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_take_next_step\u001B[49m\u001B[43m(\u001B[49m\n\u001B[1;32m 1433\u001B[0m \u001B[43m \u001B[49m\u001B[43mname_to_tool_map\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1434\u001B[0m \u001B[43m \u001B[49m\u001B[43mcolor_mapping\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1435\u001B[0m \u001B[43m \u001B[49m\u001B[43minputs\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1436\u001B[0m \u001B[43m \u001B[49m\u001B[43mintermediate_steps\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1437\u001B[0m \u001B[43m \u001B[49m\u001B[43mrun_manager\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mrun_manager\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1438\u001B[0m \u001B[43m \u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 1439\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28misinstance\u001B[39m(next_step_output, AgentFinish):\n\u001B[1;32m 1440\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_return(\n\u001B[1;32m 1441\u001B[0m next_step_output, intermediate_steps, run_manager\u001B[38;5;241m=\u001B[39mrun_manager\n\u001B[1;32m 1442\u001B[0m )\n",
"File \u001B[0;32m~/code/langchain/libs/langchain/langchain/agents/agent.py:1138\u001B[0m, in \u001B[0;36mAgentExecutor._take_next_step\u001B[0;34m(self, name_to_tool_map, color_mapping, inputs, intermediate_steps, run_manager)\u001B[0m\n\u001B[1;32m 1129\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21m_take_next_step\u001B[39m(\n\u001B[1;32m 1130\u001B[0m \u001B[38;5;28mself\u001B[39m,\n\u001B[1;32m 1131\u001B[0m name_to_tool_map: Dict[\u001B[38;5;28mstr\u001B[39m, BaseTool],\n\u001B[0;32m (...)\u001B[0m\n\u001B[1;32m 1135\u001B[0m run_manager: Optional[CallbackManagerForChainRun] \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m,\n\u001B[1;32m 1136\u001B[0m ) \u001B[38;5;241m-\u001B[39m\u001B[38;5;241m>\u001B[39m Union[AgentFinish, List[Tuple[AgentAction, \u001B[38;5;28mstr\u001B[39m]]]:\n\u001B[1;32m 1137\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_consume_next_step(\n\u001B[0;32m-> 1138\u001B[0m [\n\u001B[1;32m 1139\u001B[0m a\n\u001B[1;32m 1140\u001B[0m \u001B[38;5;28;01mfor\u001B[39;00m a \u001B[38;5;129;01min\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_iter_next_step(\n\u001B[1;32m 1141\u001B[0m name_to_tool_map,\n\u001B[1;32m 1142\u001B[0m color_mapping,\n\u001B[1;32m 1143\u001B[0m inputs,\n\u001B[1;32m 1144\u001B[0m intermediate_steps,\n\u001B[1;32m 1145\u001B[0m run_manager,\n\u001B[1;32m 1146\u001B[0m )\n\u001B[1;32m 1147\u001B[0m ]\n\u001B[1;32m 1148\u001B[0m )\n",
"File \u001B[0;32m~/code/langchain/libs/langchain/langchain/agents/agent.py:1138\u001B[0m, in \u001B[0;36m<listcomp>\u001B[0;34m(.0)\u001B[0m\n\u001B[1;32m 1129\u001B[0m \u001B[38;5;28;01mdef\u001B[39;00m \u001B[38;5;21m_take_next_step\u001B[39m(\n\u001B[1;32m 1130\u001B[0m \u001B[38;5;28mself\u001B[39m,\n\u001B[1;32m 1131\u001B[0m name_to_tool_map: Dict[\u001B[38;5;28mstr\u001B[39m, BaseTool],\n\u001B[0;32m (...)\u001B[0m\n\u001B[1;32m 1135\u001B[0m run_manager: Optional[CallbackManagerForChainRun] \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m,\n\u001B[1;32m 1136\u001B[0m ) \u001B[38;5;241m-\u001B[39m\u001B[38;5;241m>\u001B[39m Union[AgentFinish, List[Tuple[AgentAction, \u001B[38;5;28mstr\u001B[39m]]]:\n\u001B[1;32m 1137\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_consume_next_step(\n\u001B[0;32m-> 1138\u001B[0m [\n\u001B[1;32m 1139\u001B[0m a\n\u001B[1;32m 1140\u001B[0m \u001B[38;5;28;01mfor\u001B[39;00m a \u001B[38;5;129;01min\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_iter_next_step(\n\u001B[1;32m 1141\u001B[0m name_to_tool_map,\n\u001B[1;32m 1142\u001B[0m color_mapping,\n\u001B[1;32m 1143\u001B[0m inputs,\n\u001B[1;32m 1144\u001B[0m intermediate_steps,\n\u001B[1;32m 1145\u001B[0m run_manager,\n\u001B[1;32m 1146\u001B[0m )\n\u001B[1;32m 1147\u001B[0m ]\n\u001B[1;32m 1148\u001B[0m )\n",
"File \u001B[0;32m~/code/langchain/libs/langchain/langchain/agents/agent.py:1223\u001B[0m, in \u001B[0;36mAgentExecutor._iter_next_step\u001B[0;34m(self, name_to_tool_map, color_mapping, inputs, intermediate_steps, run_manager)\u001B[0m\n\u001B[1;32m 1221\u001B[0m \u001B[38;5;28;01myield\u001B[39;00m agent_action\n\u001B[1;32m 1222\u001B[0m \u001B[38;5;28;01mfor\u001B[39;00m agent_action \u001B[38;5;129;01min\u001B[39;00m actions:\n\u001B[0;32m-> 1223\u001B[0m \u001B[38;5;28;01myield\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_perform_agent_action\u001B[49m\u001B[43m(\u001B[49m\n\u001B[1;32m 1224\u001B[0m \u001B[43m \u001B[49m\u001B[43mname_to_tool_map\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mcolor_mapping\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43magent_action\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mrun_manager\u001B[49m\n\u001B[1;32m 1225\u001B[0m \u001B[43m \u001B[49m\u001B[43m)\u001B[49m\n",
"File \u001B[0;32m~/code/langchain/libs/langchain/langchain/agents/agent.py:1245\u001B[0m, in \u001B[0;36mAgentExecutor._perform_agent_action\u001B[0;34m(self, name_to_tool_map, color_mapping, agent_action, run_manager)\u001B[0m\n\u001B[1;32m 1243\u001B[0m tool_run_kwargs[\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mllm_prefix\u001B[39m\u001B[38;5;124m\"\u001B[39m] \u001B[38;5;241m=\u001B[39m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 1244\u001B[0m \u001B[38;5;66;03m# We then call the tool on the tool input to get an observation\u001B[39;00m\n\u001B[0;32m-> 1245\u001B[0m observation \u001B[38;5;241m=\u001B[39m \u001B[43mtool\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mrun\u001B[49m\u001B[43m(\u001B[49m\n\u001B[1;32m 1246\u001B[0m \u001B[43m \u001B[49m\u001B[43magent_action\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mtool_input\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1247\u001B[0m \u001B[43m \u001B[49m\u001B[43mverbose\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mverbose\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1248\u001B[0m \u001B[43m \u001B[49m\u001B[43mcolor\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mcolor\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1249\u001B[0m \u001B[43m \u001B[49m\u001B[43mcallbacks\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mrun_manager\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mget_child\u001B[49m\u001B[43m(\u001B[49m\u001B[43m)\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;28;43;01mif\u001B[39;49;00m\u001B[43m \u001B[49m\u001B[43mrun_manager\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;28;43;01melse\u001B[39;49;00m\u001B[43m \u001B[49m\u001B[38;5;28;43;01mNone\u001B[39;49;00m\u001B[43m,\u001B[49m\n\u001B[1;32m 1250\u001B[0m \u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mtool_run_kwargs\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1251\u001B[0m \u001B[43m \u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 1252\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[1;32m 1253\u001B[0m tool_run_kwargs \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39magent\u001B[38;5;241m.\u001B[39mtool_run_logging_kwargs()\n",
"File \u001B[0;32m~/code/langchain/libs/core/langchain_core/tools.py:422\u001B[0m, in \u001B[0;36mBaseTool.run\u001B[0;34m(self, tool_input, verbose, start_color, color, callbacks, tags, metadata, run_name, run_id, **kwargs)\u001B[0m\n\u001B[1;32m 420\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m (\u001B[38;5;167;01mException\u001B[39;00m, \u001B[38;5;167;01mKeyboardInterrupt\u001B[39;00m) \u001B[38;5;28;01mas\u001B[39;00m e:\n\u001B[1;32m 421\u001B[0m run_manager\u001B[38;5;241m.\u001B[39mon_tool_error(e)\n\u001B[0;32m--> 422\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m e\n\u001B[1;32m 423\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[1;32m 424\u001B[0m run_manager\u001B[38;5;241m.\u001B[39mon_tool_end(observation, color\u001B[38;5;241m=\u001B[39mcolor, name\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mname, \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mkwargs)\n",
"File \u001B[0;32m~/code/langchain/libs/core/langchain_core/tools.py:381\u001B[0m, in \u001B[0;36mBaseTool.run\u001B[0;34m(self, tool_input, verbose, start_color, color, callbacks, tags, metadata, run_name, run_id, **kwargs)\u001B[0m\n\u001B[1;32m 378\u001B[0m parsed_input \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_parse_input(tool_input)\n\u001B[1;32m 379\u001B[0m tool_args, tool_kwargs \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_to_args_and_kwargs(parsed_input)\n\u001B[1;32m 380\u001B[0m observation \u001B[38;5;241m=\u001B[39m (\n\u001B[0;32m--> 381\u001B[0m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_run\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mtool_args\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mrun_manager\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43mrun_manager\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mtool_kwargs\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 382\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m new_arg_supported\n\u001B[1;32m 383\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_run(\u001B[38;5;241m*\u001B[39mtool_args, \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mtool_kwargs)\n\u001B[1;32m 384\u001B[0m )\n\u001B[1;32m 385\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m ValidationError \u001B[38;5;28;01mas\u001B[39;00m e:\n\u001B[1;32m 386\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;129;01mnot\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mhandle_validation_error:\n",
"File \u001B[0;32m~/code/langchain/libs/core/langchain_core/tools.py:588\u001B[0m, in \u001B[0;36mTool._run\u001B[0;34m(self, run_manager, *args, **kwargs)\u001B[0m\n\u001B[1;32m 579\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mfunc:\n\u001B[1;32m 580\u001B[0m new_argument_supported \u001B[38;5;241m=\u001B[39m signature(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mfunc)\u001B[38;5;241m.\u001B[39mparameters\u001B[38;5;241m.\u001B[39mget(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mcallbacks\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n\u001B[1;32m 581\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m (\n\u001B[1;32m 582\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mfunc(\n\u001B[1;32m 583\u001B[0m \u001B[38;5;241m*\u001B[39margs,\n\u001B[1;32m 584\u001B[0m callbacks\u001B[38;5;241m=\u001B[39mrun_manager\u001B[38;5;241m.\u001B[39mget_child() \u001B[38;5;28;01mif\u001B[39;00m run_manager \u001B[38;5;28;01melse\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m,\n\u001B[1;32m 585\u001B[0m \u001B[38;5;241m*\u001B[39m\u001B[38;5;241m*\u001B[39mkwargs,\n\u001B[1;32m 586\u001B[0m )\n\u001B[1;32m 587\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m new_argument_supported\n\u001B[0;32m--> 588\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mfunc\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43margs\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mkwargs\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 589\u001B[0m )\n\u001B[1;32m 590\u001B[0m \u001B[38;5;28;01mraise\u001B[39;00m \u001B[38;5;167;01mNotImplementedError\u001B[39;00m(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mTool does not support sync\u001B[39m\u001B[38;5;124m\"\u001B[39m)\n",
"File \u001B[0;32m~/code/langchain/libs/community/langchain_community/utilities/google_search.py:94\u001B[0m, in \u001B[0;36mGoogleSearchAPIWrapper.run\u001B[0;34m(self, query)\u001B[0m\n\u001B[1;32m 92\u001B[0m \u001B[38;5;250m\u001B[39m\u001B[38;5;124;03m\"\"\"Run query through GoogleSearch and parse result.\"\"\"\u001B[39;00m\n\u001B[1;32m 93\u001B[0m snippets \u001B[38;5;241m=\u001B[39m []\n\u001B[0;32m---> 94\u001B[0m results \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_google_search_results\u001B[49m\u001B[43m(\u001B[49m\u001B[43mquery\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mnum\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mk\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 95\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28mlen\u001B[39m(results) \u001B[38;5;241m==\u001B[39m \u001B[38;5;241m0\u001B[39m:\n\u001B[1;32m 96\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mNo good Google Search Result was found\u001B[39m\u001B[38;5;124m\"\u001B[39m\n",
"File \u001B[0;32m~/code/langchain/libs/community/langchain_community/utilities/google_search.py:62\u001B[0m, in \u001B[0;36mGoogleSearchAPIWrapper._google_search_results\u001B[0;34m(self, search_term, **kwargs)\u001B[0m\n\u001B[1;32m 60\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39msiterestrict:\n\u001B[1;32m 61\u001B[0m cse \u001B[38;5;241m=\u001B[39m cse\u001B[38;5;241m.\u001B[39msiterestrict()\n\u001B[0;32m---> 62\u001B[0m res \u001B[38;5;241m=\u001B[39m \u001B[43mcse\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mlist\u001B[49m\u001B[43m(\u001B[49m\u001B[43mq\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[43msearch_term\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mcx\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mgoogle_cse_id\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mkwargs\u001B[49m\u001B[43m)\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mexecute\u001B[49m\u001B[43m(\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 63\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m res\u001B[38;5;241m.\u001B[39mget(\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mitems\u001B[39m\u001B[38;5;124m\"\u001B[39m, [])\n",
"File \u001B[0;32m~/code/langchain/.venv/lib/python3.10/site-packages/googleapiclient/_helpers.py:130\u001B[0m, in \u001B[0;36mpositional.<locals>.positional_decorator.<locals>.positional_wrapper\u001B[0;34m(*args, **kwargs)\u001B[0m\n\u001B[1;32m 128\u001B[0m \u001B[38;5;28;01melif\u001B[39;00m positional_parameters_enforcement \u001B[38;5;241m==\u001B[39m POSITIONAL_WARNING:\n\u001B[1;32m 129\u001B[0m logger\u001B[38;5;241m.\u001B[39mwarning(message)\n\u001B[0;32m--> 130\u001B[0m \u001B[38;5;28;01mreturn\u001B[39;00m \u001B[43mwrapped\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43margs\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mkwargs\u001B[49m\u001B[43m)\u001B[49m\n",
"File \u001B[0;32m~/code/langchain/.venv/lib/python3.10/site-packages/googleapiclient/http.py:923\u001B[0m, in \u001B[0;36mHttpRequest.execute\u001B[0;34m(self, http, num_retries)\u001B[0m\n\u001B[1;32m 920\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mheaders[\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124mcontent-length\u001B[39m\u001B[38;5;124m\"\u001B[39m] \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mstr\u001B[39m(\u001B[38;5;28mlen\u001B[39m(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mbody))\n\u001B[1;32m 922\u001B[0m \u001B[38;5;66;03m# Handle retries for server-side errors.\u001B[39;00m\n\u001B[0;32m--> 923\u001B[0m resp, content \u001B[38;5;241m=\u001B[39m \u001B[43m_retry_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[1;32m 924\u001B[0m \u001B[43m \u001B[49m\u001B[43mhttp\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 925\u001B[0m \u001B[43m \u001B[49m\u001B[43mnum_retries\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 926\u001B[0m \u001B[43m \u001B[49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[38;5;124;43mrequest\u001B[39;49m\u001B[38;5;124;43m\"\u001B[39;49m\u001B[43m,\u001B[49m\n\u001B[1;32m 927\u001B[0m \u001B[43m \u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_sleep\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 928\u001B[0m \u001B[43m \u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_rand\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 929\u001B[0m \u001B[43m \u001B[49m\u001B[38;5;28;43mstr\u001B[39;49m\u001B[43m(\u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43muri\u001B[49m\u001B[43m)\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 930\u001B[0m \u001B[43m \u001B[49m\u001B[43mmethod\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mstr\u001B[39;49m\u001B[43m(\u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mmethod\u001B[49m\u001B[43m)\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 931\u001B[0m \u001B[43m \u001B[49m\u001B[43mbody\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mbody\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 932\u001B[0m \u001B[43m \u001B[49m\u001B[43mheaders\u001B[49m\u001B[38;5;241;43m=\u001B[39;49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mheaders\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 933\u001B[0m \u001B[43m\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 935\u001B[0m \u001B[38;5;28;01mfor\u001B[39;00m callback \u001B[38;5;129;01min\u001B[39;00m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mresponse_callbacks:\n\u001B[1;32m 936\u001B[0m callback(resp)\n",
"File \u001B[0;32m~/code/langchain/.venv/lib/python3.10/site-packages/googleapiclient/http.py:191\u001B[0m, in \u001B[0;36m_retry_request\u001B[0;34m(http, num_retries, req_type, sleep, rand, uri, method, *args, **kwargs)\u001B[0m\n\u001B[1;32m 189\u001B[0m \u001B[38;5;28;01mtry\u001B[39;00m:\n\u001B[1;32m 190\u001B[0m exception \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;01mNone\u001B[39;00m\n\u001B[0;32m--> 191\u001B[0m resp, content \u001B[38;5;241m=\u001B[39m \u001B[43mhttp\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mrequest\u001B[49m\u001B[43m(\u001B[49m\u001B[43muri\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mmethod\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43margs\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[38;5;241;43m*\u001B[39;49m\u001B[43mkwargs\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 192\u001B[0m \u001B[38;5;66;03m# Retry on SSL errors and socket timeout errors.\u001B[39;00m\n\u001B[1;32m 193\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m _ssl_SSLError \u001B[38;5;28;01mas\u001B[39;00m ssl_error:\n",
"File \u001B[0;32m~/code/langchain/.venv/lib/python3.10/site-packages/httplib2/__init__.py:1724\u001B[0m, in \u001B[0;36mHttp.request\u001B[0;34m(self, uri, method, body, headers, redirections, connection_type)\u001B[0m\n\u001B[1;32m 1722\u001B[0m content \u001B[38;5;241m=\u001B[39m \u001B[38;5;124mb\u001B[39m\u001B[38;5;124m\"\u001B[39m\u001B[38;5;124m\"\u001B[39m\n\u001B[1;32m 1723\u001B[0m \u001B[38;5;28;01melse\u001B[39;00m:\n\u001B[0;32m-> 1724\u001B[0m (response, content) \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_request\u001B[49m\u001B[43m(\u001B[49m\n\u001B[1;32m 1725\u001B[0m \u001B[43m \u001B[49m\u001B[43mconn\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mauthority\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43muri\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mrequest_uri\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mmethod\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mbody\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mheaders\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mredirections\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mcachekey\u001B[49m\u001B[43m,\u001B[49m\n\u001B[1;32m 1726\u001B[0m \u001B[43m \u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 1727\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m \u001B[38;5;167;01mException\u001B[39;00m \u001B[38;5;28;01mas\u001B[39;00m e:\n\u001B[1;32m 1728\u001B[0m is_timeout \u001B[38;5;241m=\u001B[39m \u001B[38;5;28misinstance\u001B[39m(e, socket\u001B[38;5;241m.\u001B[39mtimeout)\n",
"File \u001B[0;32m~/code/langchain/.venv/lib/python3.10/site-packages/httplib2/__init__.py:1444\u001B[0m, in \u001B[0;36mHttp._request\u001B[0;34m(self, conn, host, absolute_uri, request_uri, method, body, headers, redirections, cachekey)\u001B[0m\n\u001B[1;32m 1441\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m auth:\n\u001B[1;32m 1442\u001B[0m auth\u001B[38;5;241m.\u001B[39mrequest(method, request_uri, headers, body)\n\u001B[0;32m-> 1444\u001B[0m (response, content) \u001B[38;5;241m=\u001B[39m \u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43m_conn_request\u001B[49m\u001B[43m(\u001B[49m\u001B[43mconn\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mrequest_uri\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mmethod\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mbody\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[43mheaders\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 1446\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m auth:\n\u001B[1;32m 1447\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m auth\u001B[38;5;241m.\u001B[39mresponse(response, body):\n",
"File \u001B[0;32m~/code/langchain/.venv/lib/python3.10/site-packages/httplib2/__init__.py:1366\u001B[0m, in \u001B[0;36mHttp._conn_request\u001B[0;34m(self, conn, request_uri, method, body, headers)\u001B[0m\n\u001B[1;32m 1364\u001B[0m \u001B[38;5;28;01mtry\u001B[39;00m:\n\u001B[1;32m 1365\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m conn\u001B[38;5;241m.\u001B[39msock \u001B[38;5;129;01mis\u001B[39;00m \u001B[38;5;28;01mNone\u001B[39;00m:\n\u001B[0;32m-> 1366\u001B[0m \u001B[43mconn\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mconnect\u001B[49m\u001B[43m(\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 1367\u001B[0m conn\u001B[38;5;241m.\u001B[39mrequest(method, request_uri, body, headers)\n\u001B[1;32m 1368\u001B[0m \u001B[38;5;28;01mexcept\u001B[39;00m socket\u001B[38;5;241m.\u001B[39mtimeout:\n",
"File \u001B[0;32m~/code/langchain/.venv/lib/python3.10/site-packages/httplib2/__init__.py:1156\u001B[0m, in \u001B[0;36mHTTPSConnectionWithTimeout.connect\u001B[0;34m(self)\u001B[0m\n\u001B[1;32m 1154\u001B[0m \u001B[38;5;28;01mif\u001B[39;00m has_timeout(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mtimeout):\n\u001B[1;32m 1155\u001B[0m sock\u001B[38;5;241m.\u001B[39msettimeout(\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mtimeout)\n\u001B[0;32m-> 1156\u001B[0m \u001B[43msock\u001B[49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mconnect\u001B[49m\u001B[43m(\u001B[49m\u001B[43m(\u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mhost\u001B[49m\u001B[43m,\u001B[49m\u001B[43m \u001B[49m\u001B[38;5;28;43mself\u001B[39;49m\u001B[38;5;241;43m.\u001B[39;49m\u001B[43mport\u001B[49m\u001B[43m)\u001B[49m\u001B[43m)\u001B[49m\n\u001B[1;32m 1158\u001B[0m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39msock \u001B[38;5;241m=\u001B[39m \u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39m_context\u001B[38;5;241m.\u001B[39mwrap_socket(sock, server_hostname\u001B[38;5;241m=\u001B[39m\u001B[38;5;28mself\u001B[39m\u001B[38;5;241m.\u001B[39mhost)\n\u001B[1;32m 1160\u001B[0m \u001B[38;5;66;03m# Python 3.3 compatibility: emulate the check_hostname behavior\u001B[39;00m\n",
"\u001B[0;31mKeyboardInterrupt\u001B[0m: "
]
}
],
"source": [
"agent_chain.run(input=\"What is ChatGPT?\")"
"agent_executor.invoke({\"input\": \"What is ChatGPT?\"})"
]
},
{
@@ -179,15 +196,15 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to find out who developed ChatGPT\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I need to find out who developed ChatGPT\n",
"Action: Search\n",
"Action Input: Who developed ChatGPT\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large ... Feb 15, 2023 ... Who owns Chat GPT? Chat GPT is owned and developed by AI research and deployment company, OpenAI. The organization is headquartered in San ... Feb 8, 2023 ... ChatGPT is an AI chatbot developed by San Francisco-based startup OpenAI. OpenAI was co-founded in 2015 by Elon Musk and Sam Altman and is ... Dec 7, 2022 ... ChatGPT is an AI chatbot designed and developed by OpenAI. The bot works by generating text responses based on human-user input, like questions ... Jan 12, 2023 ... In 2019, Microsoft invested $1 billion in OpenAI, the tiny San Francisco company that designed ChatGPT. And in the years since, it has quietly ... Jan 25, 2023 ... The inside story of ChatGPT: How OpenAI founder Sam Altman built the world's hottest technology with billions from Microsoft. Dec 3, 2022 ... ChatGPT went viral on social media for its ability to do anything from code to write essays. · The company that created the AI chatbot has a ... Jan 17, 2023 ... While many Americans were nursing hangovers on New Year's Day, 22-year-old Edward Tian was working feverishly on a new app to combat misuse ... ChatGPT is a language model created by OpenAI, an artificial intelligence research laboratory consisting of a team of researchers and engineers focused on ... 1 day ago ... Everyone is talking about ChatGPT, developed by OpenAI. This is such a great tool that has helped to make AI more accessible to a wider ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: ChatGPT was developed by OpenAI.\u001b[0m\n",
"Action Input: Who developed ChatGPT\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large ... Feb 15, 2023 ... Who owns Chat GPT? Chat GPT is owned and developed by AI research and deployment company, OpenAI. The organization is headquartered in San ... Feb 8, 2023 ... ChatGPT is an AI chatbot developed by San Francisco-based startup OpenAI. OpenAI was co-founded in 2015 by Elon Musk and Sam Altman and is ... Dec 7, 2022 ... ChatGPT is an AI chatbot designed and developed by OpenAI. The bot works by generating text responses based on human-user input, like questions ... Jan 12, 2023 ... In 2019, Microsoft invested $1 billion in OpenAI, the tiny San Francisco company that designed ChatGPT. And in the years since, it has quietly ... Jan 25, 2023 ... The inside story of ChatGPT: How OpenAI founder Sam Altman built the world's hottest technology with billions from Microsoft. Dec 3, 2022 ... ChatGPT went viral on social media for its ability to do anything from code to write essays. · The company that created the AI chatbot has a ... Jan 17, 2023 ... While many Americans were nursing hangovers on New Year's Day, 22-year-old Edward Tian was working feverishly on a new app to combat misuse ... ChatGPT is a language model created by OpenAI, an artificial intelligence research laboratory consisting of a team of researchers and engineers focused on ... 1 day ago ... Everyone is talking about ChatGPT, developed by OpenAI. This is such a great tool that has helped to make AI more accessible to a wider ...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer\n",
"Final Answer: ChatGPT was developed by OpenAI.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -202,7 +219,7 @@
}
],
"source": [
"agent_chain.run(input=\"Who developed it?\")"
"agent_executor.invoke({\"input\": \"Who developed it?\"})"
]
},
{
@@ -217,14 +234,14 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to simplify the conversation for a 5 year old.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I need to simplify the conversation for a 5 year old.\n",
"Action: Summary\n",
"Action Input: My daughter 5 years old\u001b[0m\n",
"Action Input: My daughter 5 years old\u001B[0m\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"\u001B[1m> Entering new LLMChain chain...\u001B[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThis is a conversation between a human and a bot:\n",
"\u001B[32;1m\u001B[1;3mThis is a conversation between a human and a bot:\n",
"\n",
"Human: What is ChatGPT?\n",
"AI: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\n",
@@ -232,16 +249,16 @@
"AI: ChatGPT was developed by OpenAI.\n",
"\n",
"Write a summary of the conversation for My daughter 5 years old:\n",
"\u001b[0m\n",
"\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3m\n",
"The conversation was about ChatGPT, an artificial intelligence chatbot. It was created by OpenAI and can send and receive images while chatting.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot created by OpenAI that can send and receive images while chatting.\u001b[0m\n",
"Observation: \u001B[33;1m\u001B[1;3m\n",
"The conversation was about ChatGPT, an artificial intelligence chatbot. It was created by OpenAI and can send and receive images while chatting.\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot created by OpenAI that can send and receive images while chatting.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -256,8 +273,8 @@
}
],
"source": [
"agent_chain.run(\n",
" input=\"Thanks. Summarize the conversation, for my daughter 5 years old.\"\n",
"agent_executor.invoke(\n",
" {\"input\": \"Thanks. Summarize the conversation, for my daughter 5 years old.\"}\n",
")"
]
},
@@ -289,9 +306,17 @@
}
],
"source": [
"print(agent_chain.memory.buffer)"
"print(agent_executor.memory.buffer)"
]
},
{
"cell_type": "markdown",
"id": "84ca95c30e262e00",
"metadata": {
"collapsed": false
},
"source": []
},
{
"cell_type": "markdown",
"id": "cc3d0aa4",
@@ -340,25 +365,9 @@
" ),\n",
"]\n",
"\n",
"prefix = \"\"\"Have a conversation with a human, answering the following questions as best you can. You have access to the following tools:\"\"\"\n",
"suffix = \"\"\"Begin!\"\n",
"\n",
"{chat_history}\n",
"Question: {input}\n",
"{agent_scratchpad}\"\"\"\n",
"\n",
"prompt = ZeroShotAgent.create_prompt(\n",
" tools,\n",
" prefix=prefix,\n",
" suffix=suffix,\n",
" input_variables=[\"input\", \"chat_history\", \"agent_scratchpad\"],\n",
")\n",
"\n",
"llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)\n",
"agent = ZeroShotAgent(llm_chain=llm_chain, tools=tools, verbose=True)\n",
"agent_chain = AgentExecutor.from_agent_and_tools(\n",
" agent=agent, tools=tools, verbose=True, memory=memory\n",
")"
"prompt = hub.pull(\"hwchase17/react\")\n",
"agent = create_react_agent(model, tools, prompt)\n",
"agent_executor = AgentExecutor(agent=agent, tools=tools, memory=memory)"
]
},
{
@@ -373,15 +382,15 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I should research ChatGPT to answer this question.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I should research ChatGPT to answer this question.\n",
"Action: Search\n",
"Action Input: \"ChatGPT\"\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mNov 30, 2022 ... We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer ... ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large ... ChatGPT. We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer ... Feb 2, 2023 ... ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after ... 2 days ago ... ChatGPT recently launched a new version of its own plagiarism detection tool, with hopes that it will squelch some of the criticism around how ... An API for accessing new AI models developed by OpenAI. Feb 19, 2023 ... ChatGPT is an AI chatbot system that OpenAI released in November to show off and test what a very large, powerful AI system can accomplish. You ... ChatGPT is fine-tuned from GPT-3.5, a language model trained to produce text. ChatGPT was optimized for dialogue by using Reinforcement Learning with Human ... 3 days ago ... Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. Dec 1, 2022 ... ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with a ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\u001b[0m\n",
"Action Input: \"ChatGPT\"\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mNov 30, 2022 ... We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer ... ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large ... ChatGPT. We've trained a model called ChatGPT which interacts in a conversational way. The dialogue format makes it possible for ChatGPT to answer ... Feb 2, 2023 ... ChatGPT, the popular chatbot from OpenAI, is estimated to have reached 100 million monthly active users in January, just two months after ... 2 days ago ... ChatGPT recently launched a new version of its own plagiarism detection tool, with hopes that it will squelch some of the criticism around how ... An API for accessing new AI models developed by OpenAI. Feb 19, 2023 ... ChatGPT is an AI chatbot system that OpenAI released in November to show off and test what a very large, powerful AI system can accomplish. You ... ChatGPT is fine-tuned from GPT-3.5, a language model trained to produce text. ChatGPT was optimized for dialogue by using Reinforcement Learning with Human ... 3 days ago ... Visual ChatGPT connects ChatGPT and a series of Visual Foundation Models to enable sending and receiving images during chatting. Dec 1, 2022 ... ChatGPT is a natural language processing tool driven by AI technology that allows you to have human-like conversations and much more with a ...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -396,7 +405,7 @@
}
],
"source": [
"agent_chain.run(input=\"What is ChatGPT?\")"
"agent_executor.invoke({\"input\": \"What is ChatGPT?\"})"
]
},
{
@@ -411,15 +420,15 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to find out who developed ChatGPT\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I need to find out who developed ChatGPT\n",
"Action: Search\n",
"Action Input: Who developed ChatGPT\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large ... Feb 15, 2023 ... Who owns Chat GPT? Chat GPT is owned and developed by AI research and deployment company, OpenAI. The organization is headquartered in San ... Feb 8, 2023 ... ChatGPT is an AI chatbot developed by San Francisco-based startup OpenAI. OpenAI was co-founded in 2015 by Elon Musk and Sam Altman and is ... Dec 7, 2022 ... ChatGPT is an AI chatbot designed and developed by OpenAI. The bot works by generating text responses based on human-user input, like questions ... Jan 12, 2023 ... In 2019, Microsoft invested $1 billion in OpenAI, the tiny San Francisco company that designed ChatGPT. And in the years since, it has quietly ... Jan 25, 2023 ... The inside story of ChatGPT: How OpenAI founder Sam Altman built the world's hottest technology with billions from Microsoft. Dec 3, 2022 ... ChatGPT went viral on social media for its ability to do anything from code to write essays. · The company that created the AI chatbot has a ... Jan 17, 2023 ... While many Americans were nursing hangovers on New Year's Day, 22-year-old Edward Tian was working feverishly on a new app to combat misuse ... ChatGPT is a language model created by OpenAI, an artificial intelligence research laboratory consisting of a team of researchers and engineers focused on ... 1 day ago ... Everyone is talking about ChatGPT, developed by OpenAI. This is such a great tool that has helped to make AI more accessible to a wider ...\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: ChatGPT was developed by OpenAI.\u001b[0m\n",
"Action Input: Who developed ChatGPT\u001B[0m\n",
"Observation: \u001B[36;1m\u001B[1;3mChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large ... Feb 15, 2023 ... Who owns Chat GPT? Chat GPT is owned and developed by AI research and deployment company, OpenAI. The organization is headquartered in San ... Feb 8, 2023 ... ChatGPT is an AI chatbot developed by San Francisco-based startup OpenAI. OpenAI was co-founded in 2015 by Elon Musk and Sam Altman and is ... Dec 7, 2022 ... ChatGPT is an AI chatbot designed and developed by OpenAI. The bot works by generating text responses based on human-user input, like questions ... Jan 12, 2023 ... In 2019, Microsoft invested $1 billion in OpenAI, the tiny San Francisco company that designed ChatGPT. And in the years since, it has quietly ... Jan 25, 2023 ... The inside story of ChatGPT: How OpenAI founder Sam Altman built the world's hottest technology with billions from Microsoft. Dec 3, 2022 ... ChatGPT went viral on social media for its ability to do anything from code to write essays. · The company that created the AI chatbot has a ... Jan 17, 2023 ... While many Americans were nursing hangovers on New Year's Day, 22-year-old Edward Tian was working feverishly on a new app to combat misuse ... ChatGPT is a language model created by OpenAI, an artificial intelligence research laboratory consisting of a team of researchers and engineers focused on ... 1 day ago ... Everyone is talking about ChatGPT, developed by OpenAI. This is such a great tool that has helped to make AI more accessible to a wider ...\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer\n",
"Final Answer: ChatGPT was developed by OpenAI.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -434,7 +443,7 @@
}
],
"source": [
"agent_chain.run(input=\"Who developed it?\")"
"agent_executor.invoke({\"input\": \"Who developed it?\"})"
]
},
{
@@ -449,14 +458,14 @@
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3mThought: I need to simplify the conversation for a 5 year old.\n",
"\u001B[1m> Entering new AgentExecutor chain...\u001B[0m\n",
"\u001B[32;1m\u001B[1;3mThought: I need to simplify the conversation for a 5 year old.\n",
"Action: Summary\n",
"Action Input: My daughter 5 years old\u001b[0m\n",
"Action Input: My daughter 5 years old\u001B[0m\n",
"\n",
"\u001b[1m> Entering new LLMChain chain...\u001b[0m\n",
"\u001B[1m> Entering new LLMChain chain...\u001B[0m\n",
"Prompt after formatting:\n",
"\u001b[32;1m\u001b[1;3mThis is a conversation between a human and a bot:\n",
"\u001B[32;1m\u001B[1;3mThis is a conversation between a human and a bot:\n",
"\n",
"Human: What is ChatGPT?\n",
"AI: ChatGPT is an artificial intelligence chatbot developed by OpenAI and launched in November 2022. It is built on top of OpenAI's GPT-3 family of large language models and is optimized for dialogue by using Reinforcement Learning with Human-in-the-Loop. It is also capable of sending and receiving images during chatting.\n",
@@ -464,16 +473,16 @@
"AI: ChatGPT was developed by OpenAI.\n",
"\n",
"Write a summary of the conversation for My daughter 5 years old:\n",
"\u001b[0m\n",
"\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"\u001B[1m> Finished chain.\u001B[0m\n",
"\n",
"Observation: \u001b[33;1m\u001b[1;3m\n",
"The conversation was about ChatGPT, an artificial intelligence chatbot developed by OpenAI. It is designed to have conversations with humans and can also send and receive images.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI that can have conversations with humans and send and receive images.\u001b[0m\n",
"Observation: \u001B[33;1m\u001B[1;3m\n",
"The conversation was about ChatGPT, an artificial intelligence chatbot developed by OpenAI. It is designed to have conversations with humans and can also send and receive images.\u001B[0m\n",
"Thought:\u001B[32;1m\u001B[1;3m I now know the final answer.\n",
"Final Answer: ChatGPT is an artificial intelligence chatbot developed by OpenAI that can have conversations with humans and send and receive images.\u001B[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
"\u001B[1m> Finished chain.\u001B[0m\n"
]
},
{
@@ -488,8 +497,8 @@
}
],
"source": [
"agent_chain.run(\n",
" input=\"Thanks. Summarize the conversation, for my daughter 5 years old.\"\n",
"agent_executor.invoke(\n",
" {\"input\": \"Thanks. Summarize the conversation, for my daughter 5 years old.\"}\n",
")"
]
},
@@ -524,7 +533,7 @@
}
],
"source": [
"print(agent_chain.memory.buffer)"
"print(agent_executor.memory.buffer)"
]
}
],

View File

@@ -647,7 +647,7 @@ Sometimes you may not have the luxury of using OpenAI or other service-hosted la
import logging
import torch
from transformers import AutoTokenizer, GPT2TokenizerFast, pipeline, AutoModelForSeq2SeqLM, AutoModelForCausalLM
from langchain_community.llms import HuggingFacePipeline
from langchain_huggingface import HuggingFacePipeline
# Note: This model requires a large GPU, e.g. an 80GB A100. See documentation for other ways to run private non-OpenAI models.
model_id = "google/flan-ul2"
@@ -740,7 +740,7 @@ Even this relatively large model will most likely fail to generate more complica
```bash
poetry run pip install pyyaml chromadb
poetry run pip install pyyaml langchain_chroma
import yaml
```
@@ -992,9 +992,9 @@ Now that you have some examples (with manually corrected output SQL), you can do
```python
from langchain.prompts import FewShotPromptTemplate, PromptTemplate
from langchain.chains.sql_database.prompt import _sqlite_prompt, PROMPT_SUFFIX
from langchain_community.embeddings.huggingface import HuggingFaceEmbeddings
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.prompts.example_selector.semantic_similarity import SemanticSimilarityExampleSelector
from langchain_community.vectorstores import Chroma
from langchain_chroma import Chroma
example_prompt = PromptTemplate(
input_variables=["table_info", "input", "sql_cmd", "sql_result", "answer"],

View File

@@ -9,7 +9,7 @@
" \n",
"[Together AI](https://python.langchain.com/docs/integrations/llms/together) has a broad set of OSS LLMs via inference API.\n",
"\n",
"See [here](https://api.together.xyz/playground). We use `\"mistralai/Mixtral-8x7B-Instruct-v0.1` for RAG on the Mixtral paper.\n",
"See [here](https://docs.together.ai/docs/inference-models). We use `\"mistralai/Mixtral-8x7B-Instruct-v0.1` for RAG on the Mixtral paper.\n",
"\n",
"Download the paper:\n",
"https://arxiv.org/pdf/2401.04088.pdf"
@@ -22,7 +22,7 @@
"metadata": {},
"outputs": [],
"source": [
"! pip install --quiet pypdf chromadb tiktoken openai langchain-together"
"! pip install --quiet pypdf tiktoken openai langchain-chroma langchain-together"
]
},
{
@@ -45,8 +45,8 @@
"all_splits = text_splitter.split_documents(data)\n",
"\n",
"# Add to vectorDB\n",
"from langchain_chroma import Chroma\n",
"from langchain_community.embeddings import OpenAIEmbeddings\n",
"from langchain_community.vectorstores import Chroma\n",
"\n",
"\"\"\"\n",
"from langchain_together.embeddings import TogetherEmbeddings\n",
@@ -148,7 +148,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.9.6"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,199 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": 2,
"id": "c48812ed-35bd-4fbe-9a2c-6c7335e5645e",
"metadata": {},
"outputs": [],
"source": [
"from langchain_anthropic import ChatAnthropic\n",
"from langchain_core.runnables import ConfigurableField\n",
"from langchain_core.tools import tool\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"\n",
"@tool\n",
"def multiply(x: float, y: float) -> float:\n",
" \"\"\"Multiply 'x' times 'y'.\"\"\"\n",
" return x * y\n",
"\n",
"\n",
"@tool\n",
"def exponentiate(x: float, y: float) -> float:\n",
" \"\"\"Raise 'x' to the 'y'.\"\"\"\n",
" return x**y\n",
"\n",
"\n",
"@tool\n",
"def add(x: float, y: float) -> float:\n",
" \"\"\"Add 'x' and 'y'.\"\"\"\n",
" return x + y\n",
"\n",
"\n",
"tools = [multiply, exponentiate, add]\n",
"\n",
"gpt35 = ChatOpenAI(model=\"gpt-3.5-turbo-0125\", temperature=0).bind_tools(tools)\n",
"claude3 = ChatAnthropic(model=\"claude-3-sonnet-20240229\").bind_tools(tools)\n",
"llm_with_tools = gpt35.configurable_alternatives(\n",
" ConfigurableField(id=\"llm\"), default_key=\"gpt35\", claude3=claude3\n",
")"
]
},
{
"cell_type": "markdown",
"id": "9c186263-1b98-4cb2-b6d1-71f65eb0d811",
"metadata": {},
"source": [
"# LangGraph"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "28fc2c60-7dbc-428a-8983-1a6a15ea30d2",
"metadata": {},
"outputs": [],
"source": [
"import operator\n",
"from typing import Annotated, Sequence, TypedDict\n",
"\n",
"from langchain_core.messages import AIMessage, BaseMessage, HumanMessage, ToolMessage\n",
"from langchain_core.runnables import RunnableLambda\n",
"from langgraph.graph import END, StateGraph\n",
"\n",
"\n",
"class AgentState(TypedDict):\n",
" messages: Annotated[Sequence[BaseMessage], operator.add]\n",
"\n",
"\n",
"def should_continue(state):\n",
" return \"continue\" if state[\"messages\"][-1].tool_calls else \"end\"\n",
"\n",
"\n",
"def call_model(state, config):\n",
" return {\"messages\": [llm_with_tools.invoke(state[\"messages\"], config=config)]}\n",
"\n",
"\n",
"def _invoke_tool(tool_call):\n",
" tool = {tool.name: tool for tool in tools}[tool_call[\"name\"]]\n",
" return ToolMessage(tool.invoke(tool_call[\"args\"]), tool_call_id=tool_call[\"id\"])\n",
"\n",
"\n",
"tool_executor = RunnableLambda(_invoke_tool)\n",
"\n",
"\n",
"def call_tools(state):\n",
" last_message = state[\"messages\"][-1]\n",
" return {\"messages\": tool_executor.batch(last_message.tool_calls)}\n",
"\n",
"\n",
"workflow = StateGraph(AgentState)\n",
"workflow.add_node(\"agent\", call_model)\n",
"workflow.add_node(\"action\", call_tools)\n",
"workflow.set_entry_point(\"agent\")\n",
"workflow.add_conditional_edges(\n",
" \"agent\",\n",
" should_continue,\n",
" {\n",
" \"continue\": \"action\",\n",
" \"end\": END,\n",
" },\n",
")\n",
"workflow.add_edge(\"action\", \"agent\")\n",
"graph = workflow.compile()"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "3710e724-2595-4625-ba3a-effb81e66e4a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\"),\n",
" AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_6yMU2WsS4Bqgi1WxFHxtfJRc', 'function': {'arguments': '{\"x\": 8, \"y\": 2.743}', 'name': 'exponentiate'}, 'type': 'function'}, {'id': 'call_GAL3dQiKFF9XEV0RrRLPTvVp', 'function': {'arguments': '{\"x\": 17.24, \"y\": -918.1241}', 'name': 'add'}, 'type': 'function'}]}, response_metadata={'token_usage': {'completion_tokens': 58, 'prompt_tokens': 168, 'total_tokens': 226}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_b28b39ffa8', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run-528302fc-7acf-4c11-82c4-119ccf40c573-0', tool_calls=[{'name': 'exponentiate', 'args': {'x': 8, 'y': 2.743}, 'id': 'call_6yMU2WsS4Bqgi1WxFHxtfJRc'}, {'name': 'add', 'args': {'x': 17.24, 'y': -918.1241}, 'id': 'call_GAL3dQiKFF9XEV0RrRLPTvVp'}]),\n",
" ToolMessage(content='300.03770462067547', tool_call_id='call_6yMU2WsS4Bqgi1WxFHxtfJRc'),\n",
" ToolMessage(content='-900.8841', tool_call_id='call_GAL3dQiKFF9XEV0RrRLPTvVp'),\n",
" AIMessage(content='The result of \\\\(3 + 5^{2.743}\\\\) is approximately 300.04, and the result of \\\\(17.24 - 918.1241\\\\) is approximately -900.88.', response_metadata={'token_usage': {'completion_tokens': 44, 'prompt_tokens': 251, 'total_tokens': 295}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': 'fp_b28b39ffa8', 'finish_reason': 'stop', 'logprobs': None}, id='run-d1161669-ed09-4b18-94bd-6d8530df5aa8-0')]}"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"graph.invoke(\n",
" {\n",
" \"messages\": [\n",
" HumanMessage(\n",
" \"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\"\n",
" )\n",
" ]\n",
" }\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "073c074e-d722-42e0-85ec-c62c079207e4",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'messages': [HumanMessage(content=\"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\"),\n",
" AIMessage(content=[{'text': \"Okay, let's break this down into two parts:\", 'type': 'text'}, {'id': 'toolu_01DEhqcXkXTtzJAiZ7uMBeDC', 'input': {'x': 3, 'y': 5}, 'name': 'add', 'type': 'tool_use'}], response_metadata={'id': 'msg_01AkLGH8sxMHaH15yewmjwkF', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'input_tokens': 450, 'output_tokens': 81}}, id='run-f35bfae8-8ded-4f8a-831b-0940d6ad16b6-0', tool_calls=[{'name': 'add', 'args': {'x': 3, 'y': 5}, 'id': 'toolu_01DEhqcXkXTtzJAiZ7uMBeDC'}]),\n",
" ToolMessage(content='8.0', tool_call_id='toolu_01DEhqcXkXTtzJAiZ7uMBeDC'),\n",
" AIMessage(content=[{'id': 'toolu_013DyMLrvnrto33peAKMGMr1', 'input': {'x': 8.0, 'y': 2.743}, 'name': 'exponentiate', 'type': 'tool_use'}], response_metadata={'id': 'msg_015Fmp8aztwYcce2JDAFfce3', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'input_tokens': 545, 'output_tokens': 75}}, id='run-48aaeeeb-a1e5-48fd-a57a-6c3da2907b47-0', tool_calls=[{'name': 'exponentiate', 'args': {'x': 8.0, 'y': 2.743}, 'id': 'toolu_013DyMLrvnrto33peAKMGMr1'}]),\n",
" ToolMessage(content='300.03770462067547', tool_call_id='toolu_013DyMLrvnrto33peAKMGMr1'),\n",
" AIMessage(content=[{'text': 'So 3 plus 5 raised to the 2.743 power is 300.04.\\n\\nFor the second part:', 'type': 'text'}, {'id': 'toolu_01UTmMrGTmLpPrPCF1rShN46', 'input': {'x': 17.24, 'y': -918.1241}, 'name': 'add', 'type': 'tool_use'}], response_metadata={'id': 'msg_015TkhfRBENPib2RWAxkieH6', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'tool_use', 'stop_sequence': None, 'usage': {'input_tokens': 638, 'output_tokens': 105}}, id='run-45fb62e3-d102-4159-881d-241c5dbadeed-0', tool_calls=[{'name': 'add', 'args': {'x': 17.24, 'y': -918.1241}, 'id': 'toolu_01UTmMrGTmLpPrPCF1rShN46'}]),\n",
" ToolMessage(content='-900.8841', tool_call_id='toolu_01UTmMrGTmLpPrPCF1rShN46'),\n",
" AIMessage(content='Therefore, 17.24 - 918.1241 = -900.8841', response_metadata={'id': 'msg_01LgKnRuUcSyADCpxv9tPoYD', 'model': 'claude-3-sonnet-20240229', 'stop_reason': 'end_turn', 'stop_sequence': None, 'usage': {'input_tokens': 759, 'output_tokens': 24}}, id='run-1008254e-ccd1-497c-8312-9550dd77bd08-0')]}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"graph.invoke(\n",
" {\n",
" \"messages\": [\n",
" HumanMessage(\n",
" \"what's 3 plus 5 raised to the 2.743. also what's 17.24 - 918.1241\"\n",
" )\n",
" ]\n",
" },\n",
" config={\"configurable\": {\"llm\": \"claude3\"}},\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.4"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -3811,7 +3811,7 @@
"from langchain.chains import ConversationalRetrievalChain\n",
"from langchain_openai import ChatOpenAI\n",
"\n",
"model = ChatOpenAI(model_name=\"gpt-3.5-turbo-0613\") # switch to 'gpt-4'\n",
"model = ChatOpenAI(model=\"gpt-3.5-turbo-0613\") # switch to 'gpt-4'\n",
"qa = ConversationalRetrievalChain.from_llm(model, retriever=retriever)"
]
},

View File

@@ -84,7 +84,7 @@
" Applies the chatmodel to the message history\n",
" and returns the message string\n",
" \"\"\"\n",
" message = self.model(\n",
" message = self.model.invoke(\n",
" [\n",
" self.system_message,\n",
" HumanMessage(content=\"\\n\".join(self.message_history + [self.prefix])),\n",
@@ -424,7 +424,7 @@
" DialogueAgentWithTools(\n",
" name=name,\n",
" system_message=SystemMessage(content=system_message),\n",
" model=ChatOpenAI(model_name=\"gpt-4\", temperature=0.2),\n",
" model=ChatOpenAI(model=\"gpt-4\", temperature=0.2),\n",
" tool_names=tools,\n",
" top_k_results=2,\n",
" )\n",

Some files were not shown because too many files have changed in this diff Show More