Compare commits

..

107 Commits

Author SHA1 Message Date
Eugene Yurtsev
354c382875 x 2023-12-05 11:44:05 -05:00
Eugene Yurtsev
05d53073cf x 2023-12-05 11:43:43 -05:00
Eugene Yurtsev
11f52cbee5 Merge branch 'master' into eugene/update_file_chat_memory 2023-12-05 11:20:03 -05:00
Eugene Yurtsev
ff023e96dd x 2023-12-05 10:36:42 -05:00
Eun Hye Kim
f758c8adc4 Fix #11737 issue (extra_tools option of create_pandas_dataframe_agent is not working) (#13203)
- **Description:** Fix #11737 issue (extra_tools option of
create_pandas_dataframe_agent is not working),
  - **Issue:** #11737 ,
  - **Dependencies:** no,
- **Tag maintainer:** @baskaryan, @eyurtsev, @hwchase17 I needed this
method at work, so I modified it myself and used it. There is a similar
issue(#11737) and PR(#13018) of @PyroGenesis, so I combined my code at
the original PR.
You may be busy, but it would be great help for me if you checked. Thank
you.
  - **Twitter handle:** @lunara_x 

If you need an .ipynb example about this, please tag me. 
I will share what I am working on after removing any work-related
content.

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-04 20:54:08 -08:00
Sean Bearden
77a15fa988 Added ability to pass arguments to the Playwright browser (#13146)
- **Description:** Enhanced `create_sync_playwright_browser` and
`create_async_playwright_browser` functions to accept a list of
arguments. These arguments are now forwarded to
`browser.chromium.launch()` for customizable browser instantiation.
  - **Issue:** #13143
  - **Dependencies:** None
  - **Tag maintainer:** @eyurtsev,
  - **Twitter handle:** Dr_Bearden

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-04 20:48:09 -08:00
Joan Fontanals
dcccf8fa66 adapt Jina Embeddings to new Jina AI Embedding API (#13658)
- **Description:** Adapt JinaEmbeddings to run with the new Jina AI
Embedding platform
- **Twitter handle:** https://twitter.com/JinaAI_

---------

Co-authored-by: Joan Fontanals Martinez <joan.fontanals.martinez@jina.ai>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-04 20:40:33 -08:00
Philippe PRADOS
e0c03d6c44 Pprados/lite google drive (#13175)
- Fix bug in the document
 - Add clarification on the use of langchain-google drive.
2023-12-04 20:31:21 -08:00
guillaumedelande
ea0afd07ca Update azuresearch.py following recent change from azure-search-documents library (#13472)
- **Description:** 

Reference library azure-search-documents has been adapted in version
11.4.0:

1. Notebook explaining Azure AI Search updated with most recent info
2. HnswVectorSearchAlgorithmConfiguration --> HnswAlgorithmConfiguration
3. PrioritizedFields(prioritized_content_fields) -->
SemanticPrioritizedFields(content_fields)
4. SemanticSettings --> SemanticSearch
5. VectorSearch(algorithm_configurations) -->
VectorSearch(configurations)

--> Changes now reflected on Langchain: default vector search config
from langchain is now compatible with officially released library from
Azure.

  - **Issue:**
Issue creating a new index (due to wrong class used for default vector
search configuration) if using latest version of azure-search-documents
with current langchain version
  - **Dependencies:** azure-search-documents>=11.4.0,
  - **Tag maintainer:** ,

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-12-04 20:29:20 -08:00
price-deshaw
5cb3393e20 update OpenAI function agents' llm validation (#13538)
- **Description:** This PR modifies the LLM validation in OpenAI
function agents to check whether the LLM supports OpenAI functions based
on a property (`supports_oia_functions`) instead of whether the LLM
passed to the agent `isinstance` of `ChatOpenAI`. This allows classes
that extend `BaseChatModel` to be passed to these agents as long as
they've been integrated with the OpenAI APIs and have this property set,
even if they don't extend `ChatOpenAI`.
  - **Issue:** N/A
  - **Dependencies:** none
2023-12-04 20:28:13 -08:00
Max Weng
74c7b799ef migrate openai audio api (#13557)
for issue https://github.com/langchain-ai/langchain/issues/13162
migrate openai audio api, as [openai v1.0.0 Migration
Guide](https://github.com/openai/openai-python/discussions/742)

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Double Max <max@ground-map.com>
2023-12-04 20:27:54 -08:00
Arnaud Gelas
abbba6c7d8 openapi/planner.py: Deal with json in markdown output cases (#13576)
- **Description:** In openapi/planner deal with json in markdown output
cases
- **Issue:** In some cases LLMs could return json in markdown which
can't be loaded.
  - **Dependencies:**
  - **Tag maintainer:** @eyurtsev
  - **Twitter handle:**

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-04 20:27:22 -08:00
Harrison Chase
8eab4d95c0 Harrison/delegate from template (#14266)
Co-authored-by: M.R. Sopacua <144725145+msopacua@users.noreply.github.com>
2023-12-04 20:18:15 -08:00
Erick Friis
956d55de2b docs[patch]: chat model page names (#14264) 2023-12-04 20:08:41 -08:00
Eugene Yurtsev
78ae25d6b7 x 2023-12-04 22:58:32 -05:00
Nolan
b49104c2c9 Add missing doc key to metadata field in AzureSearch Vectorstore (#13328)
- **Description:** Adds doc key to metadata field when adding document
to Azure Search.
  - **Issue:** -,
  - **Dependencies:** -,
  - **Tag maintainer:** @eyurtsev,
  - **Twitter handle:** @finnless

Right now the document key with the name FIELDS_ID is not included in
the FIELDS_METADATA field, and therefore is not included in the Document
returned from a query. This is really annoying if you want to be able to
modify that item in the vectorstore.

Other's thoughts on this are welcome.
2023-12-04 19:53:27 -08:00
Jon Watte
e042e5df35 fix: call _on_llm_error() (#13581)
Description: There's a copy-paste typo where on_llm_error() calls
_on_chain_error() instead of _on_llm_error().
Issue: #13580 
Dependencies: None
Tag maintainer: @hwchase17 
Twitter handle: @jwatte

"Run `make format`, `make lint` and `make test` to check this locally."
The test scripts don't work in a plain Ubuntu LTS 20.04 system.
It looks like the dev container pulling is stuck. Or maybe the internet
is just ornery today.

---------

Co-authored-by: jwatte <jwatte@observeinc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-04 19:44:50 -08:00
Hamza Ahmed
fcc8e5e839 Update geodataframe.py (#13573)
here it is validating shapely.geometry.point.Point: if not
isinstance(data_frame[page_content_column].iloc[0], gpd.GeoSeries):
raise ValueError(
f"Expected data_frame[{page_content_column}] to be a GeoSeries" you need
it to validate the geoSeries and not the shapely.geometry.point.Point

if not isinstance(data_frame[page_content_column], gpd.GeoSeries):
            raise ValueError(
f"Expected data_frame[{page_content_column}] to be a GeoSeries"

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-12-04 19:44:30 -08:00
Harrison Chase
2213fc9711 Harrison/bookend ai (#14258)
Co-authored-by: stvhu-bookend <142813359+stvhu-bookend@users.noreply.github.com>
2023-12-04 19:42:15 -08:00
cxumol
0d47d15a9f add(feat): Text Embeddings by Cloudflare Workers AI (#14220)
Add [Text Embeddings by Cloudflare Workers
AI](https://developers.cloudflare.com/workers-ai/models/text-embeddings/).
It's a new integration.
Trying to align it with its langchain-js version counterpart
[here](https://api.js.langchain.com/classes/embeddings_cloudflare_workersai.CloudflareWorkersAIEmbeddings.html).
- Dependencies: N/A
- Done `make format` `make lint` `make spell_check` `make
integration_tests` and all my changes was passed
2023-12-04 19:25:05 -08:00
Harrison Chase
c51001f01e fix comet tracer (#14259) 2023-12-04 19:03:19 -08:00
Erick Friis
4351b99d2b docs[patch]: search experiment (#14254)
- npm
- search config
- custom
2023-12-04 16:58:26 -08:00
Harrison Chase
4fb72ff76f fake consistent embeddings cleanup (#14256)
delete code that could never be reached
2023-12-04 16:55:30 -08:00
Michael Landis
e26906c1dc feat: implement max marginal relevance for momento vector index (#13619)
**Description**

Implements `max_marginal_relevance_search` and
`max_marginal_relevance_search_by_vector` for the Momento Vector Index
vectorstore.

Additionally bumps the `momento` dependency in the lock file and adds
logging to the implementation.

**Dependencies**

 updates `momento` dependency in lock file

**Tag maintainer**

@baskaryan 

**Twitter handle**

Please tag @momentohq for Momento Vector Index and @mloml for the
contribution 🙇

<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
2023-12-04 16:50:23 -08:00
deedy5
ee9abb6722 Bugfix duckduckgo_search news search (#13670)
- **Description:** 
Bugfix duckduckgo_search news search
  - **Issue:** 
https://github.com/langchain-ai/langchain/issues/13648
  - **Dependencies:** 
None
  - **Tag maintainer:** 
@baskaryan

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-04 16:48:20 -08:00
Aliaksandr Kuzmik
676a077c4e Add CometTracer (#13661)
Hi! I'm Alex, Python SDK Team Lead from
[Comet](https://www.comet.com/site/).

This PR contains our new integration between langchain and Comet -
`CometTracer` class which uses new `comet_llm` python package for
submitting data to Comet.

No additional dependencies for the langchain package are required
directly, but if the user wants to use `CometTracer`, `comet-llm>=2.0.0`
should be installed. Otherwise an exception will be raised from
`CometTracer.__init__`.

A test for the feature is included.

There is also an already existing callback (and .ipynb file with
example) which ideally should be deprecated in favor of a new tracer. I
wasn't sure how exactly you'd prefer to do it. For example we could open
a separate PR for that.

I'm open to your ideas :)
2023-12-04 16:46:48 -08:00
Harrison Chase
921c4b5597 Harrison/searchapi (#14252)
Co-authored-by: SebastjanPrachovskij <86522260+SebastjanPrachovskij@users.noreply.github.com>
2023-12-04 16:34:15 -08:00
Ravidhu
224aa5151d Fix Sagemaker Endpoint documentation (#13660)
- **Description:** fixed the transform_input method in the example., 
  - **Issue:** example didn't work,
  - **Dependencies:** None,
  - **Tag maintainer:** @baskaryan,
  - **Twitter handle:** @Ravidhu87
2023-12-04 16:28:29 -08:00
Colin Ulin
9f9cb71d26 Embaas - added backoff retries for network requests (#13679)
Running a large number of requests to Embaas' servers (or any server)
can result in intermittent network failures (both from local and
external network/service issues). This PR implements exponential backoff
retries to help mitigate this issue.
2023-12-04 16:21:35 -08:00
Erick Friis
f26d88ca60 docs[patch]: fix columns (#14251) 2023-12-04 16:03:09 -08:00
Kastan Day
65faba91ad langchain[patch]: Adding new Github functions for reading pull requests (#9027)
The Github utilities are fantastic, so I'm adding support for deeper
interaction with pull requests. Agents should read "regular" comments
and review comments, and the content of PR files (with summarization or
`ctags` abbreviations).

Progress:
- [x] Add functions to read pull requests and the full content of
modified files.
- [x] Function to use Github's built in code / issues search.

Out of scope:
- Smarter summarization of file contents of large pull requests (`tree`
output, or ctags).
- Smarter functions to checkout PRs and edit the files incrementally
before bulk committing all changes.
- Docs example for creating two agents:
- One watches issues: For every new issue, open a PR with your best
attempt at fixing it.
- The other watches PRs: For every new PR && every new comment on a PR,
check the status and try to finish the job.

<!-- Thank you for contributing to LangChain!

Replace this comment with:
  - Description: a description of the change, 
  - Issue: the issue # it fixes (if applicable),
  - Dependencies: any dependencies required for this change,
- Tag maintainer: for a quicker response, tag the relevant maintainer
(see below),
- Twitter handle: we announce bigger features on Twitter. If your PR
gets announced and you'd like a mention, we'll gladly shout you out!

Please make sure you're PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
  2. an example notebook showing its use.

Maintainer responsibilities:
  - General / Misc / if you don't know who to tag: @baskaryan
  - DataLoaders / VectorStores / Retrievers: @rlancemartin, @eyurtsev
  - Models / Prompts: @hwchase17, @baskaryan
  - Memory: @hwchase17
  - Agents / Tools / Toolkits: @hinthornw
  - Tracing / Callbacks: @agola11
  - Async: @agola11

If no one reviews your PR within a few days, feel free to @-mention the
same people again.

See contribution guidelines for more information on how to write/run
tests, lint, etc:
https://github.com/hwchase17/langchain/blob/master/.github/CONTRIBUTING.md
 -->

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-12-04 15:53:36 -08:00
Hynek Kydlíček
aa8ae31e5b core[patch]: add response kwarg to on_llm_error
# Dependencies
None

# Twitter handle
@HKydlicek

---------

Co-authored-by: Erick Friis <erick@langchain.dev>
2023-12-04 15:04:48 -08:00
Leonid Ganeline
1750cc464d docs[patch]: moved vectorstore notebook file (#14181)
The `/docs/integrations/toolkits/vectorstore` page is not the
Integration page. The best place is in `/docs/modules/agents/how_to/`
- Moved the file
- Rerouted the page URL
2023-12-04 14:44:06 -08:00
Jacob Lee
a26c4a0930 Allow base_store to be used directly with MultiVectorRetriever (#14202)
Allow users to pass a generic `BaseStore[str, bytes]` to
MultiVectorRetriever, removing the need to use the `create_kv_docstore`
method. This encoding will now happen internally.

@rlancemartin @eyurtsev

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2023-12-04 14:43:32 -08:00
Vincent Brouwers
67662564f3 langchain[patch]: Fix config arg detection for wrapped lambdarunnable (#14230)
**Description:**
When a RunnableLambda only receives a synchronous callback, this
callback is wrapped into an async one since #13408. However, this
wrapping with `(*args, **kwargs)` causes the `accepts_config` check at
[/libs/core/langchain_core/runnables/config.py#L342](ee94ef55ee/libs/core/langchain_core/runnables/config.py (L342))
to fail, as this checks for the presence of a "config" argument in the
method signature.

Adding a `functools.wraps` around it, resolves it.
2023-12-04 14:18:30 -08:00
Jacob Lee
de86b84a70 Prefer byte store interface for Upstash BaseStore to match other Redis (#14201)
If we are not going to make the existing Docstore class also implement
`BaseStore[str, Document]`, IMO all base store implementations should
always be `[str, bytes]` so that they are more interchangeable.

CC @rlancemartin @eyurtsev
2023-12-04 14:17:33 -08:00
Harrison Chase
411aa9a41e Harrison/nasa tool (#14245)
Co-authored-by: Jacob Matias <88005863+matiasjacob25@users.noreply.github.com>
Co-authored-by: Karam Daid <karam.daid@mail.utoronto.ca>
Co-authored-by: Jumana <jumana.fanous@mail.utoronto.ca>
Co-authored-by: KaramDaid <38271127+KaramDaid@users.noreply.github.com>
Co-authored-by: Anna Chester <74325334+CodeMakesMeSmile@users.noreply.github.com>
Co-authored-by: Jumana <144748640+jfanous@users.noreply.github.com>
2023-12-04 13:43:11 -08:00
nceccarelli
5fea63327b Support Azure gov cloud in Azure Cognitive Search retriever (#13695)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- **Description:** The existing version hardcoded search.windows.net in
the base url. This is not compatible with the gov cloud. I am allowing
the user to override the default for gov cloud support.,
  - **Issue:** N/A, did not write up in an issue,
  - **Dependencies:** None

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->

---------

Co-authored-by: Nicholas Ceccarelli <nceccarelli2@moog.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-04 12:56:35 -08:00
ealt
e09b876863 Fixes error loading Obsidian templates (#13888)
- **Description:** Obsidian templates can include
[variables](https://help.obsidian.md/Plugins/Templates#Template+variables)
using double curly braces. `ObsidianLoader` uses PyYaml to parse the
frontmatter of documents. This parsing throws an error when encountering
variables' curly braces. This is avoided by temporarily substituting
safe strings before parsing.
  - **Issue:** #13887
  - **Tag maintainer:** @hwchase17
2023-12-04 12:55:37 -08:00
Erick Friis
f6d68d78f3 nbdoc -> quarto (#14156)
Switches to a more maintained solution for building ipynb -> md files
(`quarto`)

Also bumps us down to python3.8 because it's significantly faster in the
vercel build step. Uses default openssl version instead of upgrading as
well.
2023-12-04 12:50:56 -08:00
Nithish Raghunandanan
eecfa3f9e5 Add Couchbase document loader (#13979)
**Description:** 
Adds the document loader for [Couchbase](http://couchbase.com/), a
distributed NoSQL database.
**Dependencies:** 
Added the Couchbase SDK as an optional dependency.
**Twitter handle:** nithishr

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-04 12:28:12 -08:00
Bob Lin
805e9bfc24 Add doc for the development of core and experimental sections (#13966)
### **Description**

Hi, I just started learning the source code of `langchain` and hope to
contribute code. However, according to the instructions in the
[CONTRIBUTING.md](https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md)
document, I could not run the test command `make test` to run normally.
I found that many modules did not exist after [splitting
`langchain_core`](https://github.com/langchain-ai/langchain/discussions/13823),
so I updated the document.

### **Twitter handle** 

lin_bob57617
2023-12-04 12:27:57 -08:00
Muntaqa Mahmood
25f72944a0 Add: Steam API tool (#14008)
- **Description:** Our PR is an integration of a Steam API Tool that
makes recommendations on steam games based on user's Steam profile and
provides information on games based on user provided queries.
- **Issue:** the issue # our PR implements:
https://github.com/langchain-ai/langchain/issues/12120
- **Dependencies:** python-steam-api library, steamspypi library and
decouple library
  - **Tag maintainer:** @baskaryan, @hwchase17 
  - **Twitter handle:** N/A

Hello langchain Maintainers,

We are a team of 4 University of Toronto students contributing to
langchain as part of our course [CSCD01 (link to course
page)](https://cscd01.com/work/open-source-project). We hope our changes
help the community. We have run make format, make lint and make test
locally before submitting the PR. To our knowledge, our changes do not
introduce any new errors.

Our PR integrates the python-steam-api, steamspypi and decouple
packages. We have added integration tests to test our python API
integration into langchain and an example notebook is also provided.

Our amazing team that contributed to this PR: @JohnY2002, @shenceyang,
@andrewqian2001 and @muntaqamahmood

Thank you in advance to all the maintainers for reviewing our PR!

---------

Co-authored-by: Shence <ysc1412799032@163.com>
Co-authored-by: JohnY2002 <johnyuan0526@gmail.com>
Co-authored-by: Andrew Qian <andrewqian2001@gmail.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
Co-authored-by: JohnY <94477598+JohnY2002@users.noreply.github.com>
2023-12-04 12:27:38 -08:00
Bob Lin
cd2028288e Add openai v2 adapter (#14063)
### Description

Starting from [openai version
1.0.0](17ac677995 (module-level-client)),
the camel case form of `openai.ChatCompletion` is no longer supported
and has been changed to lowercase `openai.chat.completions`. In
addition, the returned object only accepts attribute access instead of
index access:

```python
import openai

# optional; defaults to `os.environ['OPENAI_API_KEY']`
openai.api_key = '...'

# all client options can be configured just like the `OpenAI` instantiation counterpart
openai.base_url = "https://..."
openai.default_headers = {"x-foo": "true"}

completion = openai.chat.completions.create(
    model="gpt-4",
    messages=[
        {
            "role": "user",
            "content": "How do I output all files in a directory using Python?",
        },
    ],
)
print(completion.choices[0].message.content)
```

So I implemented a compatible adapter that supports both attribute
access and index access:

```python
In [1]: from langchain.adapters import openai as lc_openai
   ...: messages = [{"role": "user", "content": "hi"}]

In [2]: result = lc_openai.chat.completions.create(
   ...:     messages=messages, model="gpt-3.5-turbo", temperature=0
   ...: )

In [3]: result.choices[0].message
Out[3]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}

In [4]: result["choices"][0]["message"]
Out[4]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}

In [5]: result = await lc_openai.chat.completions.acreate(
   ...:     messages=messages, model="gpt-3.5-turbo", temperature=0
   ...: )

In [6]: result.choices[0].message
Out[6]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}

In [7]: result["choices"][0]["message"]
Out[7]: {'role': 'assistant', 'content': 'Hello! How can I assist you today?'}

In [8]: for rs in lc_openai.chat.completions.create(
    ...:     messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True
    ...: ):
    ...:     print(rs.choices[0].delta)
    ...:     print(rs["choices"][0]["delta"])
    ...:
{'role': 'assistant', 'content': ''}
{'role': 'assistant', 'content': ''}
{'content': 'Hello'}
{'content': 'Hello'}
{'content': '!'}
{'content': '!'}

In [20]: async for rs in await lc_openai.chat.completions.acreate(
    ...:     messages=messages, model="gpt-3.5-turbo", temperature=0, stream=True
    ...: ):
    ...:     print(rs.choices[0].delta)
    ...:     print(rs["choices"][0]["delta"])
    ...:
{'role': 'assistant', 'content': ''}
{'role': 'assistant', 'content': ''}
{'content': 'Hello'}
{'content': 'Hello'}
{'content': '!'}
{'content': '!'}
...
```

### Twitter handle

[lin_bob57617](https://twitter.com/lin_bob57617)
2023-12-04 12:12:30 -08:00
billytrend-cohere
0f02081392 Add input_type override (#14068)
Add option to override input_type for cohere's v3 embeddings models

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-04 12:10:24 -08:00
Dmitrii Rashchenko
aaabc1574f Support of custom hugging face inference endpoints url (#14125)
- **Description:** to support not only publicly available Hugging Face
endpoints, but also protected ones (created with "Inference Endpoints"
Hugging Face feature), I have added ability to specify custom api_url.
But if not specified, default behaviour won't change
  - **Issue:** #9181,
  - **Dependencies:** no extra dependencies
2023-12-04 12:08:51 -08:00
Bob Lin
702a6d7044 Closed #14159 (#14165)
### Description

Fix: #14159

Use `from pydantic.v1 import BaseModel, Field` instead of `from pydantic
import BaseModel, Field`

### [lin_bob57617](https://twitter.com/lin_bob57617)
2023-12-04 12:06:04 -08:00
Perry Lee
641e401ba8 Shorten wget commands (#14211)
- **Description:** The commands can be more efficient if the output name
is set to the destined filename instead of renaming in the second
command.
2023-12-04 12:03:47 -08:00
Harrison Chase
e32185193e Harrison/embass (#14242)
Co-authored-by: Julius Lipp <lipp.julius@gmail.com>
2023-12-04 11:58:52 -08:00
umair mehmood
8504ec56e4 fixed: ModuleNotFoundError: No module named 'clarifai.auth' (#14215)
Updated the clarifai imports 

fixed: #14175 

@efriis 
@baskaryan
2023-12-04 11:53:34 -08:00
Hieu Lam
ca8a022cd9 Fixed OpenAIFunctionsAgent not returning when receiving AgentFinish (#14236)
**Description:** The way the condition is checked in the
`return_stopped_response` function of `OpenAIAgent` may not be correct,
when the value returned is `AgentFinish` from the tools it does not work
properly.


Thanks for review, @baskaryan, @eyurtsev, @hwchase17.
2023-12-04 11:43:04 -08:00
Unai Garay Maestre
6826feea14 Adds llm_chain_kwargs to BaseRetrievalQA.from_llm (#14224)
- **Description:** Adds `llm_chain_kwargs` to `BaseRetrievalQA.from_llm`
so these can be passed to the LLM at runtime,
- **Issue:** https://github.com/langchain-ai/langchain/issues/14216,

---------

Signed-off-by: ugm2 <unaigaraymaestre@gmail.com>
2023-12-04 11:34:01 -08:00
James Braza
6ce5dab38c Clarifying descriptions in GuardrailsOutputParser (#14228)
Upstreaming knowledge from
https://github.com/guardrails-ai/guardrails/discussions/473 to LangChain
2023-12-04 11:33:22 -08:00
geret1
50aee687c6 langchain[patch]: Cerebrium model_api_request deprecation (#12704)
- **Description:** As part of my conversation with Cerebrium team,
`model_api_request` will be no longer available in cerebrium lib so it
needs to be replaced.
  - **Issue:** #12705 12705,
  - **Dependencies:** Cerebrium team (agreed)
  - **Tag maintainer:** @eyurtsev 
  - **Twitter handle:** No official Twitter account sorry :D

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-04 09:26:32 -08:00
Harutaka Kawamura
ee94ef55ee docs[patch]: Update MLflow and Databricks docs (#14011)
Depends on #13699. Updates the existing mlflow and databricks examples.

---------

Co-authored-by: Ben Wilson <39283302+BenWilson2@users.noreply.github.com>
2023-12-03 16:07:09 -08:00
Leonid Ganeline
94bf733dae docs[patch]: AWS platform page update (#14160)
The `AWS` platform page has many missed integrations.
- added missed integration references to the `AWS` platform page
- added/updated descriptions and links in the referenced notebooks
- renamed two notebook files. They have file names != page Title, which
generate unordered ToC.
- reroute the URLs for renamed files
- fixed `amazon_textract` notebook: removed failed cell outputs
2023-12-03 15:42:52 -08:00
Leonid Ganeline
74d4154bcc docs[patch]: added Templates Hub menu item (#14148)
This link was missing in Docs.
Added it.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-03 15:36:35 -08:00
William FH
246dc4f9cc langchain[patch]: Pass kwargs to chat fireworks (#14183)
Otherwise `.bind()` isn't really any good
2023-12-03 15:12:02 -08:00
Kaiboon Ee
e961c57fd2 langchain[patch]: Mask API key for Arcee LLM (#14193)
- **Description:** Mask API key for Arcee LLM and its associated unit
tests
  - **Issue:** https://github.com/langchain-ai/langchain/issues/12165
  - **Dependencies:** N/A
  - **Tag maintainer:** @eyurtsev
  - **Twitter handle:** `eekaiboon`

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-03 15:11:43 -08:00
Daniyar Supiyev
092f302c0f langchain[patch]: Asynchronous human-in-the-loop callback (#14195)
**Description:** Adding a possibility to use asynchronous callback
handler in human-in-the-loop validation tool. Very useful, for example,
if you want to implement a validation over Telegram bot.
**Issue:** -
**Dependencies:** -

---------

Co-authored-by: Daniyar_Supiyev <daniyar_supiyev@epam.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-03 14:57:07 -08:00
Leonid Ganeline
c660b0cf79 docs[patch]: moved semadb.mdx file (#14204)
SemaDB.mdx file was placed with additional sub-folder:
`https://python.langchain.com/docs/integrations/providers/providers/semadb`
- Moved file to the
`https://python.langchain.com/docs/integrations/providers/semadb`
- Added a redirect for the file URL
2023-12-03 14:36:47 -08:00
Mark Cusack
16c83f786c Adds the Yellowbrick Data Warehouse as a supported vector store (#13820)
- **Description** An integration to allow the Yellowbrick Data Warehouse
to function as a vector store

---------

Co-authored-by: markcusack <markcusack@markcusacksmac.lan>
Co-authored-by: markcusack <markcusack@Mark-Cusack-sMac.local>
2023-12-03 13:35:53 -08:00
Hendrik Hogertz
e6862e6e7d Fix Azure Openai function calling in streaming mode (#13768)
- **Description**: This PR addresses an issue with the OpenAI API
streaming response, where initially the key (arguments) is provided but
the value is None. Subsequently, it updates with {"arguments": "{\n"},
leading to a type inconsistency that causes an exception. The specific
error encountered is ValueError: additional_kwargs["arguments"] already
exists in this message, but with a different type. This change aims to
resolve this inconsistency and ensure smooth API interactions.
- **Issue**: None.
- **Dependencies**: None.
- **Tag maintainer**: @eyurtsev

This is an updated version of #13229 based on the refactored code.
Credit goes to @superken01.

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-03 12:07:15 -08:00
Nicolò Boschi
e204657b3c AstraDB VectorStore: implement pre_delete_collection (#13780)
- **Description:** some vector stores have a flag for try deleting the
collection before creating it (such as ´vectorpg´). This is a useful
flag when prototyping indexing pipelines and also for integration tests.
Added the bool flag `pre_delete_collection ` to the constructor (default
False)
  - **Tag maintainer:** @hemidactylus 
  - **Twitter handle:** nicoloboschi

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-03 12:06:20 -08:00
Chelsea E. Manning
2780d2d4dd Extend OpenAIEmbeddings class to support non-tiktoken based embeddings (#13884)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
- **Description:** This extends `OpenAIEmbeddings` to add support for
non-`tiktoken` based embeddings, specifically for use with the new
`text-generation-webui` API (`--extensions openai`) which does not
support `tiktoken` encodings, but rather strings
  - **Issue:** Not found,
- **Dependencies:** HuggingFace `transformers.AutoTokenizer` is new
dependency for running the model without `tiktoken`
- **Tag maintainer:** @baskaryan based on last commit for
`langchain-core` refactor
  - **Twitter handle:** @xychelsea

Modified the tokenization process to be model-agnostic, allowing for
both OpenAI and non-OpenAI model tokenizations, by setting the new
default `bool` flag `tiktoken_enabled` to `False`. This requeires
HuggingFace’s AutoTokenizer and handling tokenization for models
requiring different preprocessing steps to generate a chunked string
request rather than a list of integers.

Updated the embeddings generation process to accommodate non-OpenAI
models. This includes converting tokenized text into embeddings using
OpenAI’s and Hugging Face’s model architectures.
 -->
2023-12-03 12:04:17 -08:00
Changgeng Zhao
9b59bde93d Update Hologres vector store: use hologres-vector (#13767)
Hi,
I made some code changes on the Hologres vector store to improve the
data insertion performance.
Also, this version of the code uses `hologres-vector` library. This
library is more convenient for us to update, and more efficient in
performance.
The code has passed the format/lint/spell check. I have run the unit
test for Hologres connecting to my own database.
Please check this PR again and tell me if anything needs to change.

Best,
Changgeng,
Developer @ Alibaba Cloud

Co-authored-by: Changgeng Zhao <zhaochanggeng.zcg@alibaba-inc.com>
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-03 11:50:45 -08:00
Nicolò Boschi
0de7cf898d Ensure AstraDB integration tests clean up the environment (#13774)
- **Description:** currently astra_db integration tests might leave
orphan collections
  - **Tag maintainer:** @hemidactylus 
  - **Twitter handle:** nicoloboschi
2023-12-03 11:14:42 -08:00
Harrison Chase
7bc4c12477 delete stray test (#14200)
was added to an old path

also im not sure this is even really a test file? which is why i didnt
move it
2023-12-03 11:06:57 -08:00
Leonid Ganeline
283c2994de docs: Hugging Face platform page (#13831)
`Hugging Face` is definitely a platform. It includes many integrations
for many modules (LLM, Embedding, DocumentLoader, Tool)
So, a doc page was added that defines Hugging Face as a platform.
2023-12-03 11:06:43 -08:00
Chad Norvell
8a0951d934 Fix Mathpix PDF loader integration (#13949)
- **Description:** Fixes the Mathpix PDF loader API integration.
Specifically, ensures that Mathpix auth headers are provided for every
request, and ensures that we recognize all errors that can occur during
a request. Also, the option to provide API keys as kwargs never actually
worked before, but now that's fixed too.
  - **Issue:** #11249
  - **Dependencies:** None
2023-12-03 10:36:49 -08:00
gzyJoy
32d4bb4590 Added Slacktoolkit (#14012)
- **Description:** 
This PR introduces the Slack toolkit to LangChain, which allows users to
read and write to Slack using the Slack API. Specifically, we've added
the following tools.
1. get_channel: Provides a summary of all the channels in a workspace.
2. get_message: Gets the message history of a channel.
3. send_message: Sends a message to a channel.
4. schedule_message: Sends a message to a channel at a specific time and
date.

- **Issue:** This pull request addresses [Add Slack Toolkit
#11747](https://github.com/langchain-ai/langchain/issues/11747)
  - **Dependencies:** package`slack_sdk`
Note: For this toolkit to function you will need to add a Slack app to
your workspace. Additional info can be found
[here](https://slack.com/help/articles/202035138-Add-apps-to-your-Slack-workspace).

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: ArianneLavada <ariannelavada@gmail.com>
Co-authored-by: ArianneLavada <84357335+ArianneLavada@users.noreply.github.com>
Co-authored-by: ariannelavada@gmail.com <you@example.com>
2023-12-03 10:25:38 -08:00
Richie
99e5ee6a84 fix(vectorstores): incorrect import for mongodb atlas DriverInfo (#14060)
- **Description:** fix `import` issue for `mongodb atlas` vectore store
integration
  - **Issue:** none
  - **Dependencies:** none

while trying to follow official `langchain`'s [mongodb integration
guide](https://python.langchain.com/docs/integrations/vectorstores/mongodb_atlas),
an import error will happen.

It's caused by incorrect import location:
- `from pymongo import DriverInfo` should be `from pymongo.driver_info
import DriverInfo`
- reference: [pymongo's DriverInfo
class](https://pymongo.readthedocs.io/en/stable/api/pymongo/driver_info.html#pymongo.driver_info.DriverInfo)

Thanks!
2023-12-03 10:22:13 -08:00
ggeutzzang
03d6b94c29 Fix: (issue #14066) DOC: Summarization output broken (#14078)
- **Description:** : As described in the issue below, 
https://python.langchain.com/docs/use_cases/summarization  
I've modified the Python code in the above notebook to perform well. 

I also modified the OpenAI LLM model to the latest version as shown
below.
`gpt-3.5-turbo-16k --> gpt-3.5-turbo-1106`
This is because it seems to be a bit more responsive.
  - **Issue:** : #14066
2023-12-03 10:13:57 -08:00
James Braza
3833882ab7 Removing extra StdOutCallbackHandler overridden methods (#14136)
Unnecessarily overridden methods:

- Give the idea the subclass is doing something special (when it isn't)
- Block CTRL-click to the actual method

This PR removes some unnecessarily overridden methods in
`StdOutCallbackHandler`

Supercedes https://github.com/langchain-ai/langchain/pull/12858
2023-12-03 09:38:49 -08:00
Bob Lin
ac449f186b Update docs to use new usage in openai>1.0.0 (#14163)
### Description

Use new
[APIs](https://github.com/openai/openai-python/blob/main/api.md#finetuning)

### Twitter handle

[lin_bob57617](https://twitter.com/lin_bob57617)
2023-12-03 09:37:35 -08:00
James Braza
052e23be3e Added Python logging tracer (#14190)
This PR creates a logging handler and adds a simple unit test of it

Supercedes https://github.com/langchain-ai/langchain/pull/12862

---------

Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-12-03 09:36:30 -08:00
Bob Lin
1ea48a31da Update fallback cases (#14164)
### Description

The `RateLimitError` initialization method has changed after openai v1,
and the usage of `patch` needs to be changed.

### Twitter handle

[lin_bob57617](https://twitter.com/lin_bob57617)
2023-12-03 08:56:07 -08:00
Bob Lin
62505043be Closed #14069 (#14166)
### Description

Fix #14069

### Twitter handle

[lin_bob57617](https://twitter.com/lin_bob57617)
2023-12-03 08:55:25 -08:00
Yong woo Song
9938086df0 Fix Html2TextTransformer for shallow copy (#14197)
<!-- Thank you for contributing to LangChain!

Replace this entire comment with:
  - **Description:** a description of the change, 
  - **Issue:** the issue # it fixes (if applicable),
  - **Dependencies:** any dependencies required for this change,
- **Tag maintainer:** for a quicker response, tag the relevant
maintainer (see below),
- **Twitter handle:** we announce bigger features on Twitter. If your PR
gets announced, and you'd like a mention, we'll gladly shout you out!

Please make sure your PR is passing linting and testing before
submitting. Run `make format`, `make lint` and `make test` to check this
locally.

See contribution guidelines for more information on how to write/run
tests, lint, etc:

https://github.com/langchain-ai/langchain/blob/master/.github/CONTRIBUTING.md

If you're adding a new integration, please include:
1. a test for the integration, preferably unit tests that do not rely on
network access,
2. an example notebook showing its use. It lives in `docs/extras`
directory.

If no one reviews your PR within a few days, please @-mention one of
@baskaryan, @eyurtsev, @hwchase17.
 -->
Hi,
There is some unintended behavior in Html2TextTransformer.
The current code is **directly modifying the original documents that are
passed as arguments to the function.**
Therefore, not only the return of the function but also the input
variables are being modified simultaneously.
**To resolve this, I added unit test code as well.**

reference link: [Shallow vs Deep Copying of Python
Objects](https://realpython.com/copying-python-objects/)

Thanks! ☺️
2023-12-03 08:45:35 -08:00
h3l
818252b1f8 Fix: (issue #14127) Volc Engine MaaS import error (#14194)
- **Description:** fix Volc Engine MaaS import error
- **Issue:** [the issue # it fixes (if
applicable),](https://github.com/langchain-ai/langchain/issues/14127)
  - **Dependencies:** None
  - **Tag maintainer:** @baskaryan 
  - **Twitter handle:**

Co-authored-by: lvzhong <lvzhong@bytedance.com>
2023-12-03 08:43:23 -08:00
Leonid Ganeline
6ae0194dc7 docs: integrations/toolkits/office365 notebook update (#14188)
Added more descriptions and authentication details.
2023-12-03 08:43:00 -08:00
Bagatur
0bdb434383 langchain[patch]: Release langchain 0.0.345 (#14184) 2023-12-02 15:53:49 -08:00
Bagatur
15c04a5670 core[patch]: Release 0.0.9 (#14182) 2023-12-02 14:40:56 -08:00
James Braza
bdb6ae2ed3 core[patch]: BaseTracer helper method for Run lookup (#14139)
I observed the same run ID extraction logic is repeated many times in
`BaseTracer`.

This PR creates a helper method for DRY code.
2023-12-02 14:05:50 -08:00
Harutaka Kawamura
41ee3be95f langchain[patch]: Support passing parameters to llms.Databricks and llms.Mlflow (#14100)
Before, we need to use `params` to pass extra parameters:

```python
from langchain.llms import Databricks

Databricks(..., params={"temperature": 0.0})
```

Now, we can directly specify extra params:

```python
from langchain.llms import Databricks

Databricks(..., temperature=0.0)
```
2023-12-01 19:27:18 -08:00
Abdul
82102c99b3 langchain[patch]: Running SQLDatabaseChain adds prefix "SQLQuery:\n" (#14058)
- **Issue:** https://github.com/langchain-ai/langchain/issues/12077

---------

Co-authored-by: Abdul Kader Maliyakkal <maliyakk@amazon.com>
2023-12-01 19:26:16 -08:00
Samuel Kemp
fd781c89cc langchain[minor]: add azure ai data document loader (#13404)
This PR adds an "Azure AI data" document loader, which allows Azure AI
users to load their registered data assets as a document object in
langchain.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-01 19:25:55 -08:00
James Braza
24385a00de core[minor], langchain[patch], experimental[patch]: Added missing py.typed to langchain_core (#14143)
See PR title.

From what I can see, `poetry` will auto-include this. Please let me know
if I am missing something here.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-01 19:15:23 -08:00
quantum00549
f7c257553d langchain[patch]: fixed a bug that was causing the streaming transfer to not work… (#10827)
… properly

Fixed a bug that was causing the streaming transfer to not work
properly.
 - **Description: 
1、The on_llm_new_token method in the streaming callback can now be
called properly in streaming transfer mode.
2、In streaming transfer mode, LLM can now correctly output the complete
response instead of just the first token.
- **Tag maintainer: @wangxuqi 
- **Twitter handle: @kGX7XJjuYxzX9Km

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-01 18:57:50 -08:00
Eugene Yurtsev
6d0209e0aa Improve file system blob loader and generic loader (#14004)
* Add support for passing a specific file to the file system blob loader
* Allow specifying a class parameter for the parser for the generic
loader

```python

class AudioLoader(GenericLoader):
  @staticmethod
  def get_parser(**kwargs):
     return MyAudioParser(**kwargs):
```

The intent of the GenericLoader is to provide on-ramps from different
sources (e.g., web, s3, file system).

An alternative is to use pipelining syntax or creating a Pipeline

```
FileSystemBlobLoader(...) | MyAudioParser
```

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-01 21:23:40 -05:00
Erick Friis
700428593a fix broken api docs links (#14154) 2023-12-01 17:17:52 -08:00
Bagatur
340b42d8ee docs[minor]: lcel why page (#14089) 2023-12-01 16:13:31 -08:00
Lance Martin
cbe4753e1a Update Open CLIP embd (#14155)
Prior default model required a large amt of RAM and often crashed
Jupyter ntbk kernel.
2023-12-01 15:13:20 -08:00
Erick Friis
b01d9d27d9 docs[patch]: docs local build (#14152) 2023-12-01 14:03:36 -08:00
Alex Kira
0caef3cde7 Change RunnableMap to RunnableParallel for consistency (#14142)
- **Description:** Change instances of RunnableMap to RunnableParallel,
as that should be the one used going forward. This makes it consistent
across the codebase.
2023-12-01 13:36:40 -08:00
Erick Friis
96f6b90349 templates[patch]: relock templates (#14149) 2023-12-01 13:35:54 -08:00
Martin Jul
e3a7c96a8e docs[patch]: Fix minor typos (casing) in quickstart (#14138)
Fix casing of API and LangChain in the description text for the
LangServe example server.
2023-12-01 13:29:53 -08:00
Erick Friis
8cf4cb9e48 docs[patch]: Fix templates/index (#14146) 2023-12-01 13:09:36 -08:00
Amyh102
b6d26d3f9f infra[patch]: Add unit tests for Huggingface dataset loader (#14053)
- **Description:** Add unit tests for huggingface dataset loader and
sample huggingface dataset for future tests. Updates dependencies for
`datasets` module.
- Adds coverage for [previous pull
request](https://github.com/langchain-ai/langchain/pull/13864)
  - **Tag maintainer:** @hwchase17

---------

Co-authored-by: Amy Han <amyhan@Amys-Air.lan>
Co-authored-by: Amy Han <amyhan@Amys-MacBook-Air.local>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-01 12:42:31 -08:00
Alex Kira
6eb40db353 docs[patch]: Add getting started section to LCEL doc (#14045)
### Description:
Doc addition for LCEL introduction. Adds a more basic starter guide for
using LCEL.

---------

Co-authored-by: Alex Kira <akira@Alexs-MBP.local.tld>
Co-authored-by: Bagatur <baskaryan@gmail.com>
2023-12-01 12:23:43 -08:00
Govinda Totla
62a3473ac0 docs[patch]: add text_splitter.py test (#14025)
Description: Add HTMLHeaderTextSplitter unit test
Dependencies: none
2023-12-01 11:57:50 -08:00
Bagatur
7d5341dbd3 docs[patch]: add contribs to readme (#14137) 2023-12-01 11:34:28 -08:00
axiangcoding
1b36ddf16c docs[patch]: add deprecated note for ErnieChatBot (#14061)
- **Description:** just a little change of ErnieChatBot class
description, sugguesting user to use more suitable class
  - **Issue:** none,
  - **Dependencies:** none,
  - **Tag maintainer:** @baskaryan ,
  - **Twitter handle:** none
2023-12-01 11:16:31 -08:00
Alex Kira
1757258b2a docs[patch]: Add mermaid JS theme dependency to docusaurus (#14051)
- **Description:** Add mermaid JS dependency and configs to
documentation. Allows inline doc diagrams in markdown.
  - **Dependencies:** NPM package @docusaurus/theme-mermaid
2023-12-01 11:06:29 -08:00
Devin Dahoon Kim
32da0a4d71 langchain[patch]: use async_embed_with_retry in _aget_len_safe_embeddings (#14110)
**Description**

`embed_with_retry` is for sync operations and not for async operations.
Use `async_embed_with_retry` for appropriate async operations.


I'm using `OpenAIEmbedding(http_client=httpx.AsyncClient())` with only
async operations.
However, I got an error when I use `embedding.aembed_documents` because
`embed_with_retry` uses sync OpenAI client with async http client.
2023-12-01 10:47:07 -08:00
lijie
371bcb7580 langchain[patch]: set maxsplit when parse python function docstring (#14121)
Description

when the desc of arg in python docstring contains ":", the
`_parse_python_function_docstring` will raise **ValueError: too many
values to unpack (expected 2)**.

A sample desc would be:
"""
Args: 
    error_arg: this is an arg with an additional ":" symbol
"""

So, set `maxsplit` parameter to fix it.
2023-12-01 10:46:53 -08:00
Harrison Chase
ae646701c4 Harrison/ibm (#14133)
Co-authored-by: Mateusz Szewczyk <139469471+MateuszOssGit@users.noreply.github.com>
2023-12-01 12:44:11 -05:00
1833 changed files with 168377 additions and 149975 deletions

View File

@@ -72,9 +72,10 @@ tell Poetry to use the virtualenv python environment (`poetry config virtualenvs
### Core vs. Experimental
This repository contains two separate projects:
This repository contains three separate projects:
- `langchain`: core langchain code, abstractions, and use cases.
- `langchain.experimental`: see the [Experimental README](https://github.com/langchain-ai/langchain/tree/master/libs/experimental/README.md) for more information.
- `langchain_core`: contain interfaces for key abstractions as well as logic for combining them in chains (LCEL).
- `langchain_experimental`: see the [Experimental README](https://github.com/langchain-ai/langchain/tree/master/libs/experimental/README.md) for more information.
Each of these has its own development environment. Docs are run from the top-level makefile, but development
is split across separate test & release flows.
@@ -128,6 +129,24 @@ make docker_tests
There are also [integration tests and code-coverage](https://github.com/langchain-ai/langchain/tree/master/libs/langchain/tests/README.md) available.
### Only develop langchain_core or langchain_experimental
If you are only developing `langchain_core` or `langchain_experimental`, you can simply install the dependencies for the respective projects and run tests:
```bash
cd libs/core
poetry install --with test
make test
```
Or:
```bash
cd libs/experimental
poetry install --with test
make test
```
### Formatting and Linting
Run these locally before submitting a PR; the CI system will check also.

View File

@@ -104,3 +104,7 @@ Please see [here](https://python.langchain.com) for full documentation, which in
As an open-source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infrastructure, or better documentation.
For detailed information on how to contribute, see [here](.github/CONTRIBUTING.md).
## 🌟 Contributors
[![langchain contributors](https://contrib.rocks/image?repo=langchain-ai/langchain&max=2000)](https://github.com/langchain-ai/langchain/graphs/contributors)

View File

@@ -9,13 +9,15 @@ SCRIPT_DIR="$(cd "$(dirname "$0")"; pwd)"
cd "${SCRIPT_DIR}"
mkdir -p ../_dist
rsync -ruv . ../_dist
rsync -ruv --exclude node_modules . ../_dist
cd ../_dist
poetry run python scripts/model_feat_table.py
poetry run nbdoc_build --srcdir docs --pause 0
cp ../cookbook/README.md src/pages/cookbook.mdx
cp ../.github/CONTRIBUTING.md docs/contributing.md
mkdir -p docs/templates
cp ../templates/docs/INDEX.md docs/templates/index.md
wget https://raw.githubusercontent.com/langchain-ai/langserve/main/README.md -O docs/langserve.md
poetry run python scripts/generate_api_reference_links.py
yarn install
yarn start
yarn
quarto preview docs

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 2
sidebar_position: 3
---
# Cookbook

View File

@@ -146,7 +146,7 @@
"source": [
"### Branching and Merging\n",
"\n",
"You may want the output of one component to be processed by 2 or more other components. [RunnableMaps](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.base.RunnableMap.html) let you split or fork the chain so multiple components can process the input in parallel. Later, other components can join or merge the results to synthesize a final response. This type of chain creates a computation graph that looks like the following:\n",
"You may want the output of one component to be processed by 2 or more other components. [RunnableParallels](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.base.RunnableParallel.html#langchain_core.runnables.base.RunnableParallel) let you split or fork the chain so multiple components can process the input in parallel. Later, other components can join or merge the results to synthesize a final response. This type of chain creates a computation graph that looks like the following:\n",
"\n",
"```text\n",
" Input\n",

View File

@@ -317,7 +317,7 @@
"source": [
"## Simplifying input\n",
"\n",
"To make invocation even simpler, we can add a `RunnableMap` to take care of creating the prompt input dict for us:"
"To make invocation even simpler, we can add a `RunnableParallel` to take care of creating the prompt input dict for us:"
]
},
{
@@ -327,9 +327,9 @@
"metadata": {},
"outputs": [],
"source": [
"from langchain.schema.runnable import RunnableMap, RunnablePassthrough\n",
"from langchain.schema.runnable import RunnableParallel, RunnablePassthrough\n",
"\n",
"map_ = RunnableMap(foo=RunnablePassthrough())\n",
"map_ = RunnableParallel(foo=RunnablePassthrough())\n",
"chain = (\n",
" map_\n",
" | prompt\n",

View File

@@ -171,7 +171,7 @@
"outputs": [],
"source": [
"from langchain.schema import format_document\n",
"from langchain.schema.runnable import RunnableMap"
"from langchain.schema.runnable import RunnableParallel"
]
},
{
@@ -256,7 +256,7 @@
"metadata": {},
"outputs": [],
"source": [
"_inputs = RunnableMap(\n",
"_inputs = RunnableParallel(\n",
" standalone_question=RunnablePassthrough.assign(\n",
" chat_history=lambda x: _format_chat_history(x[\"chat_history\"])\n",
" )\n",

View File

@@ -0,0 +1,493 @@
{
"cells": [
{
"cell_type": "raw",
"id": "366a0e68-fd67-4fe5-a292-5c33733339ea",
"metadata": {},
"source": [
"---\n",
"sidebar_position: 0\n",
"title: Get started\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "befa7fd1",
"metadata": {},
"source": [
"LCEL makes it easy to build complex chains from basic components, and supports out of the box functionality such as streaming, parallelism, and logging."
]
},
{
"cell_type": "markdown",
"id": "9a9acd2e",
"metadata": {},
"source": [
"## Basic example: prompt + model + output parser\n",
"\n",
"The most basic and common use case is chaining a prompt template and a model together. To see how this works, let's create a chain that takes a topic and generates a joke:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "466b65b3",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Why did the ice cream go to therapy?\\n\\nBecause it had too many toppings and couldn't find its cone-fidence!\""
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"\n",
"prompt = ChatPromptTemplate.from_template(\"tell me a short joke about {topic}\")\n",
"model = ChatOpenAI()\n",
"output_parser = StrOutputParser()\n",
"\n",
"chain = prompt | model | output_parser\n",
"\n",
"chain.invoke({\"topic\": \"ice cream\"})"
]
},
{
"cell_type": "markdown",
"id": "81c502c5-85ee-4f36-aaf4-d6e350b7792f",
"metadata": {},
"source": [
"Notice this line of this code, where we piece together then different components into a single chain using LCEL:\n",
"\n",
"```\n",
"chain = prompt | model | output_parser\n",
"```\n",
"\n",
"The `|` symbol is similar to a [unix pipe operator](https://en.wikipedia.org/wiki/Pipeline_(Unix)), which chains together the different components feeds the output from one component as input into the next component. \n",
"\n",
"In this chain the user input is passed to the prompt template, then the prompt template output is passed to the model, then the model output is passed to the output parser. Let's take a look at each component individually to really understand what's going on. "
]
},
{
"cell_type": "markdown",
"id": "aa1b77fa",
"metadata": {},
"source": [
"### 1. Prompt\n",
"\n",
"`prompt` is a `BasePromptTemplate`, which means it takes in a dictionary of template variables and produces a `PromptValue`. A `PromptValue` is a wrapper around a completed prompt that can be passed to either an `LLM` (which takes a string as input) or `ChatModel` (which takes a sequence of messages as input). It can work with either language model type because it defines logic both for producing `BaseMessage`s and for producing a string."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "b8656990",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ChatPromptValue(messages=[HumanMessage(content='tell me a short joke about ice cream')])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"prompt_value = prompt.invoke({\"topic\": \"ice cream\"})\n",
"prompt_value"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "e6034488",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[HumanMessage(content='tell me a short joke about ice cream')]"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"prompt_value.to_messages()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "60565463",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'Human: tell me a short joke about ice cream'"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"prompt_value.to_string()"
]
},
{
"cell_type": "markdown",
"id": "577f0f76",
"metadata": {},
"source": [
"### 2. Model\n",
"\n",
"The `PromptValue` is then passed to `model`. In this case our `model` is a `ChatModel`, meaning it will output a `BaseMessage`."
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "33cf5f72",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content=\"Why did the ice cream go to therapy? \\n\\nBecause it had too many toppings and couldn't find its cone-fidence!\")"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"message = model.invoke(prompt_value)\n",
"message"
]
},
{
"cell_type": "markdown",
"id": "327e7db8",
"metadata": {},
"source": [
"If our `model` was an `LLM`, it would output a string."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "8feb05da",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'\\n\\nRobot: Why did the ice cream go to therapy? Because it had a rocky road.'"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.llms import OpenAI\n",
"\n",
"llm = OpenAI(model=\"gpt-3.5-turbo-instruct\")\n",
"llm.invoke(prompt_value)"
]
},
{
"cell_type": "markdown",
"id": "91847478",
"metadata": {},
"source": [
"### 3. Output parser\n",
"\n",
"And lastly we pass our `model` output to the `output_parser`, which is a `BaseOutputParser` meaning it takes either a string or a \n",
"`BaseMessage` as input. The `StrOutputParser` specifically simple converts any input into a string."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "533e59a8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"Why did the ice cream go to therapy? \\n\\nBecause it had too many toppings and couldn't find its cone-fidence!\""
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"output_parser.invoke(message)"
]
},
{
"cell_type": "markdown",
"id": "9851e842",
"metadata": {},
"source": [
"### 4. Entire Pipeline\n",
"\n",
"To follow the steps along:\n",
"\n",
"1. We pass in user input on the desired topic as `{\"topic\": \"ice cream\"}`\n",
"2. The `prompt` component takes the user input, which is then used to construct a PromptValue after using the `topic` to construct the prompt. \n",
"3. The `model` component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated output from the model is a `ChatMessage` object. \n",
"4. Finally, the `output_parser` component takes in a `ChatMessage`, and transforms this into a Python string, which is returned from the invoke method. \n"
]
},
{
"cell_type": "markdown",
"id": "c4873109",
"metadata": {},
"source": [
"```mermaid\n",
"graph LR\n",
" A(Input: topic=ice cream) --> |Dict| B(PromptTemplate)\n",
" B -->|PromptValue| C(ChatModel) \n",
" C -->|ChatMessage| D(StrOutputParser)\n",
" D --> |String| F(Result)\n",
"```\n"
]
},
{
"cell_type": "markdown",
"id": "fe63534d",
"metadata": {},
"source": [
":::info\n",
"\n",
"Note that if youre curious about the output of any components, you can always test out a smaller version of the chain such as `prompt` or `prompt | model` to see the intermediate results:\n",
"\n",
":::"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "11089b6f-23f8-474f-97ec-8cae8d0ca6d4",
"metadata": {},
"outputs": [],
"source": [
"input = {\"topic\": \"ice cream\"}\n",
"\n",
"prompt.invoke(input)\n",
"# > ChatPromptValue(messages=[HumanMessage(content='tell me a short joke about ice cream')])\n",
"\n",
"(prompt | model).invoke(input)\n",
"# > AIMessage(content=\"Why did the ice cream go to therapy?\\nBecause it had too many toppings and couldn't cone-trol itself!\")"
]
},
{
"cell_type": "markdown",
"id": "cc7d3b9d-e400-4c9b-9188-f29dac73e6bb",
"metadata": {},
"source": [
"## RAG Search Example\n",
"\n",
"For our next example, we want to run a retrieval-augmented generation chain to add some context when responding to questions. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "662426e8-4316-41dc-8312-9b58edc7e0c9",
"metadata": {},
"outputs": [],
"source": [
"# Requires:\n",
"# pip install langchain docarray\n",
"\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.embeddings import OpenAIEmbeddings\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnableParallel, RunnablePassthrough\n",
"from langchain.vectorstores import DocArrayInMemorySearch\n",
"\n",
"vectorstore = DocArrayInMemorySearch.from_texts(\n",
" [\"harrison worked at kensho\", \"bears like to eat honey\"],\n",
" embedding=OpenAIEmbeddings(),\n",
")\n",
"retriever = vectorstore.as_retriever()\n",
"\n",
"template = \"\"\"Answer the question based only on the following context:\n",
"{context}\n",
"\n",
"Question: {question}\n",
"\"\"\"\n",
"prompt = ChatPromptTemplate.from_template(template)\n",
"model = ChatOpenAI()\n",
"output_parser = StrOutputParser()\n",
"\n",
"setup_and_retrieval = RunnableParallel(\n",
" {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
")\n",
"chain = setup_and_retrieval | prompt | model | output_parser\n",
"\n",
"chain.invoke(\"where did harrison work?\")"
]
},
{
"cell_type": "markdown",
"id": "f0999140-6001-423b-970b-adf1dfdb4dec",
"metadata": {},
"source": [
"In this case, the composed chain is: "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5b88e9bb-f04a-4a56-87ec-19a0e6350763",
"metadata": {},
"outputs": [],
"source": [
"chain = setup_and_retrieval | prompt | model | output_parser"
]
},
{
"cell_type": "markdown",
"id": "6e929e15-40a5-4569-8969-384f636cab87",
"metadata": {},
"source": [
"To explain this, we first can see that the prompt template above takes in `context` and `question` as values to be substituted in the prompt. Before building the prompt template, we want to retrieve relevant documents to the search and include them as part of the context. \n",
"\n",
"As a preliminary step, weve setup the retriever using an in memory store, which can retrieve documents based on a query. This is a runnable component as well that can be chained together with other components, but you can also try to run it separately:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a7319ef6-613b-4638-ad7d-4a2183702c1d",
"metadata": {},
"outputs": [],
"source": [
"retriever.invoke(\"where did harrison work?\")"
]
},
{
"cell_type": "markdown",
"id": "e6833844-f1c4-444c-a3d2-31b3c6b31d46",
"metadata": {},
"source": [
"We then use the `RunnableParallel` to prepare the expected inputs into the prompt by using the entries for the retrieved documents as well as the original user question, using the retriever for document search, and RunnablePassthrough to pass the users question:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dcbca26b-d6b9-4c24-806c-1ec8fdaab4ed",
"metadata": {},
"outputs": [],
"source": [
"setup_and_retrieval = RunnableParallel(\n",
" {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
")"
]
},
{
"cell_type": "markdown",
"id": "68c721c1-048b-4a64-9d78-df54fe465992",
"metadata": {},
"source": [
"To review, the complete chain is:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1d5115a7-7b8e-458b-b936-26cc87ee81c4",
"metadata": {},
"outputs": [],
"source": [
"setup_and_retrieval = RunnableParallel(\n",
" {\"context\": retriever, \"question\": RunnablePassthrough()}\n",
")\n",
"chain = setup_and_retrieval | prompt | model | output_parser"
]
},
{
"cell_type": "markdown",
"id": "5c6f5f74-b387-48a0-bedd-1fae202cd10a",
"metadata": {},
"source": [
"With the flow being:\n",
"\n",
"1. The first steps create a `RunnableParallel` object with two entries. The first entry, `context` will include the document results fetched by the retriever. The second entry, `question` will contain the users original question. To pass on the question, we use `RunnablePassthrough` to copy this entry. \n",
"2. Feed the dictionary from the step above to the `prompt` component. It then takes the user input which is `question` as well as the retrieved document which is `context` to construct a prompt and output a PromptValue. \n",
"3. The `model` component takes the generated prompt, and passes into the OpenAI LLM model for evaluation. The generated output from the model is a `ChatMessage` object. \n",
"4. Finally, the `output_parser` component takes in a `ChatMessage`, and transforms this into a Python string, which is returned from the invoke method.\n",
"\n",
"```mermaid\n",
"graph LR\n",
" A(Question) --> B(RunnableParallel)\n",
" B -->|Question| C(Retriever)\n",
" B -->|Question| D(RunnablePassThrough)\n",
" C -->|context=retrieved docs| E(PromptTemplate)\n",
" D -->|question=Question| E\n",
" E -->|PromptValue| F(ChatModel) \n",
" F -->|ChatMessage| G(StrOutputParser)\n",
" G --> |String| H(Result)\n",
"```\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "8c2438df-164e-4bbe-b5f4-461695e45b0f",
"metadata": {},
"source": [
"## Next steps\n",
"\n",
"We recommend reading our [Why use LCEL](/docs/expression_language/why) section next to see a side-by-side comparison of the code needed to produce common functionality with and without LCEL."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -26,7 +26,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "d3e893bf",
"metadata": {},
"outputs": [],
@@ -44,19 +44,24 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "dfdd8bf5",
"metadata": {},
"outputs": [],
"source": [
"from unittest.mock import patch\n",
"\n",
"from openai.error import RateLimitError"
"import httpx\n",
"from openai import RateLimitError\n",
"\n",
"request = httpx.Request(\"GET\", \"/\")\n",
"response = httpx.Response(200, request=request)\n",
"error = RateLimitError(\"rate limit\", response=response, body=\"\")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 3,
"id": "e6fdffc1",
"metadata": {},
"outputs": [],
@@ -69,7 +74,7 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 4,
"id": "584461ab",
"metadata": {},
"outputs": [
@@ -83,7 +88,7 @@
],
"source": [
"# Let's use just the OpenAI LLm first, to show that we run into an error\n",
"with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
"with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
" try:\n",
" print(openai_llm.invoke(\"Why did the chicken cross the road?\"))\n",
" except:\n",
@@ -106,7 +111,7 @@
],
"source": [
"# Now let's try with fallbacks to Anthropic\n",
"with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
"with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
" try:\n",
" print(llm.invoke(\"Why did the chicken cross the road?\"))\n",
" except:\n",
@@ -148,7 +153,7 @@
" ]\n",
")\n",
"chain = prompt | llm\n",
"with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
"with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
" try:\n",
" print(chain.invoke({\"animal\": \"kangaroo\"}))\n",
" except:\n",
@@ -286,7 +291,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
"version": "3.11.5"
}
},
"nbformat": 4,

View File

@@ -82,7 +82,7 @@
"source": [
"## Accepting a Runnable Config\n",
"\n",
"Runnable lambdas can optionally accept a [RunnableConfig](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.config.RunnableConfig.html?highlight=runnableconfig#langchain.schema.runnable.config.RunnableConfig), which they can use to pass callbacks, tags, and other configuration information to nested runs."
"Runnable lambdas can optionally accept a [RunnableConfig](https://api.python.langchain.com/en/latest/runnables/langchain_core.runnables.config.RunnableConfig.html#langchain_core.runnables.config.RunnableConfig), which they can use to pass callbacks, tags, and other configuration information to nested runs."
]
},
{

View File

@@ -1,5 +1,5 @@
---
sidebar_position: 1
sidebar_position: 2
---
# How to

View File

@@ -104,7 +104,7 @@
"source": [
"Here the input to prompt is expected to be a map with keys \"context\" and \"question\". The user input is just the question. So we need to get the context using our retriever and passthrough the user input under the \"question\" key.\n",
"\n",
"Note that when composing a RunnableMap with another Runnable we don't even need to wrap our dictionary in the RunnableMap class — the type conversion is handled for us."
"Note that when composing a RunnableParallel with another Runnable we don't even need to wrap our dictionary in the RunnableParallel class — the type conversion is handled for us."
]
},
{
@@ -114,7 +114,7 @@
"source": [
"## Parallelism\n",
"\n",
"RunnableMaps are also useful for running independent processes in parallel, since each Runnable in the map is executed in parallel. For example, we can see our earlier `joke_chain`, `poem_chain` and `map_chain` all have about the same runtime, even though `map_chain` executes both of the other two."
"RunnableParallel are also useful for running independent processes in parallel, since each Runnable in the map is executed in parallel. For example, we can see our earlier `joke_chain`, `poem_chain` and `map_chain` all have about the same runtime, even though `map_chain` executes both of the other two."
]
},
{

View File

@@ -290,9 +290,9 @@
],
"source": [
"from langchain.schema.messages import HumanMessage\n",
"from langchain.schema.runnable import RunnableMap\n",
"from langchain.schema.runnable import RunnableParallel\n",
"\n",
"chain = RunnableMap({\"output_message\": ChatAnthropic(model=\"claude-2\")})\n",
"chain = RunnableParallel({\"output_message\": ChatAnthropic(model=\"claude-2\")})\n",
"chain_with_history = RunnableWithMessageHistory(\n",
" chain,\n",
" lambda session_id: RedisChatMessageHistory(session_id, url=REDIS_URL),\n",

View File

@@ -6,7 +6,7 @@
"metadata": {},
"source": [
"---\n",
"sidebar_position: 0\n",
"sidebar_position: 1\n",
"title: Interface\n",
"---"
]
@@ -16,7 +16,7 @@
"id": "9a9acd2e",
"metadata": {},
"source": [
"To make it as easy as possible to create custom chains, we've implemented a [\"Runnable\"](https://api.python.langchain.com/en/latest/schema/langchain.schema.runnable.base.Runnable.html#langchain.schema.runnable.base.Runnable) protocol. The `Runnable` protocol is implemented for most components. \n",
"To make it as easy as possible to create custom chains, we've implemented a [\"Runnable\"](https://api.python.langchain.com/en/stable/runnables/langchain_core.runnables.base.Runnable.html#langchain_core.runnables.base.Runnable) protocol. The `Runnable` protocol is implemented for most components. \n",
"This is a standard interface, which makes it easy to define custom chains as well as invoke them in a standard way. \n",
"The standard interface includes:\n",
"\n",

File diff suppressed because it is too large Load Diff

View File

@@ -344,7 +344,7 @@ category_chain = chat_prompt | ChatOpenAI() | CommaSeparatedListOutputParser()
app = FastAPI(
title="LangChain Server",
version="1.0",
description="A simple api server using Langchain's Runnable interfaces",
description="A simple API server using LangChain's Runnable interfaces",
)
# 3. Adding chain route

View File

@@ -12,7 +12,7 @@ Platforms with tracing capabilities like [LangSmith](/docs/langsmith/) and [Wand
For anyone building production-grade LLM applications, we highly recommend using a platform like this.
![LangSmith run](/img/run_details.png)
![LangSmith run](../../static/img/run_details.png)
## `set_debug` and `set_verbose`

View File

@@ -28,7 +28,7 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 1,
"id": "d3e893bf",
"metadata": {},
"outputs": [],
@@ -46,19 +46,24 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 2,
"id": "dfdd8bf5",
"metadata": {},
"outputs": [],
"source": [
"from unittest.mock import patch\n",
"\n",
"from openai.error import RateLimitError"
"import httpx\n",
"from openai import RateLimitError\n",
"\n",
"request = httpx.Request(\"GET\", \"/\")\n",
"response = httpx.Response(200, request=request)\n",
"error = RateLimitError(\"rate limit\", response=response, body=\"\")"
]
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 3,
"id": "e6fdffc1",
"metadata": {},
"outputs": [],
@@ -71,7 +76,7 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 4,
"id": "584461ab",
"metadata": {},
"outputs": [
@@ -85,7 +90,7 @@
],
"source": [
"# Let's use just the OpenAI LLm first, to show that we run into an error\n",
"with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
"with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
" try:\n",
" print(openai_llm.invoke(\"Why did the chicken cross the road?\"))\n",
" except:\n",
@@ -108,7 +113,7 @@
],
"source": [
"# Now let's try with fallbacks to Anthropic\n",
"with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
"with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
" try:\n",
" print(llm.invoke(\"Why did the chicken cross the road?\"))\n",
" except:\n",
@@ -150,7 +155,7 @@
" ]\n",
")\n",
"chain = prompt | llm\n",
"with patch(\"openai.ChatCompletion.create\", side_effect=RateLimitError()):\n",
"with patch(\"openai.resources.chat.completions.Completions.create\", side_effect=error):\n",
" try:\n",
" print(chain.invoke({\"animal\": \"kangaroo\"}))\n",
" except:\n",
@@ -431,7 +436,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.11.5"
}
},
"nbformat": 4,

View File

@@ -32,7 +32,7 @@
"1. `Base model`: What is the base-model and how was it trained?\n",
"2. `Fine-tuning approach`: Was the base-model fine-tuned and, if so, what [set of instructions](https://cameronrwolfe.substack.com/p/beyond-llama-the-power-of-open-llms#%C2%A7alpaca-an-instruction-following-llama-model) was used?\n",
"\n",
"![Image description](/img/OSS_LLM_overview.png)\n",
"![Image description](../../static/img/OSS_LLM_overview.png)\n",
"\n",
"The relative performance of these models can be assessed using several leaderboards, including:\n",
"\n",
@@ -55,7 +55,7 @@
"\n",
"In particular, see [this excellent post](https://finbarr.ca/how-is-llama-cpp-possible/) on the importance of quantization.\n",
"\n",
"![Image description](/img/llama-memory-weights.png)\n",
"![Image description](../../static/img/llama-memory-weights.png)\n",
"\n",
"With less precision, we radically decrease the memory needed to store the LLM in memory.\n",
"\n",
@@ -63,7 +63,7 @@
"\n",
"A Mac M2 Max is 5-6x faster than a M1 for inference due to the larger GPU memory bandwidth.\n",
"\n",
"![Image description](/img/llama_t_put.png)\n",
"![Image description](../../static/img/llama_t_put.png)\n",
"\n",
"## Quickstart\n",
"\n",

View File

@@ -60,7 +60,7 @@
"\n",
" Firstly, the wallet contains my credit card with number 4111 1111 1111 1111, which is registered under my name and linked to my bank account, PL61109010140000071219812874.\n",
"\n",
" Additionally, the wallet had a driver's license - DL No: 999000680 issued to my name. It also houses my Social Security Number, 602-76-4532. \n",
" Additionally, the wallet had a driver's license - DL No: 999000680 issued to my name. It also houses my Social Security Number, 602-76-4532.\n",
"\n",
" What's more, I had my polish identity card there, with the number ABC123456.\n",
"\n",
@@ -68,7 +68,7 @@
"\n",
" In case any information arises regarding my wallet, please reach out to me on my phone number, 999-888-7777, or through my personal email, johndoe@example.com.\n",
"\n",
" Please consider this information to be highly confidential and respect my privacy. \n",
" Please consider this information to be highly confidential and respect my privacy.\n",
"\n",
" The bank has been informed about the stolen credit card and necessary actions have been taken from their end. They will be reachable at their official email, support@bankname.com.\n",
" My representative there is Victoria Cherry (her business phone: 987-654-3210).\n",
@@ -667,7 +667,11 @@
"from langchain.chat_models.openai import ChatOpenAI\n",
"from langchain.prompts import ChatPromptTemplate\n",
"from langchain.schema.output_parser import StrOutputParser\n",
"from langchain.schema.runnable import RunnableLambda, RunnableMap, RunnablePassthrough\n",
"from langchain.schema.runnable import (\n",
" RunnableLambda,\n",
" RunnableParallel,\n",
" RunnablePassthrough,\n",
")\n",
"\n",
"# 6. Create anonymizer chain\n",
"template = \"\"\"Answer the question based only on the following context:\n",
@@ -680,7 +684,7 @@
"model = ChatOpenAI(temperature=0.3)\n",
"\n",
"\n",
"_inputs = RunnableMap(\n",
"_inputs = RunnableParallel(\n",
" question=RunnablePassthrough(),\n",
" # It is important to remember about question anonymization\n",
" anonymized_question=RunnableLambda(anonymizer.anonymize),\n",
@@ -882,7 +886,7 @@
"\n",
"\n",
"chain_with_deanonymization = (\n",
" RunnableMap({\"question\": RunnablePassthrough()})\n",
" RunnableParallel({\"question\": RunnablePassthrough()})\n",
" | {\n",
" \"context\": itemgetter(\"question\")\n",
" | retriever\n",

View File

@@ -7,7 +7,9 @@
"source": [
"# Amazon Comprehend Moderation Chain\n",
"\n",
"This notebook shows how to use [Amazon Comprehend](https://aws.amazon.com/comprehend/) to detect and handle `Personally Identifiable Information` (`PII`) and toxicity.\n",
">[Amazon Comprehend](https://aws.amazon.com/comprehend/) is a natural-language processing (NLP) service that uses machine learning to uncover valuable insights and connections in text.\n",
"\n",
"This notebook shows how to use `Amazon Comprehend` to detect and handle `Personally Identifiable Information` (`PII`) and toxicity.\n",
"\n",
"## Setting up"
]
@@ -1417,7 +1419,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,285 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "700a516b",
"metadata": {},
"source": [
"# OpenAI Adapter(Old)\n",
"\n",
"**Please ensure OpenAI library is less than 1.0.0; otherwise, refer to the newer doc [OpenAI Adapter](./openai.ipynb).**\n",
"\n",
"A lot of people get started with OpenAI but want to explore other models. LangChain's integrations with many model providers make this easy to do so. While LangChain has it's own message and model APIs, we've also made it as easy as possible to explore other models by exposing an adapter to adapt LangChain models to the OpenAI api.\n",
"\n",
"At the moment this only deals with output and does not return other information (token counts, stop reasons, etc)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6017f26a",
"metadata": {},
"outputs": [],
"source": [
"import openai\n",
"from langchain.adapters import openai as lc_openai"
]
},
{
"cell_type": "markdown",
"id": "b522ceda",
"metadata": {},
"source": [
"## ChatCompletion.create"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "1d22eb61",
"metadata": {},
"outputs": [],
"source": [
"messages = [{\"role\": \"user\", \"content\": \"hi\"}]"
]
},
{
"cell_type": "markdown",
"id": "d550d3ad",
"metadata": {},
"source": [
"Original OpenAI call"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "012d81ae",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'role': 'assistant', 'content': 'Hello! How can I assist you today?'}"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result = openai.ChatCompletion.create(\n",
" messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
")\n",
"result[\"choices\"][0][\"message\"].to_dict_recursive()"
]
},
{
"cell_type": "markdown",
"id": "db5b5500",
"metadata": {},
"source": [
"LangChain OpenAI wrapper call"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "c67a5ac8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'role': 'assistant', 'content': 'Hello! How can I assist you today?'}"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lc_result = lc_openai.ChatCompletion.create(\n",
" messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
")\n",
"lc_result[\"choices\"][0][\"message\"]"
]
},
{
"cell_type": "markdown",
"id": "034ba845",
"metadata": {},
"source": [
"Swapping out model providers"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "f7c94827",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'role': 'assistant', 'content': ' Hello!'}"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lc_result = lc_openai.ChatCompletion.create(\n",
" messages=messages, model=\"claude-2\", temperature=0, provider=\"ChatAnthropic\"\n",
")\n",
"lc_result[\"choices\"][0][\"message\"]"
]
},
{
"cell_type": "markdown",
"id": "cb3f181d",
"metadata": {},
"source": [
"## ChatCompletion.stream"
]
},
{
"cell_type": "markdown",
"id": "f7b8cd18",
"metadata": {},
"source": [
"Original OpenAI call"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "fd8cb1ea",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'role': 'assistant', 'content': ''}\n",
"{'content': 'Hello'}\n",
"{'content': '!'}\n",
"{'content': ' How'}\n",
"{'content': ' can'}\n",
"{'content': ' I'}\n",
"{'content': ' assist'}\n",
"{'content': ' you'}\n",
"{'content': ' today'}\n",
"{'content': '?'}\n",
"{}\n"
]
}
],
"source": [
"for c in openai.ChatCompletion.create(\n",
" messages=messages, model=\"gpt-3.5-turbo\", temperature=0, stream=True\n",
"):\n",
" print(c[\"choices\"][0][\"delta\"].to_dict_recursive())"
]
},
{
"cell_type": "markdown",
"id": "0b2a076b",
"metadata": {},
"source": [
"LangChain OpenAI wrapper call"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "9521218c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'role': 'assistant', 'content': ''}\n",
"{'content': 'Hello'}\n",
"{'content': '!'}\n",
"{'content': ' How'}\n",
"{'content': ' can'}\n",
"{'content': ' I'}\n",
"{'content': ' assist'}\n",
"{'content': ' you'}\n",
"{'content': ' today'}\n",
"{'content': '?'}\n",
"{}\n"
]
}
],
"source": [
"for c in lc_openai.ChatCompletion.create(\n",
" messages=messages, model=\"gpt-3.5-turbo\", temperature=0, stream=True\n",
"):\n",
" print(c[\"choices\"][0][\"delta\"])"
]
},
{
"cell_type": "markdown",
"id": "0fc39750",
"metadata": {},
"source": [
"Swapping out model providers"
]
},
{
"cell_type": "code",
"execution_count": 31,
"id": "68f0214e",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'role': 'assistant', 'content': ' Hello'}\n",
"{'content': '!'}\n",
"{}\n"
]
}
],
"source": [
"for c in lc_openai.ChatCompletion.create(\n",
" messages=messages,\n",
" model=\"claude-2\",\n",
" temperature=0,\n",
" stream=True,\n",
" provider=\"ChatAnthropic\",\n",
"):\n",
" print(c[\"choices\"][0][\"delta\"])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -7,6 +7,8 @@
"source": [
"# OpenAI Adapter\n",
"\n",
"**Please ensure OpenAI library is version 1.0.0 or higher; otherwise, refer to the older doc [OpenAI Adapter(Old)](./openai-old.ipynb).**\n",
"\n",
"A lot of people get started with OpenAI but want to explore other models. LangChain's integrations with many model providers make this easy to do so. While LangChain has it's own message and model APIs, we've also made it as easy as possible to explore other models by exposing an adapter to adapt LangChain models to the OpenAI api.\n",
"\n",
"At the moment this only deals with output and does not return other information (token counts, stop reasons, etc)."
@@ -28,12 +30,12 @@
"id": "b522ceda",
"metadata": {},
"source": [
"## ChatCompletion.create"
"## chat.completions.create"
]
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 2,
"id": "1d22eb61",
"metadata": {},
"outputs": [],
@@ -51,26 +53,29 @@
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 3,
"id": "012d81ae",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'role': 'assistant', 'content': 'Hello! How can I assist you today?'}"
"{'content': 'Hello! How can I assist you today?',\n",
" 'role': 'assistant',\n",
" 'function_call': None,\n",
" 'tool_calls': None}"
]
},
"execution_count": 15,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"result = openai.ChatCompletion.create(\n",
"result = openai.chat.completions.create(\n",
" messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
")\n",
"result[\"choices\"][0][\"message\"].to_dict_recursive()"
"result.choices[0].message.model_dump()"
]
},
{
@@ -83,26 +88,48 @@
},
{
"cell_type": "code",
"execution_count": 17,
"execution_count": 4,
"id": "c67a5ac8",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'role': 'assistant', 'content': 'Hello! How can I assist you today?'}"
"{'role': 'assistant', 'content': 'Hello! How can I help you today?'}"
]
},
"execution_count": 17,
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lc_result = lc_openai.ChatCompletion.create(\n",
"lc_result = lc_openai.chat.completions.create(\n",
" messages=messages, model=\"gpt-3.5-turbo\", temperature=0\n",
")\n",
"lc_result[\"choices\"][0][\"message\"]"
"\n",
"lc_result.choices[0].message # Attribute access"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "37a6e461-8608-47f6-ac45-12ad753c062a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'role': 'assistant', 'content': 'Hello! How can I help you today?'}"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lc_result[\"choices\"][0][\"message\"] # Also compatible with index access"
]
},
{
@@ -115,26 +142,26 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 6,
"id": "f7c94827",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'role': 'assistant', 'content': ' Hello!'}"
"{'role': 'assistant', 'content': 'Hello! How can I assist you today?'}"
]
},
"execution_count": 19,
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"lc_result = lc_openai.ChatCompletion.create(\n",
"lc_result = lc_openai.chat.completions.create(\n",
" messages=messages, model=\"claude-2\", temperature=0, provider=\"ChatAnthropic\"\n",
")\n",
"lc_result[\"choices\"][0][\"message\"]"
"lc_result.choices[0].message"
]
},
{
@@ -142,7 +169,7 @@
"id": "cb3f181d",
"metadata": {},
"source": [
"## ChatCompletion.stream"
"## chat.completions.stream"
]
},
{
@@ -155,7 +182,7 @@
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 7,
"id": "fd8cb1ea",
"metadata": {},
"outputs": [
@@ -163,25 +190,25 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'role': 'assistant', 'content': ''}\n",
"{'content': 'Hello'}\n",
"{'content': '!'}\n",
"{'content': ' How'}\n",
"{'content': ' can'}\n",
"{'content': ' I'}\n",
"{'content': ' assist'}\n",
"{'content': ' you'}\n",
"{'content': ' today'}\n",
"{'content': '?'}\n",
"{}\n"
"{'content': '', 'function_call': None, 'role': 'assistant', 'tool_calls': None}\n",
"{'content': 'Hello', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': '!', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': ' How', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': ' can', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': ' I', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': ' assist', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': ' you', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': ' today', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': '?', 'function_call': None, 'role': None, 'tool_calls': None}\n",
"{'content': None, 'function_call': None, 'role': None, 'tool_calls': None}\n"
]
}
],
"source": [
"for c in openai.ChatCompletion.create(\n",
"for c in openai.chat.completions.create(\n",
" messages=messages, model=\"gpt-3.5-turbo\", temperature=0, stream=True\n",
"):\n",
" print(c[\"choices\"][0][\"delta\"].to_dict_recursive())"
" print(c.choices[0].delta.model_dump())"
]
},
{
@@ -194,7 +221,7 @@
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 8,
"id": "9521218c",
"metadata": {},
"outputs": [
@@ -217,10 +244,10 @@
}
],
"source": [
"for c in lc_openai.ChatCompletion.create(\n",
"for c in lc_openai.chat.completions.create(\n",
" messages=messages, model=\"gpt-3.5-turbo\", temperature=0, stream=True\n",
"):\n",
" print(c[\"choices\"][0][\"delta\"])"
" print(c.choices[0].delta)"
]
},
{
@@ -233,7 +260,7 @@
},
{
"cell_type": "code",
"execution_count": 31,
"execution_count": 9,
"id": "68f0214e",
"metadata": {},
"outputs": [
@@ -241,14 +268,22 @@
"name": "stdout",
"output_type": "stream",
"text": [
"{'role': 'assistant', 'content': ' Hello'}\n",
"{'role': 'assistant', 'content': ''}\n",
"{'content': 'Hello'}\n",
"{'content': '!'}\n",
"{'content': ' How'}\n",
"{'content': ' can'}\n",
"{'content': ' I'}\n",
"{'content': ' assist'}\n",
"{'content': ' you'}\n",
"{'content': ' today'}\n",
"{'content': '?'}\n",
"{}\n"
]
}
],
"source": [
"for c in lc_openai.ChatCompletion.create(\n",
"for c in lc_openai.chat.completions.create(\n",
" messages=messages,\n",
" model=\"claude-2\",\n",
" temperature=0,\n",
@@ -275,7 +310,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.11.5"
}
},
"nbformat": 4,

View File

@@ -1,11 +1,21 @@
{
"cells": [
{
"cell_type": "raw",
"id": "a016701c",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Anthropic\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "bf733a38-db84-4363-89e2-de6735c37230",
"metadata": {},
"source": [
"# Anthropic\n",
"# ChatAnthropic\n",
"\n",
"This notebook covers how to get started with Anthropic chat models."
]

View File

@@ -1,12 +1,22 @@
{
"cells": [
{
"cell_type": "raw",
"id": "31895fc4",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Anyscale\n",
"---"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "642fd21c-600a-47a1-be96-6e1438b421a9",
"metadata": {},
"source": [
"# Anyscale\n",
"# ChatAnyscale\n",
"\n",
"This notebook demonstrates the use of `langchain.chat_models.ChatAnyscale` for [Anyscale Endpoints](https://endpoints.anyscale.com/).\n",
"\n",
@@ -33,7 +43,7 @@
"metadata": {},
"outputs": [
{
"name": "stdin",
"name": "stdout",
"output_type": "stream",
"text": [
" ········\n"

View File

@@ -1,11 +1,21 @@
{
"cells": [
{
"cell_type": "raw",
"id": "641f8cb0",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Azure OpenAI\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "38f26d7a",
"metadata": {},
"source": [
"# Azure OpenAI\n",
"# AzureChatOpenAI\n",
"\n",
">[Azure OpenAI Service](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview) provides REST API access to OpenAI's powerful language models including the GPT-4, GPT-3.5-Turbo, and Embeddings model series. These models can be easily adapted to your specific task including but not limited to content generation, summarization, semantic search, and natural language to code translation. Users can access the service through REST APIs, Python SDK, or a web-based interface in the Azure OpenAI Studio.\n",
"\n",

View File

@@ -1,10 +1,19 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Azure ML Endpoint\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Azure ML Endpoint\n",
"# AzureMLChatOnlineEndpoint\n",
"\n",
">[Azure Machine Learning](https://azure.microsoft.com/en-us/products/machine-learning/) is a platform used to build, train, and deploy machine learning models. Users can explore the types of models to deploy in the Model Catalog, which provides Azure Foundation Models and OpenAI Models. `Azure Foundation Models` include various open-source models and popular Hugging Face models. Users can also import models of their liking into AzureML.\n",
">\n",

View File

@@ -1,10 +1,19 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Baichuan Chat\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Baichuan Chat\n",
"# ChatBaichuan\n",
"\n",
"Baichuan chat models API by Baichuan Intelligent Technology. For more information, see [https://platform.baichuan-ai.com/docs/api](https://platform.baichuan-ai.com/docs/api)"
]
@@ -63,7 +72,9 @@
"outputs": [
{
"data": {
"text/plain": "AIMessage(content='首先我们需要确定闰年的二月有多少天。闰年的二月有29天。\\n\\n然后我们可以计算你的月薪\\n\\n日薪 = 月薪 / (当月天数)\\n\\n所以你的月薪 = 日薪 * 当月天数\\n\\n将数值代入公式\\n\\n月薪 = 8元/天 * 29天 = 232元\\n\\n因此你在闰年的二月的月薪是232元。')"
"text/plain": [
"AIMessage(content='首先我们需要确定闰年的二月有多少天。闰年的二月有29天。\\n\\n然后我们可以计算你的月薪\\n\\n日薪 = 月薪 / (当月天数)\\n\\n所以你的月薪 = 日薪 * 当月天数\\n\\n将数值代入公式\\n\\n月薪 = 8元/天 * 29天 = 232元\\n\\n因此你在闰年的二月的月薪是232元。')"
]
},
"execution_count": 3,
"metadata": {},
@@ -76,16 +87,23 @@
},
{
"cell_type": "markdown",
"source": [
"## For ChatBaichuan with Streaming"
],
"metadata": {
"collapsed": false
}
},
"source": [
"## For ChatBaichuan with Streaming"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"ExecuteTime": {
"end_time": "2023-10-17T15:14:25.870044Z",
"start_time": "2023-10-17T15:14:25.863381Z"
},
"collapsed": false
},
"outputs": [],
"source": [
"chat = ChatBaichuan(\n",
@@ -93,22 +111,24 @@
" baichuan_secret_key=\"YOUR_SECRET_KEY\",\n",
" streaming=True,\n",
")"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-10-17T15:14:25.870044Z",
"start_time": "2023-10-17T15:14:25.863381Z"
}
}
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"ExecuteTime": {
"end_time": "2023-10-17T15:14:27.153546Z",
"start_time": "2023-10-17T15:14:25.868470Z"
},
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": "AIMessageChunk(content='首先我们需要确定闰年的二月有多少天。闰年的二月有29天。\\n\\n然后我们可以计算你的月薪\\n\\n日薪 = 月薪 / (当月天数)\\n\\n所以你的月薪 = 日薪 * 当月天数\\n\\n将数值代入公式\\n\\n月薪 = 8元/天 * 29天 = 232元\\n\\n因此你在闰年的二月的月薪是232元。')"
"text/plain": [
"AIMessageChunk(content='首先我们需要确定闰年的二月有多少天。闰年的二月有29天。\\n\\n然后我们可以计算你的月薪\\n\\n日薪 = 月薪 / (当月天数)\\n\\n所以你的月薪 = 日薪 * 当月天数\\n\\n将数值代入公式\\n\\n月薪 = 8元/天 * 29天 = 232元\\n\\n因此你在闰年的二月的月薪是232元。')"
]
},
"execution_count": 6,
"metadata": {},
@@ -117,14 +137,7 @@
],
"source": [
"chat([HumanMessage(content=\"我日薪8块钱请问在闰年的二月我月薪多少\")])"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-10-17T15:14:27.153546Z",
"start_time": "2023-10-17T15:14:25.868470Z"
}
}
]
}
],
"metadata": {

View File

@@ -1,11 +1,20 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Baidu Qianfan\n",
"---"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Baidu Qianfan\n",
"# QianfanChatEndpoint\n",
"\n",
"Baidu AI Cloud Qianfan Platform is a one-stop large model development and service operation platform for enterprise developers. Qianfan not only provides including the model of Wenxin Yiyan (ERNIE-Bot) and the third-party open-source models, but also provides various AI development tools and the whole set of development environment, which facilitates customers to use and develop large model applications easily.\n",
"\n",

View File

@@ -1,13 +1,31 @@
{
"cells": [
{
"cell_type": "raw",
"id": "fbc66410",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Bedrock Chat\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "bf733a38-db84-4363-89e2-de6735c37230",
"metadata": {},
"source": [
"# Bedrock Chat\n",
"# BedrockChat\n",
"\n",
"[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case"
">[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that offers a choice of \n",
"> high-performing foundation models (FMs) from leading AI companies like `AI21 Labs`, `Anthropic`, `Cohere`, \n",
"> `Meta`, `Stability AI`, and `Amazon` via a single API, along with a broad set of capabilities you need to \n",
"> build generative AI applications with security, privacy, and responsible AI. Using `Amazon Bedrock`, \n",
"> you can easily experiment with and evaluate top FMs for your use case, privately customize them with \n",
"> your data using techniques such as fine-tuning and `Retrieval Augmented Generation` (`RAG`), and build \n",
"> agents that execute tasks using your enterprise systems and data sources. Since `Amazon Bedrock` is \n",
"> serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy \n",
"> generative AI capabilities into your applications using the AWS services you are already familiar with.\n"
]
},
{
@@ -131,7 +149,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -1,11 +1,21 @@
{
"cells": [
{
"cell_type": "raw",
"id": "53fbf15f",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Cohere\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "bf733a38-db84-4363-89e2-de6735c37230",
"metadata": {},
"source": [
"# Cohere\n",
"# ChatCohere\n",
"\n",
"This notebook covers how to get started with Cohere chat models."
]

View File

@@ -1,13 +1,34 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Ernie Bot Chat\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ERNIE-Bot Chat\n",
"# ErnieBotChat\n",
"\n",
"[ERNIE-Bot](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/jlil56u11) is a large language model developed by Baidu, covering a huge amount of Chinese data.\n",
"This notebook covers how to get started with ErnieBot chat models."
"This notebook covers how to get started with ErnieBot chat models.\n",
"\n",
"**Note:** We recommend users using this class to switch to [Baidu Qianfan](./baidu_qianfan_endpoint). they are 3 why we recommend users to use `QianfanChatEndpoint`:\n",
"1. `QianfanChatEndpoint` support more LLM in the Qianfan platform.\n",
"2. `QianfanChatEndpoint` support streaming mode.\n",
"3. `QianfanChatEndpoint` support function calling usgage.\n",
"\n",
"Some tips for migration:\n",
"- change `ernie_client_id` to `qianfan_ak`, also change `ernie_client_secret` to `qianfan_sk`.\n",
"- install `qianfan` package. \n",
" ```\n",
" pip install qianfan\n",
" ```"
]
},
{

View File

@@ -1,11 +1,21 @@
{
"cells": [
{
"cell_type": "raw",
"id": "5e45f35c",
"metadata": {},
"source": [
"---\n",
"sidebar_label: EverlyAI\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "642fd21c-600a-47a1-be96-6e1438b421a9",
"metadata": {},
"source": [
"# EverlyAI\n",
"# ChatEverlyAI\n",
"\n",
">[EverlyAI](https://everlyai.xyz) allows you to run your ML models at scale in the cloud. It also provides API access to [several LLM models](https://everlyai.xyz).\n",
"\n",

View File

@@ -1,12 +1,22 @@
{
"cells": [
{
"cell_type": "raw",
"id": "529aeba9",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Fireworks\n",
"---"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "642fd21c-600a-47a1-be96-6e1438b421a9",
"metadata": {},
"source": [
"# Fireworks\n",
"# ChatFireworks\n",
"\n",
">[Fireworks](https://app.fireworks.ai/) accelerates product development on generative AI by creating an innovative AI experiment and production platform. \n",
"\n",

View File

@@ -1,11 +1,20 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Google Cloud Vertex AI\n",
"---"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# Google Cloud Vertex AI \n",
"# ChatVertexAI\n",
"\n",
"Note: This is separate from the Google PaLM integration. Google has chosen to offer an enterprise version of PaLM through GCP, and this supports the models made available through there. \n",
"\n",

View File

@@ -1,10 +1,19 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Tencent Hunyuan\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tencent Hunyuan\n",
"# ChatHunyuan\n",
"\n",
"Hunyuan chat model API by Tencent. For more information, see [https://cloud.tencent.com/document/product/1729](https://cloud.tencent.com/document/product/1729)"
]
@@ -54,7 +63,9 @@
"outputs": [
{
"data": {
"text/plain": "AIMessage(content=\"J'aime programmer.\")"
"text/plain": [
"AIMessage(content=\"J'aime programmer.\")"
]
},
"execution_count": 3,
"metadata": {},
@@ -73,16 +84,23 @@
},
{
"cell_type": "markdown",
"source": [
"## For ChatHunyuan with Streaming"
],
"metadata": {
"collapsed": false
}
},
"source": [
"## For ChatHunyuan with Streaming"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"ExecuteTime": {
"end_time": "2023-10-19T10:20:41.507720Z",
"start_time": "2023-10-19T10:20:41.496456Z"
},
"collapsed": false
},
"outputs": [],
"source": [
"chat = ChatHunyuan(\n",
@@ -91,22 +109,24 @@
" hunyuan_secret_key=\"YOUR_SECRET_KEY\",\n",
" streaming=True,\n",
")"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-10-19T10:20:41.507720Z",
"start_time": "2023-10-19T10:20:41.496456Z"
}
}
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"ExecuteTime": {
"end_time": "2023-10-19T10:20:46.275673Z",
"start_time": "2023-10-19T10:20:44.241097Z"
},
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": "AIMessageChunk(content=\"J'aime programmer.\")"
"text/plain": [
"AIMessageChunk(content=\"J'aime programmer.\")"
]
},
"execution_count": 3,
"metadata": {},
@@ -121,26 +141,19 @@
" )\n",
" ]\n",
")"
],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"end_time": "2023-10-19T10:20:46.275673Z",
"start_time": "2023-10-19T10:20:44.241097Z"
}
}
]
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [],
"metadata": {
"collapsed": false,
"ExecuteTime": {
"start_time": "2023-10-19T10:19:56.233477Z"
}
}
},
"collapsed": false
},
"outputs": [],
"source": []
}
],
"metadata": {

View File

@@ -1,10 +1,19 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Konko\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Konko\n",
"# ChatKonko\n",
"\n",
">[Konko](https://www.konko.ai/) API is a fully managed Web API designed to help application developers:\n",
"\n",

View File

@@ -1,12 +1,22 @@
{
"cells": [
{
"cell_type": "raw",
"id": "59148044",
"metadata": {},
"source": [
"---\n",
"sidebar_label: LiteLLM\n",
"---"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "bf733a38-db84-4363-89e2-de6735c37230",
"metadata": {},
"source": [
"# 🚅 LiteLLM\n",
"# ChatLiteLLM\n",
"\n",
"[LiteLLM](https://github.com/BerriAI/litellm) is a library that simplifies calling Anthropic, Azure, Huggingface, Replicate, etc. \n",
"\n",

View File

@@ -1,11 +1,21 @@
{
"cells": [
{
"cell_type": "raw",
"id": "7320f16b",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Llama 2 Chat\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "90a1faf2",
"metadata": {},
"source": [
"# Llama-2 Chat\n",
"# Llama2Chat\n",
"\n",
"This notebook shows how to augment Llama-2 `LLM`s with the `Llama2Chat` wrapper to support the [Llama-2 chat prompt format](https://huggingface.co/blog/llama2#how-to-prompt-llama-2). Several `LLM` implementations in LangChain can be used as interface to Llama-2 chat models. These include [HuggingFaceTextGenInference](https://python.langchain.com/docs/integrations/llms/huggingface_textgen_inference), [LlamaCpp](https://python.langchain.com/docs/use_cases/question_answering/how_to/local_retrieval_qa), [GPT4All](https://python.langchain.com/docs/integrations/llms/gpt4all), ..., to mention a few examples. \n",
"\n",
@@ -721,7 +731,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
"version": "3.11.4"
}
},
"nbformat": 4,

View File

@@ -1,11 +1,21 @@
{
"cells": [
{
"cell_type": "raw",
"id": "71b5cfca",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Llama API\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "90a1faf2",
"metadata": {},
"source": [
"# Llama API\n",
"# ChatLlamaAPI\n",
"\n",
"This notebook shows how to use LangChain with [LlamaAPI](https://llama-api.com/) - a hosted version of Llama2 that adds in support for function calling."
]

View File

@@ -1,11 +1,20 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: MiniMax\n",
"---"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"# MiniMax\n",
"# MiniMaxChat\n",
"\n",
"[Minimax](https://api.minimax.chat) is a Chinese startup that provides LLM service for companies and individuals.\n",
"\n",

View File

@@ -1,10 +1,19 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Ollama\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Ollama\n",
"# ChatOllama\n",
"\n",
"[Ollama](https://ollama.ai/) allows you to run open-source large language models, such as LLaMA2, locally.\n",
"\n",

View File

@@ -1,10 +1,19 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Ollama Functions\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Ollama Functions\n",
"# OllamaFunctions\n",
"\n",
"This notebook shows how to use an experimental wrapper around Ollama that gives it the same API as OpenAI Functions.\n",
"\n",

View File

@@ -1,11 +1,21 @@
{
"cells": [
{
"cell_type": "raw",
"id": "afaf8039",
"metadata": {},
"source": [
"---\n",
"sidebar_label: OpenAI\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "e49f1e0d",
"metadata": {},
"source": [
"# OpenAI\n",
"# ChatOpenAI\n",
"\n",
"This notebook covers how to get started with OpenAI chat models."
]

View File

@@ -1,10 +1,19 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: AliCloud PAI EAS\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# AliCloud PAI EAS\n",
"# PaiEasChatEndpoint\n",
"Machine Learning Platform for AI of Alibaba Cloud is a machine learning or deep learning engineering platform intended for enterprises and developers. It provides easy-to-use, cost-effective, high-performance, and easy-to-scale plug-ins that can be applied to various industry scenarios. With over 140 built-in optimization algorithms, Machine Learning Platform for AI provides whole-process AI engineering capabilities including data labeling (PAI-iTAG), model building (PAI-Designer and PAI-DSW), model training (PAI-DLC), compilation optimization, and inference deployment (PAI-EAS). PAI-EAS supports different types of hardware resources, including CPUs and GPUs, and features high throughput and low latency. It allows you to deploy large-scale complex models with a few clicks and perform elastic scale-ins and scale-outs in real time. It also provides a comprehensive O&M and monitoring system."
]
},

View File

@@ -1,12 +1,22 @@
{
"cells": [
{
"cell_type": "raw",
"id": "ce3672d3",
"metadata": {},
"source": [
"---\n",
"sidebar_label: PromptLayer ChatOpenAI\n",
"---"
]
},
{
"attachments": {},
"cell_type": "markdown",
"id": "959300d4",
"metadata": {},
"source": [
"# PromptLayer ChatOpenAI\n",
"# PromptLayerChatOpenAI\n",
"\n",
"This example showcases how to connect to [PromptLayer](https://www.promptlayer.com) to start recording your ChatOpenAI requests."
]

View File

@@ -1,5 +1,14 @@
{
"cells": [
{
"cell_type": "raw",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Tongyi Qwen\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {
@@ -9,7 +18,7 @@
}
},
"source": [
"# Tongyi Qwen\n",
"# ChatTongyi\n",
"Tongyi Qwen is a large language model developed by Alibaba's Damo Academy. It is capable of understanding user intent through natural language understanding and semantic analysis, based on user input in natural language. It provides services and assistance to users in different domains and tasks. By providing clear and detailed instructions, you can obtain results that better align with your expectations.\n",
"In this notebook, we will introduce how to use langchain with [Tongyi](https://www.aliyun.com/product/dashscope) mainly in `Chat` corresponding\n",
" to the package `langchain/chat_models` in langchain"
@@ -41,7 +50,7 @@
},
"outputs": [
{
"name": "stdin",
"name": "stdout",
"output_type": "stream",
"text": [
" ········\n"

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "raw",
"id": "eb65deaa",
"metadata": {},
"source": [
"---\n",
"sidebar_label: vLLM Chat\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "eb7e5679-aa06-47e4-a1a3-b6b70e604017",

View File

@@ -1,5 +1,15 @@
{
"cells": [
{
"cell_type": "raw",
"id": "66107bdd",
"metadata": {},
"source": [
"---\n",
"sidebar_label: Volc Enging Maas\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "404758628c7b20f6",
@@ -7,7 +17,7 @@
"collapsed": false
},
"source": [
"# Volc Engine Maas\n",
"# VolcEngineMaasChat\n",
"\n",
"This notebook provides you with a guide on how to get started with volc engine maas chat models."
]
@@ -86,7 +96,9 @@
"outputs": [
{
"data": {
"text/plain": "AIMessage(content='好的,这是一个笑话:\\n\\n为什么鸟儿不会玩电脑游戏\\n\\n因为它们没有翅膀')"
"text/plain": [
"AIMessage(content='好的,这是一个笑话:\\n\\n为什么鸟儿不会玩电脑游戏\\n\\n因为它们没有翅膀')"
]
},
"execution_count": 26,
"metadata": {},
@@ -141,7 +153,9 @@
"outputs": [
{
"data": {
"text/plain": "AIMessage(content='好的,这是一个笑话:\\n\\n三岁的女儿说她会造句了妈妈让她用“年轻”造句女儿说“妈妈减肥一年轻了好几斤”。')"
"text/plain": [
"AIMessage(content='好的,这是一个笑话:\\n\\n三岁的女儿说她会造句了妈妈让她用“年轻”造句女儿说“妈妈减肥一年轻了好几斤”。')"
]
},
"execution_count": 28,
"metadata": {},

View File

@@ -1,11 +1,21 @@
{
"cells": [
{
"cell_type": "raw",
"id": "b4154fbe",
"metadata": {},
"source": [
"---\n",
"sidebar_label: YandexGPT\n",
"---"
]
},
{
"cell_type": "markdown",
"id": "af63c9db-e4bd-4d3b-a4d7-7927f5541734",
"metadata": {},
"source": [
"# YandexGPT\n",
"# ChatYandexGPT\n",
"\n",
"This notebook goes over how to use Langchain with [YandexGPT](https://cloud.yandex.com/en/services/yandexgpt) chat model.\n",
"\n",

View File

@@ -243,16 +243,16 @@
" my_file.write((json.dumps({\"messages\": m}) + \"\\n\").encode(\"utf-8\"))\n",
"\n",
"my_file.seek(0)\n",
"training_file = openai.File.create(file=my_file, purpose=\"fine-tune\")\n",
"training_file = openai.files.create(file=my_file, purpose=\"fine-tune\")\n",
"\n",
"# OpenAI audits each training file for compliance reasons.\n",
"# This make take a few minutes\n",
"status = openai.File.retrieve(training_file.id).status\n",
"status = openai.files.retrieve(training_file.id).status\n",
"start_time = time.time()\n",
"while status != \"processed\":\n",
" print(f\"Status=[{status}]... {time.time() - start_time:.2f}s\", end=\"\\r\", flush=True)\n",
" time.sleep(5)\n",
" status = openai.File.retrieve(training_file.id).status\n",
" status = openai.files.retrieve(training_file.id).status\n",
"print(f\"File {training_file.id} ready after {time.time() - start_time:.2f} seconds.\")"
]
},
@@ -271,7 +271,7 @@
"metadata": {},
"outputs": [],
"source": [
"job = openai.FineTuningJob.create(\n",
"job = openai.fine_tuning.jobs.create(\n",
" training_file=training_file.id,\n",
" model=\"gpt-3.5-turbo\",\n",
")"
@@ -300,12 +300,12 @@
}
],
"source": [
"status = openai.FineTuningJob.retrieve(job.id).status\n",
"status = openai.fine_tuning.jobs.retrieve(job.id).status\n",
"start_time = time.time()\n",
"while status != \"succeeded\":\n",
" print(f\"Status=[{status}]... {time.time() - start_time:.2f}s\", end=\"\\r\", flush=True)\n",
" time.sleep(5)\n",
" job = openai.FineTuningJob.retrieve(job.id)\n",
" job = openai.fine_tuning.jobs.retrieve(job.id)\n",
" status = job.status"
]
},
@@ -416,7 +416,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.2"
"version": "3.11.5"
}
},
"nbformat": 4,

View File

@@ -123,7 +123,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 6,
"id": "817bc077-c18a-473b-94a4-a7d810d583a8",
"metadata": {},
"outputs": [],
@@ -145,7 +145,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 7,
"id": "9e5ac127-b094-4584-9159-5a6d3d7315c7",
"metadata": {},
"outputs": [],
@@ -166,7 +166,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 8,
"id": "11d19e28-be49-4801-8065-1a58d13cd192",
"metadata": {},
"outputs": [
@@ -174,7 +174,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Status=[running]... 302.42s. 143.85s\r"
"Status=[running]... 429.55s. 46.34s\r"
]
}
],
@@ -190,20 +190,20 @@
" my_file.write((json.dumps({\"messages\": dialog}) + \"\\n\").encode(\"utf-8\"))\n",
"\n",
"my_file.seek(0)\n",
"training_file = openai.File.create(file=my_file, purpose=\"fine-tune\")\n",
"training_file = openai.files.create(file=my_file, purpose=\"fine-tune\")\n",
"\n",
"job = openai.FineTuningJob.create(\n",
"job = openai.fine_tuning.jobs.create(\n",
" training_file=training_file.id,\n",
" model=\"gpt-3.5-turbo\",\n",
")\n",
"\n",
"# Wait for the fine-tuning to complete (this may take some time)\n",
"status = openai.FineTuningJob.retrieve(job.id).status\n",
"status = openai.fine_tuning.jobs.retrieve(job.id).status\n",
"start_time = time.time()\n",
"while status != \"succeeded\":\n",
" print(f\"Status=[{status}]... {time.time() - start_time:.2f}s\", end=\"\\r\", flush=True)\n",
" time.sleep(5)\n",
" status = openai.FineTuningJob.retrieve(job.id).status\n",
" status = openai.fine_tuning.jobs.retrieve(job.id).status\n",
"\n",
"# Now your model is fine-tuned!"
]
@@ -220,16 +220,18 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 10,
"id": "3f472ca4-fa9b-485d-bd37-8ce3c59c44db",
"metadata": {},
"outputs": [],
"source": [
"# Get the fine-tuned model ID\n",
"job = openai.FineTuningJob.retrieve(job.id)\n",
"job = openai.fine_tuning.jobs.retrieve(job.id)\n",
"model_id = job.fine_tuned_model\n",
"\n",
"# Use the fine-tuned model in LangChain\n",
"from langchain.chat_models import ChatOpenAI\n",
"\n",
"model = ChatOpenAI(\n",
" model=model_id,\n",
" temperature=1,\n",
@@ -238,10 +240,21 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 11,
"id": "7d3b5845-6385-42d1-9f7d-5ea798dc2cd9",
"metadata": {},
"outputs": [],
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='[{\"s\": \"There were three ravens\", \"object\": \"tree\", \"relation\": \"sat on\"}, {\"s\": \"three ravens\", \"object\": \"a tree\", \"relation\": \"sat on\"}]')"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"model.invoke(\"There were three ravens sat on a tree.\")"
]
@@ -271,7 +284,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.5"
}
},
"nbformat": 4,

View File

@@ -35,7 +35,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": 1,
"id": "473adce5-c863-49e6-85c3-049e0ec2222e",
"metadata": {},
"outputs": [],
@@ -65,7 +65,7 @@
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 2,
"id": "9a36d27f-2f3b-4148-b94a-9436fe8b00e0",
"metadata": {},
"outputs": [],
@@ -105,7 +105,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 3,
"id": "89bcc676-27e8-40dc-a4d6-92cf28e0db58",
"metadata": {},
"outputs": [
@@ -144,7 +144,7 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": 4,
"id": "cd44ff01-22cf-431a-8bf4-29a758d1fcff",
"metadata": {},
"outputs": [],
@@ -169,18 +169,10 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 5,
"id": "62da7d8f-5cfc-45a6-946e-2bcda2b0ba1f",
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Retrying langchain.chat_models.openai.ChatOpenAI.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised ServiceUnavailableError: The server is overloaded or not ready yet..\n"
]
}
],
"outputs": [],
"source": [
"math_questions = [\n",
" \"What's 45/9?\",\n",
@@ -219,7 +211,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 6,
"id": "d6037992-050d-4ada-a061-860c124f0bf1",
"metadata": {},
"outputs": [],
@@ -231,7 +223,7 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 7,
"id": "0444919a-6f5a-4726-9916-4603b1420d0e",
"metadata": {},
"outputs": [],
@@ -266,7 +258,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": 8,
"id": "817bc077-c18a-473b-94a4-a7d810d583a8",
"metadata": {},
"outputs": [],
@@ -288,7 +280,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 9,
"id": "9e5ac127-b094-4584-9159-5a6d3d7315c7",
"metadata": {},
"outputs": [],
@@ -309,7 +301,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 10,
"id": "11d19e28-be49-4801-8065-1a58d13cd192",
"metadata": {},
"outputs": [
@@ -317,7 +309,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Status=[running]... 346.26s. 31.70s\r"
"Status=[running]... 349.84s. 17.72s\r"
]
}
],
@@ -333,20 +325,20 @@
" my_file.write((json.dumps({\"messages\": dialog}) + \"\\n\").encode(\"utf-8\"))\n",
"\n",
"my_file.seek(0)\n",
"training_file = openai.File.create(file=my_file, purpose=\"fine-tune\")\n",
"training_file = openai.files.create(file=my_file, purpose=\"fine-tune\")\n",
"\n",
"job = openai.FineTuningJob.create(\n",
"job = openai.fine_tuning.jobs.create(\n",
" training_file=training_file.id,\n",
" model=\"gpt-3.5-turbo\",\n",
")\n",
"\n",
"# Wait for the fine-tuning to complete (this may take some time)\n",
"status = openai.FineTuningJob.retrieve(job.id).status\n",
"status = openai.fine_tuning.jobs.retrieve(job.id).status\n",
"start_time = time.time()\n",
"while status != \"succeeded\":\n",
" print(f\"Status=[{status}]... {time.time() - start_time:.2f}s\", end=\"\\r\", flush=True)\n",
" time.sleep(5)\n",
" status = openai.FineTuningJob.retrieve(job.id).status\n",
" status = openai.fine_tuning.jobs.retrieve(job.id).status\n",
"\n",
"# Now your model is fine-tuned!"
]
@@ -363,16 +355,18 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 11,
"id": "7f45b281-1dfa-43cb-bd28-99fa7e9f45d1",
"metadata": {},
"outputs": [],
"source": [
"# Get the fine-tuned model ID\n",
"job = openai.FineTuningJob.retrieve(job.id)\n",
"job = openai.fine_tuning.jobs.retrieve(job.id)\n",
"model_id = job.fine_tuned_model\n",
"\n",
"# Use the fine-tuned model in LangChain\n",
"from langchain.chat_models import ChatOpenAI\n",
"\n",
"model = ChatOpenAI(\n",
" model=model_id,\n",
" temperature=1,\n",
@@ -381,17 +375,17 @@
},
{
"cell_type": "code",
"execution_count": 18,
"execution_count": 12,
"id": "7d3b5845-6385-42d1-9f7d-5ea798dc2cd9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"AIMessage(content='{\\n \"num1\": 56,\\n \"num2\": 7,\\n \"operation\": \"/\"\\n}')"
"AIMessage(content='Let me calculate that for you.')"
]
},
"execution_count": 18,
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
@@ -425,7 +419,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.11.5"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,884 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "1f3cebbe-079a-4bfe-b1a1-07bdac882ce2",
"metadata": {},
"source": [
"# Amazon Textract \n",
"\n",
">[Amazon Textract](https://docs.aws.amazon.com/managedservices/latest/userguide/textract.html) is a machine learning (ML) service that automatically extracts text, handwriting, and data from scanned documents.\n",
">\n",
">It goes beyond simple optical character recognition (OCR) to identify, understand, and extract data from forms and tables. Today, many companies manually extract data from scanned documents such as PDFs, images, tables, and forms, or through simple OCR software that requires manual configuration (which often must be updated when the form changes). To overcome these manual and expensive processes, `Textract` uses ML to read and process any type of document, accurately extracting text, handwriting, tables, and other data with no manual effort. \n",
"\n",
"This sample demonstrates the use of `Amazon Textract` in combination with LangChain as a DocumentLoader.\n",
"\n",
"`Textract` supports`PDF`, `TIF`F, `PNG` and `JPEG` format.\n",
"\n",
"`Textract` supports these [document sizes, languages and characters](https://docs.aws.amazon.com/textract/latest/dg/limits-document.html)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "a1aa66d4-85f2-42ad-a8d3-de7cea8d6c35",
"metadata": {},
"outputs": [],
"source": [
"#!pip install boto3 openai tiktoken python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "e4305a0d-37da-41f9-a52c-7d166d7dbabf",
"metadata": {},
"outputs": [],
"source": [
"#!pip install \"amazon-textract-caller>=0.2.0\""
]
},
{
"cell_type": "markdown",
"id": "400b25c6-befa-4730-a201-39ff112c8858",
"metadata": {},
"source": [
"## Sample 1\n",
"\n",
"The first example uses a local file, which internally will be send to Amazon Textract sync API [DetectDocumentText](https://docs.aws.amazon.com/textract/latest/dg/API_DetectDocumentText.html). \n",
"\n",
"Local files or URL endpoints like HTTP:// are limited to one page documents for Textract.\n",
"Multi-page documents have to reside on S3. This sample file is a jpeg."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1becee92-e82f-42d4-9b4e-b23d77cbe88d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import AmazonTextractPDFLoader\n",
"\n",
"loader = AmazonTextractPDFLoader(\"example_data/alejandro_rosalez_sample-small.jpeg\")\n",
"documents = loader.load()"
]
},
{
"cell_type": "markdown",
"id": "d566dc56-c9a9-44ec-84fb-a81928f90d40",
"metadata": {},
"source": [
"Output from the file"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "1272ce8c-d298-4059-ac0a-780bf5f82302",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Patient Information First Name: ALEJANDRO Last Name: ROSALEZ Date of Birth: 10/10/1982 Sex: M Marital Status: MARRIED Email Address: Address: 123 ANY STREET City: ANYTOWN State: CA Zip Code: 12345 Phone: 646-555-0111 Emergency Contact 1: First Name: CARLOS Last Name: SALAZAR Phone: 212-555-0150 Relationship to Patient: BROTHER Emergency Contact 2: First Name: JANE Last Name: DOE Phone: 650-555-0123 Relationship FRIEND to Patient: Did you feel fever or feverish lately? Yes No Are you having shortness of breath? Yes No Do you have a cough? Yes No Did you experience loss of taste or smell? Yes No Where you in contact with any confirmed COVID-19 positive patients? Yes No Did you travel in the past 14 days to any regions affected by COVID-19? Yes No Patient Information First Name: ALEJANDRO Last Name: ROSALEZ Date of Birth: 10/10/1982 Sex: M Marital Status: MARRIED Email Address: Address: 123 ANY STREET City: ANYTOWN State: CA Zip Code: 12345 Phone: 646-555-0111 Emergency Contact 1: First Name: CARLOS Last Name: SALAZAR Phone: 212-555-0150 Relationship to Patient: BROTHER Emergency Contact 2: First Name: JANE Last Name: DOE Phone: 650-555-0123 Relationship FRIEND to Patient: Did you feel fever or feverish lately? Yes No Are you having shortness of breath? Yes No Do you have a cough? Yes No Did you experience loss of taste or smell? Yes No Where you in contact with any confirmed COVID-19 positive patients? Yes No Did you travel in the past 14 days to any regions affected by COVID-19? Yes No ', metadata={'source': 'example_data/alejandro_rosalez_sample-small.jpeg', 'page': 1})]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"documents"
]
},
{
"cell_type": "markdown",
"id": "4cf7f19c-3635-453a-9c76-4baf98b8d7f4",
"metadata": {},
"source": [
"## Sample 2\n",
"The next sample loads a file from an HTTPS endpoint. \n",
"It has to be single page, as Amazon Textract requires all multi-page documents to be stored on S3."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "10374bfb-b325-451f-8bd0-c686710ab68c",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from langchain.document_loaders import AmazonTextractPDFLoader\n",
"\n",
"loader = AmazonTextractPDFLoader(\n",
" \"https://amazon-textract-public-content.s3.us-east-2.amazonaws.com/langchain/alejandro_rosalez_sample_1.jpg\"\n",
")\n",
"documents = loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "16a2b6a3-7514-4c2c-a427-6847169af473",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Patient Information First Name: ALEJANDRO Last Name: ROSALEZ Date of Birth: 10/10/1982 Sex: M Marital Status: MARRIED Email Address: Address: 123 ANY STREET City: ANYTOWN State: CA Zip Code: 12345 Phone: 646-555-0111 Emergency Contact 1: First Name: CARLOS Last Name: SALAZAR Phone: 212-555-0150 Relationship to Patient: BROTHER Emergency Contact 2: First Name: JANE Last Name: DOE Phone: 650-555-0123 Relationship FRIEND to Patient: Did you feel fever or feverish lately? Yes No Are you having shortness of breath? Yes No Do you have a cough? Yes No Did you experience loss of taste or smell? Yes No Where you in contact with any confirmed COVID-19 positive patients? Yes No Did you travel in the past 14 days to any regions affected by COVID-19? Yes No Patient Information First Name: ALEJANDRO Last Name: ROSALEZ Date of Birth: 10/10/1982 Sex: M Marital Status: MARRIED Email Address: Address: 123 ANY STREET City: ANYTOWN State: CA Zip Code: 12345 Phone: 646-555-0111 Emergency Contact 1: First Name: CARLOS Last Name: SALAZAR Phone: 212-555-0150 Relationship to Patient: BROTHER Emergency Contact 2: First Name: JANE Last Name: DOE Phone: 650-555-0123 Relationship FRIEND to Patient: Did you feel fever or feverish lately? Yes No Are you having shortness of breath? Yes No Do you have a cough? Yes No Did you experience loss of taste or smell? Yes No Where you in contact with any confirmed COVID-19 positive patients? Yes No Did you travel in the past 14 days to any regions affected by COVID-19? Yes No ', metadata={'source': 'example_data/alejandro_rosalez_sample-small.jpeg', 'page': 1})]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"documents"
]
},
{
"cell_type": "markdown",
"id": "3a9cd8ec-e663-4dc7-9db1-d2f575253141",
"metadata": {},
"source": [
"## Sample 3\n",
"\n",
"Processing a multi-page document requires the document to be on S3. The sample document resides in a bucket in us-east-2 and Textract needs to be called in that same region to be successful, so we set the region_name on the client and pass that in to the loader to ensure Textract is called from us-east-2. You could also to have your notebook running in us-east-2, setting the AWS_DEFAULT_REGION set to us-east-2 or when running in a different environment, pass in a boto3 Textract client with that region name like in the cell below."
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "8185e3e6-9599-4a47-8969-d6dcef3e6404",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"import boto3\n",
"\n",
"textract_client = boto3.client(\"textract\", region_name=\"us-east-2\")\n",
"\n",
"file_path = \"s3://amazon-textract-public-content/langchain/layout-parser-paper.pdf\"\n",
"loader = AmazonTextractPDFLoader(file_path, client=textract_client)\n",
"documents = loader.load()"
]
},
{
"cell_type": "markdown",
"id": "b8901eec-070d-4fd6-9d65-52211d332441",
"metadata": {},
"source": [
"Now getting the number of pages to validate the response (printing out the full response would be quite long...). We expect 16 pages."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "b23c01c8-cf69-4fe2-8141-4621edb7d79c",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"16"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"len(documents)"
]
},
{
"cell_type": "markdown",
"id": "b3e41b4d-b159-4274-89be-80d8159134ef",
"metadata": {},
"source": [
"## Using the AmazonTextractPDFLoader in an LangChain chain (e. g. OpenAI)\n",
"\n",
"The AmazonTextractPDFLoader can be used in a chain the same way the other loaders are used.\n",
"Textract itself does have a [Query feature](https://docs.aws.amazon.com/textract/latest/dg/API_Query.html), which offers similar functionality to the QA chain in this sample, which is worth checking out as well."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "53c47b24-cc06-4256-9e5b-a82fc80bc55d",
"metadata": {},
"outputs": [],
"source": [
"# You can store your OPENAI_API_KEY in a .env file as well\n",
"# import os\n",
"# from dotenv import load_dotenv\n",
"\n",
"# load_dotenv()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "a9ae004c-246c-4c7f-8458-191cd7424a9b",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# Or set the OpenAI key in the environment directly\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"your-OpenAI-API-key\""
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "d52b089c-10ca-45fb-8669-8a1c5fee10d5",
"metadata": {
"tags": []
},
"outputs": [
{
"data": {
"text/plain": [
"' The authors are Zejiang Shen, Ruochen Zhang, Melissa Dell, Benjamin Charles Germain Lee, Jacob Carlson, Weining Li, Gardner, M., Grus, J., Neumann, M., Tafjord, O., Dasigi, P., Liu, N., Peters, M., Schmitz, M., Zettlemoyer, L., Lukasz Garncarek, Powalski, R., Stanislawek, T., Topolski, B., Halama, P., Gralinski, F., Graves, A., Fernández, S., Gomez, F., Schmidhuber, J., Harley, A.W., Ufkes, A., Derpanis, K.G., He, K., Gkioxari, G., Dollár, P., Girshick, R., He, K., Zhang, X., Ren, S., Sun, J., Kay, A., Lamiroy, B., Lopresti, D., Mears, J., Jakeway, E., Ferriter, M., Adams, C., Yarasavage, N., Thomas, D., Zwaard, K., Li, M., Cui, L., Huang,'"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chains.question_answering import load_qa_chain\n",
"from langchain.llms import OpenAI\n",
"\n",
"chain = load_qa_chain(llm=OpenAI(), chain_type=\"map_reduce\")\n",
"query = [\"Who are the autors?\"]\n",
"\n",
"chain.run(input_documents=documents, question=query)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1a09d18b-ab7b-468e-ae66-f92abf666b9b",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"availableInstances": [
{
"_defaultOrder": 0,
"_isFastLaunch": true,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 4,
"name": "ml.t3.medium",
"vcpuNum": 2
},
{
"_defaultOrder": 1,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 8,
"name": "ml.t3.large",
"vcpuNum": 2
},
{
"_defaultOrder": 2,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 16,
"name": "ml.t3.xlarge",
"vcpuNum": 4
},
{
"_defaultOrder": 3,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 32,
"name": "ml.t3.2xlarge",
"vcpuNum": 8
},
{
"_defaultOrder": 4,
"_isFastLaunch": true,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 8,
"name": "ml.m5.large",
"vcpuNum": 2
},
{
"_defaultOrder": 5,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 16,
"name": "ml.m5.xlarge",
"vcpuNum": 4
},
{
"_defaultOrder": 6,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 32,
"name": "ml.m5.2xlarge",
"vcpuNum": 8
},
{
"_defaultOrder": 7,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 64,
"name": "ml.m5.4xlarge",
"vcpuNum": 16
},
{
"_defaultOrder": 8,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 128,
"name": "ml.m5.8xlarge",
"vcpuNum": 32
},
{
"_defaultOrder": 9,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 192,
"name": "ml.m5.12xlarge",
"vcpuNum": 48
},
{
"_defaultOrder": 10,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 256,
"name": "ml.m5.16xlarge",
"vcpuNum": 64
},
{
"_defaultOrder": 11,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 384,
"name": "ml.m5.24xlarge",
"vcpuNum": 96
},
{
"_defaultOrder": 12,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 8,
"name": "ml.m5d.large",
"vcpuNum": 2
},
{
"_defaultOrder": 13,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 16,
"name": "ml.m5d.xlarge",
"vcpuNum": 4
},
{
"_defaultOrder": 14,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 32,
"name": "ml.m5d.2xlarge",
"vcpuNum": 8
},
{
"_defaultOrder": 15,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 64,
"name": "ml.m5d.4xlarge",
"vcpuNum": 16
},
{
"_defaultOrder": 16,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 128,
"name": "ml.m5d.8xlarge",
"vcpuNum": 32
},
{
"_defaultOrder": 17,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 192,
"name": "ml.m5d.12xlarge",
"vcpuNum": 48
},
{
"_defaultOrder": 18,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 256,
"name": "ml.m5d.16xlarge",
"vcpuNum": 64
},
{
"_defaultOrder": 19,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 384,
"name": "ml.m5d.24xlarge",
"vcpuNum": 96
},
{
"_defaultOrder": 20,
"_isFastLaunch": false,
"category": "General purpose",
"gpuNum": 0,
"hideHardwareSpecs": true,
"memoryGiB": 0,
"name": "ml.geospatial.interactive",
"supportedImageNames": [
"sagemaker-geospatial-v1-0"
],
"vcpuNum": 0
},
{
"_defaultOrder": 21,
"_isFastLaunch": true,
"category": "Compute optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 4,
"name": "ml.c5.large",
"vcpuNum": 2
},
{
"_defaultOrder": 22,
"_isFastLaunch": false,
"category": "Compute optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 8,
"name": "ml.c5.xlarge",
"vcpuNum": 4
},
{
"_defaultOrder": 23,
"_isFastLaunch": false,
"category": "Compute optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 16,
"name": "ml.c5.2xlarge",
"vcpuNum": 8
},
{
"_defaultOrder": 24,
"_isFastLaunch": false,
"category": "Compute optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 32,
"name": "ml.c5.4xlarge",
"vcpuNum": 16
},
{
"_defaultOrder": 25,
"_isFastLaunch": false,
"category": "Compute optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 72,
"name": "ml.c5.9xlarge",
"vcpuNum": 36
},
{
"_defaultOrder": 26,
"_isFastLaunch": false,
"category": "Compute optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 96,
"name": "ml.c5.12xlarge",
"vcpuNum": 48
},
{
"_defaultOrder": 27,
"_isFastLaunch": false,
"category": "Compute optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 144,
"name": "ml.c5.18xlarge",
"vcpuNum": 72
},
{
"_defaultOrder": 28,
"_isFastLaunch": false,
"category": "Compute optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 192,
"name": "ml.c5.24xlarge",
"vcpuNum": 96
},
{
"_defaultOrder": 29,
"_isFastLaunch": true,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 16,
"name": "ml.g4dn.xlarge",
"vcpuNum": 4
},
{
"_defaultOrder": 30,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 32,
"name": "ml.g4dn.2xlarge",
"vcpuNum": 8
},
{
"_defaultOrder": 31,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 64,
"name": "ml.g4dn.4xlarge",
"vcpuNum": 16
},
{
"_defaultOrder": 32,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 128,
"name": "ml.g4dn.8xlarge",
"vcpuNum": 32
},
{
"_defaultOrder": 33,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 4,
"hideHardwareSpecs": false,
"memoryGiB": 192,
"name": "ml.g4dn.12xlarge",
"vcpuNum": 48
},
{
"_defaultOrder": 34,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 256,
"name": "ml.g4dn.16xlarge",
"vcpuNum": 64
},
{
"_defaultOrder": 35,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 61,
"name": "ml.p3.2xlarge",
"vcpuNum": 8
},
{
"_defaultOrder": 36,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 4,
"hideHardwareSpecs": false,
"memoryGiB": 244,
"name": "ml.p3.8xlarge",
"vcpuNum": 32
},
{
"_defaultOrder": 37,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 8,
"hideHardwareSpecs": false,
"memoryGiB": 488,
"name": "ml.p3.16xlarge",
"vcpuNum": 64
},
{
"_defaultOrder": 38,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 8,
"hideHardwareSpecs": false,
"memoryGiB": 768,
"name": "ml.p3dn.24xlarge",
"vcpuNum": 96
},
{
"_defaultOrder": 39,
"_isFastLaunch": false,
"category": "Memory Optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 16,
"name": "ml.r5.large",
"vcpuNum": 2
},
{
"_defaultOrder": 40,
"_isFastLaunch": false,
"category": "Memory Optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 32,
"name": "ml.r5.xlarge",
"vcpuNum": 4
},
{
"_defaultOrder": 41,
"_isFastLaunch": false,
"category": "Memory Optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 64,
"name": "ml.r5.2xlarge",
"vcpuNum": 8
},
{
"_defaultOrder": 42,
"_isFastLaunch": false,
"category": "Memory Optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 128,
"name": "ml.r5.4xlarge",
"vcpuNum": 16
},
{
"_defaultOrder": 43,
"_isFastLaunch": false,
"category": "Memory Optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 256,
"name": "ml.r5.8xlarge",
"vcpuNum": 32
},
{
"_defaultOrder": 44,
"_isFastLaunch": false,
"category": "Memory Optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 384,
"name": "ml.r5.12xlarge",
"vcpuNum": 48
},
{
"_defaultOrder": 45,
"_isFastLaunch": false,
"category": "Memory Optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 512,
"name": "ml.r5.16xlarge",
"vcpuNum": 64
},
{
"_defaultOrder": 46,
"_isFastLaunch": false,
"category": "Memory Optimized",
"gpuNum": 0,
"hideHardwareSpecs": false,
"memoryGiB": 768,
"name": "ml.r5.24xlarge",
"vcpuNum": 96
},
{
"_defaultOrder": 47,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 16,
"name": "ml.g5.xlarge",
"vcpuNum": 4
},
{
"_defaultOrder": 48,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 32,
"name": "ml.g5.2xlarge",
"vcpuNum": 8
},
{
"_defaultOrder": 49,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 64,
"name": "ml.g5.4xlarge",
"vcpuNum": 16
},
{
"_defaultOrder": 50,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 128,
"name": "ml.g5.8xlarge",
"vcpuNum": 32
},
{
"_defaultOrder": 51,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 1,
"hideHardwareSpecs": false,
"memoryGiB": 256,
"name": "ml.g5.16xlarge",
"vcpuNum": 64
},
{
"_defaultOrder": 52,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 4,
"hideHardwareSpecs": false,
"memoryGiB": 192,
"name": "ml.g5.12xlarge",
"vcpuNum": 48
},
{
"_defaultOrder": 53,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 4,
"hideHardwareSpecs": false,
"memoryGiB": 384,
"name": "ml.g5.24xlarge",
"vcpuNum": 96
},
{
"_defaultOrder": 54,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 8,
"hideHardwareSpecs": false,
"memoryGiB": 768,
"name": "ml.g5.48xlarge",
"vcpuNum": 192
},
{
"_defaultOrder": 55,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 8,
"hideHardwareSpecs": false,
"memoryGiB": 1152,
"name": "ml.p4d.24xlarge",
"vcpuNum": 96
},
{
"_defaultOrder": 56,
"_isFastLaunch": false,
"category": "Accelerated computing",
"gpuNum": 8,
"hideHardwareSpecs": false,
"memoryGiB": 1152,
"name": "ml.p4de.24xlarge",
"vcpuNum": 96
}
],
"instance_type": "ml.t3.medium",
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,174 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "a634365e",
"metadata": {},
"source": [
"# Azure AI Data\n",
"\n",
">[Azure AI Studio](https://ai.azure.com/) provides the capability to upload data assets to cloud storage and register existing data assets from the following sources:\n",
"\n",
"- Microsoft OneLake\n",
"- Azure Blob Storage\n",
"- Azure Data Lake gen 2\n",
"\n",
"The benefit of this approach over `AzureBlobStorageContainerLoader` and `AzureBlobStorageFileLoader` is that authentication is handled seamlessly to cloud storage. You can use either *identity-based* data access control to the data or *credential-based* (e.g. SAS token, account key). In the case of credential-based data access you do not need to specify secrets in your code or set up key vaults - the system handles that for you.\n",
"\n",
"This notebook covers how to load document objects from a data asset in AI Studio."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49815096",
"metadata": {},
"outputs": [],
"source": [
"#!pip install azureml-fsspec, azure-ai-generative"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "2f0cd6a5",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from azure.ai.resources.client import AIClient\n",
"from azure.identity import DefaultAzureCredential\n",
"from langchain.document_loaders import AzureAIDataLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "08d40b11-e87a-426e-a6b0-89f24e47ce2c",
"metadata": {},
"outputs": [],
"source": [
"# Create a connection to your project\n",
"client = AIClient(\n",
" credential=DefaultAzureCredential(),\n",
" subscription_id=\"<subscription_id>\",\n",
" resource_group_name=\"<resource_group_name>\",\n",
" project_name=\"<project_name>\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "321cc7f1",
"metadata": {},
"outputs": [],
"source": [
"# get the latest version of your data asset\n",
"data_asset = client.data.get(name=\"<data_asset_name>\", label=\"latest\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "25d91cea-c5f2-4a53-ac19-442810451ec6",
"metadata": {},
"outputs": [],
"source": [
"# load the data asset\n",
"loader = AzureAIDataLoader(url=data_asset.path)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "2b11d155",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': '/var/folders/y6/8_bzdg295ld6s1_97_12m4lr0000gn/T/tmpaa9xl6ch/fake.docx'}, lookup_index=0)]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"loader.load()"
]
},
{
"cell_type": "markdown",
"id": "0690c40a",
"metadata": {},
"source": [
"## Specifying a glob pattern\n",
"You can also specify a glob pattern for more finegrained control over what files to load. In the example below, only files with a `pdf` extension will be loaded."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "72d44781",
"metadata": {},
"outputs": [],
"source": [
"loader = AzureAIDataLoader(url=data_asset.path, glob=\"*.pdf\")"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "2d3c32db",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content='Lorem ipsum dolor sit amet.', lookup_str='', metadata={'source': '/var/folders/y6/8_bzdg295ld6s1_97_12m4lr0000gn/T/tmpujbkzf_l/fake.docx'}, lookup_index=0)]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"loader.load()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "885dc280",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long

View File

@@ -1,167 +0,0 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"# Embaas\n",
"[embaas](https://embaas.io) is a fully managed NLP API service that offers features like embedding generation, document text extraction, document to embeddings and more. You can choose a [variety of pre-trained models](https://embaas.io/docs/models/embeddings).\n",
"\n",
"### Prerequisites\n",
"Create a free embaas account at [https://embaas.io/register](https://embaas.io/register) and generate an [API key](https://embaas.io/dashboard/api-keys)\n",
"\n",
"### Document Text Extraction API\n",
"The document text extraction API allows you to extract the text from a given document. The API supports a variety of document formats, including PDF, mp3, mp4 and more. For a full list of supported formats, check out the API docs (link below)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# Set API key\n",
"embaas_api_key = \"YOUR_API_KEY\"\n",
"# or set environment variable\n",
"os.environ[\"EMBAAS_API_KEY\"] = \"YOUR_API_KEY\""
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"#### Using a blob (bytes)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from langchain.document_loaders.blob_loaders import Blob\n",
"from langchain.document_loaders.embaas import EmbaasBlobLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"blob_loader = EmbaasBlobLoader()\n",
"blob = Blob.from_path(\"example.pdf\")\n",
"documents = blob_loader.load(blob)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2023-06-12T22:19:48.380467Z",
"start_time": "2023-06-12T22:19:48.366886Z"
},
"collapsed": false
},
"outputs": [],
"source": [
"# You can also directly create embeddings with your preferred embeddings model\n",
"blob_loader = EmbaasBlobLoader(params={\"model\": \"e5-large-v2\", \"should_embed\": True})\n",
"blob = Blob.from_path(\"example.pdf\")\n",
"documents = blob_loader.load(blob)\n",
"\n",
"print(documents[0][\"metadata\"][\"embedding\"])"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"#### Using a file"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from langchain.document_loaders.embaas import EmbaasLoader"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"file_loader = EmbaasLoader(file_path=\"example.pdf\")\n",
"documents = file_loader.load()"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"ExecuteTime": {
"end_time": "2023-06-12T22:24:31.894665Z",
"start_time": "2023-06-12T22:24:31.880857Z"
},
"collapsed": false
},
"outputs": [],
"source": [
"# Disable automatic text splitting\n",
"file_loader = EmbaasLoader(file_path=\"example.mp3\", params={\"should_chunk\": False})\n",
"documents = file_loader.load()"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": false
},
"source": [
"For more detailed information about the embaas document text extraction API, please refer to [the official embaas API documentation](https://embaas.io/api-reference)."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 0
}

View File

@@ -217,7 +217,7 @@
"It's compatible with the ̀`langchain.document_loaders.GoogleDriveLoader` and can be used\n",
"in its place.\n",
"\n",
"To be compatible with containers, the authentication uses an environment variable ̀GOOGLE_ACCOUNT_FILE` to credential file (for user or service)."
"To be compatible with containers, the authentication uses an environment variable `̀GOOGLE_ACCOUNT_FILE` to credential file (for user or service)."
]
},
{
@@ -331,6 +331,7 @@
"Some pre-formated request are proposed (use `{query}`, `{folder_id}` and/or `{mime_type}`):\n",
"\n",
"You can customize the criteria to select the files. A set of predefined filter are proposed:\n",
"\n",
"| template | description |\n",
"| -------------------------------------- | --------------------------------------------------------------------- |\n",
"| gdrive-all-in-folder | Return all compatible files from a `folder_id` |\n",
@@ -401,6 +402,14 @@
"id": "375bb465-8f69-407b-94bd-ffa3718ef500",
"metadata": {},
"source": [
"The conversion can manage in Markdown format:\n",
"- bullet\n",
"- link\n",
"- table\n",
"- titles\n",
"\n",
"Set the attribut `return_link` to `True` to export links.\n",
"\n",
"#### Modes for GSlide and GSheet\n",
"The parameter mode accepts different values:\n",
"\n",
@@ -408,12 +417,6 @@
"- \"snippets\": return the description of each file (set in metadata of Google Drive files).\n",
"\n",
"\n",
"The conversion can manage in Markdown format:\n",
"- bullet\n",
"- link\n",
"- table\n",
"- titles\n",
"\n",
"The parameter `gslide_mode` accepts different values:\n",
"\n",
"- \"single\" : one document with &lt;PAGE BREAK&gt;\n",
@@ -503,14 +506,6 @@
" print(\"---\")\n",
" print(doc.page_content.strip()[:60] + \"...\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "51efa73a-4e2d-4f9c-abaf-6c9bde2ff69d",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {

View File

@@ -11,9 +11,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"[Amazon API Gateway](https://aws.amazon.com/api-gateway/) is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the \"front door\" for applications to access data, business logic, or functionality from your backend services. Using API Gateway, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication applications. API Gateway supports containerized and serverless workloads, as well as web applications.\n",
">[Amazon API Gateway](https://aws.amazon.com/api-gateway/) is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any >scale. APIs act as the \"front door\" for applications to access data, business logic, or functionality from your backend services. Using `API Gateway`, you can create RESTful APIs and >WebSocket APIs that enable real-time two-way communication applications. API Gateway supports containerized and serverless workloads, as well as web applications.\n",
"\n",
"API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, CORS support, authorization and access control, throttling, monitoring, and API version management. API Gateway has no minimum fees or startup costs. You pay for the API calls you receive and the amount of data transferred out and, with the API Gateway tiered pricing model, you can reduce your cost as your API usage scales."
">`API Gateway` handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, CORS support, authorization >and access control, throttling, monitoring, and API version management. `API Gateway` has no minimum fees or startup costs. You pay for the API calls you receive and the amount of data >transferred out and, with the `API Gateway` tiered pricing model, you can reduce your cost as your API usage scales."
]
},
{

View File

@@ -11,7 +11,15 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case"
">[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that offers a choice of \n",
"> high-performing foundation models (FMs) from leading AI companies like `AI21 Labs`, `Anthropic`, `Cohere`, \n",
"> `Meta`, `Stability AI`, and `Amazon` via a single API, along with a broad set of capabilities you need to \n",
"> build generative AI applications with security, privacy, and responsible AI. Using `Amazon Bedrock`, \n",
"> you can easily experiment with and evaluate top FMs for your use case, privately customize them with \n",
"> your data using techniques such as fine-tuning and `Retrieval Augmented Generation` (`RAG`), and build \n",
"> agents that execute tasks using your enterprise systems and data sources. Since `Amazon Bedrock` is \n",
"> serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy \n",
"> generative AI capabilities into your applications using the AWS services you are already familiar with.\n"
]
},
{
@@ -116,7 +124,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -20,27 +20,121 @@
"This example notebook shows how to wrap Databricks endpoints as LLMs in LangChain.\n",
"It supports two endpoint types:\n",
"* Serving endpoint, recommended for production and development,\n",
"* Cluster driver proxy app, recommended for iteractive development."
"* Cluster driver proxy app, recommended for interactive development."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installation\n",
"\n",
"`mlflow >= 2.9 ` is required to run the code in this notebook. If it's not installed, please install it using this command:\n",
"\n",
"```\n",
"pip install mlflow>=2.9\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Wrapping a serving endpoint: External model\n",
"\n",
"Prerequisite:\n",
"\n",
"- Register an OpenAI API key as a secret:\n",
"\n",
" ```bash\n",
" databricks secrets create-scope <scope>\n",
" databricks secrets put-secret <scope> openai-api-key --string-value $OPENAI_API_KEY\n",
" ```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following code creates a new serving endpoint with OpenAI's GPT-4 model for chat and generates a response using the endpoint."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"application/vnd.databricks.v1+cell": {
"cellMetadata": {
"byteLimit": 2048000,
"rowLimit": 10000
},
"inputWidgets": {},
"nuid": "bf07455f-aac9-4873-a8e7-7952af0f8c82",
"showTitle": false,
"title": ""
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"content='Hello! How can I assist you today?'\n"
]
}
},
"outputs": [],
],
"source": [
"from langchain.llms import Databricks"
"from langchain.chat_models import ChatDatabricks\n",
"from langchain.schema.messages import HumanMessage\n",
"from mlflow.deployments import get_deploy_client\n",
"\n",
"client = get_deploy_client(\"databricks\")\n",
"\n",
"secret = \"secrets/<scope>/openai-api-key\" # replace `<scope>` with your scope\n",
"name = \"my-chat\" # rename this if my-chat already exists\n",
"client.create_endpoint(\n",
" name=name,\n",
" config={\n",
" \"served_entities\": [\n",
" {\n",
" \"name\": \"my-chat\",\n",
" \"external_model\": {\n",
" \"name\": \"gpt-4\",\n",
" \"provider\": \"openai\",\n",
" \"task\": \"llm/v1/chat\",\n",
" \"openai_config\": {\n",
" \"openai_api_key\": \"{{\" + secret + \"}}\",\n",
" },\n",
" },\n",
" }\n",
" ],\n",
" },\n",
")\n",
"\n",
"chat = ChatDatabricks(\n",
" target_uri=\"databricks\",\n",
" endpoint=name,\n",
" temperature=0.1,\n",
")\n",
"chat([HumanMessage(content=\"hello\")])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Wrapping a serving endpoint: Foundation model\n",
"\n",
"The following code uses the `databricks-bge-large-en` serving endpoint (no endpoint creation is required) to generate embeddings from input text."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[0.051055908203125, 0.007221221923828125, 0.003879547119140625]\n"
]
}
],
"source": [
"from langchain.embeddings import DatabricksEmbeddings\n",
"\n",
"embeddings = DatabricksEmbeddings(endpoint=\"databricks-bge-large-en\")\n",
"embeddings.embed_query(\"hello\")[:3]"
]
},
{
@@ -56,7 +150,7 @@
}
},
"source": [
"## Wrapping a serving endpoint\n",
"## Wrapping a serving endpoint: Custom model\n",
"\n",
"Prerequisites:\n",
"* An LLM was registered and deployed to [a Databricks serving endpoint](https://docs.databricks.com/machine-learning/model-serving/index.html).\n",
@@ -97,6 +191,8 @@
}
],
"source": [
"from langchain.llms import Databricks\n",
"\n",
"# If running a Databricks notebook attached to an interactive cluster in \"single user\"\n",
"# or \"no isolation shared\" mode, you only need to specify the endpoint name to create\n",
"# a `Databricks` instance to query a serving endpoint in the same workspace.\n",
@@ -524,7 +620,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.10"
"version": "3.9.18"
},
"orig_nbformat": 4
},

View File

@@ -141,7 +141,7 @@
" accepts = \"application/json\"\n",
"\n",
" def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:\n",
" input_str = json.dumps({prompt: prompt, **model_kwargs})\n",
" input_str = json.dumps({\"inputs\": prompt, \"parameters\": model_kwargs})\n",
" return input_str.encode(\"utf-8\")\n",
"\n",
" def transform_output(self, output: bytes) -> str:\n",
@@ -197,7 +197,7 @@
" accepts = \"application/json\"\n",
"\n",
" def transform_input(self, prompt: str, model_kwargs: Dict) -> bytes:\n",
" input_str = json.dumps({prompt: prompt, **model_kwargs})\n",
" input_str = json.dumps({\"inputs\": prompt, \"parameters\": model_kwargs})\n",
" return input_str.encode(\"utf-8\")\n",
"\n",
" def transform_output(self, output: bytes) -> str:\n",

View File

@@ -0,0 +1,297 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "70996d8a",
"metadata": {},
"source": [
"# WatsonxLLM\n",
"\n",
"[WatsonxLLM](https://ibm.github.io/watson-machine-learning-sdk/fm_extensions.html) is wrapper for IBM [watsonx.ai](https://www.ibm.com/products/watsonx-ai) foundation models.\n",
"This example shows how to communicate with watsonx.ai models using LangChain."
]
},
{
"cell_type": "markdown",
"id": "ea35b2b7",
"metadata": {},
"source": [
"Install the package [`ibm_watson_machine_learning`](https://ibm.github.io/watson-machine-learning-sdk/install.html)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2f1fff4e",
"metadata": {},
"outputs": [],
"source": [
"%pip install ibm_watson_machine_learning"
]
},
{
"cell_type": "markdown",
"id": "f406e092",
"metadata": {},
"source": [
"This cell defines the WML credentials required to work with watsonx Foundation Model inferencing.\n",
"\n",
"**Action:** Provide the IBM Cloud user API key. For details, see\n",
"[documentation](https://cloud.ibm.com/docs/account?topic=account-userapikey&interface=ui)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "11d572a1",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"from getpass import getpass\n",
"\n",
"watsonx_api_key = getpass()\n",
"os.environ[\"WATSONX_APIKEY\"] = watsonx_api_key"
]
},
{
"cell_type": "markdown",
"id": "e36acbef",
"metadata": {},
"source": [
"## Load the model\n",
"You might need to adjust model `parameters` for different models or tasks, to do so please refer to [documentation](https://ibm.github.io/watson-machine-learning-sdk/model.html#metanames.GenTextParamsMetaNames)."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "407cd500",
"metadata": {},
"outputs": [],
"source": [
"from ibm_watson_machine_learning.metanames import GenTextParamsMetaNames as GenParams\n",
"\n",
"parameters = {\n",
" GenParams.DECODING_METHOD: \"sample\",\n",
" GenParams.MAX_NEW_TOKENS: 100,\n",
" GenParams.MIN_NEW_TOKENS: 1,\n",
" GenParams.TEMPERATURE: 0.5,\n",
" GenParams.TOP_K: 50,\n",
" GenParams.TOP_P: 1,\n",
"}"
]
},
{
"cell_type": "markdown",
"id": "2b586538",
"metadata": {},
"source": [
"Initialize the `WatsonxLLM` class with previous set params."
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "359898de",
"metadata": {},
"outputs": [],
"source": [
"from langchain.llms import WatsonxLLM\n",
"\n",
"watsonx_llm = WatsonxLLM(\n",
" model_id=\"google/flan-ul2\",\n",
" url=\"https://us-south.ml.cloud.ibm.com\",\n",
" project_id=\"***\",\n",
" params=parameters,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "2202f4e0",
"metadata": {},
"source": [
"Alternatively you can use Cloud Pak for Data credentials. For details, see [documentation](https://ibm.github.io/watson-machine-learning-sdk/setup_cpd.html).\n",
"```\n",
"watsonx_llm = WatsonxLLM(\n",
" model_id='google/flan-ul2',\n",
" url=\"***\",\n",
" username=\"***\",\n",
" password=\"***\",\n",
" instance_id=\"openshift\",\n",
" version=\"4.8\",\n",
" project_id='***',\n",
" params=parameters\n",
")\n",
"``` "
]
},
{
"cell_type": "markdown",
"id": "c25ecbd1",
"metadata": {},
"source": [
"## Create Chain\n",
"Create `PromptTemplate` objects which will be responsible for creating a random question."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "c7d80c05",
"metadata": {},
"outputs": [],
"source": [
"from langchain.prompts import PromptTemplate\n",
"\n",
"template = \"Generate a random question about {topic}: Question: \"\n",
"prompt = PromptTemplate.from_template(template)"
]
},
{
"cell_type": "markdown",
"id": "79056d8e",
"metadata": {},
"source": [
"Provide a topic and run the `LLMChain`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "dc076c56",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'How many breeds of dog are there?'"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from langchain.chains import LLMChain\n",
"\n",
"llm_chain = LLMChain(prompt=prompt, llm=watsonx_llm)\n",
"llm_chain.run(\"dog\")"
]
},
{
"cell_type": "markdown",
"id": "f571001d",
"metadata": {},
"source": [
"## Calling the Model Directly\n",
"To obtain completions, you can can the model directly using string prompt."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "beea2b5b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'dog'"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Calling a single prompt\n",
"\n",
"watsonx_llm(\"Who is man's best friend?\")"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "8ab1a25a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"LLMResult(generations=[[Generation(text='greyhounds', generation_info={'generated_token_count': 4, 'input_token_count': 8, 'finish_reason': 'eos_token'})], [Generation(text='The Basenji is a dog breed from South Africa.', generation_info={'generated_token_count': 13, 'input_token_count': 7, 'finish_reason': 'eos_token'})]], llm_output={'model_id': 'google/flan-ul2'}, run=[RunInfo(run_id=UUID('03c73a42-db68-428e-ab8d-8ae10abc84fc')), RunInfo(run_id=UUID('c289f67a-87d6-4c8b-a8b7-0b5012c94ca8'))])"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Calling multiple prompts\n",
"\n",
"watsonx_llm.generate(\n",
" [\n",
" \"The fastest dog in the world?\",\n",
" \"Describe your chosen dog breed\",\n",
" ]\n",
")"
]
},
{
"cell_type": "markdown",
"id": "d2c9da33",
"metadata": {},
"source": [
"## Streaming the Model output \n",
"\n",
"You can stream the model output."
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "3f63166a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The golden retriever is my favorite dog because it is very friendly and good with children."
]
}
],
"source": [
"for chunk in watsonx_llm.stream(\n",
" \"Describe your favorite breed of dog and why it is your favorite.\"\n",
"):\n",
" print(chunk, end=\"\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,11 +1,22 @@
# AWS
All functionality related to [Amazon AWS](https://aws.amazon.com/) platform
The `LangChain` integrations related to [Amazon AWS](https://aws.amazon.com/) platform.
## LLMs
### Bedrock
>[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that offers a choice of
> high-performing foundation models (FMs) from leading AI companies like `AI21 Labs`, `Anthropic`, `Cohere`,
> `Meta`, `Stability AI`, and `Amazon` via a single API, along with a broad set of capabilities you need to
> build generative AI applications with security, privacy, and responsible AI. Using `Amazon Bedrock`,
> you can easily experiment with and evaluate top FMs for your use case, privately customize them with
> your data using techniques such as fine-tuning and `Retrieval Augmented Generation` (`RAG`), and build
> agents that execute tasks using your enterprise systems and data sources. Since `Amazon Bedrock` is
> serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy
> generative AI capabilities into your applications using the AWS services you are already familiar with.
See a [usage example](/docs/integrations/llms/bedrock).
```python
@@ -14,32 +25,28 @@ from langchain.llms.bedrock import Bedrock
### Amazon API Gateway
[Amazon API Gateway](https://aws.amazon.com/api-gateway/) is a fully managed service that makes it easy for developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the "front door" for applications to access data, business logic, or functionality from your backend services. Using API Gateway, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication applications. API Gateway supports containerized and serverless workloads, as well as web applications.
>[Amazon API Gateway](https://aws.amazon.com/api-gateway/) is a fully managed service that makes it easy for
> developers to create, publish, maintain, monitor, and secure APIs at any scale. APIs act as the "front door"
> for applications to access data, business logic, or functionality from your backend services. Using
> `API Gateway`, you can create RESTful APIs and WebSocket APIs that enable real-time two-way communication
> applications. `API Gateway` supports containerized and serverless workloads, as well as web applications.
>
> `API Gateway` handles all the tasks involved in accepting and processing up to hundreds of thousands of
> concurrent API calls, including traffic management, CORS support, authorization and access control,
> throttling, monitoring, and API version management. `API Gateway` has no minimum fees or startup costs.
> You pay for the API calls you receive and the amount of data transferred out and, with the `API Gateway`
> tiered pricing model, you can reduce your cost as your API usage scales.
API Gateway handles all the tasks involved in accepting and processing up to hundreds of thousands of concurrent API calls, including traffic management, CORS support, authorization and access control, throttling, monitoring, and API version management. API Gateway has no minimum fees or startup costs. You pay for the API calls you receive and the amount of data transferred out and, with the API Gateway tiered pricing model, you can reduce your cost as your API usage scales.
See a [usage example](/docs/integrations/llms/amazon_api_gateway_example).
See a [usage example](/docs/integrations/llms/amazon_api_gateway).
```python
from langchain.llms import AmazonAPIGateway
api_url = "https://<api_gateway_id>.execute-api.<region>.amazonaws.com/LATEST/HF"
# These are sample parameters for Falcon 40B Instruct Deployed from Amazon SageMaker JumpStart
model_kwargs = {
"max_new_tokens": 100,
"num_return_sequences": 1,
"top_k": 50,
"top_p": 0.95,
"do_sample": False,
"return_full_text": True,
"temperature": 0.2,
}
llm = AmazonAPIGateway(api_url=api_url, model_kwargs=model_kwargs)
```
### SageMaker Endpoint
>[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a system that can build, train, and deploy machine learning (ML) models with fully managed infrastructure, tools, and workflows.
>[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a system that can build, train, and deploy
> machine learning (ML) models with fully managed infrastructure, tools, and workflows.
We use `SageMaker` to host our model and expose it as the `SageMaker Endpoint`.
@@ -50,6 +57,16 @@ from langchain.llms import SagemakerEndpoint
from langchain.llms.sagemaker_endpoint import LLMContentHandler
```
## Chat models
### Bedrock Chat
See a [usage example](/docs/integrations/chat/bedrock).
```python
from langchain.chat_models import BedrockChat
```
## Text Embedding Models
### Bedrock
@@ -67,11 +84,32 @@ from langchain.embeddings import SagemakerEndpointEmbeddings
from langchain.llms.sagemaker_endpoint import ContentHandlerBase
```
## Chains
### Amazon Comprehend Moderation Chain
>[Amazon Comprehend](https://aws.amazon.com/comprehend/) is a natural-language processing (NLP) service that
> uses machine learning to uncover valuable insights and connections in text.
We need to install the `boto3` and `nltk` libraries.
```bash
pip install boto3 nltk
```
See a [usage example](/docs/guides/safety/amazon_comprehend_chain).
```python
from langchain_experimental.comprehend_moderation import AmazonComprehendModerationChain
```
## Document loaders
### AWS S3 Directory and File
>[Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html) is an object storage service.
>[Amazon Simple Storage Service (Amazon S3)](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html)
> is an object storage service.
>[AWS S3 Directory](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-folders.html)
>[AWS S3 Buckets](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html)
@@ -83,6 +121,17 @@ See a [usage example for S3FileLoader](/docs/integrations/document_loaders/aws_s
from langchain.document_loaders import S3DirectoryLoader, S3FileLoader
```
### Amazon Textract
>[Amazon Textract](https://docs.aws.amazon.com/managedservices/latest/userguide/textract.html) is a machine
> learning (ML) service that automatically extracts text, handwriting, and data from scanned documents.
See a [usage example](/docs/integrations/document_loaders/amazon_textract).
```python
from langchain.document_loaders import AmazonTextractPDFLoader
```
## Memory
### AWS DynamoDB
@@ -103,3 +152,112 @@ See a [usage example](/docs/integrations/memory/aws_dynamodb).
```python
from langchain.memory import DynamoDBChatMessageHistory
```
## Retrievers
### Amazon Kendra
> [Amazon Kendra](https://docs.aws.amazon.com/kendra/latest/dg/what-is-kendra.html) is an intelligent search service
> provided by `Amazon Web Services` (`AWS`). It utilizes advanced natural language processing (NLP) and machine
> learning algorithms to enable powerful search capabilities across various data sources within an organization.
> `Kendra` is designed to help users find the information they need quickly and accurately,
> improving productivity and decision-making.
> With `Kendra`, we can search across a wide range of content types, including documents, FAQs, knowledge bases,
> manuals, and websites. It supports multiple languages and can understand complex queries, synonyms, and
> contextual meanings to provide highly relevant search results.
We need to install the `boto3` library.
```bash
pip install boto3
```
See a [usage example](/docs/integrations/retrievers/amazon_kendra_retriever).
```python
from langchain.retrievers import AmazonKendraRetriever
```
### Amazon Bedrock (Knowledge Bases)
> [Knowledge bases for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/) is an
> `Amazon Web Services` (`AWS`) offering which lets you quickly build RAG applications by using your
> private data to customize foundation model response.
We need to install the `boto3` library.
```bash
pip install boto3
```
See a [usage example](/docs/integrations/retrievers/amazon_bedrock_knowledge_bases).
```python
from langchain.retrievers import AmazonKnowledgeBasesRetriever
```
## Vector stores
### Amazon OpenSearch Service
> [Amazon OpenSearch Service](https://aws.amazon.com/opensearch-service/) performs
> interactive log analytics, real-time application monitoring, website search, and more. `OpenSearch` is
> an open source,
> distributed search and analytics suite derived from `Elasticsearch`. `Amazon OpenSearch Service` offers the
> latest versions of `OpenSearch`, support for many versions of `Elasticsearch`, as well as
> visualization capabilities powered by `OpenSearch Dashboards` and `Kibana`.
We need to install several python libraries.
```bash
pip install boto3 requests requests-aws4auth
```
See a [usage example](/docs/integrations/vectorstores/opensearch#using-aos-amazon-opensearch-service).
```python
from langchain.vectorstores import OpenSearchVectorSearch
```
## Tools
### AWS Lambda
>[`Amazon AWS Lambda`](https://aws.amazon.com/pm/lambda/) is a serverless computing service provided by
> `Amazon Web Services` (`AWS`). It helps developers to build and run applications and services without
> provisioning or managing servers. This serverless architecture enables you to focus on writing and
> deploying code, while AWS automatically takes care of scaling, patching, and managing the
> infrastructure required to run your applications.
We need to install `boto3` python library.
```bash
pip install boto3
```
See a [usage example](/docs/integrations/tools/awslambda).
## Callbacks
### SageMaker Tracking
>[Amazon SageMaker](https://aws.amazon.com/sagemaker/) is a fully managed service that is used to quickly
> and easily build, train and deploy machine learning (ML) models.
>[Amazon SageMaker Experiments](https://docs.aws.amazon.com/sagemaker/latest/dg/experiments.html) is a capability
> of `Amazon SageMaker` that lets you organize, track,
> compare and evaluate ML experiments and model versions.
We need to install several python libraries.
```bash
pip install google-search-results sagemaker
```
See a [usage example](/docs/integrations/callbacks/sagemaker_tracking).
```python
from langchain.callbacks import SageMakerCallbackHandler
```

View File

@@ -393,14 +393,24 @@ from langchain.chat_loaders.gmail import GMailLoader
## 3rd Party Integrations
### SearchApi
>[SearchApi](https://www.searchapi.io/) provides a 3rd-party API to access Google search results, YouTube search & transcripts, and other Google-related engines.
See [usage examples and authorization instructions](/docs/integrations/tools/searchapi).
```python
from langchain.utilities import SearchApiAPIWrapper
```
### SerpAPI
>[SerpApi](https://serpapi.com/) provides a 3rd-party API to access Google search results.
See a [usage example and authorization instructions](/docs/integrations/tools/google_serper).
See a [usage example and authorization instructions](/docs/integrations/tools/serpapi).
```python
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.utilities import SerpAPIWrapper
```
### YouTube

View File

@@ -0,0 +1,136 @@
# Hugging Face
All functionality related to the [Hugging Face Platform](https://huggingface.co/).
## LLMs
### Hugging Face Hub
>The [Hugging Face Hub](https://huggingface.co/docs/hub/index) is a platform
> with over 350k models, 75k datasets, and 150k demo apps (Spaces), all open source
> and publicly available, in an online platform where people can easily
> collaborate and build ML together. The Hub works as a central place where anyone
> can explore, experiment, collaborate, and build technology with Machine Learning.
To use, we should have the `huggingface_hub` python [package installed](https://huggingface.co/docs/huggingface_hub/installation).
```bash
pip install huggingface_hub
```
See a [usage example](/docs/integrations/llms/huggingface_hub).
```python
from langchain.llms import HuggingFaceHub
```
### Hugging Face Local Pipelines
Hugging Face models can be run locally through the `HuggingFacePipeline` class.
We need to install `transformers` python package.
```bash
pip install transformers
```
See a [usage example](/docs/integrations/llms/huggingface_pipelines).
```python
from langchain.llms.huggingface_pipeline import HuggingFacePipeline
```
### Hugging Face TextGen Inference
>[Text Generation Inference](https://github.com/huggingface/text-generation-inference) is
> a Rust, Python and gRPC server for text generation inference. Used in production at
> [HuggingFace](https://huggingface.co/) to power LLMs api-inference widgets.
We need to install `text_generation` python package.
```bash
pip install text_generation
```
See a [usage example](/docs/integrations/llms/huggingface_textgen_inference).
```python
from langchain.llms import HuggingFaceTextGenInference
```
## Document Loaders
### Hugging Face dataset
>[Hugging Face Hub](https://huggingface.co/docs/hub/index) is home to over 75,000
> [datasets](https://huggingface.co/docs/hub/index#datasets) in more than 100 languages
> that can be used for a broad range of tasks across NLP, Computer Vision, and Audio.
> They used for a diverse range of tasks such as translation, automatic speech
> recognition, and image classification.
We need to install `datasets` python package.
```bash
pip install datasets
```
See a [usage example](/docs/integrations/document_loaders/hugging_face_dataset).
```python
from langchain.document_loaders.hugging_face_dataset import HuggingFaceDatasetLoader
```
## Embedding Models
### Hugging Face Hub
>The [Hugging Face Hub](https://huggingface.co/docs/hub/index) is a platform
> with over 350k models, 75k datasets, and 150k demo apps (Spaces), all open source
> and publicly available, in an online platform where people can easily
> collaborate and build ML together. The Hub works as a central place where anyone
> can explore, experiment, collaborate, and build technology with Machine Learning.
We need to install the `sentence_transformers` python package.
```bash
pip install sentence_transformers
```
#### HuggingFaceEmbeddings
See a [usage example](/docs/integrations/text_embedding/huggingfacehub).
```python
from langchain.embeddings import HuggingFaceEmbeddings
```
#### HuggingFaceInstructEmbeddings
See a [usage example](/docs/integrations/text_embedding/instruct_embeddings).
```python
from langchain.embeddings import HuggingFaceInstructEmbeddings
```
## Tools
### Hugging Face Hub Tools
>[Hugging Face Tools](https://huggingface.co/docs/transformers/v4.29.0/en/custom_tools)
> support text I/O and are loaded using the `load_huggingface_tool` function.
We need to install several python packages.
```bash
pip install transformers huggingface_hub
```
See a [usage example](/docs/integrations/tools/huggingface_tools).
```python
from langchain.agents import load_huggingface_tool
```

View File

@@ -7,9 +7,8 @@ Databricks embraces the LangChain ecosystem in various ways:
1. Databricks connector for the SQLDatabase Chain: SQLDatabase.from_databricks() provides an easy way to query your data on Databricks through LangChain
2. Databricks MLflow integrates with LangChain: Tracking and serving LangChain applications with fewer steps
3. Databricks MLflow AI Gateway
4. Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks
5. Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the Hugging Face Hub
3. Databricks as an LLM provider: Deploy your fine-tuned LLMs on Databricks via serving endpoints or cluster driver proxy apps, and query it as langchain.llms.Databricks
4. Databricks Dolly: Databricks open-sourced Dolly which allows for commercial use, and can be accessed through the Hugging Face Hub
Databricks connector for the SQLDatabase Chain
----------------------------------------------
@@ -25,19 +24,58 @@ Databricks provides a fully managed and hosted version of MLflow integrated with
Databricks MLflow makes it more convenient to develop LangChain applications on Databricks. For MLflow tracking, you don't need to set the tracking uri. For MLflow Model Serving, you can save LangChain Chains in the MLflow langchain flavor, and then register and serve the Chain with a few clicks on Databricks, with credentials securely managed by MLflow Model Serving.
Databricks MLflow AI Gateway
----------------------------
Databricks External Models
--------------------------
See [MLflow AI Gateway](/docs/integrations/providers/mlflow_ai_gateway).
[Databricks External Models](https://docs.databricks.com/generative-ai/external-models/index.html) is a service that is designed to streamline the usage and management of various large language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests. The following example creates an endpoint that serves OpenAI's GPT-4 model and generates a chat response from it:
```python
from langchain.chat_models import ChatDatabricks
from langchain.schema.messages import HumanMessage
from mlflow.deployments import get_deploy_client
client = get_deploy_client("databricks")
name = f"chat"
client.create_endpoint(
name=name,
config={
"served_entities": [
{
"name": "test",
"external_model": {
"name": "gpt-4",
"provider": "openai",
"task": "llm/v1/chat",
"openai_config": {
"openai_api_key": "{{secrets/<scope>/<key>}}",
},
},
}
],
},
)
chat = ChatDatabricks(endpoint=name, temperature=0.1)
print(chat([HumanMessage(content="hello")]))
# -> content='Hello! How can I assist you today?'
```
Databricks Foundation Model APIs
--------------------------------
[Databricks Foundation Model APIs](https://docs.databricks.com/machine-learning/foundation-models/index.html) allow you to access and query state-of-the-art open source models from dedicated serving endpoints. With Foundation Model APIs, developers can quickly and easily build applications that leverage a high-quality generative AI model without maintaining their own model deployment. The following example uses the `databricks-bge-large-en` endpoint to generate embeddings from text:
```python
from langchain.llms import DatabricksEmbeddings
embeddings = DatabricksEmbeddings(endpoint="databricks-bge-large-en")
print(embeddings.embed_query("hello")[:3])
# -> [0.051055908203125, 0.007221221923828125, 0.003879547119140625, ...]
```
Databricks as an LLM provider
-----------------------------
The notebook [Wrap Databricks endpoints as LLMs](/docs/integrations/llms/databricks) illustrates the method to wrap Databricks endpoints as LLMs in LangChain. It supports two types of endpoints: the serving endpoint, which is recommended for both production and development, and the cluster driver proxy app, which is recommended for interactive development.
Databricks endpoints support Dolly, but are also great for hosting models like MPT-7B or any other models from the Hugging Face ecosystem. Databricks endpoints can also be used with proprietary models like OpenAI to provide a governance layer for enterprises.
Databricks Dolly
----------------
Databricks Dolly is an instruction-following large language model trained on the Databricks machine learning platform that is licensed for commercial use. The model is available on Hugging Face Hub as databricks/dolly-v2-12b. See the notebook [Hugging Face Hub](/docs/integrations/llms/huggingface_hub) for instructions to access it through the Hugging Face Hub integration with LangChain.
The notebook [Wrap Databricks endpoints as LLMs](/docs/integrations/llms/databricks#wrapping-a-serving-endpoint-custom-model) demonstrates how to serve a custom model that has been registered by MLflow as a Databricks endpoint.
It supports two types of endpoints: the serving endpoint, which is recommended for both production and development, and the cluster driver proxy app, which is recommended for interactive development.

View File

@@ -11,7 +11,7 @@
Click [here](https://www.alibabacloud.com/zh/product/hologres) to fast deploy a Hologres cloud instance.
```bash
pip install psycopg2
pip install hologres-vector
```
## Vector Store

View File

@@ -1,69 +0,0 @@
# Hugging Face
This page covers how to use the Hugging Face ecosystem (including the [Hugging Face Hub](https://huggingface.co)) within LangChain.
It is broken into two parts: installation and setup, and then references to specific Hugging Face wrappers.
## Installation and Setup
If you want to work with the Hugging Face Hub:
- Install the Hub client library with `pip install huggingface_hub`
- Create a Hugging Face account (it's free!)
- Create an [access token](https://huggingface.co/docs/hub/security-tokens) and set it as an environment variable (`HUGGINGFACEHUB_API_TOKEN`)
If you want work with the Hugging Face Python libraries:
- Install `pip install transformers` for working with models and tokenizers
- Install `pip install datasets` for working with datasets
## Wrappers
### LLM
There exists two Hugging Face LLM wrappers, one for a local pipeline and one for a model hosted on Hugging Face Hub.
Note that these wrappers only work for models that support the following tasks: [`text2text-generation`](https://huggingface.co/models?library=transformers&pipeline_tag=text2text-generation&sort=downloads), [`text-generation`](https://huggingface.co/models?library=transformers&pipeline_tag=text-classification&sort=downloads)
To use the local pipeline wrapper:
```python
from langchain.llms import HuggingFacePipeline
```
To use a the wrapper for a model hosted on Hugging Face Hub:
```python
from langchain.llms import HuggingFaceHub
```
For a more detailed walkthrough of the Hugging Face Hub wrapper, see [this notebook](/docs/integrations/llms/huggingface_hub)
### Embeddings
There exists two Hugging Face Embeddings wrappers, one for a local model and one for a model hosted on Hugging Face Hub.
Note that these wrappers only work for [`sentence-transformers` models](https://huggingface.co/models?library=sentence-transformers&sort=downloads).
To use the local pipeline wrapper:
```python
from langchain.embeddings import HuggingFaceEmbeddings
```
To use a the wrapper for a model hosted on Hugging Face Hub:
```python
from langchain.embeddings import HuggingFaceHubEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](/docs/integrations/text_embedding/huggingfacehub)
### Tokenizer
There are several places you can use tokenizers available through the `transformers` package.
By default, it is used to count tokens for all LLMs.
You can also use it to count tokens when splitting documents with
```python
from langchain.text_splitter import CharacterTextSplitter
CharacterTextSplitter.from_huggingface_tokenizer(...)
```
For a more detailed walkthrough of this, see [this notebook](/docs/modules/data_connection/document_transformers/text_splitters/huggingface_length_function)
### Datasets
The Hugging Face Hub has lots of great [datasets](https://huggingface.co/datasets) that can be used to evaluate your LLM chains.
For a detailed walkthrough of how to use them to do so, see [this notebook](/docs/integrations/document_loaders/hugging_face_dataset)

View File

@@ -1,75 +1,20 @@
# Jina
This page covers how to use the Jina ecosystem within LangChain.
This page covers how to use the Jina Embeddings within LangChain.
It is broken into two parts: installation and setup, and then references to specific Jina wrappers.
## Installation and Setup
- Install the Python SDK with `pip install jina`
- Get a Jina AI Cloud auth token from [here](https://cloud.jina.ai/settings/tokens) and set it as an environment variable (`JINA_AUTH_TOKEN`)
## Wrappers
### Embeddings
- Get a Jina AI API token from [here](https://jina.ai/embeddings/) and set it as an environment variable (`JINA_API_TOKEN`)
There exists a Jina Embeddings wrapper, which you can access with
```python
from langchain.embeddings import JinaEmbeddings
```
For a more detailed walkthrough of this, see [this notebook](/docs/integrations/text_embedding/jina)
## Deployment
[Langchain-serve](https://github.com/jina-ai/langchain-serve), powered by Jina, helps take LangChain apps to production with easy to use REST/WebSocket APIs and Slack bots.
### Usage
Install the package from PyPI.
```bash
pip install langchain-serve
# you can pas jina_api_key, if none is passed it will be taken from `JINA_API_TOKEN` environment variable
embeddings = JinaEmbeddings(jina_api_key='jina_**', model_name='jina-embeddings-v2-base-en')
```
Wrap your LangChain app with the `@serving` decorator.
You can check the list of available models from [here](https://jina.ai/embeddings/)
```python
# app.py
from lcserve import serving
@serving
def ask(input: str) -> str:
from langchain.chains import LLMChain
from langchain.llms import OpenAI
from langchain.agents import AgentExecutor, ZeroShotAgent
tools = [...] # list of tools
prompt = ZeroShotAgent.create_prompt(
tools, input_variables=["input", "agent_scratchpad"],
)
llm_chain = LLMChain(llm=OpenAI(temperature=0), prompt=prompt)
agent = ZeroShotAgent(
llm_chain=llm_chain, allowed_tools=[tool.name for tool in tools]
)
agent_executor = AgentExecutor.from_agent_and_tools(
agent=agent,
tools=tools,
verbose=True,
)
return agent_executor.run(input)
```
Deploy on Jina AI Cloud with `lc-serve deploy jcloud app`. Once deployed, we can send a POST request to the API endpoint to get a response.
```bash
curl -X 'POST' 'https://<your-app>.wolf.jina.ai/ask' \
-d '{
"input": "Your Question here?",
"envs": {
"OPENAI_API_KEY": "sk-***"
}
}'
```
You can also self-host the app on your infrastructure with Docker-compose or Kubernetes. See [here](https://github.com/jina-ai/langchain-serve#-self-host-llm-apps-with-docker-compose-or-kubernetes) for more details.
Langchain-serve also allows to deploy the apps with WebSocket APIs and Slack Bots both on [Jina AI Cloud](https://cloud.jina.ai/) or self-hosted infrastructure.
For a more detailed walkthrough of this, see [this notebook](/docs/integrations/text_embedding/jina.ipynb)

View File

@@ -0,0 +1,119 @@
# MLflow Deployments for LLMs
>[The MLflow Deployments for LLMs](https://www.mlflow.org/docs/latest/llms/deployments/index.html) is a powerful tool designed to streamline the usage and management of various large
> language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface
> that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests.
## Installation and Setup
Install `mlflow` with MLflow Deployments dependencies:
```sh
pip install 'mlflow[genai]'
```
Set the OpenAI API key as an environment variable:
```sh
export OPENAI_API_KEY=...
```
Create a configuration file:
```yaml
endpoints:
- name: completions
endpoint_type: llm/v1/completions
model:
provider: openai
name: text-davinci-003
config:
openai_api_key: $OPENAI_API_KEY
- name: embeddings
endpoint_type: llm/v1/embeddings
model:
provider: openai
name: text-embedding-ada-002
config:
openai_api_key: $OPENAI_API_KEY
```
Start the deployments server:
```sh
mlflow deployments start-server --config-path /path/to/config.yaml
```
## Example provided by `MLflow`
>The `mlflow.langchain` module provides an API for logging and loading `LangChain` models.
> This module exports multivariate LangChain models in the langchain flavor and univariate LangChain
> models in the pyfunc flavor.
See the [API documentation and examples](https://www.mlflow.org/docs/latest/python_api/mlflow.langchain) for more information.
## Completions Example
```python
import mlflow
from langchain.chains import LLMChain, PromptTemplate
from langchain.llms import Mlflow
llm = Mlflow(
target_uri="http://127.0.0.1:5000",
endpoint="completions",
)
llm_chain = LLMChain(
llm=Mlflow,
prompt=PromptTemplate(
input_variables=["adjective"],
template="Tell me a {adjective} joke",
),
)
result = llm_chain.run(adjective="funny")
print(result)
with mlflow.start_run():
model_info = mlflow.langchain.log_model(chain, "model")
model = mlflow.pyfunc.load_model(model_info.model_uri)
print(model.predict([{"adjective": "funny"}]))
```
## Embeddings Example
```python
from langchain.embeddings import MlflowEmbeddings
embeddings = MlflowEmbeddings(
target_uri="http://127.0.0.1:5000",
endpoint="embeddings",
)
print(embeddings.embed_query("hello"))
print(embeddings.embed_documents(["hello"]))
```
## Chat Example
```python
from langchain.chat_models import ChatMlflow
from langchain.schema import HumanMessage, SystemMessage
chat = ChatMlflow(
target_uri="http://127.0.0.1:5000",
endpoint="chat",
)
messages = [
SystemMessage(
content="You are a helpful assistant that translates English to French."
),
HumanMessage(
content="Translate this sentence from English to French: I love programming."
),
]
print(chat(messages))
```

View File

@@ -1,5 +1,11 @@
# MLflow AI Gateway
:::warning
MLflow AI Gateway has been deprecated. Please use [MLflow Deployments for LLMs](./mlflow) instead.
:::
>[The MLflow AI Gateway](https://www.mlflow.org/docs/latest/gateway/index) service is a powerful tool designed to streamline the usage and management of various large
> language model (LLM) providers, such as OpenAI and Anthropic, within an organization. It offers a high-level interface
> that simplifies the interaction with these services by providing a unified endpoint to handle specific LLM related requests.

View File

@@ -7,9 +7,9 @@
"source": [
"# Amazon Kendra\n",
"\n",
"> Amazon Kendra is an intelligent search service provided by Amazon Web Services (AWS). It utilizes advanced natural language processing (NLP) and machine learning algorithms to enable powerful search capabilities across various data sources within an organization. Kendra is designed to help users find the information they need quickly and accurately, improving productivity and decision-making.\n",
"> [Amazon Kendra](https://docs.aws.amazon.com/kendra/latest/dg/what-is-kendra.html) is an intelligent search service provided by `Amazon Web Services` (`AWS`). It utilizes advanced natural language processing (NLP) and machine learning algorithms to enable powerful search capabilities across various data sources within an organization. `Kendra` is designed to help users find the information they need quickly and accurately, improving productivity and decision-making.\n",
"\n",
"> With Kendra, users can search across a wide range of content types, including documents, FAQs, knowledge bases, manuals, and websites. It supports multiple languages and can understand complex queries, synonyms, and contextual meanings to provide highly relevant search results."
"> With `Kendra`, users can search across a wide range of content types, including documents, FAQs, knowledge bases, manuals, and websites. It supports multiple languages and can understand complex queries, synonyms, and contextual meanings to provide highly relevant search results."
]
},
{
@@ -74,11 +74,24 @@
}
],
"metadata": {
"language_info": {
"name": "python"
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"orig_nbformat": 4
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 2
"nbformat_minor": 4
}

View File

@@ -5,13 +5,13 @@
"id": "b6636c27-35da-4ba7-8313-eca21660cab3",
"metadata": {},
"source": [
"# Amazon Bedrock (Knowledge Bases)\n",
"# Bedrock (Knowledge Bases)\n",
"\n",
"> [Knowledge bases for Amazon Bedrock](https://aws.amazon.com/bedrock/knowledge-bases/) is an Amazon Web Services (AWS) offering which lets you quickly build RAG applications by using your private data to customize FM response.\n",
"\n",
"> Implementing RAG requires organizations to perform several cumbersome steps to convert data into embeddings (vectors), store the embeddings in a specialized vector database, and build custom integrations into the database to search and retrieve text relevant to the users query. This can be time-consuming and inefficient.\n",
"> Implementing `RAG` requires organizations to perform several cumbersome steps to convert data into embeddings (vectors), store the embeddings in a specialized vector database, and build custom integrations into the database to search and retrieve text relevant to the users query. This can be time-consuming and inefficient.\n",
"\n",
"> With Knowledge Bases for Amazon Bedrock, simply point to the location of your data in Amazon S3, and Knowledge Bases for Amazon Bedrock takes care of the entire ingestion workflow into your vector database. If you do not have an existing vector database, Amazon Bedrock creates an Amazon OpenSearch Serverless vector store for you. For retrievals, use the Langchain - Amazon Bedrock integration via the Retrieve API to retrieve relevant results for a user query from knowledge bases.\n",
"> With `Knowledge Bases for Amazon Bedrock`, simply point to the location of your data in `Amazon S3`, and `Knowledge Bases for Amazon Bedrock` takes care of the entire ingestion workflow into your vector database. If you do not have an existing vector database, Amazon Bedrock creates an Amazon OpenSearch Serverless vector store for you. For retrievals, use the Langchain - Amazon Bedrock integration via the Retrieve API to retrieve relevant results for a user query from knowledge bases.\n",
"\n",
"> Knowledge base can be configured through [AWS Console](https://aws.amazon.com/console/) or by using [AWS SDKs](https://aws.amazon.com/developer/tools/)."
]
@@ -108,7 +108,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.13"
"version": "3.10.12"
}
},
"nbformat": 4,

View File

@@ -7,7 +7,16 @@
"source": [
"# Bedrock\n",
"\n",
">[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that makes FMs from leading AI startups and Amazon available via an API, so you can choose from a wide range of FMs to find the model that is best suited for your use case.\n"
">[Amazon Bedrock](https://aws.amazon.com/bedrock/) is a fully managed service that offers a choice of \n",
"> high-performing foundation models (FMs) from leading AI companies like `AI21 Labs`, `Anthropic`, `Cohere`, \n",
"> `Meta`, `Stability AI`, and `Amazon` via a single API, along with a broad set of capabilities you need to \n",
"> build generative AI applications with security, privacy, and responsible AI. Using `Amazon Bedrock`, \n",
"> you can easily experiment with and evaluate top FMs for your use case, privately customize them with \n",
"> your data using techniques such as fine-tuning and `Retrieval Augmented Generation` (`RAG`), and build \n",
"> agents that execute tasks using your enterprise systems and data sources. Since `Amazon Bedrock` is \n",
"> serverless, you don't have to manage any infrastructure, and you can securely integrate and deploy \n",
"> generative AI capabilities into your applications using the AWS services you are already familiar with.\n",
"\n"
]
},
{

View File

@@ -0,0 +1,89 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "2c591a6a42ac7f0",
"metadata": {},
"source": [
"# Bookend AI\n",
"\n",
"Let's load the Bookend AI Embeddings class."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d94c62b4",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings import BookendEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "523a09e3",
"metadata": {},
"outputs": [],
"source": [
"embeddings = BookendEmbeddings(\n",
" domain=\"your_domain\",\n",
" api_token=\"your_api_token\",\n",
" model_id=\"your_embeddings_model_id\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b212bd5a",
"metadata": {},
"outputs": [],
"source": [
"text = \"This is a test document.\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "57db66bd",
"metadata": {},
"outputs": [],
"source": [
"query_result = embeddings.embed_query(text)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b790fd09",
"metadata": {},
"outputs": [],
"source": [
"doc_result = embeddings.embed_documents([text])"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -0,0 +1,125 @@
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"id": "59428e05",
"metadata": {},
"source": [
"# Text Embeddings on Cloudflare Workers AI\n",
"\n",
"[Cloudflare AI document](https://developers.cloudflare.com/workers-ai/models/text-embeddings/) listed all text embeddings models available.\n",
"\n",
"Both Cloudflare account ID and API token are required. Find how to obtain them from [this document](https://developers.cloudflare.com/workers-ai/get-started/rest-api/).\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "92c5b61e",
"metadata": {},
"outputs": [],
"source": [
"from langchain.embeddings.cloudflare_workersai import CloudflareWorkersAIEmbeddings"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "f60023b8",
"metadata": {},
"outputs": [],
"source": [
"import getpass\n",
"\n",
"my_account_id = getpass.getpass(\"Enter your Cloudflare account ID:\\n\\n\")\n",
"my_api_token = getpass.getpass(\"Enter your Cloudflare API token:\\n\\n\")"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "062547b9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(384, [-0.033627357333898544, 0.03982774540781975, 0.03559349477291107])"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"embeddings = CloudflareWorkersAIEmbeddings(\n",
" account_id=my_account_id,\n",
" api_token=my_api_token,\n",
" model_name=\"@cf/baai/bge-small-en-v1.5\",\n",
")\n",
"# single string embeddings\n",
"query_result = embeddings.embed_query(\"test\")\n",
"len(query_result), query_result[:3]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "e1dcc4bd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 384)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# string embeddings in batches\n",
"batch_query_result = embeddings.embed_documents([\"test1\", \"test2\", \"test3\"])\n",
"len(batch_query_result), len(batch_query_result[0])"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "52de8b88",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
},
"vscode": {
"interpreter": {
"hash": "7377c2ccc78bc62c2683122d48c8cd1fb85a53850a1b1fc29736ed39852c9885"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}

File diff suppressed because one or more lines are too long

View File

@@ -12,7 +12,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"id": "d94c62b4",
"metadata": {},
"outputs": [],
@@ -28,7 +28,7 @@
"outputs": [],
"source": [
"embeddings = JinaEmbeddings(\n",
" jina_auth_token=jina_auth_token, model_name=\"ViT-B-32::openai\"\n",
" jina_api_key=\"jina_*\", model_name=\"jina-embeddings-v2-base-en\"\n",
")"
]
},
@@ -52,6 +52,16 @@
"query_result = embeddings.embed_query(text)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aea3ca33-1e6e-499c-8284-b7e26f38c514",
"metadata": {},
"outputs": [],
"source": [
"print(query_result)"
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -62,21 +72,15 @@
"doc_result = embeddings.embed_documents([text])"
]
},
{
"cell_type": "markdown",
"id": "6f3607a0",
"metadata": {},
"source": [
"In the above example, `ViT-B-32::openai`, OpenAI's pretrained `ViT-B-32` model is used. For a full list of models, see [here](https://cloud.jina.ai/user/inference/model/63dca9df5a0da83009d519cd)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cd5f148e",
"id": "c2e6b743-768c-4d7e-a331-27d5f0e8e30e",
"metadata": {},
"outputs": [],
"source": []
"source": [
"print(doc_result)"
]
}
],
"metadata": {
@@ -95,7 +99,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.1"
"version": "3.10.11"
}
},
"nbformat": 4,

View File

@@ -76,18 +76,17 @@
"* Pull requests (read and write)\n",
"\n",
"\n",
"\n",
"Once the app has been registered, add it to the repository you wish the bot to act upon.\n",
"Once the app has been registered, you must give your app permission to access each of the repositories you whish it to act upon. Use the App settings on [github.com here](https://github.com/settings/installations).\n",
"\n",
"### 3. Set Environmental Variables\n",
"\n",
"Before initializing your agent, the following environmental variables need to be set:\n",
"\n",
"* **GITHUB_APP_ID**- A six digit number found in your app's general settings\n",
"* **GITHUB_APP_PRIVATE_KEY**- The location of your app's private key .pem file\n",
"* **GITHUB_REPOSITORY**- The name of the Github repository you want your bot to act upon. Must follow the format {username}/{repo-name}. Make sure the app has been added to this repository first!\n",
"* **GITHUB_BRANCH**- The branch where the bot will make its commits. Defaults to 'master.'\n",
"* **GITHUB_BASE_BRANCH**- The base branch of your repo, usually either 'main' or 'master.' This is where pull requests will base from. Defaults to 'master.'\n"
"* **GITHUB_APP_PRIVATE_KEY**- The location of your app's private key .pem file, or the full text of that file as a string.\n",
"* **GITHUB_REPOSITORY**- The name of the Github repository you want your bot to act upon. Must follow the format {username}/{repo-name}. *Make sure the app has been added to this repository first!*\n",
"* Optional: **GITHUB_BRANCH**- The branch where the bot will make its commits. Defaults to `repo.default_branch`.\n",
"* Optional: **GITHUB_BASE_BRANCH**- The base branch of your repo upon which PRs will based from. Defaults to `repo.default_branch`.\n"
]
},
{
@@ -99,7 +98,7 @@
},
{
"cell_type": "code",
"execution_count": 47,
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
@@ -107,13 +106,13 @@
"\n",
"from langchain.agents import AgentType, initialize_agent\n",
"from langchain.agents.agent_toolkits.github.toolkit import GitHubToolkit\n",
"from langchain.llms import OpenAI\n",
"from langchain.chat_models import ChatOpenAI\n",
"from langchain.utilities.github import GitHubAPIWrapper"
]
},
{
"cell_type": "code",
"execution_count": 53,
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
@@ -130,16 +129,54 @@
},
{
"cell_type": "code",
"execution_count": 54,
"execution_count": 5,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Available tools:\n",
"\tGet Issues\n",
"\tGet Issue\n",
"\tComment on Issue\n",
"\tList open pull requests (PRs)\n",
"\tGet Pull Request\n",
"\tOverview of files included in PR\n",
"\tCreate Pull Request\n",
"\tList Pull Requests' Files\n",
"\tCreate File\n",
"\tRead File\n",
"\tUpdate File\n",
"\tDelete File\n",
"\tOverview of existing files in Main branch\n",
"\tOverview of files in current working branch\n",
"\tList branches in this repository\n",
"\tSet active branch\n",
"\tCreate a new branch\n",
"\tGet files from a directory\n",
"\tSearch issues and pull requests\n",
"\tSearch code\n",
"\tCreate review request\n"
]
}
],
"source": [
"llm = OpenAI(temperature=0)\n",
"llm = ChatOpenAI(temperature=0, model=\"gpt-4-1106-preview\")\n",
"github = GitHubAPIWrapper()\n",
"toolkit = GitHubToolkit.from_github_api_wrapper(github)\n",
"tools = toolkit.get_tools()\n",
"\n",
"# STRUCTURED_CHAT includes args_schema for each tool, helps tool args parsing errors.\n",
"agent = initialize_agent(\n",
" toolkit.get_tools(), llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
")"
" tools,\n",
" llm,\n",
" agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
" verbose=True,\n",
")\n",
"print(\"Available tools:\")\n",
"for tool in tools:\n",
" print(\"\\t\" + tool.name)"
]
},
{
@@ -219,7 +256,428 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: Advanced Agent\n",
"## Example: Read an issue, open a pull request\n",
"\n",
"Workflow: \n",
"1. Read issues, either a specific one or just ask it to look at recent ones. \n",
"2. Write code, commit it to a new branch.\n",
"3. Open a PR\n",
"4. \"Request review\" on the PR from the original author of the issue.\n",
"\n",
"\n",
"### Input data and LangSmith Trace\n",
"* LangSmith trace for this run: https://smith.langchain.com/public/fee6643c-b214-42d0-967b-d24dcdd690fe/r\n",
"* Input issue: https://github.com/KastanDay/ML4Bio/issues/33\n",
"* Final PR created by bot: https://github.com/KastanDay/ML4Bio/pull/40"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Please implement these changes by creating or editing the necessary files. \n",
"\n",
"1. First use read_file to read any files in the repo that seem relevant. \n",
"2. Then, when you're ready, start implementing changes by creating and updating files. Implement any and all remaining code to make the project work as the commenter intended. \n",
"2. The last step is to create a PR with a clear and concise title and description, list any concerns or final changes necessary in the PR body.\n",
"3. After opening the PR, comment on the original issue and mention the new PR your just opened, you must comment \"I opened a PR for you to review here #<PR_NUMBER>\" (it'll be something like #30). That hashtag syntax will automatically link to the PR, as necessary. Thanks.\n",
"4. If you feel the PR is satisfactory for completing your assignment, create a review request for the original user that opened the issue. Use their username to tag them.\n",
"\n",
"Feel free to ask for help or leave a comment on the Issue or PR if you're stuck.\n",
"\n",
"Here's your latest assignment: {issue_description}\n"
]
}
],
"source": [
"from langchain import hub\n",
"\n",
"gh_issue_prompt_template = hub.pull(\"kastanday/new-github-issue\")\n",
"print(gh_issue_prompt_template.template)"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Please implement these changes by creating or editing the necessary files. \n",
"\n",
"1. First use read_file to read any files in the repo that seem relevant. \n",
"2. Then, when you're ready, start implementing changes by creating and updating files. Implement any and all remaining code to make the project work as the commenter intended. \n",
"2. The last step is to create a PR with a clear and concise title and description, list any concerns or final changes necessary in the PR body.\n",
"3. After opening the PR, comment on the original issue and mention the new PR your just opened, you must comment \"I opened a PR for you to review here #<PR_NUMBER>\" (it'll be something like #30). That hashtag syntax will automatically link to the PR, as necessary. Thanks.\n",
"4. If you feel the PR is satisfactory for completing your assignment, create a review request for the original user that opened the issue. Use their username to tag them.\n",
"\n",
"Feel free to ask for help or leave a comment on the Issue or PR if you're stuck.\n",
"\n",
"Here's your latest assignment: Title: Create a full command line executable workflow for RNA-Seq on PBMC Samples. Open a new pull request (on a separate branch) and comment the PR number here when you're done..\n",
"Opened by user: KastanDay\n",
"Body: Experiment Type:\n",
"RNA-Seq\n",
"Sequencing of total cellular RNA\n",
"\n",
"Workflow Management:\n",
"Bash/SLURM\n",
"Scripting and job scheduling\n",
"\n",
"Software Stack:\n",
"FastQC\n",
"MultiQC\n",
"STAR\n",
"RSEM\n",
"samtools\n",
"DESeq2\n",
"\n",
"What else to know about the pipeline?\n",
"I am working PBMC samples collected from patients that are undergoing immunotherapy.\n",
"\n",
"Use the data files existing in [Report_WholeBrain](https://github.com/KastanDay/ML4Bio/tree/main/Report_WholeBrain) as input for this workflow.\n",
"\n",
"You should write a series of bash scripts and R scripts that can accomplish this task. Open a PR with those scripts when you're done.\n"
]
}
],
"source": [
"def format_issue(issue):\n",
" title = f\"Title: {issue.get('title')}.\"\n",
" opened_by = f\"Opened by user: {issue.get('opened_by')}\"\n",
" body = f\"Body: {issue.get('body')}\"\n",
" comments = issue.get(\"comments\") # often too long\n",
" return \"\\n\".join([title, opened_by, body])\n",
"\n",
"\n",
"issue = github.get_issue(33) # task to implement a RNA-seq pipeline (bioinformatics)\n",
"final_gh_issue_prompt = gh_issue_prompt_template.format(\n",
" issue_description=format_issue(issue)\n",
")\n",
"print(final_gh_issue_prompt)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"from langchain.memory.summary_buffer import ConversationSummaryBufferMemory\n",
"from langchain_core.prompts.chat import MessagesPlaceholder\n",
"\n",
"summarizer_llm = ChatOpenAI(temperature=0, model=\"gpt-3.5-turbo\") # type: ignore\n",
"chat_history = MessagesPlaceholder(variable_name=\"chat_history\")\n",
"memory = ConversationSummaryBufferMemory(\n",
" memory_key=\"chat_history\",\n",
" return_messages=True,\n",
" llm=summarizer_llm,\n",
" max_token_limit=2_000,\n",
")\n",
"\n",
"agent = initialize_agent(\n",
" tools,\n",
" llm,\n",
" agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
" verbose=True,\n",
" handle_parsing_errors=True, # or pass a function that accepts the error and returns a string\n",
" max_iterations=30,\n",
" max_execution_time=None,\n",
" early_stopping_method=\"generate\",\n",
" memory=memory,\n",
" # trim_intermediate_steps=fancier_trim_intermediate_steps,\n",
" agent_kwargs={\n",
" \"memory_prompts\": [chat_history],\n",
" \"input_variables\": [\"input\", \"agent_scratchpad\", \"chat_history\"],\n",
" \"prefix\": final_gh_issue_prompt,\n",
" },\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m```json\n",
"{\n",
" \"action\": \"Get files from a directory\",\n",
" \"action_input\": \"ML4Bio/tree/main/Report_WholeBrain\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mError: status code 404, None\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe previous action to get files from a directory failed because the path provided does not exist or is not accessible. I need to correct the path to access the files in the `Report_WholeBrain` directory. Let's try to fetch the list of files from the correct directory path. \n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Get files from a directory\",\n",
" \"action_input\": \"Report_WholeBrain\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3m['Report_WholeBrain/MDSclustering_WholeBrain.html', 'Report_WholeBrain/MDSclustering_WholeBrain_RUVremoved.html', 'Report_WholeBrain/Report_Antonson_WholeBrain_2022Mar.Rmd', 'Report_WholeBrain/Report_WholeBrain_files/figure-html/Figure 1-1.png', 'Report_WholeBrain/Report_WholeBrain_files/figure-html/Figure 2-1.png', 'Report_WholeBrain/Report_WholeBrain_files/figure-html/Figure 3-1.png', 'Report_WholeBrain/Report_WholeBrain_files/figure-html/Figure 4-1.png', 'Report_WholeBrain/Report_WholeBrain_files/figure-html/Figure 6-1.png', 'Report_WholeBrain/Report_WholeBrain_files/figure-html/Figure 7-1.png', 'Report_WholeBrain/Report_WholeBrain_files/figure-html/Figure 8-1.png', 'Report_WholeBrain/Report_WholeBrain_files/figure-html/Figure 9-1.png', 'Report_WholeBrain/SalmonSummarizedOutput.RData', 'Report_WholeBrain/SampleInfo_RUVvariables_WholeBrain_2022-05-12.csv', 'Report_WholeBrain/Targets_Final.txt', 'Report_WholeBrain/WholeBrain_GeneResults_2022-05-12.xlsx', 'Report_WholeBrain/WholeBrain_GeneResults_RUV_2022-05-12.xlsx', 'Report_WholeBrain/WholeBrain_Gene_level_counts_2022-05-12.xlsx', 'Report_WholeBrain/WholeBrain_RUV_FDR0.1.html', 'Report_WholeBrain/WholeBrain_logCPMValues_RUVcorrected_2022-05-12.xlsx', 'Report_WholeBrain/WholeBrain_logCPMvalues_2022-05-12.xlsx', 'Report_WholeBrain/WholeBrain_rawP05.html', 'Report_WholeBrain/getGO.R', 'Report_WholeBrain/getPath.R', 'Report_WholeBrain/interactive_plots/css/glimma.min.css', 'Report_WholeBrain/interactive_plots/css/src/images/animated-overlay.gif', 'Report_WholeBrain/interactive_plots/css/src/images/favicon.ico', 'Report_WholeBrain/interactive_plots/css/src/images/sort_asc.png', 'Report_WholeBrain/interactive_plots/css/src/images/sort_asc_disabled.png', 'Report_WholeBrain/interactive_plots/css/src/images/sort_both.png', 'Report_WholeBrain/interactive_plots/css/src/images/sort_desc.png', 'Report_WholeBrain/interactive_plots/css/src/images/sort_desc_disabled.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-bg_flat_0_aaaaaa_40x100.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-bg_flat_75_ffffff_40x100.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-bg_glass_55_fbf9ee_1x400.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-bg_glass_65_ffffff_1x400.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-bg_glass_75_dadada_1x400.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-bg_glass_75_e6e6e6_1x400.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-bg_glass_95_fef1ec_1x400.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-bg_highlight-soft_75_cccccc_1x100.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-icons_222222_256x240.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-icons_2e83ff_256x240.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-icons_454545_256x240.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-icons_888888_256x240.png', 'Report_WholeBrain/interactive_plots/css/src/images/ui-icons_cd0a0a_256x240.png', 'Report_WholeBrain/interactive_plots/js/glimma.min.js', 'Report_WholeBrain/interactive_plots/js/old_MDSclustering_Microglia.js', 'Report_WholeBrain/interactive_plots/js/old_MDSclustering_Microglia_RUV.js', 'Report_WholeBrain/interactive_plots/js/old_MDSclustering_WholeBrain.js', 'Report_WholeBrain/interactive_plots/js/old_MDSclustering_WholeBrain_RUV.js', 'Report_WholeBrain/interactive_plots/js/old_MDSclustering_WholeBrain_noOUT.js', 'Report_WholeBrain/interactive_plots/js/old_Microglia_rawP05.js', 'Report_WholeBrain/interactive_plots/js/old_WholeBrain_RUV_FDR0.1.js', 'Report_WholeBrain/interactive_plots/js/old_WholeBrain_rawP05.js', 'Report_WholeBrain/interactive_plots/old_MDSclustering_Microglia.html', 'Report_WholeBrain/interactive_plots/old_MDSclustering_Microglia_RUV.html', 'Report_WholeBrain/interactive_plots/old_MDSclustering_WholeBrain.html', 'Report_WholeBrain/interactive_plots/old_MDSclustering_WholeBrain_RUV.html', 'Report_WholeBrain/interactive_plots/old_MDSclustering_WholeBrain_noOUT.html', 'Report_WholeBrain/interactive_plots/old_Microglia_rawP05.html', 'Report_WholeBrain/interactive_plots/old_WholeBrain_RUV_FDR0.1.html', 'Report_WholeBrain/interactive_plots/old_WholeBrain_rawP05.html', 'Report_WholeBrain/trx_EGids_combined.txt', 'Report_WholeBrain/⭐____Report_Antonson_WholeBrain_2022Mar.html']\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe files in the `Report_WholeBrain` directory seem to be mostly reports, images, and some R scripts. However, none of these files appear to be raw RNA-Seq data files or scripts directly related to the RNA-Seq workflow requested by the user. To proceed with creating the workflow, I will need to create new bash and R scripts that will handle the RNA-Seq analysis using the software stack mentioned (FastQC, MultiQC, STAR, RSEM, samtools, DESeq2).\n",
"\n",
"Before creating the scripts, I should create a new branch to work on this feature.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create a new branch\",\n",
" \"action_input\": \"rna-seq-workflow\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[33;1m\u001b[1;3mBranch 'rna-seq-workflow_v1' created successfully, and set as current active branch.\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mNow that the new branch `rna-seq-workflow_v1` is created and active, I can start creating the necessary bash and R scripts for the RNA-Seq workflow. I will begin by creating a bash script for quality control using FastQC and MultiQC.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create File\",\n",
" \"action_input\": \"rna-seq-workflow/01_quality_control.sh\\n#!/bin/bash\\n#PBS -N quality_control\\n#PBS -l walltime=04:00:00\\n#PBS -l nodes=1:ppn=8\\n#PBS -q batch\\n\\n# Load modules for FastQC and MultiQC\\nmodule load FastQC/0.11.8\\nmodule load MultiQC/1.9\\n\\n# Define input and output directories\\nINPUT_DIR='/path/to/raw_data'\\nOUTPUT_DIR='/path/to/output/quality_control'\\n\\n# Create output directory if it doesn't exist\\nmkdir -p $OUTPUT_DIR\\n\\n# Run FastQC on all files in the input directory\\nfastqc -o $OUTPUT_DIR -t 8 $INPUT_DIR/*.fastq.gz\\n\\n# Run MultiQC to aggregate results\\nmultiqc -o $OUTPUT_DIR $OUTPUT_DIR\\n\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mCreated file rna-seq-workflow/01_quality_control.sh\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe quality control bash script has been created successfully. Next, I will create a bash script for the alignment step using STAR.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create File\",\n",
" \"action_input\": \"rna-seq-workflow/02_alignment.sh\\n#!/bin/bash\\n#PBS -N alignment\\n#PBS -l walltime=08:00:00\\n#PBS -l nodes=1:ppn=16\\n#PBS -q batch\\n\\n# Load module for STAR\\nmodule load STAR/2.7.3a\\n\\n# Define input and output directories\\nINPUT_DIR='/path/to/quality_control'\\nOUTPUT_DIR='/path/to/output/alignment'\\nGENOME_DIR='/path/to/genome_indices'\\n\\n# Create output directory if it doesn't exist\\nmkdir -p $OUTPUT_DIR\\n\\n# Run STAR for each sample\\nfor sample in $INPUT_DIR/*.fastq.gz\\ndo\\n # Define output file names\\n prefix=$(basename $sample .fastq.gz)\\n output_prefix=$OUTPUT_DIR/$prefix\\n\\n # Run STAR\\n STAR --runThreadN 16 --genomeDir $GENOME_DIR --readFilesIn $sample --readFilesCommand zcat --outFileNamePrefix $output_prefix --outSAMtype BAM SortedByCoordinate\\n\\ndone\\n\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mCreated file rna-seq-workflow/02_alignment.sh\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe alignment bash script has been created successfully. Next, I will create a bash script for transcript quantification using RSEM.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create File\",\n",
" \"action_input\": \"rna-seq-workflow/03_transcript_quantification.sh\\n#!/bin/bash\\n#PBS -N transcript_quantification\\n#PBS -l walltime=12:00:00\\n#PBS -l nodes=1:ppn=8\\n#PBS -q batch\\n\\n# Load module for RSEM\\nmodule load RSEM/1.3.3\\n\\n# Define input and output directories\\nINPUT_DIR='/path/to/alignment'\\nOUTPUT_DIR='/path/to/output/transcript_quantification'\\nREFERENCE_DIR='/path/to/reference'\\n\\n# Create output directory if it doesn't exist\\nmkdir -p $OUTPUT_DIR\\n\\n# Run RSEM for each sample\\nfor bam in $INPUT_DIR/*.bam\\ndo\\n # Define output file names\\n prefix=$(basename $bam .bam)\\n output_prefix=$OUTPUT_DIR/$prefix\\n\\n # Run RSEM\\n rsem-calculate-expression --bam --no-bam-output --estimate-rspd --append-names --output-genome-bam -p 8 $bam $REFERENCE_DIR $output_prefix\\n\\ndone\\n\"\n",
"}\n",
"```\n",
"\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mCreated file rna-seq-workflow/03_transcript_quantification.sh\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe transcript quantification bash script has been created successfully. Now, I will create a bash script for sorting and indexing BAM files using samtools, which is a common step before differential expression analysis.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create File\",\n",
" \"action_input\": \"rna-seq-workflow/04_sort_index_bam.sh\\n#!/bin/bash\\n#PBS -N sort_index_bam\\n#PBS -l walltime=02:00:00\\n#PBS -l nodes=1:ppn=4\\n#PBS -q batch\\n\\n# Load module for samtools\\nmodule load samtools/1.10\\n\\n# Define input and output directories\\nINPUT_DIR='/path/to/alignment'\\nOUTPUT_DIR='/path/to/output/sorted_bams'\\n\\n# Create output directory if it doesn't exist\\nmkdir -p $OUTPUT_DIR\\n\\n# Sort and index BAM files\\nfor bam in $INPUT_DIR/*.bam\\ndo\\n # Define output file names\\n sorted_bam=$OUTPUT_DIR/$(basename $bam .bam).sorted.bam\\n\\n # Sort BAM file\\n samtools sort -@ 4 -o $sorted_bam $bam\\n\\n # Index sorted BAM file\\n samtools index $sorted_bam\\ndone\\n\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mCreated file rna-seq-workflow/04_sort_index_bam.sh\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe sorting and indexing of BAM files script has been created successfully. The next step is to create an R script for differential expression analysis using DESeq2. This script will be executed after the previous steps in the workflow have been completed.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create File\",\n",
" \"action_input\": \"rna-seq-workflow/05_differential_expression.R\\n# Load DESeq2 library\\nlibrary(DESeq2)\\n\\n# Set input directory\\ninput_dir <- '/path/to/transcript_quantification'\\n\\n# Read in the count matrix and the sample information\\n# Assuming countData is a matrix with count data and colData is a DataFrame with sample information\\ncountData <- as.matrix(read.csv(file.path(input_dir, 'count_matrix.csv'), row.names=1))\\ncolData <- read.csv(file.path(input_dir, 'colData.csv'))\\n\\n# Create a DESeqDataSet object\\ndds <- DESeqDataSetFromMatrix(countData = countData,\\n colData = colData,\\n design = ~ condition)\\n\\n# Run the differential expression analysis\\ndds <- DESeq(dds)\\n\\n# Get the results\\nres <- results(dds)\\n\\n# Order the results by the adjusted p-value\\nresOrdered <- res[order(res$padj),]\\n\\n# Write the results to a CSV file\\nwrite.csv(as.data.frame(resOrdered), file.path(input_dir, 'differential_expression_results.csv'))\\n\"\n",
"}\n",
"```\n",
"\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mCreated file rna-seq-workflow/05_differential_expression.R\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe R script for differential expression analysis using DESeq2 has been created successfully. Now that all the necessary scripts for the RNA-Seq workflow have been created, I should create a master bash script that will call each of these scripts in sequence to run the entire workflow.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create File\",\n",
" \"action_input\": \"rna-seq-workflow/run_workflow.sh\\n#!/bin/bash\\n# Master script to run the RNA-Seq workflow\\n\\n# Run quality control\\nbash 01_quality_control.sh\\n\\n# Run alignment\\nbash 02_alignment.sh\\n\\n# Run transcript quantification\\nbash 03_transcript_quantification.sh\\n\\n# Sort and index BAM files\\nbash 04_sort_index_bam.sh\\n\\n# Run differential expression analysis\\nRscript 05_differential_expression.R\\n\\n# Workflow completed\\necho \\\"RNA-Seq workflow completed successfully.\\\"\\n\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mCreated file rna-seq-workflow/run_workflow.sh\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe master bash script to run the entire RNA-Seq workflow has been created successfully. The next step is to create a pull request with all the scripts that have been created in the `rna-seq-workflow_v1` branch.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create Pull Request\",\n",
" \"action_input\": \"Create a full command line executable workflow for RNA-Seq on PBMC Samples\\n\\nThis pull request includes a series of bash and R scripts to perform RNA-Seq analysis on PBMC samples. The workflow includes steps for quality control, alignment, transcript quantification, sorting and indexing BAM files, and differential expression analysis using the following software stack: FastQC, MultiQC, STAR, RSEM, samtools, and DESeq2.\\n\\nPlease review the scripts and provide feedback or request any additional changes as necessary.\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mSuccessfully created PR number 40\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe pull request has been successfully created. Now, I need to comment on the original issue to inform the user about the new PR.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Comment on Issue\",\n",
" \"action_input\": \"2\\n\\nI opened a PR for you to review here #40\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mCommented on issue 2\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mThe comment has been posted on the original issue. Now, I will create a review request for the user KastanDay who opened the issue.\n",
"\n",
"Action:\n",
"```json\n",
"{\n",
" \"action\": \"Create review request\",\n",
" \"action_input\": \"KastanDay\"\n",
"}\n",
"```\u001b[0m\n",
"Observation: \u001b[38;5;200m\u001b[1;3mReview request created for user KastanDay on PR #40\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3mAction:\n",
"```json\n",
"{\n",
" \"action\": \"Final Answer\",\n",
" \"action_input\": \"The tasks have been completed: a new branch was created, the necessary scripts for the RNA-Seq workflow were written, a pull request was opened, the original issue was commented on with the PR number, and a review request was sent to the user KastanDay.\"\n",
"}\n",
"```\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n"
]
}
],
"source": [
"from langchain_core.tracers.context import tracing_v2_enabled\n",
"\n",
"# To use langsmith (recommended for these long tasks):\n",
"os.environ[\"LANGCHAIN_TRACING_V2\"] = \"true\"\n",
"os.environ[\"LANGCHAIN_ENDPOINT\"] = \"https://api.smith.langchain.com\"\n",
"os.environ[\"LANGCHAIN_API_KEY\"] = \"ls__......\"\n",
"os.environ[\"LANGCHAIN_PROJECT\"] = \"Github_Demo_PR\"\n",
"os.environ[\"LANGCHAIN_WANDB_TRACING\"] = \"false\"\n",
"\n",
"\n",
"with tracing_v2_enabled(project_name=\"Github_Demo_PR\", tags=[\"PR_bot\"]) as cb:\n",
" agent.run(final_gh_issue_prompt)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Full text of tools\n",
"\n",
"When using or building tools, it's always helpful to inspect what the model sees.\n",
"\n",
"On OpenAI models, tool descriptions are part of the `SystemPrompt`.\n",
"\n",
"The `args` are added to the prompt in structured chats, e.g. `AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION`, but not in `AgentType.ZERO_SHOT_REACT_DESCRIPTION`."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Get Issues: \n",
"This tool will fetch a list of the repository's issues. It will return the title, and issue number of 5 issues. It takes no input., args: {'no_input': {'title': 'No Input', 'description': 'No input required, e.g. `` (empty string).', 'default': '', 'type': 'string'}}\n",
"Get Issue: \n",
"This tool will fetch the title, body, and comment thread of a specific issue. **VERY IMPORTANT**: You must specify the issue number as an integer., args: {'issue_number': {'title': 'Issue Number', 'description': 'Issue number as an integer, e.g. `42`', 'default': 0, 'type': 'integer'}}\n",
"Comment on Issue: \n",
"This tool is useful when you need to comment on a GitHub issue. Simply pass in the issue number and the comment you would like to make. Please use this sparingly as we don't want to clutter the comment threads. **VERY IMPORTANT**: Your input to this tool MUST strictly follow these rules:\n",
"\n",
"- First you must specify the issue number as an integer\n",
"- Then you must place two newlines\n",
"- Then you must specify your comment, args: {'input': {'title': 'Input', 'description': 'Follow the required formatting.', 'type': 'string'}}\n",
"List open pull requests (PRs): \n",
"This tool will fetch a list of the repository's Pull Requests (PRs). It will return the title, and PR number of 5 PRs. It takes no input., args: {'no_input': {'title': 'No Input', 'description': 'No input required, e.g. `` (empty string).', 'default': '', 'type': 'string'}}\n",
"Get Pull Request: \n",
"This tool will fetch the title, body, comment thread and commit history of a specific Pull Request (by PR number). **VERY IMPORTANT**: You must specify the PR number as an integer., args: {'pr_number': {'title': 'Pr Number', 'description': 'The PR number as an integer, e.g. `12`', 'default': 0, 'type': 'integer'}}\n",
"Overview of files included in PR: \n",
"This tool will fetch the full text of all files in a pull request (PR) given the PR number as an input. This is useful for understanding the code changes in a PR or contributing to it. **VERY IMPORTANT**: You must specify the PR number as an integer input parameter., args: {'pr_number': {'title': 'Pr Number', 'description': 'The PR number as an integer, e.g. `12`', 'default': 0, 'type': 'integer'}}\n",
"Create Pull Request: \n",
"This tool is useful when you need to create a new pull request in a GitHub repository. **VERY IMPORTANT**: Your input to this tool MUST strictly follow these rules:\n",
"\n",
"- First you must specify the title of the pull request\n",
"- Then you must place two newlines\n",
"- Then you must write the body or description of the pull request\n",
"\n",
"When appropriate, always reference relevant issues in the body by using the syntax `closes #<issue_number` like `closes #3, closes #6`.\n",
"For example, if you would like to create a pull request called \"README updates\" with contents \"added contributors' names, closes #3\", you would pass in the following string:\n",
"\n",
"README updates\n",
"\n",
"added contributors' names, closes #3, args: {'formatted_pr': {'title': 'Formatted Pr', 'description': 'Follow the required formatting.', 'type': 'string'}}\n",
"List Pull Requests' Files: \n",
"This tool will fetch the full text of all files in a pull request (PR) given the PR number as an input. This is useful for understanding the code changes in a PR or contributing to it. **VERY IMPORTANT**: You must specify the PR number as an integer input parameter., args: {'pr_number': {'title': 'Pr Number', 'description': 'The PR number as an integer, e.g. `12`', 'default': 0, 'type': 'integer'}}\n",
"Create File: \n",
"This tool is a wrapper for the GitHub API, useful when you need to create a file in a GitHub repository. **VERY IMPORTANT**: Your input to this tool MUST strictly follow these rules:\n",
"\n",
"- First you must specify which file to create by passing a full file path (**IMPORTANT**: the path must not start with a slash)\n",
"- Then you must specify the contents of the file\n",
"\n",
"For example, if you would like to create a file called /test/test.txt with contents \"test contents\", you would pass in the following string:\n",
"\n",
"test/test.txt\n",
"\n",
"test contents, args: {'formatted_file': {'title': 'Formatted File', 'description': 'Follow the required formatting.', 'type': 'string'}}\n",
"Read File: \n",
"This tool is a wrapper for the GitHub API, useful when you need to read the contents of a file. Simply pass in the full file path of the file you would like to read. **IMPORTANT**: the path must not start with a slash, args: {'formatted_filepath': {'title': 'Formatted Filepath', 'description': 'The full file path of the file you would like to read where the path must NOT start with a slash, e.g. `some_dir/my_file.py`.', 'type': 'string'}}\n",
"Update File: \n",
"This tool is a wrapper for the GitHub API, useful when you need to update the contents of a file in a GitHub repository. **VERY IMPORTANT**: Your input to this tool MUST strictly follow these rules:\n",
"\n",
"- First you must specify which file to modify by passing a full file path (**IMPORTANT**: the path must not start with a slash)\n",
"- Then you must specify the old contents which you would like to replace wrapped in OLD <<<< and >>>> OLD\n",
"- Then you must specify the new contents which you would like to replace the old contents with wrapped in NEW <<<< and >>>> NEW\n",
"\n",
"For example, if you would like to replace the contents of the file /test/test.txt from \"old contents\" to \"new contents\", you would pass in the following string:\n",
"\n",
"test/test.txt\n",
"\n",
"This is text that will not be changed\n",
"OLD <<<<\n",
"old contents\n",
">>>> OLD\n",
"NEW <<<<\n",
"new contents\n",
">>>> NEW, args: {'formatted_file_update': {'title': 'Formatted File Update', 'description': 'Strictly follow the provided rules.', 'type': 'string'}}\n",
"Delete File: \n",
"This tool is a wrapper for the GitHub API, useful when you need to delete a file in a GitHub repository. Simply pass in the full file path of the file you would like to delete. **IMPORTANT**: the path must not start with a slash, args: {'formatted_filepath': {'title': 'Formatted Filepath', 'description': 'The full file path of the file you would like to delete where the path must NOT start with a slash, e.g. `some_dir/my_file.py`. Only input a string, not the param name.', 'type': 'string'}}\n",
"Overview of existing files in Main branch: \n",
"This tool will provide an overview of all existing files in the main branch of the repository. It will list the file names, their respective paths, and a brief summary of their contents. This can be useful for understanding the structure and content of the repository, especially when navigating through large codebases. No input parameters are required., args: {'no_input': {'title': 'No Input', 'description': 'No input required, e.g. `` (empty string).', 'default': '', 'type': 'string'}}\n",
"Overview of files in current working branch: \n",
"This tool will provide an overview of all files in your current working branch where you should implement changes. This is great for getting a high level overview of the structure of your code. No input parameters are required., args: {'no_input': {'title': 'No Input', 'description': 'No input required, e.g. `` (empty string).', 'default': '', 'type': 'string'}}\n",
"List branches in this repository: \n",
"This tool will fetch a list of all branches in the repository. It will return the name of each branch. No input parameters are required., args: {'no_input': {'title': 'No Input', 'description': 'No input required, e.g. `` (empty string).', 'default': '', 'type': 'string'}}\n",
"Set active branch: \n",
"This tool will set the active branch in the repository, similar to `git checkout <branch_name>` and `git switch -c <branch_name>`. **VERY IMPORTANT**: You must specify the name of the branch as a string input parameter., args: {'branch_name': {'title': 'Branch Name', 'description': 'The name of the branch, e.g. `my_branch`.', 'type': 'string'}}\n",
"Create a new branch: \n",
"This tool will create a new branch in the repository. **VERY IMPORTANT**: You must specify the name of the new branch as a string input parameter., args: {'branch_name': {'title': 'Branch Name', 'description': 'The name of the branch, e.g. `my_branch`.', 'type': 'string'}}\n",
"Get files from a directory: \n",
"This tool will fetch a list of all files in a specified directory. **VERY IMPORTANT**: You must specify the path of the directory as a string input parameter., args: {'input': {'title': 'Input', 'description': 'The path of the directory, e.g. `some_dir/inner_dir`. Only input a string, do not include the parameter name.', 'default': '', 'type': 'string'}}\n",
"Search issues and pull requests: \n",
"This tool will search for issues and pull requests in the repository. **VERY IMPORTANT**: You must specify the search query as a string input parameter., args: {'search_query': {'title': 'Search Query', 'description': 'Natural language search query, e.g. `My issue title or topic`.', 'type': 'string'}}\n",
"Search code: \n",
"This tool will search for code in the repository. **VERY IMPORTANT**: You must specify the search query as a string input parameter., args: {'search_query': {'title': 'Search Query', 'description': 'A keyword-focused natural language search query for code, e.g. `MyFunctionName()`.', 'type': 'string'}}\n",
"Create review request: \n",
"This tool will create a review request on the open pull request that matches the current active branch. **VERY IMPORTANT**: You must specify the username of the person who is being requested as a string input parameter., args: {'username': {'title': 'Username', 'description': 'GitHub username of the user being requested, e.g. `my_username`.', 'type': 'string'}}\n"
]
}
],
"source": [
"from langchain.tools.render import render_text_description_and_args\n",
"\n",
"print(render_text_description_and_args(tools))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example: Agent with Search\n",
"\n",
"If your agent does not need to use all 8 tools, you can build tools individually to use. For this example, we'll make an agent that does not use the create_file, delete_file or create_pull_request tools, but can also use duckduckgo-search."
]
@@ -382,7 +840,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
"version": "3.10.13"
}
},
"nbformat": 4,

View File

@@ -0,0 +1,108 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "e6fd05db-21c2-4227-9900-0840bc62cb31",
"metadata": {},
"source": [
"# NASA\n",
"\n",
"This notebook shows how to use agents to interact with the NASA toolkit. The toolkit provides access to the NASA Image and Video Library API, with potential to expand and include other accessible NASA APIs in future iterations.\n",
"\n",
"**Note: NASA Image and Video Library search queries can result in large responses when the number of desired media results is not specified. Consider this prior to using the agent with LLM token credits.**"
]
},
{
"cell_type": "markdown",
"id": "7d93e6bd-03d7-4d3c-b915-8b73164e2ad8",
"metadata": {},
"source": [
"## Example Use:\n",
"---\n",
"### Initializing the agent"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "648a2cb2-308e-4b2e-9b73-37109be4e258",
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import AgentType, initialize_agent\n",
"from langchain.agents.agent_toolkits.nasa.toolkit import NasaToolkit\n",
"from langchain.llms import OpenAI\n",
"from langchain.utilities.nasa import NasaAPIWrapper\n",
"\n",
"llm = OpenAI(temperature=0, openai_api_key=\"\")\n",
"nasa = NasaAPIWrapper()\n",
"toolkit = NasaToolkit.from_nasa_api_wrapper(nasa)\n",
"agent = initialize_agent(\n",
" toolkit.get_tools(), llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
")"
]
},
{
"cell_type": "markdown",
"id": "71f05fc9-d80d-4614-b9a3-e0a5e43cbbbb",
"metadata": {},
"source": [
"### Querying media assets"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b97409f3-dc87-425d-b555-406cf8466a28",
"metadata": {},
"outputs": [],
"source": [
"agent.run(\n",
" \"Can you find three pictures of the moon published between the years 2014 and 2020?\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "a86ce5ff-de45-4206-86ca-07ae03f36bdf",
"metadata": {},
"source": [
"### Querying details about media assets"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "80e86b49-749e-4026-b025-db32ed0bcc7e",
"metadata": {},
"outputs": [],
"source": [
"output = agent.run(\n",
" \"I've just queried an image of the moon with the NASA id NHQ_2019_0311_Go Forward to the Moon.\"\n",
" \" Where can I find the metadata manifest for this asset?\"\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -6,9 +6,15 @@
"source": [
"# Office365\n",
"\n",
">[Microsoft 365](https://www.office.com/) is a product family of productivity software, collaboration and cloud-based services owned by `Microsoft`.\n",
">\n",
">Note: `Office 365` was rebranded as `Microsoft 365`.\n",
"\n",
"This notebook walks through connecting LangChain to `Office365` email and calendar.\n",
"\n",
"To use this toolkit, you will need to set up your credentials explained in the [Microsoft Graph authentication and authorization overview](https://learn.microsoft.com/en-us/graph/auth/). Once you've received a CLIENT_ID and CLIENT_SECRET, you can input them as environmental variables below."
"To use this toolkit, you need to set up your credentials explained in the [Microsoft Graph authentication and authorization overview](https://learn.microsoft.com/en-us/graph/auth/). Once you've received a CLIENT_ID and CLIENT_SECRET, you can input them as environmental variables below.\n",
"\n",
"You can also use the [authentication instructions from here](https://o365.github.io/python-o365/latest/getting_started.html#oauth-setup-pre-requisite)."
]
},
{
@@ -17,8 +23,8 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade O365 > /dev/null\n",
"!pip install beautifulsoup4 > /dev/null # This is optional but is useful for parsing HTML messages"
"!pip install --upgrade O365\n",
"!pip install beautifulsoup4 # This is optional but is useful for parsing HTML messages"
]
},
{
@@ -27,7 +33,7 @@
"source": [
"## Assign Environmental Variables\n",
"\n",
"The toolkit will read the CLIENT_ID and CLIENT_SECRET environmental variables to authenticate the user so you need to set them here. You will also need to set your OPENAI_API_KEY to use the agent later."
"The toolkit will read the `CLIENT_ID` and `CLIENT_SECRET` environmental variables to authenticate the user so you need to set them here. You will also need to set your `OPENAI_API_KEY` to use the agent later."
]
},
{

View File

@@ -93,12 +93,9 @@
}
],
"source": [
"!wget https://raw.githubusercontent.com/openai/openai-openapi/master/openapi.yaml\n",
"!mv openapi.yaml openai_openapi.yaml\n",
"!wget https://www.klarna.com/us/shopping/public/openai/v0/api-docs\n",
"!mv api-docs klarna_openapi.yaml\n",
"!wget https://raw.githubusercontent.com/APIs-guru/openapi-directory/main/APIs/spotify.com/1.0.0/openapi.yaml\n",
"!mv openapi.yaml spotify_openapi.yaml"
"!wget https://raw.githubusercontent.com/openai/openai-openapi/master/openapi.yaml -O openai_openapi.yaml\n",
"!wget https://www.klarna.com/us/shopping/public/openai/v0/api-docs -O klarna_openapi.yaml\n",
"!wget https://raw.githubusercontent.com/APIs-guru/openapi-directory/main/APIs/spotify.com/1.0.0/openapi.yaml -O spotify_openapi.yaml"
]
},
{

View File

@@ -0,0 +1,147 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Slack\n",
"\n",
"This notebook walks through connecting LangChain to your `Slack` account.\n",
"\n",
"To use this toolkit, you will need to get a token explained in the [Slack API docs](https://api.slack.com/tutorials/tracks/getting-a-token). Once you've received a SLACK_USER_TOKEN, you can input it as an environmental variable below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install --upgrade slack_sdk > /dev/null\n",
"!pip install beautifulsoup4 > /dev/null # This is optional but is useful for parsing HTML messages"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Assign Environmental Variables\n",
"\n",
"The toolkit will read the SLACK_USER_TOKEN environmental variable to authenticate the user so you need to set them here. You will also need to set your OPENAI_API_KEY to use the agent later."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Set environmental variables here"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create the Toolkit and Get Tools\n",
"\n",
"To start, you need to create the toolkit, so you can access its tools later."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents.agent_toolkits import SlackToolkit\n",
"\n",
"toolkit = SlackToolkit()\n",
"tools = toolkit.get_tools()\n",
"tools"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Use within an Agent"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from langchain.agents import AgentType, initialize_agent\n",
"from langchain.llms import OpenAI"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"agent = initialize_agent(\n",
" tools=toolkit.get_tools(),\n",
" llm=llm,\n",
" verbose=False,\n",
" agent=AgentType.STRUCTURED_CHAT_ZERO_SHOT_REACT_DESCRIPTION,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"agent.run(\"Send a greeting to my coworkers in the #general channel.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"agent.run(\"How many channels are in the workspace? Please list out their names.\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"agent.run(\n",
" \"Tell me the number of messages sent in the #introductions channel from the past month.\"\n",
")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

View File

@@ -0,0 +1,105 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Steam Game Recommendation & Game Details Tool"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"from langchain.agents import AgentType, initialize_agent\n",
"from langchain.agents.agent_toolkits.steam.toolkit import SteamToolkit\n",
"from langchain.llms import OpenAI\n",
"from langchain.utilities.steam import SteamWebAPIWrapper"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"STEAM_KEY\"] = \"xyz\"\n",
"os.environ[\"STEAM_ID\"] = \"123\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"abc\""
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"llm = OpenAI(temperature=0)\n",
"Steam = SteamWebAPIWrapper()\n",
"toolkit = SteamToolkit.from_steam_api_wrapper(Steam)\n",
"agent = initialize_agent(\n",
" toolkit.get_tools(), llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"\n",
"\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
"\u001b[32;1m\u001b[1;3m I need to find the game details\n",
"Action: Get Games Details\n",
"Action Input: Terraria\u001b[0m\n",
"Observation: \u001b[36;1m\u001b[1;3mThe id is: 105600\n",
"The link is: https://store.steampowered.com/app/105600/Terraria/?snr=1_7_15__13\n",
"The price is: $9.99\n",
"The summary of the game is: Dig, Fight, Explore, Build: The very world is at your fingertips as you fight for survival, fortune, and glory. Will you delve deep into cavernous expanses in search of treasure and raw materials with which to craft ever-evolving gear, machinery, and aesthetics? Perhaps you will choose instead to seek out ever-greater foes to test your mettle in combat? Maybe you will decide to construct your own city to house the host of mysterious allies you may encounter along your travels? In the World of Terraria, the choice is yours!Blending elements of classic action games with the freedom of sandbox-style creativity, Terraria is a unique gaming experience where both the journey and the destination are completely in the players control. The Terraria adventure is truly as unique as the players themselves! Are you up for the monumental task of exploring, creating, and defending a world of your own? Key features: Sandbox Play Randomly generated worlds Free Content Updates \n",
"The supported languages of the game are: English, French, Italian, German, Spanish - Spain, Polish, Portuguese - Brazil, Russian, Simplified Chinese\n",
"\u001b[0m\n",
"Thought:\u001b[32;1m\u001b[1;3m I now know the final answer\n",
"Final Answer: Terraria is a game with an id of 105600, a link of https://store.steampowered.com/app/105600/Terraria/?snr=1_7_15__13, a price of $9.99, a summary of \"Dig, Fight, Explore, Build: The very world is at your fingertips as you fight for survival, fortune, and glory. Will you delve deep into cavernous expanses in search of treasure and raw materials with which to craft ever-evolving gear, machinery, and aesthetics? Perhaps you will choose instead to seek out ever-greater foes to test your mettle in combat? Maybe you will decide to construct your own city to house the host of mysterious allies you may encounter along your travels? In the World of Terraria, the choice is yours!Blending elements of classic action games with the freedom of sandbox-style creativity, Terraria is a unique gaming experience where both the journey and the destination are completely in the players control. The Terraria adventure is truly as unique as the players themselves! Are you up for the monumental task of exploring, creating, and defending a\u001b[0m\n",
"\n",
"\u001b[1m> Finished chain.\u001b[0m\n",
"{'input': 'can you give the information about the game Terraria', 'output': 'Terraria is a game with an id of 105600, a link of https://store.steampowered.com/app/105600/Terraria/?snr=1_7_15__13, a price of $9.99, a summary of \"Dig, Fight, Explore, Build: The very world is at your fingertips as you fight for survival, fortune, and glory. Will you delve deep into cavernous expanses in search of treasure and raw materials with which to craft ever-evolving gear, machinery, and aesthetics? Perhaps you will choose instead to seek out ever-greater foes to test your mettle in combat? Maybe you will decide to construct your own city to house the host of mysterious allies you may encounter along your travels? In the World of Terraria, the choice is yours!Blending elements of classic action games with the freedom of sandbox-style creativity, Terraria is a unique gaming experience where both the journey and the destination are completely in the players control. The Terraria adventure is truly as unique as the players themselves! Are you up for the monumental task of exploring, creating, and defending a'}\n"
]
}
],
"source": [
"out = agent(\"can you give the information about the game Terraria\")\n",
"print(out)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}

View File

@@ -11,11 +11,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
">`Amazon AWS Lambda` is a serverless computing service provided by `Amazon Web Services` (`AWS`). It helps developers to build and run applications and services without provisioning or managing servers. This serverless architecture enables you to focus on writing and deploying code, while AWS automatically takes care of scaling, patching, and managing the infrastructure required to run your applications.\n",
">[`Amazon AWS Lambda`](https://aws.amazon.com/pm/lambda/) is a serverless computing service provided by `Amazon Web Services` (`AWS`). It helps developers to build and run applications and services without provisioning or managing servers. This serverless architecture enables you to focus on writing and deploying code, while AWS automatically takes care of scaling, patching, and managing the infrastructure required to run your applications.\n",
"\n",
"This notebook goes over how to use the `AWS Lambda` Tool.\n",
"\n",
"By including a `awslambda` in the list of tools provided to an Agent, you can grant your Agent the ability to invoke code running in your AWS Cloud for whatever purposes you need.\n",
"By including the `AWS Lambda` in the list of tools provided to an Agent, you can grant your Agent the ability to invoke code running in your AWS Cloud for whatever purposes you need.\n",
"\n",
"When an Agent uses the `AWS Lambda` tool, it will provide an argument of type string which will in turn be passed into the Lambda function via the event parameter.\n",
"\n",

View File

@@ -17,12 +17,12 @@
"metadata": {},
"outputs": [],
"source": [
"# !pip install duckduckgo-search"
"# !pip install -U duckduckgo-search"
]
},
{
"cell_type": "code",
"execution_count": 20,
"execution_count": 1,
"id": "ac4910f8",
"metadata": {},
"outputs": [],
@@ -32,7 +32,7 @@
},
{
"cell_type": "code",
"execution_count": 21,
"execution_count": 2,
"id": "84b8f773",
"metadata": {},
"outputs": [],
@@ -42,17 +42,17 @@
},
{
"cell_type": "code",
"execution_count": 22,
"execution_count": 3,
"id": "068991a6",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'August 4, 1961 (age 61) Honolulu Hawaii Title / Office: presidency of the United States of America (2009-2017), United States United States Senate (2005-2008), United States ... (Show more) Political Affiliation: Democratic Party Awards And Honors: Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing Illinois ... Answer (1 of 12): I see others have answered President Obama\\'s name which is \"Barack Hussein Obama\". President Obama has received many comments about his name from the racists across US. It is worth noting that he never changed his name. Also, it is worth noting that a simple search would have re... What is Barack Obama\\'s full name? Updated: 11/11/2022 Wiki User ∙ 6y ago Study now See answer (1) Best Answer Copy His full, birth name is Barack Hussein Obama, II. He was named after his... Alex Oliveira July 24, 2023 4:57pm Updated 0 seconds of 43 secondsVolume 0% 00:00 00:43 The man who drowned while paddleboarding on a pond outside the Obamas\\' Martha\\'s Vineyard estate has been...'"
"\"Life After the Presidency How Tall is Obama? Books and Grammy Hobbies Movies About Obama Quotes 1961-present Who Is Barack Obama? Barack Obama was the 44 th president of the United States... facts you never knew about Barack Obama is that his immediate family spread out across three continents. Barack, who led America from 2009 to 2017, comes from a large family of seven living half-siblings. His father, Barack Obama Sr., met his mother, Ann Dunham, in 1960 and married her a year after. With a tear running from his eye, President Barack Obama recalls the 20 first-graders killed in 2012 at Sandy Hook Elementary School, while speaking in the East Room of the White House in ... Former first Lady Rosalynn Carter was laid to rest at her family's home in Plains, Ga. on Nov. 29 following three days of memorials across her home state. She passed away on Nov. 19, aged 96 ... Here are 28 of President Obama's biggest accomplishments as President of the United States. 1 - Rescued the country from the Great Recession, cutting the unemployment rate from 10% to 4.7% over ...\""
]
},
"execution_count": 22,
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
@@ -71,7 +71,7 @@
},
{
"cell_type": "code",
"execution_count": 23,
"execution_count": 4,
"id": "95635444",
"metadata": {},
"outputs": [],
@@ -81,7 +81,7 @@
},
{
"cell_type": "code",
"execution_count": 24,
"execution_count": 5,
"id": "0133d103",
"metadata": {},
"outputs": [],
@@ -91,17 +91,17 @@
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 7,
"id": "439efc06",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"[snippet: Barack Hussein Obama II (/ b ə ˈ r ɑː k h uː ˈ s eɪ n oʊ ˈ b ɑː m ə / bə-RAHK hoo-SAYN oh-BAH-mə; born August 4, 1961) is an American politician who served as the 44th president of the United States from 2009 to 2017. A member of the Democratic Party, he was the first African-American president of the United States. Obama previously served as a U.S. senator representing Illinois ..., title: Barack Obama - Wikipedia, link: https://en.wikipedia.org/wiki/Barack_Obama], [snippet: Barack Obama, in full Barack Hussein Obama II, (born August 4, 1961, Honolulu, Hawaii, U.S.), 44th president of the United States (2009-17) and the first African American to hold the office. Before winning the presidency, Obama represented Illinois in the U.S. Senate (2005-08). He was the third African American to be elected to that body ..., title: Barack Obama | Biography, Parents, Education, Presidency, Books ..., link: https://www.britannica.com/biography/Barack-Obama], [snippet: Barack Obama 's tenure as the 44th president of the United States began with his first inauguration on January 20, 2009, and ended on January 20, 2017. A Democrat from Illinois, Obama took office following a decisive victory over Republican nominee John McCain in the 2008 presidential election. Four years later, in the 2012 presidential ..., title: Presidency of Barack Obama - Wikipedia, link: https://en.wikipedia.org/wiki/Presidency_of_Barack_Obama], [snippet: First published on Mon 24 Jul 2023 20.03 EDT. Barack Obama's personal chef died while paddleboarding near the ex-president's home on Martha's Vineyard over the weekend, Massachusetts state ..., title: Obama's personal chef dies while paddleboarding off Martha's Vineyard ..., link: https://www.theguardian.com/us-news/2023/jul/24/tafari-campbell-barack-obama-chef-drowns-marthas-vineyard]\""
"'[snippet: 1:12. Former President Barack Obama, in a CNN interview that aired Thursday night, said he does not believe President Joe Biden will face a serious primary challenge during his 2024 reelection ..., title: Five takeaways from Barack Obama\\'s CNN interview on Biden ... - Yahoo, link: https://www.usatoday.com/story/news/politics/2023/06/23/five-takeaways-from-barack-obama-cnn-interview/70349112007/], [snippet: Democratic institutions in the United States and around the world have grown \"creaky,\" former President Barack Obama warned in an exclusive CNN interview Thursday, and it remains incumbent on ..., title: Obama warns democratic institutions are \\'creaky\\' but Trump ... - CNN, link: https://www.cnn.com/2023/06/22/politics/barack-obama-interview-cnntv/index.html], [snippet: Barack Obama was the 44 th president of the United States and the first Black commander-in-chief. He served two terms, from 2009 until 2017. The son of parents from Kenya and Kansas, Obama was ..., title: Barack Obama: Biography, 44th U.S. President, Politician, link: https://www.biography.com/political-figures/barack-obama], [snippet: Aug. 2, 2023, 5:00 PM PDT. By Mike Memoli and Kristen Welker. WASHINGTON — During a trip to the White House in June, former President Barack Obama made it clear to his former running mate that ..., title: Obama privately told Biden he would do whatever it takes to help in 2024, link: https://www.nbcnews.com/politics/white-house/obama-privately-told-biden-whatever-takes-help-2024-rcna97865], [snippet: Natalie Bookey-Baker, a vice president at the Obama Foundation who worked for then-first lady Michelle Obama in the White House, said about 2,500 alumni are expected. They are veterans of Obama ..., title: Barack Obama team reunion this week in Chicago; 2,500 alumni expected ..., link: https://chicago.suntimes.com/politics/2023/10/29/23937504/barack-obama-michelle-obama-david-axelrod-pod-save-america-jon-batiste-jen-psaki-reunion-obamaworld]'"
]
},
"execution_count": 25,
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
@@ -120,7 +120,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 8,
"id": "21afe28d",
"metadata": {},
"outputs": [],
@@ -130,17 +130,17 @@
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 9,
"id": "2a4beeb9",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"\"[date: 2023-07-26T12:01:22, title: 'My heart is broken': Former Obama White House chef mourned following apparent drowning death in Edgartown, snippet: Tafari Campbell of Dumfries, Va., had been paddle boarding in Edgartown Great Pond when he appeared to briefly struggle, submerged, and did not return to the surface, authorities have said. Crews ultimately found the 45-year-old's body Monday morning., source: The Boston Globe on MSN.com, link: https://www.msn.com/en-us/news/us/my-heart-is-broken-former-obama-white-house-chef-mourned-following-apparent-drowning-death-in-edgartown/ar-AA1elNB8], [date: 2023-07-25T18:44:00, title: Obama's chef drowns paddleboarding near former president's Edgartown vacation home, snippet: Campbell was visiting Martha's Vineyard, where the Obamas own a vacation home. He was not wearing a lifejacket when he fell off his paddleboard., source: YAHOO!News, link: https://news.yahoo.com/obama-chef-drowns-paddleboarding-near-184437491.html], [date: 2023-07-26T00:30:00, title: Obama's personal chef dies while paddleboarding off Martha's Vineyard, snippet: Tafari Campbell, who worked at the White House during Obama's presidency, was visiting the island while the family was away, source: The Guardian, link: https://www.theguardian.com/us-news/2023/jul/24/tafari-campbell-barack-obama-chef-drowns-marthas-vineyard], [date: 2023-07-24T21:54:00, title: Obama's chef ID'd as paddleboarder who drowned near former president's Martha's Vineyard estate, snippet: Former President Barack Obama's personal chef, Tafari Campbell, has been identified as the paddle boarder who drowned near the Obamas' Martha's Vineyard estate., source: Fox News, link: https://www.foxnews.com/politics/obamas-chef-idd-paddleboarder-who-drowned-near-former-presidents-marthas-vineyard-estate]\""
"'[snippet: 1:12. Former President Barack Obama, in a CNN interview that aired Thursday night, said he does not believe President Joe Biden will face a serious primary challenge during his 2024 reelection ..., title: Five takeaways from Barack Obama\\'s CNN interview on Biden ... - Yahoo, link: https://www.usatoday.com/story/news/politics/2023/06/23/five-takeaways-from-barack-obama-cnn-interview/70349112007/], [snippet: Democratic institutions in the United States and around the world have grown \"creaky,\" former President Barack Obama warned in an exclusive CNN interview Thursday, and it remains incumbent on ..., title: Obama warns democratic institutions are \\'creaky\\' but Trump ... - CNN, link: https://www.cnn.com/2023/06/22/politics/barack-obama-interview-cnntv/index.html], [snippet: Barack Obama was the 44 th president of the United States and the first Black commander-in-chief. He served two terms, from 2009 until 2017. The son of parents from Kenya and Kansas, Obama was ..., title: Barack Obama: Biography, 44th U.S. President, Politician, link: https://www.biography.com/political-figures/barack-obama], [snippet: Natalie Bookey-Baker, a vice president at the Obama Foundation who worked for then-first lady Michelle Obama in the White House, said about 2,500 alumni are expected. They are veterans of Obama ..., title: Barack Obama team reunion this week in Chicago; 2,500 alumni expected ..., link: https://chicago.suntimes.com/politics/2023/10/29/23937504/barack-obama-michelle-obama-david-axelrod-pod-save-america-jon-batiste-jen-psaki-reunion-obamaworld], [snippet: Aug. 2, 2023, 5:00 PM PDT. By Mike Memoli and Kristen Welker. WASHINGTON — During a trip to the White House in June, former President Barack Obama made it clear to his former running mate that ..., title: Obama privately told Biden he would do whatever it takes to help in 2024, link: https://www.nbcnews.com/politics/white-house/obama-privately-told-biden-whatever-takes-help-2024-rcna97865]'"
]
},
"execution_count": 27,
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
@@ -159,7 +159,7 @@
},
{
"cell_type": "code",
"execution_count": 28,
"execution_count": 10,
"id": "c7ab3b55",
"metadata": {},
"outputs": [],
@@ -171,27 +171,27 @@
},
{
"cell_type": "code",
"execution_count": 29,
"execution_count": 11,
"id": "adce16e1",
"metadata": {},
"outputs": [],
"source": [
"search = DuckDuckGoSearchResults(api_wrapper=wrapper, backend=\"news\")"
"search = DuckDuckGoSearchResults(api_wrapper=wrapper, source=\"news\")"
]
},
{
"cell_type": "code",
"execution_count": 30,
"execution_count": 12,
"id": "b7e77c54",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'[date: 2023-07-25T12:15:00, title: Barack + Michelle Obama: Sie trauern um Angestellten, snippet: Barack und Michelle Obama trauern um ihren ehemaligen Küchenchef Tafari Campbell. Der Familienvater verunglückte am vergangenen Sonntag und wurde in einem Teich geborgen., source: Gala, link: https://www.gala.de/stars/news/barack---michelle-obama--sie-trauern-um-angestellten-23871228.html], [date: 2023-07-25T10:30:00, title: Barack Obama: Sein Koch (†45) ist tot - diese Details sind bekannt, snippet: Tafari Campbell war früher im Weißen Haus eingestellt, arbeitete anschließend weiter für Ex-Präsident Barack Obama. Nun ist er gestorben. Diese Details sind bekannt., source: T-Online, link: https://www.t-online.de/unterhaltung/stars/id_100213226/barack-obama-sein-koch-45-ist-tot-diese-details-sind-bekannt.html], [date: 2023-07-25T05:33:23, title: Barack Obama: Sein Privatkoch ist bei einem tragischen Unfall gestorben, snippet: Barack Obama (61) und Michelle Obama (59) sind in tiefer Trauer. Ihr Privatkoch Tafari Campbell ist am Montag (24. Juli) ums Leben gekommen, er wurde nur 45 Jahre alt. Laut US-Polizei starb er bei ein, source: BUNTE.de, link: https://www.msn.com/de-de/unterhaltung/other/barack-obama-sein-privatkoch-ist-bei-einem-tragischen-unfall-gestorben/ar-AA1ejrAd], [date: 2023-07-25T02:25:00, title: Barack Obama: Privatkoch tot in See gefunden, snippet: Tafari Campbell kochte für Barack Obama im Weißen Haus - und auch privat nach dessen Abschied aus dem Präsidentenamt. Nun machte die Polizei in einem Gewässer eine traurige Entdeckung., source: SPIEGEL, link: https://www.spiegel.de/panorama/justiz/barack-obama-leibkoch-tot-in-see-gefunden-a-3cdf6377-bee0-43f1-a200-a285742f9ffc]'"
"'[snippet: When Obama left office in January 2017, a CNN poll showed him with a 60% approval rating, landing him near the top of the list of presidential approval ratings upon leaving office., title: Opinion: The real reason Trump is attacking Obamacare | CNN, link: https://www.cnn.com/2023/12/04/opinions/trump-obamacare-obama-repeal-health-care-obeidallah/index.html], [snippet: Buchempfehlung von Barack Obama. Der gut zweistündige Netflix-Film basiert auf dem gleichnamigen Roman \"Leave the World Behind\" des hochgelobten US-Autors Rumaan Alam. 2020 landete er damit unter den Finalisten des \"National Book Awards\". In Deutschland ist das Buch, das auch Barack Obama auf seiner einflussreichen Lese-Empfehlungsliste hatte ..., title: Neu bei Netflix \"Leave The World Behind\": Kritik zum ... - Prisma, link: https://www.prisma.de/news/filme/Neu-bei-Netflix-Leave-The-World-Behind-Kritik-zum-ungewoehnlichen-Endzeit-Film-mit-Julia-Roberts,46563944]'"
]
},
"execution_count": 30,
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
@@ -199,6 +199,14 @@
"source": [
"search.run(\"Obama\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b133e3c1",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -217,7 +225,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.10.1"
},
"vscode": {
"interpreter": {

View File

@@ -6,18 +6,17 @@
"collapsed": false
},
"source": [
"# Azure Cognitive Search\n",
"# Azure AI Search\n",
"\n",
"[Azure Cognitive Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search) (formerly known as `Azure Search`) is a cloud search service that gives developers infrastructure, APIs, and tools for building a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications.\n",
"\n",
"Vector search is currently in public preview. It's available through the Azure portal, preview REST API and beta client libraries. [More info](https://learn.microsoft.com/en-us/azure/search/vector-search-overview) Beta client libraries are subject to potential breaking changes, please be sure to use the SDK package version identified below. azure-search-documents==11.4.0b8"
"[Azure AI Search](https://learn.microsoft.com/azure/search/search-what-is-azure-search) (formerly known as `Azure Search` and `Azure Cognitive Search`) is a cloud search service that gives developers infrastructure, APIs, and tools for building a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications.\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Install Azure Cognitive Search SDK"
"# Install Azure AI Search SDK"
]
},
{
@@ -26,7 +25,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install azure-search-documents==11.4.0b8\n",
"!pip install azure-search-documents\n",
"!pip install azure-identity"
]
},

View File

@@ -22,7 +22,7 @@
"metadata": {},
"outputs": [],
"source": [
"#!pip install psycopg2"
"!pip install hologres-vector"
]
},
{

View File

@@ -63,7 +63,7 @@
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 1,
"id": "aac9563e",
"metadata": {},
"outputs": [],
@@ -321,12 +321,30 @@
"id": "5f590d35",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"## Using AOSS (Amazon OpenSearch Service Serverless)"
"## Using AOSS (Amazon OpenSearch Service Serverless)\n",
"\n",
"It is an example of the `AOSS` with `faiss` engine and `efficient_filter`.\n",
"\n",
"\n",
"We need to install several `python` packages."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "279bfc5c-b7f4-4553-ad15-2df7baebec47",
"metadata": {},
"outputs": [],
"source": [
"#!pip install requests requests-aws4auth"
]
},
{
@@ -335,13 +353,16 @@
"id": "de397be7",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
}
},
"outputs": [],
"source": [
"# This is just an example to show how to use AOSS with faiss engine and efficient_filter, you need to set proper values.\n",
"from requests_aws4auth import AWS4Auth\n",
"\n",
"service = \"aoss\" # must set the service as 'aoss'\n",
"region = \"us-east-2\"\n",
@@ -374,7 +395,10 @@
"cell_type": "markdown",
"id": "0aa012c8",
"metadata": {
"collapsed": false
"collapsed": false,
"jupyter": {
"outputs_hidden": false
}
},
"source": [
"## Using AOS (Amazon OpenSearch Service)"
@@ -386,6 +410,9 @@
"id": "2c47e408",
"metadata": {
"collapsed": false,
"jupyter": {
"outputs_hidden": false
},
"pycharm": {
"name": "#%%\n"
}
@@ -436,9 +463,9 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.6"
"version": "3.10.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
}

View File

@@ -9,7 +9,7 @@
"\n",
"LangChain provides async support for Agents by leveraging the [asyncio](https://docs.python.org/3/library/asyncio.html) library.\n",
"\n",
"Async methods are currently supported for the following `Tool`s: [`GoogleSerperAPIWrapper`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/utilities/google_serper.py), [`SerpAPIWrapper`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/utilities/serpapi.py), [`LLMMathChain`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chains/llm_math/base.py) and [`Qdrant`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/qdrant.py). Async support for other agent tools are on the roadmap.\n",
"Async methods are currently supported for the following `Tool`s: [`SearchApiAPIWrapper`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/utilities/searchapi.py), [`GoogleSerperAPIWrapper`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/utilities/google_serper.py), [`SerpAPIWrapper`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/utilities/serpapi.py), [`LLMMathChain`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chains/llm_math/base.py) and [`Qdrant`](https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/vectorstores/qdrant.py). Async support for other agent tools are on the roadmap.\n",
"\n",
"For `Tool`s that have a `coroutine` implemented (the four mentioned above), the `AgentExecutor` will `await` them directly. Otherwise, the `AgentExecutor` will call the `Tool`'s `func` via `asyncio.get_event_loop().run_in_executor` to avoid blocking the main runloop.\n",
"\n",

View File

@@ -9,7 +9,7 @@
"\n",
"The map reduce documents chain first applies an LLM chain to each document individually (the Map step), treating the chain output as a new document. It then passes all the new documents to a separate combine documents chain to get a single output (the Reduce step). It can optionally first compress, or collapse, the mapped documents to make sure that they fit in the combine documents chain (which will often pass them to an LLM). This compression step is performed recursively if necessary.\n",
"\n",
"![map_reduce_diagram](/img/map_reduce.jpg)"
"![map_reduce_diagram](../../../../static/img/map_reduce.jpg)"
]
},
{

View File

@@ -9,7 +9,7 @@
"\n",
"The map re-rank documents chain runs an initial prompt on each document, that not only tries to complete a task but also gives a score for how certain it is in its answer. The highest scoring response is returned.\n",
"\n",
"![map_rerank_diagram](/img/map_rerank.jpg)"
"![map_rerank_diagram](../../../../static/img/map_rerank.jpg)"
]
},
{

View File

@@ -24,7 +24,7 @@
"The obvious tradeoff is that this chain will make far more LLM calls than, for example, the Stuff documents chain.\n",
"There are also certain tasks which are difficult to accomplish iteratively. For example, the Refine chain can perform poorly when documents frequently cross-reference one another or when a task requires detailed information from many documents.\n",
"\n",
"![refine_diagram](/img/refine.jpg)\n"
"![refine_diagram](../../../../static/img/refine.jpg)\n"
]
},
{

View File

@@ -20,7 +20,7 @@
"\n",
"This chain is well-suited for applications where documents are small and only a few are passed in for most calls.\n",
"\n",
"![stuff_diagram](/img/stuff.jpg)"
"![stuff_diagram](../../../../static/img/stuff.jpg)"
]
},
{

View File

@@ -88,7 +88,7 @@
"# The retriever (empty to start)\n",
"retriever = MultiVectorRetriever(\n",
" vectorstore=vectorstore,\n",
" docstore=store,\n",
" base_store=store,\n",
" id_key=id_key,\n",
")\n",
"import uuid\n",
@@ -143,7 +143,7 @@
{
"data": {
"text/plain": [
"Document(page_content='Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court.', metadata={'doc_id': '455205f7-bb7d-4c36-b442-d1d6f9f701ed', 'source': '../../state_of_the_union.txt'})"
"Document(page_content='Tonight, Id like to honor someone who has dedicated his life to serve this country: Justice Stephen Breyer—an Army veteran, Constitutional scholar, and retiring Justice of the United States Supreme Court. Justice Breyer, thank you for your service. \\n\\nOne of the most serious constitutional responsibilities a President has is nominating someone to serve on the United States Supreme Court.', metadata={'doc_id': '59899493-92a0-41cb-b6ba-a854730ad74a', 'source': '../../state_of_the_union.txt'})"
]
},
"execution_count": 8,
@@ -183,12 +183,12 @@
"id": "cdef8339-f9fa-4b3b-955f-ad9dbdf2734f",
"metadata": {},
"source": [
"The default search type the retriever performs on the vector database is a similarity search. LangChain Vector Stores also support searching via [Max Marginal Relevance](https://api.python.langchain.com/en/latest/schema/langchain.schema.vectorstore.VectorStore.html#langchain.schema.vectorstore.VectorStore.max_marginal_relevance_search) so if you want this instead you can just set the `search_type` property as follows:"
"The default search type the retriever performs on the vector database is a similarity search. LangChain Vector Stores also support searching via [Max Marginal Relevance](https://api.python.langchain.com/en/latest/vectorstores/langchain_core.vectorstores.VectorStore.html#langchain_core.vectorstores.VectorStore.max_marginal_relevance_search) so if you want this instead you can just set the `search_type` property as follows:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 10,
"id": "36739460-a737-4a8e-b70f-50bf8c8eaae7",
"metadata": {},
"outputs": [
@@ -198,7 +198,7 @@
"9875"
]
},
"execution_count": 15,
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
@@ -223,7 +223,7 @@
},
{
"cell_type": "code",
"execution_count": 10,
"execution_count": 11,
"id": "1433dff4",
"metadata": {},
"outputs": [],
@@ -238,7 +238,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 12,
"id": "35b30390",
"metadata": {},
"outputs": [],
@@ -253,7 +253,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 13,
"id": "41a2a738",
"metadata": {},
"outputs": [],
@@ -263,7 +263,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 14,
"id": "7ac5e4b1",
"metadata": {},
"outputs": [],
@@ -276,7 +276,7 @@
"# The retriever (empty to start)\n",
"retriever = MultiVectorRetriever(\n",
" vectorstore=vectorstore,\n",
" docstore=store,\n",
" base_store=store,\n",
" id_key=id_key,\n",
")\n",
"doc_ids = [str(uuid.uuid4()) for _ in docs]"
@@ -338,7 +338,7 @@
{
"data": {
"text/plain": [
"Document(page_content=\"The document is a transcript of a speech given by the President of the United States. The President discusses several important issues and initiatives, including the nomination of a Supreme Court Justice, border security and immigration reform, protecting women's rights, advancing LGBTQ+ equality, bipartisan legislation, addressing the opioid epidemic and mental health, supporting veterans, investigating the health effects of burn pits on military personnel, ending cancer, and the strength and resilience of the American people.\", metadata={'doc_id': '79fa2e9f-28d9-4372-8af3-2caf4f1de312'})"
"Document(page_content=\"The document is a speech given by the President of the United States. The President discusses various important issues and goals for the country, including nominating a Supreme Court Justice, securing the border and fixing the immigration system, protecting women's rights, supporting veterans, addressing the opioid epidemic, improving mental health care, and ending cancer. The President emphasizes the unity and strength of the American people and expresses optimism for the future of the nation.\", metadata={'doc_id': '8fdf4009-628c-400d-949c-1d3f4daf1e66'})"
]
},
"execution_count": 19,
@@ -393,7 +393,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 22,
"id": "5219b085",
"metadata": {},
"outputs": [],
@@ -418,7 +418,7 @@
},
{
"cell_type": "code",
"execution_count": 32,
"execution_count": 23,
"id": "523deb92",
"metadata": {},
"outputs": [],
@@ -429,7 +429,7 @@
" {\"doc\": lambda x: x.page_content}\n",
" # Only asking for 3 hypothetical questions, but this could be adjusted\n",
" | ChatPromptTemplate.from_template(\n",
" \"Generate a list of 3 hypothetical questions that the below document could be used to answer:\\n\\n{doc}\"\n",
" \"Generate a list of exactly 3 hypothetical questions that the below document could be used to answer:\\n\\n{doc}\"\n",
" )\n",
" | ChatOpenAI(max_retries=0, model=\"gpt-4\").bind(\n",
" functions=functions, function_call={\"name\": \"hypothetical_questions\"}\n",
@@ -440,19 +440,19 @@
},
{
"cell_type": "code",
"execution_count": 33,
"execution_count": 24,
"id": "11d30554",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[\"What was the author's initial impression of philosophy as a field of study, and how did it change when they got to college?\",\n",
" 'Why did the author decide to switch their focus to Artificial Intelligence (AI)?',\n",
" \"What led to the author's disillusionment with the field of AI as it was practiced at the time?\"]"
"[\"What were the author's initial areas of interest before college?\",\n",
" \"What was the author's experience with programming in his early years?\",\n",
" 'Why did the author switch his focus from AI to Lisp?']"
]
},
"execution_count": 33,
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
@@ -463,7 +463,7 @@
},
{
"cell_type": "code",
"execution_count": 34,
"execution_count": 25,
"id": "3eb2e48c",
"metadata": {},
"outputs": [],
@@ -473,7 +473,7 @@
},
{
"cell_type": "code",
"execution_count": 67,
"execution_count": 26,
"id": "b2cd6e75",
"metadata": {},
"outputs": [],
@@ -488,7 +488,7 @@
"# The retriever (empty to start)\n",
"retriever = MultiVectorRetriever(\n",
" vectorstore=vectorstore,\n",
" docstore=store,\n",
" base_store=store,\n",
" id_key=id_key,\n",
")\n",
"doc_ids = [str(uuid.uuid4()) for _ in docs]"
@@ -496,7 +496,7 @@
},
{
"cell_type": "code",
"execution_count": 68,
"execution_count": 27,
"id": "18831b3b",
"metadata": {},
"outputs": [],
@@ -510,7 +510,7 @@
},
{
"cell_type": "code",
"execution_count": 69,
"execution_count": 28,
"id": "224b24c5",
"metadata": {},
"outputs": [],
@@ -521,7 +521,7 @@
},
{
"cell_type": "code",
"execution_count": 70,
"execution_count": 29,
"id": "7b442b90",
"metadata": {},
"outputs": [],
@@ -531,20 +531,20 @@
},
{
"cell_type": "code",
"execution_count": 71,
"execution_count": 30,
"id": "089b5ad0",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[Document(page_content=\"What is the President's stance on immigration reform?\", metadata={'doc_id': '505d73e3-8350-46ec-a58e-3af032f04ab3'}),\n",
" Document(page_content=\"What is the President's stance on immigration reform?\", metadata={'doc_id': '1c9618f0-7660-4b4f-a37c-509cbbbf6dba'}),\n",
" Document(page_content=\"What is the President's stance on immigration reform?\", metadata={'doc_id': '82c08209-b904-46a8-9532-edd2380950b7'}),\n",
" Document(page_content='What measures is the President proposing to protect the rights of LGBTQ+ Americans?', metadata={'doc_id': '82c08209-b904-46a8-9532-edd2380950b7'})]"
"[Document(page_content='What made Robert Morris advise the author to leave Y Combinator?', metadata={'doc_id': '740e484e-d67c-45f7-989d-9928aaf51c28'}),\n",
" Document(page_content=\"How did the author's mother's illness affect his decision to leave Y Combinator?\", metadata={'doc_id': '740e484e-d67c-45f7-989d-9928aaf51c28'}),\n",
" Document(page_content='What led the author to start publishing essays online?', metadata={'doc_id': '675ccee3-ce0b-4d5d-892c-b8942370babd'}),\n",
" Document(page_content='What measures are being taken to secure the border and fix the immigration system?', metadata={'doc_id': '2d51f010-969e-48a9-9e82-6b12bc7ab3d4'})]"
]
},
"execution_count": 71,
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
@@ -555,7 +555,7 @@
},
{
"cell_type": "code",
"execution_count": 72,
"execution_count": 31,
"id": "7594b24e",
"metadata": {},
"outputs": [],
@@ -565,17 +565,17 @@
},
{
"cell_type": "code",
"execution_count": 73,
"execution_count": 32,
"id": "4c120c65",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"9194"
"9844"
]
},
"execution_count": 73,
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
@@ -583,14 +583,6 @@
"source": [
"len(retrieved_docs[0].page_content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "616cfeeb",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
@@ -609,7 +601,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.16"
"version": "3.10.5"
}
},
"nbformat": 4,

View File

@@ -190,9 +190,9 @@
"id": "a237bbb8-e448-4238-8420-004e046ef84e",
"metadata": {},
"source": [
"We will use the [ChatPromptTemplate](https://api.python.langchain.com/en/latest/prompts/langchain.prompts.chat.ChatPromptTemplate.html) class to set up the chat prompt.\n",
"We will use the [ChatPromptTemplate](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html?highlight=chatprompttemplate) class to set up the chat prompt.\n",
"\n",
"The [from_messages](https://api.python.langchain.com/en/latest/prompts/langchain.prompts.chat.ChatPromptTemplate.html#langchain.prompts.chat.ChatPromptTemplate.from_messages) method creates a `ChatPromptTemplate` from a list of messages (e.g., `SystemMessage`, `HumanMessage`, `AIMessage`, `ChatMessage`, etc.) or message templates, such as the [MessagesPlaceholder](https://api.python.langchain.com/en/latest/prompts/langchain.prompts.chat.MessagesPlaceholder.html#langchain.prompts.chat.MessagesPlaceholder) below.\n",
"The [from_messages](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.ChatPromptTemplate.html#langchain_core.prompts.chat.ChatPromptTemplate.from_messages) method creates a `ChatPromptTemplate` from a list of messages (e.g., `SystemMessage`, `HumanMessage`, `AIMessage`, `ChatMessage`, etc.) or message templates, such as the [MessagesPlaceholder](https://api.python.langchain.com/en/latest/prompts/langchain_core.prompts.chat.MessagesPlaceholder.html#langchain_core.prompts.chat.MessagesPlaceholder) below.\n",
"\n",
"The configuration below makes it so the memory will be injected to the middle of the chat prompt, in the `chat_history` key, and the user's inputs will be added in a human/user message to the end of the chat prompt."
]

Some files were not shown because too many files have changed in this diff Show More