Commit Graph

1478 Commits

Author SHA1 Message Date
Zander Chase
e5f184c7ba Catch all exceptions in autogpt (#3413)
Ought to be more autonomous
2023-04-28 10:11:03 -07:00
Zander Chase
6d07bafda5 Move Generative Agent definition to Experimental (#3245)
Extending @BeautyyuYanli 's #3220 to move from the notebook

---------

Co-authored-by: BeautyyuYanli <beautyyuyanli@gmail.com>
2023-04-28 10:11:03 -07:00
Zander Chase
eb47767e9e Add Sentence Transformers Embeddings (#3409)
Add embeddings based on the sentence transformers library.
Add a notebook and integration tests.

Co-authored-by: khimaros <me@khimaros.com>
2023-04-28 10:11:03 -07:00
Zander Chase
9f40c09c86 Update marathon notebook (#3408)
Fixes #3404
2023-04-28 10:11:03 -07:00
Luke Harris
b7dad1b6bf Several confluence loader improvements (#3300)
This PR addresses several improvements:

- Previously it was not possible to load spaces of more than 100 pages.
The `limit` was being used both as an overall page limit *and* as a per
request pagination limit. This, in combination with the fact that
atlassian seem to use a server-side hard limit of 100 when page content
is expanded, meant it wasn't possible to download >100 pages. Now
`limit` is used *only* as a per-request pagination limit and `max_pages`
is introduced as the way to limit the total number of pages returned by
the paginator.
- Document metadata now includes `source` (the source url), making it
compatible with `RetrievalQAWithSourcesChain`.
 - It is now possible to include inline and footer comments.
- It is now possible to pass `verify_ssl=False` and other parameters to
the confluence object for use cases that require it.
2023-04-28 10:11:03 -07:00
zz
e808444b79 Add support for wikipedia's lang parameter (#3383)
Allow to hange the language of the wikipedia API being requested.

Co-authored-by: zhuohui <zhuohui@datastory.com.cn>
2023-04-28 10:11:03 -07:00
Johann-Peter Hartmann
f400386865 Improve youtube loader (#3395)
Small improvements for the YouTube loader: 
a) use the YouTube API permission scope instead of Google Drive 
b) bugfix: allow transcript loading for single videos 
c) an additional parameter "continue_on_failure" for cases when videos
in a playlist do not have transcription enabled.
d) support automated translation for all languages, if available.

---------

Co-authored-by: Johann-Peter Hartmann <johann-peter.hartmann@mayflower.de>
2023-04-28 10:11:03 -07:00
Harrison Chase
f3ab7c2a9f Harrison/hf document loader (#3394)
Co-authored-by: Azam Iftikhar <azamiftikhar1000@gmail.com>
2023-04-28 10:11:03 -07:00
Hadi Curtay
4b071a69d1 Updated incorrect link to Weaviate notebook (#3362)
The detailed walkthrough of the Weaviate wrapper was pointing to the
getting-started notebook. Fixed it to point to the Weaviable notebook in
the examples folder.
2023-04-28 10:11:03 -07:00
Ismail Pelaseyed
0a7ca1014f Add example on deploying LangChain to Cloud Run (#3366)
## Summary

Adds a link to a minimal example of running LangChain on Google Cloud
Run.
2023-04-28 10:11:03 -07:00
Ivan Zatevakhin
326c2c2474 llamacpp wrong default value passed for f16_kv (#3320)
Fixes default f16_kv value in llamacpp; corrects incorrect parameter
passed.

See:
ba3959eafd/llama_cpp/llama.py (L33)

Fixes #3241
Fixes #3301
2023-04-28 10:11:03 -07:00
Harrison Chase
8feb416664 bump version to 147 (#3353) 2023-04-28 10:11:03 -07:00
Harrison Chase
4003a79b35 Harrison/myscale (#3352)
Co-authored-by: Fangrui Liu <fangruil@moqi.ai>
Co-authored-by: 刘 方瑞 <fangrui.liu@outlook.com>
Co-authored-by: Fangrui.Liu <fangrui.liu@ubc.ca>
2023-04-28 10:11:03 -07:00
Harrison Chase
994027771e Harrison/error hf (#3348)
Co-authored-by: Rui Melo <44201826+rufimelo99@users.noreply.github.com>
2023-04-28 10:11:03 -07:00
Honkware
fcf610bf31 Add ChatGPT Data Loader (#3336)
This pull request adds a ChatGPT document loader to the document loaders
module in `langchain/document_loaders/chatgpt.py`. Additionally, it
includes an example Jupyter notebook in
`docs/modules/indexes/document_loaders/examples/chatgpt_loader.ipynb`
which uses fake sample data based on the original structure of the
`conversations.json` file.

The following files were added/modified:
- `langchain/document_loaders/__init__.py`
- `langchain/document_loaders/chatgpt.py`
- `docs/modules/indexes/document_loaders/examples/chatgpt_loader.ipynb`
-
`docs/modules/indexes/document_loaders/examples/example_data/fake_conversations.json`

This pull request was made in response to the recent release of ChatGPT
data exports by email:
https://help.openai.com/en/articles/7260999-how-do-i-export-my-chatgpt-history
2023-04-28 10:11:03 -07:00
Zander Chase
a4a4c59e97 Fix Sagemaker Batch Endpoints (#3249)
Add different typing for @evandiewald 's heplful PR

---------

Co-authored-by: Evan Diewald <evandiewald@gmail.com>
2023-04-28 10:11:03 -07:00
Johann-Peter Hartmann
3d762f2b8b Support recursive sitemaps in SitemapLoader (#3146)
A (very) simple addition to support multiple sitemap urls.

---------

Co-authored-by: Johann-Peter Hartmann <johann-peter.hartmann@mayflower.de>
2023-04-28 10:11:03 -07:00
Filip Haltmayer
69d60041af Refactor Milvus/Zilliz (#3047)
Refactoring milvus/zilliz to clean up and have a more consistent
experience.

Signed-off-by: Filip Haltmayer <filip.haltmayer@zilliz.com>
2023-04-28 10:11:03 -07:00
Harrison Chase
38c3478be4 Harrison/voice assistant (#3347)
Co-authored-by: Jaden <jaden.lorenc@gmail.com>
2023-04-28 10:11:03 -07:00
Richy Wang
c0fd62cd6f Add a full PostgresSQL syntax database 'AnalyticDB' as vector store. (#3135)
Hi there!
I'm excited to open this PR to add support for using a fully Postgres
syntax compatible database 'AnalyticDB' as a vector.
As AnalyticDB has been proved can be used with AutoGPT,
ChatGPT-Retrieve-Plugin, and LLama-Index, I think it is also good for
you.
AnalyticDB is a distributed Alibaba Cloud-Native vector database. It
works better when data comes to large scale. The PR includes:

- [x]  A new memory: AnalyticDBVector
- [x]  A suite of integration tests verifies the AnalyticDB integration

I have read your [contributing
guidelines](72b7d76d79/.github/CONTRIBUTING.md).
And I have passed the tests below
- [x]  make format
- [x]  make lint
- [x]  make coverage
- [x]  make test
2023-04-28 10:11:03 -07:00
Harrison Chase
471f961c8e Harrison/power bi (#3205)
Co-authored-by: Eduard van Valkenburg <eavanvalkenburg@users.noreply.github.com>
2023-04-28 10:11:03 -07:00
Daniel Chalef
a0f76dd40b args_schema type hint on subclassing (#3323)
per https://github.com/hwchase17/langchain/issues/3297

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-28 10:11:03 -07:00
Zander Chase
3492a93dc4 Fix linting on master (#3327) 2023-04-28 10:11:03 -07:00
Varun Srinivas
bc76e0613c Change in method name for creating an issue on JIRA (#3307)
The awesome JIRA tool created by @zywilliamli calls the `create_issue()`
method to create issues, however, the actual method is `issue_create()`.

Details in the Documentation here:
https://atlassian-python-api.readthedocs.io/jira.html#manage-issues
2023-04-28 10:11:03 -07:00
Davis Chase
692baa797d Update docs api references (#3315) 2023-04-28 10:11:03 -07:00
Paul Garner
15535b913d Add PythonLoader which auto-detects encoding of Python files (#3311)
This PR contributes a `PythonLoader`, which inherits from
`TextLoader` but detects and sets the encoding automatically.
2023-04-28 10:11:03 -07:00
Daniel Chalef
6d55489419 Fix example match_documents fn table name, grammar (#3294)
ref
https://github.com/hwchase17/langchain/pull/3100#issuecomment-1517086472

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-28 10:11:03 -07:00
Davis Chase
be958d98ad Cleanup integration test dir (#3308) 2023-04-28 10:11:03 -07:00
leo-gan
0f7d997bb0 added links to the important YouTube videos (#3244)
Added links to the important YouTube videos
2023-04-28 10:11:03 -07:00
Sertaç Özercan
87c046858b fix: handle youtube TranscriptsDisabled (#3276)
handles error when youtube video has transcripts disabled

```
youtube_transcript_api._errors.TranscriptsDisabled: 
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=<URL> This is most likely caused by:

Subtitles are disabled for this video

If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!
```

Signed-off-by: Sertac Ozercan <sozercan@gmail.com>
2023-04-28 10:11:03 -07:00
Alexandre Pesant
eecd5795f4 Do not print openai settings (#3280)
There's no reason to print these settings like that, it just pollutes
the logs :)
2023-04-28 10:11:03 -07:00
Zander Chase
181840dcb4 Handle null action in AutoGPT Agent (#3274)
Handle the case where the command is `null`
2023-04-28 10:11:03 -07:00
Harrison Chase
d2b2772272 bump version 146 (#3272) 2023-04-28 10:11:03 -07:00
Harrison Chase
353e96cb66 gradio tools (#3255) 2023-04-28 10:11:03 -07:00
Naveen Tatikonda
a8b1bb6c4c OpenSearch: Add Support for Lucene Filter (#3201)
### Description
Add Support for Lucene Filter. When you specify a Lucene filter for a
k-NN search, the Lucene algorithm decides whether to perform an exact
k-NN search with pre-filtering or an approximate search with modified
post-filtering. This filter is supported only for approximate search
with the indexes that are created using `lucene` engine.

OpenSearch Documentation -
https://opensearch.org/docs/latest/search-plugins/knn/filter-search-knn/#lucene-k-nn-filter-implementation

Signed-off-by: Naveen Tatikonda <navtat@amazon.com>
2023-04-28 10:11:02 -07:00
Davis Chase
c08f644d6d Hf emb device (#3266)
Make it possible to control the HuggingFaceEmbeddings and HuggingFaceInstructEmbeddings client model kwargs. Additionally, the cache folder was added for HuggingFaceInstructEmbedding as the client inherits from SentenceTransformer (client of HuggingFaceEmbeddings).

It can be useful, especially to control the client device, as it will be defaulted to GPU by sentence_transformers if there is any.

---------

Co-authored-by: Yoann Poupart <66315201+Xmaster6y@users.noreply.github.com>
2023-04-28 10:11:02 -07:00
Zach Jones
e12dc1321a Fix type annotation for QueryCheckerTool.llm (#3237)
Currently `langchain.tools.sql_database.tool.QueryCheckerTool` has a
field `llm` with type `BaseLLM`. This breaks initialization for some
LLMs. For example, trying to use it with GPT4:

```python
from langchain.sql_database import SQLDatabase
from langchain.chat_models import ChatOpenAI
from langchain.tools.sql_database.tool import QueryCheckerTool


db = SQLDatabase.from_uri("some_db_uri")
llm = ChatOpenAI(model_name="gpt-4")
tool = QueryCheckerTool(db=db, llm=llm)

# pydantic.error_wrappers.ValidationError: 1 validation error for QueryCheckerTool
# llm
#   Can't instantiate abstract class BaseLLM with abstract methods _agenerate, _generate, _llm_type (type=type_error)
```

Seems like much of the rest of the codebase has switched from `BaseLLM`
to `BaseLanguageModel`. This PR makes the change for QueryCheckerTool as
well

Co-authored-by: Zachary Jones <zjones@zetaglobal.com>
2023-04-28 10:11:02 -07:00
Davis Chase
c31cdc6d8d Contextual compression retriever (#2915)
Co-authored-by: Harrison Chase <hw.chase.17@gmail.com>
2023-04-28 10:11:02 -07:00
Matt Robinson
7f4eb81be7 feat: add loader for rich text files (#3227)
### Summary

Adds a loader for rich text files. Requires `unstructured>=0.5.12`.

### Testing

The following test uses the example RTF file from the [`unstructured`
repo](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs).

```python
from langchain.document_loaders import UnstructuredRTFLoader

loader = UnstructuredRTFLoader("fake-doc.rtf", mode="elements")
docs = loader.load()
docs[0].page_content
```
2023-04-28 10:11:02 -07:00
Harrison Chase
ba167800dd add to docs 2023-04-28 10:11:02 -07:00
Albert Castellana
96f33ed3ae Ecosystem/Yeager.ai (#3239)
Added yeagerai.md to ecosystem
2023-04-28 10:11:02 -07:00
Boris Feld
0bfa2c9216 Fixing issue link for Comet callback (#3212)
Sorry I fixed that link once but there was still a typo inside, this
time it should be good.
2023-04-28 10:11:02 -07:00
Daniel Chalef
1a6e8865bf fix error msg ref to beautifulsoup4 (#3242)
Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
2023-04-28 10:11:02 -07:00
Tom Dyson
fd948bef64 Add DuckDB prompt (#3233)
Adds a prompt template for the DuckDB SQL dialect.
2023-04-28 10:11:02 -07:00
Zander Chase
bd9d9412b7 Patch Chat History Formatting (#3236)
While we work on solidifying the memory interfaces, handle common chat
history formats.

This may break linting on anyone who has been passing in
`get_chat_history` .

Somewhat handles #3077

Alternative to #3078 that updates the typing
2023-04-28 10:11:02 -07:00
Harrison Chase
2dbb5261b5 wikibase agent 2023-04-20 15:37:56 -07:00
Harrison Chase
8f22949dc4 update nnotebook title 2023-04-20 11:53:23 -07:00
leo-gan
130e4b9fcb fixed a link to the youtube page (#3232)
A link to the `YouTube` page was missing on the `index` page.
2023-04-20 10:47:16 -07:00
Peter Stolz
d54b977d4e Fix docstring of RetrievalQA (#3231)
Structure changed an RetrievalQA now expects BaseRetriever not
VectorStore
2023-04-20 10:46:51 -07:00
Harrison Chase
b7dea80cba bump version to 145 (#3229) v0.0.145 2023-04-20 08:30:38 -07:00