community: Spider integration (#20937)

Added the [Spider.cloud](https://spider.cloud) document loader.
[Spider](https://github.com/spider-rs/spider) is the
[fastest](https://github.com/spider-rs/spider/blob/main/benches/BENCHMARKS.md)
and cheapest crawler that returns LLM-ready data.

```
- **Description:** Adds Spider data loader
- **Dependencies:** spider-client
- **Twitter handle:** @WilliamEspegren 
```

---------

Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>
Co-authored-by: Bagatur <baskaryan@gmail.com>
Co-authored-by: = <=>
Co-authored-by: Chester Curme <chester.curme@gmail.com>
This commit is contained in:
WilliamEspegren
2024-04-27 23:45:03 +02:00
committed by GitHub
parent 6342217b93
commit 804390ba4b
6 changed files with 223 additions and 1 deletions

View File

@@ -143,6 +143,7 @@ EXPECTED_ALL = [
"SitemapLoader",
"SlackDirectoryLoader",
"SnowflakeLoader",
"SpiderLoader",
"SpreedlyLoader",
"StripeLoader",
"SurrealDBLoader",