mirror of
https://github.com/hwchase17/langchain.git
synced 2025-12-07 05:24:25 +00:00
add CoNLL-U document loader (#1297)
I've added a simple [CoNLL-U](https://universaldependencies.org/format.html) document loader. CoNLL-U is a common format for NLP tasks and is used, for example, in the Universal Dependencies treebank corpora. The loader reads a single file in standard CoNLL-U format and returns a document.
This commit is contained in:
@@ -0,0 +1,8 @@
|
||||
# sent_id = 1
|
||||
# text = They buy and sell books.
|
||||
1 They they PRON PRP Case=Nom|Number=Plur 2 nsubj 2:nsubj|4:nsubj _
|
||||
2 buy buy VERB VBP Number=Plur|Person=3|Tense=Pres 0 root 0:root _
|
||||
3 and and CONJ CC _ 4 cc 4:cc _
|
||||
4 sell sell VERB VBP Number=Plur|Person=3|Tense=Pres 2 conj 0:root|2:conj _
|
||||
5 books book NOUN NNS Number=Plur 2 obj 2:obj|4:obj SpaceAfter=No
|
||||
6 . . PUNCT . _ 2 punct 2:punct _
|
||||
Reference in New Issue
Block a user