This action may take several minutes for large corpora, please wait.
【華】Youtube節目華語字幕語料庫(Youtube Clip Chinese Subtitle Corpus)
youtube_corpus
Counts |
Tokens | 3673337 |
Words | 3574577 |
Sentences | 623778 |
General info |
Corpus description |
Document |
Language | Chinese |
Encoding | UTF-8 |
Compiled | 03/17/2022 12:46:33 |
Tagset |
Description |
Lexicon sizes |
word | 74361 |
tag | 60 |
Tags legend |
adjective | A|VH |
adverb | D.* |
conjunction | C.* |
determiner | Ne.* |
noun | Na|Nb|Nc|Ncd|Nd|Nf|Nh|Nv |
preposition | P |
pronoun | Nhaa|Nhab|Nhb|Nhc |
verb | V.* |
Lempos suffixes |
adjective | -j |
adverb | -a |
conjunction | -c |
noun | -n |
preposition | -p |
pronoun | -d |
verb | -v |
Structures and attributes
Subcorpora statistics
Subcorpus |
Tokens |
Words |
% |
a |
15289 |
~ 14877 |
0.416215555502 |