For full functionality of Sketch Engine it is necessary to
enable JavaScript
elllo
youtube_corpus
academicspokendata
prea
a1_level
elementandjuniortextmerged
tai_zhong_ri_han_guo_zhong_ying_wen_ke_ben
covid19
biolifesciences2020
engineeringandcs2020
medicalsciences2020
physicalsciences2020
socialsciences2020
social_2020
ASBC
collocationof5domains
opensourcesubtitles
academic_presentations
AL_Master_Thesis_494
allacademiclecture
all_the_efcamdata
baseacademicspokenUK
basketball_english
BAWEacademicwritttenUK
bnc_spoken_data
BNCTXT
businesschinese
business_english
chinese_academic_corpus
chinese_learners_spoken_corpus
chinese_learners_written_corpus
chrisbusinessNNS
civillaw
computerscience-006
computersciencesabstracts
education_opentextbooks
efcamdatchineseandtaiwanese
engineering_corpus
ETAallproceeding
japanlearnersEFL
JCEEonly
kubuntuchatcorpus
movietvscripts
nativeappliedlxjournals
naturalsciencesabstracts
nnsthesis
nursing
physicscorpus-007
SallyCSNNScomputertheses
Shooter
simplewikidata-008
socialsciences-009
susanne
Taiwanlearner
TaiwanThesis_Edited
technologychinese
tedcorpus
Tourism
TWWaC
the_social_sciences_10000
elllo
defaults
Reset settings
Home
Search
Word list
Ngram list
Corpus info
My jobs
User guide
All words
All lemmas
Find x
Menu position
This action may take several minutes for large corpora, please wait.
Word list options
Corpus:
elllo
【華】Youtube節目華語字幕語料庫(Youtube Clip Chinese Subtitle Corpus)
【華】中研院平衡語料庫4.0 (Academia Sinica Balanced Corpus)
【華】民法華語語料庫(Civil Law Corpus)
【華】射手網華語字幕語料庫(Shooter Chinese Subtitles Corpus)
【華】華語學習者口語語料庫(Chinese Learner Spoken Corpus)
【華】華語學習者書面語語料庫(Chinese Learner Written Corpus)
【華】以新型冠狀病毒為主題之華語語料庫(Chinese Web Corpus of COVID 19)
【華】網路科技華語語料庫(Web Technology Chinese Corpus)
【華】網路旅遊華語語料庫(Web Travel Chinese Corpus)
【華】網路商務華語語料庫(Web Business Chinese Corpus)
【華】臺灣網路巨量語料庫(TWWAC)
【華】學術華語語料庫(Academic Chinese Corpus)
academicspokendata
elementandjuniortextmerged
prea
tai_zhong_ri_han_guo_zhong_ying_wen_ke_ben
a1_level
collocationof5domains
AL_Master_Thesis_494 台灣英語教學碩論語料庫
BAWEacademicwritttenUK 英國學術書面語語料庫
BNCTXT 英國國家語料庫
ETAallproceeding
JCEEonly
SallyCSNNScomputertheses 英文台灣電腦科系碩論
TaiwanThesis_Edited
Taiwanlearner 英文台灣學習者語料
academic_presentations 英文學術報告
all_the_efcamdata EF劍橋學習者語料
allacademiclecture2015 英文上課講演語料
baseacademicspokenUK 英國學術口語語料庫
basketball_english 英文籃球網路語料庫
bnc_spoken_data 英國國家語料庫口語
business_english 商業英文語料庫
chrisbusinessNNS 英文台灣商業學位論文語料庫
computerscience-006 英文電腦科學論文語料庫
computersciencesabstracts 英文電腦論文摘要語料庫
education_opentextbooks 英文開放式課本語料庫
efcamdatchineseandtaiwanese EF劍橋中文為母與學習者語料庫
engineering_corpus 英文工程論文語料庫
japanlearnersEFL 英文日本EFL學生語料庫
kubuntuchatcorpus kubuntu對談語料庫
movietvscripts 英文電影電視語料庫
nativeappliedlxjournals 英文應用語言學論文語料庫
naturalsciencesabstracts 英文自然科學摘要語料庫
nnsthesis
nursing 英文護理英文語料
opensourcesubtitles
physicscorpus-007 英文物理學論文語料庫
simplewikidata-008 維基簡易英文語料庫
social_2020
socialsciences-009 英文社會科學論文語料庫
Susanne
tedcorpus TED英文演說語料庫
biolifesciences2020
engineeringandcs2020
medicalsciences2020
physicalsciences2020
socialsciences2020
the social sciences 10000 articles for the book project 社會科學類期刊論文
Subcorpus:
create new
Search attribute:
word
tag
lemma
word (lowercase)
lemma (lowercase)
archive_file.filename
use n-grams
. Value of n: from
2
3
4
5
6
to
2
3
4
5
6
hide/nest sub-n-grams
Filter options:
Filter word list by:
Regular expression:
Minimum frequency:
Maximum frequency:
(0 = no maximum frequency)
Whitelist:
Blacklist:
format
Word list whitelists and blacklists must be plain text (.txt), encoded in UTF-8, with one item per line. The items must correspond to the selected attribute, so, eg, if 'lemma' is selected from the attribute menu, then the list should be a list of lemmas. We use exact matching, not regular-expression matching, for file input.
Include non-words
Output options:
Frequency figures:
Hit counts
Document counts
ARF
Output type:
Simple
Keywords
Reference (sub)corpus
elllo
【華】Youtube節目華語字幕語料庫(Youtube Clip Chinese Subtitle Corpus)
【華】中研院平衡語料庫4.0 (Academia Sinica Balanced Corpus)
【華】民法華語語料庫(Civil Law Corpus)
【華】射手網華語字幕語料庫(Shooter Chinese Subtitles Corpus)
【華】華語學習者口語語料庫(Chinese Learner Spoken Corpus)
【華】華語學習者書面語語料庫(Chinese Learner Written Corpus)
【華】以新型冠狀病毒為主題之華語語料庫(Chinese Web Corpus of COVID 19)
【華】網路科技華語語料庫(Web Technology Chinese Corpus)
【華】網路旅遊華語語料庫(Web Travel Chinese Corpus)
【華】網路商務華語語料庫(Web Business Chinese Corpus)
【華】臺灣網路巨量語料庫(TWWAC)
【華】學術華語語料庫(Academic Chinese Corpus)
academicspokendata
elementandjuniortextmerged
prea
tai_zhong_ri_han_guo_zhong_ying_wen_ke_ben
a1_level
collocationof5domains
AL_Master_Thesis_494 台灣英語教學碩論語料庫
BAWEacademicwritttenUK 英國學術書面語語料庫
BNCTXT 英國國家語料庫
ETAallproceeding
JCEEonly
SallyCSNNScomputertheses 英文台灣電腦科系碩論
TaiwanThesis_Edited
Taiwanlearner 英文台灣學習者語料
academic_presentations 英文學術報告
all_the_efcamdata EF劍橋學習者語料
allacademiclecture2015 英文上課講演語料
baseacademicspokenUK 英國學術口語語料庫
basketball_english 英文籃球網路語料庫
bnc_spoken_data 英國國家語料庫口語
business_english 商業英文語料庫
chrisbusinessNNS 英文台灣商業學位論文語料庫
computerscience-006 英文電腦科學論文語料庫
computersciencesabstracts 英文電腦論文摘要語料庫
education_opentextbooks 英文開放式課本語料庫
efcamdatchineseandtaiwanese EF劍橋中文為母與學習者語料庫
engineering_corpus 英文工程論文語料庫
japanlearnersEFL 英文日本EFL學生語料庫
kubuntuchatcorpus kubuntu對談語料庫
movietvscripts 英文電影電視語料庫
nativeappliedlxjournals 英文應用語言學論文語料庫
naturalsciencesabstracts 英文自然科學摘要語料庫
nnsthesis
nursing 英文護理英文語料
opensourcesubtitles
physicscorpus-007 英文物理學論文語料庫
simplewikidata-008 維基簡易英文語料庫
social_2020
socialsciences-009 英文社會科學論文語料庫
Susanne
tedcorpus TED英文演說語料庫
biolifesciences2020
engineeringandcs2020
medicalsciences2020
physicalsciences2020
socialsciences2020
the social sciences 10000 articles for the book project 社會科學類期刊論文
(whole corpus)
Prefer:
rare words
common words
Change output attribute(s)
---
word
tag
lemma
word (lowercase)
lemma (lowercase)
---
word
tag
lemma
word (lowercase)
lemma (lowercase)
---
word
tag
lemma
word (lowercase)
lemma (lowercase)
You can select one or more output attributes. Please note that this option can be time-consuming.