收藏 分销(赏)

语料库语言学:语料库的种类types-of-corpora.ppt

上传人:胜**** 文档编号:743934 上传时间:2024-02-29 格式:PPT 页数:9 大小:191.50KB 下载积分:11 金币
下载 相关 举报
语料库语言学:语料库的种类types-of-corpora.ppt_第1页
第1页 / 共9页
语料库语言学:语料库的种类types-of-corpora.ppt_第2页
第2页 / 共9页


点击查看更多>>
资源描述
Types of corporaGeneral vs.specialized corporaWritten vs.spoken corporaSynchronic vs.diachronic corporaMonolingual vs.multilingual corporaComparable vs.parallel corporaNative vs.learner corporaSample vs.monitor corporaRaw vs.annotated corporaGeneral vs.specialized corporaGeneral corpora(通用语料库通用语料库)or reference corpora(参考语料库参考语料库):a wide coverage of different text categories or registers;represents language for general purposes.usu.:very large,millions of words.E.g.British National Corpus(BNC),Bank of English(BOE).specialized corpora(专用语料库专用语料库):texts from a particular variety of a language,e.g.from a particular dialect or from a particular subject area.Written vs.spoken corporaWritten corpora(笔语语料库)(笔语语料库):contain only written materials.(more)Spoken corpora(口语语料库)(口语语料库):contain transcribed texts of spoken language.(less)Synchronic vs.diachronic corporaSynchronic corpora(共时语料库)(共时语料库):materials from a specific period of time.Diachronic corpora(历时语料库):(历时语料库):materials over a longer period of time.Monolingual vs.multilingual corporaMonolingual corpora(单语语料库):(单语语料库):texts in one language.Multilingual corpora(多语语料库):(多语语料库):texts in several different languages.Comparable vs.parallel corporaComparable corpora(可比语料库):(可比语料库):texts from two or more languages which are similar in genre,topic,register etc.without,however,containing the same content.Parallel corpora(平行语料库)(平行语料库)(translation corpora)(翻译语料库):(翻译语料库):a corpus of original texts in one language and their translations into another(or several other languages)。探索“同一内容是如何用两种语言表达的”。Native vs.learner corporaNative speakers corpora(本族语语料库)(本族语语料库):texts from native speakers.Learner corpora(学习者语料库)(学习者语料库):texts from language learners.Sample vs.monitor corporaSample corpora(样本语料库样本语料库):as opposed to a monitor corpus,a sample corpus is of finite size and consists of text segments selected to provide a static picture of languageMonitor corpora(监控语料库监控语料库):monitor language change.It is regularly updated and open-ended.Raw vs.annotated corporaRaw corpora(生语料库)(生语料库):in raw states of plain text;without annotationsAnnotated corpora(标注语料库)(标注语料库):some external information is added to a corpus.e.g.information identifying the origin and nature of the text;tagging to show the word class of each word;parsing to show the sentence structure and the function of different elements in a sentence.one specific example,“gives”:third person singular present tense verbIn an annotated corpus,the form gives may be gives_VVZ,VVZ:it is a third person singular present tense(Z)form of a lexical verb(VV).Such annotation makes it quicker and easier to retrieve and analyze information about the language contained in the corpus.
展开阅读全文

开通  VIP会员、SVIP会员  优惠大
下载10份以上建议开通VIP会员
下载20份以上建议开通SVIP会员


开通VIP      成为共赢上传

当前位置:首页 > 教育专区 > 语言学

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        抽奖活动

©2010-2025 宁波自信网络信息技术有限公司  版权所有

客服电话:4009-655-100  投诉/维权电话:18658249818

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :微信公众号    抖音    微博    LOFTER 

客服