收藏 分销(赏)

语料库语言学:语料库的种类types-of-corpora.ppt

上传人:胜**** 文档编号:743934 上传时间:2024-02-29 格式:PPT 页数:9 大小:191.50KB
下载 相关 举报
语料库语言学:语料库的种类types-of-corpora.ppt_第1页
第1页 / 共9页
语料库语言学:语料库的种类types-of-corpora.ppt_第2页
第2页 / 共9页
语料库语言学:语料库的种类types-of-corpora.ppt_第3页
第3页 / 共9页
语料库语言学:语料库的种类types-of-corpora.ppt_第4页
第4页 / 共9页
语料库语言学:语料库的种类types-of-corpora.ppt_第5页
第5页 / 共9页
点击查看更多>>
资源描述

1、Types of corporaGeneral vs.specialized corporaWritten vs.spoken corporaSynchronic vs.diachronic corporaMonolingual vs.multilingual corporaComparable vs.parallel corporaNative vs.learner corporaSample vs.monitor corporaRaw vs.annotated corporaGeneral vs.specialized corporaGeneral corpora(通用语料库通用语料库)o

2、r reference corpora(参考语料库参考语料库):a wide coverage of different text categories or registers;represents language for general purposes.usu.:very large,millions of words.E.g.British National Corpus(BNC),Bank of English(BOE).specialized corpora(专用语料库专用语料库):texts from a particular variety of a language,e.g

3、from a particular dialect or from a particular subject area.Written vs.spoken corporaWritten corpora(笔语语料库)(笔语语料库):contain only written materials.(more)Spoken corpora(口语语料库)(口语语料库):contain transcribed texts of spoken language.(less)Synchronic vs.diachronic corporaSynchronic corpora(共时语料库)(共时语料库):ma

4、terials from a specific period of time.Diachronic corpora(历时语料库):(历时语料库):materials over a longer period of time.Monolingual vs.multilingual corporaMonolingual corpora(单语语料库):(单语语料库):texts in one language.Multilingual corpora(多语语料库):(多语语料库):texts in several different languages.Comparable vs.parallel

5、corporaComparable corpora(可比语料库):(可比语料库):texts from two or more languages which are similar in genre,topic,register etc.without,however,containing the same content.Parallel corpora(平行语料库)(平行语料库)(translation corpora)(翻译语料库):(翻译语料库):a corpus of original texts in one language and their translations int

6、o another(or several other languages)。探索“同一内容是如何用两种语言表达的”。Native vs.learner corporaNative speakers corpora(本族语语料库)(本族语语料库):texts from native speakers.Learner corpora(学习者语料库)(学习者语料库):texts from language learners.Sample vs.monitor corporaSample corpora(样本语料库样本语料库):as opposed to a monitor corpus,a samp

7、le corpus is of finite size and consists of text segments selected to provide a static picture of languageMonitor corpora(监控语料库监控语料库):monitor language change.It is regularly updated and open-ended.Raw vs.annotated corporaRaw corpora(生语料库)(生语料库):in raw states of plain text;without annotationsAnnotate

8、d corpora(标注语料库)(标注语料库):some external information is added to a corpus.e.g.information identifying the origin and nature of the text;tagging to show the word class of each word;parsing to show the sentence structure and the function of different elements in a sentence.one specific example,“gives”:third person singular present tense verbIn an annotated corpus,the form gives may be gives_VVZ,VVZ:it is a third person singular present tense(Z)form of a lexical verb(VV).Such annotation makes it quicker and easier to retrieve and analyze information about the language contained in the corpus.

展开阅读全文

开通  VIP会员、SVIP会员  优惠大
下载10份以上建议开通VIP会员
下载20份以上建议开通SVIP会员


开通VIP      成为共赢上传
相似文档                                   自信AI助手自信AI助手
搜索标签

当前位置:首页 > 教育专区 > 语言学

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        抽奖活动

©2010-2025 宁波自信网络信息技术有限公司  版权所有

客服电话:4009-655-100  投诉/维权电话:18658249818

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :gzh.png    weibo.png    LOFTER.png 

客服