1、单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,2019/11/5,#,文献检索与利用,武 琳,1,中山大学 资讯管理,学院,wulinmail,教学经历,资讯管理学院 副教授,管理学博士,硕士研究生导师(学术、专业),美国雪城大学信息学院访问学者,学术专长:信息检索、信息服务、管理咨询,研究方向:网络信息资源管理、数字信息服务、数据科学管理,主持教学教改项目:,中山大学,本科教改,项目:,基于,Blackboard,平台,信息组织,课程建设,美国岭南基金会 中山大学,博雅教育计划精品课程,建设项目:,网络信息检索与利用(双语),中山大学,研究生教育教学改革
2、,研究项目:,学术信息检索与学科前沿跟踪,中山大学,本科双语,教学课程建设项目:,信息检索,2,关于该课程,文献检索是干什么的,?,我需要这样一门课吗?,它能给我的益处与收获?,我们都会使用百度,google,,还有什么是搜索引擎不能做的吗?,你能为我的研究带来高效率、高水准?,。,为什么?,3,上小学时知道了太阳系有,九大行星,。当时人们还没有找到太阳系之外的任何行星。现在,天文学家已经找到超过,800,颗围绕其它恒星公转的行星(此外可能是行星的“候选”天体数量更是上千),而同时冥王星却已被降级为一颗“,矮行星,”。即使粗略检视其他领域你也会发现类似的情况。,Arbesman,)给这种现象取
3、名叫“,知识的半衰期,”,并写了一本以此为题的书。在书中他解释说知识的新旧交替如同放射性衰变,你无法预知某项特定知识是否会受这种衰变效果影响而被淘汰,但你能够找出某一领域的知识在,多久之后会有一半被淘汰,。,真理的衰变:事实也有半衰期,世界科学,,2012,(,11,),5,例如,在研究肝炎和肝硬化的医学方面,研究者对过多久这些领域内的一半知识会被推翻进行了研究。他们将,50,年前的一批论文交给一个专家评审团,判断其中哪些的结果如今已被推翻,或是已经不再重要。他们将结果绘成一副图表,并在图中看到非常清楚平滑的衰变曲线。你可以用该图预测每过,45,年,该领域的知识都会有一半变得过时。,6,哪些科
4、学领域的“衰变”最慢?哪些最快?它们之间为什么会有这样的差异?,这是没有标准答案的,因为,衰变率本身会随时间推移而改变,。例如,当医学最早从一门艺术变成一门科学时,其半衰期比现在要迅速多了。尽管如此,医学仍然是半衰期非常短的一门科学,实际上它是知识变化最快的领域之一。知识变化最慢的领域之一是数学,因为在数学里一旦证明某一定理很多时候结果就等于是敲定了,除非有人发现你证明过程中的错误其一般不会被推翻。,7,我们观察到的一个现象就是社会科学的衰变率比自然科学要快得多,因为社会科学在实验水平上存在的“噪音”要大得多。例如,在物理学里,如果你想要研究抛物线轨迹,你可以发射大炮,100,次,看看炮弹会落
5、在哪里。你会发现炮弹大多非常漂亮地集中在某个点周围。但是,如果你测量的对象是人,那问题就多了,,因为人对大量不同的外界事物会有反应,换句话说,效应大小会小得多。,8,我希望让人们看见,知识是如何改变,的。但是同时我也想指出一旦你知道知识是会改变的,你必须要常常警醒,这样才不会在你的孩子放学回家告诉你恐龙有羽毛时大吃一惊。你必须经常查阅资料,意识到自己年轻时学到的很多知识并不是最先进的。,例如,医学在鼓励从业者时时保持学习最新知识这一点上做得很好。很多医学院学生会被告知他们所学的一切在毕业后不久就会过时。甚至有一个名叫“,最新信息,”的网站在不停地更新,医学教科书,。,在一定程度上各个领域都可以
6、学习医学,我们必须不停地努力来重新探索世界,就算这仅仅体现在更频繁地查,维基百科,也好。而且我所讲的不仅只是关于恐龙和外太空。你在营养和育儿这些和我们生活息息相关的领域也会看到同样的,知识更新,现象。,9,知识半衰期正在日益缩短,由最早的,100,年逐渐缩短至当前的,3,年左右。,美国福特汽车公司的首席技术专家路易斯,罗斯说的:“对你的职业生涯而言,知识就像鲜奶,纸盒右边就印着有效日期。工程技术的有效期大约是三年,如果时间到了你还不更新所有的知识,你的职业生涯也就到头了”。,国外有研究表明:在农业经济时代,人们只要在,714,岁接受教育,就足以应付往后,40,年工作生活之所需;在工业经济时代,
7、人们求学的时间延伸为,522,岁;而在知识经济时代,学习将成为人们的终身需要。,10,终身学习,“终身学习”概念是,20,世纪,60,年代中期在发达国家提出的,一经提出便得到了国际社会的热烈响应,演变为国际潮流。,20,世纪,70,年代,,联合国教科文组织,提出:人类要向着学习型社会发展。,十年寒窗并不能终身受用,不断学习就需要不断获取所需要的信息,信息素养核心内容中正包含着信息获取与利用。,11,信息素质与创新能力有关吗,好比作家需要融合许多同类人物身上的特点,才能创造出典型的新人物形象;机械设计师通常需要某些原型的启发,才能设计出新机器。科学巨匠中有哪一个不是知识广博、经验丰富的呢?,创新
8、的本质实际上是知识的重组、信息的延伸,。从创造学所研究的各种创造方法归纳,创造能力的本质都离不开人类的“联想”思维能力,而“联想”和,“想象”与印象和记忆,有关,没有印象和记忆,联想或想象都是无源之水、无根之木。印象和记忆即个人掌握的信息量,换句话说,只有拥有丰富的各种知识和信息才具备创新的基础。,12,信息素养,信息素养(,Information Literacy,)是指个人有能力察觉何时需要信息,且能有效的查询、评估、组织与利用信息。(,One of the most widely used definitions originates with the American Library
9、Association,),“,信息素养者是指,受过训练,能利用资源至工作中的人,他们学习使用,信息工具,与信息,源,的,技术与技能,,构建以信息解决问题的能力。,”,13,信息素养是一种基本能力,信息素养是一种对信息社会的适应能力。美国教育技术,CEO,论坛,2001,年第,4,季度报告提出,21,世纪的能力素质,包括基本学习技能,(,指读、写、算,),、信息素养、创新思维能力、人际交往与合作精神、实践能力。信息素养是其中一个方面,它涉及信息的意识、信息的能力和信息的应用。,能够判断什么时候需要信息,并且懂得如何去获取信息,如何去评价和有效利用所需的信息。,14,信息素养是一种综合能力,信息
10、素养涉及各方面的知识,是一个特殊的、涵盖面很宽的能力,它包含人文的、技术的、经济的、法律的诸多因素,和许多学科有着紧密的联系。,信息技术支持信息素养,通晓信息技术强调对技术的理解、认识和使用技能。,而信息素养的重点是内容、传播、分析,包括信息检索以及评价,涉及更宽的方面。它是一种了解、搜集、评估和利用信息的知识结构,既需要通过熟练的信息技术,也需要通过完善的调查方法、通过鉴别和推理来完成。信息素养是一种信息能力,信息技术是它的一种工具,15,8,个方面的能力,运用信息工具,能熟练使用各种信息工具,特别是网络传播工具。,获取信息,能根据自己的学习目标有效地收集各种学习资料与信息,能熟练地运用阅读
11、、访问、讨论、参观、实验、检索等获取信息的方法。,处理信息,能对收集的信息进行归纳、分类、存储记忆、鉴别、遴选、分析综合、抽象概括和表达等。,16,生成信息,在信息收集的基础上,能准确地概述、综合、履行和表达所需要的信息,使之简洁明了,通俗流畅并且富有个性特色。,创造信息,在多种收集信息的交互作用的基础上,迸发创造思维的火花,产生新信息的生长点,从而创造新信息,达到收集信息的终极目的。,发挥信息的效益,善于运用接受的信息解决问题,让信息发挥最大的社会和经济效益。,17,信息协作,使信息和信息工具作为跨越时空的、“零距离”的交往和合作中介,使之成为延伸自己的高效手段,同外界建立多种和谐的,合作关
12、系,。,信息免疫,浩瀚的信息资源往往良莠不齐,需要有正确的人生观、价值观、甄别能力以及自控、自律和自我调节能力,能自觉抵御和消除垃圾信息及有害信息的干扰和侵蚀,并且完善合乎时代的,信息伦理,素养。,18,19,主要内容,1.,信息检索基本理论与技能,2.,常用的信息工具:通用、学术搜索引擎,3.,综合大型学术数据库:,CNKI,、,VIP,、,WOS,、,GS,、,Scopus,、,ABI,、,EBSCO,、,EMERALD,。,4.,经济管理学术信息在网络上的分布,5.,案例分析,学术信息综合利用、文献管理软件,20,采用课题检索报告的方式进行课程学习考核。考试占,60%,,平时上机作业占,
13、30%,,讨论、出勤占,10%,。,21,对于学术问题而言,通常我们根据需求知识的特点将需求类型分为以下两种。,1,、特征型知识需求:特征型知识一般有比较确定的解,可以通过事实型、数值型数据库和搜索引擎、百科类网站获得。,例如查找某概念的确切含义:“查找什么是,blog”,?或者查找某概念的背景知识:“谁最先发现青霉素?”或者查找某些事物的数值及量化指标:“如金刚石的物理与化学特性”等。,。,22,2,、关联型知识需求:而在学习研究中更大量是没有确定的解的知识,需要通过查找某些相关的知识,进行分析归纳与综合才能满足我们的需求,这些内容的知识称为关联型知识,其查找一般需要通过相关的文献数据库。,
14、例如查找某一学科的一般知识“关于分子生物学有哪些专著”;查找学科专业领域的新进展:“有关纳米技术的研究综述”;以及最常见的查找课题相关的专业文献等。,23,24,2024/11/28 周四,School of Information Management,25,Outline,Introduction to Information Retrieval,Web Search(search engine),Scholar search tools,Scopus,、,Google Scholar etc.,Large integrating Databases,CNKI,、,VIP,etc.,WOK
15、,、,EBSCO etc.,Economic information retrieval,ABI,、,Emerald etc.,26,Information literacy is a set of abilities requiring individuals to recognize,when,information is needed and have the ability to,locate,evaluate,and,use effectively,the needed information.,-,American Library Association,Information l
16、iteracy also is increasingly important in the contemporary environment of rapid technological change and proliferating information resources.Because of the escalating complexity of this environment,individuals are faced with diverse,abundant information choices-in their academic studies,in the workp
17、lace,and in their personal lives.,Information Literacy,27,https:/pilot.library.qut.edu.au/index.jsp,studysmart.library.qut.edu.au/,www.library.pitt.edu/services/classes/infoliteracy/teaching.html,libraries.stjohns.edu/ilt/,www.library.american.edu/tutorial/index.html,www.lib.washington.edu/uwill/tut
18、orial.html,skil.stanford.edu/intro/index.html,core.lib.purdue.edu/,www.kyvl.org/kids/homebase.html,libguides.ust.hk/infoliteracy,www.webs.uidaho.edu/info_,literacy,/,libraries.stjohns.edu/ilt/,Information Literacy Tutorial,28,Evaluating Information Found on the Internet,The Sheridan Libraries of the
19、 Johns Hopkins University.,guides.library.jhu.edu/evaluatinginformation,TILT,The University of Texas System.Winner of the 1998 ACRL/Instruction Section Innovation in Instruction Award.,library.utb.edu/tilt/,RIO,:Research Instruction Online,The University of Arizona.Winner of the 1999 Innovation in I
20、nstruction award from the Association of College and Research Libraries,Instruction Section.,www.ic.arizona.edu/ic/rio/,Tutorials,Penn State University Libraries,www.libraries.psu.edu/psul/tutorials.html,www.engr.psu.edu/cde/nuce/library.html,Online Library Learning Center,University of Georgia Syst
21、em.,www.usg.edu/galileo/skills/,The Virtual Librarian,Monash University,Melbourne,Australia,www.lib.monash.edu.au/vl/,Information Literacy Tutorial,29,Information Literacy Tutorial,Minneapolis Community&Technical College,www.mctc.mnscu.edu/Library/tutorials/infolit/index.html,Quickstudy:Library Rese
22、arch Guide,The University of Minnesota.,tutorial.lib.umn.edu/,CSU Information Competence Project,This site has been constructed to provide a focal point for the Information Competence initiatives within the California State University system.The site is currently maintained by the Kennedy Library at
23、 Cal Poly,San Luis Obispo.,www.lib.calpoly.edu/infocomp/,Mission Critical,San Jose State Universitys Critical Thinking Web Page.,www.sjsu.edu:80/depts/itl/,RDN Virtual Training Suite,ILRT at The University of Bristol,UK.,www.vts.rdn.ac.uk/,InfoTrekk,Curtin University of Technology,Australia,lisweb.c
24、urtin.edu.au/finding/infotrekk/index.html,InfoTrekk Plus,Curtin University of Technology,Australia,lisweb.curtin.edu.au/finding/infotrekkplus/index.html,Information Literacy Tutorial,Five Colleges of Ohio,www.denison.edu/ohio5/infolit/,Core,:Comprehensive Online Research Education,Purdue University.
25、,core.lib.purdue.edu/,30,31,2024/11/28 周四,School of Information Management,32,2024/11/28 周四,School of Information Management,33,2024/11/28 周四,School of Information Management,34,2024/11/28 周四,School of Information Management,35,Introduction to Information Retrieval,Information,Information Retrieval,
26、Web Search,Information Retrieval,Information,What is“information”?,Retrieval,What do we mean by“retrieval”?,What are different types information needs?,What is Information?,What do you think?,There is no“correct”definition,Cookie Monsters definition:,“news or facts about something”,Different approac
27、hes:,Philosophy,Psychology,Linguistics,Electrical engineering,Physics,Computer science,Information science,Dictionary says,Oxford English Dictionary,information,:informing,telling;thing told,knowledge,items of knowledge,news,knowledge,:knowing familiarity gained by experience;persons range of inform
28、ation;a theoretical or practical understanding of;the sum of what is known,Random House Dictionary,information,:knowledge communicated or received concerning a particular fact or circumstance;news,Three Views of Information,Information as process,Information as communication,Information as message t
29、ransmission and reception,One View,Information=characteristics of the output of a process,Tells us something about the process and the input,Information-generating process do not occur in isolation,Ibid.,Process,Input,Input,Input,Output,Output,Output,Process,1,Process,2,Input,Output,Wheres the human
30、?,If a tree falls in the forest,and no one is around to hear it,is information transmitted?,In the“information as process”:Yes,but thats not very interesting to us,Were concerned about information for human consumption,Transmission of information from one person to another,Recording of information,R
31、econstruction of stored information,Another View,Information science is characterized by“the deliberate(purposeful)structure of the message by the sender in order to affect the image structure of the recipient”,This implies that the sender has knowledge of the recipients structure,Text=“a collection
32、 of signs purposefully structured by a sender with the intention of changing image-structure of a recipient”,Information=“the structure of any text which is capable of changing the image-structure of a recipient”,Nicholas J.Belkin and Stephen E.Robertson.(1976)Information Science and the Phenomenon
33、of Information.,Journal of the American Society for Information Science,27(4),197-204.,Information Hierarchy,Data,Information,Knowledge,Wisdom,More refined and abstract,Information Hierarchy,Data,The raw material of information,Information,Data organized and presented in a particular manner,Knowledg
34、e,“Justified true belief”,Information that can be acted upon,Wisdom,Distilled and integrated knowledge,Demonstrative of high-level“understanding”,A(Facetious)Example,Data,98.6,F,99.5 F,100.3 F,101 F,Information,Hourly body temperature:98.6 F,99.5 F,100.3 F,101 F,Knowledge,If you have a temperature a
35、bove 100 F,you most likely have a fever,Wisdom,If you dont feel well,go see a doctor,“Retrieval?”,“Fetch something”thats been stored,Recover a stored state of knowledge,Search through stored messages to find some messages relevant to the task at hand,Sender,Recipient,Encoding,Decoding,storage,messag
36、e,message,noise,indexing/writing,Retrieval/reading,What is IR?,Information retrieval is a,problem-oriented,discipline,concerned with the problem of the,effective and efficient,transfer of desired information between,human generator and human user,.,Anomalous States of Knowledge as a Basis for Inform
37、ation Retrieval.(1980)Nicholas J.Belkin.,Canadian Journal of Information Science,5,133-143.,Modern History,The“information overload”problem is much older than you may think,Origins in period immediately after World War II,Tremendous scientific progress during the war,Rapid growth in amount of scient
38、ific publications available,The“Memex Machine”,Conceived by Vannevar Bush,President Roosevelts science advisor,Outlined in 1945 Atlantic Monthly article titled“As We May Think”,Foreshadows the development of hypertext(the Web)and information retrieval system,The Memex Machine,Bush,的,Memex,是现代个人电脑的雏型
39、,也是首次为协助人类思考和管理资讯而设计的理想机器。,1939,年在“,Mechanization and the Record”,中,Bush,详述,Memex,的功能,此理念于六年后才撰写成“,As We May Think”,该文综论,Memex,具有增强人类控制资讯环境,增加自行控制处理资讯能力,增进人类福祉与支援以及加强人类思想处理能力等四方面影响,.,当时,Memex,被,Bush,视为个人档案和私人图书馆,强调其独特性质,为一类似人脑的机器,其功能远超越当时以发展之数字电脑,(digital computer),。,1945,年的,Memex,是为伸展人类体能而设计,并无法加强智
40、慧能力,后世研究者将,Memex,加入索引及连接等功能,以扩增其为人类提供资讯相关服务之能力。,Types of Information Needs,Retrospective,“Searching the past”,Different queries posed against a static collection,Time invariant,Prospective,“Searching the future”,Static query posed against a dynamic collection,Time dependent,Retrospective Searches(I)
41、,Ad hoc,retrieval:find documents“about this”,Known item search,Directed exploration,Identify positive accomplishments of the Hubble telescope since it was launched in 1991.,Compile a list of mammals that are considered to be endangered,identify their habitat and,if possible,specify what threatens th
42、em.,Find Jimmy Lins homepage.,Whats the ISBN number of“Modern Information Retrieval”?,Who makes the best chocolates?,What video conferencing systems exist for digital reference desk services?,Retrospective Searches(II),Question answering,Who discovered Oxygen?,When did Hawaii become a state?,Where i
43、s Ayers Rock located?,What team won the World Series in 1992?,“Factoid”,What countries export oil?,Name U.S.cities that have a“Shubert”theater.,“List”,Who is Aaron Copland?,What is a quasar?,“Definition”,Prospective“Searches”,Filtering,Make a binary decision about each incoming document,Routing,Sort
44、 incoming documents into different bins?,Spam or not spam?,Categorize news headlines:World?Nation?Metro?Sports?,What about databases?,What are examples of databases?,Banks storing account information,Retailers storing inventories,Universities storing student grades,What exactly is a(relational)datab
45、ase?,Think of them as a collection of tables,They model some aspect of“the world”,A(Simple)Database Example,Student Table,Department Table,Course Table,Enrollment Table,Database Queries,What would you want to know from a database?,What classes is John Arrow enrolled in?,Who has the highest grade in
46、IR206?,Whos in the history department?,Of all the non-sysu students taking IR206 with a last name shorter than six characters and were born on a Monday,who has the longest email address?,Databases vs.IR,Other issues,Interaction with system,Results we get,Queries were posing,What were retrieving,IR,D
47、atabases,Issues downplayed.,Concurrency,recovery,atomicity are all critical.,Interaction is important.,One-shot queries.,Sometimes relevant,often not.,Exact.Always correct in a formal sense.,Vague,imprecise information needs(often expressed in natural language).,Formally(mathematically)defined queri
48、es.Unambiguous.,Mostly unstructured.Free text with some metadata.,Structured data.Clear semantics based on a formal model.,The Big Picture,The four components of the information retrieval environment:,User,Process,System,Collection,What computer geeks care about!,What we care about!,The Information
49、Retrieval Cycle,Source,Selection,Search,Query,Selection,Ranked List,Examination,Documents,Delivery,Documents,Query,Formulation,Resource,query reformulation,vocabulary learning,relevance feedback,source reselection,Supporting the Search Process,Source,Selection,Search,Query,Selection,Ranked List,Exam
50、ination,Documents,Delivery,Documents,Query,Formulation,Resource,Indexing,Index,Acquisition,Collection,Simplification?,Source,Selection,Search,Query,Selection,Ranked List,Examination,Documents,Delivery,Documents,Query,Formulation,Resource,query reformulation,vocabulary learning,relevance feedback,sou