1、Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,人工智能学习,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,人工智能学习,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,人工智能学习,大纲,Introduction to machine learning,Supervised learning(监督学习)Decision tre
2、e learning(决策树学习)Linear predictions(线性预测)Support vector machines(支持向量机),Unsupervised learning(无监督学习),人工智能学习,Learning,学习对认识未知的环境是必不可少的,i.e.,当设计者缺乏完整知识的时候when designer lacks omniscience(全知),学习作为一种系统构造方法是有用的,i.e.,编写一个学习智能体程序来实现比尝试着将函数直接写出来要好得多,学习会不断地修改智能体的决策机制来提高性能,人工智能学习,学习智能体,人工智能学习,学习元件,Design of a
3、learning element is affected by,性能元素的哪些组件是需要学习的,什么样的反馈可被用来学习这些组件,这些组件可以被哪些方法来表示,人工智能学习,Introduction to Machine Learning,人工智能学习,Machine Learning Everyday:Search Engine,人工智能学习,Machine Learning Everyday:Spam Detection(垃圾邮件检测),人工智能学习,Machine Learning Everyday:Machine Translation,人工智能学习,Machine Learning
4、Everyday:Face Detection,Now in most digital cameras for auto focusing,人工智能学习,Machine Learning,脱胎于人工智能工作,电脑的一种新能力,人工智能学习,Why Machine Learning?,Solve classification problems,Learn models of data(“data fitting”),Understand and improve efficiency of humanlearning,Discover new things or structures that a
5、re unknown to humans(“data mining”),人工智能学习,Why Machine Learning?,Large amounts of dataWeb data,Medical data,Biological data,昂贵的手工分析费用,计算机变得便宜并性能更加优良,人工智能学习,Why Machine Learning?,应用程序无法直接通过手工编程完成,无人驾驶,手写识别,自然语言处理,(NLP),计算机视觉,理解人类的学习(人脑,真正的,AI,),人工智能学习,What is machine learning useful for?,机器学习在哪些领域有用?
6、人工智能学习,Automatic speech recognition自动语音识别,当前大部分语音识别和翻译都能够不断学习 你用得越多,它们就会变得越聪明,人工智能学习,Computer vision:e.g.object,face andhandwriting recognition,人工智能学习,Information retrieval信息检索,对大量文本数据库,的阅读,领会和分,类对于人类来说,是困难的,Web Pages,Retrieval,(检索),Categorization,(分类),Clustering,(聚类),Relations between pages,人工智能学习
7、Financial prediction,人工智能学习,Medical diagnosis(医学诊断),人工智能学习,Bioinformatics(生物信息学),e.g.基因微阵列数据建模,蛋白质结构预测,人工智能学习,Robotics机器人学,人工智能学习,电影推荐系统Movie recommendation systems,人工智能学习,Machine Learning,Machine learning,is an interdisciplinary field focusing on both the mathematical foundations and practical app
8、lications of systems that learn,reason and act.,机器学习,是一个交叉学科的领域,着重于研究具有学习、推理和行动的系统所需要的数学基础以及实际应用,Other related terms:Pattern Recognition,(模式识别),Neural Networks,(神经网络),Data Mining,(数据挖掘),Statistical Modeling,(统计模型),.,Using ideas from:Statistics,Computer Science,Engineering,AppliedMathematics,Cognitiv
9、e Science,(认知科学),Psychology,(心理学),Computational Neuroscience,(计算神经学),Economics,The goal of these lectures:to introduce important concepts,(概念),models and algorithms in machine learning.,人工智能学习,Machine Learning:定义,Tom Mitchell(1998)Well-posed Learning Problem:A computer program is said to,learn,from,
10、experience E,with respect to some,task T,and some,performance measure P,if its performance on,T,as measured by,P,improves with,experience E,.,汤姆,米切尔(,1998,)很好地定义了学习问题:我们说一个计算机程序能从经验,E,中学会针对某些任务,T,和一些性能指标,P,的方法,如果程序使用,E,有效提高了在,T,中运行时的指标,P.,人工智能学习,“A computer program is said to,learn,from,experience E
11、withrespect to some,task T,and some performance,measure P,if itsperformance on T,as measured by P,improves with experience E.”,假设你的邮件程序观测到你将一些邮件标记为垃圾邮件,以此为基础程序学习如何更好地过滤垃圾邮件,那么在该设定中,task T,是什么?,将邮件分类为垃圾或非垃圾邮件,Watching you label emails as spam or not spam.The number(or fraction)of emails correctly cl
12、assified as spam/not spam.None of the abovethis is not a machine learning problem.,人工智能学习,学习的种类,想象一下,一个智能体或机器收集到一系列的传感输入(,sensory inputs,),:x,1,x,2,x,3,x,4,.,Supervised learning,(监督学习),:The machine is also given desired outputs y,1,y,2,.,and its goal is to learn to produce the correct outputgiven a
13、new input.,Unsupervised learning,(无监督学习),:outputs y,1,y,2,.Not given,the agent still wants tobuild a model of x that can be used for reasoning,decision making,predicting things,communicating etc.,Semi-supervised learning,(半监督学习),人工智能学习,Representing“objects”in machinelearning,举个实例,x,represents a spec
14、ific object,x,通常表示一个,d,维的特征向量,x=(x,1,.,x,d,),R,d,其中每一个维度叫做,feature or attribute,特征值是连续的或离散的,x,在,d,维的特征空间中是一个点,目标抽象化,.,忽略其他方面,(e.g.,two people having the same weight and height may be considered identical),人工智能学习,Feature vector representation特征向量表示法,文本文件,词汇,of size d(100,000),“bag of words”:,对每个词条的计数
15、通常忽略掉,stopwords:the,of,at,in,特别的,用,“out-of-vocabulary”(OOV),来捕捉,所有未知的词,人工智能学习,特征向量表示法,图像,像素,颜色直方图,银行账户,信用等级,余额,最近一天、一星期、一个月、一年存款,#,取款,You and me,医学特征,test1,test2,test3,人工智能学习,主要成分,Data,The data set,D,consists of N data points:,D,=x,1,x,2,.,x,N,Predictions,(预测),We are generally interested in predict
16、ing something based on the observed dataset.,基于已观测到的数据集能否正确对后来的数据进行预测,Given,D,what can we say about x,N+1,?,Model,To make predictions,we need to make some assumptions.We can often expressthese assumptions in the form of a model,with some parameters,(参数),为了完成预测任务,我们需要做一些合理假设。我们经常以带参数的模型形式来表达这些假设,在给定数
17、据集,D,时,我们学习模型的参数,以便对新数据进行预测,.,人工智能学习,主要成分,人工智能学习,学习的架构,实验,3,:,人工智能学习,Learning Problems,人工智能学习,Housing price prediction,Supervised Learning,监督学习,数据中给出了,“right answers”,Regression,(回归),:,预测连续的输出值,(price),人工智能学习,乳腺癌(恶性,良性),Supervised Learning,监督学习,数据中给出了,“right answers”,Classification,(分类),:,预测离散值输出,人工智能学习,Supervised Learning,人工智能学习,Unsupervised Learning,人工智能学习,Next,Machine learning algorithmsSupervised learningUnsupervised learning,人工智能学习,






