人工智能学习讲义.ppt_咨信网zixin.com.cn

资源描述

Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,人工智能学习,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,人工智能学习,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,人工智能学习,人工智能学习,Supervised learning,正规的参数表示,分类,回归,人工智能学习,Classification（分类）,We are given a set of N observations(,x,i,y,i,),i,=1.,N,Need to map x X to a label y Y,Examples:,人工智能学习,Decision Trees决策树,教材Section 18.3,人工智能学习,学习决策树,Problem:,基于以下属性决定是否在一家餐馆等座位,:1.,Alternate,（别的选择）,:is there an alternative restaurant nearby?2.,Bar,:is there a comfortable bar area to wait in?3.,Fri/Sat,:is today Friday or Saturday?4.,Hungry,:are we hungry?5.,Patrons,（顾客）,:number of people in the restaurant(None,Some,Full)6.,Price,:price range($,$,$)7.,Raining,:is it raining outside?8.,Reservation,（预约）,:have we made a reservation?9.,Type,:kind of restaurant(French,Italian,Thai,Burger)10.,Wait Estimate,:estimated waiting time(0-10,10-30,30-60,60),人工智能学习,Attribute-based representations,以下是,12,个基于这,10,个属性描述的例子，属性值是布尔、离散和连续的,E.g.,situations where I will/wont wait for a table:,Classification,（分类）,of examples is,positive(T),or,negative(F),人工智能学习,Decision trees,一个可能的假设表示,E.g.,here is the“true”tree for deciding whether to wait:,人工智能学习,Decision Tree Learning,人工智能学习,Expressiveness（表达能力）,决策树能够表达关于输入属性的任何函数,E.g.,for Boolean functions,truth table row path to leaf,（函数真值表的每行对应于树中的一条路径）,:,Trivially,there is a consistent decision tree for any training set with one path to leaf for each example(unless,f,nondeterministic in,x,)but it probably wont generalize to new examples,需要找到一颗更,紧凑,的决策树,人工智能学习,Decision tree learning,目标,:,找到一颗小的决策树来满足训练样本,Idea:(,递归地,),选择最佳属性作为（子）树的根,人工智能学习,Choosing an attribute,Idea:,一个好的属性选择将样本分割成理想的子集，例如,“all positive”or“all negative“,Patrons,?,is a better choice,人工智能学习,Using information theory（信息论）,algorithm,落实,DTL,算法中,Choose-Attribute,函数的实施,Information Content,信息量,(Entropy,熵,):,对于一个包含,p,个正例和,n,个反例的训练集：,人工智能学习,Information gain（信息增益）,任何属性,A,都可以根据属性,A,的值将训练集,E,划分为几个子集,E,1,E,v,，其中,A,可以有,v,个不同的值,从属性,A,测试中得到的,信息增益,(IG),是原始的信息需求和新的信息需求之间的差异,:,Choose the attribute with the largest IG,人工智能学习,信息增益,对于训练集,p=n=6,I(6/12,6/12)=1,bit,考虑属性,Patrons,and,Type,(and others too):,Patrons,has the highest IG of all attributes and so is chosen by the DTL algorithm as the root,人工智能学习,Example contd.,Decision tree learned from the 12 examples:,明显比前面那颗,“true”tree,要简单得多,人工智能学习,性能评估,How do we know that,h f,?1.Use theorems of computational/statistical learning theory2.Try,h,on a new test set,（测试集）,of examples(use same distribution over example space as training set),Learning curve,（学习曲线）,=%correct on test set as a function of training,人工智能学习,评论基于决策树的分类,Advantages:,易于构造,在分类位置记录时速度快,对于“小号”树易于解释,在简单数据集上分类精度相当于其他分类算法,Example:C4.5Simple depth-first construction.Uses Information Gain,人工智能学习,K nearest neighbor classifier最近邻模型,教材,Section 20.4,Linear predictions,线性预测,人工智能学习,Learning Framework,人工智能学习,Focus of this part,Binary classification(e.g.,predicting spam or not spam):,Regression(e.g.,predicting housing price):,人工智能学习,Classification,Classification=learning from data with finite discrete labels.Dominant problem in Machine Learning,人工智能学习,线性分类器,Binary classification can be viewed as the task ofseparating classes in feature space,（特征空间）,:,人工智能学习,Roadmap,人工智能学习,线性分类器,h,(,x,),=,sign(,w,T,x,+,b,),需要寻找合适的,w,(direction),和,b,(location)of,分界线,Want to minimize the expected zero/one loss,（损失）,for classifier,h,:X,Y,which is,h,(,x,),=,sign(,w,T,x,+,b,),理想情况下，完全分割,人工智能学习,线性分类器,损失最小化,理想情况下我们想找到一个分类器,h,(,x,),=,sign(,w,T,x,+,b,),来最小化,0/1 loss,Unfortunately,this is a hard problem.,替换的损失函数,:,人工智能学习,Learning as Optimization,人工智能学习,Least Squares Classification最小二乘分类,Least squares loss function:,目标,:,学习一个分类器,h,(,x,),=,sign(,w,T,x,+,b,),来使最小二乘损失最小,人工智能学习,最小二乘分类解决方案,人工智能学习,W解决方案,人工智能学习,通用的线性分类,人工智能学习,Regression（回归）,Regression=learning from continuously labeled data.,（连续的标签数据）,人工智能学习,线性回归,人工智能学习,一般的线性/多项式回归,人工智能学习,模型复杂度及过拟合,人工智能学习,模型复杂度及过拟合,欠拟合,高偏离,人工智能学习,模型复杂度及过拟合,人工智能学习,模型复杂度及过拟合,过拟合,高方差,人工智能学习,模型复杂度及过拟合,人工智能学习,模型复杂度及过拟合,人工智能学习,Prediction Errors预测误差,Training errors(apparent errors)训练误差 Errors committed on the training set,Test errors 测试误差 Errors committed on the test set,Generalization errors 泛化误差 Expected error of a model over random selection of records from same distribution（未知记录上的期望误差）,人工智能学习,模型复杂度及过拟合,欠拟合,:when model is too simple,both training and test errors are large,过拟合,:when model is too complex,training error is small but test error is large,人工智能学习,Incorporating Model Complexity,基本原理,:Ockhams Razor,奥卡姆剃刀原则,Given two models of similar generalization errors,oneshould prefer the simpler model over the more complex model,A complex model has a greater chance of being fittedaccidentally by errors in data,复杂的模型在拟合上更容易受错误数据误导,因此在评估一个模型时需要考虑其模型复杂度,人工智能学习,Regularization（规范化）,直观的,:small values for parameters “Simpler”hypothesis Less prone to overfitting,人工智能学习,Regularization,人工智能学习,L-2 and L-1 regularization,L-2:easy to optimize,closed form solution,L-1:sparsity,人工智能学习,More than two classes?,人工智能学习,More than two classes,人工智能学习,评论最小二乘分类,不是分类问题最好的办法,But,易于训练,closed form solution,（闭式解）,可以与很多经典的学习原理相结合,人工智能学习,Cross-validation（交叉验证）,基本思想,:,如果一个模型有一些过拟合（对训练数据敏感），那么这个模型是不稳定的。也就是说移除部分数据会显著地改变拟合结果。,因此我们先,取出,部分数据，在剩余数据中做拟合，然后在取出的数据中做测试,人工智能学习,Cross-validation,人工智能学习,Cross-validation,人工智能学习,Cross-validation,人工智能学习,Cross-validation,人工智能学习,Learning Framework,人工智能学习,Model/parameter learning paradigm,Choose a model classNB,kNN,decision tree,loss/regularization combination,Model selectionCross validation,TrainingOptimization,Testing,人工智能学习,Summary,Supervised learning(1)ClassificationNave Bayes modelDecision treeLeast squares classification(2)RegressionLeast squares regression,人工智能学习,课后思考题,试证明对于不含冲突数据（即特征向量完全相同但标记不同）的训练集，必存在与训练集一致（即训练误差为 0）的决策树。,人工智能学习,

展开阅读全文