1、深度深度学习:快速推进中的学习:快速推进中的机器学习与人工智能前沿机器学习与人工智能前沿山世光中科院计算所Institute of Computing Technology,Chinese Academy of Sciences提纲提纲n深度学习(DL)及其应用前沿nDL在CV领域应用的启示n关键算法介绍Perceptron及学习算法MLP及其BP算法Auto-EncoderCNN及其主要变种n关于DL的思考与讨论2Institute of Computing Technology,Chinese Academy of Sciences机器学习的基本任务机器学习的基本任务3Classlabel
2、(Classification)Vector(Estimation)dog,cat,horse,ObjectrecognitionSuperresolutionLow-resolutionimageHigh-resolutionimageInstitute of Computing Technology,Chinese Academy of Sciences源起源起生物神经系统的启示生物神经系统的启示n神经元之间通过突触(synapse)连接层级感受野,学习使突触连接增强或变弱甚至消失4Hubel,D.H.&Wiesel,T.N.(1962)Institute of Computing Tec
3、hnology,Chinese Academy of Sciences第一代神经网络第一代神经网络5FrankRosenblatt(1957),ThePerceptron-aperceivingandrecognizingautomaton.Report85-460-1,CornellAeronauticalLaboratory.Institute of Computing Technology,Chinese Academy of Sciences第一代神经网络第一代神经网络n单层感知机(Perceptrons)模型的局限性Minsky&Papert的专著Perceptron(1969)只能
4、对线性可分的模式进行分类解决不了异或问题几乎宣判了这类模型的死刑,导致了随后多年NN研究的低潮6Institute of Computing Technology,Chinese Academy of Sciences2nd Generation Neural Networksn多层感知机(Multi-layerPerceptron,MLP)超过1层的hiddenlayers(正确输出未知的层)nBP算法算法 Rumelhartet al.,1986Compute error signal;Then,back-propagateerrorsignaltogetderivatives for l
5、earning7DavidE.Rumelhart,GeoffreyE.Hinton,andRonaldJ.Williams.(Oct.1986).Learningrepresentationsbyback-propagatingerrors.Nature323(6088):53353612312Institute of Computing Technology,Chinese Academy of SciencesError BackpropagationnWistheparameterofthenetwork;JistheobjectivefunctionFeedforwardoperati
6、onBackerrorpropagationDavidE.Rumelhart,GeoffreyE.Hinton,andRonaldJ.Williams.(Oct.1986).Learningrepresentationsbyback-propagatingerrors.Nature323(6088):533536OutputlayerHiddenlayersInputlayerTargetvaluesInstitute of Computing Technology,Chinese Academy of Sciences2nd Generation Neural Networksn理论上理论上
7、多层好多层好两层权重即可逼近任何连续函数映射两层权重即可逼近任何连续函数映射n遗憾的是,训练困难遗憾的是,训练困难ItrequireslabeledtrainingdatanAlmostalldataisunlabeled.ThelearningtimedoesnotscalewellnItisveryslowinnetworkswithmultiplehiddenlayers.ItcangetstuckinpoorlocaloptimanTheseareoftenquitegood,butfordeepnetstheyarefarfromoptimal.9Institute of Compu
8、ting Technology,Chinese Academy of Sciences1990-2006更流行更流行nSpecificmethodsforspecifictasksHand-craftedfeatures(SIFT,LBP,HOG)MLmethodsnSVMKerneltricksnBoostingAdaBoostnkNNnDecisiontree10Kruger et al.TPAMI13Institute of Computing Technology,Chinese Academy of SciencesA Breakthrough Back to 2006n2006年,
9、通过分层的、无监督预训练,终于获得了训练深层网络结构的能力11Institute of Computing Technology,Chinese Academy of SciencesA Breakthrough Back to 2006nHinton,G.E.,Osindero,S.andTeh,Y.,Afastlearningalgorithmfordeep belief nets.NeuralComputation18:1527-1554,2006nHinton,G.E.andSalakhutdinov,R.R.(2006)Reducing the dimensionality ofda
10、tawithneuralnetworks.Science,Vol.313.no.5786,pp.504-507,28July2006nYoshua Bengio,PascalLamblin,DanPopoviciandHugoLarochelle,GreedyLayer-Wise Training of Deep Networks,AdvancesinNeuralInformationProcessingSystems19(NIPS2006)nMarcAurelioRanzato,ChristopherPoultney,SumitChopraandYann LeCun.Efficient Le
11、arning of Sparse Representations withanEnergy-BasedModel,AdvancesinNeuralInformationProcessingSystems(NIPS2006)12Institute of Computing Technology,Chinese Academy of Sciences其实是有例外的其实是有例外的CNNn卷积神经网络CNNK.Fukushima,“Neocognitron:Aself-organizingneuralnetworkmodelforamechanismofpatternrecognitionunaffe
12、ctedbyshiftinposition,”Biological Cybernetics,vol.36,pp.193202,1980Y.LeCun,B.Boser,J.S.Denker,D.Henderson,R.E.Howard,W.Hubbard,andL.D.Jackel,“Backpropagationappliedtohandwrittenzipcoderecognition,”NeuralComputation,vol.1,no.4,pp.541551,1989Y.Le Cun,L.Bottou,Y.Bengio,andP.Haffner,“Gradient-basedlearn
13、ingappliedtodocumentrecognition,”ProceedingsoftheIEEE,vol.86,no.11,pp.22782324,199813Institute of Computing Technology,Chinese Academy of Sciences其实是有例外的其实是有例外的CNNnNeocognitron 198014K.Fukushima,“Neocognitron:Aself-organizingneuralnetworkmodelforamechanismofpatternrecognitionunaffectedbyshiftinposit
14、ion,”Biological Cybernetics,vol.36,pp.193202,1980LocalConnectionInstitute of Computing Technology,Chinese Academy of Sciences例外:例外:CNN用于数字识别用于数字识别15Institute of Computing Technology,Chinese Academy of Sciences例外例外:CNN用于目标检测与识别用于目标检测与识别16Institute of Computing Technology,Chinese Academy of Sciences而且
15、,东风同样重要而且,东风同样重要n大数据n大数据n大数据语音图像视频n计算能力并行计算平台GPU大量部署n开放的社区开源,开放数据17Institute of Computing Technology,Chinese Academy of Sciences语音识别语音识别(2011)1819862006DBNScienceSpeech2011BPInstitute of Computing Technology,Chinese Academy of Sciences2012年计算机视觉的巨大进步年计算机视觉的巨大进步nImageNet物体分类任务上物体分类任务:1000类,1,431,167幅
16、图像1919862006DBNScienceSpeech2011 2012RankNameError rates(TOP5)Description1U.Toronto0.153Deeplearning2U.Tokyo0.261Hand-craftedfeaturesandlearningmodels.Bottleneck.3U.Oxford0.2704Xerox/INRIA0.271BPInstitute of Computing Technology,Chinese Academy of SciencesImageNet with Deep CNNn方法:大规模CNN网络20A.Krizhe
17、vsky,L.Sutskever,andG.E.Hinton,“ImageNetClassificationwithDeepConvolutionalNeuralNetworks,”NIPS,2012.Institute of Computing Technology,Chinese Academy of SciencesImageNet with Deep CNNn方法:大规模CNN网络650K神经元,60M参数TrainedwithBPonGPU使用了各种技巧+dropoutnReLU,Dataaugment,contrastnormalization,.n被Google收编(Jan201
18、3)Google+PhotoTagging(2013.5)21A.Krizhevsky,L.Sutskever,andG.E.Hinton,“ImageNetClassificationwithDeepConvolutionalNeuralNetworks,”NIPS,2012.Institute of Computing Technology,Chinese Academy of SciencesImageNet物体分类物体分类(2013)n1000类,1,431,167幅图像,Top5错误率2219862006DBNScienceSpeech2011 2012RankNameError r
19、ates(TOP5)Description1NYU0.11197Deeplearning2NUS0.12535Deeplearning3Oxford0.13555Deeplearning2013BPInstitute of Computing Technology,Chinese Academy of SciencesMIT Tech Review坐不住了坐不住了23Institute of Computing Technology,Chinese Academy of SciencesImageNet物体分类物体分类(2014)n1000类,1,431,167幅图像,Top5错误率24198
20、62006DBNScienceSpeech2011 2012RankNameError rates(TOP5)Description1Google0.06656Deeplearning2Oxford0.07325Deeplearning3MSRA0.08062Deeplearning2013 2014BPInstitute of Computing Technology,Chinese Academy of SciencesImageNet物体分类物体分类(2014)nGoogLeNetCVPR201522个卷积层SzegedyC,LiuW,JiaY,etal.Goingdeeperwithc
21、onvolutions.CVPR20152519862006DBNScienceSpeech2011 2012BP20142013Institute of Computing Technology,Chinese Academy of SciencesImageNet物体分类物体分类(2010-2014)nImageNet Top 5 Error Rate上的持续进步2619862006DBNScienceSpeech2011 2012BP20142013Institute of Computing Technology,Chinese Academy of SciencesImageNet物
22、体检测物体检测任务任务n200类,456,567幅图像,检测率27传统方法SIFT+BOW+SPM深度方法R-CNN+GoogLeNet19862006DBNScienceSpeech2011 2012BP20142013Institute of Computing Technology,Chinese Academy of Sciences物体分割物体分割/语义标注进步迅速语义标注进步迅速nJonathanLong,EvanShelhamer,TrevorDarrell.FullyConvolutionalNetworksforSemanticSegmentation.CVPR201528I
23、nstitute of Computing Technology,Chinese Academy of SciencesDL有多热有多热nDeepLearningforVision602篇文章中,仅标题中出现Deep的就有87篇,出现Convolution的47篇,出现Neural的40篇,出现Network的51篇,Recurrent7篇nGoingdeeper,优化,无监督、自主学习nFullyConvolutionalNetwork(forsegmentation等)nVisionandLanguage(for看图说话,Google,Fei-fei,Microsoft,UCB)nRNNw
24、ithLSTM(for时序处理)nDeepLearningforX(detection,metriclearning,attribute,hash,)29Institute of Computing Technology,Chinese Academy of Sciences计算机视觉的重大进步计算机视觉的重大进步nVisionandLanguage(Google,Microsoft,UCB)看图说话:Minsky60年前布置的作业30Show and Tell:A Neural Image Caption Generator(a work from Google)From Captions
25、to Visual Concepts and Back(a work from Microsoft)Long-term Recurrent Convolutional Networks for Visual Recognition and Description(a work from UTA/UML/UCB)Institute of Computing Technology,Chinese Academy of Sciences人脸识别上的进步人脸识别上的进步n正确率正确率95.17%D.Chen,X.Cao,F.Wen,J.Sun,CVPR13n正确率正确率97.35%Y.Taigman,
26、M.Yang,M.Ranzato,L.Wolf,CVPR14n正确率正确率99.47%Y.Sun,X.Wang,andX.Tang,CVPR14n正确率正确率99.63%F.Schroff,D.Kalenichenko,andJ.Philbin,CVPR153119862006DBNScienceSpeech2011 2012Face2014 2015BP在在LFW上,过去上,过去2年错误率从年错误率从5%下降到下降到0.5%(错错300对对错错30对对)Institute of Computing Technology,Chinese Academy of Sciences人脸识别上的进步人
27、脸识别上的进步nLabeledFaceintheWild(LFW)非限定条件下的人脸识别数据来源于因特网国外名人,Yahoo新闻广为人知的测试模式n训练集:无限制无限制n验证任务测试集共6000图像对图像对32HuangGB,RameshM,BergT,etal.Labeledfacesinthewild:Adatabaseforstudyingfacerecognitioninunconstrainedenvironments.TechnicalReport,UniversityofMassachusetts,Amherst,2007.Institute of Computing Techn
28、ology,Chinese Academy of Sciences人脸识别上的进步人脸识别上的进步n2014:DeepFace1(Facebook)大数据:4K人,4.4M图像331TaigmanY,YangM,RanzatoMA,etal.Deepface:Closingthegaptohuman-levelperformanceinfaceverification.CVPR,2014.2SunY,WangX,TangX.Deeplylearnedfacerepresentationsaresparse,selective,androbust.arXivpreprint,2014.Insti
29、tute of Computing Technology,Chinese Academy of Sciences人脸识别上的进步人脸识别上的进步n香港中文大学DeepID2+在25个人脸Patch上分别训练CNN(4个卷积层,4个全连接层,4个verification损失信号和1个identification损失信号)训练数据:10K人,202K名人图像Y.Sun,X.Wang,andX.Tang,CVPR14Institute of Computing Technology,Chinese Academy of Sciences人脸识别上的人脸识别上的进步进步nGoogle最新的FaceNe
30、t深层网络(22层)+海量数据(800万人,2亿张图像)+Triplet Loss(不需要额外占用显存)F.Schroff,D.Kalenichenko,andJ.Philbin,CVPR15Institute of Computing Technology,Chinese Academy of Sciences提纲提纲n深度学习(DL)及其应用前沿nDL在CV领域应用的启示n关键算法介绍BP算法Auto-EncoderCNNCNN主要变种n关于DL的思考与讨论36Institute of Computing Technology,Chinese Academy of SciencesDL之前
31、的视觉处理方法之前的视觉处理方法n分步处理背后的哲学分而治之DivideandConquerKnowledge-drivennHand-craftedfeaturenIthinkitshouldbesolvedbymethodslike37Institute of Computing Technology,Chinese Academy of SciencesDL及其之后的及其之后的视觉处理方法视觉处理方法38Institute of Computing Technology,Chinese Academy of SciencesDL及其之后的视觉处理方法及其之后的视觉处理方法n学习到接近期望
32、的底层、中层和高层特征39Institute of Computing Technology,Chinese Academy of SciencesDL之前的视觉处理方法之前的视觉处理方法n任务n人工设计F(部分学习F)领域知识:分步处理滤波器,局部特征(SIFT),BoW,直方图,Max/Sum汇聚,判别分析,Kernel技巧,分段线性,流形学习,测度学习40类标签(分类问题)向量(回归/估计)预处理特征设计特征降维分类/回归Institute of Computing Technology,Chinese Academy of SciencesDL时代的视觉时代的视觉处理方法处理方法n任务
33、n人工设计F(部分学习F)nEnd-to-end地学习地学习F(全全步骤学习)步骤学习)RepresentationlearningFeaturelearningNonlineartransformlearning41离散类标签(分类问题)连续向量(回归/估计)CredittoDr.XiaogangWangInstitute of Computing Technology,Chinese Academy of SciencesDL时代的视觉处理方法时代的视觉处理方法42Collect dataPreprocessing 1Feature designClassifierEvaluationPr
34、eprocessing 2Collect dataFeature transformFeature transformClassifierDeep neural networkEvaluationvs.CredittoDr.XiaogangWangInstitute of Computing Technology,Chinese Academy of SciencesDL时代的视觉处理方法时代的视觉处理方法n方法论上的变化从分治协同(joint)n多步骤end-to-endlearning更广义的n检测与识别n分割与识别n43Institute of Computing Technology,
35、Chinese Academy of Sciences提纲提纲n深度学习(DL)及其应用前沿nDL在CV领域应用的启示n关键算法介绍Perceptron算法BP算法Auto-EncoderCNN及其主要变种n关于DL的思考与讨论44Institute of Computing Technology,Chinese Academy of SciencesPerceptron45FrankRosenblatt(1957),ThePerceptron-aperceivingandrecognizingautomaton.Report85-460-1,CornellAeronauticalLabora
36、tory.Institute of Computing Technology,Chinese Academy of SciencesPerceptron算法算法46F.Rosenblatt.Theperceptron:Aprobabilisticmodelforinformationstorageandorganizationinthebrain.Psychological Review,65:386-408,1958Institute of Computing Technology,Chinese Academy of SciencesPerceptron算法算法47前馈神经网络的BP学习算
37、法DavidE.Rumelhart,GeoffreyE.Hinton,andRonaldJ.Williams.(Oct.1986).Learningrepresentationsbyback-propagatingerrors.Nature323(6088):533536(单独slides)卷积神经网络及其变种卷积神经网络及其变种49(单独slides)Institute of Computing Technology,Chinese Academy of Sciences提纲提纲n深度学习(DL)及其应用前沿nDL在CV领域应用的启示n关键算法介绍Perceptron及学习算法MLP及其BP
38、算法Auto-EncoderCNN及其主要变种n关于DL的思考与讨论50Institute of Computing Technology,Chinese Academy of Sciences关于关于DL的更多讨论的更多讨论nDL带来观念的变革nDL是类脑信息处理方法吗?nDL有理论吗?nDL不能做什么?n数据驱动的学习不再需要领域知识?n工业界抢了学术界的饭碗?nCV研究者沦为ML研究者的实验员?nDL未来工作?51Institute of Computing Technology,Chinese Academy of SciencesDL带来观念的带来观念的变革变革n人工领域知识驱动数据
39、驱动的学习思想n小数据控制模型复杂度避免过拟合n大数据提高模型复杂度避免欠拟合“大数据+简单模型”是错误的!维数灾难(降维)高维有益(升维)n分步、分治思想协同学习(jointlearning)思想End-to-end的全过程学习n软硬件更优的协同52Institute of Computing Technology,Chinese Academy of SciencesDL是类脑信息处理方法吗是类脑信息处理方法吗?nDL受到脑信息处理方式启发分层逐级抽象初级视觉神经元的“类Gabor小波编码”n并不“类脑”本质上,脑的计算“机制”尚不清晰脑的连接更多样、更复杂nTop-down,反馈机制学习
40、过程未必需要大量数据n先天生物进化的结果(大数据长期训练)n后天学习更多演绎推理,迁移学习53Institute of Computing Technology,Chinese Academy of SciencesDL有理论吗有理论吗?nDL理论匮乏收敛性,bound局部极值,初值很重要复杂度理论n但不完全是blackbox与传统“分步”做法的关系比Kernel更“显式”层级可视化提供了很多线索逐层抽象or分层“非线性”?54Institute of Computing Technology,Chinese Academy of SciencesDL不能做什么不能做什么?n用做“特征学习”或
41、“非线性变换”最成功学到的特征具有良好的通用性传统分类器或回归似乎还可用n非常倚重大数据,小数据深度学习不可靠需要引入领域知识,深度模型的迁移学习n难以演绎推理DL是归纳学习,难以举一反三,更难无师自通n在一些简单问题上未必需要深度学习人脸识别的例子n目前的DL不学习“自身结构”调试经验很重要55Institute of Computing Technology,Chinese Academy of Sciences数据驱动的学习数据驱动的学习不需要不需要领域知识领域知识?n大数据驱动确实减少了对领域知识的依赖nCNN在CV领域的成功,本身就说明了领域知识的重要性卷积操作,Pooling操作n
42、小数据条件下,领域知识尤其重要56Dataisking,andDLisqueen?Institute of Computing Technology,Chinese Academy of Sciences工业界抢了学术界的饭碗工业界抢了学术界的饭碗?n工业界看不起学术界?工业界重视大数据收集和并行实现学术界重视理论和新的模型nCV学术界应该更smart更前瞻新的模型设计网络结构学习优化方法训练加速更Smart的数据收集大而脏乱差数据的高效利用57Institute of Computing Technology,Chinese Academy of SciencesCV研究者沦为研究者沦为ML
43、研究者的实验员?研究者的实验员?n有这个危险CV本身缺少理论体系“分步”法主宰CV太多年nCV仍有机会实际上ML也从CV获益良多基于学习的CV理论?CV研究者应该与ML有更多互动“几何”和“结构”的可学习性?58Institute of Computing Technology,Chinese Academy of SciencesDL领域的未来工作领域的未来工作nDL理论n网络结构本身的学习n小数据条件下的DLn领域知识的嵌入n带反馈的深度网络n大而脏乱差数据条件下的DLn深度模型的迁移与适应n面向视频分析的DL模型n“非线性”的更多来源n新的优化和训练算法59Institute of Computing Technology,Chinese Academy of Sciences总结和总结和警告!警告!n神经网络兴衰史的教训是复兴不是创新历史经常重演n相比CV等应用领域对DL的狂热,ML领域很冷静他强由他强,清风拂山冈;他横由他横,明月照大江。n建议要会DL,但不要只会DL经验知识驱动数据驱动混合驱动60谢谢!谢谢!61