1、CNN的早期历史的早期历史n卷积神经网络CNNK.Fukushima,“Neocognitron:Aself-organizingneuralnetworkmodelforamechanismofpatternrecognitionunaffectedbyshiftinposition,”Biological Cybernetics,vol.36,pp.193202,1980Y.LeCun,B.Boser,J.S.Denker,D.Henderson,R.E.Howard,W.Hubbard,andL.D.Jackel,“Backpropagationappliedtohandwrittenz
2、ipcoderecognition,”NeuralComputation,vol.1,no.4,pp.541551,1989Y.Le Cun,L.Bottou,Y.Bengio,andP.Haffner,“Gradient-basedlearningappliedtodocumentrecognition,”ProceedingsoftheIEEE,vol.86,no.11,pp.22782324,19981DL时代的时代的CNN扩展扩展nAKrizhevsky,ISutskever,GEHinton.ImageNetclassificationwithdeepconvolutionalneu
3、ralnetworks.NIPS2012nY.Jiaetal.Caffe:ConvolutionalArchitectureforFastFeatureEmbedding.ACMMM2014nK.Simonyan,A.Zisserman.Verydeepconvolutionalnetworksforlarge-scaleimagerecognition.arXivpreprintarXiv:1409.1556,2014nC.Szegedy,W.Liu,Y.Jia,P.Sermanet,S.Reed,D.Anguelov,D.Erhan,V.Vanhoucke,A.Rabinovich.Goi
4、ngdeeperwithconvolutions.CVPR2015(&arXiv:1409.4842,2014)2卷积卷积示例示例3卷积卷积形式化形式化4卷积卷积why?n1.sparseinteractions有限连接,Kernel比输入小连接数少很多,学习难度小,计算复杂度低nm个节点与n个节点相连O(mn)n限定k(m)个节点与n个节点相连,则为O(kn)5卷积卷积why?n1.sparseinteractions有限连接,Kernel比输入小连接数少很多,学习难度小,计算复杂度低nm个节点与n个节点相连O(mn)n限定k(m)个节点与n个节点相连,则为O(kn)6卷积卷积why?n1.
5、sparseinteractions有限(稀疏)连接nKernel比输入小n局部连接局部连接连接数少很多n学习难度小n计算复杂度低层级感受野(生物启发)n越高层的神经元,感受野越大7卷积卷积why?n2.ParameterSharing(参数共享)Tiedweights进一步极大的缩减参数数量n3.Equivariantrepresentations等变性配合Pooling可以获得平移不变性n对scale和rotation不具有此属性8CNN的基本结构的基本结构n三个步骤卷积n突触前激活,net非线性激活nDetectorPoolingnLayer的两种定义复杂定义简单定义n有些层没有参数9P
6、ooling10n定义(没有需要学习的参数)replacestheoutputofthenetatacertainlocationwithasummary statistic ofthenearbyoutputsn种类max pooling(weighted)averagepoolingWhy Pooling?11n获取不变性小的平移不变性:有即可,不管在哪里n很强的先验假设ThefunctionthelayerlearnsmustbeinvarianttosmalltranslationsWhy Pooling?12n获取不变性小的平移不变性:有即可,不管在哪里旋转不变性?n9个不同朝向的k
7、ernels(模板)0.20.610.10.50.30.020.050.1Why Pooling?13n获取不变性小的平移不变性:有即可,不管在哪里旋转不变性?n9个不同朝向的kernels(模板)0.50.30.0210.40.30.60.30.1Pooling与下采样结合与下采样结合n更好的获取平移不变性n更高的计算效率(减少了神经元数)14从全连接到有限连接从全连接到有限连接n部分链接权重被强制设置为0通常:非邻接神经元,仅保留相邻的神经元全连接网络的特例,大量连接权重为015Why Convolution&Pooling?napriorprobabilitydistributionov
8、ertheparametersofamodelthatencodesour beliefsaboutwhatmodelsarereasonable,beforewehaveseenanydata.n模型参数的先验概率分布(No free lunch)在见到任何数据之前,我们的信念(经验)告诉我们,什么样的模型参数是合理的nLocalconnections;对平移的不变性;tiedweigts来自生物神经系统的启发16源源起:起:Neocognitron(1980)nSimplecomplexLowerorderhighorder17K.Fukushima,“Neocognitron:Aself
9、-organizingneuralnetworkmodelforamechanismofpatternrecognitionunaffectedbyshiftinposition,”Biological Cybernetics,vol.36,pp.193202,1980LocalConnection源起:源起:Neocognitron(1980)18源起:源起:Neocognitron(1980)n训练方法分层自组织ncompetitivelearning无监督n输出层独立训练有监督19LeCun-CNN1989用于字符识别用于字符识别n简化了Neocognitron的结构n训练方法监督训练B
10、P算法n正切函数收敛更快,SigmoidLoss,SGDn用于邮编识别大量应用20LeCun-CNN1989用于字符识别用于字符识别n输入16x16图像nL1H112个5x5kernel8x8个神经元nL2-H212个5x5x8kernel4x4个神经元nL3H330个神经元nL4输出层10个神经元n总连接数5*5*12*64+5*5*8*12*16+192*30,约66,000个21LeCun-CNN1989用于字符识别用于字符识别nTiedweights对同一个featuremap,kernel对不同位置是相同的!22LeCun-CNN1989用于字符识别用于字符识别231998年年LeN
11、et数字数字/字符识别字符识别nLeNet-5Featuremapnasetofunitswhoseweighsareconstrainedtobeidentical.241998年年LeNet数字数字/字符识别字符识别n例如:C3层参数个数(3*6+4*9+6*1)*25+16=151625后续:后续:CNN用于目标检测与识别用于目标检测与识别26AlexNet for ImageNet(2012)n大规模CNN网络650K神经元60M参数使用了各种技巧nDropoutnDataaugmentnReLUnLocalResponseNormalizationnContrastnormaliza
12、tionn.27Krizhevsky,Alex,IlyaSutskever,andGeoffreyE.Hinton.Imagenetclassificationwithdeepconvolutionalneuralnetworks.Advances in neural information processing systems.2012.AlexNet for ImageNet(2012)nReLU激活函数28AlexNet for ImageNet(2012)n实现2块GPU卡输入层150,528其它层253,440186,62464,89664,89643,264409640961000
13、.29Krizhevsky,Alex,IlyaSutskever,andGeoffreyE.Hinton.Imagenetclassificationwithdeepconvolutionalneuralnetworks.Advances in neural information processing systems.2012.AlexNet for ImageNet(2012)nImageNet物体分类任务上1000类,1,431,167幅图像30RankNameError rates(TOP5)Description1U.Toronto0.153Deeplearning2U.Tokyo0
14、.261Hand-craftedfeaturesandlearningmodels.Bottleneck.3U.Oxford0.2704Xerox/INRIA0.271Krizhevsky,Alex,IlyaSutskever,andGeoffreyE.Hinton.Imagenetclassificationwithdeepconvolutionalneuralnetworks.Advances in neural information processing systems.2012.AlexNet for ImageNetn深度的重要性31网络深度87664参数数量60M44M10M59
15、M10M性能损失0%1.1%5.7%3.0%33.5%Krizhevsky,Alex,IlyaSutskever,andGeoffreyE.Hinton.Imagenetclassificationwithdeepconvolutionalneuralnetworks.Advances in neural information processing systems.2012.VGG Net(2014)n多个stage每个stage多个卷积层n卷积采样间隔1x1n卷积核大小3x31个Pooling层(2x2)n16-19层n多尺度融合K.Simonyan,A.Zisserman.Verydee
16、pconvolutionalnetworksforlarge-scaleimagerecognition.arXivpreprintarXiv:1409.1556,2014VGG Net(2014)n几种配置Cov3-64:n3x3感受野n64个channel33K.Simonyan,A.Zisserman.Verydeepconvolutionalnetworksforlarge-scaleimagerecognition.arXivpreprintarXiv:1409.1556,2014VGG Net(2014)34K.Simonyan,A.Zisserman.Verydeepconvol
17、utionalnetworksforlarge-scaleimagerecognition.arXivpreprintarXiv:1409.1556,2014GoogLeNet(2014)n超大规模22个卷积层的网络计算复杂度是AlexNet的4倍左右C.Szegedy,W.Liu,Y.Jia,P.Sermanet,S.Reed,D.Anguelov,D.Erhan,V.Vanhoucke,A.Rabinovich.Goingdeeperwithconvolutions.CVPR2015(&arXiv:1409.4842,2014)GoogLeNet(2014)n超大规模24层网络nIncep
18、tion结构提取不同scale的特征然后串接起来1x1convolutions3x3convolutions5x5convolutionsFilterconcatenationPreviouslayerC.Szegedy,W.Liu,Y.Jia,P.Sermanet,S.Reed,D.Anguelov,D.Erhan,V.Vanhoucke,A.Rabinovich.Goingdeeperwithconvolutions.CVPR2015(&arXiv:1409.4842,2014)GoogLeNet(2014)n超大规模24层网络nInception结构提取不同scale的特征,然后串接起来
19、增加1x1的卷积:把响应图的数量缩小了1x1convolutions3x3convolutions5x5convolutionsFilterconcatenationPreviouslayer3x3maxpooling1x1convolutions1x1convolutions1x1convolutionsC.Szegedy,W.Liu,Y.Jia,P.Sermanet,S.Reed,D.Anguelov,D.Erhan,V.Vanhoucke,A.Rabinovich.Goingdeeperwithconvolutions.CVPR2015(&arXiv:1409.4842,2014)Goo
20、gLeNet(2014)38C.Szegedy,W.Liu,Y.Jia,P.Sermanet,S.Reed,D.Anguelov,D.Erhan,V.Vanhoucke,A.Rabinovich.Goingdeeperwithconvolutions.CVPR2015(&arXiv:1409.4842,2014)GoogLeNet(2014)n在ImageNet上1000类物体分类上的性能39C.Szegedy,W.Liu,Y.Jia,P.Sermanet,S.Reed,D.Anguelov,D.Erhan,V.Vanhoucke,A.Rabinovich.Goingdeeperwithconvolutions.CVPR2015(&arXiv:1409.4842,2014)40谢谢!谢谢!41