1、Introduction of Current Deep Learning Software PackagesThree Popular ones1.Caffehttp:/caffe.berkeleyvision.org/2.Theanohttps:/pypi.python.org/pypi/Theano3.TensorFlowhttps:/www.tensorflow.org/These websites provide information about how to install and run related deep learning software.1.Caffe1.Overv
2、iew:Caffe(Convolutional Architecture For Feature Extraction)Created by Yangqing Jia(贾扬清),UC Berkeley.Written in C+,has Python and MATLAB interface.2.Github page:https:/ method(CUDA+Caffe):Ouxinyu.github.io/Blogs/2014723001.htmlAnatomy of Caffe Blob:Stores data andderivatives Layer:Transforms Bottom
3、blobs to top blobs Net:Many layers;computes gradients viaforward/backward BlobLayerNetBlobA Blob is a wrapper over the actual data being processed and passed along by Caffe,and also under the hood provides synchronization capability between the CPU and the GPU.The conventional blob dimensions for ba
4、tches of image data are(number N)x(channel K)x(height H)x(width W).For a convolution layer with 96 filters of 11 x 11 spatial dimension and 3 inputs the blob is 96 x 3 x 11 x 11.For an inner product/fully-connected layer with 1000 output channels and 1024 input channels the parameter blob is 1000 x
5、1024.LayerThe layer is the essence of a model and the fundamental unit of computation.Layers convolve filters,pool,take inner products,apply nonlinearities like rectified-linear and sigmoid and other element-wise transformations,normalize,load data,and compute losses like softmax and hinge.Case:Conv
6、olution LayerNetThe net jointly defines a function and its gradient by composition and auto-differentiation.The composition of every layers output computes the function to do a given task,and the composition of every layers backward computes the gradient from the loss to learn the task.name:LogRegla
7、yer name:mnist type:Data top:data top:label data_param source:input_leveldb batch_size:64 layer name:ip type:InnerProduct bottom:data top:ip inner_product_param num_output:2 layer name:loss type:SoftmaxWithLoss bottom:ip bottom:label top:lossHow to use Caffe?Just 4 steps!1.Convert data(run a script)
8、2.Define net(edit prototxt)3.Define solver(edit prototxt)4.Train(with pretrained weights)(run a script)Take Cifar10 image classification for example.DataLayer reading from LMDB is the easiest,create LMDB using convert_imageset Need text file where each line is “path/to/image.jpeg label”(use imageDat
9、aLayer read)Create HDF5 file yourself using h5py(use HDF5Layer read)Step 1:Convert Data for CaffeConvert Data on CIFAR10Step 2:Define Net(cifar10_quick_train_test.prototxt)Layer nameBlobs nameLearning rate of weightLearning rate of biasInput image num per iterationTraining image dataData typeBlobs n
10、ameNumber of output classOutput accuracy during testOutput loss during trainIf you finetune some pre-train model,you can set lr_mul=0Step 2:Define Net(cifar10_quick_train_test.prototxt)Visualize the Defined Networkhttp:/ethereon.github.io/netscope/#/editorStep 3:Define Solver(cifar10_quick_solver.pr
11、ototxt)#reduce the learning rate after 8 epochs(4000 iters)by a factor of 10#The train/test net protocol buffer definitionnet:examples/cifar10/cifar10_quick_train_test.prototxt“#test_iter specifies how many forward passes the test should carry out.#In the case of MNIST,we have test batch size 100 an
12、d 100 test iterations,#covering the full 10,000 testing images.test_iter:100#Carry out testing every 500 training iterations.test_interval:500#The base learning rate,momentum and the weight decay of the network.base_lr:0.001 momentum:0.9weight_decay:0.004#The learning rate policylr_policy:fixed“#Dis
13、play every 100 iterationsdisplay:100#The maximum number of iterationsmax_iter:4000#snapshot intermediate resultssnapshot:4000snapshot_prefix:examples/cifar10/cifar10_quick“#solver mode:CPU or GPUsolver_mode:GPUDefined Net fileKey parametersImportant parametersStep 4:TrainWrite a shell file(train_qui
14、ck.sh):Then enjoy a cup of caffe Model Zoo(Pre-trained Model+Finetune)We can finetune these models or do feature extraction based on these models Some tricks/skills about training Caffe1 Neural Networks:tricks of the trade1.Data Augmentation to enlarge training samples2.Image Pre-Processing3.Network
15、 Initializations4.During Training5.Activation Functions6.Regularizationsmore details can refer to 1,22http:/ AugmentationData AugmentationVery useful for face and car recognition!Very useful for face and car recognition!Data AugmentationTo get rid of occlusion and scale change,like visual tracking D
16、ata AugmentationData AugmentationImage Pre-ProcessingStep1:subtract the dataset-mean value in each channelStep2:swap channels from RGB to BGRStep3:move image channels to outermost dimensionStep4:rescale from 0,1 to 0,255Network InitializationsDuring TrainingDropout1Batch Normalization2help alleviate
17、 overfitting during training in Caffe1 Srivastava,Nitish,et al.Dropout:a simple way to prevent neural networks from overfitting.Journal of Machine Learning Research 15.1(2014):1929-1958.2 S.Ioffe and C.Szegedy.Batch normalization:Accelerating deep network training by reducing internal covariate shif
18、t.arXiv preprint arXiv:1502.03167,2015OverfittingPros and Cons of CaffeA practical example of Caffe 1.Object detectionRCNN/Fast-RCNN/Faster-RCNN Caffe+MATLABlr=0.1 x base learning ratelr=base learning rate2.Theano1.Overview:A Python library that allows to define,optimize and evaluate mathematical ex
19、pression.From Yoshua Bengios group at University of Montreal.Embracing computation graphs,symbolic computation.High-level wrappers:Keras,Lasagne.2.Github:https:/ and Cons of Theano 3.TensorFlow1.Overview:Very similar to Theano-all about computation graphs.Easy visualizations(TensorBoard).Multi-GPU a
20、nd multi-node training.2.Tutorial:http:/terryum.io/ml_practice/2016/05/28/TFIntroSlides/1.Load data2.Define the NN structure3.Set optimization parameters4.Run!Basic Flow of TensorFlow1.Load data1.Load data2.Define the NN structure3.Set optimization parameters4.RUNThe Pros and Cons of TensorFlow Over
21、view CaffeTheanoTensorFlowlanguageC+,Python,MATLABPythonPythonPretrainedYes+Yes(Lasagne)InceptionMulti-GPU:Data parallelYesYesYesMulti-GPU:Model parallelNoExperimentalYes(Best)SpeedVery fastQuickQuickPlatformAll operation systemsLinux,OSXLinux,OSXReadablesource codeYesNoNoGood at RNNNoYesYes(Best)Feature extraction/finetuning existing models:Use CaffeComplex uses of pretrained models:Use Lasagne(Theano)Crazy RNNs:Use Theano or TensorflowHuge model,need model parallelism:Use TensorFlowOther popular deep learning tools1.Matconvnet:From VGG:http:/www.vlfeat.org/matconvnet/2.Torch7:https:/