资源描述
Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,3,*,Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,Computer Vision,(,机器视觉,),Image by,1,3,Todays Talk,What is Computer Vision?,Why Study Computer Vision?,How Vision is Used Now?,Overview of Computer Vision Algorithm,Challenges of Computer Vision,Questions,2,4/17/2025,What is computer vision?,Terminator 2,3,Terminator 5,Every picture tells a story,4,Goal of computer vision is to write computer programs that can interpret images,Can computers match(or beat)human vision?,5,What is Computer Vision?,Automatic understanding of images and video,Computing properties of the 3D world from visual data,(measurement),6,1.Vision for measurement,Real-time stereo,Structure from motion,NASA Mars Rover,Pollefeys et al.,Multi-view stereo forcommunity photo collections,Goesele et al.,Slide credit:L.Lazebnik,7,What is Computer Vision?,Automatic understanding of images and video,Computing properties of the 3D world from visual data,(measurement),Algorithms and representations to allow a machine to recognize objects,people,scenes,and activities.,(perception and interpretation),8,2.Vision for perception,interpretation,sky,water,Ferris wheel,amusement park,Cedar Point,12 E,tree,tree,tree,carousel,deck,people waiting in line,ride,ride,ride,umbrellas,pedestrians,maxair,bench,tree,Lake Erie,people sitting on ride,Objects,Activities,Scenes,Locations,Text/writing,Faces,Gestures,Motions,Emotions,The Wicked Twister,9,What is Computer Vision?,Automatic understanding of images and video,Computing properties of the 3D world from visual data,(measurement),Algorithms and representations to allow a machine to recognize objects,people,scenes,and activities.,(perception and interpretation),Algorithms to mine,search,and interact with visual data,(search and organization),10,3.Vision for search and organization,11,Components of a computer vision system,Lighting,Scene,Camera,Computer,Scene Interpretation,Srinivasa Narasimhans slide,12,Computer vision vs human vision,What we see,What a computer sees,13,Vision is really hard,Vision is an amazing feat of natural intelligence,Visual cortex occupies about 50%of brain,More human brain devoted to vision than anything else,Is that a queen or a bishop?,14,Vision is multidisciplinary,From wiki,Computer Graphics,HCI,15,Why computer vision matters,Safety,Health,Security,Comfort,Access,Fun,16,A little story about Computer Vision,In 1966,Marvin Minsky at MIT asked his undergraduate student Gerald Jay Sussman to“spend the summer linking a camera to a,computer and getting the computer to describe what it saw”.We now know that the problem is slightly more difficult than that.(Szeliski 2009,Computer Vision),17,Ridiculously brief history of computer vision,1966:Minsky assigns computer vision as an undergraduate summer project,1960s:interpretation of synthetic worlds,1970s:some progress on interpreting selected images,1980s:ANNs come and go;shift toward geometry and increased mathematical rigor,1990s:face recognition;statistical analysis in vogue,2000s:broader recognition;large annotated datasets available;video processing starts,2030s:robot uprising?,Guzman 68,Ohta Kanade 78,Turk and Pentland 91,18,19,Why study computer vision?,Millions of images being captured all the time,Lots of useful applications,The next slides show the current state of the art,Source:S.Lazebnik,20,Flick,r,1 billion,2 billion,3 billion,4 billion,5 billion,6 billion,21,Other photo sharing sites,10 billion,20 billion,50 billion,30 billion,40 billion,22,and growing,Flickr:1.7 million photos/day,Facebook:100 million photos/day,YouTube:35 hours of video every minute,57 billion photos will be taken(US)in 2010,of November 2010),(compare with 17 billion negatives exposed in 1996),(as of February 2010),23,How vision is used now,Examples of state-of-the-art,24,1.Optical character recognition(OCR),Digit recognition,AT&T labs,to convert scanned docs to text,If you have a scanner,it probably came with OCR software,License plate readers,en.wikipedia.org/wiki/Automatic_number_plate_recognition,25,2.Face detection,Many new digital cameras now detect faces,Canon,Sony,Fuji,26,3.Smile detection,Sony Cyber-shot T70 Digital Still Camera,27,4.3D from thousands of images,Building Rome in a Day:Agarwal et al.2009,28,The old city of Dubrovnik,4,619 images,3,485,717 points,5.Object recognition(in supermarkets),LaneHawk by EvolutionRobotics,“A smart camera is flush-mounted in the checkout lane,continuously watching for items.When an item is detected and recognized,the cashier verifies the quantity of items that were found under the basket,and continues to close the transaction.The item can remain under the basket,and with LaneHawk,you are assured to get paid for it“,29,6.Vision-based biometrics,“,How the Afghan Girl was Identified by Her Iris Patterns,”,National Geographic,30,7.Forensics,Source:Nayar and Nishino,“Eyes for Relighting”,31,Source:Nayar and Nishino,“Eyes for Relighting”,32,4/17/2025,Source:Nayar and Nishino,“Eyes for Relighting”,33,4/17/2025,8.Login without a password,Fingerprint scanners on many new laptops,other devices,Face recognition systems now beginning to appear more widely,Object recognition(in mobile phones),Point&Find,Nokia,Google Goggles,35,10.Vision in space,Vision systems(JPL)used for several tasks,Panorama stitching,3D terrain modeling,Obstacle detection,position tracking,For more,read“,Computer Vision on Mars,”by Matthies et al.,NASAS Mars Exploration Rover Spirit,captured this westward view from atop a low plateau where Spirit spent the closing months of 2007.,36,11.Industrial robots,Vision-guided robots position nut runners on wheels,37,12.Mobile robots,www.robocup.org/,NASAs Mars Spirit Rover,en.wikipedia.org/wiki/Spirit_rover,Saxena et al.2008,STAIR,at Stanford,38,13.Medical imaging,Image guided surgery,Grimson et al.,MIT,3D imaging,MRI,CT,39,14.Digital cosmetics,40,15.Inpainting,Bertalmio et al.SIGGRAPH 00,41,16.Debluring,Fergus et al.SIGGRAPH 06,42,17.Sports,Sportvision,first down line,Nice,explanation,on,Smart cars,Mobileye,Vision systems currently in high-end BMW,GM,Volvo models,By 2010:70%of car manufacturers.,44,19.Google cars,Oct 9,2010.,Google Cars Drive Themselves,in Traffic,.,The New York Times,.John Markoff,June 24,2011.,Nevada state law paves the way for driverless cars,.,Financial Post,.Christine Dobby,Aug 9,2011,Human error blamed after Googles driverless car sparks five-vehicle crash,.,The Star,(Toronto),45,20.Interactive Games:Kinect,Object Recognition:,Matrix,movies,ESC Entertainment,XYZRGB,NRC,21.Special effects:shape capture,47,Pirates of the Carribean,Industrial Light and Magic,22.Special effects:motion capture,48,Computer Vision and Nearby Fields,Computer Graphics:Models to Images,Comp.Photography:Images to Images,Computer Vision:Images to Models,49,Overview of Computer Vision Algorithm,50,So what do humans care about?,Verification:is that a bus?,slide by Fei Fei,Fergus&Torralba,51,Detection:are there cars?,slide by Fei Fei,Fergus&Torralba,52,Identification:is that a picture of Mao?,slide by Fei Fei,Fergus&Torralba,53,Object categorization,sky,building,flag,wall,banner,bus,cars,bus,face,street lamp,slide by Fei Fei,Fergus&Torralba,54,Scene and context categorization,outdoor,city,traffic,slide by Fei Fei,Fergus&Torralba,55,Rough 3D layout,depth ordering,56,Overview of Computer Vision Algorithm,Image formation,Features,Grouping&fitting,Multi-view geometry,Recognition&learning,Motion&tracking,57,1.Image formation,How does light in 3d world project to form 2d images?,58,2.Features and filters,Transforming and describing images;textures,colors,edges,59,3.Grouping&fitting,fig from Shi et al,Clustering,segmentation,fitting;what parts belong together?,60,4.Multiple views,Hartley and Zisserman,Multi-view geometry,matching,invariant features,stereo vision,Fei-Fei Li,61,5.Recognition and learning,Recognizing objects and categories,learning techniques,62,6.Motion and tracking,Tracking objects,video analysis,low level motion,optical flow,63,Challenges 1:view point variation,Michelangelo 1475-1564,64,Challenges 2:illumination,slide credit:S.Ullman,65,Challenges 3:occlusion,Magritte,1957,66,Challenges 4:scale,slide by Fei Fei,Fergus&Torralba,67,Challenges 5:deformation,Xu,Beihong 1943,68,Challenges 6:background clutter,Klimt,1913,69,Challenges 7:object intra-class variation,slide by Fei-Fei,Fergus&Torralba,70,Challenges 8:local ambiguity,slide by Fei-Fei,Fergus&Torralba,71,Challenges 9:the world behind the image,72,Challenges 10:complexity,Thousands to millions of pixels in an image,3,000-30,000 human recognizable object categories,30+degrees of freedom in the pose of articulated objects(humans),Billions of images indexed by Google Image Search,18 billion+prints produced from digital camera images in 2004,295.5 million camera phones sold in 2005,73,Keep Moving,Ok,clearly the vision problem is deep and challenging time to give up?,Active research area with exciting progress!,74,
展开阅读全文