收藏 分销(赏)

Serengeti-虚拟化你的大数据应用.pptx

上传人:可**** 文档编号:10291717 上传时间:2025-05-17 格式:PPTX 页数:42 大小:6.37MB
下载 相关 举报
Serengeti-虚拟化你的大数据应用.pptx_第1页
第1页 / 共42页
Serengeti-虚拟化你的大数据应用.pptx_第2页
第2页 / 共42页
点击查看更多>>
资源描述
Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,4/25/2010,#,2009,VMware,Inc.,All,rights,reserved,Serengeti,-,虚,拟,化你的大数据,应,用,蔺,永,华,Vmware,Inc.,Agenda,Todays,big,data,system,Why,virtualize,hadoop?,Serengeti,introduction,Common,questions,about,virtualization,Serengeti,solution,Deep,insight,into,Serengeti,Summary,Q&A,Todays,Big,Data,System:,ETL,Unstructured,Data,(HDFS),Real,Time,Structured,Database,Big,SQL,Data,Parallel,Batch,Processing,Real,Time,Streams,Real-Time,Processing,(s4,storm),Analytics,Agenda,Todays,big,data,system,Why,virtualize,hadoop?,Serengeti,introduction,Common,questions,about,virtualization,Serengeti,solution,Deep,insight,into,Serengeti,Summary,Q&A,Challenges,To,Use,Hadoop,in,physical,infrastructure,Deployment,Difficult,to,deploy,cost,several,people,for,several,days,even,months,Difficult,to,tune,cluster,performance,Low,Efficiency,Hadoop,clusters,are,typically,not,100%,utilized,across,all,hardware,resources.,Difficult,to,share,resources,safely,between,different,workload,Single,Point,of,Failure,Single,point,of,failure,for,Name,Node,and,Job,tracker,No,HA,for,Hive,HCatalog,etc.,Why,Virtualize,Hadoop?,-,Get,your,Hadoop,cluster,in,minutes,1/1000humanefforts,LeastHadoopoperation,knowledge,Fullyautomated,process,10,minutesto,get,a,Hadoop/HBaseclusterfrom,scratch,Server,preparation,OS,installation,Automateby,Serengeti,on,vSpherewith,best,practice,Network,Configuration,Hadoop,Installation,and,Configuration,Manual,process,costdays,Why,Virtualize,Hadoop?,-,Consolidate,sprawling,clusters,Clustersshare,serverswith,strongisolation,Single,Hardware,Infrastructure,Unified,operations,Optimize,Shared,Resources,=,higher,utilization,Elastic,resources,=,faster,on-demand,access,Hadoop,Dev,Hadoop,Prod,HBase,ClusterSprawling,Single,purpose,clusters,for,various,business,applications,lead,to,cluster,sprawl.,Cluster,Consolidation,Simplify,Finance,Hadoop,Virtualization,Platform,Hadoop,Dev,Hadoop,Prod,HBase,.,Portal,Hadoop,Portal,Hadoop,30%CAPEXDown,50%+,resourcesaresitting,idlewhilehighpriorityjob,is,burningup,its,cluster.,Utilizeall,resourcesfrom,pool,on,demand.,Dynamic,elastic,scalingonshared,resourcepool,Why,Virtualize,Hadoop?,Utilize,all,your,resources,to,solve,the,priority,problem,3X,fasterto,getanalyticresults,vSphere,High,Availability,(HA),-,protection,against,unplanned,downtime,Overview,Protection,against,host,and,VM,failures,Automatic,failure,detection,(host,guest,OS),Automatic,virtual,machine,restart,in,minutes,on,any,available,host,in,cluster,OS,and,application-independent,does,not,require,complex,configuration,changes,(Coordination),Zookeepr,Management,Server,High,Availability,for,the,Hadoop,Stack,(Hadoop,Distributed,File,System),HBase,(Key-Valuestore),HDFS,MapReduce,(Job,Scheduling/Execution,System),Pig,(Data,Flow),Hive,BI,Reporting,ETLTools,RDBMS,Jobtracker,Namenode,(SQL),Hive,MetaDB,HCatalog,Hcatalog,MDB,Server,X,X,HA,HA,App,OS,App,App,OS,OS,App,OS,App,OS,App,OS,App,OS,VMwareESX,X,VMwareESX,Zero,downtime,zero,data,loss,failover,for,all,virtual,machines,in,case,of,hardware,failures,Integrated,with,VMware,HA/DRS,No,complex,clustering,or,specialized,hardware,required,Single,common,mechanism,for,all,applications,and,operating,FT,vSphere,Fault,Tolerance,provides,continuous,protection,Overview,Single,identical,VMs,running,in,lockstep,on,separate,hosts,systems,ZerodowntimeforNameNode,JobTrackerandothercomponentsin,Hadoopclusters,Agenda,Todays,big,data,system,Why,virtualize,hadoop?,Serengeti,introduction,Common,questions,about,virtualization,Serengeti,solution,Deep,insight,into,Serengeti,Summary,Q&A,Easy,and,rapid,deployment,and,management,Open,sourceprojectlaunched,in,June,2012,0.8,is,released,at,Apr.,and,willrelease0.9,at,Jun.,Toolkitthat,leveragevirtualizationto,simplifyHadoop,deployment,and,operations,Deploy,a,cluster,in,10,Minutes,fully,automated,Customize,Hadoop,and,HBase,cluster,Automated,cluster,operation,Come,with,eco-system,components,Support,all,popular,Hadoop,Distributions,Serengeti,Demo:,10,minutes,to,a,Hadoop,cluster,with,Serengeti,Agenda,Todays,big,data,system,Why,virtualize,hadoop?,Serengeti,introduction,Common,questions,about,virtualization,Serengeti,solution,Deep,insight,into,Serengeti,Summary,Q&A,Common,questions,about,virtualization,Local,Disk,Can,local,disk,be,used,in,virtualization,environment?,Flexibilityand,Scalability,How,to,flexible,schedule,resources,between,clusters,and,different,applications,as,mentioned,above?,Data,stability,In,virtual,environment,how,can,we,distribute,data,across,host,and,rack?,Data,locality,Hadoop,will,schedule,compute,tasks,near,by,the,data,to,reduce,network,IO,for,data,R/W.,Can,virtual,environment,get,the,same,result?,Performance,How,about,the,performance,in,virtual,environment?,Agenda,Todays,big,data,system,Why,virtualize,hadoop?,Serengeti,introduction,Common,questions,about,virtualization,Serengeti,solution,Deep,insight,into,Serengeti,Summary,Q&A,Can,I,use,local,diskeasily?,Other,VM,Other,VM,Other,VM,Other,VM,Other,VM,Other,VM,Other,VM,Other,VM,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,Hadoop,SerengetiExtend,Virtual,StorageArchitectureto,IncludeLocalDisk,Shared,Storage:SAN,or,NAS,Easy,to,provision,Automated,cluster,rebalancing,Hybrid,Storage,SAN,for,boot,images,other,workloads,Local,disk,for,Hadoop,&,HDFS,Host,Host,Host,Host,Host,Host,How,to,flexiblescalein/scaleout,How,to,flexiblescheduleresourcesbetween,clustersand,differentapplications?,-,Compute,Current,Hadoop:,T1,T2,VM,VM,VM,VM,Combined,Storage/Com,pute,HadoopinVM,-,*,VM,lifecycle,determined,by,Datanode,-,*,Limited,elasticity,VM,Storage,SeparateStorage,VM,Storage,SeparateComputeClusters,-,*,Separate,compute,-,fromdata,-,*,Remove,elasticconstrain,-,by,Datanode,-,*,Elastic,compute,-,*,Raise,utilization,-,*,Separate,virtual,compute,*,Compute,clusterpertenant,*,Stronger,VM-grade,security,and,resourceisolation,Evolution,of,Hadoop,on,VMs,Data/Compute,separation,Slave,Node,Serengeti,Node,Scale,Out,/,Scale,In,NameNode,Host,D,Host,JobTracker,C,C,C,C,D,Host,C,C,C,C,D,Host,C,C,C,C,D,Host,C,C,C,C,Serengeti,Ballooning,Enhancement,for,Java,Application,JVM,Guest,OS,Host,JVM,Guest,OS,Host,Guest,OS,JVM,How,to,keep,data,stability?,How,to,access,data,locallyif,data,node,and,computenode,are,located,in,differentVM?,Datanodeandtasktrackercombinedcluster,DataComputeseparatedcluster,master,Host,worker,Host,worker,Host,master,Host,Data,node,Host,Tasktracker,Data,node,Host,Tasktracker,Tasktracker,Tasktracker,Data,node,Host,Computeonly,cluster1,Computeonly,cluster2,HDFS,cluster,Compute,Onlycluster,Rack1,Rack2,Rack1,Distributed,and,Data/Compute,Associated,VM,Placement,Rack2,Rack1,Job,tracker,Job,tracker,Name,node,Host,Rack2,Tasktracker,Tasktracker,Data,node,Host,HadoopTopologyChanges,for,Virtualization,Hadoop,Topology,Awareness,Serengeti,HVE,/,D1,D2,R1,R2,N1,H1,H2,H3,H4,H5,H6,H7,H8,H9,H10,H11,H12,R3,R4,3,/,D1,D2,R1,R2,H1,H2,H3,H4,H5,H6,H7,H8,H9,H10,H11,H12,R3,R4,2,3,N2,N3,N4,N5,N6,N7,N8,1,1,2,3,2,1,1,2,3,4,HADOOP-8468(UmbrellaJIRA),HADOOP-8469,HDFS-3495,HDFS-3498,Hadoop,NetworkTopologyExtension,Hadoop,Virtualization,Extensions,for,Topology,HVE,TaskScheduling,PolicyExtension,BalancerPolicy,Extension,ReplicaChoosing,PolicyExtension,ReplicaPlacement,PolicyExtension,ReplicaRemovalPolicyExtension,HDFS,MapReduce,Hadoop,Common,MAPREDUCE-4310,MAPREDUCE-4309,HADOOP-8470,HADOOP-8472,Is,there,significantperformancedegradationin,virtualization,environment?,Is,there,any,performancedata?,Virtualized,Hadoop,Performance,Native,versus,Virtual,Platforms,32,hosts,16,disks/host,Agenda,Todays,big,data,system,Why,virtualize,hadoop?,Serengeti,introduction,Common,questions,about,virtualization,Serengeti,solution,Deep,insight,into,Serengeti,Summary,Q&A,RestAPI,SpringBatch,Update,MetaDB,step,VM,Placement,calculation,VM,Provision,step,Software,Mgmt,step,UI,Client,Flex,UI,Serengeti,architecture,diagram,CLI,Client,Spring,Shell,Serengeti,Web,Service,Hibernate/,DAO,vPostgres,VC,adapter,Ironfan,service,ThriftService,Progress,Ironfan,report,Chef,server,RestAPI,Cookbook,VHM,step,RabbitMQ,VM,runtime,Manager,Host,Host,Host,Host,Host,Virtualization,Platform,Hadoop,Node,ChefClient,HA,kit,Hadoop,Node,Hadoop,Node,Package,repository,vCenter,Customizing,your,Hadoop/HBase,cluster,with,Serengeti,Choiceof,distros,Storageconfiguration,Choice,of,shared,storage,or,Local,disk,Resourceconfiguration,High,availabilityoption,#,of,nodes,distro:apache,groups:,name:master,roles:,hadoop_namenode,hadoop_jobtracker”,storage:,type:,SHARED,sizeGB:,20,instance_type:MEDIUM,instance_num:1,ha:true,name:worker,roles:,hadoop_datanode,hadoop_tasktracker,instance_type:SMALL,instance_num:5,ha:false,One,command,to,scale,out,your,cluster,with,Serengeti,cluster,resize,name,-nodegroup,worker,instanceNum,Configure/reconfigure,Hadoop,with,ease,by,Serengeti,ModifyHadoop,clusterconfigurationfromSerengeti,Use,the,“configuration”,section,of,the,json,spec,file,Specify,Hadoop,attributes,in,core-site.xml,hdfs-site.xml,mapred-site.xml,hadoop-env.sh,log4j.properties,Apply,new,Hadoop,configuration,using,the,edited,spec,file,configuration:,hadoop:,core-site.xml:,/,check,for,all,settings,at,hadoop.apache.org/common/docs/r1.0.0/core-default.html,hdfs-site.xml:,/,check,for,all,settings,at,hadoop.apache.org/common/docs/r1.0.0/hdfs-default.html,mapred-site.xml:,/,check,for,all,settings,at,hadoop.apache.org/common/docs/r1.0.0/mapred-default.html,io.sort.mb:,300,hadoop-env.sh:,/,HADOOP_HEAPSIZE:,/,HADOOP_NAMENODE_OPTS:,/,HADOOP_DATANODE_OPTS:,cluster,config,-name,myHadoop,-specFile,/home/serengeti/myHadoop.json,Freedom,of,Choice,and,Open,Source,Community,Projects,Distributions,Flexibilityto,choosefrom,major,distributions,cluster,create,-name,myHadoop,-distro,apache,Supportfor,multipleprojects,Open,architectureto,welcomeindustryparticipation,ContributingHadoop,VirtualizationExtensions(HVE)to,open,sourcecommunity,HDFS2,with,Namenode,Federation,and,HA,Deploy,CDH4,Hadoop,cluster,Name,Node,Federation,Name,Node,HA,MapReduce,v1,HBase,Pig,Hive,and,Hive,Server,CDH4,configurations,Scale,out,Elasticity,JobTracker,HA/FT,ActiveNamenode,Standby,Namenode,ActiveNamenode,Standby,Namenode,Zookeeper,Group,ZK,ZK,ZK,Coordinate,NamenodeGroup1,Coordinate,NamenodeGroup2,Quorum-based,metadatastore,Data,Nodes,Datanode,Datanode,Datanode,Datanode,Datanode,Datanode,Datanode,Datanode,Blockreport,Blockreport,Proactive,monitoring,and,tuning,with,VCOPs,Proactivelymonitoring,through,VCOPs,Gain,comprehensivevisibility,Eliminatemanual,processeswith,intelligentautomation,Proactivelymanage,operations,Agenda,Todays,big,data,system,Why,virtualize,hadoop?,Serengeti,introduction,Common,questions,about,virtualization,Serengeti,solution,Deep,insight,into,Serengeti,Summary,Q&A,VMWarebringsAgility,Efficiency,and,Elasticityto,Big,Data,Elasticity,Enable,full,elasticity,through,separation,of,Data,and,Compute,Scale,In/Out,Hadoop,with,Resource,Constrain,Agility,Deploy,configure,and,monitor,Hadoop,clusters,on,the,fly,Dynamic,reconfiguring,of,Hadoop,to,meet,changing,business,demands,Efficiency,Consolidate,Hadoop,to,achieve,higher,utilization,Pool,resources,to,allow,for,increased,performance,and,priority,job,processing,Serengeti,Resources,Download,and,try,Serengeti,projectserengeti.org,VMware,Hadoop,site,
展开阅读全文

开通  VIP会员、SVIP会员  优惠大
下载10份以上建议开通VIP会员
下载20份以上建议开通SVIP会员


开通VIP      成为共赢上传
相似文档                                   自信AI助手自信AI助手

当前位置:首页 > 包罗万象 > 大杂烩

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        抽奖活动

©2010-2025 宁波自信网络信息技术有限公司  版权所有

客服电话:4009-655-100  投诉/维权电话:18658249818

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :微信公众号    抖音    微博    LOFTER 

客服