收藏 分销(赏)

大数据分析存储解决方案讲义.ppt

上传人:精*** 文档编号:12109822 上传时间:2025-09-14 格式:PPT 页数:40 大小:10MB 下载积分:12 金币
下载 相关 举报
大数据分析存储解决方案讲义.ppt_第1页
第1页 / 共40页
大数据分析存储解决方案讲义.ppt_第2页
第2页 / 共40页


点击查看更多>>
资源描述
Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,大数据分析存储解决方案,Header text,Level One Text,Level Two Text,Level Three Text,Level Four Text,Level Five Text,Page,*,*,大数据分析存储解决方案,Header text,Level One Text,Level Two Text,Level Three Text,Level Four Text,Level Five Text,Page,*,*,大数据分析存储解决方案,大数据分析存储解决方案,对新式基础架构的需求,在,可靠和安全,的环境中处理关键业务应用,存取和处理,海量数据,包括结构化和非结构化数据,速度及时响应随时可能出现的商业机会,这就需要灵活、实时性的基础架构,The dynamics of SoR and SoE:,通过负载及资源部署的优化,来增强灵活性和效益,通过采用包括基于开放标准的技术等新技术来改善,IT economics,System of Record,(,SoR,),Systems of Engagement,(,SoE,),对的决策,对的地方,对的时间点,Big Data,&Analytic,s,大数据分析存储解决方案,大数据分析的新型架构解决方案,IBM Big Data&Analytics Infrastructure,Data Zone,Application Zone,大数据分析存储解决方案,4,Smart Metering,Grid Operations,电网管理,Field Service,外勤现场服务,Resource Planning,资源规划,Customer Service/Customer Operations,实现真正的有效的法规遵从,及时发现能源损耗问题、以及偷电和欺诈行为,提高客户满意度,电量使用预测更为精确,电网运维优化,减少停电次数和时间,案例,:,Smart Metering,智慧电力计费,大数据分析应用可以带来真正的业务价值,法规遵从,大数据分析存储解决方案,案例,:,用大数据分析来加强,Smart Metering,数据分析的高可用性,以确保随时了解用户喜好,跨应用的,TB,级的数据需求,通用虚拟化存储平台,实时收集、存储并分析数据,最快可达,50,000 data points/sec,历史用电状态数据的复杂查询处理,数据在加载到数据仓库前的清洗、验证,这些数据可能来自很多的用户、收费系统或断电保护系统,关系掌控,构建和维护电网的唯一试图,对整个企业的结构化和非结构化数据,t,做全局导览,Navigation,,从中发现,Discover,价值,分析用户用电情况,侦测偷电、改表等行为,预测哪些用户适合于哪些分时时段电价或需求,/,响应服务,分时时段电价的实时定价 或,提供及时的需求,/,响应服务,大数据分析存储解决方案,IBM Big Data&Analytics Reference Architecture,Big Data Platform Capabilities,Information Ingest,Real-time Analytics,Warehouse&Data Marts,Analytic Appliances,All Data Sources,Advanced Analytics/New Insights,New/Enhanced Applications,Cognitive,认知,Learn Dynamically?,Prescriptive,规范,Best Outcomes?,Predictive,预测,What Could Happen?,Descriptive,描述,What Has Happened?,Exploration and Discovery,What Do You Have?,Streaming Data,Text Data,Applications Data,Time Series,Geo Spatial,Relational,Social Network,Video&Image,Automated Process,Case Management,Analytic Applications,Watson,Cloud Services,ISV Solutions,Alerts,大数据分析存储解决方案,New Infrastructure Leverages Data Types,Data inMotion,Data atRest,Data inMany Forms,Information Ingestion and Operational Information,Decision Management,BI and Predictive Analytics,Navigation and Discovery,IntelligenceAnalysis,Raw Data,Structured Data,Text Analytics,Data Mining,Entity Analytics,Machine Learning,Landing Area,Analytics Zone and Archive,Video/Audio,Network/Sensor,Entity Analytics,Predictive,Real-time Analytics,Exploration,Integrated Warehouse,and Mart Zones,Discovery,Deep Reflection,Operational,Predictive,Stream Processing,Data Integration,Master Data,Streams,Information Governance,Security and Business Continuity,BigInsights,Streams,Warehouse,大数据分析存储解决方案,大数据分析存储解决方案,InfoSphere BigInsights,Hadoop-based,低延迟分析,针对多样化的、海量静态数据,Data-At-Rest,Netezza High Capacity Appliance,基于结构化数据的可查询归档,Netezza 1000,基于结构化数据的,BI+,定制化分析,Data,Smart Analytics System,基于结构化数据的运营分析,Informix Timeseries,Time-structured analytics,InfoSphere Warehouse,基于结构化数据的大容量数据分析,InfoSphere Streams,低延迟流数据分析,Velocity,Variety&Volume,Data-In-Motion,MPP Data Warehouse,Stream Computing,Information Integration,Hadoop,InfoSphere Information Server,海量数据集成和转化,Apache Hadoop:,跨服务器集群的大数据集分布式处理开放系统框架,采用的是一种简单化编程模型,IBM Big Data Platform,大数据平台,大数据分析存储解决方案,What:,一种开源软件,将数据计算分布到整个集群的常见商用服务器和存储上,Why:,传统的计算架构是一种沿纵向扩展模式,通过更快的,SAN,、大容量内存和多级缓存将数据加载到,CPU,上,成本比较高。,What:Hadoop,把大数据集合拆分区划为小数据集合,再把小数据集合分发到多台普通服务器上,是一种横向扩展模式。,Why:Scalable,Flexible,Cost Effective,Fault,Tolerent,Components:Map Reduce,HDFS,What is Hadoop?,大数据分析存储解决方案,NameNode(Metadata store),Nodes,HDFS Cluster,Operating System,Nodes,Elastic Storage-SNC Cluster,Kernel Level,IBM Value for Hadoop!,HDFS,把数据分散存储在多个存储节点,Node,上,HDFS,设计时就假设存储节点有失效的可能,所以,HDFS,会把一份数据复制,3,份以上,分散存储在多个节点上,从而实现系统整体上的可靠性,HDFS,文件系统是由服务器节点集群组成的,每台服务器依照,HDFS,的特有,block,协议支持网络化,block,数据,HDFS Name Node,有发生单点故障的危险,IBM,在改善文件系统的性能同时消除了单点故障,Elastic Storage-SNC(available as beta code),Hadoop,说明,Map Reduce,HDFS,大数据分析存储解决方案,Hadoop Stack,What does it look like?,大数据分析存储解决方案,典型,Hadoop,存储的,Pain Points,在选择,HDFS,的组件(如软件、服务器、网络和存储等)时很难选,对,在从测试环境迁移到生产环境时,需要做的调优和调整工作太繁复了,长期持续不断的运维保障过于繁重,比如老要更换失效组件(尤其是硬盘),这使得保证期望的,SLA,非常难,CPU,和存储去耦,本来用户的,CPU,和内存已经满足计算需求,但为了存储容量需要安装更多的硬盘不得不买更多的、不必要的,CPU,和内存,Storage,options available have clear,gaps,本地存储的利用率低,(25,%),,每次需要扩容的时候就要添加更多的服务器,而一旦硬盘失效后需要重建,服务器越多,失效的几率越高,性能也就越差,大数据分析存储解决方案,IBM Storage for Hadoop,传统的,Hadoop,集群使用的是服务器内置硬盘存储。如果用作测试或科学研究还好,可作为业务运行的存储就要采用企业存储,Hadoop,集群要负责数据保护和复制,重建(就是,copy,)失效的数据集到不同节点上,严重影响,CPU,性能,无法实现企业级的,RAS,Replicate data,问题同上,扩展的时候同时增加处理器,/,网络,/,存储,无法做到物尽其用(,no way to separate these 3 even if excess capacity existing in one(e.g.Needed more storage but had to add Compute and Network),),使用外部存储可以将存储负载和,Hadoop,计算节点分离,同时还获得了企业存储的好处。,Sell the value of XIV,V7000,SVC,etc.,用户一般会随,Hadoop File System,部署;采用,Elastic Storage,可以有很多好处,大数据分析存储解决方案,数据加速,Experience the instant results that come from IBM FlashSystem,Drive as much as,45X,faster analytics results on certain workloads,数据负载的多样性和灵活性,XIV delivers predictable performance that scales linearly without hotspots delivering insights from analytics faster with tuning-free data distribution,Scale-out,parallel processing of Elastic Storage software and integration with FlashSystem dramatically accelerates performance of Analytics clusters,Virtual Storage Center with SVC automatically optimizes data warehouse performance and cost across Flash and Disk,Mainframe Data Environments,Integration with DB2&specialty analytics,“,engines,”,leveraging DS8870 delivers,4x,reduction in batch times with new High Performance Flash Enclosures,High speed encryption on every drive type secures data,数据保护和保留,LTFS EE w/tape provides reduced TCO by up to,90%,over disk for long term retention of data at rest with a large open format tape repository,Reduce the amount of data to be stored by up to,25 times,with ProtecTIER de-duplication,12x,更快,IBM FlashSystem increased SPLUNK&SAS application efficiency to perform business analytics,20 x,改善,in actionable supply chain analytics,4x,reduction in batch times,virtualization for plug&play,6x,时间节省,“GPFS allows us to move the metadata from the disk to the FlashSystem online.Once we did that,the backups were reduced down to about an hour.”,2 hrs becomes,2 minutes,失效切换时间大幅缩短,Mapping Characteristics to IBM Storage Products,大数据分析存储解决方案,Storage Infrastructure,需求,适用于所有的,5,种应用场景,Optimized Multi-Temperature,Warehouse,优化的多级存储库,All Flash,FlashSystem,Hybrid,DS8000 EasyTier,XIV+SSD Caching,Storwize EasyTier,FlashSystem Solution(VSC+FlashSystem),PureSystems,PureFlex(XIV or Storwize w/EasyTier),PureData for Transactions(Storwize),PureData for Analytics(Netezza),大数据分析存储解决方案,Midrange,&Entry,Tier 0,Acceleration,Smarter Storage,Integrated Systems,Enterprise,Offerings,XIV,zEnterprise Solutions,for Analytics with DS8000,PureData System for,Operational Analytics,with Storwize,PureFlex System,with Storwize,DS8000,Smart Analytics,Systems with DS3xxx,Open&Extensible,Storwize,family,FlashSystem,family,IBM Smarter Storage,的设计就是支持大数据分析,高效和优化数据基础架构,大数据分析存储解决方案,IBM FlashSystem,:为大数据分析应用设计的,让应用和数据实现极速,IBM FlashSystem,的,极速性能,让实时业务决策成为可能,适合于模块化数据存储结构的,Hadoop,系统。某些或所有数据可以保存到,Flash,闪存上,其他可以保存到,XIV,大数据分析存储解决方案,IBM XIV:Optimized data workload diversity for Big Data&Analytics,IBM XIV,的,高性能,无须人工干预配置,且适用于各种各样的存储负载,IBM XIV,的,效率,高的异乎寻常,而且简单性业内最高,内置友好界面,IBM XIV,的,弹性,是企业级的,完全保证了数据的可用性和业务连续性,大数据分析存储解决方案,XIV:为 Analytics 而生,无与伦比的,性能,可扩展的网格存储架构,任意时间支持任意读写负载,板上的闪存,Flash,无与伦比的,可靠性,精致的数据分布,无双的磁盘重建时间,企业级的可用性,无与伦比的,简易性,简单的规划、供给和灵活性,上线后零维护,零调优,“,XIV,最吸引我们的地方就是其超强的性能,we,正是由于,XIV,为我们的精细复杂的分析应用提供了一致的高性能,使得我们能够为我们的用户带来更多的价值。,”,大数据分析存储解决方案,SAS,和,XIV,网格架构,完美的结合,大规模并行计算,保持持续地最佳性能,Balanced Performance,性能均衡,常年零调整,Unprecedented Scalability,史无前例的扩展性,配合添加,SAS,节点和,XIV,模块即可,大数据分析存储解决方案,IBM SVC:Optimized data workload flexibility for Big Data&Analytics,IBM SVC,通过如下功能在,IBM,大数据产品线上增加了,灵活性,:,完整和数据虚拟化和数据移动性,高级集群和复制,多路镜像,,read,preferred option,Real Time Compression,实时压缩,Easy Tier Hot Extent caching,Storwize V7000/U,IBM SVC,大数据分析存储解决方案,设计原则,Real-Time Compression,实时压缩是设计来做:,作用于,Active Primary Data,专用的压缩平台,Platform handles ALL heavy lifting associated with compression,不会影响性能,We modify a compressed file in-place efficiently,不会改变用户应用,Users nor admins need to change anything,处理流程不变,压缩是在线完成,不是事后压缩,业界标准压缩算法,所采用的压缩算法已经使用了几十年,Storwize V7000/U,IBM SVC,大数据分析存储解决方案,流处理计算,&IBM Flash Systems,大数据分析存储解决方案,Data:,是拥有还是保存,?,或是是分析和开始行动,!,Data in,Data at,大数据分析存储解决方案,InfoSphere Streams:,大数据流分析,为分析动态数据而建,多并发输入数据流,大规模可扩展,Massive scalability,分析和处理的数据多样化,Structured,unstructured,video,audio,Advanced analytic operators,自适应实时分析,With Data Warehouses,With Hadoop Systems,大数据分析存储解决方案,Current fact finding,当前数据查询,分许流动中的数据,在数据落盘前,低延迟模式,push model,数据驱动,真正的数据分析,Historical fact finding,历史数据查询,查找和分析存储在磁盘上的数据信息,批处理模式,pull model,查询驱动,:submits queries to static data,Traditional Computing,Stream Computing,流数据计算代表着计算模式的变迁,Real-time Analytics,大数据分析存储解决方案,Real Time Analytics,实时分析,想象一下你如何用防火栓喝水,来自多个多样输入源的大量数据,直接处理和过滤数据,而不必存储,仅保存有价值的数据,仅关联对数据最感兴趣的用户,随着数据信息的产生采取行动,大数据分析存储解决方案,Adaptive Analytics,自适应分析,Data in Motion and Data at Rest,的集成,1.Data Ingest,数据集成,,数据挖掘,,机器学习,,统计建模,实时和历史数据洞察力的可视化,3.Adaptive Analytics Model,数据收取,,在线分析准备,模式校验,Data,2.Bootstrap/Enrich,Control flow,InfoSphere,BigInsights,Database&Warehouse,InfoSphere,Streams,大数据分析存储解决方案,Adaptive Real-Time Analytics,自适应实时分析,来自多个多样输入源的大量数据,过去、现在和未来全方位综合性视图,实时分析,低延时结果,Full context for deep analysis,深度分析的完整的上下文,跨,data in motion and data at rest,的常用数据分析,自适应,-,随机而变,当发现非预期行为时,自适应,当识别出新数据意义时深度分析之,开始没有意识到的数据意义,随后才可能意识到,自适应,在开始没有意识到的,随后可以找出数据模式,大数据分析存储解决方案,Stock market,Impact of weather on securities prices,Analyze market data at ultra-low latencies,Momentum Calculator,Fraud prevention,Detecting multi-party fraud,Real time fraud prevention,e-Science,Space weather prediction,Detection of transient events,Synchrotron atomic research,Genomic Research,Transportation,Intelligent traffic management,Automotive Telematics,Energy&Utilities,Transactive control,Phasor Monitoring Unit,Down hole sensor monitoring,Natural Systems,Wildfire management,Water management,Other,Manufacturing,Text Analysis,ERP for Commodities,Real-time multimodal surveillance,Situational awareness,Cyber security detection,Law Enforcement,Defense&Cyber Security,Health&Life Sciences,ICU monitoring,Epidemic early warning system,Remote healthcare monitoring,Telephony,CDR processing,Social analysis,Churn prediction,Geomapping,如何使用,InfoSphere Streams?,大数据分析存储解决方案,加快数据流入分析系统的速度,向交易方向加速,。,一个高效和灵活的基础架构显然可以加快流速,并平衡不同数据分析的需求,Cores,SCM,Storage,Network,Cores,SCM,Storage,Network,Cores,SCM,Storage,Network,Cores,SCM,Storage,Network,+,+,预测分析,数据仓库,文本分析,Hadoop Workloads,优化,敏感性分析,加快流速,价值,时间,“,触发事件,”,数据完备,交易,Insight,预见,获取数据,时间,分析数据,时间,行动时间,大数据分析存储解决方案,大数据分析的新式基础架构解决方案,IBM Big Data&Analytics Infrastructure,Data Zone,Application Zone,并行,处理,可扩展性,低延时,资源,数据,优化,大数据分析存储解决方案,Experience real-time analytical insights with up to,50 x better performance,than enterprise disk systems using,IBM FlashCore technology,Preserve and protect infrastructure continuity while,scaling to over,2 petabyte,of effective all-flash capacity under a single integrate interface,Deliver agility and,data,economics with,4x greater capacity,in less rack space,than competitive all-flash products,Synchronized and Complimentary to Overarching Storage Messaging-,Accelerate time to insights through data without borders.IBM innovation frees data with agile and simple to use storage solutions delivering superior data economics,IBM FlashSystem Core Launch Messaging,Drive a complete paradigm shift in Enterprise Storage with the all new IBM FlashSystem Family,大数据分析存储解决方案,IBM FlashSystem Family,2015 Theme,Time to insight.Time to value.Time to market.,IBM FlashSystem,its about,time.,Flash Realized!,大数据分析存储解决方案,IBM FlashSystem V9000,Foundational Pillars,IBM FlashCore Technology is the DNA of the FlashSystem Family,大数据分析存储解决方案,Introducing the New IBM FlashSystem Family Offerings,IBM FlashSystem 900,Extreme Performance:,D,elivers,100 microsecond response times,Macro Efficiency:,Lowest latency offering with 40%greater capacity at a lower cost per capacity,Enterprise Reliability:,IBM enhanced Micron MLC flash technology with Flash Wear Guarantee,Powered by,IBM FlashCore Technology,IBM FlashSystem V9000,Scalable Performance:,Grow capacity and performance with u,p to 2.2PB scaling capability,Enduring Economics:,Next generation flash media with lower cost per capacity,Agile Integration:,Fully integrated system management to simplify management and improve workforce productivity under a single name space,大数据分析存储解决方案,FlashSystem 900,Introducing IBM FlashSystem 900,the next generation in our lowest latency offering,IBM MicroLatency with up to 1.1 million IOPS,40%greater capacity at a 10%lower cost per capacity,IBM FlashCore technology,our secret sauce,Technical collaboration with Micron Technology,our flash chip supplier,IBM enhanced flash technology,MLC NAND flash offering with Flash Wear Guarantee,VAAI UNMAP and VASA support with IBMSIS for improved cloud storage performance and efficiency,Minimum latency,Write,90 s,Read,155 s,Maximum IOPS 4 KB,Read(100%,random),1,100,00,Read/write(70%/30%,random),800,000,Write(100%,random),600,000,Maximum bandwidth 256 KB,Read(100%,sequential),10 GB/s,Write(100%,sequential),4.5 GB/s,Performance at-a-glance,IBM MicroLatency module type,1.2 TB,2.9 TB,5.7 TB,Modules quantity,4,6,8,10,12,6,8,10,12,6,8,10,12,RAID 5 capacity(TB),2.4,4.8,7.2,9.6,12,11.6,17.4,23.2,29.0,22.8,34.2,45.6,57.0,Raw Capacity(TB),7.1,10.7,14.2,17.8,21.4,26.3,35.1,43.9,52.7,52.7,70.3,87.9,105.5,大数据分析存储解决方案,IBM introduces a fully integrated,fully managed,full function all-flash storage system,FlashSystem V9000,Scalable all-flash architecture with full set of advanced data features,Performs at up to 2.5M IOPS with IBM MicroLatency,scalable to 19.2 GB/s,Scales to 456 TB usable and up to 2.28 PB effective capacity in only 34U,Up to 57 TB usable and up to 285 TB effective capacity in only 6U,New licensing structure to simplify ordering and planning for External Data Virtualization,Flash Copy,Metro Mirror,and Real-time Compression,Scalable,Performance,Agile,Integration,Enduring,Economics,Powered by,FlashCore,Technology,大数据分析存储解决方案,Thank You,!,大数据分析存储解决方案,
展开阅读全文

开通  VIP会员、SVIP会员  优惠大
下载10份以上建议开通VIP会员
下载20份以上建议开通SVIP会员


开通VIP      成为共赢上传

当前位置:首页 > 包罗万象 > 大杂烩

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        抽奖活动

©2010-2026 宁波自信网络信息技术有限公司  版权所有

客服电话:0574-28810668  投诉电话:18658249818

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :微信公众号    抖音    微博    LOFTER 

客服