资源描述
,单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,报表平台(,BI,)体系设计,-,关注于报表,/,统计分析,/,商业智能,邓英达,020-38771111-2217,13609702242,yindeng,Microsoft Corporation,内容,第一部分:体系的理解与用处,第二部分:体系结构的设计,1.,总体结构,2.,数据建模,3.,功能模块,传统报表技术的难点,基于平台的架构,-,功能性报表,性能,-,支持静态与动态报表,业务系统的影响,面向业务的灵活性,前端展现的易用性,开发和维护成本,实现报表系统(商业智能)的平台组件,Data Warehouse,Data Access,前端报表用户,Data,Sources,Data Input,Staging,Area,Data Marts,商业智能项目的通用实现模式,源,DB+,前端工具(包括报表统计模块),源,DB+OLAP+,前端工具,源,DB+ODS+,前端工具,源,DB+ODS+OLAP+,前端工具,源,DB+DW+,前端工具,源,DB+DW+OLAP+,前端工具,源,DB+ODS+DW+OLAP+,前端工具,源,DB+ODS+DW+DataMart+OLAP+,前端工具,考虑的因素,性能,-OLAP,与报表,Cache,数据的集成性,-,集成的数据存储,业务的灵活性,-,面向业务的设计,满足更多需求,-,数据建模,使用方便性,-,可以由客户定制的报表,/,基于,WEB,的使用模式,第二部分:微软,BI,体系结构,体系结构的设计,1,、总体结构,2,、数据建模,3,、功能模块,微软商业智能体系要点:强调集成,Data Marts and cubes,Data,Warehouse,Source,Systems,OLTP,1,3,4,2,Data Warehouse,Data Marts/OLAP/Cubes,Front-End Portal or Tools,Business Intelligence,相关产品模块,Analysis Services,OLAP&Data Mining,Data Transformation Services,(,SSIS,),SQL Server,Relational Engine,Reporting Services,Management Tools,Dev tools,Visual Studio.Net,Excel,OWC,Map Point,Data Analyzer,Balance Score Card,SharePoint Portal,Server,Windows Server,Windows Client,微软商业智能体系要点:强调,OLAP,财务经理的视角,产品经理的视角,某些特殊视角,销售经理的视角,销售数据,产品,时间,市,场,OLAP,的基本概念,-,维度和度量,二维到多维,度量,维度,层次,海量数据的处理:,T,3,世界最大的多维数据集-,MOLAP Cube,中1.2,TB,的源数据,DW,77,亿行,RDW,进入到,MOLAP,中(440,GB),50,个并发用户,中值查询时间,=0.02-0.08 sec,反映了真实的公司架构、业务流程和数据,微软商业智能体系要点:重视闭环,直接联系,电话中心,其它,外部交互应用,站点,销售,内部交互应用,市场,服务,内部最终,用户访问,数据仓库,(DW),and/or ODS,内部,安全,访问,后台系统,数据分析报表工具,ERP,external,operational,e-commerce,other,数据挖掘,数据报表,数据分析,Data,Marts,数据集市,微软,BI,的典型架构,体系结构的设计,1,、总体结构,2,、数据建模,3,、功能模块,The Star Schema,Fact Table,Dimension Table,Employee_Dim,EmployeeKey,EmployeeID,.,Time_Dim,TimeKey,TheDate,.,Product_Dim,ProductKey,ProductID,.,Customer_Dim,CustomerKey,CustomerID,.,Shipper_Dim,ShipperKey,ShipperID,.,Sales_Fact,TimeKey,EmployeeKey,ProductKey,CustomerKey,ShipperKey,Sales Amount,Unit Sales.,多维结构的价值,Grocery Transaction,Store Number,Transaction Date,Customer,Product,Quantity,Amount,Time,Transaction Date,Sales Period,Period Identifier,Sales Period,From Date,To Date,取决于企业结构与时间的使,用方式,季度,上半年,/,下半年,是否传统节假日或西方节假日,财政年的月份,月份是上旬、中旬还是下旬,星期几?今年的第几个星期?,多维模型,:,四种模式,星型模式,(Star Schema),雪花模式,(Snowflake Schema),星座模式,(Constellation Schema),雪暴模式,(Snowstorm Schema),多维模型,:,雪花模式,Grocery Transaction,Store Number,Transaction Date,Customer,Product,Quantity,Amount,Customer,Customer,First Name,Last Name,Address 1,Address 2,Address 3,City,State,Country,Postal Code,Customer Category,Time,Transaction Date,Store,Store Number,Store Name,City,State,Country,Telephone,Region,Product,Product,Description,Category,Product Category,Product Category,Description,Region,Region,Description,Sales Period,Period Identifier,Sales Period,From Date,To Date,Customer Category,Category,Customer Category,为了避免数据冗余,用多张表来描述一个复杂维,在星型模式的基础上,构造维表的多层结构,多维模型,:,星座模式,Grocery Transaction,Store Number,Transaction Date,Customer,Product,Purchase Quantity,Amount,Customer,Customer,First Name,Last Name,Address 1,Address 2,Address 3,City,State,Country,Postal Code,Customer Category,Time,Transaction Date,Store,Store Number,Store Name,City,State,Country,Telephone,Region,Product,Product,Description,Category,Product Line,Sales Period,Period Identifier,Sales Period,From Date,To Date,Customer Category,Category,Customer Category,Product Purchases,Product,Purchase Date,Supplying Vendor,Purchase Order,Unit Quantity,Purchase Cost,Vendor,Vendor,Vendor Name,Address 1,Address 2,Address 3,City,State,Country,Postal Code,Product Inventory,Product,Warehouse Location,Quantity On Hand,Quantity Back Ordered,Warehouse,Warehouse,Address 1,Address 2,Address 3,City,State,Country,Postal Code,具有多个事实表,多维模型,:,雪暴模式,Grocery Transaction,Store Number,Transaction Date,Customer,Product,Purchase Quantity,Amount,Customer,Customer,First Name,Last Name,Address 1,Address 2,Address 3,City,State,Country,Postal Code,Customer Category,Time,Transaction Date,Store,Store Number,Store Name,City,State,Country,Telephone,Region,Product,Product,Description,Category,Product Line,Product Category,Product Category,Description,Region,Region,Description,Sales Period,Period Identifier,Sales Period,From Date,To Date,Customer Category,Category,Customer Category,Promotion Period,Promotion Id,Promotion,From Date,To Date,Product Line,Product Line ID,Description,Product Purchases,Product,Purchase Date,Supplying Vendor,Purchase Order,Unit Quantity,Purchase Cost,Vendor,Vendor,Vendor Name,Address 1,Address 2,Address 3,City,State,Country,Postal Code,Product Inventory,Product,Warehouse Location,Quantity On Hand,Quantity Back Ordered,Warehouse,Warehouse,Address 1,Address 2,Address 3,City,State,Country,Postal Code,具有多个事实表与多层维表,确定事实表的组成,DimensionTables,customer_dim,201,ALFI,Alfreds,product_dim,25,123,Chai,Sales_fact Table,customer_key,product_key,time_key,quantity_sales,amount_sales,Foreign Keys,201,25,134,400,10,789,The,grain,of the sales_fact table is defined by the lowest level of detail stored in each dimension associated with the fact table,The,grain,of the sales_fact table is sales data by,customer ID,product ID,and,order date,134,1/1/2000,time_dim,Measures,描述维度,Describing Business Entities,Containing Attributes That Provide Context to Numeric Data,Presenting Data Organized into Hierarchies,建模的原则,粒度设计,数据稳定性设计,事实表存储设计,维表主键的选择,层次结构的设计,雪化模型中的维表,退化维的设计,变化维的设计,体系结构的设计,1,、总体结构,2,、数据建模,3,、功能模块,EM 2000,QA 2000,AM 2000,EM 2000,QA 2000,AM 2000,SQL 2000,SQL 2005,SQL,Management,Studio,BI,Development,Studio,针对目标,:,DBA,Developer,New,SQL Server 2005,管理工具,SQL Server 2005 DTS-SQL Server Integration Services(SSIS),SSIS,组件,完善,丰富的流水线组件,(Transforms),Source,Destination,Aggregation,Derived column,Merge,Sort,Pivot,Sampling,Conditional split,multicast,Slowly changing dimension,Data Ming,OLAP,Fuzzy lookup,SSIS example,SQL Server 2005-OLAP,Multi-Dimension OLAP,“嘿 我卖掉了价值1亿美元的货物,”,月份,1,2,3,4,7,6,5,产品,牙膏,果汁,可乐,牛奶,奶油,肥皂,地区,西,南,北,维度:产品、地区、时间,分层汇总路径,产品,地区,时间,行业 国家 年,类别 地区 季度,产品 城市 月 周,部门 天,分析,OLAP,数据分析软件,钻取,切片,比较,多维分析,:MOLAP,ROLAP,HOLAP,性能与空间的平衡,Aggregation wizard finds the“80-20”rule in the data,The 20 percent of all possible pre-aggregations that provide 80 percent of the performance gain,Analyses level counts for each dimensions and parent-child ratios for each level,1)January,February,March 2000 Sales,Query 1)January,February,and March 2000 Sales,Client,Server,2)Quarter 1 2000 Sales,3)Quarter 1 1999 Sales,1)January,February,March 2000 Sales,3)Quarter 1 1999 Sales,Query 2)Quarter 1 2000 Sales,Query 3)Quarter 1 2000 Sales,and Quarter 1 1999 Sales,Client,Calculates,Only,Quarter 1 1999,Needed From,Server,Intelligent Caching,SQL Server 2005 Reporting Services,Reporting services,组件,报表服务是一个开放的可扩展的平台,它具有支持授权、管理与在整个企业内传递丰富的交互式报告的能力。,Report,Definition,ManagedReport,Delivery,Channels,Online,Access,Authoring,Management,Delivery,SQL Server 2005 Data Mining,Value of Data Mining,5 new algorithms,Graphical tools/wizards,12 embeddable viewers,Tightly integrated with AS,DTS,Reporting,Integration with Web/Office apps,SQL Server 2005,OLAP,Reports(Adhoc),Reports(Static),Data Mining,Business Knowledge,Easy Difficult,Usability,Relative Business Value,LAB,:功能演示,SQL Server 2005,-Complete Platform for Business Intelligence,Q&A,
展开阅读全文