中石油数字化网络服务器虚拟化技术实践和规划.pptx

资源描述

Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,*,Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,中国石油长庆油田数字化网络服务器虚拟化应用,第六采油厂-科技与信息中心,年2月15日,第1页,议程,应用实施范围考虑,服务器采购考虑,虚拟机布署考虑,管理维护考虑,第2页,议程,应用实施范围考虑,服务器采购考虑,虚拟机布署考虑,管理维护考虑,第3页,应用实施范围总体标准,不适合采取虚拟化应用,含有特殊硬件访问要求应用,高性能图形显卡,-,不适用虚拟化,特殊串,/,并行加密设备,-,不适用虚拟化,USB,设备连接需求,-,可能不适用，可采取外置,USB,设备代替，需经过测试,即使在高配置服务器上依然含有很高负载应用,-,可能不适用，需分析当前服务器配置情况,能够采取虚拟化应用,除上述不适合采取虚拟化应用之外全部应用,可依据应用迁移复杂程度决定虚拟化先后次序,较易实现,P2V,应用可先做迁移，如可用,Converter,工具直接迁移应用,较难或不能做,P2V,迁移应用可考虑采取重新安装方式后迁,依据管理需要决定是否做虚拟化,虚拟化转变过程对现有业务影响程度,转变为虚拟化后对现有管理影响程度,部门之间协调难易程度,第4页,虚拟化宿主服务器布署类型,垂直扩展与水平扩展布署模式,不一样资源池“量子化”模型,物理主机,Hypervisor,VM,OS,App,VM,OS,App,VM,OS,App,VM,OS,App,VM,OS,App,VM,OS,App,VM,OS,App,VM,OS,App,VM,OS,App,Physical Host,Hypervisor,VM,OS,App,VM,OS,App,VM,OS,App,Physical Host,Hypervisor,VM,OS,App,VM,OS,App,VM,OS,App,物理主机,Hypervisor,VM,OS,App,VM,OS,App,VM,OS,App,第5页,不一样资源池类型,垂直扩展主机模式提供更大连续性资源空间,更轻易满足不一样负载吻合性要求,可提供更高资源利用率,水平扩展集群主机模式更像是一组小池子集合,多组小容量资源池,需要更多监控管理,模块化使用现有优点也有缺点,第6页,影响虚拟化布署参数体系,功效多样化,服务器主要性,独立服务器,不主要,主要,(,独立,当地存放等等,),(,群集,多主机等等,),服务器农场,(,水平扩展服务器群集，,公用服务器等等,),(,后端办公,当地应用等等,),负载约束,技术约束,商业约束,困难度增加,第7页,负载约束,不一样资源组都需要分别考虑,CPU,利用率,磁盘,I/O,网络,I/O,内存利用率,虚拟化造成额外负载通常也要做一定考虑，如,磁盘和网络,I/O,会增加,CPU,负担,iSCSI,存放访问也会增加,CPU,负担,运维周期性负载改变也必须考虑进来,月末负载改变,年末负载改变,第8页,技术约束,技术约束通常主要是指：,兼容性（指系统,/,应用兼容性）,关联性（如系统是摩格逻辑组一部分）,大部分环境下，这些约束包含了：,网络连接（子网段级别）,应用之间互连性,相关存放使用技术,所用硬件和外设,软件支持度和认证,这些约束条件依据虚拟化在内核上下实现不一样而有所不一样,共享与分离,OS,镜像模式,第9页,商业和流程约束,规模较小或集中情况下轻易被忽略约束,在试验室测试环境能够不考虑，但生产环境必须要考虑,在虚拟化中常见商业和流程约束包含：,维护窗口和冻结改变,地理位置和其它物理限制,运维环境，安全区域，应用分层布署考虑,商业组织，部门以及客户,法规政策考虑与限制,忽略这些约束条件将可能造成不可预知结果,详细情况详细分析，依据目标制订计划,第10页,议程,应用实施范围考虑,服务器采购考虑,虚拟机布署考虑,管理维护考虑,第11页,虚拟化中使用硬件应满足兼容性列表要求,全部用于实现,VMware VI3,虚拟架构处理方案设备，包含：服务器系统、存放系统、,IO,卡设备等，应满足,VMware VI3,产品兼容列表要求，最新兼容列表可从以下连接找到：,Systems Compatibility Guide For ESX Server 3.5 and ESX Server 3i,存放系统兼容列表,HCL:Storage/SAN Compatibility Guide For ESX Server 3.5 and ESX Server 3i,IO,卡设备兼容列表，包含网卡、,FC HBA,卡和,iSCSI HBA,卡等,HCL:I/O Compatibility Guide For ESX Server 3.5 and ESX Server 3i,第12页,ESX,服务器硬件配置考虑关键点,CPUs,ESX,调度,CPU,周期满足虚拟机和,Service Console,处理请求,可用,CPU,目标数量越多，,ESX,管理这个调度机制效果越好,（单台服务器配置,8,个以上,CPU,核会有最好效果,）,超线程技术并不能提供等同于多核处理器好处；提议关闭,CPU,超线程功效（假如有话）,使用含有,EM64T,能力,Intel VT,或,AMD V,技术,CPU,能够同时支持运行,32,位和,64,位虚拟机,采取同一厂商、同一产品家族和同一代处理器服务器组成集群，能够取得最好,VMotion,兼容能力,ES,Enhanced VMotion,兼容性扩大了原有,VMotion,兼容能力,-,Alleviating Constraints with Resource Pools Live Migration with Enhanced VMotion,-,参见,Best Practices for Successful VI Design,第13页,ESX,服务器硬件配置考虑关键点,-,内存,内存资源往往比,CPU,资源更会成为潜在瓶颈,在一些时候，虚机环境内存使用量可能会超出物理内存值：,Host swap file(,尽可能少用以取得最正确性能,),Transparent Page Sharing,（多个虚机共享相同内存页面）,注意服务器特定内存配置要求,DIMM sizes,bank pairing,parity,upgrade considerations(mix and match or forklift replacement),尽可能将服务器配置到最大内存，采取最大容量内存条（尤其是当没有配满全部内存条时）,-,参见,Best Practices for Successful VI Design,第14页,虚拟架构基本网络连接部件组成,(Management virtual machine),(Vmotion,iSCSI,NFS),(VM connectivity),Port Group,Port Group,Port Group,ESX,服务器硬件配置考虑关键点,-,网络,-,参见,Best Practices for Successful VI Design,第15页,最少配置一个虚拟交换机，测试环境可用,2,个虚拟交换机，生产环境提议最少配置,3,个虚拟交换机,虚拟交换机可同时支持,3,种类型端口组,(Service Console,VMkernel,VM),提议将,Service Console,、,VMkernel,和虚机端口组各自使用自己虚拟交换机,可用,VLAN,技术分割不一样端口组,对于使用,VMotion,和,DRS,功效服务器集群，网络配置应该相匹配,（虚拟交换机数量与网络卷标名应保持一致）,ESX,服务器,Service Console,使用固定,IP,，配置正确,speed,和,duplex,。,ESX,服务器硬件配置考虑关键点,-,网络,-,虚拟交换机和端口组,-,参见,Best Practices for Successful VI Design,第16页,ESX,服务器，虚拟交换机，物理网卡,出于冗余考虑，每个虚拟交换机提议最少分配两个物理网卡,每个,ESX,服务器物理网卡,/,口数量取决于准备配置虚拟交换机数量,假如,3,种类型端口组（,SC,VMkernel,VM,）都在不一样虚拟交换机上，生产环境提议最少,6,个物理网卡,/,口,假如给包含虚拟机端口组虚拟交换机分配更多物理网卡,/,口，能够取得负载均衡好处,ESX,服务器硬件配置考虑关键点,-,网络基本组件,-,参见,Best Practices for Successful VI Design,第17页,物理网卡,/,口与物理交换机,同一个虚拟交换机上不一样物理网卡,/,口应连接到不一样物理交换机上,将一个集群中全部服务器,VMotion,功效端口组所使用物理网卡,/,口都连到同一套物理交换机上,（一样遵照上述第一条规则）,ESX,服务器硬件配置考虑关键点,-,与物理网络连接,-,参见,Best Practices for Successful VI Design,第18页,Example 1:Blade Server with 2 NIC Ports,vSwitch,vmnic0,SC,vmkernel,Active,Standby,vmnic1,Candidate Design:,Team both NIC ports,Create one virtual switch,Create three port groups:,Use Active/Standby policy for each portgroup,Portgroup1:Service Console(SC),Portgroup2:VMotion,Portgroup3:VM traffic,Use VLAN trunking,Trunk VLANs 10,20,30 on each uplink,Portgroup1VLAN 10,Portgroup3VLAN 30,Portgroup2VLAN 20,VLAN Trunks(VLANs 10,20,30),第19页,Example 2:Server with 4 NIC Ports,vmnic0,SC,vmkernel,Active,Standby,vmnic1,Candidate Design:,Create two virtual switches,Team two NICs to each vSwitch,vSwitch0(use active/standby for each portgroup):,Portgroup1:Service Console(SC),Portgroup2:VMotion,vSwitch1(use Originating Virtual PortID),Portgroup3:VM traffic#1,Portgroup4:VM traffic#2,Use VLAN trunking,vmnic1 and vmnic3:Trunk VLANs 10,20,vmnic0 and vmnic2:Trunk VLANs 30,40,Portgroup4VLAN,40,VLANs 10,20,vSwitch0,Portgroup1VLAN 10,Portgroup2VLAN 20,vSwitch1,vmnic2,vmnic3,Portgroup3VLAN 30,VLANs 30,40,第20页,Example 3:Server with 4 NIC Ports(Slight Variation),vmnic0,SC,vmkernel,Active,Standby,vmnic1,Candidate Design:,Create,one,virtual switch,Create two NIC teams,vSwitch0(use active/standby for portgroups 1&2):,Portgroup1:Service Console(SC),Portgroup2:Vmotion,Use Originating Virtual PortID for Portgroups 3&4,Portgroup3:VM traffic#1,Portgroup4:VM traffic#2,Use VLAN trunking,vmnic1 and vmnic3:Trunk VLANs 10,20,vmnic0 and vmnic2:Trunk VLANs 30,40,VLANs 10,20,vSwitch0,Portgroup1VLAN 10,Portgroup2VLAN 20,vmnic2,vmnic3,Portgroup3VLAN 30,VLANs 30,40,Portgroup4VLAN,40,第21页,Servers with More NIC Ports,More than 4 NIC PortsDesign Considerations,With Trunks(VLAN tagging):,Use previous approach and scale up to meet additional bandwidth and redundancy requirements,Add NICs to NIC team supporting VM traffic,VLAN Tagging always recommended,but options if NICs available:,Dedicated NIC for VMotion,At least one NIC,Dedicated NICs for IP Storage(NFS and/or iSCSI),Usually two teamed NICs(consider IP-hash&etherchannel if multiple destinations and Multi-Chassis Etherchannel employed on physical switches),Dedicated NIC(s)for Service Console,At least two for availability,Note:easy to consume many physical NICs and switch ports if not using VLAN tagging,第22页,ESX,服务器硬件配置考虑关键点,-,存放,应尽可能采取外置共享磁盘阵列存放虚拟机文件,ESX,服务器内置硬盘应有充分冗余，提议采取,RAID1,ESX,服务器本身对硬盘要求，安装时,Partition,划分：,不提议用安装时自动硬盘划分方法，因为,/,、,/var,、,/home,会放再同一个目录下，当,/(root),满了时，,ESX,服务器会发生严重问题。提议：,/boot50,到,100 MB(Primary Partition),/8.0,到,18GB(Primary Partition),(swap)2,倍,Service Console,内存,提议固定使用,1.6G,/var4GB,或更大,提议足够,ESX,服务器程序空间大小为,18GB,当地端,ISO,以及其它文本文件存放空间要考虑,第23页,存放对于虚拟机展现方式,7,VM,层,数据存放,存放阵列,SCSI,控制器,虚拟磁盘展现为,SCSI,控制器,SCSI,控制器显示为,BUS,或,LSI Logic,磁盘控制器,一个,VM,可含有,1,到,4,个虚拟,LSI Logic,或,BusLogic SCSI,适配器,每个,SCSI,适配器包含,1,到,15,个虚拟,SCSI,存放设备,虚拟磁盘驻留在可格式化为,VMFS,、,NFS,或裸磁盘数据存放中,文件系统类型由底层物理磁盘驱动器确定,VMFS,NFS,FC,iSCSI,NAS,第24页,卷、数据存放和,LUN,卷,存放阵列,数据存放,8,LUN 10,20 GB,LUN,是一个逻辑空间,可由存放阵列整个空间创建，也可由其中部分空间创建,LUN,映射到,ESX,后即成为卷,当卷被格式化为某种文件系统之后即成为数据存放,不能在同一个,LUN,中混用不一样类型文件系统,每个,LUN,对应一个,VMFS,卷,第25页,虚拟机内容位于数据存放中,数据存放,ESX,数据存放类型：,VMware,文件系统,（,VMFS,）,使用,VMFS,RDM,网络文件系统（,NFS,）,卷,VM,内容,ESX,主机,数据存放采取某种文件系统格式,能够像操作文件一样操作数据存放,每个系统含有,256,个,VMFS,数据存放,每个系统含有,8,个,NFS,数据存放,ISO,映像、,VM,模板和软盘映像,9,第26页,虚拟机内容,文件名,说明,.vmx,VM,配置,.vmdk,虚拟磁盘,-flat.vmdk,预分配虚拟磁盘文件,（包含数据）,.vswp,交换文件,nvram,非易失性,RAM,文件,.vmem,VM,内存,.vmss,VM,挂起文件,.vmsd,快照数据,-Snapshot.vmsn,快照状态文件,VMware-0.log,、,vmware-1.log,等,日志文件,10,文件夹,/,子目录,第27页,数据存放类型,数据存放,VMFS,数据存放,VMFS,数据存放,NFS,IP,网络,VM,内容,VM,内容,FC,交换机,光纤通道,SAN,磁盘阵列,iSCSI SAN,磁盘阵列,NAS,磁盘阵列,VM 1,ESX,主机,1,VM 2,VM 1,ESX,主机,2,VM 2,VM,内容,VM 3,VM 3,12,当地,SCSI,VMDK,IP,交换机,第28页,ESX,服务器提议配置,-,新购,为了尽可能发挥虚拟化作用，最大程度利用单台服务器资源，提议用于虚拟化宿主服务器配置应到达或超出以下标准：,服务器,CPU,路数,双路,四路,八路,CPU（提议主频2GHz以上）,双路四核,四路双核或四核,四路双核或四核,+,内存,16GB+,32GB+,64GB+,千兆网口,无外接存放,4+/,6+,4+/,6+,4+/,6+,使用FC存放,4+/,6+,4+/,6+,4+/,6+,使用IP存放,6+/,8+,6+/,8+,6+/,8+,FC HBA口（提议4Gb或8Gb产品）,2,2,2,内置硬盘（使用外置磁盘阵列时）,2,2,2,电源,双冗余,双冗余,双冗余,从性价比和可用性考虑，不提议在单路服务器上布署虚拟化,第29页,虚拟化宿主服务器提议配置,-,现有,对于当前业内用比较多四路服务器，提议以下：,四路单核服务器：运算能力较弱，虚机数量应控制在,10,个以内，内存配置提议在,12GB-16GB,；,四路双核服务器：运算能力中等，虚机数量可做到,10-15,个左右，内存配置提议在,16GB-24GB,；,四路四核服务器：运算能力强劲，虚机数量可做到,15-30,个左右，内存配置提议在,24GB-32GB,。,第30页,VC,服务器最正确配置提议,处理器：,2.0GHz,或更高,Intel,或,AMD x86,处理器，,VC,支持多处理，可支持至多,2,个,CPU,。,内存：最低需求为,2GB,，假使数据库和,VC,安装于同一台，提议增加至,4GB,。,磁盘空间：最小为,560MB,，提议,2GB,。,网卡：提议用,Gigabit,。,最低硬件配置,-,单个,2GHzCPU,，,2GB,内存，千兆网口,可支持,20,个同时连接，管理,50,台物理机，,1000,个虚拟机左右,提议配置,-,双,CPU,，,4GB,内存，千兆网口,可支持,50,个同时连接，管理,200,台物理机，,个虚拟机左右,第31页,议程,应用实施范围考虑,服务器采购考虑,虚拟机布署考虑,管理维护考虑,第32页,虚机个数规划,单台服务器所能支持虚机数量决定原因：,服务器硬件配置,CPU,性能,-,多核高主频技术使得,CPU,成为性能瓶颈可能性越来越低,内存大小,-,做为硬指标内存，配置越高，所能支持虚机数量越多,网络端口,-,千兆网环境已很普遍，网络带宽大多有确保，更多从管理角度来考虑,HBA,卡,-,磁盘访问性能对虚机数量有一定影响，提议采取,4Gb,或,8GbHBA,卡以降低链路影响,当地磁盘,-,内置磁盘可用性及,IO,吞吐能力均较弱，不提议在其上存放虚拟机，推荐使用外置高性能磁盘阵列,应用负载大小,因为物理服务器资源本身最大限制，应用负载越大，所能同时运行虚机数量越少,提议将不一样应用访问特征应用混合布署在同一物理服务器上,灵活利用,DRS,和,VMotion,技术可将物理机与虚机比率关系调到最优,考虑到,HA,及,DRS,所要求资源冗余，全部运行虚机在正常负载下，总体资源使用率不超出三分之二会比较适当,经验值：双路四核,10,个虚机左右，四路四核,15-30,个虚机（仅为参考）,第33页,虚机资源分配,-CPU,、内存,CPU,分配标准：,尽可能使用最少,vCPUs,，假如是单线程应用，不支持多线程处理，请不要使用,virtual SMP,虚拟,CPU,数量不要等于或超出物理,CPU,核数，如双路双核服务器配置虚机最多使用两个虚拟,CPU,当配置虚拟机时候须了解,ESX,服务器本身也有一些,overhead,。需注意不要超出全部虚拟机使用率和全部,vCPU,汇总数目。,观察”,idle loop spin”,功效参数，一些操作系统当它们闲置时，并不会真正释放,virtual CPU,。,确认配置了单一处理器虚拟机为”,UP HAL/kernel”,，多处理器虚拟机必须设定为”,SMP HAL/kernel”,。,内存分配标准：,内存总量为在资源评定后，计算虚拟机评定结果所需实际物理内存总和，其它因为应用程序而产生更多内存需要能够用,ESX,磁盘内存来处理,关键应用可考虑固定内存方法以确保性能稳定性,第34页,VMware vSphere Client 虚拟化客户端应用,1、在客户端安装VS client以实现远程管理,第35页,VSphere client,硬件配置,2,、,Ensure hosts are CPU compatible,Intel vs AMD,Similar CPU family/SSE3 status,Enhanced VMotion Compatibility(EVC),“VMware VMotion and CPU Compatibility”whitepaper,CPU incompatibility=limited DRS VM migration options,Larger Host CPU and memory size preferred for VM placement(if all equal),Differences in cache or memory architecture=inconsistency in performance,第36页,DRS Best Practices:Cluster Configuration,Higher number of hosts=more DRS balancing options,Recommend up to 32 hosts/cluster,May vary with VC server configuration and VM/host ratio,Network configuration on all hosts,VMotion network:Security policies,VMotion nic enabled,GigE network,etc,Virtual Machine network present on all hosts,VM datastore shared across all hosts,VM floppy/CD connected to host device,第37页,DRS Best Practices:VM Resource Settings,Reservations,Limits,and Shares,Shares take effect during resource contention,Low limits can lead to wasted resources,High VM reservations may limit DRS balancing,Overhead memory,Use resource pools(RP)for better manageability,Virtual CPUs and Memory size,High memory size and virtual CPUs=fewer migration opportunities,Configure VMs based on need,第38页,DRS Best Practices:Algorithm Settings,Aggressiveness threshold,Moderate threshold(default)works well for most cases,Aggressive thresholds recommended if,Homogenous clusters and,VM demand relatively constant and,Few affinity/anti-affinity rules,Use affinity/anti-affinity rules only when need,Affinity rules:closely interacting VMs,Anti-affinity rules:I/O intensive workloads,availability,Automatic DRS mode recommended(cluster-wide),Manual/Partially automatic mode for location-critical VMs(per VM),Per VM setting overrides cluster-wide setting,第39页,HA Best Practices-Setup&Networking,Proper DNS&Network settings are needed for initial configuration,After configuration DNS resolutions are cached to/etc/FT_HOSTS(minimizing the dependency on DNS server availability during an actual failover),DNS on each host is preferred(manual editing of/etc/hosts is error prone),Redundancy to ESX Service Console networking is essential(several options),Choose the option that minimizes single points of failure,Gateways/isolation addresses should respond via ICMP(ping),Enable PortFast(or equivalent)on network switches to avoid spanning tree related isolations,Network maintenance activities should take into account dependencies on the ESX Service Console network(s),VMware HA can be temporarily disabled through the Cluster-Edit Settings dialog,Valid VM network label names required for proper failover,Virtual machines use them to re-establish network connectivity upon restart,第40页,HA Network Configuration,A single service console network with underlying redundancy is usually sufficient:,Use a team of 2 NICs connected to,different physical switches,to avoid a single point of failure,Configure vNics in vSwitch for Active/Standby configuration(rolling failover=“yes”,default load balancing=route based on originating port ID),Consider extending timeout values&adding multiple isolation addresses(*see appendix),Timeouts of 30-60 seconds will slightly extend recovery times,but will also allow for intermittent network outages,Network redundancy between the ESX service consoles is essential for reliable detection of host failures&isolation conditions,第41页,HA Network Configuration(Continued),HA will detect and use a secondary service console network,Adding a secondary service console portgroup to an existing VMotion vSwitch avoids having to dedicate an additional subnet&NIC for this purpose,Also need to specify an additional isolation address for the cluster to account for the added redundancy(*see appendix),Continue using the primary service console network&IP address for management purposes,Be careful with network maintenance that affects the primary service console network and the secondary/VMotion network,Beyond NIC teaming,a secondary service console network can be configured to provide redundant heartbeating&isolation detection,第42页,HA Best Practices Resource Management,Larger groups of homogenous servers will allow higher levels of utilization across an HA/DRS enabled cluster(on average),More nodes per cluster(current maximum is 16)can tolerate multiple host failures while still guaranteeing failover capacities,Admission control heuristics are conservatively weighted(so that large servers with many VMs can failover to small servers),To define the sizing estimates used for admission control,set reasonable reservations as the,minimum,resources needed,Admission control will exceed failover capacities when reservations are not set;otherwise HA will use largest reservation specified as the“slot”size.,At a minimum,set reservations for a few virtual machines considered“average”,Admission control may be too conservative when host and VM sizes vary widely,Perform your own capacity planning by choosing“Allow virtual machines to be powered on even if they violate availability constraints”.HA will still try to restart as many virtual machines as it can.,第43页,议程,应用实施范围考虑,服务器采购考虑,虚拟机布署考虑,管理维护考虑,第44页,Impact of VirtualCenter Downtime,Component,Impact Experienced,Virtual Machines,Unaffected,management requires direct connections to ESX Servers,ESX Servers,Unaffected,management requires direct connections to ESX Servers,Performance&Monitoring Statistics,Historical records will have gaps during outages,still available via ESX Servers,VMotion,Unavailable,VMware DRS,Unavailable,VMware HA,Agents unaffected&provide failov

展开阅读全文