资源描述
单击以编辑,母版标题样式,单击以编辑母版文本样式,第二级,第三级,第四级,第五级,*,第三章:动态规划,3.1,基本概念,一、动态决策问题,决策过程具有阶段性和时序性,(,与时间有关,),的决策问题。即决策过程可划分为明显的阶段。,二、什么叫动态规划,(D.P.Dynamic Program),多阶段决策问题最优化的一种方法。,广泛应用于工业技术、生产管理、企业管理、经济、军事等领域。,三、动态规划,(D.P.),的起源,1951,年,(,美,),数学家,R.Bellman,等提出,最优化原理,从而建立动态规划,,名著,动态规划,于,1957,年出版。,四、动态决策问题分类,1,、按数据给出的形式分为:,离散型动态决策问题。,连续型动态决策问题。,2,、按决策过程演变的性质分为:,确定型动态决策问题。,随机型动态决策问题,。,1,五、动态决策问题的基本要素,A,B,1,B,2,B,3,C,1,C,2,C,3,D,1,D,2,D,3,E,1,E,2,F,3,5,4,9,5,4,3,5,1,7,1,5,8,4,6,4,4,2,2,2,6,9,7,5,1,4,1,、,阶段,(stage)n,:,作出决策的若干轮次。,n=1,、,2,、,3,、,4,、,5,。,2,、,状态,(,state)S,n,:,每一阶段的出发位置。构成状态集,记为,S,n,S,1,=A,,,S,2,=B,1,B,2,B,3,,,S,3,=C,1,C,2,C,3,,,S,4,=D,1,D,2,D,3,,,S,5,=E,1,E,2,。,阶段的起点,。,3,、,决策,(,decision)X,n,:从,一个阶段某状态演变到下一个阶段某状态的选择。,构成决策集,记为,D,n,(S,n,),。,阶段的终点。,D,1,(S,1,)=X,1,(A)=B,1,B,2,B,3,=S,2,,,D,2,(S,2,)=X,2,(B,1,),X,2,(B,2,),X,2,(B,3,)=C,1,C,2,;C,1,C,2,C,3,;C,2,C,3,=C,1,C,2,C,3,=S,3,,,D,3,(S,3,)=X,3,(C,1,),X,3,(C,2,),X,3,(C,3,)=D,1,D,2,;D,1,D,2,D,3,;D,1,D,2,D,3,=D,1,D,2,D,3,=S,4,,,D,4,(S,4,)=X,4,(D,1,),X,4,(D,2,),X,4,(D,3,)=E,1,E,2,;E,1,E,2,;E,1,E,2,=E,1,E,2,=S,5,,,D,5,(S,5,)=X,5,(E,1,),X,5,(E,2,)=F;F=F,。,2,A,B,1,B,2,B,3,C,1,C,2,C,3,D,1,D,2,D,3,E,1,E,2,F,3,5,4,9,5,4,3,5,1,7,1,5,8,4,6,4,4,2,2,2,6,9,7,5,1,4,4,、,策略,(policy),:,全过程中各个阶段的决策,X,n,组成的有序总体,X,n,。,如,A,B,2,C,1,D,1,E,2,F,上例从,A,F,共有,38,种走法,即有,38,条路线,,38,个策略。,5,、,子策略,(sub-policy),:,剩下的,n,个阶段构成,n,子过程,相应的决策系列叫,n,子策略。,如,C,1,D,1,E,2,F,6,、,状态转移方程:前一阶段的终点,(,决策,),是后前一阶段的起点,(,状态,),。,X,n,=S,n+1,7,、,指标函数:各个阶段的数量指标,记为,r,n,(s,n,x,n,),。,如上例中,用,d,n,(s,n,x,n,),表示距离。,d,2,(B,3,C,2,)=1,d,3,(C,2,D,3,)=6,等。,8,、目标函数:策略的数量指标值,记为,Z=optr,1,(s,1,x,1,)*,*,r,n,(s,n,x,n,),。,其中:,opt,为,max,或,min,,*,为运算符号。,如上例中,,Z=mind,1,(s,1,x,1,)+,+,d,n,(s,n,x,n,)=mind,1,+d,2,+,d,n,3,3.2,最优化原理,一、,R.Bellman,最优化原理:,作为整个过程的最优策略,无任过去的状态和决策如何,对前面的决策形成状态而言,,余下的诸决策必构成最优策略。,即:若,M,是从,A,到,B,最优路线上的任一点,则从,M,到,B,的路线也是最优路线。,A,M,B,二、指标递推方程:,f,n,*,(,S,n,)=,opt,r,n,(s,n,x,n,)*,f,n+1,*,(s,n+1,),x,n,D,n,(S,n,),如上例:,f,n,*,(,S,n,)=,min,d,n,(s,n,x,n,)+,f,n+1,*,(S,n+1,),,,n,=4,、,3,、,2,、,1,x,n,D,n,(S,n,),f,5,*,(S,5,)=,min,r,5,(s,5,x,5,),x,5,D,5,(S,5,),三、求解过程:,用反向嵌套递推法:从最后一个阶段开始,依次对各子过程寻优,直至获得全过程的最优,,形成最优策略,获得最优策略指标值。,4,3.3 DP,建模及求解,一、建模条件:,决策过程本身具有时顺序性或可以转化为具有时序性的决策问题,,均可建立动态规划数学模型求解。,A,B,1,B,2,B,3,C,1,C,2,C,3,D,1,D,2,D,3,E,1,E,2,F,3,5,4,9,5,4,3,5,1,7,1,5,8,4,6,4,4,2,2,2,6,9,7,5,1,4,二、典型动态决策问题建模及其求解,1,、最短路线问题,例,1,:求下列图中,A,到,F,的,最短路线及最短路线值。,5,A,B,1,B,2,B,3,C,1,C,2,C,3,D,1,D,2,D,3,E,1,E,2,F,3,5,4,9,5,4,3,5,1,7,1,5,8,4,6,4,4,2,2,2,6,9,7,5,1,4,1,、,阶段,(stage)n,:,n=1,、,2,、,3,、,4,、,5,。,2,、,状态,(,state)S,n,:,S,1,=A,,,S,2,=B,1,B,2,B,3,,,S,3,=C,1,C,2,C,3,,,S,4,=D,1,D,2,D,3,,,S,5,=E,1,E,2,。,3,、,决策,(,decision)X,n,:,决策集,D,n,(S,n,),。,D,1,(S,1,)=X,1,(A)=B,1,B,2,B,3,=S,2,,,D,2,(S,2,)=X,2,(B,1,),X,2,(B,2,),X,2,(B,3,)=C,1,C,2,;C,1,C,2,C,3,;C,2,C,3,=C,1,C,2,C,3,=S,3,,,D,3,(S,3,)=X,3,(C,1,),X,3,(C,2,),X,3,(C,3,)=D,1,D,2,;D,1,D,2,D,3,;D,1,D,2,D,3,=D,1,D,2,D,3,=S,4,,,D,4,(S,4,)=X,4,(D,1,),X,4,(D,2,),X,4,(D,3,)=E,1,E,2,;E,1,E,2,;E,1,E,2,=E,1,E,2,=S,5,,,D,5,(S,5,)=X,5,(E,1,),X,5,(E,2,)=F;F=F,。,4,、,状态转移方程:,X,n,=S,n+1,5,、,指标函数,(,距离,),:,d,n,(s,n,x,n,),。,d,2,(B,3,C,2,)=1,d,3,(C,2,D,3,)=6,等。,6,、,指标递推方程,:,f,n,*,(,S,n,)=,min,r,n,(s,n,x,n,)+,f,n+1,*,(S,n+1,),,,n,=4,、,3,、,2,、,1,x,n,D,n,(S,n,),f,5,*,(S,5,)=,min,r,5,(s,5,x,5,),x,5,D,5,(S,5,),6,A,B,1,B,2,B,3,C,1,C,2,C,3,D,1,D,2,D,3,E,1,E,2,F,3,5,4,9,5,4,3,5,1,7,1,5,8,4,6,4,4,2,2,2,6,9,7,5,1,4,1,1,F,2,2,F,4+1=5,2+2=4,4,E,2,6+1=7,9+2=11,7,E,1,7+1=8,5+2=7,7,E,2,7,A,B,1,B,2,B,3,C,1,C,2,C,3,D,1,D,2,D,3,E,1,E,2,F,3,5,4,9,5,4,3,5,1,7,1,5,8,4,6,4,4,2,2,2,6,9,7,5,1,4,1+4=5,5+7=12,/,5,D,1,8+4=12,4+7=11,6+7=13,11,D,2,4+4=8,4+7=11,2+7=9,8,D,1,9+5=14,5+11=16,/,14,C,1,4+5=9,3+11=14,5+8=13,9,C,1,/,1+11=12,7+8=15,12,C,2,8,A,B,1,B,2,B,3,C,1,C,2,C,3,D,1,D,2,D,3,E,1,E,2,F,3,5,4,9,5,4,3,5,1,7,1,5,8,4,6,4,4,2,2,2,6,9,7,5,1,4,3+14=17,5+9=14,4+12=16,14,B,2,最短路线值为:,f,1,*,(s,1,)=14,最短路线求解如下:,9,1,1,F,2,2,F,4+1=5,2+2=4,4,E,2,6+1=7,9+2=11,7,E,1,7+1=8,5+2=7,7,E,2,1+4=5,5+7=12,/,5,D,1,8+4=12,4+7=11,6+7=13,11,D,2,4+4=8,4+7=11,2+7=9,8,D,1,9+5=14,5+11=16,/,14,C,1,4+5=9,3+11=14,5+8=13,9,C,1,/,1+11=12,7+8=15,12,C,2,3+14=17,5+9=14,4+12=16,14,B,2,10,A,B,1,B,2,B,3,C,1,C,2,C,3,D,1,D,2,D,3,E,1,E,2,F,3,5,4,9,5,4,3,5,1,7,1,5,8,4,6,4,4,2,2,2,6,9,7,5,1,4,即:,A,B,2,C,1,D,1,E,2,F,11,2,、资源分配问题,某种资源总量为,a,,,用于生产,n,种产品,若分配数量,X,i,用于生产第,i,种产品,收益为,g,i,(X,i,),。,问:如何分配才使总收益最大,?,例,1.,某有色金属公司拟拔出,50,万元对所属三家冶炼厂进行技术改造。若以十万元为最小分割单位,,各厂收益与投资的关系如下表示:,公司经理从定量决策的需要出发,要求,系统分析组求出:对三个工厂如何分配,这,50,万元,才能使总收益达到最大,?,12,1,、,阶段,n,:,1 2 3,(,工厂,),2,、,状态,S,n,:,S,1,,,S,2,=S,1,-,X,1,,,S,3,=S,2,-,X,2,,,(,可供分配的资源量,)=5,,,=0,1,5,,,=0,1,5,,,3,、,决策变量,X,n,:,X,1,,,X,2,X,3,=S,3,(,分配的资源量,)=0,1,5,,,=0,1,5,,,=0,1,5,4,、,状态转移方程:,S,n+1,=,S,n,-,X,n,5,、,指标函数,(,收益,),g,n,(x,n,),:,g,1,(x,1,)=g,2,(x,2,)=g,3,(x,3,)=,0,4.5,7,9,10.5,12,,,0,2,4.5,7.5,11,15,,,0,5,7,8,10,13,6,、,指标递推方程,:,f,n,*,(,S,n,)=,max,g,n,(x,n,)+,f,n+1,*,(S,n+1,),,,n,=2,、,1,0,x,n,S,n,f,3,*,(S,3,)=,max,g,3,(x,3,),,,0,x,3,S,3,工厂,1,工厂,2,工厂,3,13,S,2,=S,1,-x,1,=5-1=4,S,3,=S,2,-x,2,=4-3=1,最优策略为:,P*=x,1,*,,,x,2,*,,,x,3,*=1,,,3,,,1,Z*=17,万元,S,3,=S,2,-x,2,S,2,=S,1,-x,1,14,3,、背包问题,例,.,设有,3,种物品,每种数量无限,其重量和价值如下表。现有一只可装载重量为,W=5,公斤的背包,,试问,:,各种物品应各取多少件放入背包,才能使背包中的物品价值最高?,这个问题可以整数规划数学模型来描述:,设第,i,种物品取,x,i,件放入背包,背包中物品总价值记为,Z,,,则有数学模型:,Max Z=65x,1,+80 x,2,+30 x,3,s.t.2x,1,+3x,2,+x,3,5,x,j,0,,,j=1,2,3,;,且为整数,下面用动态规划求解:,15,、,阶段,n,:,1 2 3,(,物品,),、,状态,Sn,:,S,1,=5,,,S,2,=S,1,W,1,X,1,,,S,3,=S,2,W,2,X,2,(,背包可装入的重量,)=1,3,5 =0,1,2,3,5,、,决策,Xn,:,0,X,1,S,1,/W,1,0,X,2,S,2,/W,2,0,X,3,S,3,/W,3,(,装入的物品件数,)X,1,=0,1,2 X,2,=0,1 X,3,=0,1,2,3,5,、,状态转移方程:,S,n+1,=,S,n,WnXn,、,阶段,指标函数,(,价值,),:,r,1,(x,1,)=65x,1,r,2,(x,2,)=80 x,2,r,3,(x,3,)=30 x,3,、,递推方程:,物品,A,物品,B,物品,C,下面利用表格进行计算,,从最后一个阶段开始:,16,n=3,时:此时,X,3,S,3,/,W,3,=,S,3,,,为整数,X,3,S,3,f,3,(,S,3,)=r,3,(,X,3,),f,3,*,(,S,3,),X,3,*,0,1,2,3,5,0,0,0,0,1,0,30,30,1,2,0,30,60,60,2,3,0,30,60,90,90,3,5,0,30,60,90,150,150,5,n=2,时:此时,X,2,S,2,/,W,2,=,S,2,/,3,,,为整数,,S,3,=S,2,W,2,X,2,X,2,S,2,f,2,(,S,2,)=r,2,(,X,2,)+f,3,*,(,S,3,),f,2,*,(,S,2,),X,2,*,0,1,1,0+30=30,30,0,3,0+90=90,80+0=80,90,0,5,0+150=150,80+60=140,150,0,n=1,时:此时,X,1,S,1,/,W,1,=,S,1,/,2,,,为整数,,S,2,=S,1,W,1,X,1,X,1,S,1,f,1,(,S,1,)=r,1,(,X,1,)+f,2,*,(,S,2,),f,1,*,(,S,1,),X,1,*,0,1,2,5,0+150=150,65+90=155,130+30=160,160,2,S,2,=S,1,W,1,X,1,*,=5,2,2=1,S,3,=S,2,W,2,X,2,*,=130=1,最优策略为:,X,*,=x,1,*,,,x,2,*,,,x,3,*=2,,,0,,,1,,,Z,*,=f,1,*,(,S,1,)=160,即应取第一种物品,2,件,第二种物品,0,件,第三种物品,1,件放入背包,,才能使背包中的所有物品总价值最高为,160,元。,17,4,、生产问题,例,.,某厂生产一种产品,该产品在未来三个月中的需要量分别为,3,,,4,,,3,万件,若生产准备费为,3,万元,/,次,每件成本为,1,元,每件每月存储费为,0.7,元,假定,1,月初和,4,月初存货为,0,,且每月,产量不限。试求:该厂未来三个月内的最优生产计划?,1,月,3,月,4,月,2,月,需求量:,D,1,=3 D,2,=4 D,3,=3,、,阶段,(,月,)n,:,1 2 3 4,、,状态,Sn,:,S,1,=0,,,S,2,=S,1,+X,1,D,1,,,S,3,=S,2,+X,2,D,2,S,4,=S,3,+X,3,D,3,=0,(,月初库存,)=0,1,2,3,4,5,6,7,=0,1,2,3,、,决策,Xn,:,X,1,=,X,2,=,X,3,=,(,生产量,)3,4,5,6,7,8,9,10;0,1,2,3,4,5,6,7;0,1,2,3,、,状态转移方程:,S,n+1,=,S,n,+,X,n,D,n,、,阶段,指标函数,(,成本,),:成本,=,生产费用存储费用,r,n,(X,n,)=,3,1X,n,,,X,n,0,0,,,X,n,0,0.7S,n,、,递推方程:,18,n=3,时:,此时,S,3,+X,3,D,3,=0,,,即,X,3,=3,S,3,n=2,时:,因为,0,S,3,3,,而,S,3,=S,2,+X,2,D,2,,,即,0,S,2,+X,2,4 3,,,所以,4-,S,2,X,2,7,S,2,X,3,S,3,f,3,(,S,3,)=r,3,(,X,3,),f,3,*,(,S,3,),X,3,*,0,1,2,3,0,6+0=6,6,3,1,5+0.7=5.7,5.7,2,2,4+1.4=5.4,5.4,1,3,0+2.1=2.1,2.1,0,X,2,S,2,f,2,(,S,2,)=r,2,(,X,2,)+f,3,*,(,S,3,),f,2,*,(,S,2,),X*,2,0,1,2,3,4,5,6,7,0,7+6,8+5.7,9+5.4,10+2.1,12.1,7,1,6.7+6,7.7+5.7,8.7+5.4,9.7+2.1,11.8,6,2,6.4+6,7.4+5.7,8.4+5.4,9.4+2.1,11.6,5,3,6.1+6,7.1+5.7,8.1+5.4,9.1+2.1,11.2,4,4,2.8+6,6.8+5.7,7.8+5.4,8.8+2.1,8.8,0,5,3.5+5.7,7.5+5.4,8.5+2.1,9.2,0,6,4.2+5.4,8.2+2.1,9.6,0,7,4.9+2.1,7,0,19,n=1,时:因为,0,S,2,7,,而,S,2,=S,1,+X,1,D,1,,,即,0,S,1,+X,1,3 7,,,所以,3,S,1,X,1,10,S,1,由此可知:,S,1,=0,,,此时,X,1,*,=3,;,S,2,=S,1,+X,1,*,3=0,3,3=0,,,此时,X,2,*,=7,;,S,3,=S,2,+X,2,*,4=0,7,4=3,,,此时,X,3,*,=0,。,最优策略为:,X,*,=x,1,*,,,x,2,*,,,x,3,*=3,,,7,,,0,Z,*,=f,1,*,(S,1,)=18.1,即第一个月生产,3,万件,第二个月生产,7,万件,第三个月生产,0,万件,可使总成本最低为,18.1,万元。,X,1,S,1,f,1,(,S,1,)=r,1,(,X,1,)+f,2,*,(,S,2,),f,1,*,(,S,1,),X,1,*,3,4,5,6,7,8,9,10,0,9+11.2,10+8.8,11+9.2,12+9.6,13+7,18.1,3,6+12.1,7+11.8,8+11.6,20,
展开阅读全文