收藏 分销(赏)

静态代码分析-PPT.ppt

上传人:w****g 文档编号:1822915 上传时间:2024-05-09 格式:PPT 页数:54 大小:330KB 下载积分:14 金币
下载 相关 举报
静态代码分析-PPT.ppt_第1页
第1页 / 共54页
静态代码分析-PPT.ppt_第2页
第2页 / 共54页


点击查看更多>>
资源描述
静态代码分析提纲提纲q动机动机q程序静态分析(概念程序静态分析(概念+实例)实例)q程序缺陷分析(科研工作)程序缺陷分析(科研工作)动机动机q云平台特点云平台特点应用程序直接部署在云端服用程序直接部署在云端服务器上,存在安全器上,存在安全隐患患直接操作破坏服直接操作破坏服务器文件系器文件系统 存在安全漏洞存在安全漏洞时,可提供黑客入口,可提供黑客入口资源共享,源共享,动态分配分配单个个应用的性能低下,会侵占其他用的性能低下,会侵占其他应用的用的资源源q解决方案之一:解决方案之一:在部署在部署应用程序之前,用程序之前,对其其进行静行静态代代码分析:分析:是否存在是否存在违禁禁调用?(非法文件用?(非法文件访问)是否存在低效代是否存在低效代码?(未借助?(未借助StringBuilder对String进行大量行大量拼接)拼接)是否存在安全漏洞?(是否存在安全漏洞?(SQL注入,跨站攻注入,跨站攻击,拒,拒绝服服务)是否存在是否存在恶意病毒?意病毒?提纲提纲q动机动机q程序静态分析(概念程序静态分析(概念+实例)实例)q程序缺陷分析(科研工作)程序缺陷分析(科研工作)静态代码分析静态代码分析q定定义:程序静程序静态分析是在不分析是在不执行程序的情况下行程序的情况下对其其进行分析的技行分析的技术,简称称为静静态分析。分析。q对比:比:程序程序动态分析:需要分析:需要实际执行程序行程序 程序理解:静程序理解:静态分析分析这一一术语一般用来形容自一般用来形容自动化工具的分析,而化工具的分析,而人工分析人工分析则往往叫做程序理解往往叫做程序理解q用途:用途:程序翻程序翻译/编译(编译器),程序器),程序优化重构,化重构,软件缺陷件缺陷检测等等 q过程:程:大多数情况下,静大多数情况下,静态分析的分析的输入都是源程序代入都是源程序代码或者中或者中间码(如(如Java bytecode),只有极少数情况会使用目),只有极少数情况会使用目标代代码;以特定形式;以特定形式输出分析出分析结果果静态代码分析静态代码分析 qBasic BlocksqControl Flow GraphqDataflow AnalysisLive Variable AnalysisReaching Definition AnalysisqLattice Theory Basic BlocksqA basic block is a maximal sequence of consecutive three-address instructions with the following properties:The flow of control can only enter the basic block thru the 1st instr.Control will leave the block without halting or branching,except possibly at the last instr.qBasic blocks become the nodes of a flow graph,with edges indicating the order.大家有疑问的,可以询问和交流大家有疑问的,可以询问和交流可以互相讨论下,但要小声点可以互相讨论下,但要小声点可以互相讨论下,但要小声点可以互相讨论下,但要小声点Bye ByeE EA AB BC CD DF FBasic Block ExampleLeadersqi=1qj=1qt1=10*iqt2=t1+jqt3=8*t2qt4=t3-88qat4=0.0qj=j+1qif j=10 goto(3)qi=i+1qif i=10 goto(2)qi=1qt5=i-1qt6=88*t5qat6=1.0qi=i+1qif i=10 goto(13)Basic BlocksControl-Flow GraphsqControl-flow graph:Node:an instruction or sequence of instructions(a basic block)Two instructions i,j in same basic blockiff execution of i guarantees execution of jDirected edge:potential flow of controlDistinguished start node Entry&ExitFirst&last instruction in programControl-Flow EdgesqBasic blocks=nodesqEdges:Add directed edge between B1 and B2 if:Branch from last statement of B1 to first statement of B2(B2 is a leader),orB2 immediately follows B1 in program order and B1 does not end with unconditional branch(goto)Definition of predecessor and successorB1 is a predecessor of B2B2 is a successor of B1CFG Example静态代码分析静态代码分析qBasic BlocksqControl Flow GraphqDataflow AnalysisLive Variable AnalysisReaching Definition AnalysisqLattice Theory Dataflow AnalysisqCompile-Time Reasoning AboutRun-Time Values of Variables or ExpressionsqAt Different Program PointsWhich assignment statements produced value of variable at this point?Which variables contain values that are no longer used after this program point?What is the range of possible values of variable at this program point?Program PointsOne program point before each nodeOne program point after each nodeJoin point point with multiple predecessorsSplit point point with multiple successorsLive Variable AnalysisqA variable v is live at point p if v is used along some path starting at p,and no definition of v along the path before the use.qWhen is a variable v dead at point p?No use of v on any path from p to exit node,orIf all paths from p redefine v before using v.What Use is Liveness Information?qRegister allocation.If a variable is dead,can reassign its registerqDead code elimination.Eliminate assignments to variables not read later.But must not eliminate last assignment to variable(such as instance variable)visible outside CFG.Can eliminate other dead assignments.Handle by making all externally visible variables live on exit from CFGConceptual Idea of Analysisqstart from exit and go backwards in CFGqCompute liveness information from end to beginning of basic blocksLiveness Example a=x+y;t=a;c=a+x;x=0 b=t+z;c=y+1;qAssume a,b,c visible outside methodqSo are live on exitqAssume x,y,z,t not visibleqRepresent Liveness Using Bit Vectorqorder is abcxyzta b c x y z ta b c x y z ta b c x y z tFormalizing AnalysisnEach basic block hasnIN-set of variables live at start of blocknOUT-set of variables live at end of blocknUSE-set of variables with upwards exposed uses in block(use prior to definition)nDEF-set of variables defined in block prior to usenUSEx=z;x=x+1;=z (x not in USE)nDEFx=z;x=x+1;y=1;=x,ynCompiler scans each basic block to derive USE and DEF setsAlgorithmfor all nodes n in N-Exit INn=emptyset;OUTExit=emptyset;INExit=useExit;Changed=N-Exit;while(Changed!=emptyset)choose a node n in Changed;Changed=Changed-n;OUTn=emptyset;for all nodes s in successors(n)OUTn=OUTn U INp;INn=usen U(outn-defn);if(INn changed)for all nodes p in predecessors(n)Changed=Changed U p;静态代码分析静态代码分析 概念概念qBasic BlocksqControl Flow GraphqDataflow AnalysisLive Variable AnalysisReaching Definition AnalysisqLattice Theory Reaching DefinitionsqConcept of definition and usea=x+y is a definition of a is a use of x and yqA definition reaches a use if value written by definition may be read by useReaching Definitions s=0;a=4;i=0;k=0 b=1;b=2;i ns=s+a*b;i=i+1;return sReaching Definitions and Constant PropagationqIs a use of a variable a constant?Check all reaching definitionsIf all assign variable to same constantThen use is in fact a constantqCan replace variable with constantIs a Constant in s=s+a*b?s=0;a=4;i=0;k=0 b=1;b=2;i ns=s+a*b;i=i+1;return sYes!On all reaching definitionsa=4 Constant Propagation Transform s=0;a=4;i=0;k=0 b=1;b=2;i ns=s+4*b;i=i+1;return sYes!On all reaching definitionsa=4 Computing Reaching DefinitionsqCompute with sets of definitionsrepresent sets using bit vectorseach definition has a position in bit vectorqAt each basic block,computedefinitions that reach start of blockdefinitions that reach end of blockqDo computation by simulating execution of program until reach fixed point 1:s=0;2:a=4;3:i=0;k=0 4:b=1;5:b=2;1 2 3 4 5 6 71 2 3 4 5 6 71 2 3 4 5 6 71 2 3 4 5 6 71 2 3 4 5 6 71 2 3 4 5 6 7i nreturn s6:s=s+a*b;7:i=i+1;Formalizing Reaching DefinitionsqEach basic block hasIN-set of definitions that reach beginning of blockOUT-set of definitions that reach end of blockGEN-set of definitions generated in blockKILL-set of definitions killed in blockqGENs=s+a*b;i=i+1;=0000011qKILLs=s+a*b;i=i+1;=1010000qCompiler scans each basic block to derive GEN and KILL setsExampleForwards vs.backwardsqA forwards analysis is one that for each program point computes information about the past behavior.Examples of this are available expressions and reaching definitions.Calculation:predecessors of CFG nodes.qA backwards analysis is one that for each program point computes information about the future behavior.Examples of this are liveness and very busy expressions.Calculation:successors of CFG nodes.May vs.MustqA may analysis is one that describes information that may possibly be true and,thus,computes an upper approximation.Examples of this are liveness and reaching definitions.Calculation:union operator.qA must analysis is one that describes information that must definitely be true and,thus,computes a lower approximation.Examples of this are available expressions and very busy expressions.Calculation:intersection operator.静态代码分析静态代码分析 概念概念qBasic BlocksqControl Flow GraphqDataflow AnalysisLive Variable AnalysisReaching Definition AnalysisqLattice Theory Basic IdeaqInformation about program represented using values from algebraic structure called latticeqAnalysis produces lattice value for each program pointqTwo flavors of analysisForward dataflow analysisBackward dataflow analysisPartial OrdersqSet PqPartial order such that x,y,z Px x(reflexive)x y and y x implies x y(asymmetric)x y and y z implies x z(transitive)qCan use partial order to defineUpper and lower boundsLeast upper boundGreatest lower boundUpper BoundsqIf S P thenx P is an upper bound of S if y S.y xx P is the least upper bound of S ifx is an upper bound of S,and x y for all upper bounds y of S -join,least upper bound(lub),supremum,sup S is the least upper bound of Sx y is the least upper bound of x,yLower BoundsqIf S P thenx P is a lower bound of S if y S.x yx P is the greatest lower bound of S ifx is a lower bound of S,and y x for all lower bounds y of S -meet,greatest lower bound(glb),infimum,inf S is the greatest lower bound of Sx y is the greatest lower bound of x,yCoveringqx y if x y and x y qx is covered by y(y covers x)ifx y,andx z y implies x zqConceptually,y covers x if there are no elements between x and yExampleqP=000,001,010,011,100,101,110,111(standard Boolean lattice,also called hypercube)qx y if(x bitwise and y)=x111011101110010001000100Hasse DiagramIf y covers xLine from y to xy above x in diagramLatticesqIf x y and x y exist for all x,y P,then P is a lattice.qIf S and S exist for all S P,then P is a complete lattice.qAll finite lattices are completeLatticesqIf x y and x y exist for all x,y P,then P is a lattice.qIf S and S exist for all S P,then P is a complete lattice.qAll finite lattices are completeqExample of a lattice that is not completeIntegers IFor any x,y I,x y=max(x,y),x y=min(x,y)But I and I do not existI ,is a complete latticeLattice ExamplesqLatticesqNon-latticesSemi-LatticeqOnly one of the two binary operations(meet or join)existMeet-semilattice If x y exist for all x,y PJoin-semilattice If x y exist for all x,y PMonotonic Function&Fixed pointqLet L be a lattice.A function f:L L is monotonic ifx,y S:x y f(x)f(y)qLet A be a set,f:A A a function,a A.If f(a)=a,then a is called a fixed point of f on AExistence of Fixed PointsThe height of a lattice is defined to be the length of the longest path from to In a complete lattice L with finite height,every monotonic function f:L L has a unique least fixed-point:Knaster-Tarski Fixed Point TheoremqSuppose(L,)is a complete lattice,f:LL is a monotonic function.qThen the fixed point m of f can be defined asCalculating Fixed PointqThe time complexity of computing a fixed-point depends on three factors:The height of the lattice,since this provides a bound for i;The cost of computing f;The cost of testing equality.qThe computation of a fixed-point can be illustrated as a walk up the lattice starting at:Application to Dataflow AnalysisqDataflow information will be lattice valuesTransfer functions operate on lattice valuesSolution algorithm will generate increasing sequence of values at each program pointAscending chain condition will ensure terminationqWill use to combine values at control-flow join pointsTransfer FunctionsqTransfer function f:PP for each node in control flow graphqf models effect of the node on the program informationTransfer FunctionsEach dataflow analysis problem has a set F of transfer functions f:PPIdentity function i FF must be closed under composition:f,g F.the function h=x.f(g(x)FEach f F must be monotone:x y implies f(x)f(y)Sometimes all f F are distributive:f(x y)=f(x)f(y)Distributivity implies monotonicity课程考核方式课程考核方式q作作业(提交到(提交到课程平台程平台q,并演示),并演示)+课程程报告告q作作业选题:代代码注注释提取,文档生成提取,文档生成代代码信息信息统计:总行数,代行数,代码行数,行数,类数量,方法数,方法数量,方法数,方法长度等度等Latex格式文档自格式文档自动转成成PDF代代码在在线diffExecutable Jar转换成成带有特定有特定icon的的exe程序程序代代码各各类缺陷缺陷检测:内存泄漏,空指:内存泄漏,空指针异常异常Test case 自自动生成生成脚本缺陷分析:脚本缺陷分析:Javascript,Python,Ruby,PHP C#代代码缺陷分析缺陷分析在在线压缩,解,解压缩,加密,解密,加密,解密Questions?Thank you!
展开阅读全文

开通  VIP会员、SVIP会员  优惠大
下载10份以上建议开通VIP会员
下载20份以上建议开通SVIP会员


开通VIP      成为共赢上传

当前位置:首页 > 包罗万象 > 大杂烩

移动网页_全站_页脚广告1

关于我们      便捷服务       自信AI       AI导航        抽奖活动

©2010-2026 宁波自信网络信息技术有限公司  版权所有

客服电话:0574-28810668  投诉电话:18658249818

gongan.png浙公网安备33021202000488号   

icp.png浙ICP备2021020529号-1  |  浙B2-20240490  

关注我们 :微信公众号    抖音    微博    LOFTER 

客服