资源描述
单击此处编辑母版标题样式,单击此处编辑母版文本样式,第二级,第三级,第四级,第五级,*,Algorithms in General Asynchronous Networks,沈卓炜,zwshen,九龙湖校区计算机楼,347,房间,Tel:52090919,13390922952,2011,年,3,月,1,Leader election in general asynchronous networks,Undirected graphs.,Can get asynchronous version of synchronous FloodMax algorithm:,Simulate rounds with counters.,Need to know diameter for termination.,FloodMax algorithm:,Every round:Send max UID seen to all neighbors.,Stop after diam rounds.,Elect self iff own UID is max seen.,2,2025/8/27 周三,Leader election in general asynchronous networks,Well see better asynchronous algorithms later:,Dont need to know diameter.,Lower message complexity.,Depend on techniques such as:,Breadth-first search,Convergecast using a spanning tree,Synchronizers to simulate synchronous algorithms,Consistent global snapshots to detect termination,3,2025/8/27 周三,Spanning tree and searching,Spanning trees are used for communication,e.g.,broadcast/convergecast,Start with the simple task of setting up some(arbitrary)spanning tree with a(given)root i,0,.,Assume,:,Undirected,connected graph(i.e.,bidirectional communication).,Root i,0,Size and diameter unknown.,UIDs,with comparisons.,Can identify in-and out-edges to same neighbor.,4,2025/8/27 周三,Spanning tree and searching,Require:,Each process should output its parent in tree,with a parent output action.,Starting point:SynchBFS algorithm:,i,0,floods,search,message;,parent of a node is the first node from which it receives a,search,message.,Try running the same algorithm in asynchronous network.,Still yields spanning tree,but not necessarily breadth-first tree.,5,2025/8/27 周三,AsynchSpanning tree,Process i,6,2025/8/27 周三,Asynchronous spanning tree,7,2025/8/27 周三,Asynchronous spanning tree,S,8,2025/8/27 周三,Asynchronous spanning tree,S,9,2025/8/27 周三,Asynchronous spanning tree,S,10,2025/8/27 周三,Asynchronous spanning tree,S,S,11,2025/8/27 周三,Asynchronous spanning tree,S,12,2025/8/27 周三,Asynchronous spanning tree,S,S,S,13,2025/8/27 周三,Asynchronous spanning tree,14,2025/8/27 周三,AsynchSpanning tree,Complexity,Messages:O(|E|),Time:diam(l+d)+l,Anomaly:Paths may be longer than diameter!,Messages may travel faster along longer paths,in asynchronous networks.,15,2025/8/27 周三,Application of AsynchSpanning tree,Similar to,synchronous BFS,Message broadcast:Piggyback on,search,message.,Child pointers:Add responses to,search,messages,easy because of bidirectional communication.,Use precomputed tree for bcast/convergecast,Now the timing anomaly arises,O(h(l+d)time complexity.,O(|E|)message complexity.,h=height of tree;may be n,16,2025/8/27 周三,Applications of BFS,Global computation:,Sum,max,or any kind of data aggregation:Convergecast on BFS tree.,Complexity:Time O(diameter);Messages O(n),Leader election(without knowing diameter):,Everyone starts BFS,determines max UID.,Complexity:Time O(diam);Messages O(n|E|)(actually,O(diam|E|).,Compute diameter:,All do BFS.,Convergecast to find height of each BFS tree.,Convergecast again to find max of all heights.,17,2025/8/27 周三,More applications,Asynchronous broadcast/convergecast:,Can also construct spanning tree while using it to broadcast message and also to collect responses.,E.g.,to tell the root when the bcast is done,or to collect aggregated data.,Complexity:,O(|E|)message complexity.,O(n(l+d)time complexity,timing anomaly.,Elect leader when nodes have no info about the network(,no knowledge of n,diam,etc.;no root,no spanning tree,),18,2025/8/27 周三,Breadth-first spanning tree,Assume(same as above):,Undirected,connected graph(i.e.,bidirectional communication).,Root i,0,.,Size and diameter unknown.,UIDs,with comparisons.,Require:,Each process should output its parent in a,breadth-first spanning tree,.,19,2025/8/27 周三,Breadth-first spanning tree,In asynchronous networks,modified SynchBFSdoes not guarantee that the spanning tree constructed is breadth-first.,Long paths may be traversed faster than short ones.,Can modify each process to keep track of distance,change parent when it hears of shorter path.,Relaxation algorithm(like Bellman-Ford).,Must inform neighbors of changes.,Eventually,tree stabilizes to a breadth-first spanning tree.,20,2025/8/27 周三,AsynchBFS,21,2025/8/27 周三,AsynchBFS,0,22,2025/8/27 周三,AsynchBFS,0,0,23,2025/8/27 周三,AsynchBFS,0,0,0,24,2025/8/27 周三,AsynchBFS,1,0,0,25,2025/8/27 周三,AsynchBFS,1,0,1,1,26,2025/8/27 周三,AsynchBFS,1,0,1,3,2,1,1,27,2025/8/27 周三,AsynchBFS,1,0,4,1,3,2,1,1,28,2025/8/27 周三,AsynchBFS,1,0,4,1,3,2,1,1,4,4,29,2025/8/27 周三,AsynchBFS,1,0,2,1,3,2,1,4,4,30,2025/8/27 周三,AsynchBFS,1,0,2,5,1,3,2,1,4,2,31,2025/8/27 周三,AsynchBFS,6,1,0,2,3,1,3,2,1,1,32,2025/8/27 周三,AsynchBFS,6,1,0,2,2,1,3,2,1,1,33,2025/8/27 周三,AsynchBFS,2,1,0,2,2,1,3,2,1,0,34,2025/8/27 周三,AsynchBFS,1,1,0,2,2,1,3,2,1,35,2025/8/27 周三,AsynchBFS,Complexity:,Messages:,O(n|E|),May send O(n)messages on each link(one for each distance estimate).,Time:,O(diam n(l+d)(taking pileups into account).,Can reduce complexity if know bound D on diameter:,Allow only distance estimates D.,Messages:O(D|E|);Time:O(diamD(l+d),36,2025/8/27 周三,AsynchBFS,Termination:,No one knows when this is done,so cant produce,parent,outputs.,Can augment with,ack,s for search messages,convergecast back to i,0,.,i,0,learns when the tree has stabilized,tells everyone else.,A bit tricky:,Tree grows and shrinks.,Some processes may participate many times,as they learn improvements.,Bookkeeping needed.,Complexity?,37,2025/8/27 周三,Layered BFS,Asynchrony leads to many corrections,which lead to lots of communication.,Idea:,Slow down communication,grow the tree in synchronized phases.,In phase k,incorporate all nodes at distance k from i,0,.,i,0,synchronizes between incorporating nodes at distance k and k+1.,Phase 1,:,i,0,sends,search,messages to neighbors.,Neighbors set dist:=1,send,ack,s to i,0,.,38,2025/8/27 周三,Layered BFS,Phase k+1,:,Assume phases 1,k are completed:each node at distance k knows its parent,and each node at distance k-1 also knows its children.,i,0,broadcasts,newphase,message along tree edges,to distance k processes.,Each of these sends,search,message to all nbrs except its parent.,When any non-i,0,process receives first,search,message,sets parent:=sender and sends a,positive ack,;sends,nacks,for subsequent,search msgs,.,When distance k process receives,acks/nacks,for all its,search,messages,designates nodes that sent,postive acks,as its children.,Then distance k processes convergecast back to i,0,along depth k tree to say that theyre done;include a bit saying whether new nodes were found,.,39,2025/8/27 周三,Layered BFS,Terminates:,When i,0,learns,in some phase,that no new nodes were found.,Obviously produces BFS tree.,Complexity:,Messages:,O(|E|+n diam),Time:,Use simplified analysis:,Neglecting local computation time l,Assuming that every message in a channel is delivered in time d(ignoring congestion delays).,O(diam,2,d),Each edge explored at most once in each direction by search/ack.,Each tree edge traversed at most once in each phase by newphase/convergecast.,40,2025/8/27 周三,Layered BFS vs AsynchBFS,Message complexity:,AsynchBFS:O(diam|E|),assuming diam is known,O(n|E|)if not,LayeredBFS:O(|E|+n diam),Time complexity:,AsynchBFS:O(diamd),LayeredBFS:O(diam,2,d),Can also define“hybrid”algorithm,Add m layers in each phase.,Within each phase,layers constructed asynchronously.,Intermediate performance.,41,2025/8/27 周三,Shortest Paths,Assumptions:,Same as for BFS,plus edge weights.,weight(i,j),nonnegative real,same in both directions.,Require:,Output shortest distance and parent in shortest-paths tree.,Use Bellman-Ford asynchronously,Used to establish routes in ARPANET 1969-1980.,Can augment with convergecast as for BFS,for termination.,But worst-case complexity is,very bad,42,2025/8/27 周三,AsynchBellmanFord,43,2025/8/27 周三,AsynchBellmanFord,Termination:,Use convergecast(as for AsynchBFS).,Complexity:,O(n!)simple paths from i,0,to any other node,which is O(n,n,).,So the number of messages sent on any channel is O(n,n,).,So message complexity=O(n,n,|E|),time complexity=O(nn,n,(l+d).,44,2025/8/27 周三,AsynchBellmanFord,Complexity:,Q:Are the message and time complexity really exponential in n?,A:Yes:In some execution of network below,i,k,sends 2,k,messages to i,k+1,so message complexity is(2,n/2,)and time complexity is(2,n/2,d).,45,2025/8/27 周三,Exponential time/message complexity,Possible distance estimates for i,k,are 2,k,1,2,k,2,0.,Moreover,i,k,can take on all these estimates in sequence:,First,messages traverse upper links,2,k,1.,Then last lower message arrives at i,k,2,k,2.,Then lower message i,k-2,i,k-1,arrives,reduces i,k-1,s estimate by 2,message i,k-1,i,k,arrives on upper links,2,k,3.,Etc.Count down in binary.,If this happens quickly,get pileup of 2,k,search messages in C,k,k+1,.,46,2025/8/27 周三,Shortest Paths,Moral:Unrestrained asynchrony can cause problems.,Return to this problem after we have better synchronization methods.,Now,another good illustration of the problems introduced by asynchrony:,47,2025/8/27 周三,Minimum spanning tree,Assumptions:,G=(V,E)connected,undirected.,Weighted edges,weights known to endpoint processes,weights distinct.,UIDs,Processes dont know n,diam.,Can identify in-and out-edges to same neighbor.,Input:wakeup actions,occurring at any time at one or more nodes.,Process wakes up when it first receives either a wakeupinput or a protocol message.,48,2025/8/27 周三,Minimum spanning tree,Assumptions:,Requires:,Produce MST,where each process knows which of its incident edges belong to the tree.,Guaranteed to be unique,because of unique weights.,Gallager-Humblet-Spira,algorithm,49,2025/8/27 周三,Recall synchronous algorithm,Proceeds in,phases(levels).,After each phase,we have a,spanning forest,in which each component tree has a leader.,In each phase,each component finds,min weight outgoing edge(MWOE),then components merge using all MWOEs to get components for next phase.,50,2025/8/27 周三,Synchronous algorithm,Complexity is good:,Messages:O(nlog n+|E|),Time(rounds):O(nlog n),Low message complexity depends on the way nodes test their incident edges,in order of weight,not retesting same edge once its rejected.,Q:How to run this algorithm asynchronously?,51,2025/8/27 周三,Running the Alg asynchronously,Problems arise:,Inaccurate information about outgoing edges:,In synchronous algorithm,when a node tests its edges,it knows that its neighbors are already up to the same level,and have up-to-date information about their component.,In asynchronous version,neighbors could lag behind;they might be in same component but not yet know this.,Less“balanced”combination of components:,In synchronous algorithm,level k components have 2,k,nodes,and level k+1 components are constructed from at least two level k components.,In asynchronous version,components at different levels could becombined.,Can lead to more messages overall.,52,2025/8/27 周三,Running the Alg asynchronously,Problems arise:,Inaccurate information about outgoing edges:,Less“balanced”combination of components:,Concurrent overlapping searches/convergecasts:,When nodes are out of synch,concurrent searches for MWOEs could interfere with each other(well see this).,Time bound:,These problems result from nodes being out-of-synch,at different levels.,We could try to synchronize levels,but this must be done carefully,so as not to hurt the time complexity too much.,53,2025/8/27 周三,GHS algorithm,Same basic ideas as before:,Form components,combine along MWOEs.,Within any component,processes cooperate to find component MWOE.,Broadcast from leader,convergecast,etc.,54,2025/8/27 周三,GHS algorithm,Introduce synchronization to prevent nodes from getting too far ahead of their neighbors.,Associate a“level”with each component,as before.,Number of nodes in a level k component 2,k,.,Now,each level k+1 component will be(initially)formed from exactly two level k components.,Level numbers are used for synchronization,and in determining who is in the same component.,Complexity:,Messages:O(|E|+n log n),Time:O(nlog n(d+l),55,2025/8/27 周三,GHS algorithm,Combine pairs of components in two ways,merging,and,absorbing,.,Merging,:,C and C have same level k,and have a common MWOE.,Result is a new merged component C,with level k+1.,56,2025/8/27 周三,GHS algorithm,Absorbing:,level(C)level(C),and Cs MWOE leads to C.,Result is to absorb C into C.,Not creating a new component,just adding C to existing C.,C“catches up”with the more advanced C.,Absorbing is cheap,local.,Merging and absorbing ensure that the number of nodes in any level k component 2k.,Merging and absorbing are both allowable operations in finding MST,because they are allowed by the general theory for MSTs.,57,2025/8/27 周三,Liveness,Q:,Why are merging and absorbing sufficient to ensure that the construction is eventually completed?,Lemma:,After any allowable finite sequence of merges and absorbs,either the forest consists of one tree(so were done),or some merge or absorb is enabled.,58,2025/8/27 周三,Liveness,Proof:,Consider the current“component digraph”:,Nodes=components,Directed edges correspond to MWOEs,Then there must be some pair C,C whose MWOEs point to each other.(Why?),These MWOEs must be the same edge.(Why?),Can combine,using either merge or absorb:,If same level,merge,else absorb.,So,merging and absorbing are enough.,Now,how to implement them with a distributed algorithm?,59,2025/8/27 周三,Component names and leaders,For every component with level 1,define the core edge of the components tree.,Defined in terms of the merge and absorb operations used to construct the component:,After merge:Use the common MWOE.,After absorb:Keep the old core edge of the higher-level component.,“The edge along which the most recent merge occurred.”,Component name:,(core,level),Leader:,Endpoint of core edge with higher id.,60,2025/8/27 周三,Determining if an edge is outgoing,Suppose i wants to know if the edge(i,j)is outgoing from is current component.,At that point,is component name info is up-to-date:,Component is in“search mode”.,i has received initiate message from the leader,which carried component name.,So i sends j a test message.,Three cases:,61,2025/8/27 周三,Determining if an edge is outgoing,Three cases:,If js current(core,level)is the same as is,then j knows that j is in the same component as i.,If js(core,level)is different from is and js level is is,then j knows that j is in a different component from i.,Component has only one core per level.,No one in the same component currently has a higher level than i does,since the component is still searching for its MWOE.,If js level is is,then j doesnt know if it is in the same or a different component.So it doesnt yet respond-waits to catch up to is level.,62,2025/8/27 周三,Liveness,again,Q:,Can the extra delays imposed here affect the progress argument?,No:,We can redo the progress argument,this time considering only those components with the lowest current level k.,All processes in these components must succeed in determining their MWOEs,so these components succeed in determining the component MW
展开阅读全文