资源描述
Network Analysis with igraph
友情提示:
这是在R语言环境下进行社会网络分析或者图论分析的工具,中间有些语句已经在新版本已经不能正常运行了,比如创建网络的语句
> g <- graph( c(0,1, 1,2, 3,4, 5,6) )
在igraph的0.6-2版不能运行,可以改为
> g <- graph( c(1,2, 1,3, 3,4, 5,6) )
也就是新版的顶点id是不能出现0的。
这个教程的作者是igraph包的创始人Gábor Csárdi,虽然他没有写完整,但是这个教程仍具有权威性,结合帮助文件可以助你快速掌握这个分析工具。
Gábor Csárdi
Department of Biophysics, Research Institute for Nuclear and Particle Physics of the Hungarian Academy of Sciences
29-33 Konkoly-Thege Miklós road, Budapest H-1121, Hungary
Center for Complex Systems Studies, Kalamazoo College
1200 Academy st, Kalamazoo, 49006, MI, USA
Copyright (C) 2006 Gábor Csárdi Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled “GNU Free Documentation License”.
Table of Contents
Introduction
A short igraph Tour
Installation
The igraph data model
Graphs Objects
Regular Graphs
Creating Graphs
Random Graphs
Importing and Exporting Graphs
Graph Visualization
Graph, Vertex and Edge Attributes
Vertex and Edge Sequences
Connected Components
Analysis of Citation Networks
Centrality Measures
Path Lengths and Similar Measures
Graph Motifs
Network Flows and Related Concepts
Graph Operators
Licenses for igraph and this manual
Bibliography
Index
Installation
The igraph R package can be installed just like any other R packages. Perhaps the easiest way is to use the install.packages() command. This downloads either a source of binary package appropriate for the running system from CRAN and installs it by default to the system directories. The package can be loaded by typing
library(igraph)
after installation.
If the user doesn't have rights to write to system directories the lib argument of the install.packages() command can be used to set an alternate installation directory. It is good R practice to keep the user's privately installed packages in a separate directory (say ~/.R/library), if this is the case igraph can be installed like this:
install.packages("igraph", lib="~/.R/library")
The R package search path needs to be modified as well to let R know about our private package directory:
.libPaths( c("~/.R/library", .libPaths()) )
This command can also be added to the ~/.Rprofile file to set the search path automatically every time you start R.
See ?Startup and ?.libPaths() for more information about R's startup mechanism and the way directories are searched for packages and ?install.packages about installing R packages.
The igraph data model
Directed graphs
Undirected graphs
Examples
This section introduces the igraph data model. It is recommended that the reader should be familiar with this model of computation to understand the semantics of the igraph functions better.
An igraph graph is either directed or undirected, igraph cannot handle mixed graphs with both directed and undirected edges.
Directed graphs
The igraph data model is slightly different for directed and undirected graphs, we start with the directed case. In igraph vertices are identified by vertex ids: integer number between zero and |V|-1, if |V| is the number of vertices in the graph. Thus the numbering of vertex ids is always continual, all igraph operations on graphs keep this important property.
Directed igraph edges are ordered pairs or vertex ids. A directed graph is an ordered multiset of directed edges and some additional meta data: the number of vertices in the graph and a boolean tag to mark the graph as directed. Here is an example of a directed igraph graph:
( vertices: 6,
directed: yes,
(
(0,2),
(2,2),
(2,3),
(3,3),
(3,4),
(3,4),
(4,1)
)
)
Just like the vertices have vertex ids, igraph edges also have ids: edge ids are integer numbers between zero and |E|-1, |E| is the number of edges in the graph.
Undirected graphs
An undirected igraph edge is a two-element or one-element set of vertex ids. The two element set is a non-loop edge, ie. the edge connects different vertices, while loop edges are represented with a single-element set. An undirected graph is an ordered multiset of undirected edges, plus the usual meta data: the number of vertices and the fact that the graph is undirected. Here is an example for an undirected graph:
( vertices: 6,
directed: no,
(
{0,2},
{2},
{2,3},
{3},
{3,4},
{3,4},
{4,1}
)
)
This graph contains both loop edges ({2} and {3}, ie. (2-2) and (3-3)) and multiple edges ({3,4} twice).
Examples
We give some simple examples on how this data model is used in computations. The first example is the are.connected() function, this decides whether two vertices are connected by an edge. Lets assume that G1 and G2 are the graphs from the previous two sections, G1 is the directed, G2 the undirected graph. Now if we check with are.connected() whether vertices 0 and 2 are connectedin G1, the function searches for the (0,2) directed pair, and since it finds it it returns positive result. If the same search is performed on G2, the function searches for the {0,2} set and the result is, again, positive. Now it is perhaps obvious by now that if we suppply the vertices in the opposite order, ie. 2 and 0 then in G1 the (2,0) ordered pair is requested, so the answer is negative. For G2 {0,2}={2,0} is searched again, the answer is still positive.
Another simple example is the neighbors() command, this can be used to find the neighboring vertices of a vertex. neighbors() (like many other functions in igraph) has a parameter called mode. This parameter defines how the search is performed in directed graphs. Let us assume that we are interested in the neighbors of vertex 3. If mode is out then the graph's edge list (which is a multiset) is searched for elements having 3 in the first position of the ordered pair and returns the vertex ids in the second position in these elements. So the answer is (3,4,4). Now if mode is in then the role of the first and second positions is exchanged and (2,3) is returned. The third possible value of mode is all, in this case pairs having 3 at either at the first or second positions are searched and the other vertex id in the pair is returned: (2,3,4,4). Note that 3 is returned only once even if it is found both ways. This of course corresponds to the normal graph theoretic conventions.
The mode argument is always ignored when dealing with undirected graphs, as these graphs store an edge as an unordered pair (ie. a set). For example running neighbors() on graph G2 and vertex 3, the result is (2,3,4,4).
Note that if we set mode to all on a directed graph that essentially means that we use the corresponding undirected graph for calculation. Also note that the corresponding undirected graph always have the same number of edges as the directed counterpart, eg. if the directed graph contains (3,4) and (4,3) these edges make two {3,4} edges in the undirected graph.
Some igraph functions have a mode for other purposes: eg. graph.tree() and graph.star().
Graphs Objects
igraph uses graph objects to represent graphs, a graph object is an opaque type in the sense that users are not expected to manipulate graph objects directly. Instead igraph provides functions to create and manipulate graph objects. This organization has the advantage that the internal representation can be changed any time without breaking any code, assuming the users keep the rules. To facilitate the proper usage of igraph graph objects we do not present the internal graph representation here.
There are several functions which create graph objects, here is a very incomplete list of them: graph(), graph.star(), graph.tree(), graph.adjacency(), barabasi.game().
For example graph() creates a directed graph from a list of edges:
> g <- graph( c(0,1, 1,2, 3,4, 5,6) )
> g
Vertices: 7
Edges: 4
Directed: TRUE
Edges:
[0] 0 -> 1
[1] 1 -> 2
[2] 3 -> 4
[3] 5 -> 6
Graph objects have an R class igraph and the is.igraph function can be used to check that an R object is an igraph graph object:
> class( graph( 1:10 ) )
[1] "igraph"
> is.igraph( graph( 1:10 ) )
[1] TRUE
> is.igraph( 1:10 )
[1] FALSE
Regular Graphs
Empty graphs
Full graphs
Stars and Rings
Lattices
Trees
The Graph Atlas
Empty graphs
igraph privides some ways for generating regular graph strucures: lattices, trees, etc. Let us begin with the simplest case: the empty graph:
> g <- graph.empty()
This command creates an empty graph and stores it in a variable named g. If you type in the name of a variable containing a graph object, R prints some information about the graph, and its edge list:
> g
Vertices: 0
Edges: 0
Directed: TRUE
This small example illustrates a number of things: empty graphs can be created by the graph.empty() command; by default this command creates a graph with no vertices and no edges (well this is what the word empty means); by default this command creates directed graphs. graph.empty() has an optional parameter: n, the number of vertices in the graph. (A graph with non-zero number of vertices and no edges is still called and empty graph.)
> g <- graph.empty(n=10)
> g
Vertices: 10
Edges: 0
Directed: TRUE
This is a graph with ten vertices and zero edges.
Full graphs
A full graph is completely the opposite of empty graphs: it contains (one copy of) all possible edges. A full graph can be created by the graph.full() command. By default it creates undirected graphs with no loop (ie. self) edges:
> f <- graph.full(3)
> f
Vertices: 3
Edges: 3
Directed: FALSE
Edges:
[0] 0 -- 1
[1] 0 -- 2
[2] 1 -- 2
The command has a directed parameter which can be set to TRUE to generated directed graphs:
> f <- graph.full(3, directed=TRUE)
> f
Vertices: 3
Edges: 6
Directed: TRUE
Edges:
[0] 0 -> 1
[1] 0 -> 2
[2] 1 -> 0
[3] 1 -> 2
[4] 2 -> 0
[5] 2 -> 1
Note that igraph prints undirected edges with “--” and directed ones with “->”, stressing that the order of the vertex ids in the edges are important only in directed graphs.
Here is an example of an undirected full graph:
Some igraph functions provide information about the structure of a graph, is.directed() returns a logical constant, TRUE is the argument graph is directed and FALSE otherwise. are.connected() gives TRUE if the given vertices are connected with an edge and FALSE otherwise. It works slightly differently for directed and undirected graphs; for undirected graphs the order of the vertex id parameters doesn't matter; for directed graph it searches for an edge from the first vertex to the second:
> is.directed(graph.full(10))
[1] FALSE
> is.directed(graph.full(10, directed=TRUE))
[1] TRUE
> g <- graph.full(10)
> are.connected(g, 1,4)
[1] TRUE
> g <- graph.empty(n=10)
> are.connected(g, 1,4)
[1] FALSE
If a graph is very big, then it is probably not a good idea to list its edges to the screen. In this case the standard summary() function can be used to print some useful information about the graph:
> big <- graph.full(1000)
> summary(big)
Vertices: 1000
Edges: 499500
Directed: FALSE
graph.full() also has a loops parameter to include loop edges in graph.
graph.full() never creates graphs with multiple edges.
Stars and Rings
In a star graph, a single distinguished vertex is connected to every other vertices and there are no other edges present. In igraph three types of star graphs can be created, the mode argument decides which one to generate: in creates a directed graph in which every edge points to the distinguished vertex, out creates a directed graph with opposite directedness and undirected creates an undirected star.
The graph_ring() command can create various ring graphs (ie. one-dimensional lattices). The directed, mutual and circular parameters give the exact shape of the graph, see the figure for details.
Lattices
An n-dimensional lattice (sometimes also called grid) is a graph in which the vertices are placed at the integer coordinate points of the n-dimensional Euclidean space and each vertex connects to vertices which are exactly one unit away from it. Ie. if the lattice is two dimensional and the length of the lattice is 5 along the first and 3 along the second dimension, then it has 15 vertices and they're placed at coordinates (1,1), (1,2), (1,3), (2,1), … (5,3) and two vertices are connected if the difference of one of their coordinates is one or minus one and all their other coordinates are exactly the same.
In igraph lattices are generated with graph.lattice(). It has two different forms. In the first form, the parameter length, the length of the lattice along every dimension and parameter dim, the dimensionality of the lattice are given. In the second form the parameter dimvector is given, this contains the length of the lattice along each dimensions separately. The default is the second form. The following two are equivalent, they create the same graphs:
> l1 <- graph.lattice(length=5, dim=2)
> l2 <- graph.lattice( c(5,5) )
The second form is obviously more general.
graph.lattice() generates non-circular lattices by defaults. In a circular lattice the difference of the coordinates of the vertices is calculated modulo the size of the lattice along the given dimension so for example in the circular 5x3 two dimensional lattice vertices (1,1) and (1,3) are also connected just like (1,1) and (5,1). The circular parameter of graph.lattice() can be set to TRUE to generate circular graphs.
graph.lattice() generates undirected graphs by default, but the directed parameter can be set to TRUE to generate directed lattices. The direction of the edges is such that the edge points from the vertex with the smaller vertex id to the vertex with the larger vertex id. If the mutual parameter is also set to TRUE then two directed edges are created in both direction.
Trees
igraph has some limited capabilities for creating trees with the graph.tree() function. The only kind of trees this function can generate are almost complete regular trees. In a complete regular tree every vertex has the same number of children except for the leafs which have no children at all. An almost complete regular tree is made from a complete regular tree by removing some of the leafs from it.
The directedness of the edges of tree is determined by the mode parameter. See the figure for the possible values of this parameter.
The Graph Atlas
igraph can generate graphs from the book An Atlas of Graphs by Roland C. Read and Robin J. Wilson. The atlas contains all undirected graphs with up to seven vertices, numbered from 0 up to 1252. The graphs are listed:
1. in increasing
展开阅读全文