graph_tool.collection
#
This module contains an assortment of useful networks.
Interface to the Netzschleuder online network repository#
- graph_tool.collection.ns#
-
Dictionary containing
Graph
objects, indexed by the name of the dataset, fetched from the Netzschleuder repository. For dataset entries with more than one network, they are accessed either via a string"<entry>/<network>"
or a tuple("<entry>", "<network>")
. This is a “lazy” dictionary, i.e. it only downloads the graphs when the items are accessed for the first time. The description and summary information for each graph are given in thens_info
dictionary, or alternatively in the graph properties which accompanies each graph object.Examples#
>>> g = gt.collection.ns["advogato"] >>> print(g.gp.description) A network of trust relationships among users on Advogato, an online community of open source software developers. Edge direction indicates that node i trusts node j, and edge weight denotes one of four increasing levels of declared trust from i to j: observer (0.4), apprentice (0.6), journeyer (0.8), and master (1.0).
- graph_tool.collection.ns_info#
-
Dictionary containing descriptions and other summary information for datasets available in the Netzschleuder repository. For dataset entries with more than one network, they are accessed either via a string
"<entry>/<network>"
or a tuple("<entry>", "<network>")
. The information is downloaded on-the-fly via the available JSON API.
Built-in collection of empirical network data#
- graph_tool.collection.data#
Dictionary containing
Graph
objects, indexed by the name of the graph. This is a “lazy” dictionary, i.e. it only loads the graphs from disk when the items are accessed for the first time. The description for each graph is given in thedescriptions
dictionary, or alternatively in the"description"
graph property which accompanies each graph object.Examples#
>>> g = gt.collection.data["karate"] >>> print(g) <Graph object, undirected, with 34 vertices and 78 edges, 1 internal vertex property, 2 internal graph properties, at 0x...> >>> print(g.gp.readme) The file karate.gml contains the network of friendships between the 34 members of a karate club at a US university, as described by Wayne Zachary in 1977. If you use these data in your work, please cite W. W. Zachary, An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 (1977).
- graph_tool.collection.descriptions#
Dictionary with a short description and source information on each graph contained in
data
.A summary, with some extra information, is available in the following table.
Name
N
E
Directed
Description
adjnoun
112
425
False
Word adjacencies: adjacency network of common adjectives and nouns in the novel David Copperfield by Charles Dickens. Please cite M. E. J. Newman, Phys. Rev. E 74, 036104 (2006). Retrieved from Mark Newman’s website.
as-22july06
22963
48436
False
Internet: a symmetrized snapshot of the structure of the Internet at the level of autonomous systems, reconstructed from BGP tables posted by the University of Oregon Route Views Project. This snapshot was created by Mark Newman from data for July 22, 2006 and is not previously published. Retrieved from Mark Newman’s website.
astro-ph
16706
121251
False
Astrophysics collaborations: weighted network of coauthorships between scientists posting preprints on the Astrophysics E-Print Archive between Jan 1, 1995 and December 31, 1999. Please cite M. E. J. Newman, Proc. Natl. Acad. Sci. USA 98, 404-409 (2001). Retrieved from Mark Newman’s website.
celegansneural
297
2359
True
Neural network: A directed, weighted network representing the neural network of C. Elegans. Data compiled by D. Watts and S. Strogatz and made available on the web here. Please cite D. J. Watts and S. H. Strogatz, Nature 393, 440-442 (1998). Original experimental data taken from J. G. White, E. Southgate, J. N. Thompson, and S. Brenner, Phil. Trans. R. Soc. London 314, 1-340 (1986). Retrieved from Mark Newman’s website.
cond-mat
16726
47594
False
Condensed matter collaborations 1999: weighted network of coauthorships between scientists posting preprints on the Condensed Matter E-Print Archive between Jan 1, 1995 and December 31, 1999. Please cite M. E. J. Newman, The structure of scientific collaboration networks, Proc. Natl. Acad. Sci. USA 98, 404-409 (2001). Retrieved from Mark Newman’s website.
cond-mat-2003
31163
120029
False
Condensed matter collaborations 2003: updated network of coauthorships between scientists posting preprints on the Condensed Matter E-Print Archive. This version includes all preprints posted between Jan 1, 1995 and June 30, 2003. The largest component of this network, which contains 27519 scientists, has been used by several authors as a test-bed for community-finding algorithms for large networks; see for example J. Duch and A. Arenas, Phys. Rev. E 72, 027104 (2005). These data can be cited as M. E. J. Newman, Proc. Natl. Acad. Sci. USA 98, 404-409 (2001). Retrieved from Mark Newman’s website.
cond-mat-2005
40421
175693
False
Condensed matter collaborations 2005: updated network of coauthorships between scientists posting preprints on the Condensed Matter E-Print Archive. This version includes all preprints posted between Jan 1, 1995 and March 31, 2005. Please cite M. E. J. Newman, Proc. Natl. Acad. Sci. USA 98, 404-409 (2001). Retrieved from Mark Newman’s website.
dolphins
62
159
False
Dolphin social network: an undirected social network of frequent associations between 62 dolphins in a community living off Doubtful Sound, New Zealand. Please cite D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, Behavioral Ecology and Sociobiology 54, 396-405 (2003). Retrieved from Mark Newman’s website.
email-Enron
36692
367662
False
Enron email communication network covers all the email communication within a dataset of around half million emails. This data was originally made public, and posted to the web, by the Federal Energy Regulatory Commission during its investigation. Nodes of the network are email addresses and if an address i sent at least one email to address j, the graph contains an undirected edge from i to j. Note that non-Enron email addresses act as sinks and sources in the network as we only observe their communication with the Enron email addresses. The Enron email data was originally released by William Cohen at CMU. This version was retrieved from the SNAP database at http://snap.stanford.edu/data/email-Enron.html. Please cite: J. Leskovec, K. Lang, A. Dasgupta, M. Mahoney. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Mathematics 6(1) 29–123, 2009, B. Klimmt, Y. Yang. Introducing the Enron corpus. CEAS conference, 2004.
football
115
615
False
American College football: network of American football games between Division IA colleges during regular season Fall 2000. Please cite M. Girvan and M. E. J. Newman, Proc. Natl. Acad. Sci. USA 99, 7821-7826 (2002), and T.S. Evans, “Clique Graphs and Overlapping Communities”, J.Stat.Mech. (2010) P12037 [arXiv:1009.0638]. Retrieved from Mark Newman’s website, with corrections by T. S. Evans, available here.
hep-th
8361
15751
False
High-energy theory collaborations: weighted network of coauthorships between scientists posting preprints on the High-Energy Theory E-Print Archive between Jan 1, 1995 and December 31, 1999. Please cite M. E. J. Newman, Proc. Natl. Acad. Sci. USA 98, 404-409 (2001). Retrieved from Mark Newman’s website.
karate
34
78
False
Zachary’s karate club: social network of friendships between 34 members of a karate club at a US university in the 1970s. Please cite W. W. Zachary, An information flow model for conflict and fission in small groups, Journal of Anthropological Research 33, 452-473 (1977). Retrieved from Mark Newman’s website.
lesmis
77
254
False
Les Miserables: coappearance network of characters in the novel Les Miserables. Please cite D. E. Knuth, The Stanford GraphBase: A Platform for Combinatorial Computing, Addison-Wesley, Reading, MA (1993). Retrieved from Mark Newman’s website.
netscience
1589
2742
False
Coauthorships in network science: coauthorship network of scientists working on network theory and experiment, as compiled by M. Newman in May 2006. A figure depicting the largest component of this network can be found here. These data can be cited as M. E. J. Newman, Phys. Rev. E 74, 036104 (2006). Retrieved from Mark Newman’s website.
pgp-strong-2009
39796
301498
True
Strongly connected component of the PGP web of trust circa November 2009. The full data is available at http://key-server.de/dump/. Please cite: Richters O, Peixoto TP (2011) Trust Transitivity in Social Networks. PLoS ONE 6(4): e18384. DOI: 10.1371/journal.pone.0018384 [sci-hub, @tor].
polblogs
1490
19090
True
Political blogs: A directed network of hyperlinks between weblogs on US politics, recorded in 2005 by Adamic and Glance. Please cite L. A. Adamic and N. Glance, “The political blogosphere and the 2004 US Election”, in Proceedings of the WWW-2005 Workshop on the Weblogging Ecosystem (2005). Retrieved from Mark Newman’s website.
polbooks
105
441
False
Books about US politics: A network of books about US politics published around the time of the 2004 presidential election and sold by the online bookseller Amazon.com. Edges between books represent frequent copurchasing of books by the same buyers. The network was compiled by V. Krebs and is unpublished, but can found on Krebs’ web site. Retrieved from Mark Newman’s website.
power
4941
6594
False
Power grid: An undirected, unweighted network representing the topology of the Western States Power Grid of the United States. Data compiled by D. Watts and S. Strogatz and made available on the web here. Please cite D. J. Watts and S. H. Strogatz, Nature 393, 440-442 (1998). Retrieved from Mark Newman’s website.
serengeti-foodweb
161
592
True
Plant and mammal food web from the Serengeti savanna ecosystem in Tanzania. Please cite: Baskerville EB, Dobson AP, Bedford T, Allesina S, Anderson TM, et al. (2011) Spatial Guilds in the Serengeti Food Web Revealed by a Bayesian Group Model. PLoS Comput Biol 7(12): e1002321. DOI: 10.1371/journal.pcbi.1002321 [sci-hub, @tor]
Functions returning small graphs#
Returns the cubic graph specified in LCF notation. |
|
Returns the Bull Graph |
|
Returns the Chvátal Graph |
|
Returns the 3-regular Platonic Cubical Graph |
|
Returns the Desargues Graph |
|
Returns the Diamond graph |
|
Returns the Platonic Dodecahedral graph. |
|
Returns the Frucht Graph. |
|
Returns the Heawood Graph, a (3,6) cage. |
|
Returns the Hoffman-Singleton Graph. |
|
Returns the House graph (square with triangle on top). |
|
Returns the Platonic Icosahedral graph. |
|
Returns the Krackhardt Kite Social Network. |
|
Returns the Moebius-Kantor graph. |
|
Returns the Platonic Octahedral graph. |
|
Returns the Pappus graph. |
|
Returns the Petersen graph. |
|
Return a small maze with a cycle. |
|
Returns the 3-regular Platonic Tetrahedral graph. |
|
Returns the skeleton of the truncated cube. |
|
Returns the skeleton of the truncated Platonic tetrahedron. |
|
Returns the Tutte graph. |
Small graph atlas#
- graph_tool.collection.atlas#
Lazy list of of all graphs with up to seven nodes named in the Graph Atlas [R3656cabee8a8-atlas].
The graphs are listed in increasing according to
number of nodes,
number of edges,
degree sequence (for example 111223 < 112222),
number of automorphisms,
in that order, with three exceptions:
graphs 55 and 56, with degree sequences 001111 and 000112,
graphs 1007 and 1008, with degree sequences 3333444 and 3333336,
graphs 1012 and 1213, with degree sequences 1244555 and 1244456.
These errors are kept in this list so that the indices match those of [R3656cabee8a8-atlas].
References#