This document provides an introduction to analyzing social graphs using graph databases. It discusses using Neo4j to store social graph data and analyze centrality and clustering coefficients. Sample code is shown for inserting nodes and relationships in Neo4j and traversing the graph. Analyzing a sample social network dataset found centrality values ranged from 1 to over 900, with a median of 10, and clustering coefficients ranged from 0 to 1. Visualization of the social graph and analysis results is also briefly discussed.
44. GraphDB Neo4j
• True ACID transactions
• High availability
• Scales to billions of nods and relationships
• High speed querying through traversals
Single instance(GPLv3) Multiple instance(AGPLv3)
Embedded EmbeddedGraphDatabase HighlyAvailableGraphDatabase
Standalone Neo4j Server Neo4j Server high availability mode
http://neo4j.org/
45. Other my favorite features
for Neo4j
• RESTful APIs
• Query Language(Cypher)
• Full indexing
– lucene
• Implemented graph algorithm
– A*, Dijkstra
– High speed traverse
• Gremlin supported
– Like a query language
http://www.tinkerpop.com/post/4633229547/tinkerpop-graph-stack
46. Introduction simple Neo4j usecase
Single node Multi node
Embedded
Analyses system Analyses system
Analyses system Analyses system
Server
47. Introduction simple Neo4j usecase
Single node Multi node
Embedded
Analyses system Analyses system
Analyses system Analyses system
Server
48. Introduction simple Neo4j usecase
Single node Multi node
Analyses system
Embedded
Analyses system
Analyses system Analyses system
Server
49. Introduction simple Neo4j usecase
Single node Multi node
Analyses system
Embedded
Analyses system
Analyses system Analyses system
Server
50. Introduction to simple
embedded Neo4j
• Insert Vertices & make Relationships
• Single node & Embedded
• Traversal sample
51. Insert vertices,
make relationship
public final class InputVertex {
public static void main(final String[] args) {
GraphDatabaseService graphDb = new
EmbeddedGraphDatabase("/tmp/neo4j");
Transaction tx = graphDb.beginTx();
try {
Node firstNode = graphDb.createNode();
firstNode.setProperty("Name", "Kimura");
Node secondNode = graphDb.createNode();
secondNode.setProperty("Name", "Kato");
firstNode.createRelationshipTo(secondNode,
DynamicRelationshipType.withName("LIKE"));
tx.success();
} finally {
tx.finish();
}
graphDb.shutdown();
}
}
52. Insert vertices,
make relationship
public final class InputVertex {
public static void main(final String[] args) {
GraphDatabaseService graphDb = new
EmbeddedGraphDatabase("/tmp/neo4j");
Transaction tx = graphDb.beginTx();
try {
Node firstNode = graphDb.createNode();
firstNode.setProperty("Name", "Kimura");
Node secondNode = graphDb.createNode();
secondNode.setProperty("Name", "Kato");
firstNode.createRelationshipTo(secondNode,
DynamicRelationshipType.withName("LIKE"));
tx.success();
} finally {
tx.finish();
}
graphDb.shutdown();
}
}
53. Insert vertices,
make relationship
public final class InputVertex {
public static void main(final String[] args) { ID: 1
GraphDatabaseService graphDb = new NAME: kimura
EmbeddedGraphDatabase("/tmp/neo4j");
Transaction tx = graphDb.beginTx();
try {
Node firstNode = graphDb.createNode();
firstNode.setProperty("Name", "Kimura");
Node secondNode = graphDb.createNode();
secondNode.setProperty("Name", "Kato");
firstNode.createRelationshipTo(secondNode,
DynamicRelationshipType.withName("LIKE"));
tx.success();
} finally {
tx.finish();
}
graphDb.shutdown();
}
}
54. Insert vertices,
make relationship
public final class InputVertex {
public static void main(final String[] args) { ID: 1
GraphDatabaseService graphDb = new NAME: kimura
EmbeddedGraphDatabase("/tmp/neo4j");
Transaction tx = graphDb.beginTx();
try {
Node firstNode = graphDb.createNode();
firstNode.setProperty("Name", "Kimura");
Node secondNode = graphDb.createNode();
secondNode.setProperty("Name", "Kato");
firstNode.createRelationshipTo(secondNode,
DynamicRelationshipType.withName("LIKE"));
tx.success();
} finally {
tx.finish();
}
graphDb.shutdown();
}
}
55. Insert vertices,
make relationship
public final class InputVertex {
public static void main(final String[] args) { ID: 1
GraphDatabaseService graphDb = new NAME: kimura
EmbeddedGraphDatabase("/tmp/neo4j");
Transaction tx = graphDb.beginTx();
try {
Node firstNode = graphDb.createNode();
firstNode.setProperty("Name", "Kimura");
Node secondNode = graphDb.createNode();
secondNode.setProperty("Name", "Kato");
firstNode.createRelationshipTo(secondNode,
DynamicRelationshipType.withName("LIKE"));
tx.success();
} finally { ID: 2
tx.finish(); NAME: Kato
}
graphDb.shutdown();
}
}
56. Insert vertices,
make relationship
public final class InputVertex {
public static void main(final String[] args) { ID: 1
GraphDatabaseService graphDb = new NAME: kimura
EmbeddedGraphDatabase("/tmp/neo4j");
Transaction tx = graphDb.beginTx();
try {
Node firstNode = graphDb.createNode();
firstNode.setProperty("Name", "Kimura");
Node secondNode = graphDb.createNode();
secondNode.setProperty("Name", "Kato");
firstNode.createRelationshipTo(secondNode,
DynamicRelationshipType.withName("LIKE"));
tx.success();
} finally { ID: 2
tx.finish(); NAME: Kato
}
graphDb.shutdown();
}
}
57. Insert vertices,
make relationship
public final class InputVertex {
public static void main(final String[] args) { ID: 1
GraphDatabaseService graphDb = new NAME: kimura
EmbeddedGraphDatabase("/tmp/neo4j");
Transaction tx = graphDb.beginTx();
try {
Node firstNode = graphDb.createNode();
ID: 3
firstNode.setProperty("Name", "Kimura"); Relation: Like
Node secondNode = graphDb.createNode();
secondNode.setProperty("Name", "Kato");
firstNode.createRelationshipTo(secondNode,
DynamicRelationshipType.withName("LIKE"));
tx.success();
} finally { ID: 2
tx.finish(); NAME: Kato
}
graphDb.shutdown();
}
}
58. Batch Insert
• Non thread safe, non transaction
• But very fast!
public final class Batch {
public static void main(final String[] args) {
BatchInserter inserter = new BatchInserterImpl("/tmp/neo4j",
BatchInserterImpl.loadProperties("/tmp/neo4j.props"));
Map<String, Object> prop = new HashMap<String, Object>();
prop.put("Name", "Kimura");
prop.put("Age", 21);
long node1 = inserter.createNode(prop);
prop.put("Name", "Kato");
prop.put("Age", 21);
long node2 = inserter.createNode(prop);
inserter.createRelationship(node1, node2,
DynamicRelationshipType.withName("LIKE"), null);
inserter.shutdown();
}
}
74. Network Dataset
• Stanford Large Network Dataset Collection
• SNAP has a Wide variety of graph data!
Social Networks Communication networks
Citation networks Collaboration networks
Web graphs Product co-purchasing networks
Internet peer-to-peer networks Road networks
Autonomous systems graphs Signed networks
Wikipedia networks and metadata Memetracker and Twitter
http://snap.stanford.edu/data/index.html