Mongo Scaling

Easy to Start, Easy to Develop, Easy to Scale Alvin Richards Senior Director Enterprise Engineering alvin@10gen.

com

Sunday, August 21, 2011

Topics we will cover fast!

Vertical Scaling Horizontal Scaling with MongoDB
Schema & Index design Auto Sharding Replication
Scaling
Operations/sec go up Storage needs go up Capacity IOPs Complexity goes up Caching

Sunday, August 21, 2011 3
How do you scale now?

Optimization & Tuning Schema & Index Design O/S tuning Hardware conguration
$$$
Vertical scaling Hardware is expensive Hard to scale in cloud

throughput
MongoDB Scaling - Single Node

read
node_a1
write
Read scaling - add Replicas

read
node_b1 node_a1
write
Read scaling - add Replicas

read
node_c1 node_b1 node_a1
write
Write scaling - Sharding

read
shard1
write
Write scaling - add Shards

read
shard1
shard2
write
Write scaling - add Shards

read
shard1
shard2
shard3
write
10
Scaling with MongoDB

Schema & Index Design Sharding Replication
11
Schema
Data model effects performance
Embedding versus Linking
Partial versus full document writes Partial versus full document reads
Schema and Schema usage critical for scaling and
perfromance
Roundtrips to database Disk seek time Size of data to read & write
12
Indexes
Index common queries Do not over index
Right-balanced indexes keep working set small
(A) and (A,B) are equivalent, choose one
13
Query for {a: 7}

With Index
[-
, 5)
[5, 10)
[10,
[-
, 5) buckets
[5, 7)
[7, 9)
[9, 10)
[10,
) buckets
{...} {...} {...} {...} {...} {...} {...} {...} {...} {...} {...}
Without index - Scan

Indexing Embedded Documents & Multikeys

db.posts.save({ title: My First blog, tags: [mongodb, cool], comments: [ {author: James, ts : new Date()} ] }); db.posts.ensureIndex({tags: 1}) db.posts.ensureIndex({comments.author: 1})
15
Picking an a Index
find({x: 10, y: foo})
scan terminate index on x
index on y
remember
16
What is Sharding
Ad-hoc partitioning Consistent hashing
Amazon Dynamo Google BigTable Yahoo! PNUTS MongoDB
Range based partitioning
17
MongoDB Sharding
Automatic partitioning and management Range based Convert to sharded system with no downtime Fully consistent
18
How MongoDB Sharding works

> db.runCommand( { addshard : "shard1" } );
> db.runCommand( { shardCollection : mydb.blogs, key : { age : 1} } )
- +
Range keys from - to + Ranges are stored as chunks


> db.posts.save( {age:40} )
- + - 40 41 +
Data in inserted Ranges are split into more chunks


> db.posts.save( {age:40} ) > db.posts.save( {age:50} )
- + - 40 41 + 51 +
41 50
More Data in inserted Ranges are split into morechunks


> db.posts.save( {age:40} ) > db.posts.save( {age:50} ) > db.posts.save( {age:60} )
- + - 40 41 + 51 + 61 +
22
41 50
51 60

> db.posts.save( {age:40} ) > db.posts.save( {age:50} ) > db.posts.save( {age:60} )
- + - 40 41 + 51 + 61 +
23
41 50
51 60
shard1 - 40 41 50 51 60 61 +

- 40 41 50 51 60 61 +

shard1 - 40 41 50 51 60 61 +

shard1 - 40
shard2 41 50
51 60 61 +

> db.runCommand( { addshard : "shard2" } ); > db.runCommand( { addshard : "shard3" } );
shard1 - 40
shard2 41 50
shard3
51 60 61 +
Sharding Features
Shard data without no downtime Automatic balancing as data is written Commands routed (switched) to correct node
Inserts - must have the Shard Key Updates - must have the Shard Key Queries Indexed Queries
With Shard Key - routed to nodes Without Shard Key - scatter gather With Shard Key - routed in order Without Shard Key - distributed sort merge
25
MongoDB Replication
MongoDB replication like MySQL replication
Asynchronous master/slave
Variations:
Master / slave Replica Sets
26
Replica Set features

A cluster of N servers Any (one) node can be primary Consensus election of primary Automatic failover Automatic recovery All writes to primary Reads can be to primary (default) or a secondary
27
How MongoDB Replication works

Member 1 Member 3
Member 2
Set is made up of 2 or more nodes
28

Member 1 Member 3
Member 2 PRIMARY
Election establishes the PRIMARY Data replication from PRIMARY to SECONDARY


Member 1 negotiate new master Member 3
Member 2 DOWN
PRIMARY may fail Automatic election of new PRIMARY


Member 1
Member 3 PRIMARY
Member 2 DOWN
New PRIMARY elected Replication Set re-established


Member 1
Member 3 PRIMARY
RECOVERING
Member 2
Automatic recovery
32

Member 1
Member 3 PRIMARY
Member 2
Replication Set re-established
33
Creating a Replica Set

> cfg = { _id : "acme_a", members : [ { _id : 0, host : "sf1.acme.com" }, { _id : 1, host : "sf2.acme.com" }, { _id : 2, host : "sf3.acme.com" } ] } > use admin > db.runCommand( { replSetInitiate : cfg } )
34
Replica Set Member Types

Normal {priority:1} Passive {priority:0} Arbiters
Cannot be elected as PRIMARY Can vote in an election Do not hold any data
Hidden {hidden:True} Tagging - New in 2.0 tags : {"dc": "ny"}, "rack": "r23s5"}
Using Replicas
slaveOk() - driver will send read requests to Secondaries - driver will always send writes to Primary Java examples - DB.slaveOk() - Collection.slaveOk()
- find(q).addOption(Bytes.QUERYOPTION_SLAVEOK);
36
Safe Writes
db.runCommand({getLastError: 1, w : 1})
- ensure write is synchronous - command returns after primary has written to memory
w=n or w='majority'
- n is the number of nodes data must be replicated to - driver will always send writes to Primary
w='myTag' [MongoDB 2.0]
- Each member is "tagged" e.g. "US_EAST", "EMEA", "US_WEST" - Ensure that the write is executed in each tagged "region"
fsync:true
- Ensures changed disk blocks are ushed to disk
j:true
- Ensures changes are ush to Journal

Replication features
Reads from Primary are always consistent Reads from Secondaries are eventually consistent Automatic failover if a Primary fails Automatic recovery when a node joins the set
38
Scaling Use Case

User profile information Multiple ways to identify a "user"
Facebook ID Twitter Name Email address SSN# / National Identifier

What is the best schema, index and sharding strategy?
39
Schema #1
> db.profiles.save( { _id : " facebook_name : "alvin.j.richards", twitter_name : "jonnyeight", linkedin_name : "alvinrichards", details : { loc: [50.78076,7.181969], ...} }) > db.profiles.ensureIndex({facebook_name:1}) > db.runCommand( { shardCollection : social.profiles, key : { facebook_name : 1} } )
40
Schema #1
Good: Schema is simple to understand Easy to add new identifiers, e.g. foursquare name Query is routed to a shard
db.profiles.find({facebook_name: "alvin.j.richards"})
Bad: Each identifier needs a separate index More indexes means less data in memory Memory contention and disk paging Query is scatter/gathered across cluster
db.profiles.find({linkedin_name:"alvinrichards"})
41
Schema #2
> db.profiles.save( { _id : ObjectId("1234") details : {loc: [50.78076,7.181969], ...}}) > db.identfiers.save( { _id : {type: "facebook_name", value: "alvin.j.richards}, profile: ObjectId("1234")}) > db.identfiers.save( { _id : {type: "twitter_name", value: "jonnyeight}, profile: ObjectId("1234")}) > db.runCommand( { shardCollection : social.identifiers, key : { _id : 1} } ) > db.runCommand( { shardCollection : social.profiles, key : { _id : 1} } )
Schema #2
Good: Easy to add new identifiers, e.g. foursquare name All query are routed to a shard > db.profiles.find(
{_id : {type: "facebook_name": value: "alvin.j.richards"}})
> db.profiles.find(
{_id : {type: "foursquare_id": value: "alvin10gen"}})
Bad: Schema is more complex Two lookups are required for each access (but both routed) Need to maintain links (data relationships)
43
Summary
Schema & Index design Simplest way to scale Sharding Automatically scale writes Replication Automatically scale reads
download at mongodb.org
[email protected] MongoDB Munich, Germany - October 10
conferences, appearances, and meetups
http://www.10gen.com/events
http://bit.ly/mongoW
Facebook | Twitter | LinkedIn

@mongodb
http://linkd.in/joinmongo
45

Mongo Scaling

Uploaded by

Copyright:

Available Formats

Mongo Scaling

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mongo Scaling

Uploaded by

Copyright:

Available Formats

Easy to Start, Easy to Develop, Easy to Scale Alvin Richards Senior Director Enterprise Engineering alvin@10gen.

Sunday, August 21, 2011

Topics we will cover fast!

Schema & Index design Auto Sharding Replication

Sunday, August 21, 2011

Operations/sec go up Storage needs go up Capacity IOPs Complexity goes up Caching

How do you scale now?

Vertical scaling Hardware is expensive Hard to scale in cloud

MongoDB Scaling - Single Node

Sunday, August 21, 2011

Read scaling - add Replicas

Sunday, August 21, 2011

Read scaling - add Replicas

node_c1 node_b1 node_a1

Sunday, August 21, 2011

Write scaling - Sharding

Sunday, August 21, 2011

Write scaling - add Shards

Sunday, August 21, 2011

Write scaling - add Shards

Sunday, August 21, 2011

Scaling with MongoDB

Sunday, August 21, 2011

Embedding versus Linking

Sunday, August 21, 2011

Right-balanced indexes keep working set small

(A) and (A,B) are equivalent, choose one

Sunday, August 21, 2011

Query for {a: 7}

Without index - Scan

Indexing Embedded Documents & Multikeys

Sunday, August 21, 2011

scan terminate index on x

Sunday, August 21, 2011

Amazon Dynamo Google BigTable Yahoo! PNUTS MongoDB

Range based partitioning

Sunday, August 21, 2011

Sunday, August 21, 2011

How MongoDB Sharding works

Range keys from - to + Ranges are stored as chunks

How MongoDB Sharding works

Data in inserted Ranges are split into more chunks

How MongoDB Sharding works

More Data in inserted Ranges are split into morechunks

How MongoDB Sharding works

How MongoDB Sharding works

How MongoDB Sharding works

How MongoDB Sharding works

How MongoDB Sharding works

How MongoDB Sharding works

How MongoDB Sharding works

Sunday, August 21, 2011

Master / slave Replica Sets

Sunday, August 21, 2011

Replica Set features

Sunday, August 21, 2011

How MongoDB Replication works

Set is made up of 2 or more nodes

Sunday, August 21, 2011

How MongoDB Replication works