SlideShare a Scribd company logo
What is MongoDB?
          What Makes Mongo Special?

Or how I learned to stop worrying and love the database

                          Mathias Stearn


    Bay Area Hadoop Meetup – February 17, 2010

                      Mathias Stearn   MongoDB
What is MongoDB?
              What Makes Mongo Special?

    The resulting [MongoDB] application has literally
    changed the way the pharma company conducts
    business. Whereas in the past, patient queries could
    take minutes to hours, results are now essentially
    Why bother [with] memcached for caching HTTP sessions
    when you have an authoritative MongoDB? The
    performance is there.

Carl Byström, @cgbystrom on twitter

                          Mathias Stearn   MongoDB
What is MongoDB?
              What Makes Mongo Special?

    Compared to hadoop, Mongo’s speed and startup time
    make developing new queries much easier; what took
    us two weeks to get working on hadoop was done in
    two days on mongo.
Emmett Shear, CTO (and developer) at
    It took me half a day to go from not touching MongoDB to
    writing some fairly good functionality against it. It makes
    setting up, configuring, and interfacing with MySQL look
    archaic – ridiculously archaic.

                          Mathias Stearn   MongoDB
What is MongoDB?
              What Makes Mongo Special?

1   What is MongoDB?
     Document Oriented
     JavaScript Enabled
     Fast, Scalable, Available, and Reliable
2   What Makes Mongo Special?
     Native Language Integration
     Rich Data Types
     Atomic Modifiers
     Dynamic Queries
3   MapReduce
      Built-In MapReduce
      Easy Hadoop-Mongo Integration
      Better Hadoop-Mongo Integration

                          Mathias Stearn   MongoDB
What is MongoDB?     Document Oriented
              What Makes Mongo Special?    JavaScript Enabled
                             MapReduce     Fast, Scalable, Available, and Reliable

1   What is MongoDB?
     Document Oriented
     JavaScript Enabled
     Fast, Scalable, Available, and Reliable
2   What Makes Mongo Special?
     Native Language Integration
     Rich Data Types
     Atomic Modifiers
     Dynamic Queries
3   MapReduce
      Built-In MapReduce
      Easy Hadoop-Mongo Integration
      Better Hadoop-Mongo Integration

                          Mathias Stearn   MongoDB
What is MongoDB?     Document Oriented
            What Makes Mongo Special?    JavaScript Enabled
                           MapReduce     Fast, Scalable, Available, and Reliable

   Document Oriented
   Organized into Databases and Collections (like Tables)
   JSON-like (BSON)
   Dynamic, Strong Typing
   Database can “reach into” objects
  _id: "mstearn",
  name: "Mathias Stearn",
  karma: 42,
  active: true,
  birthdate: new Date(517896000000),
  interests: ["MongoDB", "Python", "Üñíçø¯˘"],
  subobject: {foo: "bar"}

                        Mathias Stearn   MongoDB
What is MongoDB?     Document Oriented
            What Makes Mongo Special?    JavaScript Enabled
                           MapReduce     Fast, Scalable, Available, and Reliable

JavaScript used for:
    Shell and Documentation
    (Very) Advanced Queries
    “Group By” Queries
db.users.find({$where: "this.a + this.b >= 42"});
  { key: "user"
  , initial: {count:0, comments:0}
  , reduce: function(doc,out){
      out.comments += doc.comments.length; }
  , finalize: function(out){
      out.avg = out.comments / out.count; }

                        Mathias Stearn   MongoDB
What is MongoDB?     Document Oriented
                 What Makes Mongo Special?    JavaScript Enabled
                                MapReduce     Fast, Scalable, Available, and Reliable

Fast, Scalable, Available, and Reliable

       Master-Slave replication for Availability and Reliability
            Replica-Pairs support auto-negotiation for master
       Auto-Sharding for Horizontal Scalability
            Distributes based on specified field
            Currently alpha
       MMAP database files to automatically use available RAM
       Asynchronous modifications

                             Mathias Stearn   MongoDB
What is MongoDB?     Document Oriented
What Makes Mongo Special?    JavaScript Enabled
               MapReduce     Fast, Scalable, Available, and Reliable

            Mathias Stearn   MongoDB
Native Language Integration
                      What is MongoDB?
                                           Rich Data Types
              What Makes Mongo Special?
                                           Atomic Modifiers
                                           Dynamic Queries

1   What is MongoDB?
     Document Oriented
     JavaScript Enabled
     Fast, Scalable, Available, and Reliable
2   What Makes Mongo Special?
     Native Language Integration
     Rich Data Types
     Atomic Modifiers
     Dynamic Queries
3   MapReduce
      Built-In MapReduce
      Easy Hadoop-Mongo Integration
      Better Hadoop-Mongo Integration

                          Mathias Stearn   MongoDB
Native Language Integration
                     What is MongoDB?
                                          Rich Data Types
             What Makes Mongo Special?
                                          Atomic Modifiers
                                          Dynamic Queries


Community Supported
Closure, Scala, C#, Haskell, Erlang, and More

                         Mathias Stearn   MongoDB
Native Language Integration
                     What is MongoDB?
                                          Rich Data Types
             What Makes Mongo Special?
                                          Atomic Modifiers
                                          Dynamic Queries

   String (UTF8)
   Object (hash/map/dict)
   Null / Undefined

   Int32 / Int64
   ObjectID (12 bytes: timestamp + host + pid + counter)
   Binary (with type byte)

                         Mathias Stearn   MongoDB
Native Language Integration
                     What is MongoDB?
                                          Rich Data Types
             What Makes Mongo Special?
                                          Atomic Modifiers
                                          Dynamic Queries

   $multiply (soon)
   $push / $pushAll
   $pull / $pullAll

db.posts.update({_id:SOMEID}, {$push:{tags:"mongodb"}})
db.tags.update({_id:"mongodb"}, {$inc:{count:1}},

                         Mathias Stearn   MongoDB
Native Language Integration
                     What is MongoDB?
                                          Rich Data Types
             What Makes Mongo Special?
                                          Atomic Modifiers
                                          Dynamic Queries


     db.posts.findOne({ user: "mstearn" });

     var cursor = db.posts.find({ user: "mstearn" });

                         Mathias Stearn   MongoDB
Native Language Integration
                     What is MongoDB?
                                          Rich Data Types
             What Makes Mongo Special?
                                          Atomic Modifiers
                                          Dynamic Queries


       { user: "mstearn" }

                         Mathias Stearn   MongoDB
Native Language Integration
                     What is MongoDB?
                                          Rich Data Types
             What Makes Mongo Special?
                                          Atomic Modifiers
                                          Dynamic Queries


       { user: "mstearn" }

                         Mathias Stearn   MongoDB
Native Language Integration
                      What is MongoDB?
                                           Rich Data Types
              What Makes Mongo Special?
                                           Atomic Modifiers
                                           Dynamic Queries

Simple Tag Search

       { user: "mstearn"
       , tags: "mongo"

                          Mathias Stearn   MongoDB
Native Language Integration
                      What is MongoDB?
                                           Rich Data Types
              What Makes Mongo Special?
                                           Atomic Modifiers
                                           Dynamic Queries

Complex Tag Search

       { user: "mstearn"
       , tags: {$in: ["mongo", "mongodb"]}

                          Mathias Stearn   MongoDB
Native Language Integration
                         What is MongoDB?
                                              Rich Data Types
                 What Makes Mongo Special?
                                              Atomic Modifiers
                                              Dynamic Queries

Nested Objects

       { user: "mstearn"
       , tags: {$in: ["mongo", "mongodb"]}
       , comments.user: "mdirolf"

                             Mathias Stearn   MongoDB
Native Language Integration
                      What is MongoDB?
                                           Rich Data Types
              What Makes Mongo Special?
                                           Atomic Modifiers
                                           Dynamic Queries

Regular Expressions

       { user: "mstearn"
       , tags: {$in: ["mongo", "mongodb"]}
       , comments.user: "mdirolf"
       , text: /windows/i

                          Mathias Stearn   MongoDB
Native Language Integration
                    What is MongoDB?
                                         Rich Data Types
            What Makes Mongo Special?
                                         Atomic Modifiers
                                         Dynamic Queries


      { user: "mstearn"
      , tags: {$in: ["mongo", "mongodb"]}
      , comments.user: "mdirolf"
      , text: /windows/i
      , points: {$gt: 10, $lt: 100}

                        Mathias Stearn   MongoDB
Native Language Integration
                        What is MongoDB?
                                             Rich Data Types
                What Makes Mongo Special?
                                             Atomic Modifiers
                                             Dynamic Queries

Arbitrary JavaScript

       { user: "mstearn"
       , tags: {$in: ["mongo", "mongodb"]}
       , comments.user: "mdirolf"
       , text: /windows/i
       , points: {$gt: 10, $lt 100}
       , $where: "this.a + this.b >= 42"

                            Mathias Stearn   MongoDB
What is MongoDB?     Built-In MapReduce
              What Makes Mongo Special?    Easy Hadoop-Mongo Integration
                             MapReduce     Better Hadoop-Mongo Integration

1   What is MongoDB?
     Document Oriented
     JavaScript Enabled
     Fast, Scalable, Available, and Reliable
2   What Makes Mongo Special?
     Native Language Integration
     Rich Data Types
     Atomic Modifiers
     Dynamic Queries
3   MapReduce
      Built-In MapReduce
      Easy Hadoop-Mongo Integration
      Better Hadoop-Mongo Integration

                          Mathias Stearn   MongoDB
What is MongoDB?     Built-In MapReduce
           What Makes Mongo Special?    Easy Hadoop-Mongo Integration
                          MapReduce     Better Hadoop-Mongo Integration

   function() {
            {count:1, words:c.text.split().length; } }
 , function(key, values){
     for (var i=1; i<values.length; i++){
       values[0].count += values[i].count;
       values[0].words += values[i].words; }
     return values[0]; }
 , { finalize: function(out){
       out.avg = out.words / out.count;
       return out; }
   , query: {posted: {$gt: new Date(2010,0,1)}}
   , out: ’posts.comment_stats’

                       Mathias Stearn   MongoDB
What is MongoDB?     Built-In MapReduce
               What Makes Mongo Special?    Easy Hadoop-Mongo Integration
                              MapReduce     Better Hadoop-Mongo Integration

Easy Hadoop-Mongo Integration

      mongoexport can export to JSON/CSV/TSV
          Can also easily use a custom script
      Process in Hadoop
      Use mongoimport to get data back into MongoDB

                           Mathias Stearn   MongoDB
What is MongoDB?     Built-In MapReduce
               What Makes Mongo Special?    Easy Hadoop-Mongo Integration
                              MapReduce     Better Hadoop-Mongo Integration

Better Hadoop-Mongo Integration

      mongodump writes a stream of BSON to a file
      Write an InputFilter and RecordReader to read BSON
      Write a BSONWriter class to directly use the data
          Just added two methods to driver to make this easier
      Process the data with the Java/Scala/Closure driver
      Write a custom RecordWriter to either:
          Dump to a file and use mongorestore
          Dump the output directly to MongoDB
      Optional: use renameCollection to mimic our MapReduce

                           Mathias Stearn   MongoDB
What is MongoDB?
              What Makes Mongo Special?

Upcoming events

      NoSQL Live! from Boston (March 11)
      MongoDB Training in San Francisco (March 25)
      San Fransisco MySQL Meetup (April 12)

                          Mathias Stearn   MongoDB
What is MongoDB?
                What Makes Mongo Special?


 (Try mongo in your browser)
        #mongodb on
        mongodb-user on google groups
        @mathias_mongo on twitter

                            Mathias Stearn   MongoDB

More Related Content

10 Gen 20100217 Hadoop Bay Area

  • 1. What is MongoDB? What Makes Mongo Special? MapReduce MongoDB Or how I learned to stop worrying and love the database Mathias Stearn 10gen Bay Area Hadoop Meetup – February 17, 2010 Mathias Stearn MongoDB
  • 2. What is MongoDB? What Makes Mongo Special? MapReduce The resulting [MongoDB] application has literally changed the way the pharma company conducts business. Whereas in the past, patient queries could take minutes to hours, results are now essentially real-time. Why bother [with] memcached for caching HTTP sessions when you have an authoritative MongoDB? The performance is there. Carl Byström, @cgbystrom on twitter Mathias Stearn MongoDB
  • 3. What is MongoDB? What Makes Mongo Special? MapReduce Compared to hadoop, Mongo’s speed and startup time make developing new queries much easier; what took us two weeks to get working on hadoop was done in two days on mongo. Emmett Shear, CTO (and developer) at It took me half a day to go from not touching MongoDB to writing some fairly good functionality against it. It makes setting up, configuring, and interfacing with MySQL look archaic – ridiculously archaic. mongodb-day-geek-austin-data-series Mathias Stearn MongoDB
  • 4. What is MongoDB? What Makes Mongo Special? MapReduce 1 What is MongoDB? Document Oriented JavaScript Enabled Fast, Scalable, Available, and Reliable 2 What Makes Mongo Special? Native Language Integration Rich Data Types Atomic Modifiers Dynamic Queries 3 MapReduce Built-In MapReduce Easy Hadoop-Mongo Integration Better Hadoop-Mongo Integration Mathias Stearn MongoDB
  • 5. What is MongoDB? Document Oriented What Makes Mongo Special? JavaScript Enabled MapReduce Fast, Scalable, Available, and Reliable 1 What is MongoDB? Document Oriented JavaScript Enabled Fast, Scalable, Available, and Reliable 2 What Makes Mongo Special? Native Language Integration Rich Data Types Atomic Modifiers Dynamic Queries 3 MapReduce Built-In MapReduce Easy Hadoop-Mongo Integration Better Hadoop-Mongo Integration Mathias Stearn MongoDB
  • 6. What is MongoDB? Document Oriented What Makes Mongo Special? JavaScript Enabled MapReduce Fast, Scalable, Available, and Reliable Document Oriented Organized into Databases and Collections (like Tables) JSON-like (BSON) Schemaless Dynamic, Strong Typing Database can “reach into” objects db.people.insert({ _id: "mstearn", name: "Mathias Stearn", karma: 42, active: true, birthdate: new Date(517896000000), interests: ["MongoDB", "Python", "Üñíçø¯˘"], de subobject: {foo: "bar"} }); Mathias Stearn MongoDB
  • 7. What is MongoDB? Document Oriented What Makes Mongo Special? JavaScript Enabled MapReduce Fast, Scalable, Available, and Reliable JavaScript used for: Shell and Documentation (Very) Advanced Queries “Group By” Queries MapReduce db.users.find({$where: "this.a + this.b >= 42"}); { key: "user" , initial: {count:0, comments:0} , reduce: function(doc,out){ out.count++; out.comments += doc.comments.length; } , finalize: function(out){ out.avg = out.comments / out.count; } }); Mathias Stearn MongoDB
  • 8. What is MongoDB? Document Oriented What Makes Mongo Special? JavaScript Enabled MapReduce Fast, Scalable, Available, and Reliable Fast, Scalable, Available, and Reliable Master-Slave replication for Availability and Reliability Replica-Pairs support auto-negotiation for master Auto-Sharding for Horizontal Scalability Distributes based on specified field Currently alpha MMAP database files to automatically use available RAM Asynchronous modifications Mathias Stearn MongoDB
  • 9. What is MongoDB? Document Oriented What Makes Mongo Special? JavaScript Enabled MapReduce Fast, Scalable, Available, and Reliable Mathias Stearn MongoDB
  • 10. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries 1 What is MongoDB? Document Oriented JavaScript Enabled Fast, Scalable, Available, and Reliable 2 What Makes Mongo Special? Native Language Integration Rich Data Types Atomic Modifiers Dynamic Queries 3 MapReduce Built-In MapReduce Easy Hadoop-Mongo Integration Better Hadoop-Mongo Integration Mathias Stearn MongoDB
  • 11. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Official Java/JVM Python Ruby C/C++ Perl PHP Community Supported Closure, Scala, C#, Haskell, Erlang, and More Mathias Stearn MongoDB
  • 12. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries JSON String (UTF8) Double Object (hash/map/dict) Array Bool Null / Undefined Extras Date Int32 / Int64 ObjectID (12 bytes: timestamp + host + pid + counter) Binary (with type byte) Mathias Stearn MongoDB
  • 13. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries $set $inc $multiply (soon) $push / $pushAll $pull / $pullAll db.posts.update({_id:SOMEID}, {$push:{tags:"mongodb"}}) db.tags.update({_id:"mongodb"}, {$inc:{count:1}}, {upsert:true}}) Mathias Stearn MongoDB
  • 14. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Simple db.posts.findOne({ user: "mstearn" }); var cursor = db.posts.find({ user: "mstearn" }); cursor.forEach(function(){ doSomething(this.text); }); Mathias Stearn MongoDB
  • 15. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Sorted db.posts.find( { user: "mstearn" } ).sort({timestamp:-1}) Mathias Stearn MongoDB
  • 16. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Paginated db.posts.find( { user: "mstearn" } ).sort({timestamp:-1}).skip(10).limit(10); Mathias Stearn MongoDB
  • 17. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Simple Tag Search db.posts.find( { user: "mstearn" , tags: "mongo" } ).sort({timestamp:-1}).skip(10).limit(10); Mathias Stearn MongoDB
  • 18. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Complex Tag Search db.posts.find( { user: "mstearn" , tags: {$in: ["mongo", "mongodb"]} } ).sort({timestamp:-1}).skip(10).limit(10); Mathias Stearn MongoDB
  • 19. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Nested Objects db.posts.find( { user: "mstearn" , tags: {$in: ["mongo", "mongodb"]} , comments.user: "mdirolf" } ).sort({timestamp:-1}).skip(10).limit(10); Mathias Stearn MongoDB
  • 20. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Regular Expressions db.posts.find( { user: "mstearn" , tags: {$in: ["mongo", "mongodb"]} , comments.user: "mdirolf" , text: /windows/i } ).sort({timestamp:-1}).skip(10).limit(10); Mathias Stearn MongoDB
  • 21. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Ranges db.posts.find( { user: "mstearn" , tags: {$in: ["mongo", "mongodb"]} , comments.user: "mdirolf" , text: /windows/i , points: {$gt: 10, $lt: 100} } ).sort({timestamp:-1}).skip(10).limit(10); Mathias Stearn MongoDB
  • 22. Native Language Integration What is MongoDB? Rich Data Types What Makes Mongo Special? Atomic Modifiers MapReduce Dynamic Queries Arbitrary JavaScript db.posts.find( { user: "mstearn" , tags: {$in: ["mongo", "mongodb"]} , comments.user: "mdirolf" , text: /windows/i , points: {$gt: 10, $lt 100} , $where: "this.a + this.b >= 42" } ).sort({timestamp:-1}).skip(10).limit(10); Mathias Stearn MongoDB
  • 23. What is MongoDB? Built-In MapReduce What Makes Mongo Special? Easy Hadoop-Mongo Integration MapReduce Better Hadoop-Mongo Integration 1 What is MongoDB? Document Oriented JavaScript Enabled Fast, Scalable, Available, and Reliable 2 What Makes Mongo Special? Native Language Integration Rich Data Types Atomic Modifiers Dynamic Queries 3 MapReduce Built-In MapReduce Easy Hadoop-Mongo Integration Better Hadoop-Mongo Integration Mathias Stearn MongoDB
  • 24. What is MongoDB? Built-In MapReduce What Makes Mongo Special? Easy Hadoop-Mongo Integration MapReduce Better Hadoop-Mongo Integration db.posts.mapReduce( function() { this.comments.forEach(c){ emit(c.user, {count:1, words:c.text.split().length; } } , function(key, values){ for (var i=1; i<values.length; i++){ values[0].count += values[i].count; values[0].words += values[i].words; } return values[0]; } , { finalize: function(out){ out.avg = out.words / out.count; return out; } , query: {posted: {$gt: new Date(2010,0,1)}} , out: ’posts.comment_stats’ } }); Mathias Stearn MongoDB
  • 25. What is MongoDB? Built-In MapReduce What Makes Mongo Special? Easy Hadoop-Mongo Integration MapReduce Better Hadoop-Mongo Integration Easy Hadoop-Mongo Integration mongoexport can export to JSON/CSV/TSV Can also easily use a custom script Process in Hadoop Use mongoimport to get data back into MongoDB Mathias Stearn MongoDB
  • 26. What is MongoDB? Built-In MapReduce What Makes Mongo Special? Easy Hadoop-Mongo Integration MapReduce Better Hadoop-Mongo Integration Better Hadoop-Mongo Integration mongodump writes a stream of BSON to a file Write an InputFilter and RecordReader to read BSON Write a BSONWriter class to directly use the data Just added two methods to driver to make this easier Process the data with the Java/Scala/Closure driver Write a custom RecordWriter to either: Dump to a file and use mongorestore Dump the output directly to MongoDB Optional: use renameCollection to mimic our MapReduce Mathias Stearn MongoDB
  • 27. What is MongoDB? What Makes Mongo Special? MapReduce Upcoming events NoSQL Live! from Boston (March 11) MongoDB Training in San Francisco (March 25) San Fransisco MySQL Meetup (April 12) Mathias Stearn MongoDB
  • 28. What is MongoDB? What Makes Mongo Special? MapReduce Links (Try mongo in your browser) #mongodb on mongodb-user on google groups [email protected] @mathias_mongo on twitter Mathias Stearn MongoDB