SlideShare a Scribd company logo
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
      


      


      


      



• 
      


      


      
• 

      



• 

      


      


      




      
 
     


     




 
     


     


     
 
 


 


 
 


 


 
!

 


 
 
     


     




 
     


     
 
     


     




 
     


     


     


     
 
�
�   �
HadoopとMongoDBを活用したソーシャルアプリのログ解析
HadoopとMongoDBを活用したソーシャルアプリのログ解析
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
     ‣ 

     ‣ 

     ‣ 

• 
     ‣ 

     ‣ 
• 
     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
def mapper(key, value):!
      for word in value.split(): yield word,1!
def reducer(key, values):!
      yield key,sum(values)!
if __name__ == "__main__":!
      import dumbo!
      dumbo.run(mapper, reducer)




dumbo start wordcount.py !
-hadoop /path/to/hadoop !
-input wc_input.txt -output wc_output
• 
      ‣ 

      ‣ 

      ‣ 

      ‣ 

python wordcount.py map < wc_input.txt | sort | !
     python wordcount.py red > wc_output.txt
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 


     ‣ 

     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
• 
 -----Change------!
 ActionLogger    a{ChangeP}          (Point,1371,1383)       !
 ActionLogger    a{ChangeP}          (Point,2373,2423)!

 ActionLogger    a{ChangeMedal}      (lucky_star,9,10)    !
 ActionLogger    a{ChangeMedal}      (lucky_sea_bream,0,1)!

 ActionLogger    a{ChangeG}                  !

 ActionLogger    a{ChangeSubG} (SubGold,13,16)               !

 ActionLogger    a{ChangeWakuwakuP}       (buy,0,30)!
 ActionLogger    a{ChangeWakuwakuP}       (by gacha,30,0) !

 ------Get------!
 ActionLogger     a{GetMaterial}    (syouhinnomoto,0,-1) !
 ActionLogger     a{GetMaterial}    usesyouhinnomoto !
 ActionLogger     a{GetMaterial}    (omotyanomotoPRO,1,6)!
 ActionLogger     a{GetMaterial}    (sui-tunomoto,5,4)!

 ActionLogger    a{GetInterior}      (bakery_counter,0,1)!

 ActionLogger    a{GetAvatarPart}      (190167,0,1)      !
 ActionLogger    a{GetAvatarPart}      (old_girl_09,0,1) !

 -----Trade-----!
 ActionLogger     a{Trade}              buy 3 itigoke-kis from gree.jp:xxxxx   !
• 
            
           ‣ 
           ‣ 
2010-07-26 00:00:02,446 INFO catalina-exec-483 ActionLogger –
userId a{Make} make item onsenmanjyuu!
2010-07-26 00:00:02,478 INFO catalina-exec-411 ActionLogger –
userId a{LifeCycle} Login



userId 2010-07-26 00:00:02,446 a{Make}        {onsenmanjyuu,1}!
userId 2010-07-26 00:00:02,478 a{LifeCycle}   {Login,1}!
userId 2010-07-26 00:00:02,478 a{GetMaterial} {omotyanomotoPRO,5}!
• 
      
     ‣ 
     ‣ 
     ‣ 
     ‣ 
• 


     • 

     • 
     • 

     • 
     • 

     • 
     • 

     • 
     • 
• 
     { !
        "_id" : "2010-06-27+xxxxx+a{ChangeP}",!
        "lastUpdate" : "2010-09-17",!
        "date" : "2010-06-27" !
        "userId" : “xxxxx",!
        "actionType" : "a{ChangeP}",!
        "actionDetail" : { "Point" : 600 },!
     }!
     { !
        "_id" : "2010-06-27+xxxxx+a{LifeCycle}", !
        "lastUpdate" : "2010-09-17",!
        "date" : "2010-06-27" !
        "userId" : ”xxxxx",!
        "actionType" : "a{LifeCycle}",!
        "actionDetail" : { ”Login" : 3 }!
     }!
• 
     { "_id" : "2010-08-31+group+a{PutOn}", !
       "date" : "2010-08-31", !
       "lastUpdate" : "2010-09-21",!
       "actionType" : "a{PutOn}",!
        "actionDetail" : { "a{PutOn}" : 52050 } !
     }!
     {...!
      "actionType" : "a{Make}",!
      "actionDetail" : { !
                         ”syurijyou” : 11,!
                         ”aisukuri-mu” : 378,!
                         ”kinnokarakuridokei” : 103,!
                         ”puramoderu” : 22,!
                         ”guremurinno_n” : 164,!
                         ”kyodaipenginno_n” : 76,!
                         ”patinko” : 67,!
                         “wakizasi” : 250,!
                         “dendendaiko” : 13651,!
                         ... (over 100 items)!
                        }!
     }!
HadoopとMongoDBを活用したソーシャルアプリのログ解析
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
     ‣ 


     ‣ 


     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
     ‣ 

     ‣ 

     ‣ 


     ‣ 

     ‣ 

     ‣ 
• 
MySQL:   select * from things where x=3 and y="foo"!
MongoDB: db.things.find( { x : 3, y : "foo" } );!


MySQL: select z from things where x=3!
MongoDB: db.things.find( { x : 3 }, { z : 1 } );


db.collection.find({ "field" : { $gt: value } } ); !
//       : field > value !

db.collection.find({ "field" : { $lt: value } } ); !
//       : field < value !

db.collection.find({"field”: {$gt: value1, $lt: value2}});!
 // value1 <= field <= value2
MySQL:   select * from things where x in (b,a,c)!
MongoDB: db.collection.find( { "field" : { $in : array } } ); !

     db.things.find({j:{$in: [2,4,6]}});!



db.customers.find( { name : /acme.*corp/i } ); !


db.myCollection.find().sort( { ts : -1 } ); // ts             !



>   m = function() { emit(this.user_id, 1); } !
>   r = function(k,vals) { return 1; } !
>   res = db.events.mapreduce(m, r, { query : {type:'sale'} }); !
>   db[res.result].find().limit(2) !
{   "_id" : 8321073716060 , "value" : 1 } !
{   "_id" : 7921232311289 , "value" : 1 } !
• 
{ !
   "_id" : "2010-06-27+xxxxx+a{ChangeP}",!
   "lastUpdate" : "2010-09-17",!
   "date" : "2010-06-27" !
   "userId" : “xxxxx",!
   "actionType" : "a{ChangeP}",!
   "actionDetail" : { "Point" : 600 },!
}!
{ !
   "_id" : "2010-06-27+xxxxx+a{LifeCycle}", !
   "lastUpdate" : "2010-09-17",!
   "date" : "2010-06-27" !
   "userId" : ”xxxxx",!
   "actionType" : "a{LifeCycle}",!
   "actionDetail" : { ”Login" : 3 }!
}!
HadoopとMongoDBを活用したソーシャルアプリのログ解析
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
     ‣ 

• 
     ‣ 

     ‣ 

     ‣ 

     ‣ 
• 
      




         • 
         • 
         • 


         • 
         • 
         • 


         • 
         • 
         • 
• 

{!
 "_id" : "2010-06-28+xxxx+Charge",!
 "lastUpdate" : "2010-09-20",!
 "userId" : ”xxxx",!
 "date" : "2010-06-28",!
 "actionType" : "Charge",!
 "totalCharge" : 1210,!
 "boughtItem" : { "          EX 5 " : 1,!
                  "           5 " : 1,!
                  "         5 " : 1,!
                  "          " : 1,!
                  "     " : 2 }!
}!
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
     ‣ 

     ‣ 

• 
     ‣ 

     ‣ 

     ‣ 
• 
• 
      
• 
      
• 
     {!
          "_id" : "2010-07-11+xxxxx+Registration",!
          "lastUpdate" : "2010-09-25",!
          "actionType" : "Registration",!
          "userId" : ”xxxxx",!
          "date" : "2010-07-11",!
          "firstCharge" : "2010-07-12",!
          "lastCharge" : "2010-09-02",!
          "lastLogin" : "2010-09-02",!
          "firstChargeTerm" : 1,!
          "playTerm" : 50,!
          "totalMonthCharge" : 1000,!
          "totalMonthChargeDetail" : {!
              "1th" : 74.3!
              "2th" : 17.1,!
              "3th" : 8.6,!                             i.e.
              "4th" : 0,!
          },!
          "totalCumlativeCharge" : 10000,!
          "totalCumlativeChargeDetail" : {!
              "1th" : 2,!
              "2th" : 0.5,!
              "3th" : 0.2,!
              "4th" : 0,!
              "5th" : 0.1,!
              "6th" : 27.5,!
              "7th" : 1.2,!
              "8th" : 49!
              "9th" : 19.5,!                     2.7%
          }!
     }!
• 
topMonthCharge = function(n){!
 return db.user_registration.find({},{!
   totalMonthCharge:true,!
   totalMonthChargeDetail:true,!
   userId:true!
 }).sort({totalMonthCharge:-1}).limit(n);!
}!

> topMonthCharge(20)                                          !
{ !
   "_id" : "2010-07-10+9999+Registration",!   Top20
   "userId" : ”9999”,!
   "totalMonthCharge" : 10000,!
   "totalMonthChargeDetail" : { "5th" : 13.7, "4th" : 27.6,
"3th" : 21, "2th" : 16.2, "1th" : 21.5 }!
}!
…!
findUser = function(x){ !
 return db.user_charge.find({userId:x},{!
   userId:true,!
   totalCharge:true,!
   boughtItem:true}).sort({date:-1})!
}!
> findUserCharge("9999")!
{!
     "_id" : "2010-09-08+9223458+Charge",!
     "totalCharge" : 2000,!
     "userId" : ”9999",!
     "boughtItem" : {!                       Top
         "        110 " : 2!
     }!
}!
{!
     "_id" : "2010-09-07+9223458+Charge",!
     "totalCharge" : 5000,!
     "userId" : ”9999",!
     "boughtItem" : {!
         "        350 " : 1,!
         "        110 " : 2!
     }!
}!
…!
HadoopとMongoDBを活用したソーシャルアプリのログ解析
• 
• 
• 
• 
• 
• 
db.user_error!
 db.user_access!    (           )!   db.user_trace!
(from          )!                    (from       )!




                    db.user_attr!
                    (          )!




 db.user_status!                     db.user_charge!
(from Cassandra)!                     (from MySQL)!
 
• 
     ‣ 


     ‣ 


     ‣ 

     ‣ 
• 
     ‣ 


     ‣ 


     ‣ 



     ‣ 
HadoopとMongoDBを活用したソーシャルアプリのログ解析

More Related Content

HadoopとMongoDBを活用したソーシャルアプリのログ解析

  • 2. •          •       
  • 3. •    •         
  • 4.            
  • 5.
  • 9.          
  • 10.              
  • 11.
  • 12. � �
  • 16. •  ‣  ‣  ‣  •  ‣  ‣ 
  • 17. •  ‣  ‣  ‣  ‣  ‣ 
  • 18. •  ‣  ‣  ‣  ‣  ‣ 
  • 19. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  • 20. •  def mapper(key, value):! for word in value.split(): yield word,1! def reducer(key, values):! yield key,sum(values)! if __name__ == "__main__":! import dumbo! dumbo.run(mapper, reducer) dumbo start wordcount.py ! -hadoop /path/to/hadoop ! -input wc_input.txt -output wc_output
  • 21. •  ‣  ‣  ‣  ‣  python wordcount.py map < wc_input.txt | sort | ! python wordcount.py red > wc_output.txt
  • 23. •  ‣  ‣  ‣  ‣  ‣ 
  • 25. •  -----Change------! ActionLogger a{ChangeP} (Point,1371,1383) ! ActionLogger a{ChangeP} (Point,2373,2423)! ActionLogger a{ChangeMedal} (lucky_star,9,10) ! ActionLogger a{ChangeMedal} (lucky_sea_bream,0,1)! ActionLogger a{ChangeG} ! ActionLogger a{ChangeSubG} (SubGold,13,16) ! ActionLogger a{ChangeWakuwakuP} (buy,0,30)! ActionLogger a{ChangeWakuwakuP} (by gacha,30,0) ! ------Get------! ActionLogger a{GetMaterial} (syouhinnomoto,0,-1) ! ActionLogger a{GetMaterial} usesyouhinnomoto ! ActionLogger a{GetMaterial} (omotyanomotoPRO,1,6)! ActionLogger a{GetMaterial} (sui-tunomoto,5,4)! ActionLogger a{GetInterior} (bakery_counter,0,1)! ActionLogger a{GetAvatarPart} (190167,0,1) ! ActionLogger a{GetAvatarPart} (old_girl_09,0,1) ! -----Trade-----! ActionLogger a{Trade} buy 3 itigoke-kis from gree.jp:xxxxx !
  • 26. •    ‣  ‣  2010-07-26 00:00:02,446 INFO catalina-exec-483 ActionLogger – userId a{Make} make item onsenmanjyuu! 2010-07-26 00:00:02,478 INFO catalina-exec-411 ActionLogger – userId a{LifeCycle} Login userId 2010-07-26 00:00:02,446 a{Make} {onsenmanjyuu,1}! userId 2010-07-26 00:00:02,478 a{LifeCycle} {Login,1}! userId 2010-07-26 00:00:02,478 a{GetMaterial} {omotyanomotoPRO,5}!
  • 27. •    ‣  ‣  ‣  ‣ 
  • 28. •  •  •  •  •  •  •  •  •  • 
  • 29. •  { ! "_id" : "2010-06-27+xxxxx+a{ChangeP}",! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : “xxxxx",! "actionType" : "a{ChangeP}",! "actionDetail" : { "Point" : 600 },! }! { ! "_id" : "2010-06-27+xxxxx+a{LifeCycle}", ! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : ”xxxxx",! "actionType" : "a{LifeCycle}",! "actionDetail" : { ”Login" : 3 }! }!
  • 30. •  { "_id" : "2010-08-31+group+a{PutOn}", ! "date" : "2010-08-31", ! "lastUpdate" : "2010-09-21",! "actionType" : "a{PutOn}",! "actionDetail" : { "a{PutOn}" : 52050 } ! }! {...! "actionType" : "a{Make}",! "actionDetail" : { ! ”syurijyou” : 11,! ”aisukuri-mu” : 378,! ”kinnokarakuridokei” : 103,! ”puramoderu” : 22,! ”guremurinno_n” : 164,! ”kyodaipenginno_n” : 76,! ”patinko” : 67,! “wakizasi” : 250,! “dendendaiko” : 13651,! ... (over 100 items)! }! }!
  • 33. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  • 34. •  ‣  ‣  ‣  ‣  ‣  ‣ 
  • 35. •  MySQL: select * from things where x=3 and y="foo"! MongoDB: db.things.find( { x : 3, y : "foo" } );! MySQL: select z from things where x=3! MongoDB: db.things.find( { x : 3 }, { z : 1 } ); db.collection.find({ "field" : { $gt: value } } ); ! // : field > value ! db.collection.find({ "field" : { $lt: value } } ); ! // : field < value ! db.collection.find({"field”: {$gt: value1, $lt: value2}});! // value1 <= field <= value2
  • 36. MySQL: select * from things where x in (b,a,c)! MongoDB: db.collection.find( { "field" : { $in : array } } ); ! db.things.find({j:{$in: [2,4,6]}});! db.customers.find( { name : /acme.*corp/i } ); ! db.myCollection.find().sort( { ts : -1 } ); // ts ! > m = function() { emit(this.user_id, 1); } ! > r = function(k,vals) { return 1; } ! > res = db.events.mapreduce(m, r, { query : {type:'sale'} }); ! > db[res.result].find().limit(2) ! { "_id" : 8321073716060 , "value" : 1 } ! { "_id" : 7921232311289 , "value" : 1 } !
  • 37. •  { ! "_id" : "2010-06-27+xxxxx+a{ChangeP}",! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : “xxxxx",! "actionType" : "a{ChangeP}",! "actionDetail" : { "Point" : 600 },! }! { ! "_id" : "2010-06-27+xxxxx+a{LifeCycle}", ! "lastUpdate" : "2010-09-17",! "date" : "2010-06-27" ! "userId" : ”xxxxx",! "actionType" : "a{LifeCycle}",! "actionDetail" : { ”Login" : 3 }! }!
  • 40. •  ‣  •  ‣  ‣  ‣  ‣ 
  • 41. •    •  •  •  •  •  •  •  •  • 
  • 42. •  {! "_id" : "2010-06-28+xxxx+Charge",! "lastUpdate" : "2010-09-20",! "userId" : ”xxxx",! "date" : "2010-06-28",! "actionType" : "Charge",! "totalCharge" : 1210,! "boughtItem" : { " EX 5 " : 1,! " 5 " : 1,! " 5 " : 1,! " " : 1,! " " : 2 }! }!
  • 44. •  ‣  ‣  •  ‣  ‣  ‣ 
  • 46. • 
  • 47. • 
  • 48. •  {! "_id" : "2010-07-11+xxxxx+Registration",! "lastUpdate" : "2010-09-25",! "actionType" : "Registration",! "userId" : ”xxxxx",! "date" : "2010-07-11",! "firstCharge" : "2010-07-12",! "lastCharge" : "2010-09-02",! "lastLogin" : "2010-09-02",! "firstChargeTerm" : 1,! "playTerm" : 50,! "totalMonthCharge" : 1000,! "totalMonthChargeDetail" : {! "1th" : 74.3! "2th" : 17.1,! "3th" : 8.6,! i.e. "4th" : 0,! },! "totalCumlativeCharge" : 10000,! "totalCumlativeChargeDetail" : {! "1th" : 2,! "2th" : 0.5,! "3th" : 0.2,! "4th" : 0,! "5th" : 0.1,! "6th" : 27.5,! "7th" : 1.2,! "8th" : 49! "9th" : 19.5,! 2.7% }! }!
  • 49. •  topMonthCharge = function(n){! return db.user_registration.find({},{! totalMonthCharge:true,! totalMonthChargeDetail:true,! userId:true! }).sort({totalMonthCharge:-1}).limit(n);! }! > topMonthCharge(20) ! { ! "_id" : "2010-07-10+9999+Registration",! Top20 "userId" : ”9999”,! "totalMonthCharge" : 10000,! "totalMonthChargeDetail" : { "5th" : 13.7, "4th" : 27.6, "3th" : 21, "2th" : 16.2, "1th" : 21.5 }! }! …!
  • 50. findUser = function(x){ ! return db.user_charge.find({userId:x},{! userId:true,! totalCharge:true,! boughtItem:true}).sort({date:-1})! }! > findUserCharge("9999")! {! "_id" : "2010-09-08+9223458+Charge",! "totalCharge" : 2000,! "userId" : ”9999",! "boughtItem" : {! Top " 110 " : 2! }! }! {! "_id" : "2010-09-07+9223458+Charge",! "totalCharge" : 5000,! "userId" : ”9999",! "boughtItem" : {! " 350 " : 1,! " 110 " : 2! }! }! …!
  • 53. db.user_error! db.user_access! ( )! db.user_trace! (from )! (from )! db.user_attr! ( )! db.user_status! db.user_charge! (from Cassandra)! (from MySQL)!
  • 54.
  • 55. •  ‣  ‣  ‣  ‣ 
  • 56. •  ‣  ‣  ‣  ‣