第2回NHNテクノロジーカンファレンスで発表した資料ですー。
References: LINE Storage: Storing billions of rows in Sharded-Redis and HBase per Month (http://tech.naver.jp/blog/?p=1420), I posted this entry in 2012.3.
1 of 52
Downloaded 385 times
More Related Content
HBase at LINE
1. HBase at LINE
~ How to grow our storage together with service ~
中村 俊介, Shunsuke Nakamura
(LINE, twitter, facebook: sunsuk7tp)
NHN Japan Corp.
2. 自己紹介
中村 俊介
• 2011.10 旧 Japan新卒入社 (2012.1から Japan)
• LINE server engineer, storage team
• Master of Science@東工大首藤研
• Distributed Processing, Cloud Storage, and NoSQL
• MyCassandra [CASSANDRA-2995]: A modular NoSQL with
Pluggable Storage Engine based on Cassandra
• はてなインターン/インフラアルバイト
3. NHN/NAVER
• NAVER Korea: 検索ポータルサイト • NHN = Next Human Network
• NAVER
• 韓国本社の検索シェア7割
• Hangame
• 元Samsungの社内ベンチャー • livedoor
• NAVER = Navigate + er • データホテル
• NAVER Japan • NHN ST
• Japanは今年で3年目 • JLISTING 韓国本社
• メディエータ Green Factory
• 経営統合によりNAVERはサービス
名、グループ、宗教 • 深紅
• LINE、まとめ、画像検索、NDrive
5. 8.17 5,500万users (日本 2,500万users)
AppStore Ranking - Top 1
Japan,Taiwan,Thailand,, HongKong, Saudi, Malaysia, Bahrain, Jordan, Qatar,
Singapore, Indonesia, Kazakhstan, Kuwait, Israel, Macau, Ukraine, UAE,
Switzerland, Australia,Turkey,Vietnam, Germany, Russian
6. LINE Roadmap
2011.6 iPhone first release
Android first release 2011.8
WAP
I join to LINE team. 2011.10
Sticker VOIP
Bots (News, Auto-translation, Public account, Gurin)
PC (Win/Mac), Multi device Sticker Shop
LINE Card/Camera/Brush
WP first release 2012.6
Timeline
BB first release 2012.8
LINE platform
7. Target of LINE Storage
start
1. Performing well (put < 1ms, get < 5ms)
2. A high scalable, available, and eventually
consistent storage system built around NoSQL
3. Geological distribution
future
Global Japan
56.8% 43.2%
8. LINE Storage and NoSQL
1. Performing well
2. A high scalable, available, and eventually
consistent storage system
3. Geological distribution
20. Data and Scalability
• constant
• DATA: async operation
• SCALE: thousands ~ millions per each queue
• linear
• DATA: users’ contact and group
10000000000
• SCALE: millions ~ billions 7500000000
• exponential 5000000000
• DATA: message and message inbox 2500000000
• SCALE: tens of billion ~ 0
constant linear exponential
21. Data and Workload
Queue
• constant
• FIFO
• read&write fast
Zipfian curve
• linear
• zipf.
• read fast [w3~5 : r95] Message timeline
• exponential
• latest
• write fast [w50 : r50]
29. Primary key for LINE
• Long Type keyを元に生成: e.g. userId, messageId
• simple ^ random for single lookup
• range queryのためのlocalityの考慮不要
• prefix(key) + key
• prefix(key): ALPHABETS[key%26] + key%1000
• o.a.h.hbase.util.RegionSplitter.SplitAlgorithmを実装
• prefixでRegion splitting
a HRegion 2600
2626
2756
2782 b 2601
c 2602
d z
2652 2808
a000 a250 a500 a750 b000
30. Data stored in HBase
• User, Contact, Group
• linear scale
• mutable
• Message, Inbox
• exponential scale
• immutable
41. 1. HDFS NameNode (NN)
• HA Framework for HDFS NN (HDFS-1623)
• Backup NN (0.21)
• Avatar NN (Facebook)
• HA NN using Linux HA
• Active/passive configuration deploying two NN (cloudera)
42. HA NN using Linux-HA
• DRBD+heartbeatで冗長化
• DRBD: disk mirroring (RAID1-like)
• heartbeat: network monitoring
• pacemaker: resource management (failover logicの登録)