SlideShare a Scribd company logo
2010-02-26       4   Erlang
1/2
    No Erlang
 


 
                 3

 
         Chord
         VectorClocks
         …
2/2
 
      


 
                         

                          
               
          
       Chord
          EpiChord
              
   BASE            Sinfonia:
                   VectorClocks                         
          
       Bloom       
   ZDD
                                         EpiChord
Lookup
 
 
                                          1
                        0-10
    1
              5?
                        11-20
   2
       2
                        21-30
   3
                                          3
 
      
      
 
                   !
         ?
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


Address Space

                                           0-63
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


Mapping Keys to Nodes

                                                 




         1-6
          6
           :
     :
         52-57
        57
           :
     :
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
    


Chord
                               


            35    
    36-40
       40
    41-47
       47
    48-50
       49
    51-2
        57
    3-30
        6




                       •      log

                      •      log
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


EpiChord Lookup Algorithm
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


EpiChord Lookup Algorithm
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


EpiChord Lookup Algorithm
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


EpiChord Lookup Algorithm
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


EpiChord Lookup Algorithm
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


EpiChord Lookup Algorithm
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


EpiChord Lookup Algorithm
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     


EpiChord Lookup Algorithm
B. Leong et al. http://www.comp.nus.edu.sg/~bleong/slides/icon-epichord-slides.pdf
     EpiChord


Division of Address Space


                   ?
J. Xu et al., On the Fundamental Tradeoffs between Routing Table Size and Network Diameter in Peer-to-Peer Networks
    


                                                                                           Lookup

                         
                                                                                                                         
                                                                                (Dynamo?)


                                                EpiChord?
 
 
         ACID:
                           1.  SELECT …
                           2.  INSERT …
                           3.     :
          DB

 
         1                               ?
 
                      !
                 ?
S. Shinohara http://www.slideshare.net/shino/conflict-resolution-in-kai-presentation
       VectorClocks


 “happens-before” (->)




BASE                              VectorClocks
S. Shinohara http://www.slideshare.net/shino/conflict-resolution-in-kai-presentation
       VectorClocks


 “concurrent” (||)




BASE                              VectorClocks
T.Yamamuro
     


(postgresql’s) 2-phase commit
T.Yamamuro
     Sinfonia


2-phase commit’s optimization




 
                                      ?
T.Yamamuro
      Sinfonia


mini-transaction primitives
T.Yamamuro
     Sinfonia


Sinfonia’s 2-phase commit
                 =
T.Yamamuro
     Sinfonia


Sinfonia 2-phase commit 5
T.Yamamuro
     Sinfonia


Sinfonia 2-phase commit 5
T.Yamamuro
     Sinfonia


Sinfonia 2-phase commit 5
T.Yamamuro
              Sinfonia


 Evaluation: Scalability

                                    NFS,   




250                             
          2               100
VectorClocks




                Sinfonia
                        2-phase commit

                                              
        BASE
         ACID
 
 


            {apple, banana} + {orange}            Is apple a member?
          = {apple, banana, orange}
             {apple, banana, orange}
                             
                                
 
      
      
 
                        !
         ?                      Q.          ?
                                 A. BigTable
                                        Bloom
Bloom filter
 
               
       
      BF
                      m          
             0
 0
 0
 0
 0
 0
 0
 0
      ZF
                          m           
      hi
          i               0<i   k 
                                         [0,m-1]                  hi(x)
 
                                                        0
 1
 0
 0
 1
 0
 0
 0
         BF = ZF
 
         BF[ hi(x) ] = 1                                         hi(y)

                                                        0
 1
 0
 0
 1
 1
 0
 0

                           
                       Bloom filter m=8, k=2   x, y
Bloom filter
              membership query
         F = ZF
                                          hi(x)
   included
         F[ hi(x) ] = 1
         BF & F == F             0
 1
 0
 0
 1
 1
 0
 0


                                          hi(z)
   not included

                                  0
 1
 0
 0
 1
 1
 0
 0

                                                      false positive!
                                          hi(w)
 included

                                  0
 1
 0
 0
 1
 1
 0
 0
Bloom filter
 
         O k
                   n
      
Bloom filter
              OR AND
             m,            hi           

                                 0
 1
 0
 0
 1
 1
 0
 0



                                 0
 0
 0
 1
 1
 0
 0
 1


                                 0
 1
 0
 1
 1
 1
 0
 1
Bloom filter                              false positive
                                         p
                      1  kn    kn 
                 p = 1−  ≈ exp − 
                      m        m

                                 f   

€                             k
                 f = (1− p)



€
                                                        m = 256, n = 32
Bloom filter                            false positive
        f                     k                        
             f        k
                       m
              k = ln2 ⋅ 
                       n
             floor
                 k
                   
€

                              f
       
                               m
                  f ≈ 0.6185   n

                           k
Bloom filter
        n, f                             m   

                 m = −1.44n log 2 f
                          k           



€
Bloom filter
                                             b       

                  b = −1.44 log 2 f
                                      k           
                                     
                  b = −log 2 f                                  9.6       1%
€            1.44                                                    
                   


€                                                         9.6   

                      Bloom filter
                                         b
                                                                          1%
Bloom filter
                     k                                     p
                     1
                  p=                                 0
 1
 0
 0
 1
 1
 0
 1
                     2
                      k       
                            Bloom filter   

                                 [mitzenmacher02]
€
                     k
             0   1
          
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
      ZDD
BDD (Binary Decision Diagram)


                          {abc, ab, ac, b, } 
           trie                  



 0 (false)
   1 (true)
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
     ZDD


BDD
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
     ZDD


BDD
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
    ZDD
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
    ZDD
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
    ZDD
           BDD
ZDD: Zero-suppressed BDD
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
     ZDD


ZDD
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
     ZDD


ZDD
S. Minato http://www-alg.ist.hokudai.ac.jp/~minato/alg2009-j.html
     ZDD


ZDD
 
      


 
         EpiChord
              B. Leong et al., “EpiChord: Parallelizing the Chord lookup algorithm with reactive
               routing state management,” Comput. Commun., vol.29, no.9, pp.1243-1259, 2006.
 
         Sinfonia
              M.K. Aguilera et al., “Sinfonia: a new paradigm for building scalable distributed systems,”
               Proc. of SOSP’07, pp.159-174, 2007.
 
         ZDD
              S. Minato, “Zero-suppressed BDDs for set manipulation in combinatorial problems,”
               Proc. of DAC’93, pp. 272-277, 1993.

 

More Related Content

分散ストレージに使えるかもしれないアルゴリズム