"Collective Intelligence"ã®ãµã³ãã«ãrubyã«ç§»æ¤ãã¦ã¿ã
Programming Collective Intelligence: Building Smart Web 2.0 Applications
- ä½è : Toby Segaran
- åºç社/ã¡ã¼ã«ã¼: O'Reilly Media
- çºå£²æ¥: 2007/08/26
- ã¡ãã£ã¢: ãã¼ãã¼ããã¯
- è³¼å ¥: 3人 ã¯ãªãã¯: 117å
- ãã®ååãå«ãããã° (31件) ãè¦ã
ã¡ãã£ã¨é·ããã ãã©ï¼ã¾ã recommendation.rbã
def critics { 'Lisa Rose'=>{'Lady in the Water'=>2.5, 'Snake on the Plane'=>3.5, 'Just My Luck'=>3.0, 'Superman Returns'=>3.5, 'You, Me and Dupree'=>2.5, 'The Night Listener'=>3.0}, 'Gene Seymour'=>{'Lady in the Water'=>3.0, 'Snake on the Plane'=>3.5, 'Just My Luck'=>1.5, 'Superman Returns'=>5.0, 'The Night Listener'=>3.0, 'You, Me and Dupree'=>3.5}, 'Michael Phillips'=>{'Lady in the Water'=>2.5, 'Snake on the Plane'=>3.0, 'Superman Returns'=>3.5, 'The Night Listener'=>4.0}, 'Claudia Puig'=>{'Snake on the Plane'=>3.5, 'Just My Luck'=>3.0, 'The Night Listener'=>4.5, 'Superman Returns'=>4.0, 'You, Me and Dupree'=>2.5}, 'Mick LaSalle'=>{'Lady in the Water'=>3.0, 'Snake on the Plane'=>4.0, 'Just My Luck'=>2.0, 'Superman Returns'=>3.0, 'The Night Listener'=>3.0, 'You, Me and Dupree'=>2.0}, 'Jack Matthews'=>{'Lady in the Water'=>3.0, 'Snake on the Plane'=>4.0, 'The Night Listener'=>3.0, 'Superman Returns'=>5.0, 'You, Me and Dupree'=>3.5}, 'Toby'=>{'Snake on the Plane'=>4.5, 'You, Me and Dupree'=>1.0, 'Superman Returns'=>4.0}, } end # äºè ã®ã¢ã¤ãã éã®è·é¢ã«ããç®åº def sim_distance(prefs, person1, person2) # å ±éã¢ã¤ãã ããããã ã shared_items_a = shared_items_a(prefs, person1, person2) # å ±éã¢ã¤ãã ãç¡ããã°0 return 0 if shared_items_a.size == 0 # åã¢ã¤ãã ã®å·®ã®èªä¹ã®ç·åãè¨ç®ãã sum_of_squares = shared_items_a.inject(0) {|result, item| result + (prefs[person1][item]-prefs[person2][item])**2 } return 1/(1+sum_of_squares) end # äºè ã®ã¢ã¤ãã ãåä¸ç´ç·ã«ä¹ããã©ããã§ç®åº def sim_pearson(prefs, person1, person2) # å ±éã¢ã¤ãã ããããã ã shared_items_a = shared_items_a(prefs, person1, person2) # å ±éã¢ã¤ãã ãç¡ããã°0 n = shared_items_a.size return 0 if n == 0 # å ±éã¢ã¤ãã ã«ãããã¹ã¦ã®è©ä¾¡ãå ç® sum1 = shared_items_a.inject(0) {|result,si| result + prefs[person1][si]} sum2 = shared_items_a.inject(0) {|result,si| result + prefs[person2][si]} # å ±éã¢ã¤ãã ã«ãããã¹ã¦ã®è©ä¾¡ã®èªä¹ãå ç® sum1_sq = shared_items_a.inject(0) {|result,si| result + prefs[person1][si]**2} sum2_sq = shared_items_a.inject(0) {|result,si| result + prefs[person2][si]**2} # productsã®ç·è¨ sum_products = shared_items_a.inject(0) {|result,si| result + prefs[person1][si]*prefs[person2][si]} # ãã¢ã½ã³å¤ãè¨ç® num = sum_products - (sum1*sum2/n) den = Math.sqrt((sum1_sq - sum1**2/n)*(sum2_sq - sum2**2/n)) return 0 if den == 0 return num / den end # å¾åã®ä¼¼ã¦ããé ã«è©ä¾¡è ãåå¾ # åå¾æ°ï¼è¿ä¼¼å¤å®é¢æ°ã¯æå®å¯è½ã¨ãã def top_matches(prefs, person, n=5, similarity=:sim_pearson) scores = Array.new prefs.each do |key,value| # èªåãããªãè©ä¾¡è ãè¨ç® if key != person then scores << [__send__(similarity,prefs,person,key),key] end end scores.sort.reverse[0,n] end # ããè©ä¾¡è ç¨ã®ãå§ãã¢ã¤ãã ãè¨ç®ãã def get_recommendations(prefs, person, similarity=:sim_pearson) totals_h = Hash.new(0) sim_sums_h = Hash.new(0) prefs.each do |other,val| next if other==person sim = __send__(similarity,prefs,person,other) next if sim <= 0 prefs[other].each do |item, val| if !prefs[person].keys.include?(item) || prefs[person][item] == 0 then # ä¼¼ã¦ã度æ°*ã¹ã³ã¢ totals_h[item] += prefs[other][item]*sim # ä¼¼ã¦ã度æ°ã®ç·å sim_sums_h[item] += sim end end end # æ£è¦åãããªã¹ãã®ä½æ rankings = Array.new totals_h.each do |item,total| rankings << [total/sim_sums_h[item], item] end rankings.sort.reverse end # {'name1'=>{item1=>score1,item2=>score2..}...} ã¨ããããã·ã¥ã # {'item1'=>{name1=>score1,name2=>score2..}...} ã¨ããããã·ã¥ã«å¤æãã def transform_prefs(prefs) result = Hash.new prefs.each do |person,score_h| score_h.each do |item,score| result[item] ||= Hash.new result[item][person] = score end end result end # å ±éã¢ã¤ãã ã®åå¾ def shared_items(prefs, person1, person2) # å ±éã¢ã¤ãã ããããã ã shared_items_h = Hash.new prefs[person1].each do |k,v| shared_items_h[k] = 1 if prefs[person2].include?(k) end shared_items_h end # å ±éã¢ã¤ãã ã®åå¾ãã®2 # shared_itemsã¨ç°ãªãï¼å ±éã®ãã¼ã®é åãè¿ã def shared_items_a(prefs, person1, person2) prefs[person1].keys & prefs[person2].keys end if $0 == __FILE__ then p critics p sim_distance(critics, 'Lisa Rose', 'Gene Seymour') p sim_pearson(critics, 'Lisa Rose', 'Gene Seymour') p top_matches(critics, 'Toby') p get_recommendations(critics, 'Toby') movies = transform_prefs(critics) p movies p top_matches(movies, 'Superman Returns') end
使ãæ¹ã¯ãããªæããã¾ã critics ã¯ãã¹ãç¨ã®è©ä¾¡ãã¼ã¿ã{name=>{item1=>score1,..},name2=>{item2=>score2,..}...} ã¨ããå½¢å¼ã«ãªã£ã¦ãããsim_distance 㨠sim_pearson ã¯ï¼ããããç°ãªãã¢ã«ã´ãªãºã ã§äºè ã®è¿ä¼¼ã®åº¦åããè¨ç®ããã
irb(main):002:0> sim_distance(critics, 'Lisa Rose', 'Gene Seymour')
=> 0.148148148148148
å¤ã¯1ã«ãªãã°æ大ï¼ãã£ã¨ãè¿ä¼¼ãã¦ããï¼ãtop_matches ã¯ï¼ããè©ä¾¡è ã«ä¸çªä¼¼ã¦ããä»ã®è©ä¾¡è ãè¨ç®ããé¢æ°ã
irb(main):003:0> top_matches(critics, 'Toby')
=> [[0.99124070716193, "Lisa Rose"], [0.924473451641905, "Mick LaSalle"], [0.893405147441565, "Claudia Puig"], [0.66284898035987, "Jack Matthews"], [0.381246425831512, "Gene Seymour"]
]
Toby ã«ä¸çªä¼¼ã¦ããã®ã¯ Lisa Rose ã§ãããã¨ãåãããget_recommendations ã¯ï¼ãã®è©ä¾¡è ã«ä¸çªä¼¼ã¦ãã人ã®è©ä¾¡ãªã¹ãããï¼ã¾ã æªè¦ã®ãã®ãæ¢ãã¦ããããã¤ã¾ãããå§ããæ©è½ã
irb(main):004:0> get_recommendations(critics, 'Toby')
=> [[3.3477895267131, "The Night Listener"], [2.83254991826416, "Lady in the Water"], [2.53098070376556, "Just My Luck"]
]
"The Night Listener" ããå§ãã§ãããã¨ãåããã¾ãã