大è¦æ¨¡ã½ã¼ã·ã£ã«ãµã¼ãã¨ã³ã¸ã³ã®æ§é
ã¯ããã«
Googleã®ããã«ï¼ã©ã®ããã¥ã¡ã³ããé©åãªã®ããé¸ã¶ã®ã§ã¯ãªãï¼è³ªåã誰ã«ããã®ãé©åããé¸ã¶æ¤ç´¢ã¨ã³ã¸ã³ãAardvarkã¨ããä¼ç¤¾ãä½ãï¼ãã®æ§é ã論文で公開ãã¾ããï¼ãã®ä¼ç¤¾ã¯ãã¨ãã¨Googleã®ç¤¾å¡ã ã£ã人éãä½ã£ãç©ã§ï¼æè¿Googleãè²·ãä¸ãã¾ããï¼ä»æ¥ã¯ãã®è«æã®è¦æ¨ãã¾ã¨ãã¦ã¿ã¾ããï¼
ã¿ã¤ãã«ã¨èè
ã¿ã¤ãã«ã¯Googleåµå§è ã®Larry Pageããã¨Sergey Brinããã1988å¹´ã«çºè¡¨ãã"Anatomy of a Large-Scale Hypertextual Search Engine"ã¨é»ãè¸ãã§ãã¾ãï¼è«æãçºè¡¨ããã®ã¯ï¼Aardvark社ã®Damon Horowitzさんã¨Stanford Univ.ã®Sepandar D. Kamvarさんã§ãï¼ä»¥ä¸å°è¦åºããç« ï¼å°ã è¦åºããç¯ã¨ããå½¢å¼ã§é²ãã¾ãï¼
ABSTRACT
Aardvarkã¯ä¼ç¤¾åã§ããã¨ä¾ã«ï¼ãã®ä¼ç¤¾ãä½ã£ãSocial Search Engineã®ååã§ãããã¾ãï¼ã¦ã¼ã¶ã¯IMï¼emailï¼webå ¥åï¼text messageãé³å£°ã§è³ªåããã¾ãï¼Aardvarkã¯ï¼è³ªåããã人ã®extended social networkä¸ã§ï¼ãã®è³ªåã«æãé©åã«çããããã¨èãã人ã«ãã®è³ªåãå»»ãã¾ãï¼
INTRODUCTION
The Library and the Village
å¾æ¥ï¼æ å ±ãåå¾ããããã«ã¯ã©ãlibraryã使ãããªãã°ãããï¼ã¨ããäºãæ¤ç´¢ã®çè«çæ çµã¿ã®åºç¤ã¨ãªã£ã¦ãã¾ããï¼Googleèªä½"Stanford Digital Library project"ããçã¾ãã¾ããï¼ä¸æ¹ã§äººã ã¯ï¼Aardvarkãvillage paradigmã¨å¼ãã§ããæ å ±åå¾æ¹æ³ãå¤ããã使ã£ã¦ãã¾ããï¼æ å ±ã¯ï¼æã®ä¸ã§ã¯èãï¼ãã¤ï¼äººãã人ã¸ä¼ãããªããåå¨ãã¾ãï¼è³ªåã®çããè¦ã¤ããããã«ã¯ï¼æ£ããæç« ãè¦ã¤ããã®ã§ã¯ãªãï¼æ å ±ãæã£ã¦ãã人ãè¦ã¤ããªããã°ãªãã¾ããï¼
Aardvark
Aardvarkã¯village paradigmã«åºã¥ããsocial search engineã§ãï¼
OVERVIEW
Main Components
- Crawler and Indexerï¼æ å ±æºãè¦ã¤ãã¦ã©ãã«ä»ããã¾ãï¼ããã§ããæ å ±æºã¨ã¯ï¼æç« ã§ã¯ãªãã¦äººã®ãã¨ã§ãï¼
- Query Analyzerï¼ã¦ã¼ã¶ãå¿ è¦ã¨ãã¦ããæ å ±ãç解ãã¾ãï¼
- Ranking Functionï¼é©åãªæ å ±æºãé¸æãã¾ãï¼
- UIï¼äººã ã«åãå ¥ããããï¼å¯¾è©±çãªæ å ±æä¾æ¹æ³ã§ã
The Initiation of a User
ã¦ã¼ã¶ãåãã¦Aardvarkã«ã¢ã¯ã»ã¹ããã¨ï¼Aardvarkã¯ã©ããªè³ªåããã®ã¦ã¼ã¶ã«å»»ãã®ãé©åããå¤æããindexing stepsãå®æ½ãã¾ãï¼ã©ããªè³ªåãå»»ããã¯ãã®ã¦ã¼ã¶ã®extended social networkã«ä¾åããã®ã§ï¼Aardvarkã¯ã¦ã¼ã¶ã®å人ãindexesãããã¨ã¨ï¼ã¦ã¼ã¶ã¨ãã®å人ã¨ã®é¢ä¿ã調ã¹ã¾ãï¼ãã®çãã¯åã«social networkãä½ãäºã«ããã®ã§ã¯ããã¾ããï¼ã¦ã¼ã¶ããå½¼ãã®extended social networksãå©ç¨ããã¦ããããã¨ã«ããã¾ãï¼Aardvarkã§ã¯ï¼ã¦ã¼ã¶ã¯å®ç¤¾ä¼ã«ããã人éé¢ä¿ãåæ ããããã¤ãã®groupsã§çµã³ã¤ãã¾ãï¼ãããã®groupsã¯social networksããèªåçã«åãè¾¼ããã¨ãããã¾ããï¼ã¦ã¼ã¶ãç·¨éãããã¨ãã§ãã¾ãï¼
ããã«Aardvarkã¯ï¼ã¦ã¼ã¶ã®topicã«ãããç¥èãçµé¨ãã©ã®ã¬ãã«ã«ããã®ããè¤æ°ã®è¦³ç¹ããå¤å®ãForward Indexã«indexesãã¾ãï¼Forward Indexã«ã¯userIdã«å¯¾ããtopicãã¨ã®ã¬ãã«ã®ãªã¹ãï¼ãã®ã¦ã¼ã¶ã解çãã質åå 容ã¨è§£çã®è³ªãæ ¼ç´ããã¦ãã¾ãï¼Aardvarkã¯Forward IndexããInverted Indexãä½æãï¼ããã§ã¯ä»»æã®topicIdã«å¯¾ãã¦å°éæ§ãããã¨å¤æããã¦ã¼ã¶ã®ãªã¹ãã¨ãããã®ã¦ã¼ã¶ãã¨ã®ã¬ãã«ï¼åçã®è³ªã¨åçã«è¦ããæéãæ ¼ç´ãã¦ãã¾ãï¼
The Life of a Query
Aardvarkã¯ã¦ã¼ã¶ã®è³ªåãMessage data structureã«æ£è¦åãConversation Managerã«éãã¾ãï¼Conversation Managerããã®messageã質åã ã¨å¤æããã¨ï¼ãã®è³ªåãã©ããªtopicsã«å±ãã¦ããã®ãã調ã¹ãçºQuestion Analyzerã«ååãã¾ãï¼Conversation Managerã¯ï¼Question Analyzerããå¾ãtopics種å¥ãã¦ã¼ã¶ã®èãã¨ä¸è´ãã¦ãããã©ãããã¦ã¼ã¶ã«ç¢ºèªãï¼éã£ã¦ããã°è¨æ£ãã¦ãããã¾ãï¼ããã¨åæã«Conversation Managerã¯Routing Engineã«å¯¾ãã¦Routing Suggestion Requestãçºè¡ãã¾ãï¼Routing Engineã¯Inverted Indexã¨Social Graphã使ã£ã¦åçãã¦ãããããªåè£è ãªã¹ããä½æãï¼åè£è ã®åçã®è³ªã¨è³ªåè ã®å¸æã«ã©ããããåè´ããããåºã«ã©ã³ã¯ä»ããã¾ãï¼Routing Engineã¯ã©ã³ã¯ä»ãããRouting SuggestionsãConversation Mangerã«è¿å´ãã¾ãï¼Conversation Managerã¯Routing Policyã«å¾ãï¼åçåè£è ã«è³ªåã«è¿äºã§ããããåãåããç¶ãã¾ãï¼æçµçã«Conversation Managerã¯åçã質åè ã«è¿ãï¼è³ªåè ã¨åçè ã追å æ å ±ã®ããã¨ããå¿ è¦ã¨ããéã«ã¯æ å ±ãä¸ç¶ãã¾ãï¼
ANATOMY
The Model
Aardvarkã®coreã¯æ½å¨çãªåçè ã¸è³ªåãroutingããããã®çµ±è¨ã¢ãã«ã«ããã¾ãï¼ãã®ã¢ãã«ã¯ï¼aspect modelã¨å¼ã°ãã¦ããç©ããããã¯ã¼ã¯ç¨ã«ãããã®ã使ã£ã¦ãã¾ãï¼ãã®ã¢ãã«ã«ã¯2ã¤ã®ç¹å¾´ãããã¾ãï¼
-
- ã¦ã¼ã¶ãtopic t()ã«å±ãã質åqã«çãã確ç
- ã©ã®ãããªè³ªåã§ãããã«ä¿ããã¦ã¼ã¶ãã¦ã¼ã¶ã®è³ªåã«çãã確ç
ããããµãã¤ã®ç¢ºçãæãåãããç©ãscoring function ã§ãï¼
ranking problemã¨ãã¦ã®ç®æ¨ã¯ï¼ã¦ã¼ã¶ãã質åqãä¸ããããæï¼ãæ大ã«ããã¦ã¼ã¶ã®ãªã¹ããä½æãããã¨ã«ããã¾ãï¼
Social Crawling
Aardvarkã«ã¨ã£ã¦ã®æ å ±æºã¯äººãªã®ã§ï¼active userãç¶æãå¢ããã¦è¡ããã¨ãå¿ è¦ã§ãï¼ãã®ããã«ã¯ä½¿ã£ã¦ããã£ãã¨ããä½é¨ãã¦ã¼ã¶ã«ä¸ããå¿ è¦ãããã¾ãï¼
Indexing People
Aardvarkã¯ã¦ã¼ã¶ã«ã¤ãã¦ä»¥ä¸ã®2ç¹ç¥ãå¿ è¦ãããã¾ãï¼
-
- ã¦ã¼ã¶ãã§çããäºãã§ããtopic t
- ã¦ã¼ã¶ã¨ã¦ã¼ã¶ã¨ã®connections
ã¾ãã¯topicã«ã¤ãã¦ï¼
ã®å¤ã¯ï¼éå»ãã®ã¦ã¼ã¶ãè¡ã£ãçºè¨ããæ¨æ¸¬ãã¾ãï¼ç¹å®ã®topicã«é¢ãã¦ãã確çã§åçããã³ã³ãã³ãã¸ã§ãã¬ã¼ã¿ã¨ãã¦ã¦ã¼ã¶ãè¦ãããã§ãï¼ãã¹ã¦ã®topicã«å¯¾ãã¦ã¹ã³ã¢ãä½æãã¦ã¼ã¶ãããã¡ã¤ã«ã«ç»é²ãã¾ãï¼å ãã¦ï¼
- ã¦ã¼ã¶ãç¡è¦ããtopic
- æ©ä¼ããã£ãã®ã«åçãæå¦ããtopic
- åçã«å¯¾ãã¦å¥ã®ã¦ã¼ã¶ããå¦å®çãªã³ã¡ã³ããã¤ããtopic
ã調ã¹ï¼ã¦ã¼ã¶ã«å¯¾ãã¦éãã¹ãã§ã¯ãªãtopicãå¦ç¿ãã¾ãï¼
ããã«ï¼å®æçã«topic strengthening algorithmãå®è¡ãã¾ãï¼ãã®ã¢ã«ã´ãªãºã ã®åºæ¬çãªèãæ¹ã¯ï¼ããã¦ã¼ã¶ãç¹å®ã®topicã®å°é家ã§ï¼ãã®ã¦ã¼ã¶ã®ã»ã¨ãã©ã®å人ããã¯ããã®topicã®å°é家ã§ããã°ï¼ãã®ã¦ã¼ã¶ãèªåã®åéã°ã«ã¼ãã®ä¸ã§ãã ä¸äººã®å°é家ã§ããå ´åãããé«ãä¿¡é ¼ãç½®ãï¼ã¨ãããã®ã§ãï¼
ããã¦ã¼ã¶ãï¼ãã®ã¦ã¼ã¶ã®åéã°ã«ã¼ããUï¼ç¹å®ã®topicãtã¨ããã¨ã
if then
ã¨ãªãã¾ãï¼ããã§ã¯å°ããªå®æ°ã§ãï¼
å ãã¦ï¼ãµãã¤ã®smoothing algorithmsãå®è¡ãã¾ãï¼ããã¯æ示çã«åçãã¦ããªãtopicã«å¯¾ãã¦ç¢ºçãå²ãå½ã¦ãæ段ã§ï¼ã²ã¨ã¤ã¯ï¼basic collaborating filtering techniquesã§ï¼ä¼¼ããããªtopicã«å¯¾ãã¦ã¯ä¼¼ããããªå¤ã«ãã¾ãï¼ããã²ã¨ã¤ã¯semantic similarityã§ãï¼
ãããã®ä½æ¥ãçµã¦ï¼ããã¦ã¼ã¶ã«å¯¾ãããã¹ã¦ã®topicã«å¯¾ããã¹ã³ã¢ãå¾ããã¨ãã§ããã®ã§ï¼ã¨ãªãããã«æ£è¦åãã¦ããã¾ãï¼Bayesã®å®çããï¼ããtopicã¨ã¦ã¼ã¶ã«å¯¾ãã¦
ãå¾ããã¾ãï¼ããã§ï¼ã¯ä¸æ§åå¸ï¼ã¯topicã観測ããå²åã§ãï¼Aardvarkã¯ç®åºãããtopicããã¼ã«ãã転å°ã¤ã³ããã¯ã¹ã«æ ¼ç´ãï¼è³ªåãæ¥ãæã«åãã¾ãï¼
次ã«Connectionsã«ã¤ãã¦ï¼
Aardvarkã¯ä»»æã®ã¦ã¼ã¶éã®connectionããã¾ãã¾ãªæ¹æ³ã§ç®åºãã¾ãï¼ã½ã¼ã·ã£ã«ãããã¯ã¼ã¯ä¸ã§ã®è·é¢ãéè¦ã§ããï¼demographicsï¼äººå£çµ±è¨å¦ï¼ãæ¯ãèãã®é¡ä¼¼æ§ãéè¦ã§ãï¼èæ
®ãã¦ããé
ç®ã«ã¯ä»¥ä¸ã®ãããªãã®ãããã¾ã
- Social connection (common friends and affiliations)
- Demographic similarity
- Profile similarity (e.g. common favorite movies)
- Vocabulary match (e.g. IM shortcuts)
- Chattiness match (freequency of follow-up messages)
- Verbosity match (the average length of messages)
- Politeness match (e.g. the use of "Thanks")
- Speed match (responsiveness to other users)
ã¦ã¼ã¶éã®connectionã®å¼·ãã¯weighted cosine similarityã使ã£ã¦è¨ç®ãï¼ã¨ãªãããæ£è¦åãã¾ãï¼
topicã¨connectionã¯å¸¸ææ´æ°ãã¾ãï¼
Analyzing Questions
質åã解æããç®çã¯è³ªåqããtopicã«å¯¾ããã¹ã³ã¢ã®ãªã¹ããå°ãã ããã¨ã«ããã¾ãï¼ã¾ãï¼ä»¥ä¸ã®Classifierãé©ç¨ãï¼åçãã¹ã質åãé¸å¥ãã¾ãï¼
- NonQuwestionClassifier: 質åãã©ãã
- InappropriateQuestionClassifier: Q&Aã¨ãã¦é©åãªæç« ãã©ãã
- TrivialQuestionClassifier: 人ã«èããªãã¨çããããªã質åãã©ãã
- LocationSensitiveClassifier: ç¹å®ã®å°åã«é¢ããç¥èãå¿ è¦ãã©ãã
次ã«ä»¥ä¸ã®TopicMapper algorithmsã§å¾ãããçµæãçµ±åãã¦ï¼è³ªåã«å¯¾ããtopicã®ãªã¹ããå¾ãäºãã§ãã¾ãï¼
- A KeywordMatchTopicMapper: user profileã®topicã«è¨è¼ãããæååã¨ä¸è´ããã
- A TaxonomyTopicMapper: SVMãç¨ãã¦ç´3,000種é¡ã«åé¡
- A SalientTermTopicMapper: 質åããã®ç¹å¾´èªã®æ½åº
- A UserTagTopicMapper: 質åè ã«ãã£ã¦ä»ããããã¿ã°
The Aardvark Ranking Algorithm
ã©ã³ãã³ã°ã¯RoutingEngineãè¡ãï¼è³ªåè ã¨ï¼Question Analyzerããåãåã£ãæ å ±ãå ã«ï¼ããåçããã¦ãããããªäººã®é åºãªã¹ããä½æãã¾ãï¼ã¦ã¼ã¶ã®ã©ã³ãã³ã°ã決å®ããããã«å¿ è¦ãªè¦ç´ ã¯ï¼Topic Expertise ï¼Connectedness,Availabilityã§ãï¼
- Topic Expertise: ã¾ãï¼Routing Engineã¯è³ªåã¨semantic matchesããã¦ã¼ã¶ç¾¤ãæ¢ãã¾ãï¼location-sensitiveãªè³ªåã«é¢ãã¦ã¯ï¼ã¦ã¼ã¶ãããã£ã¼ã«ããã®å ´æã¨ä¸è´ããã¦ã¼ã¶ã®ã¿èæ ®ãã¾ãï¼
- Connectedness: 次ã«Routing Engineã¯åçè ã®topicã«é¢ããå°éæ§ã¨ã¯ç¡é¢ä¿ã«ï¼è³ªåè ã¨ã©ããããã®è¦ªåæ§ãããããè¦æ¥µãã¾ãï¼
- Availability: 3ã¤ç®ã«Routing Engineã¯è³ªåã«ä»åçãã¦ãããããªäººã®åªå 度ä»ããããã¾ãï¼ããã¯IMã§ãªã³ã©ã¤ã³ã«ãªã£ã¦ãããã©ãããªã©ãåèã«ãã¾ãï¼
åçãã¦ãããããªäººã®ãªã¹ããã§ããã¨Aardvarkã¯ã¬ã¤ãã©ã¤ã³ã«åºã¥ãã¦ã³ã³ã¿ã¯ããã¦ã¯ãªããªã人ãé¤å¤ãã¾ãï¼æ®ã£ããªã¹ããConversation Managerã«æ¸¡ãã¾ãï¼Conversation Managerã¯ãªã¹ãã®ããããã®äººã«ã¢ã¯ã»ã¹ãã¦ï¼åçãå¾ãããã¾ã§ï¼è³ªåã«çãã¦ããããã©ãã訪ãã¦å»»ãã¾ãï¼
ãã®å¾ã®è«æã¯â¦
- User Interface
- EXAMPLES
- EVALUATION
- ACKNOWLEDGMENTS
- REFERENCES
ã¨ç¶ãã¾ãï¼UIã¯é¢ç½ãã¨æãã®ã§ããï¼ã¢ã¼ããã¯ãã£ã§ã¯ãªãã¨æãã®ã§ããã§ã¯åãä¸ãã¾ããï¼ï¼ç²ããã¨ãããï¼è©³ããã¯å è«æããããé ããï¼The Anatomy of Large-Scale Social Search Engine: ソーシャル検索エンジンAardvark論文の輪講用資料 - シリコンの谷のゾンビã«è¼ã£ã¦ããè³æãããããã ããï¼
ãããã«
ããã¤ãã¢ã«ã´ãªãºã ã®ååãåºã¦ãããã§ããï¼ããããªãã ãããã£ã¦ã¾ããï¼åå¼·ããªãã¨ï¼æè¿ã½ã¼ã·ã£ã«ã¢ããªæµè¡ã ãï¼çããç¥ã£ã¦ã人ãæ¨è¦ãã¦ãããã¢ããªã¨ããã®ã¯é¢ç½ãã¨æããã§ããï¼ãã¤ã«ãªã£ããèªåã§ä½ãããã¨ãã⦠ã¨ããããï¼ç¡ç ä¸è¶³ã®æ¥ã ã解æ¶ã§ãã¦ããããã§ãï¼