Apache Solr 3.3.0 ã§æ¥æ¬èªæ¤ç´¢ã§ããããã«ãªãã¾ã§ã«ãããã¨
ããããã¯Solrã使ã£ã¦å
¨ææ¤ç´¢ãµã¼ãã¹ã®æ代ã ã¨ãããã¨ã§ãSolrããã¦ã³ãã¼ããã¦ãã¦ãµã³ãã«ã¢ããªã±ã¼ã·ã§ã³ã§æ¥æ¬èªãæ±ããããã«ãã¾ããã
調ã¹ã¤ã¤ãã£ã¦ããã®ã§ããããããééã£ã¦ãããã¨ãè¨ã£ã¦ãããããã¾ããã
Apache Solr 3.3.0ã®ãã¦ã³ãã¼ãã解å
http://lucene.apache.org/solr/ ã®å·¦å´ã®ã¡ãã¥ã¼ãããResourcesã->ãDownloadãã¨è¾¿ããææ°çï¼3.3.0ï¼ããã¦ã³ãã¼ããã¾ãã
ä»å㯠apache-solr-3.3.0.tgz ããã¦ã³ãã¼ããã¾ããã
$ wget http://ftp.kddilabs.jp/infosystems/apache//lucene/solr/3.3.0/apache-solr-3.3.0.tgz $ tar xvf apache-solr-3.3.0.tgz $ # é©å½ãªãã£ã¬ã¯ããªã«é ç½®ãã(ä»åã¯~/Public/apache-solr/3.3.0/) $ mkdir ~/Public/apache-solr $ mv apache-solr-3.3.0 ~/Public/apache-solr/3.3.0
lucene-gosenã®ãã¦ã³ãã¼ãã¨é ç½®
æ¥æ¬èªã®å½¢æ
ç´ è§£æã«å¿
è¦ãªlucene-gosenããã¦ã³ãã¼ããã¦ãsolr/libã«é
ç½®ãã¾ãã
http://code.google.com/p/lucene-gosen/ ããææ°çã®lucene-gosenããã¦ã³ãã¼ããã¾ãã
ä»å㯠lucene-gosen-1.1.1-ipadic.jar ããã¦ã³ãã¼ããã¾ããã
$ # exampleã«ç§»å $ cd ~/Public/apache-solr/3.3.0/example $ mkdir solr/lib $ cd solr/lib $ wget http://lucene-gosen.googlecode.com/files/lucene-gosen-1.1.1-ipadic.jar
schema.xmlã®ç·¨é
solr/conf/schema.xmlãç·¨éãã¦lucene-gosenã使ããããã«è¨å®ãã¾ãã
... <types> ... <fieldType name="text_ja" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.JapaneseTokenizerFactory" /> </analyzer> </fieldType> ... </types> ...
ããã§fieldã®åã¨ãã¦text_jaã使ããããã«ãªãã¾ãã
å¼ãç¶ãsolr/conf/schema.xmlãç·¨éãã¦textãã£ã¼ã«ãã®åãå¤æ´ãã¾ãã
... <fields> ... <!-- 547è¡ç®ä»è¿ --> <field name="text" type="text_ja" indexed="true" stored="false" multiValued="true" /> ... </fields> ...
ã¤ãã§ã«ç¬èªã®ãã£ã¼ã«ãã追å ãã¦ãæ¤ç´¢ã®ã¤ã³ããã¯ã¹ã«è¿½å ãããããã«ãã¾ãã
... <fields> ... <field name="nihongo" type="text_ja" indexed="true" stored="true" /> <copyField source="nihongo" dest="text"/> ... </fields> ...
ããä¸æ¯
Solrèµ·å
exampleã®ãã£ã¬ã¯ããªã«ããstart.jarãå®è¡ãã¾ãã
$ cd ~/Public/apache-solr/3.3.0/example $ java -jar start.jar
ãµã³ãã«ããã¥ã¡ã³ãç»é²
exampledocsã«ããxmlãã¡ã¤ã«ãåèã«ãããã¥ã¡ã³ããä½æã»è¿½å ãã¾ãã
<!-- nihongo.xmlã¨ãã¦ä¿åãã¾ã --> <?xml version="1.0" encoding="UTF-8"?> <add> <doc> <field name="id">NIHONGOTEST</field> <field name="nihongo">æ¥æ¬èªã®ãã¹ãã§ã</field> </doc> </add>
$ cd exampledocs
$ ./post.sh nihongo.xml
確èª
http://localhost:8983/solr/admin/ ã«ã¢ã¯ã»ã¹ãã¦ã確èªãã¾ãã
ã¾ããªãã®ã§ã£ããããã¯ã¹ã«ãæ¥æ¬èªãã¨å
¥åãã¦ãæ¤ç´¢ã§ãããOKï¼
ãã£ã¨è©³ããè¦ããã£ããã[ANALYSIS]ã¨ãããªã³ã¯ããã©ã£ã¦ã次ã®ããã«å ¥åãã¦ç¢ºèªã§ãã¾ãã
Field type : text_ja
Field value : ãããããããããã®ãã¡
verbose output ã«ãã§ãã¯ãå
¥ãã