æè¿ã§ã¯sudachiã®æ¹ãå©ç¨ããã¦ããããããã¾ãããã ã¤ã³ã¿ã¼ãããã§æ¤ç´¢ããç¯å²ã§ã¯ elasticsearch ã¯kuromojiã®æ å ±ãå¤ãã®ã§ãä»å㯠kuromoji
åèurl
https://qiita.com/mserizawa/items/8335d39cacb87f12b678
install analysis-kuromoji
$ bin/elasticsearch-plugin install analysis-kuromoji -> Installing analysis-kuromoji -> Downloading analysis-kuromoji from elastic [=================================================] 100%?? -> Installed analysis-kuromoji -> Please restart Elasticsearch to activate any plugins installed
âã¤ã³ã¹ãã¼ã«å¾ãâãã®çµæã確èª
$ bin/elasticsearch-plugin list analysis-kuromoji $ curl --cacert config/certs/http_ca.crt -u elastic \ https://localhost:9200/_nodes/plugins?pretty Enter host password for user 'elastic': mNXX=JAaEr+gVO5zDQhB { "_nodes" : { "total" : 1, "successful" : 1, "failed" : 0 }, "cluster_name" : "elasticsearch", "nodes" : { ãç¥ã "plugins" : [ { "name" : "analysis-kuromoji", "version" : "8.15.2", "elasticsearch_version" : "8.15.2", "java_version" : "17", "description" : "The Japanese (kuromoji) Analysis plugin integrates Lucene kuromoji analysis module into elasticsearch.", "classname" : "org.elasticsearch.plugin.analysis.kuromoji.AnalysisKuromojiPlugin", "extended_plugins" : [ ], "has_native_controller" : false, "licensed" : false, "is_official" : true } ãç¥ã }
ä¸æ¦ãindexãéããkuromoji ãããã©ã«ãã®ãã¼ã¯ãã¤ã¶ã«è¨å®
$ curl --cacert config/certs/http_ca.crt -u elastic \ -XPOST http://localhost:9200/test_index/_close Enter host password for user 'elastic': mNXX=JAaEr+gVO5zDQhB $ curl --cacert config/certs/http_ca.crt -u elastic \ -H "Content-Type: application/json" -X PUT \ "https://localhost:9200/_all/_settings?preserve_existing=true" \ -d '{"index.analysis.analyzer.default.tokenizer": "kuromoji_tokenizer", "index.analysis.analyzer.default.type" : "custom"}' Enter host password for user 'elastic': mNXX=JAaEr+gVO5zDQhB $ curl --cacert config/certs/http_ca.crt -u elastic \ -XPOST http://localhost:9200/test_index/_open Enter host password for user 'elastic': mNXX=JAaEr+gVO5zDQhB
æ¥æ¬èªãã¼ã¿ã®æå ¥
$ vi wine.json { "index" : {} } { "name": "ã«ãã«ãã»ã½ã¼ã´ã£ãã¨ã³", "description": "ã«ãã«ãã»ã½ã¼ã´ã£ãã¨ã³ (Cabernet Sauvignon) ã¯ãä¸ççã«æãæåãªèµ¤ã¯ã¤ã³ç¨ã®ä»£è¡¨ã¯ã¤ã³ç¨å種ã®1ã¤ã§ãããåã«ãã«ãã«ãã(Cabernet) ã¨ãå¼ã°ãããã¨ãå¤ãããã©ã³ã¹ã§ã¯ã¡ããã¯å°åºã«ä»£è¡¨ãããããã«ãã«ãã¼ã®æãéè¦ãªå種ã®ä¸ã¤ã§ãããä¸çåå°ã§ãæ ½å¹ããã¦ããããæ¯è¼ç温æãªæ°åã好ããã½ã¼ã´ã£ãã¨ã³ã»ãã©ã³ã¨ã«ãã«ãã»ãã©ã³ã®èªç¶äº¤é ã«ãã£ã¦èªçããã¨ãããã¦ããã æç®ã®ã¿ã³ãã³åãå¤ããå¼·ãæ¸å³ã®ããæ¿åãªã¯ã¤ã³ã¨ãªããéå³ãå¤ããæ¯è¼çé·æã®çæãå¿ è¦ã¨ãããå¼·éããæ¸å³ãç·©åãã¹ããã¡ã«ãã¼çã®ä»ã®å種ã¨ã®æ··é¸ãæ··åãå°ãªããªããæ´å²çã«ã¯ãã´ã£ãã¥ã¼ã¬ããã´ã§ãã¼ã¬ãï¼ã硬ããã®æï¼ã¨ãå¼ã°ãããã½ã¼ã´ã£ãã¨ã³ã»ãã©ã³åæ§ã¡ããã·ãã©ã¸ã³(Methoxypyrazine)ã«ç±æ¥ããã¢ãããããã"} { "index" : {} } { "name": "ã¡ã«ãã¼", "description": "ã¡ã«ãã¼ (Merlot) ã¯ã赤ã¯ã¤ã³ç¨ã®å種ã®ä¸ã§ã¯æ大ã®ä½å°é¢ç©ããã¤ãã¨ãã«ãã©ã³ã¹ã®ãã«ãã¼ãããããçä¼¼ãããã«ãã¼ã»ãã¬ã³ããã«ããã¦é常ã«éè¦ã§ãããã«ãã«ãã»ã½ã¼ã´ã£ãã¨ã³ã¨ãã¬ã³ãããããã¨ããããã«ãã«ãã»ã½ã¼ã´ã£ãã¨ã³ã«æ¯ãç½ããã§ã軽å£ã§ãããã¾ãããã«ãã¼ã®ãµã³ãããªãªã³(Saint-Emilion)ããã ãã¼ã«(Pomerol)ã¨ãã£ãå°åºã§ã¯ãã«ãã«ãã»ã½ã¼ã´ã£ãã¨ã³ãããå¤ãé åãããã¨ãã«ãã ãã¼ã«å°åºã®ãã·ã£ãã¼ã»ãããªã¥ã¹ãã¯ããã°ãã°ãã®å種åç¬ã§é ããããæ¥æ¬ã§ãé·éçã®å¡©å°»å¸æ¡æ¢ã¶åå°åºãªã©ã§æ ½å¹ããã¦ãããåå£ã®å¡©åã«å¼±ãã"} { "index" : {} } { "name": "ããã»ãã¯ã¼ã«", "description": "ããã»ãã¯ã¼ã« (Pinot Noir) ã¯ããã©ã³ã¹ã®ãã«ã´ã¼ãã¥å°æ¹ãåç£ã¨ããä¸ççãªå種ã§ãç´«è²ã帯ã³ãéè²ã®æç®ãæã¤ãå·æ¶¼ãªæ°åã好ã¿ãç¹ã«æ¸©æãªæ°åã§ã¯è²ããã¬ã¼ãã¼ãå®å®ããªãã®ã§æ ½å¹ã¯é£ãããã¤ã¿ãªã¢ã§ã¯ãããã»ããã(Pinot Nero)ããã¤ãã§ã¯ãã·ã¥ãã¼ããã«ã°ã³ãã¼ã(Spätburgunder)ã®åããããéºä¼åçã«ä¸å®å®ã§å¤ç°ç¨®ãå°ãªããªãããã®ä¸ã«ã¯ãç·ã¿ã帯ã³ãé»è²ã®æç®ãæã¤ããã»ãã©ã³(Pinot Blanc)ãè¤è²ã®ããã»ã°ãª(Pinot Gris)ãªã©ããããæã«ã¯åã樹ã«ç°ãªã£ãè²ã®æå®ããªãã¨ããããã¦ããããã©ã³ã¹ä»¥å¤ã§ã¯æè¿ãã¥ã¼ã¸ã¼ã©ã³ãã§ã®æ ½å¹ãçãã§ãå¯å·å°ãä¸å¿ã«æ ½å¹ããããã¯ã¤ã³ã¯ã©ã¤ãããã£ã§ãå¼±ãã®æ¸å³ãç¹ç´°ãªã¢ããã¨ãã¬ã¼ãã¼ãç¹å¾´ã§ãããã·ã£ã³ãã³ã«ãæ¬ ãããªãå種ã§ããã"} { "index" : {} } { "name": "ã·ã©ã¼", "description": "ã·ã©ã¼(Syrah)ã¯ãã·ã©ã¼ãºã(Shiraz)ã¨ãå¼ã°ãã赤ã¯ã¤ã³ç¨ã®ä»£è¡¨çãªå種ã®1ã¤ã§ãããã·ã©ã¼ãºã¯ã¤ã©ã³ã®é½å¸åã§ãããããã©ã³ã¹ã»ãã¼ãå°æ¹ãèµ·æºã¨ãããããã¼ãå°æ¹ã®ä»£è¡¨çãªå種ã§ããä»ããªã¼ã¹ãã©ãªã¢ã§ã¯æãéè¦ãªå種ã§ãããåã¢ããªã«ãããªãªã©ã§ãæ ½å¹ããã¦ãããã¯ã¤ã³ã¯ãã«ããã£ã§é¦å³ãå¼·ããã«ãã«ãã»ã½ã¼ã´ã£ãã¨ã³ã«æ¯ã¹ã¿ã³ãã³ããæ°é®®ããªã®ãç¹å¾´ã§ãããä»ã®å種ã¨ã®æ··é¸ãæ··åãè¦ããããæ ½å¹ãããæ°åã風åã«ãã£ã¦å³ãç°ãªãããã¼ãæ¸è°·åé¨ã®ã³ã¼ãã»ããã£ãã¨ã«ãã¿ã¼ã¸ã¥ããªã¼ã¹ãã©ãªã¢ç£ãæåãæå®ã¯çããã¨ããªã³ãããã"} { "index" : {} } { "name": "ãµã³ã¸ã§ã´ã§ã¼ã¼", "description": "ãµã³ã¸ã§ã´ã§ã¼ã¼ (Sangiovese) ã¯ãã¤ã¿ãªã¢ã§æãæ ½å¹é¢ç©ã®å¤ã赤ã¯ã¤ã³ç¨ã®å種ã§ãããæç®ã®è²ã®éããå«ãæ°å¤ãã®äºç¨®ãæã¤ãä¸å¤®ã¤ã¿ãªã¢ã®ãã¹ã«ã¼ãå·ã主ç£å°ã§ãã¤ã¿ãªã¢ã§æãæåãªä¸ã¤ã§ããããã£ã³ãã£(Chianti)ãã¯ãããããã«ãããã»ãã£ã»ã¢ã³ã¿ã«ãã¼ãã(Brunello di Montalcino) ããã´ã£ã¼ãã»ãã¼ãã¬ã»ãã£ã»ã¢ã³ããã«ãã¢ã¼ãã(Vino Nobile di Montepulciano) ããã¢ã¬ããªã¼ãã»ãã£ã»ã¹ã«ã³ãµã¼ãã(Morellino di Scansano)ãªã©ãçç£ããããã³ã«ã·ã«å³¶ã§ã¯ãããã¨ã«ãããªã(Nielluccio)ã¨ãã¦ç¥ãããã"}
$ curl --cacert config/certs/http_ca.crt -u elastic \ -H "Content-Type: application/json" \ -X POST http://localhost:9200/test_index/_bulk --data-binary @wine.json Enter host password for user 'elastic': mNXX=JAaEr+gVO5zDQhB {"errors":false,"took":200, "items":[ {"index":{"_index":"test_index","_id":"rb8gdZIBNDBl3Nb_TkmR","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":1,"_primary_term":8,"status":201}}, {"index":{"_index":"test_index","_id":"rr8gdZIBNDBl3Nb_TkmU","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":2,"_primary_term":8,"status":201}}, {"index":{"_index":"test_index","_id":"r78gdZIBNDBl3Nb_TkmU","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":3,"_primary_term":8,"status":201}}, {"index":{"_index":"test_index","_id":"sL8gdZIBNDBl3Nb_TkmU","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":4,"_primary_term":8,"status":201}}, {"index":{"_index":"test_index","_id":"sb8gdZIBNDBl3Nb_TkmU","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"_seq_no":5,"_primary_term":8,"status":201}} ] }
æ¥æ¬èªã§ã®æ¤ç´¢test
$ curl --cacert config/certs/http_ca.crt -u elastic \ -H "Content-Type: application/json" \ http://localhost:9200/test_index/_search -d '{"query":{"match":{"description":"æ¸ã"}}}' Enter host password for user 'elastic': mNXX=JAaEr+gVO5zDQhB {"took":162,"timed_out":false,"_shards":{"total":1,"successful":1,"skipped":0,"failed":0},"hits":{"total":{"value":2,"relation":"eq"},"max_score":1.2804732,"hits":[{"_index":"test_index","_id":"rb8gdZIBNDBl3Nb_TkmR","_score":1.2804732,"_ignored":["description.keyword"],"_source":{ "name": "ã«ãã«ãã»ã½ã¼ã´ã£ãã¨ã³", "description": "ã«ãã«ãã»ã½ã¼ã´ã£ãã¨ã³ (Cabernet Sauvignon) ã¯ãä¸ççã«æãæåãªèµ¤ã¯ã¤ã³ç¨ã®ä»£è¡¨ã¯ã¤ã³ç¨å種ã®1ã¤ã§ãããåã«ãã«ãã«ãã(Cabernet) ã¨ãå¼ã°ãããã¨ãå¤ãããã©ã³ã¹ã§ã¯ã¡ããã¯å°åºã«ä»£è¡¨ãããããã«ãã«ãã¼ã®æãéè¦ãªå種ã®ä¸ã¤ã§ãããä¸çåå°ã§ãæ ½å¹ããã¦ããããæ¯è¼ç温æãªæ°åã好ããã½ã¼ã´ã£ãã¨ã³ã»ãã©ã³ã¨ã«ãã«ãã»ãã©ã³ã®èªç¶äº¤é ã«ãã£ã¦èªçããã¨ãããã¦ããã æç®ã®ã¿ã³ãã³åãå¤ããå¼·ãæ¸å³ã®ããæ¿åãªã¯ã¤ã³ã¨ãªã ãéå³ãå¤ããæ¯è¼çé·æã®çæãå¿ è¦ã¨ãããå¼·éããæ¸å³ãç·©åãã¹ããã¡ã«ãã¼çã®ä»ã®å種ã¨ã®æ··é¸ãæ··åãå°ãªããªããæ´å²çã«ã¯ãã´ã£ãã¥ã¼ã¬ããã´ã§ãã¼ã¬ãï¼ã硬ããã®æï¼ã¨ãå¼ã°ãããã½ã¼ã´ã£ãã¨ã³ã»ãã©ã³åæ§ã¡ããã·ãã©ã¸ã³(Methoxypyrazine)ã«ç±æ¥ããã¢ãããããã"}},{"_index":"test_index","_id":"r78gdZIBNDBl3Nb_TkmU","_score":0.84890795,"_ignored":["description.keyword"],"_source":{ "name": "ããã»ãã¯ã¼ã«", "description": "ããã»ãã¯ã¼ã« (Pinot Noir) ã¯ããã©ã³ã¹ã®ãã«ã´ã¼ãã¥å°æ¹ãåç£ã¨ããä¸ççãªå種ã§ãç´«è²ã帯ã³ãéè²ã®æç®ãæã¤ãå·æ¶¼ãªæ°åã好ã¿ãç¹ã«æ¸©æãªæ°åã§ã¯è²ããã¬ã¼ãã¼ãå®å®ããªãã®ã§æ ½å¹ã¯é£ãããã¤ã¿ãªã¢ã§ã¯ãããã»ããã(Pinot Nero)ããã¤ãã§ã¯ãã·ã¥ãã¼ããã«ã°ã³ãã¼ã(Spätburgunder)ã®åããããéºä¼åçã«ä¸å®å®ã§å¤ç°ç¨®ãå°ãªã ãªãããã®ä¸ã«ã¯ãç·ã¿ã帯ã³ãé»è²ã®æç®ãæã¤ããã»ãã©ã³(Pinot Blanc)ãè¤è²ã®ããã»ã°ãª(Pinot Gris)ãªã©ããããæã«ã¯åã樹ã«ç°ãªã£ãè²ã®æå®ããªãã¨ããããã¦ããããã©ã³ã¹ä»¥å¤ã§ã¯æè¿ã㥠ã¼ã¸ã¼ã©ã³ãã§ã®æ ½å¹ãçãã§ãå¯å·å°ãä¸å¿ã«æ ½å¹ããããã¯ã¤ã³ã¯ã©ã¤ãããã£ã§ãå¼±ãã®æ¸å³ãç¹ç´°ãªã¢ããã¨ãã¬ã¼ãã¼ãç¹å¾´ã§ãããã·ã£ã³ãã³ã«ãæ¬ ãããªãå種ã§ããã"}}]}}
å½¢æ ç´ è§£æã®test
$ curl --cacert config/certs/http_ca.crt -u elastic \ -H "Content-Type: application/json" \ -X POST "http://localhost:9200/test_index/_analyze" -d '{"text":"æ¸ã"}' Enter host password for user 'elastic': mNXX=JAaEr+gVO5zDQhB {"tokens":[{"token":"æ¸","start_offset":0,"end_offset":1,"type":"word","position":0},{"token":" ã","start_offset":1,"end_offset":2,"type":"word","position":1}]}