Budou - æ©æ¢°å¦ç¿ãç¨ããæ¥æ¬èªæ¹è¡åé¡ã¸ã®ã½ãªã¥ã¼ã·ã§ã³
ããã«ã¡ã¯ï¼ æ¥æ¬èªã®ã¦ã§ããµã¤ããä½ã£ã¦ããã¨ãæ¥æ¬èªç¹æã®åé¡ã«ã¶ã¡ããããã¨ãããã¾ãããã ãã®ä¸ã§ãä»åçç®ãããã®ã¯ãæ¥æ¬èªæ¹è¡åé¡ãæè¿ããã®åé¡ã解決ããããã®ã©ã¤ãã©ãªãå ¬éããã®ã§ãç´¹ä»ãã¾ãã
ããããæ¥æ¬èªæ¹è¡åé¡ã¨ã¯ä½ã
ã¦ã§ããã©ã¦ã¶ã§æ¥æ¬èªã§æ¸ãããã¦ã§ããµã¤ããè¦ã¦ããã¨ãã¨ãã©ãæç« ãå¤ãªã¨ããã§æ¹è¡ããã¦ããã®ãç®ã«ãããã¨ãããã¾ãããã ãã¨ãã°ããããªãããã ãã½ãªã¥ã¼ã·ã§ã³ãããã½ãªã¥ã¼ã·ã§ãã¨ãã³ãã«åããã¦ãã¾ã£ã¦ãã¾ããèªã¿ã«ããã§ããã
è±èªã§ã¯åèªãã¹ãã¼ã¹ã«ãã£ã¦åºåããã¾ãããæ¥æ¬èªãä¸å½èªãªã©ã®ã¢ã¸ã¢åã®è¨èªã§ã¯åèªãã¹ãã¼ã¹ã§åºåãããªããã¨ãå¤ãã§ãã ãã®ãããè±èªã§ã¯åèªã®éä¸ã§æ¹è¡ããããã¨ã¯é常ããã¾ããããæ¥æ¬èªã§ã¯åèªã®éä¸ã§æ¹è¡ããããã¨ãããããã¾ãã æ¬æãªãã¨ããããè¦åºãããã£ããã³ãã¼ã§ãããèµ·ããã¨ãçµæ§æ°ã«ãªãã¾ãããï¼é·ãã«ã¿ã«ãã®å¤æ¥èªãå ¥ãã¨ãæ´ã«ç®ç«ã¡ã¾ãã ããã§ãåã¯ãã®ãæ¥æ¬èªã®åèªãéä¸ã§æ¹è¡ããã¦ãã¾ãç¾è±¡ãã®ãã¨ãæ¥æ¬èªæ¹è¡åé¡ã¨å¼ã³ããã¨æãã¾ãã
ãã¡ããã対å¿çãç¡ãããã§ã¯ããã¾ããã
ä»ã¾ã§ã®è§£æ±ºæ³ 1: <br>
ã¿ã°ã使ã
åèªãéä¸ã§æ¹è¡ããã¦ãã¾ãã®ãªãããã®åã«æ¹è¡ãå ¥ãã¦ãã¾ãï¼ã¨ããã¢ããã¼ãã§ããã æªãã¯ãªãã¢ããã¼ãã§ãããã¬ã¹ãã³ã·ããã¶ã¤ã³ã§ã¦ã§ããµã¤ããä½ã£ã¦ããå ´åã«ã¯ãæã¾ãããªãã¨ããã§æ¹è¡ãå ¥ã£ã¦ãã¾ãå¯è½æ§ãããã¾ãã
ä»ã¾ã§ã®è§£æ±ºæ³ 2: white-space: nowrap
ã使ã
ããããã使ãããã¢ããã¼ãã ã¨æãã¾ããwhite-space: nowrap
ã§æå®ããã¿ã°ã«ã¯æ¹è¡ãæ½ãããªãã®ã§ãããã§æ¹è¡ãã¦ã»ãããªãåèªãå²ãã¨ããæ¹æ³ã§ãããããããã®ã¢ããã¼ãã ã¨ç»é¢å¹
ã«ãã£ã¦ã¯åèªãç»é¢ããã¯ã¿åºãã¦ãã¾ãå¯è½æ§ãããã¾ãã
ä»ã¾ã§ã®è§£æ±ºæ³ 3: display: inline-block
ã使ã
ãããææãªã¢ããã¼ãã ã¨æãã¾ããdisplay: inline-block
ã§æå®ãããã¿ã°ã¯åºæ¥ãéãæ¹è¡ããªãããã«æ¯ãèãã®ã§ãåèªã®éä¸ã§æ¹è¡ãçºçãããã¨ãé²ããã¨ãã§ãã¾ããããããç»é¢å¹
ã極端ã«çãå ´åã«ã¯ç»é¢ããã¯ã¿åºããã«æ¹è¡ãã¦ããã¾ããSPAN ã¿ã°ã« display: inline-block
ãæå®ãã¦ä½¿ããã¨ãå¤ãããã§ãã
ã¨ããããã§ãdisplay: inline-block
ã使ãã°è§£æ±ºã§ããããããããããã SPAN ã¿ã°å
¥ãã¦ããããªï¼ãã§ãããã§ããã
æ¬å½ã«è§£æ±ºãªã®ãï¼
ã¨ã¯ãããã®ä½æ¥ã¯å¤§å¤é¢åãããã§ããä¸æä¸æè¦ã¦ãæç¯ãèããªããã°ãªãã¾ãããå°ãããµã¤ããªãè¯ãã§ãããè¨å¤§ãªæ°ã®ãã¼ã¸ãæ±ãããµã¤ãã§ã¯éç¾å®çã§ããããããã«ããªãã¨ãã£ã¦ãæ¥æ¬èªãèªããªãã¦ã¯ããã¾ããããã¨ãã¨è±èªã§ä½ããããµã¤ããæ¥æ¬èªã«ç¿»è¨³ãã¦ä½¿ã£ã¦ããã±ã¼ã¹ãããããã§ãå®ã¯ã¨ã³ã¸ãã¢ããã¶ã¤ãã¼ãæ¥æ¬èªã¯èªããªãã¨ããç¶æ³ã¯æ±ºãã¦çããããã¾ããã
ãã㧠Budou ã§ãã
Budou ã¯ãèªåçã«æç« ãè¯ãæãã®æç¯ã§åºåã£ã¦ãããã©ã¤ãã©ãªã§ãã
ãã¨ãã°ãæåã«åºã¦ããä¾ã« Budou ã使ãã¨ããããªãµãã«ãªãã¾ãã åãã¦ãªãï¼
ãã£ã¦ãããã¨ã¯ã¨ã¦ãã·ã³ãã«ã§ãCloud Natural Language API ã使ã£ã¦ããã¡æ¸ã*1ãè¡ããäºãä¸ããããã«ã¼ã«ã«å¾ã£ã¦æç¯ã®çæãè¡ã£ã¦ãã¾ãã
ãã£ãã使ãæ¹ãè¦ã¦ã¿ã¾ãããï¼
ã¤ã³ã¹ãã¼ã«
pip install budou
ã§ã¤ã³ã¹ãã¼ã«ã§ãã¾ããããã·ã¹ãã ã® Python ã§ä½¿ããããã«ãããã®ãªãé©å® sudo
ãã¤ãã¦ä¸ããã
Cloud Natural Language API ã® credential ãç¨æããã
Budou ã使ãã«ã¯ãCloud Natural Language API ã®è¨å®ãå¿ è¦ã§ããã¾ãã¯ãGoogle Cloud Consoleã§ããã¸ã§ã¯ããä½æãã¾ãã ããã²ã¼ã·ã§ã³ãã¼ã®ãã«ãã¦ã³ãã¯ãªãã¯ããã¨ãããã¸ã§ã¯ããä½æãã¨ããé ç®ãç¾ããã®ã§ããããã¯ãªãã¯ãã¾ãã 次ã«ãã¿ãã¡ãã¥ã¼ãããAPI Managerããé¸æãã¦ãµã¤ããã¼ãããã©ã¤ãã©ãªããé¸æããCloud Natural Language API ãæ¤ç´¢ãã¾ãã Cloud Natural Language API ãé¸æããã¨ããæå¹ã«ãããã¨æ¸ããããªã³ã¯ãããã¯ããªã®ã§ãã¯ãªãã¯ã㦠API ãæå¹ã«ãã¾ãã ãµã¤ããã¼ãããèªè¨¼æ å ±ããé¸æãã¦ãèªè¨¼æ å ±ã®ä½æã«ç§»ãã¾ãã ãèªè¨¼æ å ±ãä½æãã¨æ¸ããããã¿ã³ãã¯ãªãã¯ãã¦ããµã¼ãã¹ã¢ã«ã¦ã³ããã¼ãçæãã¾ãã ãã¼ã®ã¿ã¤ã㯠JSON 㨠P12 ãé¸ã¹ã¾ãããããã§ã¯ JSON ãé¸æãã¦ä¸ããã æå¾ã«ãã¿ãã¡ãã¥ã¼ããããæ¯æãããé¸æãã¦ãæå¹ãªæ¯ææ å ±ã¨ããã¸ã§ã¯ããç´ä»ããã°ãAPI ã®è¨å®ã¯å®äºã§ãã
â»æ³¨æ
API ãæå¹ã«ãã¦ããæå¾ã®ããæ¯æããã®ã¹ããããå®äºããªã㨠Cloud Natural Language API ã使ããªããã Budou ã使ããã¨ã¯ã§ãã¾ããã ãã ãCloud Natural Language API ã«ã¯ç¡ææ ãç¨æããã¦ãã¾ãã®ã§ãçãæç« ãªãæ 5,000 åãããã¾ã§ç¡æã§ä½¿ããã¨ãã§ããã¯ãã§ããæéä½ç³»ã«ã¤ãã¦è©³ããã¯ä¸è¨ã®ãªã³ã¯ãåç §ãã¦ä¸ããã
Pricing | Google Cloud Natural Language API Documentation | Google Cloud Platform
Budou ã使ã£ã¦ã¿ã
API ã®è¨å®ãå®äºããããpython ãèµ·åã㦠Budou ã使ã£ã¦ã¿ã¾ãããã
import budou parser = budou.authenticate('/path/to/json') output = parser.parse(u'ä»æ¥ãå æ°ã§ã', 'wordwrap') print output['html_code'] # => "<span class="wordwrap">ä»æ¥ã</span><span class="wordwrap">å æ°ã§ã</span>" print output['chunks'][0] # => "Chunk(word='ä»æ¥ã', pos='NOUN', label='NN', forward=True)" print output['chunks'][1] # => "Chunk(word='å æ°ã§ã', pos='NOUN', label='ROOT', forward=False)]"
Budou ã使ãã«ã¯ãauthenticate
ã¡ã½ããã§ãã¦ã³ãã¼ããã JSON å½¢å¼ã®èªè¨¼æ
å ±ãæå®ãã¦ãã ããã
çæããããã¼ãµã® parse
ã¡ã½ããã«å¦çãè¡ãããæååã渡ããã¨ã§ãæç¯ã SPAN ã§å²ã¾ãã HTML ã³ã¼ãã¨ãæç¯ã®ãªã¹ããè¿ããã¾ãã
ãã㯠optional ã§ãããSPAN ã¿ã°ã®ã¯ã©ã¹åãæå®ãããã¨ãã§ãã¾ãã
åç¬ã® Python ã¹ã¯ãªãããã¦ã¯ãã¡ããããã³ãã¬ã¼ãã¨ã³ã¸ã³ã®ã«ã¹ã¿ã ãã£ã«ã¿ã¼ã«å¿ç¨ãããã¨ãã§ãã¾ãã
ãããããæå®ããã¯ã©ã¹åã®è¦ç´ ã«ã¯ display: inline-block
ãè¨å®ãããã¨ããå¿ããªãï¼
Budou ã使ãã°ãããã¤ã¹ã®å¹ ã«ãããããæ¥æ¬èªæ¹è¡åé¡ã解決ãããã¨ãã§ãã¾ãã
ãã²ä½¿ã£ã¦ã¿ã¦ãã ãã
ç°¡åãªã¹ã¯ãªããã§ãããæ¥ã ã®ã¦ã§ããµã¤ãéçºãå°ãã§ã楽ã«ã§ããã°å¹¸ãã§ãã
æ¥æ¬èªæ¹è¡åé¡ãæ²æ» ãã¾ãããï¼
GitHub ã® issue ã pull request ããå¾ ã¡ãã¦ããã¾ã*2ã
*1:ããããããããããã®ãã¡âããããããããããããããã®ããã¡ãã®ããã«ãåèªãèªã«ããããã¨ã§ããわかち書き - Wikipedia
*2:ã¡ãªã¿ã«ãBudou ã¯ã¶ã©ãã®ä¸ç²ä¸ç²ãæ½°ãã¦ã¯ãããªã大åãªæç¯ã«è¦ç«ã¦ã¦ãå½åãã¾ããã