ã¨ã ã¹ãªã¼ã¨ã³ã¸ãã¢ãªã³ã°ã°ã«ã¼ã AIã»æ©æ¢°å¦ç¿ãã¼ã ã®ä¸æ(@po3rin) ã§ãã 好ããªè¨èªã¯Goãä»äºã§ã¯ä¸»ã«æ¤ç´¢å¨ããæ å½ãã¦ãã¾ãã
æè¿ãå»çè¨èªå¦çãã¨ããæ¬ãèªãã§ãå»çç¨èªã®è¡¨è¨ããå¸åãæå³æ§é æ¤ç´¢ãªã©ã«ã¤ãã¦å¦ã³ã¾ããã

å»çè¨èªå¦ç (èªç¶è¨èªå¦çã·ãªã¼ãº)
- ä½è :èç§ è±æ²»
- çºå£²æ¥: 2017/08/01
- ã¡ãã£ã¢: åè¡æ¬
ããã§ä»åã¯Elasticsearchã¨æ£è 表ç¾è¾æ¸ã使ã£ãæå³æ§é æ¤ç´¢ãã©ã®ããã宿¦æå ¥ã§ããããç°¡åã«è©¦ããã®ã§ãæ¦è¦ã¨å®è£ æ¹æ³ãç°¡åã«ãç´¹ä»ãã¾ãã
- æ£è ããã¹ãã®è¡¨è¨ãã
- ãã¼ã¯ã³ã«ããæ¤ç´¢ã®èª²é¡ã¨å¯¾çã®æ¤è¨
- ä¿ãåãè§£æã¨æ£è 表ç¾è¾æ¸ã使ã£ãæå³æ§é æ¤ç´¢ã®å®è£
- 宿¦æå ¥ã¾ã§ã®èª²é¡
- ã¾ã¨ã
- Reference
æ£è ããã¹ãã®è¡¨è¨ãã
ãã®ç¯ã§ã¯å»çè¨èªå¦çã«ãããæ£è ããã¹ãã®è¡¨è¨ããåé¡ã«ã¤ãã¦è¨åããMEDNLPããå ¬éããã¦ããæ£è 表ç¾è¾æ¸ãç´¹ä»ãã¾ãã
æ£è ããã¹ãã®è¡¨è¨ããã¨ã¯
å»çè¨èªå¦çã«ããã表è¨ããã¨èãã¦ã¾ãæãæµ®ãã¹ãã®ã¯å»çç¨èªã®æºãã§ããããããçãã¬ã³ããç£ããããç£è¤¥ããçé çãåé çããªã©æ¥æ¬èªã§ã¯æ¼¢åãã«ã¿ã«ãã平仮åã®æ··å表è¨ãããã表è¨ãããé常ã«å¤§ããã®ã課é¡ã«ãªã£ã¦ãã¾ãã
䏿¹ã§æ£è ãè¨è¿°ããææ¸ã«å¯¾ããè¨èªå¦çã®å£ã¨ãã¦æããããã®ãæ£è ããã¹ãã®è¡¨è¨ããã§ããæ£è ã¯å»çç¨èªã使ããã«éå»çç¨èªã使ããããå»çç¨èªã®æºã以ä¸ã«æ£è ããã¹ãã§ã¯ãã大ããªã®ã£ãããåå¨ãã¾ããä¾ãã°ICD-10(å½éç¾ç )ã³ã¼ãã®R098ã§å®ç¾©ããã¦ããè¸é¨ä¸å¿«æã¨ããç åã訴ããå ´åãè¸ãè¦ãããã¨å ±åããæ£è ãããã°ãè¸ãã ã«ã ã«ãããã¨è¡¨ç¾ããæ£è ããã¾ãã
ãã¡ãããæ£è ããã¹ãã®æ¤ç´¢ã¨ããã¿ã¹ã¯ã§ã表è¨ããã¯å¤§ããªèª²é¡ã§ããå¼ç¤¾ã§ã¯AskDoctorsã¨ããå»å¸«Q&Aãµã¼ãã¹ãéç¨ãã¦ãããQ&Aãæ¤ç´¢ã§ããæ©è½ããããã¾ãã«æ£è ããã¹ããå ã«ãæ¤ç´¢ã宿½ãããµã¼ãã¹ã§ãã
ä¾ãã°ããã¼ã¯ã³ãã¼ã¹ã§ã®åºæ¬çãªãããå¤å®ã§ã¯ãè¸ãè¦ãããã§æ¤ç´¢ããæã«ãè¸ãã ã«ã ã«ããããããã¾ãããå»å¸«ã¸ã®è³ªåã§åãç æ°ãå¥ã®è¡¨ç¾ã§èª¬æãã¦ããå ´åãæ¤ç´¢æå³ã«ãã£ãããã¥ã¡ã³ããè¿ããæ¤ç´¢ã®é©åçãä¸ãã£ã¦ãã¾ãã¾ãã
åèªã®è¡¨è¨ããã«å¯¾ãã¦ã¯ã·ããã ã対çã¨ãã¦æãããã¾ãããçç¶ã表ç¾ããå ´åã¯åèªã¬ãã«ã§ã®ããã§ã¯ãªãã®ã§é£ããã¨ããã§ããä¾ãã°ãè¦ãããã¨ãã ã«ã ã«ããé¡ä¼¼èªã¨ãã¦è¨å®ããã®ã¯ããéãã«æãã¾ãããã®ããæ£è ããã¹ãã®è¡¨è¨ããã¯æ¤ç´¢ã«ããã¦ã¯ããã«è§£æ±ºã§ããåé¡ã§ã¯ããã¾ããã
MEDNLPã®æ£è 表ç¾è¾æ¸
æ£è ããã¹ãã®è¡¨è¨ããã«å¯¾ããææã¨ãã¦MEDNLPã§å ¬éããã¦ããæ£è 表ç¾è¾æ¸ãæãããã¾ãã sociocom.jp
æ£è 表ç¾è¾æ¸ã¯ãæ£è ãç¨ãã表ç¾ã«é¢ããè¾æ¸ã§ãããæ£è ã®è¡¨ç¾ã«å¯¾å¿ãããæ¨æºç åãé¨ä½ãICD-10ã³ã¼ããªã©ãåæããã¦ãã¾ãã
ãã®è¾æ¸ã§ã¦ã¼ã¶ã¼ã®æ¤ç´¢ã¯ã¨ãªã®å ¨ã¦ãå¦çã§ãã訳ã§ã¯ããã¾ããããããã使ãã°ãè¸ãè¦ãããããè¸ãã ã«ã ã«ãã®ãããªè¡¨è¨ãããå¸åã§ãããã§ãã
ãã¼ã¯ã³ã«ããæ¤ç´¢ã®èª²é¡ã¨å¯¾çã®æ¤è¨
ãã®ç¯ã§ã¯ãã¼ã¯ã³ã«ããæ¤ç´¢ã®èª²é¡ãç´¹ä»ãããã®å¯¾çã¨ãªãããæå³æ§é æ¤ç´¢ãç´¹ä»ãã¾ãã
主èªãéãã®ã«ããããã¡ããï¼
ãã¼ã¯ã³ãããã§æ¤ç´¢ããå ´åãæ®éã«ããã¨ãé ãçããã¨ããã¯ã¨ãªã¯ãé çããã«å½¢æ ç´ è§£æãããé ã¯æ£å¸¸ã ãèãçããã¨ããããã¥ã¡ã³ããããããã¦ãã¾ãã¾ããããã¯æ¤ç´¢æå³ãæ±²ãã§ããªãã®ã§æ¤ç´¢ãã¤ãºã¨è¨ãã¾ãã
ãã¬ã¼ãºãããã®å ´åã§ããä¾ãã°slop=0ã«è¨å®ããå ´åããé ãçããã¨ããã¯ã¨ãªã§ãé ããºããºãã¨çããã«ããããã¦ãããªãã®ã§ãã¬ã¼ãºãããã§ã¯å®ç§ã«è§£æ±ºããããªãåé¡ã§ãã
ãã®åé¡ã解決ããããã«ã¯åèªéã®ä¿ãåãè§£æãç¨ããæå³æ§é æ¤ç´¢ãå½¹ç«ã¡ã¾ãã
æå³æ§é æ¤ç´¢
æå³æ§é æ¤ç´¢ã¨ã¯ããã¹ãã«å¯¾ãã¦ä¿ãåãè§£æãè¡ãæç« ã®æ§é ãæãã¦æ¤ç´¢ãããã¨ãæãã¾ããä¿ãåãè§£æã§ã¯ä¸è¨ã®ããã«æç« ã®åèªéã®é¢ä¿ãè§£æãã¾ãã
ãã®ä¿ãåãè§£æã«ããããé ãçããã¨ãé ã¯æ£å¸¸ã ãèãçããã§ã¯çãã«ç¹ãã主èªåè©ã®é¢ä¿ãä¸è´ããªãã¨å¤æã§ããæ¤ç´¢ãã¤ãºãæ¸ããã¾ãã
ä¿ãåãè§£æã¨æ£è 表ç¾è¾æ¸ã使ã£ãæå³æ§é æ¤ç´¢ã®å®è£
ããã¾ã§ã§æ£è ããã¹ãã®ããããæå³æ§é è§£æã®æ¦è¦ã説æãã¾ãããããããã¯å®éã«Elasticsearchä¸ã§ä¿ãåãè§£æã使ã£ãæå³æ§é æ¤ç´¢ãå®è£ ãã¦ã¿ã¾ãã
ä»åã®å®è£ ã«å©ç¨ããã¢ã¸ã¥ã¼ã«ãPythonã®ãã¼ã¸ã§ã³ã¯ä¸è¨ã§ãã
[tool.poetry.dependencies] python = "^3.9" spacy = "^2.3.5" ginza = "^4.0.5" fastapi = "^0.63.0" uvicorn = "^0.13.3" joblib = "^1.0.1" elasticsearch = "^7.11.0" pandas = "^1.2.2" numpy = "^1.20.1"
ããã¦ãä»åã¯ä¸è¨ã®ãããªã¢ã¼ããã¯ãã£ã§å®è£ ãã¦ããã¾ãã
æ£è 表ç¾è¾æ¸ã使ã£ãä¿ãåãè§£æ
ä¿ãåãè§£æã§ã¯GiNZAãå©ç¨ãã¾ãã GiNZAã¯æ¥æ¬èªã®èªç¶è¨èªå¦çã©ã¤ãã©ãªã§ãããèªç¶è¨èªå¦çãã¬ã¼ã ã¯ã¼ã¯spaCy ã«å¯¾å¿ãã¦ãã¾ããå é¨ã®å½¢æ ç´ è§£æå¨ã¯Sudachiãå©ç¨ããã¦ãã¾ãã
GiNZAã使ãã¨ä¸è¨ã³ã¼ãã®ããã«æ¥æ¬èªã®ä¿ãåãè§£æãåºæè¡¨ç¾æ½åºãç°¡åã«è¡ãã¾ãã
# Jupyter Notebookeã§visualizeæ³å® class DependencyAnalysis: def __init__(self): self.nlp = spacy.load('ja_ginza') def run(self, text): doc = self.nlp(text) displacy.render(doc, style="dep", jupyter=True, options={'distance': 90}) displacy.render(doc, style="ent", jupyter=True,) dependency = DependencyAnalysis() dependency.run("é ã¯æ£å¸¸ã ãèãçã")
Jupyter Notebookä¸ã§å®è¡ããã°ä¿ãåãè§£æã®çµæã¨åºæè¡¨ç¾æ½åºã®çµæã確èªã§ãã¾ãã
åºæè¡¨ç¾æ½åºã§ã¯ANIMAL_PART
ã¿ã°ã§åç©ã®ä½ã®é¨ä½ãæ½åºã§ãã¦ãã¾ããããã使ãã°ä½ã®é¨ä½ã«ç´ã¥ãæ§æã®ã¿ãåå¾ã§ãããã§ãã
ããã§ã¯ä½ã®é¨ä½ã®çç¶ã«é¢ããä¿ãåãçµæã ããè¿ãã¯ã©ã¹DependencyAnalysis
ãä½ãã¾ãã
class DependencyAnalysis: def __init__(self) -> None: self.nlp = spacy.load("ja_ginza") def run(self, text: str) -> dict[str, str]: """ ä½ã®é¨ä½ãå«ã主èª-åè©,é¢ç¯ç®çèªã®æ§æãæ½åºãã¾ãã """ doc = self.nlp(text) ent_words = [ ent.text for ent in doc.ents if ent.label_ == "Animal_Part"] deps = [] for sent in doc.sents: for token in sent: if ( token.dep_ in ["nsubj", "iobj"] and token.lemma_ in ent_words and len(sent) >= token.head.i ): deps.append(f"{token.lemma_}->{sent[token.head.i].lemma_}") return deps
ä»åã¯ç°¡åã®ããã«ã主èªåè©(nsubj)ããé¢ç¯ç®çèª(iobj)ãã®é¢ä¿ã®ã¿ãæ½åºãããã®é¢ä¿ãã主èª->åè©ãã®ãããªå½¢ã«ãã©ã¼ãããããæååã¨ãã¦æ±ã£ã¦ããã¾ããããã«ãã£ã¦ã°ã©ããããã³ã°åé¡ã«ãªã£ã¦éããªãã®ãåé¿ãã¦ãã¾ãã
ç¶ãã¦DependencyAnalysis
ã使ã£ã¦æ£è
表ç¾è¾æ¸ãæ±ãã¯ã©ã¹PatientExpressionCoupus
ã使ãã¾ããPatientExpressionCoupus
ã§ã¯æ£è
表ç¾è¾æ¸ãä¿ãåãè§£æã«éãã¦ãpandas.DataFrame
ã«å¤æãã¾ãããããè¾æ¸ã¨ãã¦ç
åã«ç´ã¥ãæ£è
表ç¾ãå
¨ã¦åå¾ãã¾ãã
ããã§ã¯PatientExpressionCoupus
ã®ã³ã¼ããæ¸ãã¦ããã¾ãã
class PatientExpressionsCoupus: def __init__(self) -> None: self.df = {} self.da = DependencyAnalysis() def get_symptom_expressions(self, desease: str) -> list[str]: """ ç åã«ç´ã¥ãæ£è 表ç¾ä¸è¦§ãåå¾ãã """ expressions = self.df[self.df["æ¨æºç å"] == desease]["åºç¾å½¢"].tolist() return expressions def get_symptom_deps(self, desease: str) -> list[str]: """ ç åã«ç´ã¥ãä¿ãåãæ§æãåå¾ãã """ deps = self.df[self.df["æ¨æºç å"] == desease]["deps"].tolist() return deps def get_deseases(self, expression: str) -> list[str]: """ æ£è 表ç¾ã«ç´ã¥ãç åãåå¾ãã """ deps = self.da.run(expression) deseases = self.df[self.df["deps"] == ",".join(deps)]["æ¨æºç å"].tolist() return deseases def analyze(self, text: str) -> str: """ æ£è 表ç¾ãä¿ãåãè§£æãã¦æ¤ç´¢ã®keyã«ãªãããã«stringã«å¤æãã¦è¿ã """ deps = self.da.run(text) if len(deps) == 0: return np.NaN return ",".join(deps) def load_from_csv(self, file) -> None: """ æ£è 表ç¾è¾æ¸ãpands.DataFrameã«å¤æ """ self.da = DependencyAnalysis() with codecs.open(file) as f: df = pd.read_table(f, delimiter=",") df = df.loc[:, ["åºç¾å½¢", "æ¨æºç å"]] df["deps"] = df.apply(lambda x: self.analyze(x["åºç¾å½¢"]), axis=1) df = df.dropna(how="any") self.df = df
PatientExpressionsCoupus
ã®åä½ç¢ºèªããã¦ã¿ã¾ãããã
pec = PatientExpressionsCoupus() pec.load_from_csv("corpus/D3_20190326.csv") d = DependencyAnalysis() dep = d.run("è¸ãè¦ãã") print(dep) deseases = pec.get_deseases(t) print(deseases) expression_list = [] for d in deseases: expression_list.extend(pec.get_symptom_expressions(d)) print(expression_list)
corpus/D3_20190326.csv
ã¯æ£è
表ç¾è¾æ¸ãcsvã«å¤æãããã®ã§ãã
ãããå®è¡ããã¨ä¿ãåãçµæãæ£è
表ç¾ã«ç´ã¥ãç
åãç
åã«åºã¥ãæ£è
表ç¾ãåå¾ã§ãã¦ãã¾ããã¾ãããè¸ãè¦ãããããæ§ã
ãªãè¸çãããè¸é¨ä¸å¿«æãã表ç¾ããæ£è
表ç¾ã確èªã§ãã¾ãã
[{'dep': 'nsubj', 's': 'è¸', 'h': 'è¦ãã'}] ['è¸ç', 'è¸å è¦æ¶', 'è¸é¨ä¸å¿«æ', 'è¸é¨ä¸å¿«æ'] ['å³ã®è¸ãçã', 'å³å´ã®è ¹é¨ãçããã¨', 'æ¥ã«ç çã«è¸ãçã', 'è¸ãããã', 'è¸ãããã', 'è¸ããã¥ãã¨', 'è¸ããã¥ã³ãã¥ã³', 'è¸ãããªããª', 'è¸ãã®ãªã®ãªã¨', 'è¸ãã°ãã¨', 'è¸ããºããºã', 'è¸ããºããºããã', 'è¸ãã㯠ãã¯', 'è¸ãæ¥ã«çããªã', 'è¸ãæ¥ã«çã', 'è¸ãè¦ãã', 'è¸ãåãªã', 'è¸ãçã', 'è¸ãçããã¨', 'è¸ãçãçºä½', 'è¸ãçããªã', 'è¸ãçããªãçºä½', 'è¸ãçã', 'è¸ãçã', 'è¸ãç· ãä»ãããã', 'è¸ãçªç¶çã', 'å·¦ã®è¸ ãçã', 'å·¦ã®è¸ãçããªã', 'å·¦ã®è¸ãçã', 'å·¦è¸ãããã', 'å·¦è¸ãçããã¨', 'å·¦å´ã®è¸ãçã', 'å¿èãçã', 'å¿èãçããªã', 'çªç¶è¸ãçããªã', 'èè ¹ãçã', 'è¸ããã¤ã', 'è¸ããã¥ã¼', 'è¸ããã¥ã¼ã', 'è¸ããã ã£ã¨', 'è¸ããã¥ãã¨ãªã', 'è¸ããããã', 'è¸ãããã¤ã', 'è¸ãããã¤ãããã', 'è¸ãããã©ã', 'è¸ãã¤ã¾ã', 'è¸ãå§è¿«æ', 'è¸ãæ¼ããã¤ãããã', 'è¸ãæ¼ããã¤ãããããããªæã', 'è¸ãæ¼ããããããª', 'è¸ãæ¼ãããæã', ' è¸ãè©°ã¾ã', 'è¸ãè¦ãã', 'è¸ãç· ãä»ã', 'è¸ãç· ãä»ãããã', 'è¸ãç· ãä»ããããããã ', 'è¸ãå¿èãè¦ãã', 'è¸é¨ãæ¼ããããããªçã¿', 'å¿èããã ã¼', 'å¿èããã¥ã¼', 'å¿èããã¥ã¼ã', 'å¿èããã ã£ã¨', 'å¿èãã㥠ãã¨ãªã', 'å¿èããããã', 'å¿èãããã¤ã', 'å¿èãããã©ã', 'å¿èãã¤ã¾ã', 'å¿èãå§è¿«æ', 'å¿èãè©°ã¾ã', 'å¿èãè¦ãã', 'å¿èãç· ãä»ã', 'å¿èãç· ãä»ãããã', 'è¸ããããã', 'è¸ããããã', 'è¸ãã¶ã¯ã¶ã¯ãã', 'è¸ãã¤ããã', 'è¸ããªãããããã', 'è¸ãã¸ã', 'è¸ãã ã«ã ã«', 'è¸ããããããã', 'è¸ãã¢ã¤ã¢ã¤', 'è¸ãå§è¿«ãããæã', 'è¸ãæ¼ããããããªæã', 'è¸ãæ°æã¡æªã', 'è¸ãè¦ããæã', 'è¸ãè¦ãããªã', 'è¸ãéã', 'è¸ ãæ£å¸¸ã§ã¯ãªãæã', 'è¸ãçã', 'è¸ãæ½°ããæã', 'è¸ãç· ãä»ãããã¦ãããããªæã', 'è¸ãç· ãä»ãããããããªæè¦', 'è¸ãç· ãä»ããããããã«çã', 'è¸ãç· ãä»ããããæã', 'è¸ãç· ãä»ããããæãããã', 'è¸ãä¸å¿«', 'è¸ã®ä¸ã®æ¹ï¼ãã£ã±ãããä¸ï¼ãæ¼ãã¤ã¶ããããããªæã', 'è¸é¨ãæªã', 'å¿èãæªã', 'è¸ããããã', 'è¸ããããã', 'è¸ãã¶ã¯ã¶ã¯ãã', 'è¸ãã¤ããã', 'è¸ããªãããããã', 'è¸ãã¸ã', 'è¸ãã ã«ã ã«', 'è¸ããããããã ', 'è¸ãã¢ã¤ã¢ã¤', 'è¸ãå§è¿«ãããæã', 'è¸ãæ¼ããããããªæã', 'è¸ãæ°æã¡æªã', 'è¸ãè¦ããæã', 'è¸ãè¦ãããªã', 'è¸ãéã', 'è¸ãæ£å¸¸ã§ã¯ãªãæã', 'è¸ãçã', 'è¸ãæ½°ããæã', 'è¸ãç· ãä»ãããã¦ãããããªæã', 'è¸ãç· ãä»ãããããããªæè¦', 'è¸ãç· ãä»ããããããã«çã', 'è¸ãç· ãä»ããããæã', 'è¸ãç· ãä»ããããæãããã', 'è¸ãä¸å¿«', 'è¸ã®ä¸ã®æ¹ï¼ãã£ã±ãããä¸ï¼ãæ¼ãã¤ã¶ããããããªæã', 'è¸é¨ãæªã', 'å¿èãæªã']
æ¯åãæ£è
表ç¾è¾æ¸ããä¿ãåãè§£æãéãã®ã¯æéããããã®ã§pickle
ãjoblib
ã§ä¿åãã¦ãã¼ãã§ããããã«ãã¦ããã¨è¯ãã§ãããã
æ£è 表ç¾è¾æ¸ã®è¡¨ç¾ãã¯ã¨ãªã«å±éããAPI
ããã¾ã§ããããã¨ã¯Elasticsearchãå©ãAPIãä½ãã ãã§ããä»åã¯Pythonã®è»½éWebãã¬ã¼ã ã¯ã¼ã¯ã§ããFastAPIã使ã£ã¦æ§ç¯ãã¾ãã
APIã¯ããã¥ã¡ã³ãã®ç»é²(/topics)ã¨æ¤ç´¢(/topics/search)ã ãã§ããElasticsearchã§ã¤ã³ããã¯ã¹ããæ¤ç´¢å¯¾è±¡ãã¼ã¿ã¯AskDoctorsã®è³ªåã¿ã¤ãã«ã®ã¿ã ããæ³å®ãã¦ãã¾ãã
app = FastAPI() es = Elasticsearch("http://localhost:9200") pec = PatientExpressionsCoupus() pec.load("symptoms_expression_dict.joblib") d = DependencyAnalysis() @app.get("/topics/search") def topics(q: str = None): deseases = pec.get_deseases(q) deps = [] for d in deseases: deps.extend(pec.get_symptom_deps(d)) expression_queries = [{"match": {"deps": e}} for e in deps] if len(expression_queries) == 0: return {} query = { "query": {"bool": {"should": expression_queries, "minimum_should_match": 1}} } q = json.dumps(query) return es.search(index="topics", body=query, size=3) @app.post("/topics") def topics(body: dict = Body(None)): deps = d.run(body["title"]) topic = { "id": body["id"], "title": body["title"], "deps": deps, } return es.create(id=body["id"], index="topics", body=topic)
ããã§æ£è ããã¹ãã®è¡¨è¨ãããå¸åããæå³æ§é æ¤ç´¢ã®å®æã§ãã
åä½ç¢ºèª
ããã¥ã¡ã³ãç»é²ãã¾ãã
POST localhost:8000/topics { "id": 1111, "title": "ãè ¹ãçããé¦ã¯æ£å¸¸" }
ãã¼ã¿ãè¦ãã¨ä¿ãåãçµæãå«ãã ãã¼ã¿ã¨ãã¦ä¿åã§ãã¦ãããã¨ã確èªã§ãã¾ãã
{ // ... "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 0.8630463, "hits": [ { "_index": "topics", "_type": "_doc", "_id": "1113", "_score": 0.8630463, "_source": { "id": 1113, "title": "ãè ¹ãçããé¦ã¯æ£å¸¸", "deps": [ "ãè ¹->çã", "é¦->æ£å¸¸" ] } } ] } }
æ¬¡ã«æ¤ç´¢ã§ããçµæãè¦ãã¨ãããè ¹ãçããã¨ããè ¹ããã¯ãã¯ãããã®è¡¨è¨ãããæãã¦æ¤ç´¢ãã§ãã¦ãã¾ãã
curl localhost:8000/topics/search?q=ãè ¹ããã¯ãã¯ãã { // ... "hits": { "total": { "value": 1, "relation": "eq" }, "max_score": 0.8630463, "hits": [ { "_index": "topics", "_type": "_doc", "_id": "1113", "_score": 0.8630463, "_source": { "id": 1113, "title": "ãè ¹ãçããé¦ã¯æ£å¸¸", "deps": [ "ãè ¹->çã", "é¦->æ£å¸¸" ] } } ] } }
䏿¹ã§ããé¦ãããçãããtitleã«ããã®ã§ããé¦ çããã¨ãããã¼ã¯ã³ã§ANDæ¤ç´¢ãè¡ã£ãå ´åããè ¹ãçããé¦ã¯æ£å¸¸ããããããã¦ãã¾ãã¾ãããä¿ãåãè§£æãè¡ãªã£ã¦ããã®ã§ãçããã®ä¸»èªãæ£ãã夿ããããããåé¿ãã¦ãã¾ãã
curl localhost:8000/topics/search?q=é¦ãçã { // ... "hits": { "total": { "value": 0, "relation": "eq" }, "max_score": null, "hits": [] } }
ããã§æ£è ããã¹ãã®è¡¨è¨ãããå¸åããæå³æ§é æ¤ç´¢ãã§ãã¾ããã
宿¦æå ¥ã¾ã§ã®èª²é¡
å®éã®æ¤ç´¢ã¯ã¨ãªã§ã¯ä»åã®ããã«å©è©ãå¿ ãå«ã¾ãã¦ããããã§ã¯ãªããã»ã¨ãã©ã®å ´åã¯ãçºç± å³ ãªã é¼» åºãããªã©å©è©ãçç¥ãããå½¢ã§ã¯ã¨ãªãå¦çããå¿ è¦ãããã¾ããããã©ã«ãã®GiNZAã§ã¯å©è©ããªãã¨ç²¾åº¦è¯ãä¿ãåãè§£æãã§ãã¾ããã
ã¾ããæ£è 表ç¾è¾æ¸ã ãã§ã¯ã¦ã¼ã¶ã¼ã®ã¯ã¨ãªãå ¨ã¦æãã訳ã§ã¯ãªãã®ã§ããã¤ãã®æ¤ç´¢ãã¸ãã¯ã¨åããã¦ä½¿ãå¿ è¦ãããããã§ãã
ããã¦ä»åã¯ä½ã®é¨ä½ãå«ãç°¡åãªæç« ã ãã対象ã«ãã¾ããããããã«ãµãã¼ãç¯å²ãåºããã¦ããã®ã§ããã°ãã¥ã¼ãã³ã°ã¯ãã¡ãããå»çç¨èªè¾æ¸ã®å°å ¥ãå»¶ãã¦ã¯æ£è 表ç¾è¾æ¸ã®ç¬èªæ¡å¼µãæ¤è¨ãã¦ããã¨è¯ãããã§ãã
ã¾ã¨ã
ä»åã¯GiNZAã¨æ£è 表ç¾è¾æ¸ã使ã£ã¦Elasticsearchã«ããæå³æ§é æ¤ç´¢ã試ãã¦ã¿ã¾ããããããããæ¤ç´¢æ¹åã«ç¹ããæè¡ãå®é¨ãã¦ãããã°ã¨æãã¾ãã
We're hiring !!!
ã¨ã ã¹ãªã¼ã§ã¯æ¤ç´¢åºç¤ã®éçº&æ¹åãéãã¦å»çãåé²ãããã¨ã³ã¸ãã¢ãåéãã¦ãã¾ãï¼ ç¤¾å ã§ã¯æè¿ãæ¤ç´¢ãã¼ã ãä¸å¿ã«ãElasticsearch & Lucene ã³ã¼ããªã¼ãã£ã³ã°ä¼ããçºè¶³ããæ¤ç´¢ã®ä»çµã¿ã«é¢ããè°è«ãæ´»çºã§ãã
ãã¡ãã£ã¨è©±èãã¦ã¿ãããããã¨ãã人ã¯ãã¡ãããï¼
Reference
æ£è 表ç¾è¾æ¸ sociocom.jp
GiNZA+Elasticsearchã§ä¿ãåãæ¤ç´¢ã®ç¬¬ä¸æ© acro-engineer.hatenablog.com
å»çè¨èªå¦ç

å»çè¨èªå¦ç (èªç¶è¨èªå¦çã·ãªã¼ãº)
- ä½è :èç§ è±æ²»
- çºå£²æ¥: 2017/08/01
- ã¡ãã£ã¢: åè¡æ¬