äºæ¥éçºé¨ã® @himkt ã§ãï¼å¥½ããªãã¥ã¼ã©ã«ããã㯠BiLSTM-CRF ã§ãï¼ æ®æ®µã¯ã¯ãã¯ãããã¢ããªã®ã¤ããã½æ¤ç´¢æ©è½ã®éçºãã¼ã ã§èªç¶è¨èªå¦çããã¦ãã¾ãï¼
æ¬ç¨¿ã§ã¯ï¼ã¬ã·ãããã¹ãããã®æçç¨èªæ½åºã·ã¹ãã nerman
ã«ã¤ãã¦ç´¹ä»ãã¾ãï¼
nerman ã®ç±æ¥ã¯ ner
(åºæ表ç¾æ½åº = Named Entity Recognition) + man
(ãã太é) ã§ãï¼
ã¯ãã¯ãããã«æ稿ãããã¬ã·ãããæçã«é¢ããç¨èªãèªåæ½åºããã·ã¹ãã ã§ããï¼AllenNLP 㨠Optuna ãçµã¿åããã¦ä½ããã¦ãã¾ãï¼
ï¼ã³ã¼ãã«ã¤ãã¦ãã¹ã¦ã説æããã®ã¯é£ããããï¼å®éã®ã³ã¼ããç°¡ç¥åãã¦ããç®æãããã¾ãï¼
æçç¨èªã®èªåæ½åº
æçã¬ã·ãã«ã¯æ§ã ãªæçç¨èªãåºç¾ãã¾ãï¼ é£æã調çå¨å ·ã¯ãã¡ããï¼èª¿çåä½ãé£æã®åéãªã©ãæçç¨èªã¨ã¿ãªãã¾ãï¼ ãåããã¨ãã調çåä½ãèãã¦ãï¼ãããåãã«ãããã輪åãã«ããããã¿ããåãã«ããããªã©ï¼ç¨éã«åããã¦è²ã ãªåãæ¹ãåå¨ãã¾ãï¼ ã¬ã·ãã®ä¸ãããã®ãããªæçç¨èªãæ½åºã§ããã°ï¼ã¬ã·ãããã®æ å ±æ½åºã質åå¿çãªã©ã®ã¿ã¹ã¯ã«å¿ç¨ã§ãã¾ãï¼
æçç¨èªã®èªåæ½åºã«ã¯ï¼ä»åã¯æ©æ¢°å¦ç¿ãå©ç¨ãã¾ãï¼ èªç¶è¨èªå¦çã®ã¿ã¹ã¯ã®ä¸ã«ï¼åºæ表ç¾æ½åºã¨ããã¿ã¹ã¯ãåå¨ãã¾ãï¼ åºæ表ç¾æ½åºã¨ã¯ï¼èªç¶è¨èªã®æï¼æ°èè¨äºãªã©ã®ææ¸ã対象ã¨ãªããã¨ãå¤ãã§ãï¼ãã人åãå°åï¼çµç¹åãªã©ã®åºæ表ç¾ãæ½åºããã¿ã¹ã¯ã§ãï¼ ãã®ã¿ã¹ã¯ã¯ç³»åã©ããªã³ã°ã¨å¼ã°ããåé¡ã«å®å¼åã§ãã¾ãï¼ ç³»åã©ããªã³ã°ãç¨ããåºæ表ç¾æ½åºã§ã¯ï¼å ¥åæãåèªã«åå²ããã®ã¡ååèªã«åºæ表ç¾ã¿ã°ãä»ä¸ãã¾ãï¼ ã¿ã°ãä»ä¸ãããåèªåãæ½åºãããã¨ã§åºæ表ç¾ãå¾ããã¾ãï¼
ä»åã¯äººåï¼å°åãªã©ã®ä»£ããã«é£æåï¼èª¿çå¨å ·åï¼èª¿çåä½ã®ååãªã©ãåºæ表ç¾ã¨ã¿ãªãã¦ã¢ãã«ãå¦ç¿ãã¾ãï¼ è©³ç´°ãªåºæ表ç¾ã¿ã°ã®å®ç¾©ã¯æ¬¡ã®ç« ã§èª¬æãã¾ãï¼
ãã¼ã¿ã»ãã
æ©æ¢°å¦ç¿ã¢ãã«ã®å¦ç¿ã«ã¯æ師ãã¼ã¿ãå¿ è¦ã§ãï¼ ã¯ãã¯ãããã§ã¯è¨èªãã¼ã¿ä½æã®å°é家ã®æ¹ã«ååãã¦ããã ãï¼ã¢ããã¼ã·ã§ã³ã¬ã¤ãã©ã¤ã³ã®æ´åããã³ã³ã¼ãã¹ã®æ§ç¯ã«åãçµã¿ã¾ããï¼ ã¬ã·ãããã®åºæ表ç¾æ½åºã«ã¤ãã¦ã¯äº¬é½å¤§å¦ã®æ£®ç 究室ã§ãç 究ããã¦ãã¾ãï¼è«æã¯ãã¡ãï¼ PDF ãã¡ã¤ã«ãéããã¾ãï¼ï¼ ãã®ç 究ã§å®ç¾©ããã¦ããåºæ表ç¾ã¿ã°ãåèã«ãã¤ã¤ï¼ã¯ãã¯ãããã§ã®ã¦ã¼ã¹ã±ã¼ã¹ã«åããã¦æ¬¡ã®ãããªåºæ表ç¾ã¿ã°ãæ½åºå¯¾è±¡ã¨ãã¦å®ç¾©ãã¾ããï¼
ãã®å®ç¾©ã«åºã¥ãï¼ã¯ãã¯ãããã«æ稿ãããã¬ã·ãã®ä¸ãã 500 åã®ã¬ã·ãã«å¯¾ãã¦åºæ表ç¾ãä»ä¸ãã¾ããï¼ ãã¼ã¿ã¯ Cookpad Parsed Corpus ã¨åä»ãããï¼ç¤¾å ã® GitHub ãªãã¸ããªã§ç®¡çããã¦ãã¾ãï¼ ã¾ãï¼æ©æ¢°å¦ç¿ã¢ãã«ã§å©ç¨ããããã®åå¦çï¼ãã©ã¼ãããã®å¤æ´ãªã©ï¼ããããã¼ã¿ã S3 ã«ã¢ãããã¼ãããã¦ãã¾ãï¼
Cookpad Parsed Corpus ã«é¢ããã¢ã¦ããããã¨ãã¦è«æåã«ãåãçµãã§ãã¾ãï¼ å·çããè«æã¯èªç¶è¨èªå¦çã®å½éä¼è°ã§ãã COLING ã§éå¬ãããè¨èªè³æºã«é¢ããç 究ã®ã¯ã¼ã¯ã·ã§ãã LAWï¼Linguistic Annotation Workshopï¼ã«æ¡æããã¾ããï¼ ð
ã¿ã¤ãã«ã¯ä»¥ä¸ã®éãã§ãï¼
Cookpad Parsed Corpus: Linguistic Annotations of Japanese Recipes Jun Harashima and Makoto Hiramatsu
Cookpad Parsed Corpus ã«åé²ããã¦ããã¬ã·ãã¯åºæ表ç¾ã®ä»ã«ãå½¢æ ç´ ã¨ä¿ãåãã®æ å ±ãä»ä¸ããã¦ããï¼ ç¾å¨å¤§å¦çã®ç 究æ©é¢ã«æå±ããã¦ããæ¹ã«å©ç¨ããã ããããã«å ¬éã®æºåãé²ãã¦ãã¾ãï¼
æºå: AllenNLP ãç¨ããåºæ表ç¾æ½åºã¢ãã«
nerman ã§ã¯ã¢ãã«ã¯ AllenNLP ãç¨ãã¦å®è£ ãã¦ãã¾ãï¼
AllenNLP 㯠Allen Institute for Artificial Intelligence (AllenAI) ãéçºãã¦ããèªç¶è¨èªå¦çãã¬ã¼ã ã¯ã¼ã¯ã§ããï¼
ææ°ã®æ©æ¢°å¦ç¿ææ³ã«åºã¥ãèªç¶è¨èªå¦çã®ããã®ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ãç°¡åã«ä½æã§ãã便å©ãªã©ã¤ãã©ãªã§ãï¼
AllenNLP 㯠pip
ã§ã¤ã³ã¹ãã¼ã«ã§ãã¾ãï¼
pip install allennlp
AllenNLP ã§ã¯ã¢ãã«ã®å®ç¾©ãå¦ç¿ã®è¨å®ã Jsonnet å½¢å¼ã®ãã¡ã¤ã«ã«è¨è¿°ãã¾ãï¼
以ä¸ã«ä»åã®åºæ表ç¾æ½åºã¢ãã«ã®å¦ç¿ã§å©ç¨ããè¨å®ãã¡ã¤ã«ï¼config.jsonnet
ï¼ã示ãã¾ãï¼
ï¼ã¢ãã«ã¯ BiLSTM-CRF ãæ¡ç¨ãã¦ãã¾ãï¼ï¼
config.jsonnet
local dataset_base_url = 's3://xxx/nerman/data'; { dataset_reader: { type: 'cookpad2020', token_indexers: { word: { type: 'single_id', }, }, coding_scheme: 'BIOUL', }, train_data_path: dataset_base_url + '/cpc.bio.train', validation_data_path: dataset_base_url + '/cpc.bio.dev', model: { type: 'crf_tagger', text_field_embedder: { type: 'basic', token_embedders: { word: { type: 'embedding', embedding_dim: 32, }, }, }, encoder: { type: 'lstm', input_size: 32, hidden_size: 32, dropout: 0.5, bidirectional: true, }, label_encoding: 'BIOUL', calculate_span_f1: true, dropout: 0.5, initializer: {}, }, data_loader: { batch_size: 10, }, trainer: { num_epochs: 20, cuda_device: -1, optimizer: { type: 'adam', lr: 5e-4, }, }, }
ã¢ãã«ï¼ãã¼ã¿ï¼ããã¦å¦ç¿ã«é¢ããè¨å®ãããããæå®ããã¦ãã¾ãï¼
AllenNLP ã¯ãã¼ã¿ã»ããã®ãã¹ã¨ãã¦ãã¼ã«ã«ã®ãã¡ã¤ã«ãã¹ã ãã§ãªã URL ãæå®ã§ãã¾ãï¼
ç¾ç¶ã§ã¯ http
ï¼https
ï¼ãã㦠s3
ã®ã¹ãã¼ãã«å¯¾å¿ãã¦ããããã§ãï¼
ï¼èªãã ã³ã¼ãã¯ãã®ãããï¼
nerman ã§ã¯ train_data_path
ããã³ validation_data_path
ã«
S3 ä¸ã®å å·¥æ¸ã¿ Cookpad Parsed Corpus ã®å¦ç¿ãã¼ã¿ï¼ããªãã¼ã·ã§ã³ãã¼ã¿ã® URL ãæå®ãã¦ãã¾ãï¼
AllenNLP ã¯èªç¶è¨èªå¦çã®æåãªã¿ã¹ã¯ã®ãã¼ã¿ã»ãããèªã¿è¾¼ãããã®ã³ã³ãã¼ãã³ããæä¾ãã¦ããã¾ãï¼
ããããªããï¼ä»åã®ããã«èªåã§æ§ç¯ãããã¼ã¿ã»ãããå©ç¨ãããå ´åã«ã¯èªåã§ãã¼ã¿ã»ããããã¼ã¹ããã¯ã©ã¹ï¼ãã¼ã¿ã»ãããªã¼ãã¼ï¼ãä½æããå¿
è¦ãããã¾ãï¼
cookpad2020
㯠Cookpad Parsed Corpus ãèªã¿è¾¼ãããã®ãã¼ã¿ã»ãããªã¼ãã¼ã§ãï¼
ãã¼ã¿ã»ãããªã¼ãã¼ã®ä½ææ¹æ³ã«ã¤ãã¦ã¯å
¬å¼ãã¥ã¼ããªã¢ã«ã§
説æããã¦ããã®ã§è©³ããç¥ãããæ¹ã¯ãã¡ããåç
§ããã ããã°ã¨æãã¾ãï¼
è¨å®ãã¡ã¤ã«ãä½æã§ãããï¼ allennlp train config.jsonnet --serialization-dir result
ã®ããã«ã³ãã³ããå®è¡ãããã¨ã§å¦ç¿ãã¯ãã¾ãã¾ãï¼
å¦ç¿ã®ããã«å¿
è¦ãªæ
å ±ãã¹ã¦ãè¨å®ãã¡ã¤ã«ã«ã¾ã¨ã¾ã£ã¦ãã¦ï¼å®é¨ã管çãããããã¨ã AllenNLP ã®ç¹å¾´ã®ï¼ã¤ã§ãï¼
serialization-dir
ã«ã¤ãã¦ã¯å¾è¿°ãã¾ãï¼
ä»åã®è¨äºã§ã¯ç´¹ä»ãã¾ãããï¼ allennlp
ã³ãã³ãã«ã¯ allennlp predict
allennlp evaluate
ãªã©ã®é常ã«ä¾¿å©ãªãµãã³ãã³ããç¨æããã¦ãã¾ãï¼
詳ããç¥ãããæ¹ã¯å
¬å¼ããã¥ã¡ã³ããåç
§ãã ããï¼
nerman ã®å ¨ä½å
以ä¸ã« nerman ã®å ¨ä½åã示ãã¾ãï¼
ã·ã¹ãã ã¯å¤§ããåã㦠3 ã¤ã®ãããããæ§æããã¦ãã¾ãï¼ããããã®å½¹å²ã¯ä»¥ä¸ã®éãã§ãï¼
- (1) ãã¤ãã¼ãã©ã¡ã¼ã¿æé©å
- (2) ã¢ãã«ã®å¦ç¿
- (3) å®ãã¼ã¿ï¼ã¬ã·ãï¼ããã®åºæ表ç¾æ½åºï¼äºæ¸¬ï¼
æ¬ç¨¿ã§ã¯ï¼é åºãå ¥ãæ¿ã㦠ã¢ãã«ã®å¦ç¿ => å®ãã¼ã¿ã§ã®äºæ¸¬ => ãã¤ãã¼ãã©ã¡ã¼ã¿æé©å ã®é ã«è§£èª¬ãã¦ããã¾ãï¼
ã¢ãã«ã®å¦ç¿
ã¢ãã«ã®å¦ç¿ãããã¯ä»¥ä¸ã®ãããªã·ã§ã«ã¹ã¯ãªãããå®è¡ãã¾ãï¼
train
#!/bin/bash allennlp train \ config/ner.jsonnet \ --serialization-dir result \ --include-package nerman # ã¢ãã«ã¨ã¡ããªã¯ã¹ã®ã¢ãããã¼ã aws s3 cp result/model.tar.gz s3://xxx/nerman/model/$TIMESTAMP/model.tar.gz aws s3 cp result/metrics.json s3://xxx/nerman/model/$TIMESTAMP/metrics.json
æºåã®ç« ã§è§£èª¬ããããã«ï¼ allennlp train
ã³ãã³ãã§ã¢ãã«ãå¦ç¿ãã¾ãï¼
--serialization-dir
ã§æå®ãã¦ãããã£ã¬ã¯ããªã«ã¯ã¢ãã«ã®ã¢ã¼ã«ã¤ãï¼tar.gz å½¢å¼ï¼ï¼
ã¢ã¼ã«ã¤ããã¡ã¤ã«ã®ä¸ã«ã¯ã¢ãã«ã®éã¿ã®ä»ã«æ¨æºåºåã»æ¨æºã¨ã©ã¼åºåï¼ããã¦å¦ç¿ããã¢ãã«ã®ã¡ããªã¯ã¹ãªã©ã®ãã¼ã¿ãä¿åããã¾ãï¼
å¦ç¿ãçµãã£ããï¼ allennlp train
ã³ãã³ãã«ãã£ã¦çæãããã¢ãã«ã®ã¢ã¼ã«ã¤ãã¨ã¡ããªã¯ã¹ã S3 ã«ã¢ãããã¼ããã¾ãï¼
ï¼ã¢ã¼ã«ã¤ããã¡ã¤ã«ã«ã¯ã¢ãã«ã®éã¿ãªã©ãä¿åããã¦ããï¼ãã®ãã¡ã¤ã«ãããã°å³åº§ã«ã¢ãã«ã復å
ã§ãã¾ãï¼ï¼
ã¾ãï¼ã¡ããªã¯ã¹ãã¡ã¤ã«ãåæã«ã¢ãããã¼ããã¦ãããã¨ã§ï¼ã¢ãã«ã®æ§è½ããã©ããã³ã°ã§ãã¾ãï¼
metrics.json
çæãããã¡ããªã¯ã¹ãã¡ã¤ã«ï¼æ§è½ææ¨ã ãã§ãªãå¦ç¿æéãè¨ç®æéãªã©ããããã¾ãï¼
{ "best_epoch": 19, "peak_worker_0_memory_MB": 431.796, "training_duration": "0:29:38.785065", "training_start_epoch": 0, "training_epochs": 19, "epoch": 19, "training_accuracy": 0.8916963871929718, "training_accuracy3": 0.8938523846944327, "training_precision-overall": 0.8442808607021518, "training_recall-overall": 0.8352005377548734, "training_f1-measure-overall": 0.8397161522865011, "training_loss": 38.08172739275527, "training_reg_loss": 0.0, "training_worker_0_memory_MB": 431.796, "validation_accuracy": 0.8663015463917526, "validation_accuracy3": 0.8688788659793815, "validation_precision-overall": 0.8324965769055226, "validation_recall-overall": 0.7985989492119089, "validation_f1-measure-overall": 0.815195530726207, "validation_loss": 49.37634348869324, "validation_reg_loss": 0.0, "best_validation_accuracy": 0.8663015463917526, "best_validation_accuracy3": 0.8688788659793815, "best_validation_precision-overall": 0.8324965769055226, "best_validation_recall-overall": 0.7985989492119089, "best_validation_f1-measure-overall": 0.815195530726207, "best_validation_loss": 49.37634348869324, "best_validation_reg_loss": 0.0, "test_accuracy": 0.875257568552861, "test_accuracy3": 0.8789031542241242, "test_precision-overall": 0.8318906605922551, "test_recall-overall": 0.8214125056230319, "test_f1-measure-overall": 0.8266183793571253, "test_loss": 48.40180677297164 }
ã¢ãã«ã®å¦ç¿ã¯ EC2 ã¤ã³ã¹ã¿ã³ã¹ä¸ã§å®è¡ããã¾ãï¼ ä»åã®ã±ã¼ã¹ã§ã¯ãã¼ã¿ã»ããã¯æ¯è¼çå°ããï¼å ¨ãã¼ã¿ = 500 ã¬ã·ãï¼ï¼ BiLSTM-CRF ã®ãããã¯ã¼ã¯ãããã¾ã§å¤§ããããã¾ããï¼ ãã®ããï¼é常ã®ãããã¸ã§ãã¨ã»ã¼åãç¨åº¦ã®è¦æ¨¡ã®ã¤ã³ã¹ã¿ã³ã¹ã§ã®å¦ç¿ãå¯è½ã§ãï¼ å®è¡ç°å¢ã GPU ã大容éã¡ã¢ãªãªã©ã®ãªã½ã¼ã¹ãå¿ è¦ã¨ããªãããï¼é常ã®ãããéçºã®ããã¼ã«ä¹ããã¨ãã§ãã¾ããï¼ ããã«ããï¼ç¤¾å ã«èç©ããã¦ãããããéç¨ã®ç¥è¦ãæ´»ããã¦ã¤ã³ãã©ç°å¢ã®æ´åã«ãããã³ã¹ããæãã¤ã¤å¦ç¿ããããæ§ç¯ã§ãã¦ãã¾ãï¼
ã¾ãï¼ nerman ã®ãããã¯ãã¹ã¦ã¹ãããã¤ã³ã¹ã¿ã³ã¹ãåæã¨ãã¦æ§ç¯ããã¦ãã¾ãï¼ ã¹ãããã¤ã³ã¹ã¿ã³ã¹ã¯é常ã®ã¤ã³ã¹ã¿ã³ã¹ãããã³ã¹ããä½ãï¼ ä»£ããã«å®è¡ä¸ã«å¼·å¶çµäºããï¼spot interruption ã¨å¼ã°ããï¼å¯è½æ§ãããã¤ã³ã¹ã¿ã³ã¹ã§ãï¼ ã¢ãã«ã®å¦ç¿ã¯å¼·å¶çµäºããã¦ãã¾ã£ã¦ããªãã©ã¤ããããã°ããï¼å¦ç¿ã«ãããæéãé·ãããªããã°ã¹ãããã¤ã³ã¹ã¿ã³ã¹ãå©ç¨ãããã¨ã§ã³ã¹ããæãããã¾ãï¼ ï¼ãã ãï¼å¦ç¿ã«ãããæéãé·ããã°é·ãã ã spot interruption ã«ééããå¯è½æ§ãé«ããªãã¾ãï¼ ãªãã©ã¤ãå«ããå ¨ä½ã§ã®å®è¡æéãé常ã®ã¤ã³ã¹ã¿ã³ã¹ã§ã®å®è¡æéã¨æ¯è¼ãã¦é·ããªããããå ´åï¼ ããã£ã¦ã³ã¹ããããã£ã¦ãã¾ãå¯è½æ§ãããï¼æ³¨æãå¿ è¦ã§ãï¼ï¼
å®ãã¼ã¿ã§ã®äºæ¸¬
以ä¸ã®ãããªã·ã§ã«ã¹ã¯ãªãããå®è¡ãã¦äºæ¸¬ãå®è¡ãã¾ãï¼
predict
#!/bin/bash export MODEL_VERSION=${MODEL_VERSION:-2020-07-08} export TIMESTAMP=${TIMESTAMP:-`date '+%Y-%m-%d'`} export FROM_IDX=${FROM_IDX:-10000} export LAST_IDX=${LAST_IDX:-10100} export KUROKO2_PARALLEL_FORK_INDEX=${KUROKO2_PARALLEL_FORK_INDEX:--1} export KUROKO2_PARALLEL_FORK_SIZE=${KUROKO2_PARALLEL_FORK_SIZE:--1} if [ $KUROKO2_PARALLEL_FORK_SIZE = -1 ] || [ $KUROKO2_PARALLEL_FORK_INDEX = -1 ]; then echo $FROM_IDX $LAST_IDX ' (without parallel execution)' else if (($KUROKO2_PARALLEL_FORK_INDEX >= $KUROKO2_PARALLEL_FORK_SIZE)); then echo '$KUROKO2_PARALLEL_FORK_INDEX'=$KUROKO2_PARALLEL_FORK_INDEX 'must be smaller than $KUROKO2_PARALLEL_FORK_SIZE' $KUROKO2_PARALLEL_FORK_SIZE exit fi # ============================================================================== # begin: FROM_IDX ~ LAST_IDX ã®ãã¼ã¿ã KUROKO2_PARALLEL_FORK_SIZE ã®å¤ã§çåããå¦ç # ============================================================================== NUM_RECORDS=$(($LAST_IDX - $FROM_IDX)) echo 'NUM_RECORDS = ' $NUM_RECORDS if (($NUM_RECORDS % $KUROKO2_PARALLEL_FORK_SIZE != 0)); then echo '$KUROKO2_PARALLEL_FORK_SIZE = ' $KUROKO2_PARALLEL_FORK_SIZE 'must be multiple of $NUM_RECORDS=' $NUM_RECORDS exit fi DIV=$(($NUM_RECORDS / $KUROKO2_PARALLEL_FORK_SIZE)) echo 'DIV=' $DIV if (($DIV <= 0)); then echo 'Invalid DIV=' $DIV exit fi LAST_IDX=$(($FROM_IDX + (($KUROKO2_PARALLEL_FORK_INDEX + 1) * $DIV) )) FROM_IDX=$(($FROM_IDX + ($KUROKO2_PARALLEL_FORK_INDEX * $DIV) )) echo '$FROM_IDX = ' $FROM_IDX ' $LAST_IDX = ' $LAST_IDX # ============================================================================ # end: FROM_IDX ~ LAST_IDX ã®ãã¼ã¿ã KUROKO2_PARALLEL_FORK_SIZE ã®å¤ã§çåããå¦ç # ============================================================================ fi allennlp custom-predict \ --from-idx $FROM_IDX \ --last-idx $LAST_IDX \ --include-package nerman \ --model-path s3://xxx/nerman/model/$MODEL_VERSION/model.tar.gz aws s3 cp \ --recursive \ --exclude "*" \ --include "_*.csv" \ prediction \ s3://xxx/nerman/output/$TIMESTAMP/prediction/
äºæ¸¬ãããã¯å¦ç¿ããããä½æããã¢ãã«ãèªã¿è¾¼ã¿ï¼åºæ表ç¾ãä»ä¸ããã¦ããªãã¬ã·ãã解æãã¾ãï¼ ã¾ãï¼äºæ¸¬ãããã¯ä¸¦åå®è¡ã«å¯¾å¿ãã¦ãã¾ãï¼ ã¯ãã¯ãããã«ã¯ 340 ä¸å以ä¸ã®ã¬ã·ããæ稿ããã¦ããï¼ãããã®ã¬ã·ããä¸åº¦ã«è§£æããã®ã¯å®¹æã§ã¯ããã¾ããï¼ ãã®ããï¼ã¬ã·ããè¤æ°ã®ã°ã«ã¼ãã«åå²ãï¼ããããã並åã«è§£æãã¦ãã¾ãï¼
FROM_RECIPE_IDX
㨠LAST_RECIPE_IDX
ã§è§£æ対象ã¨ããã¬ã·ããæå®ãï¼ KUROKO2_PARALLEL_FORK_SIZE
ã¨ããç°å¢å¤æ°ã§ä¸¦åæ°ãè¨å®ãã¾ãï¼
並åå®è¡ãããããã»ã¹ã«ã¯ KUROKO2_PARALLEL_FORK_INDEX
ã¨ããå¤æ°ã渡ãããããã«ãªã£ã¦ãã¦ï¼ãã®å¤æ°ã§èªèº«ã並åå®è¡ãããããã»ã¹ã®ãã¡ä½çªç®ããèå¥ãã¾ãï¼
ããã»ã¹ã®ä¸¦ååã¯ç¤¾å
ã§å©ç¨ããã¦ããã¸ã§ã管çã·ã¹ãã kuroko2
ã®ä¸¦åå®è¡æ©è½ (parallel_fork) ãå©ç¨ãã¦å®ç¾ãã¦ãã¾ãï¼
custom-predict
ã³ãã³ãã¯ä¸ã§å®ç¾©ããå¤æ°ãç¨ãã¦å¯¾è±¡ã¨ãªãã¬ã·ããåå²ãï¼ AllenNLP ã®ã¢ãã«ãç¨ãã¦åºæ表ç¾ãæ½åºããã³ãã³ãã§ãï¼
AllenNLP ã§ã¯èªåã§ãµãã³ãã³ããç»é²ã§ãï¼ãã®ããã«ãã¹ã¦ã®å¦çã allennlp
ã³ãã³ãããå®è¡ã§ããããã«ãªã£ã¦ãã¾ãï¼
ãµãã³ãã³ãã¯ä»¥ä¸ã®ããã« Python ã¹ã¯ãªããï¼predict_custom.py
ï¼ãä½æãã¦å®ç¾©ã§ãã¾ãï¼
ï¼ãµãã³ãã³ãã«ã¤ãã¦ã®å
¬å¼ããã¥ã¡ã³ãã¯ãã¡ãï¼
custom_predict.py
import argparse from allennlp.commands import Subcommand from nerman.data.dataset_readers import StreamSentenceDatasetReader from nerman.predictors import KonohaSentenceTaggerPredictor def create_predictor(model_path) -> KonohaSentenceTaggerPredictor: archive = load_archive(model_path) predictor = KonohaSentenceTaggerPredictor.from_archive(archive) dataset_reader = StreamSentenceDatasetReader(predictor._dataset_reader._token_indexers) return KonohaSentenceTaggerPredictor(predictor._model, dataset_reader) def _predict( from_idx: int, last_idx: int, model_path: str, ): # predictor ã®ä½æ predictor = create_predictor(model_path) ... # Redshift ãããã¼ã¿ãåã£ã¦ãããã¢ãã«ã«å ¥åãããããå¦çï¼ä»åã®è¨äºã§ã¯è§£èª¬ã¯å²æãã¾ãï¼ def predict(args: argparse.Namespace): from_idx = args.from_idx last_idx = args.last_idx _predict(from_idx, last_idx) @Subcommand.register("custom-predict") class CustomPrediction(Subcommand): @overrides def add_subparser(self, parser: argparse._SubParsersAction) -> argparse.ArgumentParser: description = "Script to custom predict." subparser = parser.add_parser(self.name, description=description, help="Predict entities.") subparser.add_argument("--from-idx", type=int, required=True) subparser.add_argument("--last-idx", type=int, required=True) subparser.add_argument("--model-path", type=str, required=True) subparser.set_defaults(func=predict) # ãµãã³ãã³ããå¼ã°ããã¨ãã«å®éã«å®è¡ããã¡ã½ãããæå®ãã return subparser
model_path
ã¨ããå¤æ°ã«ã¯ã¢ãã«ã®ã¢ã¼ã«ã¤ããã¡ã¤ã«ã®ãã¹ãæå®ããã¦ãã¾ãï¼
ã¢ã¼ã«ã¤ããã¡ã¤ã«ã®ãã¹ã¯ load_archive
ã¨ããã¡ã½ããã«æ¸¡ããã¾ãï¼
load_archive
㯠AllenNLP ãæä¾ãã¦ããã¡ã½ããã§ããï¼ãããå©ç¨ããã¨ä¿åãããå¦ç¿æ¸ã¿ã¢ãã«ã®å¾©å
ãç°¡åã«ã§ãã¾ãï¼
ã¾ãï¼ load_archive
ã¯ãã¼ã¿ã»ããã®ãã¹ã¨åæ§ S3 ã¹ãã¼ãã«å¯¾å¿ãã¦ããããï¼å¦ç¿ãããã§ã¢ãããã¼ãå
ã«æå®ãããã¹ããã®ã¾ã¾å©ç¨ã§ãã¾ãï¼
ï¼load_archive
ã®å
¬å¼ããã¥ã¡ã³ãã¯ãã¡ãï¼
æååãã¢ãã«ã«å
¥åããããã«ã¯ AllenNLP ã® Predictor
ã¨ããæ©æ§ãå©ç¨ãã¦ãã¾ãï¼
å
¬å¼ããã¥ã¡ã³ãã¯ãã¡ãã§ãï¼
ç³»åã©ããªã³ã°ã¢ãã«ã®äºæ¸¬çµæãæ±ãéã«ä¾¿å©ãª SentenceTaggerPredictor
ã¯ã©ã¹ãç¶æ¿ãï¼ä»¥ä¸ã«ç¤ºã KonohaSentenceTaggerPredictor
ã¯ã©ã¹ãå®ç¾©ãã¦ãã¾ãï¼
predict
ã¡ã½ããã«è§£æãããæååãå
¥åããã¨ï¼ã¢ãã«ã®äºæ¸¬çµæãåºåãã¦ããã¾ãï¼
from allennlp.common.util import JsonDict from allennlp.data import Instance from allennlp.data.dataset_readers.dataset_reader import DatasetReader from allennlp.models import Model from allennlp.predictors import SentenceTaggerPredictor from allennlp.predictors.predictor import Predictor from konoha.integrations.allennlp import KonohaTokenizer from overrides import overrides @Predictor.register("konoha_sentence_tagger") class KonohaSentenceTaggerPredictor(SentenceTaggerPredictor): def __init__(self, model: Model, dataset_reader: DatasetReader) -> None: super().__init__(model, dataset_reader) self._tokenizer = KonohaTokenizer("mecab") def predict(self, sentence: str) -> JsonDict: return self.predict_json({"sentence": sentence}) @overrides def _json_to_instance(self, json_dict: JsonDict) -> Instance: sentence = json_dict["sentence"] tokens = self._tokenizer.tokenize(sentence) return self._dataset_reader.text_to_instance(tokens)
nerman ã§ã¯ï¼æ¥æ¬èªã®ã¬ã·ããã¼ã¿ãæ±ãããã«æ¥æ¬èªå¦çãã¼ã«ã® konoha ãå©ç¨ãã¦ãã¾ãï¼
KonohaTokenizer
㯠Konoha ãæä¾ãã¦ãã AllenNLP ã¤ã³ãã°ã¬ã¼ã·ã§ã³æ©è½ã§ãï¼
æ¥æ¬èªæååãåãåãï¼åãã¡æ¸ããããã¯å½¢æ
ç´ è§£æãå®æ½ï¼ AllenNLP ã®ãã¼ã¯ã³åãåºåãã¾ãï¼
å½¢æ
ç´ è§£æå¨ã«ã¯ MeCab ãæ¡ç¨ãã¦ããï¼è¾æ¸ã¯ mecab-ipadic ã使ç¨ãã¦ãã¾ãï¼
次ã«ï¼ä½æããã¢ã¸ã¥ã¼ã«ã __init__.py
ã§ã¤ã³ãã¼ããã¾ãï¼
ä»å㯠nerman/commands
ã¨ãããã£ã¬ã¯ããªã« custom_predict.py
ãè¨ç½®ãã¦ãã¾ãï¼
ãã®ããï¼ nerman/__init__.py
ããã³ nerman/commands/__init__.py
ããããã次ã®ããã«ç·¨éãã¾ãï¼
nerman/__init__.py
import nerman.commands
nerman/commands/__init__.py
from nerman.commands import custom_predict
ã³ãã³ãã®å®ç¾©ããã³ã¤ã³ãã¼ããã§ãããï¼ allennlp
ã³ãã³ãã§å®éã«ãµãã³ãã³ããèªèãããããã«
.allennlp_plugins
ã¨ãããã¡ã¤ã«ããªãã¸ããªã«ã¼ãã«ä½æãã¾ãï¼
.allennlp_plugins
nerman
以ä¸ã®æä½ã§ãµãã³ãã³ãã allennlp
ã³ãã³ãã§å®è¡ã§ããããã«ãªãã¾ãï¼
allennlp --help
ãå®è¡ãã¦ä½æããã³ãã³ããå©ç¨ã§ããããã«ãªã£ã¦ããã確èªã§ãã¾ãï¼
å¾ãããäºæ¸¬çµæ㯠CSV å½¢å¼ã®ãã¡ã¤ã«ã¨ãã¦ä¿åããï¼äºæ¸¬ãçµäºããå¾ã« S3 ã¸ã¢ãããã¼ãããã¾ãï¼
次ã«ï¼ S3 ã«ã¢ãããã¼ãããäºæ¸¬çµæããã¼ã¿ãã¼ã¹ã«æå
¥ãã¾ãï¼
ãã¼ã¿ã¯æçµçã« Amazon Redshift (以é Redshift) ã«é
ç½®ããã¾ããï¼ Amazon Aurora ï¼ä»¥é Auroraï¼ãçµç±ããã¢ã¼ããã¯ãã£ãæ¡ç¨ãã¦ãã¾ãï¼
ãã㯠Aurora ã® LOAD DATA FROM S3
ã¹ãã¼ãã¡ã³ãã¨ããæ©è½ãå©ç¨ããããã§ãï¼
LOAD DATA FROM S3
ã¹ãã¼ãã¡ã³ãã¯æ¬¡ã®ãã㪠SQL ã¯ã¨ãªã§å©ç¨ã§ãã¾ãï¼
load.sql
load data from S3 's3://xxx/nerman/output/$TIMESTAMP/prediction.csv' into table recipe_step_named_entities fields terminated by ',' lines terminated by '\n' (recipe_text_id, start, last, name, category) set created_at = current_timestamp, updated_at = current_timestamp;
ãã®ã¯ã¨ãªãå®è¡ãããã¨ã§ï¼ S3 ã«ã¢ãããã¼ããã CSV ãã¡ã¤ã«ãç´æ¥ Amazon Aurora ã«ã¤ã³ãã¼ãã§ãã¾ãï¼
LOAD DATA FROM S3
ã«ã¤ãã¦ã¯
AWS ã®å
¬å¼ããã¥ã¡ã³ã ãåèã«ãªãã¾ãï¼
ããããµã¤ãºãã³ãããã®ã¿ã¤ãã³ã°ã®èª¿æ´ã®æéãå¿
è¦ãªããªãããï¼å¤§è¦æ¨¡ãã¼ã¿ããã¼ã¿ãã¼ã¹ã«æå
¥ããéã«é常ã«ä¾¿å©ã§ãï¼
Aurora ã®ãã¼ã¿ãã¼ã¹ã«æå ¥ããäºæ¸¬çµæ㯠pipelined-migrator ã¨ãã社å ã·ã¹ãã ãå©ç¨ãã¦å®æçã« Redshift ã¸åãè¾¼ã¾ãã¾ãï¼ pipelined-migrator ãå©ç¨ãããã¨ã§ï¼ç®¡çç»é¢ä¸ã§æ°ã¹ãããè¨å®ãè¡ãã ã㧠Aurora ãã Redshift ã¸ãã¼ã¿ãåãè¾¼ãã¾ãï¼ ããã«ããï¼ S3 ããã®ãã¼ã㨠pipelined-migrator ãçµã¿åãããæéã®å°ãªããã¼ã¿ã®æå ¥ããã¼ãå®ç¾ã§ãã¾ããï¼
解æçµæãã¹ã¿ããã«å©ç¨ãã¦ãããæ¹æ³ã¨ãã¦ï¼ãã¼ã¿ãã¼ã¹ãå©ç¨ããã«äºæ¸¬ API ãç¨æããæ¹æ³ãèãããã¾ãï¼ ä»åã®ã¿ã¹ã¯ã®ç®æ¨ã¯ããã§ã«æ稿ãããã¬ã·ãããã®æçç¨èªã®èªåæ½åºãã§ããï¼ããã¯ãããå¦çã§ãããããè¨ç®å¯è½ã§ãï¼ ãã®ããï¼ API ãµã¼ããç¨æããã«ãããå¦çã§äºæ¸¬ãè¡ãæ¹éãæ¡ç¨ãã¾ããï¼
ã¾ãï¼ã¨ã³ã¸ãã¢ä»¥å¤ã®ã¹ã¿ããã«ãäºæ¸¬çµæã使ã£ã¦ã¿ã¦ãããããã¨èãã¦ãã¾ããï¼ ã¯ãã¯ãããã¯ã¨ã³ã¸ãã¢ä»¥å¤ã®ã¹ã¿ããã SQL ãæ¸ããæ¹ãå¤ãããï¼ äºæ¸¬çµæãã¯ã¨ãªå¯è½ãªå½¢ã§ãã¼ã¿ãã¼ã¹ã«ä¿åãã¦ããæ¹éã¯ã³ã¹ãããã©ã¼ãã³ã¹ãããé¸æè¢ã§ããï¼ äºæ¸¬çµæãå©ç¨ããã¯ã¨ãªä¾ã以ä¸ã«ç¤ºãã¾ãï¼
list_tools.sql
select , s.recipe_id , e.name , e.category from recipe_step_named_entities as e inner join recipe_steps as s on e.step_id = s.id where e.category in ('Tg') and s.recipe_id = xxxx
ãã®ã¯ã¨ãªã Redshift ä¸ã§å®è¡ãããã¨ã§ï¼ã¬ã·ãä¸ã«åºç¾ãã調çå¨å ·ã®ãªã¹ããåå¾ã§ããããã«ãªãã¾ããï¼
Optuna ãç¨ãããã¤ãã¼ãã©ã¡ã¼ã¿ã®åæ£æé©å
æå¾ã«ãã¤ãã¼ãã©ã¡ã¼ã¿ã®æé©åã«ã¤ãã¦è§£èª¬ãã¾ãï¼
nerman ã§ã¯ Optuna ãç¨ãããã¤ãã¼ãã©ã¡ã¼ã¿ã®æé©åãå®æ½ãã¦ãã¾ãï¼
Optuna 㯠Preferred Networks (PFN) ãéçºãã¦ãããã¤ãã¼ãã©ã¡ã¼ã¿æé©åã®ã©ã¤ãã©ãªã§ãï¼
ã¤ã³ã¹ãã¼ã«ã¯ pip install optuna
ãã¿ã¼ããã«ã§å®è¡ããã°å®äºãã¾ãï¼
Optuna ã§ã¯ï¼åã¤ã³ã¹ã¿ã³ã¹ããæ¥ç¶å¯è½ãªããã¯ã¨ã³ãã¨ã³ã¸ã³ï¼RDB or Redisï¼ãç¨æãï¼ãããã¹ãã¬ã¼ã¸ã§ä½¿ç¨ãããã¨ã§ï¼ è¤æ°ã¤ã³ã¹ã¿ã³ã¹ãå©ç¨ããåæ£ç°å¢ä¸ã§ã®ãã¤ãã¼ãã©ã¡ã¼ã¿æé©åãå®ç¾ã§ãã¾ãï¼ ï¼ã¹ãã¬ã¼ã¸ã¯ Optuna ãæé©åçµæãä¿åããããã«ä½¿ç¨ãããã®ã§ï¼RDB ã Redis ãªã©ãæ½è±¡åãããã®ã§ãï¼ ã¤ã³ã¹ã¿ã³ã¹ãã¾ããã åæ£æé©åãå®æ½ããå ´åï¼ã¹ãã¬ã¼ã¸ã®ããã¯ã¨ã³ãã¨ã³ã¸ã³ã¯ MySQL ããã㯠PostgreSQL ãæ¨å¥¨ããã¦ãã¾ã ï¼Redis ã experimental feature ã¨ãã¦å©ç¨å¯è½ã«ãªã£ã¦ãã¾ãï¼ï¼ 詳ããã¯å ¬å¼ããã¥ã¡ã³ãããåç §ãã ããï¼ ä»åã¯ã¹ãã¬ã¼ã¸ã¨ã㦠MySQL (Aurora) ãæ¡ç¨ãã¦ãã¾ãï¼
Optuna ã«ã¯ AllenNLP ã®ããã®ã¤ã³ãã°ã¬ã¼ã·ã§ã³ã¢ã¸ã¥ã¼ã«ãåå¨ãã¾ãï¼
ããããªããï¼ãã®ã¤ã³ãã°ã¬ã¼ã·ã§ã³ã¢ã¸ã¥ã¼ã«ã使ãã¨èªèº«ã§æé©åãå®è¡ããããã® Python ã¹ã¯ãªãããè¨è¿°ããå¿
è¦ãããã¾ãï¼
ããã§ï¼ AllenNLP ã¨ããã¹ã ã¼ãºã«é£æºããããã« allennlp-optuna ã¨ãããã¼ã«ãéçºãã¾ããï¼
allennlp-optuna
ãã¤ã³ã¹ãã¼ã«ããã¨ï¼ã¦ã¼ã¶ã¯ allennlp tune
ã¨ããã³ãã³ã㧠Optuna ãå©ç¨ãããã¤ãã¼ãã©ã¡ã¼ã¿æé©åãå®è¡ã§ããããã«ãªãã¾ãï¼
ãã®ã³ãã³ã㯠allennlp train
ã³ãã³ãã¨äºææ§ãé«ãï¼ AllenNLP ã«æ
£ããã¦ã¼ã¶ã¯ã¹ã ã¼ãºã«ãã¤ãã¼ãã©ã¡ã¼ã¿ã®æé©åã試ãã¾ãï¼
allennlp tune
ã³ãã³ããå®è¡ããã«ã¯ï¼ã¾ã pip install allennlp-optuna.git
㧠allennlp-optuna
ãã¤ã³ã¹ãã¼ã«ãã¾ãï¼
次ã«ï¼ .allennlp_plugins
ã以ä¸ã®ããã«ç·¨éãã¾ãï¼
.allennlp_plugins
allennlp-optuna nerman
allennlp --help
ã¨ã³ãã³ããå®è¡ãã¦ï¼ä»¥ä¸ã®ããã« retrain
ã³ãã³ã㨠tune
ã³ãã³ãã確èªã§ããã°ã¤ã³ã¹ãã¼ã«æåã§ãï¼
$ allennlp --help 2020-11-05 01:54:24,567 - INFO - allennlp.common.plugins - Plugin allennlp_optuna available usage: allennlp [-h] [--version] ... Run AllenNLP optional arguments: -h, --help show this help message and exit --version show program's version number and exit Commands: best-params Export best hyperparameters. evaluate Evaluate the specified model + dataset. find-lr Find a learning rate range. predict Use a trained model to make predictions. print-results Print results from allennlp serialization directories to the console. retrain Train a model with hyperparameter found by Optuna. test-install Test AllenNLP installation. train Train a model. tune Optimize hyperparameter of a model.
allennlp-optuna
ãç¡äºã«ã¤ã³ã¹ãã¼ã«ã§ãã¾ããï¼
次㫠allennlp-optuna
ãå©ç¨ããããã«å¿
è¦ãªæºåã«ã¤ãã¦è§£èª¬ãã¾ãï¼
è¨å®ãã¡ã¤ã«ã®ä¿®æ£
ã¯ããã«ï¼æºåã®ç« ã§ä½æãã config.jsonnet
ã以ä¸ã®ããã«æ¸ãæãã¾ãï¼
config.jsonnet
(allennlp-optuna
ç¨)
// ãã¤ãã¼ãã©ã¡ã¼ã¿ãå¤æ°åãã local lr = std.parseJson(std.extVar('lr')); local lstm_hidden_size = std.parseInt(std.extVar('lstm_hidden_size')); local dropout = std.parseJson(std.extVar('dropout')); local word_embedding_dim = std.parseInt(std.extVar('word_embedding_dim')); local cuda_device = -1; { dataset_reader: { type: 'cookpad2020', token_indexers: { word: { type: 'single_id', }, }, coding_scheme: 'BIOUL', }, train_data_path: 'data/cpc.bio.train', validation_data_path: 'data/cpc.bio.dev', model: { type: 'crf_tagger', text_field_embedder: { type: 'basic', token_embedders: { word: { type: 'embedding', embedding_dim: word_embedding_dim, }, }, }, encoder: { type: 'lstm', input_size: word_embedding_dim, hidden_size: lstm_hidden_size, dropout: dropout, bidirectional: true, }, label_encoding: 'BIOUL', calculate_span_f1: true, dropout: dropout, // ããã§å®£è¨ããå¤æ°ãæå®ãã initializer: {}, }, data_loader: { batch_size: 10, }, trainer: { num_epochs: 20, cuda_device: cuda_device, optimizer: { type: 'adam', lr: lr, // ããã§å®£è¨ããå¤æ°ãæå®ãã }, }, }
æé©åããããã¤ãã¼ãã©ã¡ã¼ã¿ã local lr = std.parseJson(std.extVar('lr'))
ã®ããã«å¤æ°åãã¦ãã¾ãï¼
std.extVar
ã®è¿ãå¤ã¯æååã§ãï¼æ©æ¢°å¦ç¿ã¢ãã«ã®ãã¤ãã¼ãã©ã¡ã¼ã¿ã¯æ´æ°ãæµ®åå°æ°ã§ãããã¨ãå¤ãããï¼ãã£ã¹ããå¿
è¦ã«ãªãã¾ãï¼
æµ®åå°æ°ã¸ã®ãã£ã¹ã㯠std.parseJson
ã¨ããã¡ã½ãããå©ç¨ãã¾ãï¼æ´æ°ã¸ã®ãã£ã¹ã㯠std.parseInt
ãå©ç¨ãã¦ãã ããï¼
æ¢ç´¢ç©ºéã®å®ç¾©
次ã«ï¼ãã¤ãã¼ãã©ã¡ã¼ã¿ã®æ¢ç´¢ç©ºéãå®ç¾©ãã¾ãï¼
allennlp-optuna
ã§ã¯ï¼æ¢ç´¢ç©ºéã¯æ¬¡ã®ãã㪠JSON ãã¡ã¤ã«ï¼hparams.json
ï¼ã§å®ç¾©ãã¾ãï¼
hparams.json
[ { "type": "float", "attributes": { "name": "dropout", "low": 0.0, "high": 0.8 } }, { "type": "int", "attributes": { "name": "lstm_hidden_size", "low": 32, "high": 256 }, }, { "type": "float", "attributes": { "name": "lr", "low": 5e-3, "high": 5e-1, "log": true } } ]
ä»åã®ä¾ã§ã¯å¦ç¿çã¨ããããã¢ã¦ãã®æ¯çãæé©åã®å¯¾è±¡ã§ãï¼
ããããã«ã¤ãã¦ï¼å¤ã®ä¸éã»ä¸éãè¨å®ãã¾ãï¼
å¦ç¿çã¯å¯¾æ°ã¹ã±ã¼ã«ã®åå¸ãããµã³ããªã³ã°ããããï¼ "log": true
ã¨ãã¦ãããã¨ã«æ³¨æãã¦ãã ããï¼
æé©åãããã¯æ¬¡ã®ãããªã·ã§ã«ã¹ã¯ãªãããå®è¡ãã¾ãï¼
optimize
#!/bin/bash export N_TRIALS=${N_TRIALS:-20} # Optuna ã®è©¦è¡åæ°ãå¶å¾¡ãã export TIMEOUT=${TIMEOUT:-7200} # # ä¸å®æéãçµéãããæé©åãçµäºããï¼åä½ã¯ç§ï¼: 60*60*2 => 2h export TIMESTAMP=${TIMESTAMP:-`date '+%Y-%m-%d'`} export OPTUNA_STORAGE=${OPTUNA_STORAGE:-mysql://$DB_USERNAME:$DB_PASSWORD@$DB_HOST_NAME/$DB_NAME} export OPTUNA_STUDY_NAME=${OPTUNA_STUDY_NAME:-nerman-$TIMESTAMP} # ãã¤ãã¼ãã©ã¡ã¼ã¿ã®æé©å allennlp tune \ config/ner.jsonnet \ config/hparam.json \ --serialization-dir result/hpo \ --include-package nerman \ --metrics best_validation_f1-measure-overall \ --study-name $OPTUNA_STUDY_NAME \ --storage $OPTUNA_STORAGE \ --direction maximize \ --n-trials $N_TRIALS \ --skip-if-exists \ --timeout $TIMEOUT
ãã®ã³ãã³ããè¤æ°ã®ã¤ã³ã¹ã¿ã³ã¹ã§å®è¡ãããã¨ã§ï¼ãã¤ãã¼ãã©ã¡ã¼ã¿ã®åæ£æé©åãå®è¡ã§ãã¾ãï¼
ãªãã·ã§ã³ --skip-if-exists
ãæå®ãããã¨ã§ï¼è¤æ°ã®ã¤ã³ã¹ã¿ã³ã¹ã®éã§æé©åã®éä¸çµéãå
±æãã¦ãã¾ãï¼
Optuna ã¯é常å®è¡ã®ãã³ã«æ°ããå®é¨ç°å¢ï¼study
ã¨å¼ã°ãã¾ãï¼ãä½æãï¼ãã¤ãã¼ãã©ã¡ã¼ã¿ã®æ¢ç´¢ãè¡ãã¾ãï¼
ãã®ã¨ãï¼ãã§ã«ã¹ãã¬ã¼ã¸ã«ååã® study ãåå¨ããå ´åã¯ã¨ã©ã¼ã«ãªãã¾ãï¼
ãããï¼ --skip-if-exists
ãæå¹ã«ããã¨ï¼ã¹ãã¬ã¼ã¸ã«ååã® study ãããå ´åã¯å½è©²ã® study ãèªã¿è¾¼ã¿ï¼éä¸ããæ¢ç´¢ãåéãã¾ãï¼
ãã®ä»çµã¿ã«ãã£ã¦ï¼è¤æ°ã®ã¤ã³ã¹ã¿ã³ã¹ã§ --skip-if-exists
ãæå¹ã«ãã¦æ¢ç´¢ãéå§ãããã¨ã§ã ã㧠study ãå
±æããæé©åãè¡ããã¾ãï¼
ä¸è¨ã®ã¹ã¯ãªããã«ãã£ã¦ï¼æé©åãããã¯ä¸ããããæéï¼--timeout
ã§è¨å®ããã¦ããå¤ = 2 æé
ï¼ã«æ大 20 åæ¢ç´¢ãå®è¡ãã¾ãï¼
ãã®ããã«ï¼ Optuna ã®ãªã¢ã¼ãã¹ãã¬ã¼ã¸æ©è½ã«ãã£ã¦ï¼è¤æ°ã®ã¤ã³ã¹ã¿ã³ã¹ã§åãã³ãã³ããå®è¡ããã ãã§åæ£æé©åãå®ç¾ã§ãã¾ããï¼ Optuna ã®åæ£ãã¤ãã¼ãã©ã¡ã¼ã¿æé©åã®è©³ããä»çµã¿ï¼ãããã¯ããé«åº¦ãªä½¿ãæ¹ã«ã¤ãã¦ã¯ Optuna éçºè 㮠解説è³æ ãåèã«ãªãã®ã§ï¼ èå³ã®ããæ¹ã¯åããã¦ãåç §ãã ããï¼
ã¢ãã«ã®åå¦ç¿
æå¾ã«ï¼æé©åããããã¤ãã¼ãã©ã¡ã¼ã¿ãç¨ãã¦ã¢ãã«ãåå¦ç¿ãã¾ãï¼ åå¦ç¿ãããã¯ä»¥ä¸ã®ãããªã·ã§ã«ã¹ã¯ãªããã§å®è¡ãã¾ãï¼
retrain
#!/bin/bash export TIMESTAMP=${TIMESTAMP:-`date '+%Y-%m-%d'`} export OPTUNA_STORAGE=${OPTUNA_STORAGE:-mysql://$DB_USERNAME:$DB_PASSWORD@$DB_HOST_NAME/$DB_NAME} # æé©åããããã¤ãã¼ãã©ã¡ã¼ã¿ãç¨ããã¢ãã«ã®åå¦ç¿ allennlp retrain \ config/ner.jsonnet \ --include-package nerman \ --include-package allennlp_models \ --serialization-dir result \ --study-name $OPTUNA_STUDY_NAME \ --storage $OPTUNA_STORAGE # ã¢ãã«ã¨ã¡ããªã¯ã¹ã®ã¢ãããã¼ã aws s3 cp result/model.tar.gz s3://xxx/nerman/model/$TIMESTAMP/model.tar.gz aws s3 cp result/metrics.json s3://xxx/nerman/model/$TIMESTAMP/metrics.json
ãã®ã·ã§ã«ã¹ã¯ãªããã§ã¯ allennlp-optuna
ãæä¾ãã retrain
ã³ãã³ããå©ç¨ãã¦ãã¾ãï¼
allennlp retrain
ã³ãã³ãã¯ã¹ãã¬ã¼ã¸ããæé©åçµæãåå¾ãï¼å¾ããããã¤ãã¼ãã©ã¡ã¼ã¿ã AllenNLP ã«æ¸¡ãã¦ã¢ãã«ã®å¦ç¿ãè¡ã£ã¦ããã¾ãï¼
tune
ã³ãã³ãåæ§ï¼ retrain
ã³ãã³ãã train
ã³ãã³ãã¨ã»ã¼åãã¤ã³ã¿ã¼ãã§ã¼ã¹ãæä¾ãã¦ãããã¨ããããã¾ãï¼
åå¦ç¿ããã¢ãã«ã®ã¡ããªã¯ã¹ã以ä¸ã«ç¤ºãã¾ãï¼
metrics.json
{ "best_epoch": 2, "peak_worker_0_memory_MB": 475.304, "training_duration": "0:45:46.205781", "training_start_epoch": 0, "training_epochs": 19, "epoch": 19, "training_accuracy": 0.9903080859981059, "training_accuracy3": 0.9904289830542626, "training_precision-overall": 0.9844266427651112, "training_recall-overall": 0.9843714989917096, "training_f1-measure-overall": 0.9843990701061036, "training_loss": 3.0297666011196327, "training_reg_loss": 0.0, "training_worker_0_memory_MB": 475.304, "validation_accuracy": 0.9096327319587629, "validation_accuracy3": 0.911243556701031, "validation_precision-overall": 0.884530630233583, "validation_recall-overall": 0.8787215411558669, "validation_f1-measure-overall": 0.8816165165824231, "validation_loss": 61.33201414346695, "validation_reg_loss": 0.0, "best_validation_accuracy": 0.9028672680412371, "best_validation_accuracy3": 0.9048002577319587, "best_validation_precision-overall": 0.8804444444444445, "best_validation_recall-overall": 0.867338003502627, "best_validation_f1-measure-overall": 0.873842082046708, "best_validation_loss": 38.57948366800944, "best_validation_reg_loss": 0.0, "test_accuracy": 0.8887303851640513, "test_accuracy3": 0.8904739261372642, "test_precision-overall": 0.8570790531487271, "test_recall-overall": 0.8632478632478633, "test_f1-measure-overall": 0.8601523980277404, "test_loss": 44.22851959539919 }
ã¢ãã«ã®å¦ç¿
ã®ç« ã§å¦ç¿ãããã¢ãã«ã¨æ¯è¼ãã¦ï¼ãã¹ããã¼ã¿ã§ã® Få¤ï¼test_f1-measure-overall
ï¼ã 82.7
ãã 86.0
ã¨ãªãï¼ 3.3
ãã¤ã³ãæ§è½ãåä¸ãã¾ããï¼
ãã¤ãã¼ãã©ã¡ã¼ã¿ã®æ¢ç´¢ç©ºéãã¢ãã¦ãã«å®ã㦠Optuna ã«æé©åããã¦ãããã°ååãªæ§è½ãçºæ®ãããã¤ãã¼ãã©ã¡ã¼ã¿ãå¾ããã¾ãï¼ä¾¿å©ã§ãï¼
Optuna ã¯ãã¤ãã¼ãã©ã¡ã¼ã¿ãæé©åããã ãã§ãªãï¼ æé©åéä¸ã®ã¡ããªã¯ã¹ã®æ¨ç§»ããã¤ãã¼ãã©ã¡ã¼ã¿ã®éè¦åº¦ãªã©ãå¯è¦åããæ©è½ï¼ æé©åçµæã pandas DataFrame ã§åºåããæ©è½ãã¯ããã¨ããå¼·åãªå®é¨ç®¡çæ©è½ãæä¾ãã¦ãã¾ãï¼ ãã詳ãã AllenNLP 㨠Optuna ã®ä½¿ãæ¹ãå¦ã³ããæ¹ã¯ AllenNLP ã®å ¬å¼ã¬ã¤ã ãªã©ãåããã¦èªãã§ã¿ã¦ãã ããï¼
ã¾ã¨ã
æ¬ç¨¿ã§ã¯ AllenNLP 㨠Optuna ãç¨ãã¦æ§ç¯ããåºæ表ç¾æ½åºã·ã¹ãã nerman ã«ã¤ãã¦ç´¹ä»ãã¾ããï¼ nerman 㯠AllenNLP ãç¨ããã¢ãã«å¦ç¿ã»å®ãã¼ã¿é©ç¨ï¼ Amazon Aurora ãæ´»ç¨ãããã¼ã¿æå ¥ã®æéã®åæ¸ï¼ ããã³ Optuna ãæ´»ç¨ããã¹ã±ã¼ã©ãã«ãªãã¤ãã¼ãã©ã¡ã¼ã¿æ¢ç´¢ãå®ç¾ãã¦ãã¾ãï¼ AllenNLP 㨠Optuna ãç¨ããæ©æ¢°å¦ç¿ã·ã¹ãã ã®ä¸ä¾ã¨ãã¦ï¼èªãã§ãã ãã£ãçããã®åèã«ãªãã°ããããã§ãï¼
ã¯ãã¯ãããã§ã¯èªç¶è¨èªå¦çã®æè¡ã§æ¯æ¥ã®æçã楽ãããã仲éãåéãã¦ãã¾ãï¼ å®ç¾ããã価å¤ã®ããï¼ãã¼ã¿ã»ããã®æ§ç¯ããæ¬æ°ã§åãçµã¿ããã¨èãã¦ããæ¹ã«ã¯ã¨ã¦ã楽ãããç°å¢ã ã¨æãã¾ãï¼ èå³ããã£ã¦ãã ãã£ãæ¹ã¯ãã²ãå¿åãã ããï¼ ã¯ãã¯ãããã® {R&D, ãµã¼ãã¹éçºç¾å ´} ã§ã®èªç¶è¨èªå¦çã«ã¤ãã¦ã«ã¸ã¥ã¢ã«ã«è©±ãèãããã¨æã£ã¦ãã ãã£ãæ¹ã¯ @himkt ã¾ã§ãæ°è»½ã«ãé£çµ¡ãã ããï¼