Neural Network Consoleã§æ´»æ§å¤äºæ¸¬ããã¦ã¿ã
SONYããNeural Network Consoleã¨ããã®ãå ¬éããã¾ããã(以ä¸NNC)ã
ãããã®è¯ãæªããè°è«ã§ããã»ã©ã®ç¥èã¯ç§ã«ã¯ãªãã§ããã§ãGUIãã¼ã«ã¨ãããã¨ã§ãã¨ãããã使ã£ã¦ããï¼ã¨ãã試ã¿ã§ããã»ããã¢ããã®æ¹æ³ãªã©ã¯å ¬å¼ã«æ¸ããã¦ãã¾ãã®ã§ããã¡ããåèã«ãã¦ãã ããã
Â
ããã¦ããªã«ããã£ã¦ã¿ããããªã¨èããã®ã§ãããç§ã¯ä»äºæãç»åèªèããã¾ãæ´»ç¨ã§ãã身è¿ãªå 容ããã¾ãæãã¤ããªãã£ãã®ã§ãæ°å¤(é»å®³æ´»æ§å¤)ã®äºæ¸¬ããã¦ã¿ããã¨ã«ãã¾ãã*1
Â
ããã¥ã¼ããªã¢ã«ã«ã¯ç»åã®åé¡(MNIST)ãæ¸ããã¦ããã®ã§ãããç»å以å¤ã®å½¢å¼ãå ¥åã«ããä¾ã¯ãã¾ãç´¹ä»ããã¦ãã¾ãããç§ã¯ãã®è¨äºãåèã«ãã¾ããã
ãã®å 容ã«å¾ãã°ãã¨ããããåãã¾ãããã£ããã¨ããã¨ã
- ãã¼ã¿ã®csv
- ãã¼ã¿ã»ããã®csv
ã®2ã¤ãããã°ãããã¨ããå 容ã§ãã
Â
ãã¼ã¿ã®csv
Â
ä»åã®å ´åã¯ããã ã®0,1ã1024å並ãã csvãååç©ãã¨ã«å¿ è¦ã¨ãããã¨ã§ããä¸è¨ã¯compound-0ã«ã¤ãã¦ã®ãã®ãããã1ã¤ã®å ´æã«ããããããç¶æ ãå¿ è¦ã§ãã
Â
ããã誰ã§ãç°¡åã«ç¨æã§ãããã®ãªã®ã§ããããï¼
KNIMEã使ãã°ãçµæ§ç°¡åã§ãããéå»è¨äºã®å 容ã§ãã¹ã¦ç¶²ç¾ ã§ãã¾ãã®ã§ããã¨ã§ã¡ãã£ã¨è§£èª¬ãã¾ãã
Â
ãã¼ã¿ã»ããã®csv
ããã¯ä½ãæããã¨ããã¨
ãã¼ã¿ã®csvã®å ´æï¼ããã®ã©ãã«ãã¾ã¨ã¾ã£ãcsvã§ãã
ãã©ãã«ã¨ã¯ãä»åã®å ´åãæ£è§£ã®æ°å¤ãã§ãããåé¡ãããããªãred, blue, greenãªã©ã®ãæ£è§£ã®ç¨®å¥ãã«è©²å½ãããã®ã§ããä»åã¯æ´»æ§å¤(-1~1ã«ãã¦æ¬²ããã¿ãªããªã®ã§ãpIC50ã10ã§å²ãã¾ãã)ããã®ãªã¹ããä½ãæãKNIMEã§ç°¡åã«ä½ãã¾ãã
Â
ããã¾ã§ã§ç´¹ä»ãã2ç¨®ã®æºåç©ã使ããããã¼ããã£ããã£ãã¾ããã
Â
ã¡ãã£ã¨ç´°ããã§ããã
éå»è¨äºã®ãªã³ã¯ãæ´»ç¨ããªããããã£ããã¨è§£èª¬ãã¾ãã
Â
fingerprinté¢é£
sdfãèªãã å¾ã§ããååç©ã®IDã¨pIC50ãå ¥ã£ã¦ã¾ããæ§é ãå ¥ã£ã¦ã¾ãããèªåã®ãã¼ã¿ãªã®ã§è¦ããªãããã«ãã¦ããã¾ãã*2
ããã«Fingerprintsã¨ãããã¼ããç¹ãã¾ãã
ãè¨å®ç»é¢ã¯ãããªæãã§ããç¹ã«æå³ã¯ããã¾ããããCircular-ECFP4ãä»åã¯é¸ã³ã¾ããããããå®è¡ããå¾ã«Expand Bit Vectorã¨ãããã¼ããã¤ãªããã¨0,1ã®ãããã1024åãåå¥ã®ã«ã©ã ã«åå²ããã¾ãã
Â
ãã®å 容ã®pIC50ãnormalizeãã¾ãã
normalizerã®è¨å®ç»é¢ã§ããä¹±æ´ããªã¨æãã¾ããããåç´ã«å°æ°åãã¾ãã(ãã£ããã£å·¦ä¸)ã
ç¶ãColumn Renameã§pIC50ã¨ããã«ã©ã åãpIC50_normalã«å¤æ´ããColumn Filterã§æ§é å¼ã®ã«ã©ã ãæ¨ã¦ã¾ãã(fingerprintãåºããã®ã§ããããããªã)ã
Â
ãã¼ã¿ãã¡ã¤ã«ä½æ
Loopã使ã£ã¦1è¡ãã¤èªã¿è¾¼ã¿ãcsvã«æ¸ãåºãã¾ããåå²ä¸å´ã§ã¯ã夿°åãããååç©ã®IDããææã®å ´æï¼ååç©ã®ID+æ¡å¼µåãã«ãªãããã«ãã¦ãã¾ããåå²ä¸å´ã§ã¯ååç©IDã¨pIC50_mormalãæ¨ã¦ã¦ãfingerprintã®ã¿(å¦ç¿ã«ä½¿ãå ¥åå¤ã®ã¿)ã«ãªãããã«filterãã¦ãã¾ãã
ä¸å´åå²ãããã夿°ããã¡ã¤ã«åã¨ãã¦ãcsvãã¡ã¤ã«ãæ¸ãåºãã¾ããæ¸ãåºãæã«ã¯ãã«ã©ã ãããã¯å ¥ãã¾ãããã¾ããã¯ãªã¼ããã¤ããªãããã«æå®ãã¾ãããæåã³ã¼ãã¯UTF-8ã¨ãã¾ãããå種ã¿ãã«ã¦è¨å®å¯è½ã§ãã
ããã¾ã§ã§ãåé ã®ãã¡ã¤ã«ãã£ã±ãç¶æ ãåºæ¥ä¸ããã§ãã
Â
ãã¼ã¿ã»ãã使
ãã¼ã¿ãããããæ ¼ç´ããå ´æãList Filesã§æå®ãã¾ããç¶ãã¦URL to File Pathãç¹ãã¾ããããã¨File nameã¨ããã«ã©ã ãã§ãã¾ãããããã¯ä»åã®ä¾ã§ã¯ååç©ã®IDã¨ãã¦ä½æãã¦ããã¾ããã§ãã®ã§ãå ã®ãã¼ã¿ã®ååç©IDã¨file nameã§ã¢ããã¼ã·ã§ã³(joiner)ãããã¨ã«ããããªãé ç®ãé¤ãã°ãããã宿ã§ãã
ãã¼ã¿ã®å ´æã¨ãæ´»æ§å¤(æ£è§£ã®å¤)ã ãã®ãã¼ãã«ãã§ãã¦ãã¾ããNNCã®ã使³ã«å¾ã£ã¦ã«ã©ã åã¯
- ãã¡ã¤ã«ã®å ´æâãx:dataã
- æ£è§£å¤âãy:labelã
ã«å¤æ´ãã¾ãããã®tableãå¦ç¿ç¨ã¨è©ä¾¡ç¨ã«åå²ããã°æºåå®äºã§ããpartitioningãã¼ãã§ã¯æ§ã ãªæ¹æ³ã§ãã¼ã¿ãåå²ã§ãã¾ããä»åã¯ã©ã³ãã ã«80:20ã«åå²ãã¾ããããã®åãæ¹ã¨ãããdeeplearningã«ã¯éè¦ãªãã ã¨æãã¾ãããä»åã¯ãã¾ãæ°ã«ãã¦ãã¾ããã
æå¾ã«ãããããcsvã¨ãã¦æ¸ãåºãã¾ããã
Â
NNCã®ç»é¢
ã¨ããããã1024â100â1ã¨ããæãã«ãã¾ãããæé©åæ³ã¯ããã©ã«ãã®ã¾ã¾ã§adamãlossã¯squared errorã«ãã¾ãããåå¼·ãå§ãããã¨ãã¦ããã¨ãããªã®ã§ãæå³ãã¼ããããªã®ã¨ããå 容ã§ãããç³ã訳ããã¾ãããããããæãã¦ãã ããæ¹ãããã¨å©ããã¾ãã
ããã§å¦ç¿ããã¨ãããªæãã®ããã§ãã
è©ä¾¡çµæã¯ãããªæãã
y:labelãæ£è§£(pIC50ã10ã§å²ã£ããã¤)ãy'ãäºæ¸¬å¤ã«ãªãã¾ãã
ãã®ç»é¢ã§ãå³ã¯ãªãã¯ããã¨å 容ãcsvã§æ¸ãåºããã¨ãå¯è½ã§ãã
æ¸ãåºããcsvãKNIMEä¸ã§ã°ã©ãã«ãã¦ã¿ã¾ããã
ãã¡ã¤ã«ãèªã¿è¾¼ãã§ã2D/3DScatterplotã«ã¤ãªãã¾ãã
ãã¡ãã£ã¨è¦ã«ããã§ãããæ¨ªè»¸ãy:label(æ£è§£)ã»ç¸¦è»¸ãy'(äºæ¸¬å¤)ã§ããr^2 = 0.59ã¨ãªã£ã¦ãã¾ãããªããªãã®ç¸é¢ã§ã¯ãªãã§ããããï¼ç¹°ãè¿ãã«ãªãã¾ãããçµæ§çãç¯å²ã§ã®å¦ç¿ã»äºæ¸¬ãªã®ã§(ã±ã¢ã¿ã¤ããçµããã¦ã)ãæ°ããªã±ã¢ã¿ã¤ãã«ã¤ãã¦äºæ¸¬å¤ãåºããã¨ããã°å¤§å¤ãããã¨æãã¾ããã§ããç°¡åã«äºæ¸¬å¤ãåºããã¨ãã§ãã¦çµæ§æ¥½ããã§ãã
Â
ããã¦ããã£ããã£ã®plotã§ãããã¯ã£ããè¨ã£ã¦ããµãã§ããã綺éºãããªãã¨ããããããªãã¦ãªãã¨ãããâ¦ã
Â
ããããªãã§ããå人çãªæè¦ã§ãããKNIMEã¯å¯è¦åã«é¢ãã¦ã¯ãã¾ãå¼·ãã¯ãªãã¨æãã¾ããæ å ±ã¯å¾ãããããã£ããããªããã§ãã®ã§ãç§ã¯æ´å½¢ãããã¡ã¤ã«ãspotfireãªã©ã«èªã¾ãã¦å¯è¦åé¨åãæ å½ããã¦ãã¾ãã
Â
ãspotfireã¯æåãªã®ã§ããªã¼ãã³ã½ã¼ã¹ã§ãã£ãããæç»ãã§ãããã®ãããã¨ãããªã¼ã¨æã£ã¦ãã¾ããã°ã©ãå ã§ã¤ã³ã¿ã©ã¯ãã£ãã«åããããã¤ãããã§ãã(ãåç¥ã®æ¹ãããã°æãã¦ãã ãã)
Â
ãä»åã¯ãéè¨ã¨ãããã¨ã§KNIMEã«é¢ããç´¹ä»ã¯å°ãªãï¼éã§ãããããããã§ãããï¼ç§ã¯æ«ç«¯ã®åæåå¦è ã§ãã®ã§ããã¼ã¿è§£æãæ©æ¢°å¦ç¿ãªã©ã«ã¤ãã¦ãã¾ãç¥èã¯ããã¾ããããèå³ã¯ãã£ã¦ãã¾ããç¾å½¹ã§ãã³ãã¯ã¼ã¯ãã¦ã¾ãã®ã§ããã¾ãåå¼·ããæéãåããªãã®ã§ããâ¦ã
ãå¨ãã«ãã¾ã詳ãã人ãããªãã®ã§ãããããã¨æãã¦ããã人ãããã¨ãããªã¼ã¨æã£ã¦ãã¾ããä½ãããã°å ã§ééã£ãããé¨åã®ææã¨ããè²ã ãªé¢é£ããé¢ç½ãæè¡ã¨ããã³ã¡ã³ããå¾ ã¡ãã¦ãã¾ãã®ã§ãããæ°ãåãã°å®ãããé¡ããã¾ãã
ãã¡ããããã ã®ææ³ã§ãçµæ§ã§ãã
Â
ã§ã¯ã¾ã次åï¼
*1:graph convolutionãªã©ãå®ç¾ã§ããã°ããã®ã§ãããä»åã®NNCã¯rdkitãdeep chemãªã©æ§ã ãªã©ã¤ãã©ãªãèªã¿è¾¼ãã§äºã ãããããªãã®ã§ã¯ãªãã®ã§ãã¨ããããECFP4ãå ¥åå¤ã«ããç°¡åãªãã®ã§ããã¾ã1D conv.ã«ã対å¿ãã¦ãªãã£ã½ãã®ã§ç³ã¿è¾¼ã¿ããã¦ã¾ãããã
*2:delaneyã®æº¶è§£åº¦ã¨ããã£ããã©ããã¾ãã«çµæãã²ã©ãã£ãã®ã§ãèªåã®ãã¼ã¿ã使ãã¾ãããã±ã¢ã¿ã¤ããæ°ç¨®ã«éããã¦ãããSARãã¯ã£ãããã¦ããååç©ç¾¤ãªã®ã§ãfingerprintã§ãããããã®çµæãåºã¦ããã®ã ã¨æãã¾ã