å¼·åå¦ç¿ã®æ¦è¦ï¼å¿ç¨ä¸ã®å©ç¹ï¼é©ç¨ä¾ï¼åºç¤çè«ï¼ä»£è¡¨çææ³ï¼å¿ç¨ã«å¿ è¦ãªæè¡ãªã©ã®èª¬æã æ¬ãã¼ã¸ã®è¨è¿°ã¯ä¸è¨ã®è§£èª¬è¨äºããã¨ã«WEBç¨ã«ä¿®æ£ãããã®ã§ããï¼ æ¨æ å ï¼å®®å´ åå ï¼å°æ éä¿¡ï¼ å¼·åå¦ç¿ã·ã¹ãã ã®è¨è¨æéï¼ è¨æ¸¬ã¨å¶å¾¡, Vol.38, No.10, pp.618--623 (1999), è¨æ¸¬èªåå¶å¾¡å¦ä¼. 6 pages, postscript file, sice99.ps (1.31MB) PDF file, sice99.pdf (148KB) 第ï¼ç« ï¼ å¼·åå¦ç¿ã®æ¦è¦ 1.1 å¼·åå¦ç¿ (Reinforcement Learning) ã¨ã¯? 1.2 å¶å¾¡ã®è¦ç¹ããè¦ãå¼·åå¦ç¿ã®ç¹å¾´ 1.3 å¿ç¨ä¸æå¾ ã§ããã㨠第ï¼ç« ï¼ å¼·åå¦ç¿ã®é©ç¨ä¾ï¼ããããã®æ©è¡åä½ç²å¾ 第ï¼ç« ï¼ å¼·åå¦ç¿ã®åºç¤çè« 3.1 ãã«ã³ã決å®éç¨(Markov decision proc
{{#tags}}- {{label}}
{{/tags}}