Chainerã使ã£ã深層強åå¦ç¿ã©ã¤ãã©ãªChainerRLãå ¬éãã¾ããï¼ https://github.com/pfnet/chainerrl PFNã¨ã³ã¸ãã¢ã®è¤ç°ã§ãï¼ç¤¾å ã§Chainerã使ã£ã¦å®è£ ãã¦ãã深層強åå¦ç¿ã¢ã«ã´ãªãºã ãâChainerRLâã¨ããã©ã¤ãã©ãªã¨ãã¦ã¾ã¨ãã¦å ¬éãã¾ããï¼RLã¯Reinforcement Learningï¼å¼·åå¦ç¿ï¼ã®ç¥ã§ãï¼ä»¥ä¸ã®ãããªæè¿ã®æ·±å±¤å¼·åå¦ç¿ã¢ã«ã´ãªãºã ãå ±éã®ã¤ã³ã¿ãã§ã¼ã¹ã§ä½¿ããããå®è£ ãã¦ã¾ã¨ãã¦ãã¾ãï¼ Deep Q-Network (Mnih et al., 2015) Double DQN (Hasselt et al., 2016) Normalized Advantage Function (Gu et al., 2016) (Persistent) Advantage Learning (Bellemar
{{#tags}}- {{label}}
{{/tags}}