Photo via Visual Hunt
Â
å°ãåã®ãã¨ã§ãããAlphaGoã¨ããå²ç¢ã®äººå·¥ç¥è½ããã°ã©ã ãã¤ã»ã»ãã«ä¹æ®µã«åå©ãããã¨ã§è©±é¡ã«ãªãã¾ããã*1
ã¾ããä¸é¨ã®ã²ã¼ã ã«ããã¦ãDQNï¼Deep Q-networkï¼ãã人éãããä¸æããã¬ã¤ããããã«ãªã£ãã¨ãããã¥ã¼ã¹ã話é¡ã«ãªã£ã¦ãã¾ãããã*2
ä»åã¯ãããã®äºä¾ã§ä½¿ããã¦ããã深層強åå¦ç¿ãã¨ããä»çµã¿ã使ã£ã¦ãFXã®ã·ã¹ãã ãã¬ã¼ããã§ããªããã¨æãã調ã¹ã¦ã¿ã¾ããã
注æï¼å¼·åå¦ç¿ãFXãåå¼·ãå§ããã°ãããªã®ã§ãè²ã
ééã£ã¦ããç®æãããããããã¾ããããææããã ããã¨å¹¸ãã§ãã
ä»åã®å 容
- 1.å¼·åå¦ç¿ã«ã¤ãã¦
- 2.å¼·åå¦ç¿ã§FX
- 2-1.å¼·åå¦ç¿ï¼çºæ¿ãã¬ã¼ãæ¦ç¥
- 2-2.ã¢ã«ã´ãªãºã ãã¬ã¼ãã®å¼·åå¦ç¿ã¢ã«ã´ãªãºã ã«ã¤ãã¦èª¿ã¹ã¦ã¿ã - Qiita
- 2-3.Design of an FX trading system using Adaptive Reinforcement Learning
- 2-4.Algorithm Trading using Q-Learning and Recurrent Reinforcement Learning
- 2-5.An Investigation into the Use of Reinforcement Learning Techniques within the Algorithmic Trading Domain
- 3.深層強åå¦ç¿ / DQNã«ã¤ãã¦
- 3-1.ã¼ãããDeepã¾ã§å¦ã¶å¼·åå¦ç¿
- 3-2.Pythonã§ã¯ããã OpenAI Gymãã¬ã¼ãã³ã°
- 3-3.DQNãKerasã¨TensorFlowã¨OpenAI Gymã§å®è£ ãã
- 3-4.深層強åå¦ç¿ï¼ãã¯ã»ã«ããããã³ã
- 3-5.ä¸è¨ã®ãã¯ã»ã«ããããã³ãã§ç´¹ä»ããã¦ããåç»
- 3-6.Kerasã§DQNãå®è£ ãã¦FlappyBirdããã¬ã¤ãã
- 4.深層強åå¦ç¿ã§FX
- 5.ãã¾ã
Â
Â
ããã§ã¯ã¹ã¿ã¼ã
Â
1.å¼·åå¦ç¿ã«ã¤ãã¦
1-1.å¼·åå¦ç¿
å¼·åå¦ç¿ã®æç§æ¸ã¨ããã°ãã®æ¬ã®ããã§ããÂ
- ä½è : Richard S.Sutton,Andrew G.Barto,ä¸ä¸è²è³,çå·é ç«
- åºç社/ã¡ã¼ã«ã¼: 森ååºç
- çºå£²æ¥: 2000/12/01
- ã¡ãã£ã¢: åè¡æ¬ï¼ã½ããã«ãã¼ï¼
- è³¼å ¥: 5人 ã¯ãªãã¯: 76å
- ãã®ååãå«ãããã° (29件) ãè¦ã
ã§ããããã®æ¬ã®åèã¯ç¡æã§å ¬éããã¦ãã¾ãã
ããããã¡ãâ¬ï¸
1-2.Reinforcement Learning: An Introduction (2nd Edition)
ä¸ã§æããæ¸ç±ã®åèã§ãã
2ndã¨ãã£ã·ã§ã³ã®è稿ãå ¬éããã¦ããããã¡ãã§ã¯æ·±å±¤å¼·åå¦ç¿ã«ã¤ãã¦ãæ¸ããããããã§ããè±èªãèªããæ¹ã¯ãã¡ããèªãã ã»ããè¯ãããã§ããã
https://webdocs.cs.ualberta.ca/~sutton/book/bookdraft2016sep.pdf
1-3.UCL Course on RLÂ
http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.htm
Googleã«è²·åãããDeepMind社ã®David Silverããã®è¬ç¾©è³æã§ããl
1-4.å¼·åå¦ç¿ã«ã¤ãã¦å¦ãã§ã¿ããï¼ã¾ã¨ãï¼ - ããã®ãã¾ã
å¼·åå¦ç¿ã«ã¤ãã¦å¦ãã§ã¿ããï¼ã¾ã¨ãï¼ - ããã®ãã¾ã
æç§æ¸ãè«æã¯ãã¼ãã«ãé«ãã¨ãã人ï¼ï¼åï¼ã«ã¨ã£ã¦ã¯æé©ãªã®ããã¡ãã®ããã°ã
é常ã«ããããããã¾ã¨ãããã¦ãã¾ãã
1-5.å ¨è³ã¢ã¼ããã¯ãã£è¥æã®ä¼ å¼·åå¦ç¿
å ¨è³ã¢ã¼ããã¯ãã£è¥æã®ä¼ å¼·åå¦ç¿
å ¨169æãããªãã¹ã©ã¤ããè³ã¨ã®é¢é£æ§ã¾ã§è¿°ã¹ããã¦ãã¦é¢ç½ãã§ããÂ
2.å¼·åå¦ç¿ã§FX
ããããã¯å¼·åå¦ç¿ã§FXã«é¢ãã¦èªãã æ¹ãè¯ããããªè³æãã¾ã¨ãã¾ãããÂ
2-1.å¼·åå¦ç¿ï¼çºæ¿ãã¬ã¼ãæ¦ç¥
å¼·åå¦ç¿ï¼çºæ¿ãã¬ã¼ãæ¦ç¥ â Momentum
å¼·åå¦ç¿ï¼çºæ¿ãã¬ã¼ãæ¦ç¥(ãã®2) â Momentum
詳細ã¯è¦ã¦ãã¾ãããã½ã¼ã¹ã³ã¼ããå ¬éããã¦ãã¦åèã«ãªãããã§ããã®ã§ã¡ã¢Â
2-2.ã¢ã«ã´ãªãºã ãã¬ã¼ãã®å¼·åå¦ç¿ã¢ã«ã´ãªãºã ã«ã¤ãã¦èª¿ã¹ã¦ã¿ã - Qiita
ã¢ã«ã´ãªãºã ãã¬ã¼ãã®å¼·åå¦ç¿ã¢ã«ã´ãªãºã ã«ã¤ãã¦èª¿ã¹ã¦ã¿ã - Qiita
ä¸ã®è¨äºã«è§¦çºããã¦ãã£ã¦ã¿ããã¨ããè¨äºã
Q-learningã¨ã¯å¥ã®Direct RLã§ãã£ã¦ãããããã§ããï¼ã¾ã ããããã£ã¦ãªãã§ãï¼
2-3.Design of an FX trading system using Adaptive Reinforcement Learning
http://www.optirisk-systems.com/events/carisma2007_files/dayone3.pdf
ãã¨ã§èªãã
2-4.Algorithm Trading using Q-Learning and Recurrent Reinforcement Learning
http://cs229.stanford.edu/proj2009/LvDuZhai.pdf
ãã¨ã§èªãã
2-5.An Investigation into the Use of Reinforcement Learning Techniques within the Algorithmic Trading Domain
http://www.doc.ic.ac.uk/teaching/distinguished-projects/2015/j.cumming.pdf
 ãã¨ã§èªãã
3.深層強åå¦ç¿ / DQNã«ã¤ãã¦
ããããã¯æ·±å±¤å¼·åå¦ç¿ã«ã¤ãã¦ãè¿°ã¹ã¦ããè³æãã¾ã¨ãã¾ãã
3-1.ã¼ãããDeepã¾ã§å¦ã¶å¼·åå¦ç¿
ã¼ãããDeepã¾ã§å¦ã¶å¼·åå¦ç¿ - Qiita
深層強åå¦ç¿ã調ã¹ã¦ããã¨è³ãæã§åèã«ããã¦ããè¨äºã§ãã
åã¯ã¾ã éä¸ã¾ã§ããç解ã§ãã¦ãã¾ããããåèã«ãªãè³æã¨ãã¦ã
3-2.Pythonã§ã¯ããã OpenAI Gymãã¬ã¼ãã³ã°
Pythonã§ã¯ããã OpenAI Gymãã¬ã¼ãã³ã°
ä¸ã®Qiitaè¨äºãæ¸ããæ¹ã®ã¹ã©ã¤ãã§ãã
ãªãå¼·åå¦ç¿ãDeep Learningã¨èåããã®ããããããããæ¸ããã¦ãã¾ãã
3-3.DQNãKerasã¨TensorFlowã¨OpenAI Gymã§å®è£ ãã
DQNãKerasã¨TensorFlowã¨OpenAI Gymã§å®è£ ãã
ãã£ã¼ãã©ã¼ãã³ã°é¢ä¿ã調ã¹ã¦ããã¨ããåºã¦ããElixããã®æè¡ããã°ã
ã¿ã¤ãã«ã®éããOpenAI Gym ã使ã£ã¦DQNãå®è£ ããã¦ãã¾ããã½ã¼ã¹ã³ã¼ããå ¬éããã¦ããã®ã§ããããåèã«ãªãã¾ããÂ
3-4.深層強åå¦ç¿ï¼ãã¯ã»ã«ããããã³ã
深層強åå¦ç¿ï¼ãã¯ã»ã«ããããã³ã â åç·¨ | ããã°ã©ãã³ã° | POSTD
深層強åå¦ç¿ï¼ãã¯ã»ã«ããããã³ã â å¾ç·¨ | ããã°ã©ãã³ã° | POSTDÂ
130è¡ç¨åº¦ã¨ããå°ãªãã§ãå¼·åå¦ç¿ã®ä¸ç¨®ï¼æ¹çå¾é æ³ï¼PGæ³ï¼ï¼ã§ATARIã¨ããã²ã¼ã ãå¦ç¿ãããã½ã¼ã¹ã³ã¼ããå ¬éããã¦ãã¾ãã
Training a Neural Network ATARI Pong agent with Policy Gradients from raw pixels · GitHub
3-5.ä¸è¨ã®ãã¯ã»ã«ããããã³ãã§ç´¹ä»ããã¦ããåç»
深層強åå¦ç¿ã®è¬ç¾©åç»
John Schulman 1: Deep Reinforcement Learning - YouTube
John Schulman 2: Deep Reinforcement Learning - YouTube
John Schulman 3: Deep Reinforcement Learning - YouTube
John Schulman 4: Deep Reinforcement Learning - YouTube
3-6.Kerasã§DQNãå®è£ ãã¦FlappyBirdããã¬ã¤ãã
Using Keras and Deep Q-Network to Play FlappyBird | Ben Lau
ä»ã®ã½ã¼ã¹ã³ã¼ãã¯OpenAI Gymãå©ç¨ãã¦ãããã®ãå¤ãã®ã§ããããã¡ãã®ã½ã¼ã¹ã³ã¼ãã¯ãããå©ç¨ãã¦ããªãããã§ãã
4.深層強åå¦ç¿ã§FX
æå¾ã«æ·±å±¤å¼·åå¦ç¿ã¨FXé¢é£ã§ããã¾ã 深層強åå¦ç¿ãå¿ç¨ããã¨ããæ å ±ã¯å°ãªãããã§ããï¼åã®æ¤ç´¢è½åãä½ãã ãããããã¾ãããï¼
4-1.éèåå¼æ¦ç¥ç²å¾ã®ããã®è¤å©å深層強åå¦ç¿
http://sigfin.org/?plugin=attach&refer=SIG-FIN-016-01&openfile=SIG-FIN-016-01.pdf
è¤å©åDeep Q-Networkã§å®é¨ãã¦ã¿ãã¨ã®ãã¨ã§ãããä¸é層ã1層ãªã®ã§Deepã¨å¼ã¹ãã®ãã¯â¦ï¼
ããå°ãç解ã§ããããã«ãªã£ã¦ããèªã¿ç´ãã¾ãã
4-2.Deep Q-Learningã§FXãã¦ã¿ã
Deep Q-Learningã§FXãã¦ã¿ã | GMOã¤ã³ã¿ã¼ããã 次ä¸ä»£ã·ã¹ãã ç 究室
ã¾ãã«æ±ãã¦ããéãã®è¨äºã§ããããããã»ã©è¿½æ±ããã¦ã¯ããªãããã§ãã
Â
5.ãã¾ã
Machine Learning for Trading
ä»åã®è¨äºã¯ãå¼·åå¦ç¿ãããã¼ãã«ã¾ã¨ãã¾ããããã·ã¹ãã¬&æ©æ¢°å¦ç¿ã¨ãã¦ã¯ãã®è¬åº§ãè¯ãããã§ãã
Machine Learning for Trading | Udacity
Â
ãã³çã§æ¾éä¸ã«ç´¹ä»ãã¦ããã ããæ¬
Python3ã§ã¯ãããã·ã¹ãã ãã¬ã¼ã
ç®æ¬¡ãè¦ãéããPythonã®åå¿è ãã¤ã·ã¹ãã ãã¬ã¼ãã®åå¿è ã«ã¯åãã¦ããã§ããã