ãã®ä¾ã«é¢ãã質åã¸ã®åçãè£è¶³ã®é ã«è¨è¼ãã¾ããã®ã§ãããè¯ãç解ã®ããã«ãåç §ãã ããã 1.3 ã¢ãã«æ§é ãã©ã³ã¹ãã¯ã·ã§ã³ã¢ãã«(ããæç« ãä»ã®æç« ã«å¤æããã¢ãã«(翻訳ãªã©))ã«ããã¦ä¸»æµãªã®ã¯ä»¥ä¸ã®ãããªã¨ã³ã³ã¼ã-ãã³ã¼ãã¢ãã«ã§ããã ã¨ã³ã³ã¼ã: å ¥åã®æ $(x_1,\ldots,x_n)$ ã $\boldsymbol{z}=(z_1,\ldots,z_n)$ ã¸å¤æ ãã³ã¼ã: $\boldsymbol{z}$ ããåèª $(y_1,\ldots,y_m)$ ãåºåã ãã ãã1æå»ã«1åèªã®ã¿ã§ãåæå»ã®ãã³ã¼ãã®åºåãç¾æå»ã®ãã³ã¼ãã®å ¥åã¨ãã¦ä½¿ãã Transformerã¯åºæ¬çãªå¤§æ ã¯ã¨ã³ã³ã¼ã-ãã³ã¼ãã¢ãã«ã§self-attention層ã¨Position-wiseå ¨çµå層ã使ç¨ãã¦ãããã¨ãç¹å¾´ã ã¤ã¾ãã以ä¸ã®3ã¤(+2ã¤)ã®ãã¨ãåããã°ã¢ãã«

{{#tags}}- {{label}}
{{/tags}}