ããã«ã¡ã¯ãã¨ã³ã¸ãã¢ãªã³ã°ã°ã«ã¼ãã®AIã»æ©æ¢°å¦ç¿ãã¼ã ã«æå±ãã¦ããé´¨ç° ã§ãã
ãã®ããã°ã¯ã¨ã ã¹ãªã¼ã¢ããã³ãã«ã¬ã³ãã¼14æ¥ç®ã®è¨äºã§ãã
å¼ãã¼ã ã§ã¯æ¯é±1æéã®æè¡å ±æä¼ãå®æ½ãã¦ãããåèªãæ å½ãããããã¯ãã®æè¡ããæè¿èªãã è«æãç´¹ä»ãã¦ãã¾ããä»é±ã¯NeurIPS2024ãéå¬ããã¦ãããã¨ããããåå¦ä¼ã®è«æèªã¿ä¼ã¨ãªãã¾ããã1ã»ãã·ã§ã³1åã®æ å½ã§ãåèªãã»ãã·ã§ã³å ã§æ°ã«ãªã£ãè«æã®è©³ç´°ã解説ãã¾ããæ¬ããã°ã§ã¯ãã®ä¸é¨ã¨ãã¦ãã»ãã·ã§ã³ãã¨ã®ãæ¨ãè«æããç´¹ä»ãã¾ãã
- You Only Cache Once: Decoder-Decoder Architectures for Language Models
- Decompose, Analyze and Rethink: Solving Intricate Problems with Human-like Reasoning Cycle
- RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
- GIC: Gaussian-Informed Continuum for Physical Property Identification and Simulation
- Convolutional Differentiable Logic Gate Networks
- Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought
- Guiding a Diffusion Model with a Bad Version of Itself
- We are hiring !!
You Only Cache Once: Decoder-Decoder Architectures for Language Models
- ã»ãã·ã§ã³: 6D: Deep Learning Architecture, Infrastructure
- èè : Yutao Sun, Li Dong, Yi Zhu, Shaohan Huang, Wenhui Wang, Shuming Ma, Quanlu Zhang, Jianyong Wang, Furu Wei
è«æãªã³ã¯: https://arxiv.org/pdf/2405.05254
ç´¹ä»è : é«æ© (æ©æ¢°å¦ç¿ã¨ã³ã¸ãã¢)
æ¨ããã¤ã³ã
LLMå ¨çã®æ¨ä»ã§ã¯Transformerã®æ¨è«é度ãè¦ããHBMã¡ã¢ãªã®åæ¸ãéç¨ã³ã¹ãã«ã¯ãªãã£ã«ã«ã«å¹ãã¦ãã¾ããæè¡ãã¬ã³ãã¨ãã¦ãAttentionæ©æ§ã«é¢ããé¨åã®ç 究ã¯å¤ããä¾ãã°GPUã¢ã¼ããã¯ãã£ã®æ§è³ªã«é©ããFlashAttentionã§ãã£ãããAttentionãç³»åé·Nã«å¯¾ãã¦O(N2)ã®è¨ç®ã³ã¹ããè¦ããã¨ããO(N)ã«ããã¢ã¼ããã¯ãã£ãæ°å¤ãææ¡ããã¦ãã¾ãã
ä»åé¸ãã You Only Cache One (YOCO)ã§ã¯decoder-decoderã¨ãã°ããã¢ã¼ããã¯ãã£ãææ¡ããåè¦æ¨¡ã®Transformerã¨æ¯ã¹ã¦ã¹ã«ã¼ãããã®10åè¿ãé«éåãéæãã¦ãã¾ãã
èã¨ãªãã®ã¯key-value cache (KV cache)ãã¬ã¤ã¤ã¼ãã¨ã§ã¯ãªããã¢ãã«ã§1ã¤ã®ã¿ã®KV cacheãçæãã (global KV cache)ã¨ããç¹ã§ããè«æã®Figure 2ã®ã¨ãããã¢ã¼ããã¯ãã£å ¨ä½ã¨ãã¦ã¯ååã®self-decoderã¨å¾åã®cross-decoderããæ§æããã¾ããself-decoderã§ã¯Retentive Networksãæ¡ç¨ãã¦O(N)ã®è¨ç®ã³ã¹ã㧠global KV Cacheãçæãã¾ããcross-decoderã§ã¯global KV cacheãå©ç¨ãã¦é常ã®Transformerã®Attentionæ©æ§ã¨åæ§ã®è¨ç®ããã¾ãã
æ°ã«ãªãã®ã¯LLMã¨ãã¦ã®æ§è½ã§ãããlanguage modeling, needle-in-a-haystack (é·æããã®æ å ±æ½åºã¿ã¹ã¯), question asnweringã¨ãã£ãè¤æ°ã®ã¿ã¹ã¯ã§åè¦æ¨¡ã®Transformerã¨åç¨åº¦ãããä¸åãããã©ã¼ãã³ã¹ãéæãã¦ããé常ã«å¹æçãªã¢ã¼ããã¯ãã£ã§ãããã¨ã示ããã¦ãã¾ãã
AIãã¼ã ã§ã¯LLMãæ±ãã¿ã¹ã¯ãå¢ãã¦ãã¦ãããç¹ã«æ¨è«æã®ã¹ã«ã¼ããããã³ã¹ãã¯å®éã«ãããã¯ãã§LLMãæ´»ç¨ããéã«éè¦ãªãããããããæè¡ãæãã¦ãããã¨ã¯å¤§äºã ã¨æãæ¨è¦ãã¾ããã
Decompose, Analyze and Rethink: Solving Intricate Problems with Human-like Reasoning Cycle
- ã»ãã·ã§ã³: 3B: Natural Language Processing
- èè : Shangzi Xue, Zhenya Huang, Jiayu Liu, Xin Lin, Yuting Ning, Binbin Jin, Xin Li, Qi Liu
- è«æãªã³ã¯: Decompose, Analyze and Rethink: Solving Intricate Problems with Human-like Reasoning Cycle | OpenReview
- ç´¹ä»è : æ± å¶ (æ©æ¢°å¦ç¿ã¨ã³ã¸ãã¢)
æ¨ããã¤ã³ã
ãã®è«æã§ã¯ãLLMã§è¤éãªè³ªåã¸ã®æ¨è«æ§è½ãä¸ããç®çã§ãåé¡æãå解ãæ¨è«ãè©ä¾¡ãåæ§ç¯ã¨ããã¹ãããã§è§£ãã¦ãããDeAR (Decompose-Analyze-Rethink)ã¨ãããã¬ã¼ã ã¯ã¼ã¯ãææ¡ãã¦ãã¾ããæ¢åææ³ã§ãããChain-of-ThoughtãTree-of-Thoughtã§ãåé¡æãå解ãã¦éä¸åé¡ã解ããããæ¨è«çµæãè©ä¾¡ãã¦è¯ããããªãã®ãæ·±å ããããã§ãã¾ããããéä¸åé¡ã®æ¨è«ééããè¨æ£ã§ããªãã¨ããå¼±ç¹ãããã¾ãããDeARã§ã¯ãLLMèªèº«ã§ã»ã«ããã§ãã¯ãããããåæ§ç¯ã®ã¹ãããã§ã©ã®é¨ååé¡ã®åçã使ãã¹ãããLLMã«å¤æãããããã¦ããã®èª²é¡ã解æ¶ãã¦ãã¾ãã
çµæã¯ãã¡ããSoTAã§ãæ¢åã®Chain-of-ThoughtãTree-of-Thoughtãããæ§è½ãä¸ãã£ã¦ããã¨èª¬æãã¦ãã¾ãã人éã®æ¨è«æ¹æ³ã模å£ãã¦ãåé¡æãå解ã»åæ§ç¯ãããã¨ã§æ§è½ãä¸ããã¨ããã¢ã¤ãã£ã¢ãé¢ç½ãããåç¾ãããããã®ã§ã¤ã³ãã¯ãã®ããè«æããªã¨èãã¾ããã
ãã ãè«æã§ã¯è©ä¾¡ã«ã¯GPT-3.5ã使ç¨ãã¦ãããGPT4ã®å ´åã¯ãããé£åº¦ã®é«ãåé¡ã§å·®ãåºãã¨ä¸»å¼µãã¦ãã¾ãããããã«è«æã§ç¤ºãããLLMã§è§£ãã®ãé£ããã¨ãããåé¡æã試ãã«GPT4*1ã«å ¥ãã¦ã¿ãã¨ã0-shotããã³ããã«ãé¢ãããæ£è§£ãã¦ãã¾ã£ã¦ãã¾ãããLLMèªèº«ã®é²åãããè¦ããé¢ç½ãpaperã ãªã¨æãã¾ããã
RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
- ã»ãã·ã§ã³: 1B: Human-AI Interaction
- èè : Changli Wu, Qi Chen, Jiayi Ji, Haowei Wang , Yiwei Ma, You Huang, Gen Luo, Hao Fei , Xiaoshuai Sun , Rongrong Ji
- è«æãªã³ã¯: https://arxiv.org/pdf/2412.02402
- ç´¹ä»è : é´¨ç° (æ©æ¢°å¦ç¿ã¨ã³ã¸ãã¢)
æ¨ããã¤ã³ã
3D Referring Expression Segmentation ã¯ããã¹ãã«ããåç §è¡¨ç¾ã«åºã¥ãã¦ç¹ç¾¤ãã¼ã¿å ã®3Dç©ä½ãã»ã°ã¡ã³ãã¼ã·ã§ã³ããç 究åéã§ãèªå¾åãããããèªåé転ã¸ã®å¿ç¨ãæå¾ ããã¦ãã¾ãã
ä»åé¸ãã RG-SANã§ã¯ãæ¢åç 究ã§èª²é¡ã ã£ãã¤ã³ã¹ã¿ã³ã¹éã®ç©ºéçãªä½ç½®é¢ä¿ã®èæ ®ãæ¹åããè¤æ°ã®å種ãªãã¸ã§ã¯ããåå¨ããè¤éãªã·ã¼ã³ã§ã®æ§è½åä¸ãå®ç¾ãã¦ãã¾ãã
ãã®è«æã®ãã¼ãã¤ã³ãã¯ããã¹ãã«ç»å ´ããç©ä½ã®3D空éã§ã®ä½ç½®ãç¹å®ã»æ´æ°ããText-driven Localization Moduleï¼TLMï¼ã¨ã¿ã¼ã²ãããªãã¸ã§ã¯ãã®ä½ç½®æ å ±ã®ã¿ã§å¦ç¿å¯è½ã¨ããRule-guided Weak Supervisionï¼RWSï¼ã®å°å ¥ã§ããTLMã«ãã£ã¦ç©ä½éã®ç©ºéçé¢ä¿æ§ãããæ£ç¢ºã«ç解ã§ããRWSã«ãã£ã¦éãããæ師ãã¼ã¿ãæ大éã«æ´»ç¨ããªããå¹ççãã¤å¹æçãªå¦ç¿ãå¯è½ã«ãã¦ãã¾ãã
課é¡ã«å¯¾ãã¦å¹æçã«ã¢ããã¼ãããç¹ã¨ãç¹å¾´ã®æ¬¡å å°å½±ãStep-by-stepã§å®æ½ãã¦ããã®ãã¢ãã«ã®è¤éæ§ã«åãããæ½çã§è¯ããªã¨æãæ¨è¦ãã¾ããã
GIC: Gaussian-Informed Continuum for Physical Property Identification and Simulation
- ã»ãã·ã§ã³: 4D: Machine Vision
- èè : Junhao Cai, Yuji Yang, Weihao Yuan, Yisheng He, Zilong Dong, Liefeng Bo, Hui Cheng, Qifeng Chen
- è«æãªã³ã¯: https://arxiv.org/pdf/2406.14927
- ç´¹ä»è : å¤§å£ (æ©æ¢°å¦ç¿ã¨ã³ã¸ãã¢)
æ¨ããã¤ã³ã
人éã¯ä¸åº¦ç©ãè½ä¸ãã¦è¡çªãããããªãè¦ãã°ããã®ç©ãã©ã®ãããªç¹æ§ãæã¤ãäºæ³ã§ãã
Gaussian Splattingã«ãã3D復å ã®æ¡å¼µã¨ãã¦ãç©çç¹æ§ãå«ãã3D復å ãããã®ãã¢ããã¼ã·ã§ã³ãç¹ã«ãã¤ã³ããªã®ã¯ãå¦ç¿å¯è½ãªãã©ã¡ã¿ã«ããããã«ãç©çç¹æ§ããâå¤å½¢ã®äºæ¸¬âã¨ãã¦å®å¼åãã¦ããç¹ãããç©ä½ã®é£ç¶ããåç»ãã移åã¨å¤å½¢ã®ãã©ã¡ã¿ã§è¨è¿°ããåç¾ãããåç»ã¨å®éã®åç»ã®è·é¢ãè¿ããªãããã«å¦ç¿ãã (b)ãã¾ããå®éã®åç»èªä½ãã2Dã«å°å½±ãã¦è·é¢ãåãã®ã§ã¯ãªãããããã3D復å ãããã®ã¨ã®è·é¢ãåããã¨ã§åé¡ãä¸æ®µé解ãããããã¦ã (a)ã
ãããããã¨ã§ãæ¨è«æã«ã¯ãåç»ã®ä¸é¨ãå ¥ããã°ãã®å¾ã®å¤å½¢ãäºæ¸¬ã§ããããããããã¢ã¼ã ã§ä¿æãã¦åãå ¥ããç¶æ ã§ã®å¤å½¢ãäºæ¸¬ã§ãããã¨ãããã®ã Gaussian SplattingãNeRFãªã©ãç©ä½ãç¹ç¾¤ã§ã¯ãªãå ´ã¨ãã¦æããã¨ããç 究ãè³¢ãã¦å¥½ããªã®ã§ãç©çç¹æ§ãå¤å½¢ã¨ããå ´ã®æ¡å¼µãå°å ¥ããã¢ã¤ãã£ã¢ãç¾ããã¨æãã¾ãã ä»å¾ããreal-worldã®ãã¼ã¿ããè¨æ¸¬ã§ããããã«ãªã£ããã2stageãããªãend-to-endã«ãªã£ãããä»å¾ã®çºå±ã®å¯è½æ§ã大ããã¨æãã¾ãã
Convolutional Differentiable Logic Gate Networks
- ã»ãã·ã§ã³: 5C: Machine Vision
- èè : Felix Petersen, Hilde Kuehne, Christian Borgelt, Julian Welzel, Stefano Ermon
è«æãªã³ã¯: https://arxiv.org/pdf/2411.04732
ç´¹ä»è : ä¸æä¼å¹ (ã½ããã¦ã§ã¢ã¨ã³ã¸ãã¢)
æ¨ããã¤ã³ã
æ©æ¢°å¦ç¿ã¢ãã«ã®å¹ççãªæ¨è«ãå®ç¾ããããã®æ°ããã¢ããã¼ãã¨ãã¦ãConvolutional Differentiable Logic Gate Networks (LogicTreeNet)ãææ¡ãã¦ãã¾ãããã¸ãã¯ã²ã¼ããããã¯ã¼ã¯ãç´æ¥å¦ç¿ããææ³ãéçºããç³ã¿è¾¼ã¿æ¼ç®ããã¼ãªã³ã°ãæ®å·®æ¥ç¶ã®æ¦å¿µããã¸ãã¯ã²ã¼ããããã¯ã¼ã¯ã«å°å ¥ãã¦ãã¾ãã
ä»ã¯ã½ããã¦ã§ã¢ã¨ã³ã¸ãã¢ã¨ãã¦AIãã¼ã ã§åãã¦ãã¾ãããå¦çæ代ã¯ã¨ãã¸ããã¤ã¹ã«ããã深層å¦ç¿ã¢ãã«ã«ã¤ãã¦ç 究ãã¦ãã¾ããããã®è«æã§ã¯ã·ã³ãã«ãªçºæ³ã§ç解ããããããã¤ç²¾åº¦ãé«ãã¢ãã«ãå®ç¾ããã¦ãã¾ããã¨ãã¸ã«ããã¦å¹ççã«åã深層å¦ç¿ã¢ãã«ã«å¿ç¨ã§ããããªç 究ã ãªã¨æãããã¡ããã¡ãã¯ã¯ã¯ã¯ãã¾ããã
Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought
- ã»ãã·ã§ã³: 3B: Natural Language Processing
- èè : Qiguang Chen, Libo Qin, Jiaqi Wang, Jinxuan Zhou, Wanxiang Che
- è«æãªã³ã¯: https://arxiv.org/abs/2410.05695
- ç´¹ä»è : æ°å®¶ (æ©æ¢°å¦ç¿ã¨ã³ã¸ãã¢)
æ¨ããã¤ã³ã
è¿å¹´LLMã®æ¨è«ç²¾åº¦ãä¸ããç®çã¨ãã¦ãæèéç¨ãåºåãã段éçã«æ¨è«ãããChain-of-Thoughtã¨ããææ³ãææ¡ããã¦ãã¾ãã
ãã®è«æã§ã¯ãChain-of-Thought(CoT)ãå®éè©ä¾¡ãæé©åãããããReasoning Boundary (RB; LLMã«ã¨ã£ã¦ã®ã¿ã¹ã¯ã®é£æ度ã®ãããªãã®)ã¨ãããã¬ã¼ã ã¯ã¼ã¯ãææ¡ãã¦ãã¾ããCoTã«ããã¿ã¹ã¯ãå解ããéã«ãå ã®ã¿ã¹ã¯ã®RBãå解å¾ã®ã¿ã¹ã¯ã®RBã®éã¿ä»ã調åå¹³åã«å解ãããã¨ã§ãCoTãçè«çã«èª¬æä»ãããã¨ãã¦ãã¾ãã
ãã®è«æã«éãããæ°ããªææ³ãåºã¦ãå¾ããçè«çãªç 究ãé²ãã®ã¯å·¥å¦çã§ã¨ã¦ã好ããªã®ã§æ¨è¦ãã¾ãããCoTãªã©LLMãåãå·»ãç 究ã¯æ®æ®µä½¿ãããLLMã®ããã³ããã«ãå¿ç¨ã§ãããã®ãããã¾ãããå®æçã«ã¦ã©ãããã¦ããããã§ãã
Guiding a Diffusion Model with a Bad Version of Itself
- ã»ãã·ã§ã³: 4B: Diffusion-based Models
- èè : Tero Karras, Miika Aittala, Tuomas Kynkäänniemi, Jaakko Lehtinen, Timo Aila, Samuli Laine
- è«æãªã³ã¯: https://arxiv.org/abs/2406.02507
- ç´¹ä»è : è¾²è¦ (æ©æ¢°å¦ç¿ã¨ã³ã¸ãã¢)
æ¨ããã¤ã³ã
ç»åçæAIã§åºãç¨ãããã¦ãããClassifier-Free Guidance (CFG)ãã¯ãããã³ããï¼æ示ï¼ããã®ç»åã¨ããã³ãããªãã®ç»åã®å·®åã強調ãããã¨ã§ãæ示å 容ã«å¿ å®ã§é«å質ãªç»åãçæããæè¡ã§ãã
ãã®è«æã§ã¯ãCFGããªãé«å質ãªç»åãçæã§ããã®ãããã®ã¡ã«ããºã ãæããã«ãã¾ãããå ·ä½çã«ã¯ãCFGã¯ãæ¡ä»¶ä»ãã¢ãã«ï¼ããã³ããããï¼ã¨ããæ¯è¼çå質ã®é«ãã¢ãã«ã®åºåã¨ãæ¡ä»¶ãªãã¢ãã«ï¼ããã³ãããªãï¼ã¨ããå¦ç¿é£æ度ãé«ãå質ãä½ãã¢ãã«ã®åºåã®å·®åã«çç®ãããã®å·®åã使ãé«å質ãªã¢ãã«å´ã«å¯ãã¦ãããã¨ã§ãçæç»åã®å質ãåä¸ããã¦ãã¾ãããã®è«æã§ã¯ãCFGã®ä»çµã¿ãããã«çºå±ãããæçµå¦ç¿ãçµããé«å質ãªã¢ãã«ã¨ãå¦ç¿éä¸ããã©ã¡ã¼ã¿æ°ã®å°ãªãæ¯è¼çå質ã®ä½ãã¢ãã«ã¨ã®å·®åãå©ç¨ãããautoguidanceãã¨ããæ°ããªææ³ãææ¡ãã¾ãããautoguidanceãç¨ãããã¨ã§ãCFGãæ±ãã¦ãã課é¡ãããªãã¡ããã³ããã«å¿ å®ããããã¾ãç»åãåç´åãã¦ãã¾ãå¾åãç·©åããããå¤æ§æ§ã®ããç»åãçæã§ããããã«ãªãã¾ããã
ã¨ã¯ããæ¬ ç¹ããããä»ã®å¤§è¦æ¨¡ç»åçæã¢ãã«ã¯æ®µéçã«å¦ç¿ããããã¨ãå¤ãåºåã®åå¸ãä¼¼ã¦ããå¼±ãã¢ãã«ãå¾ãããªãã¨ãã話ã¯ããã¾ãããã§ãããä»ã¾ã§åºã使ããã¦ãCFGã®ä»çµã¿ãæããã«ããããè¯ãæ¹åæ¹æ³ãä½ã£ãç¹ãé¢ç½ããªã¨æãã¾ããã
We are hiring !!
NeurIPSã®è«æã®é¢ç½ãã«ã¯ã¯ã¯ã¯ããçãããã¨ã ã¹ãªã¼AIã»æ©æ¢°å¦ç¿ãã¼ã ã§ä¸ç·ã«æ©æ¢°å¦ç¿ã¨ã³ã¸ãã¢ããã¾ãããï¼ãã¾ããå¦çã®çããåãã«ã¯æ©æ¢°å¦ç¿ã»MLOpsã¤ã³ã¿ã¼ã³ãåéãã¦ã¾ãããã²ä¸ç·ã«è«æãèªã¿ãµã¼ãã¹éçºãã¦ããã¾ãããã
ã¨ã ã¹ãªã¼ã§ã¯ãã³ã³ãã¥ã¼ã¿ãã¸ã§ã³ã»æ©æ¢°å¦ç¿ã¯ãã¡ãããææ°æè¡ã¸ã®ã¢ã³ãããé«ã仲éãæè¿ãã¦ãã¾ããæ°åã»ä¸éããããã®æ¡ç¨ãã«ã¸ã¥ã¢ã«é¢è«ãã¤ã³ã¿ã¼ã³ã常æåéãã¦ãã¾ãï¼
ã¨ã³ã¸ãã¢æ¡ç¨ãã¼ã¸ã¯ãã¡ã
ã«ã¸ã¥ã¢ã«é¢è«ããæ°è»½ã«ã©ãã
ã¤ã³ã¿ã¼ã³ã常æåéãã¦ãã¾ã
*1:OpenAI o1ã·ãªã¼ãºã«ã¯å é¨çã«Chain-of-Thoughtãæ¡ç¨ããã¦ããããã§ããhttps://platform.openai.com/docs/guides/reasoning