Last updated: Dec 18, 2025 It has been seven years since the original GPT architecture was developed. At first glance, looking back at GPT-2 (2019) and forward to DeepSeek V3 and Llama 4 (2024-2025), one might be surprised at how structurally similar these models still are. Sure, positional embeddings have evolved from absolute to rotational (RoPE), Multi-Head Attention has largely given way to Gr
ã³ã³ããã¹ãã¨ã³ã¸ãã¢ãªã³ã°ã«ã¤ã㦠LLMï¼å¤§è¦æ¨¡è¨èªã¢ãã«ï¼ã®åéã§ãæè¿ãã³ã³ããã¹ãã¨ã³ã¸ãã¢ãªã³ã°ï¼Context Engineeringï¼ãã¨ããè¨èãå¤ã使ãããããã«ãªãã¾ãããAIã¨ã¼ã¸ã§ã³ãã®æèã§ã使ããããã¨ãå¤ããèªåã®ä¸ã§ãã£ã¨ã¢ã¤ã¢ã¤ãã¦ããã®ã§ãããå°ãèªåãªãã«æ´çãã¦ã¿ãã®ã§ããã«æ¸ãã¦ã¿ã¾ãã åå以ä¸ãæ°æã¡ã¨ããããã¨ã ãç§è¦ãæ··ãã£ã¦ãã¾ãã®ã§ãå¦è¡çãªå®ç¾©ã®å³å¯æ§ãããèªåãæ®æ®µä½¿ã£ã¦ãã¦æããå®è·µç®ç·ã§ã®ä¸ã¤ã®èãæ¹ã¨ãã¦æãã¦ããããã¨ãããããã§ãã ãããã³ããã¨ã³ã¸ãã¢ãªã³ã°ããããã³ã³ããã¹ãã¨ã³ã¸ãã¢ãªã³ã°ã㸠ãããããã³ã³ããã¹ãã¨ã³ã¸ãã¢ãªã³ã°ãã£ã¦ä½ï¼ãããã³ããã¨ã³ã¸ãã¢ãªã³ã°ãã¨ä½ãéãã®ï¼ã¨ããã¨ããããå§ãããã¨æãã¾ãã ããã³ããã¨ã³ã¸ãã¢ãªã³ã°ã¯ããã®ãããåç´ã«ããå³ã«ããã¨ä»¥ä¸ã«ãªãã¨æãã¾ãã ããã³ã
Precise Source Grounding: Maps every extraction to its exact location in the source text, enabling visual highlighting for easy traceability and verification. Reliable Structured Outputs: Enforces a consistent output schema based on your few-shot examples, leveraging controlled generation in supported models like Gemini to guarantee robust, structured results. Optimized for Long Documents: Overcom
2025-05-02 Stanford CS336 Language Modeling from Scratch: GPUã®è¬ãè§£ã - Flash Attentionã¾ã§ã®æé©åæè¡å®å ¨ã¬ã¤ã â»æ¬è¨äºã¯ãStanford CS336 Language Modeling from Scratch Spring 2025ã®è¬ç¾©åç»ãGPUsãã®å 容ãåºã«ä½æããã¦ãã¾ããè¬ç¾©ã®è©³ç´°æ å ±ã¯ https://stanford-cs336.github.io/spri... ã§ã覧ããã ãã¾ããStanford大å¦ã®ãªã³ã©ã¤ã³AIããã°ã©ã ã«ã¤ãã¦ã¯ https://stanford.io/ai ãæ¬è¬ç¾©ã¸ã®ç»é²ã«ã¤ãã¦ã¯ https://online.stanford.edu/courses/c... ããåç §ãã ããã æ¬è¨äºã§ã¯ãè¬ç¾©ã®å 容ã詳細ã«ã¾ã¨ãã¦ããã¾ãããè¦ç´ãè§£éã«ãã誤ãã
Language models serve as the cornerstone of modern natural language processing (NLP) applications and open up a new paradigm of having a single general purpo...
TL;DRAgents need context to perform tasks. Context engineering is the art and science of filling the context window with just the right information at each step of an agentâs trajectory. In this post, we break down some common strategies â write, select, compress, and isolate â for context engineering by reviewing various popular agents and papers. We then explain how LangGraph is designed to supp
Context Engineering for AI Agents: Lessons from Building Manus At the very beginning of the Manus project, my team and I faced a key decision: should we train an end-to-end agentic model using open-source foundations, or build an agent on top of the in-context learning abilities of frontier models? Back in my first decade in NLP, we didn't have the luxury of that choice. In the distant days of BER
AI systems that "think" in human language offer a unique opportunity for AI safety: we can monitor their chains of thought (CoT) for the intent to misbehave. Like all other known AI oversight methods, CoT monitoring is imperfect and allows some misbehavior to go unnoticed. Nevertheless, it shows promise and we recommend further research into CoT monitorability and investment in CoT monitoring alon
It is interesting to note how views on this topic have shifted with the rise of outcome-based RL applied to LLMs. A couple of years ago, the consensus in the safety community was that process-based RL should be prioritized over outcome-based RL, since it incentivizes choosing actions for reasons that humans endorse. See for example Anthropic's Core Views On AI Safety: Learning Processes Rather tha
ã¯ããã« æ¬è¨äºã§ã¯ãæ¨ä»è©±é¡ã® AI ã¨ã¼ã¸ã§ã³ããµã¼ãã¹ãããããä½ãæ¹ã¸ãAI ã¨ã¼ã¸ã§ã³ãã·ã¹ãã ãä½ãä¸ã§ã®æããã¹ããã¤ã³ãã¨å®è·µçãªãã³ãºãªã³ã«ã¦ DeepDive ãã¦ããããã¨æãã¾ãã æ¬è¨äºã¯ 2 鍿§æã«ãªã£ã¦ããã以ä¸ã®æ§æã§ãã 第 1 é¨: AI ã¨ã¼ã¸ã§ã³ã ã®åºæ¬æ¦å¿µã¨ã¨ã¼ã¸ã§ã³ãã·ã¹ãã æ§ç¯ã®ã¬ã¤ã 第 2 é¨: Azure AI Agent Service ã使ã£ãã¯ã¼ã¯ããã¼ã«ã¼ãã£ã³ã°ã®å®è£ 第ï¼é¨ã§ã¯ãOpenAI 社ã®a-practical-guide-to-building-agents ãåèã«ãAI ã¨ã¼ã¸ã§ã³ãã®åºæ¬æ¦å¿µã¨ã¨ã¼ã¸ã§ã³ãã·ã¹ãã æ§ç¯ã®ã¬ã¤ãã解説ãã¾ãã 第ï¼é¨ã§ã¯ãAnthropic ã®ããã°è¨äº Building Effective Agents â Workflow Routing ã§ç´¹ä»ããã¦ãã ã¯ã¼ã¯ããã¼ã«ã¼
ã¯ããã« ãã®è¨äºã§ã¯ãSpeculative Decodingã«ããLLMã®æ¨è«é«éåãvLLMã§è©¦ããç°¡åãªãã³ããã¼ã¯ãè¡ã£ãçµæãå ±æãã¾ãã Speculative Decodingã«ã¤ã㦠æåã«ãSpeculative Decodingã«ã¤ãã¦ç°¡åã«è§£èª¬ãã¾ãã Speculative Decodingã¨ã¯ã大åã®ã¢ãã«ã®æ¨è«ãããéãããå°åã®ã¢ãã«ãå©ç¨ãã¦æ¨è«ãé«éåããææ³ã§ãããã®æ¬æ¥ã®åºåãå¾ãã大åã®ã¢ãã«ãTarget Modelãé«éåã®ããã®å°åã®ã¢ãã«ãDraft Modelã¨è¨ãã¾ãã Speculative Decodingã§ã¯éå¸¸ã®æ¨è«ã¨ã¯éããæ¨è«ã®éã«ã¾ãå°åã®Draft Modelãä¸å®ã®Draft Tokensåã®çæãè¡ããåè£ã¨ãªããã¼ã¯ã³åãææ¡ãã¾ãããã®å¾Target Modelã¯ãã®Draft Tokensã«å¯¾ãã¦ç¢ºçåå¸ãå ã«
ã¯ããã« çããã«è³ªåã§ãã ãã¢ãã«ã®ç²¾åº¦ãè½ã¨ãããè¨ç®ãªã½ã¼ã¹ãå¢ããããæ¨è«é度ã ãã2åã«ããæ¹æ³ã ãããã¨ãããââããã¯éæ³ã§ããããï¼ããã¨ãç¾å®ã®æè¡ã§ããããï¼ çãã¯å¾è ã§ããGoogle DeepMindã¨UC Berkeleyãå ±åéçºããSpeculative Decodingã¯ãã¾ãã«ãã®ä¸å¯è½ãå¯è½ã«ãããæ¨è«å éã®ãã©ãã¯ããã¯ã¹ããèªåè»ã§ä¾ããã°ãããã®äºæ¸¬ã«ã¼ãåè£ãäºåè¨ç®ãã¤ã¤ãå®éã®èµ°è¡ã§æé©çµè·¯ã鏿ãããããªå·§å¦ãªææ³ã§ãLLMã®çæé度ã«é©å½ãèµ·ããã¾ãã ãSpeculative Decodingãã£ã¦ä½ï¼ ãSpeculative Decodingãã¯æ¥æ¬èªã§ãæ¨æ¸¬çãã³ã¼ãã£ã³ã°ãã¨è¨³ããããã¨ãå¤ããç´è¨³ã«è¿ã表ç¾ã¨ãã¦ãææ©çãã³ã¼ãã£ã³ã°ãã¨å¼ã°ãããã¨ãããã¾ãããã®ææ³ãç°¡åã«è¨ãã¨ãå°ããªã¢ãã«ï¼ãã©ããã¢ãã«ï¼ã§è¤æ°
ã©ã³ãã³ã°
ã©ã³ãã³ã°
ã¡ã³ããã³ã¹
ãªãªã¼ã¹ãé害æ å ±ãªã©ã®ãµã¼ãã¹ã®ãç¥ãã
ææ°ã®äººæ°ã¨ã³ããªã¼ã®é ä¿¡
å¦çãå®è¡ä¸ã§ã
j次ã®ããã¯ãã¼ã¯
kåã®ããã¯ãã¼ã¯
lãã¨ã§èªã
eã³ã¡ã³ãä¸è¦§ãéã
oãã¼ã¸ãéã
{{#tags}}- {{label}}
{{/tags}}