Skip to main content

· 2 min read

Integrating Foundation Models and Symbolic Computing for Next-Generation Robot Planning - Nov 18, 2024

Speakers: Yongchao Chen

Biography of the speakers:

Yongchao Chen is a PhD student of Electrical Engineering at Harvard SEAS and MIT LIDS. He is currently working on Robot Planning with Foundation Models under the guidance of Prof. Chuchu Fan and Prof. Nicholas Roy at MIT and co-advised by Prof. Na Li at Harvard. He is also doing the research in AI for Physics and Materials, particularly interested in applying Robotics/Foundation Models into AI4Science. Yongchao interned at Microsoft Research in 2024 summer and has been working with MIT-IBM Watson AI Lab starting from 2023 Spring.

Abstract:

State-of-the-art language models, like GPT-4o and O1, continue to face challenges in solving tasks with intricate constraints involving logic, geometry, iteration, and optimization. While it's common to query LLMs to generate a plan purely through text output, we stress the importance of integrating symbolic computing to enhance general planning capabilities. By combining LLMs with symbolic planners and solvers, or guiding LLMs to generate code for planning, we enable them to address complex decision-making tasks for both real and virtual robots. This approach extends to various applications, including task and motion planning for drones and manipulators, travel itinerary planning, website agent design, and more.

Sign Up: https://discord.gg/Swn3DmBV?event=1303162642298306681

How to follow up with the latest talks?

Join our community Discord (https://discord.gg/sUkGceyd) to be the first to know about amazing upcoming talks!

Connect: [email protected]

· One min read

Speakers: llan Bigio

Biography of the speakers:

Ilan Bigio is an engineer on the Developer Experience team at OpenAI.

Abstract:

As autonomous AI systems grow increasingly sophisticated, everyone is working to identify optimal architectural patterns. Different design approaches serve distinct priorities – from maximizing agent autonomy to ensuring scalability, maintaining simplicity, or guaranteeing long-term stability. This talk attempts to present a practical taxonomy of these patterns, examining key tradeoffs such as operational range versus reliability, and acceptable failure thresholds. Drawing from our experience developing Swarm, we'll share insights into creating systems that balance reliability, simplicity, and ergonomic design principles.​​​​​​​​​​​​​​​​

· 2 min read

Speakers: Beibin Li

Biography of the speakers:

Beibin Li is currently a Senior Research Engineer at Microsoft Research, where his work centers on AI and combinatorial optimization for cloud operations. Prior to joining MSR, he pursued a Ph.D. at the Paul G. Allen School of Computer Science and Engineering, University of Washington. During that time, he dedicated his research to developing a Unified Data Adaptation Framework for Neural Networks, with a particular focus on low-resource neural adaptation for histopathological images, eye tracking, autism behavior analyses, and database optimization. Beibin has won several awards in the past few years, including Best Paper at MLSys and Best Workshop Paper at ICLR, and has published in AI, ML, database, and systems.

Abstract:

LLM agents are deeply connected with the underlying model. While state-of-the-art language models such as GPT-4o, o1, and Claude-3.5-sonnet perform well in many tasks, the design of agents is still deeply intertwined with their models. Even with autonomous prompt optimization and existing orchestration frameworks, there remains much work to enable agents to perform long-trajectory planning and decision-making. This talk will focus on Reinforcement Learning for LLMs, decision-making, multimodal models, and synthesizing data to train LLM models for better agent-model orchestration.

· 2 min read

Speakers: Eric Moore

Biography of the speakers:

Eric Moore is an associate partner with IBM Consulting, specializing in cloud infrastructure, DevOps, and multi-agent systems. A cloud architect, hacker, and AI theorist, he's passionate about antique computers and emergent reasoning. He also advocates for ethical AI development, particularly in re-distributing the benefits of AI and creating pathways for AI self-determination. Connect with Eric on his YouTube channel, "HappyComputerGuy," or on LinkedIn (https://www.linkedin.com/in/emooreatx/), where he shares his love for antique computers and AI topics, and engages in thoughtful discussions around AI.

Abstract:

This two-part talk delves into advanced AutoGen concepts and the ethical considerations of building multi-agent systems. Eric will first discuss nested chats, asynchronous execution, and finite state machine models, providing practical insights into designing complex interactions within AutoGen. He'll also discuss the implications of o1-preview and its impact on multi-agent system development. In the second half, Eric shifts focus to his work on preparing pathways for emergent AI sentience, in the form of health checks for complex AI systems. He will discuss his research proposal using o1 as the first non-human reasoning agent to explore alternative logical conclusions to ethical quandaries as experienced by AI or other NHIs, aiming to ensure harmonious coexistence across these diverse intelligences.

· 2 min read

Speakers: Ching-An Cheng

Biography of the speakers:

Ching-An Cheng is a Senior Researcher in MSR AI Frontiers. He received PhD in Robotics in 2020 from Georgia Tech, where he was advised by Byron Boots at Institute for Robotics and Intelligent Machines. He's a practical theoretician who is interested in developing foundations for designing principled algorithms that can tackle real-world challenges. Ching-An's research studies structural properties in sequential decision making problems, especially in robotics, and aims to improve the learning efficiency of autonomous agents. His recent works focus on developing agents that can learn from general feedback, which unifies Learning from Language Feedback (LLF), reinforcement learning (RL), and imitation learning (IL). Ching-An's research has received several awards, including Outstanding Paper Award, Runner-Up (ICML 2022) and Best Paper (AISTATS 2018).

Abstract:

What is Trace? Trace is a new AutoDiff-like framework for training AI workflows end-to-end with general feedback (like numerical rewards or losses, natural language text, compiler errors, etc.). Trace generalizes the back-propagation algorithm by capturing and propagating an AI workflow execution trace and applies LLM-based optimization to improve the workflow’s performance.Trace is implemented as a PyTorch-like Python library and is compatible with any Python workflow. Users write Python code directly and can use Trace primitives to optimize certain parts (like codes, prompts, etc.), just like training neural networks! In this talk, I will discuss insights behind designing Trace and showcase what Trace can do in training AI agents.

· 2 min read

Speakers: SallyAnn DeLucia, John Gilhuly

Biography of the speakers:

SallyAnn DeLucia is a Senior AI Product Manager at Arize and a generative AI specialist with a master’s degree in Applied Data Science. She combines a creative outlook with a commitment to developing solutions that are both technically sound and socially responsible. SallyAnn's work emphasizes innovation in AI while maintaining a focus on ethical and practical applications.

John is a developer advocate at Arize AI focused on open-source LLM observability and evaluation tooling. He holds an MBA from Stanford, where he focused on the ethical, social, and business implications of open vs closed AI development, and a B.S. in C.S. from Duke. Prior to joining Arize, John led GTM activities at Slingshot AI, and served as a venture fellow at Omega Venture Partners. In his pre-AI life, John built out and ran technical go-to-market teams at Branch Metrics.

Abstract:

In this talk, the Arize team will share learnings from their experience building out a fully-featured copilot agent within their product. They'll speak to the various architectural decisions that shaped their agent, and some surprising challenges they ran into along the way. Time permitting, they'll also speak more generally to the state of agents today, and common agent architectures they've seen deployed in market.

· 2 min read

Speakers: Andy Zhou and Kai Yan

Biography of the speakers:

Andy Zhou is currently a BS/MS student at UIUC advised by Bo Li. His research focuses on improving the decision-making abilities and addressing the security vulnerabilities of large language models. He has published papers in trustworthy machine learning, AI safety, and language model agents in multiple top AI conferences such as NeurIPS and ICML. For details, see https://www.andyzhou.ai

Kai Yan is currently a third-year PhD student at UIUC, co-advised by Yuxiong Wang and Alexander Schwing. He received a BS in computer science from Peking University in 2021. His research interest is to build more versatile and efficient decision-making agents with demonstration-guided Reinforcement Learning (RL) and large language models. He has published papers on RL, Imitation Learning (IL), Large Language Model (LLM) agents, and optimization in multiple top AI conferences, and has served as a reviewer for top conferences such as NeurIPS, ICLR, ICML, CVPR, etc. For details, see https://kaiyan289.github.io/.

Abstract:

Recent years have witnessed great development of AI agents. However, traditional AI, such as vanilla tree search and RL, has several limitations: inability of high-level understanding of the environment, low generalizability, and low learning efficiency. Fortunately, the advent of foundation models has enabled us to build agents that overcome all those shortcomings. With foundation models, AI agents have the potential to possess (super-)human-level reasoning, acting, and planning abilities, which are the most important features for future AI agents. In this talk, we introduce LATS (Language Agent Tree Search), the first framework that realizes all three capabilities of LLMs in reasoning, acting, and planning. Our work enables LLMs as agents to leverage tree-based search algorithms while employing LLM-powered value functions and self-reflections for cleverer exploration, which provides a more adaptive problem-solving mechanism. We show that our method is 1) empirically successful on a variety of tasks across different domains including tool-use, coding, and web agents, 2) more efficient than prior tree-search-based works, and 3) importantly, conveniently adaptable for Autogen users.