A curated list of awesome papers on Embodied AI and related research/industry-driven resources, inspired by awesome-computer-vision.
Embodied AI has led to a new breakthrough, and this repository will keep tracking and summarizing the research or industrial progress.
- Contribution is highly welcome and feel free to submit a pull request or contact me.
If you find this repository helpful, please consider Stars ⭐ or Sharing ⬆️.
- CVPR-Workshop
- ICCV-Workshop
- CS539-OregonStateUniversity
- ChatGPT for Robotics: Design Principles and Model Abilities
Please do consider this fantastic paper : Agent AI: Surveying the Horizons of Multimodal Interaction
- Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI
- Vision-Language Navigation with Embodied Intelligence: A Survey
- The Rise and Potential of Large Language Model Based Agents: A Survey
- A Survey of Embodied AI: From Simulators to Research Tasks
- A Survey on LLM-based Autonomous Agents
- Mindstorms in Natural Language-Based Societies of Mind
- Data Interpreter: An LLM Agent For Data Science
- Communicative Agents for Software Development
- Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web
- Experiential Co-Learning of Software-Developing Agents
- EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction
- A survey of embodied ai: From simulators to research tasks
- Embodied AI in education: A review on the body, environment, and mind
- Agent ai: Surveying the horizons of multimodal interaction
- Learning to Generate Context-Sensitive Backchannel Smiles for Embodied AI Agents with Applications in Mental Health Dialogues
- Alexa arena: A user-centric interactive platform for embodied ai
- Artificial intelligence education for young children: A case study of technology‐enhanced embodied learning
- Embodiedgpt: Vision-language pre-training via embodied chain of thought
- Multimodal embodied interactive agent for cafe scene
- The Essential Role of Causality in Foundation World Models for Embodied AI
- A Survey on Robotics with Foundation Models: toward Embodied AI
- Where are we in the search for an artificial visual cortex for embodied intelligence?
- A Comprehensive Survey of Cross-Domain Policy Transfer for Embodied Agents
- The sense of agency in human–AI interactions
- " Don't forget to put the milk back!" Dataset for Enabling Embodied Agents to Detect Anomalous Situations
- Vision-Language Navigation with Embodied Intelligence: A Survey
- Hierarchical Auto-Organizing System for Open-Ended Multi-Agent Navigation
- Velma: Verbalization embodiment of llm agents for vision and language navigation in street view
- Spatially-Aware Transformer Memory for Embodied Agents
- VELMA: Verbalization Embodiment of LLM Agents for Vision and Language Navigation in Street View
- Embodied Human Activity Recognition
- LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents
- EDGI: Equivariant diffusion for planning with embodied agents
- Large Multimodal Agents: A Survey
- Egocentric Planning for Scalable Embodied Task Achievement
- EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents
- Human-agent teams in VR and the effects on trust calibration
- Talk with Ted: an embodied conversational agent for caregivers
- MOPA: Modular Object Navigation With PointGoal Agents
- Embodied Conversational Agents for Chronic Diseases: Scoping Review
- Towards anatomy education with generative AI-based virtual assistants in immersive virtual reality environments
- Improving Knowledge Extraction from LLMs for Task Learning through Agent Analysis
- A Survey on Large Language Model-Based Game Agents
- Autort: Embodied foundation models for large scale orchestration of robotic agents
- Towards Heterogeneous Multi-Agent Systems in Space
- Embodied Machine Learning
- Penetrative ai: Making llms comprehend the physical world
- WebVLN: Vision-and-Language Navigation on Websites
- Generating meaning: active inference and the scope and limits of passive AI
- RoboHive: A Unified Framework for Robot Learning
- Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents
- Turing Test in the Era of LLM
- Generative Models for Decision Making
- AgentScope: A Flexible yet Robust Multi-Agent Platform
- MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
- MuLan: Multimodal-LLM Agent for Progressive Multi-Object Diffusion
- Vision-Language Navigation with Embodied Intelligence: A Survey
- An Interactive Agent Foundation Model
- UFO:A UI-Focused Agent for Windows OS Interaction
- Thanks to GT-RIPL's repository
- Thanks to Jacob Rintamaki's repository
- Thanks to Jiankai-Sun's repository
- Thanks to Yafei Hu's repository
- Thanks to Changan's repository
- Thanks to Rui's repository
- An Interactive Agent Foundation Model
- AutoGen, EcoOptiGen
- AgentTuning: Enabling Generalized Agent Abilities For LLMs
- AgentBench: Evaluating LLMs as Agents
- The Rise and Potential of Large Language Model Based Agents: A Survey
- An Open-source Framework for Autonomous Language Agents
- MetaGPT: Meta Programming for Multi-Agent Collaborative Framework
- AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
- ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models
- Embodied Task Planning with Large Language Models
- Building Cooperative Embodied Agents Modularly with Large Language Models
- State-Maintaining Language Models for Embodied Reasoning
- Embodied Executable Policy Learning with Language-based Scene Summarization
- Voyager: An Open-Ended Embodied Agent with Large Language Models
- Simple Embodied Language Learning as a Byproduct of Meta-Reinforcement Learning
- Vision-Language Tasks
- Exploring Large Language Models for Communication Games: An Empirical Study on Werewolf
- Language Guided Generation of 3D Embodied AI Environments
- CogAgent: Visual Expert for Pretrained Language Models
- ProAgent: from Robotic Process Automation to Agentic Process Automation
- Waymax: An accelerated simulator for autonomous driving research
- HOW FAR ARE LARGE LANGUAGE MODELS FROM AGENTS WITH THEORY-OF-MIND?
- AgentBench: Evaluating LLMs as Agents
- MINDAGENT: EMERGENT GAMING INTERACTION
- Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI
- Emergent Communication for Embodied Control
- Simple but Effective: CLIP Embeddings for Embodied AI
- Embodied AI-Driven Operation of Smart Cities: A Concise Review
- Modeling Dynamic Environments with Scene Graph Memory
- An Open-source Framework for Autonomous Language Agents
- MetaGPT: Meta Programming for Multi-Agent Collaborative Framework