Embodied Navigation Combining Exploration and Imagination

Professor Shuqiang Jiang
Institute of Computing Technology, Chinese Academy of Sciences, China


Embodied AI represents a significant manifestation of artificial intelligence in the real physical world, which has showcased great application potentials in dynamic open-world environments. Embodied navigation refers to the ability of the agent to perceive and understand the environment based on task objectives (such as language instructions), then predict and execute movement actions, thereby progressively completing tasks. It is the key technology for embodied intelligent systems to interact with the real world. Existing methods for embodied navigation largely rely on current and past visual observations for short-term and single-step action prediction, lacking the capability for evaluating unobserved environments and conducting long-term action planning. Physiological studies have indicated that humans not only depend on current observations but can also imagine unobserved environments from prior memories, constantly refining and enhancing their understanding of the environment combining exploration and imagination. Thus, endowing agents with the ability to "imagine" thereby aiding them in predicting the layout of unobserved environments, assessing the long-term value of navigation actions, and realizing more efficient and accurate navigation decisions, emerges as a significant research challenge. This report will first introduce the research background of embodied AI and embodied navigation and then report on the research progress in embodied navigation combining exploration and imagination, including self-supervised generative map and lookahead exploration with neural radiance representation, and finally introduce the adaptation of embodied navigation from simulator to the real world and provide demonstrations.



Biosketch

Shuqiang Jiang is a professor with the Institute of Computing Technology(ICT), Chinese Academy of Sciences (CAS) and a professor in University of CAS. He is also with the Key Laboratory of Intelligent Information Processing, CAS. His research interests include multimedia analysis and multimodal intelligence. He leads the food computing research group in ICT, CAS. He has authored or coauthored more than 200 papers on the related research topics. He was supported by National Science Fund for Distinguished Young Scholars in 2021. He won the CAS International Cooperation Award for Young Scientists, the CCF Award of Science and Technology, Wu Wenjun Natural Science Award for Artificial Intelligence, CSIG Natural Science Award, and Beijing Science and Technology Progress Award. He is the Associate Editor of ACM ToMM, vice Chair of IEEE CASS Beijing Chapter, vice Chair of ACM SIGMM China chapter. He has served as an organization member of more than 20 academic conferences, including the general chair of ICIMCS 2015, program chair of ICIMCS2010, PCM2017, ACM Multimedia Asia2019, He has also served as an area chair or TPC member for many conferences, including ACM Multimedia, CVPR, ICCV, IJCAI, ICME, ICIP, etc.