Group Introduction
Multi-modal Intelligence Group (MIG) is a research team based at the Center for Future Media, University of Electronic Science and Technology of China (UESTC). The group is supervised by Prof. Guoqing Wang and co-supervised by Dr. Tianyu Li.
MIG studies vision driven aerial intelligence—building aerial systems that can see (capable of seeing in night, hazy, rainy, cloudy days), understand (capable of modelling the world in pixel-/object-/scene- level), embodied control (capable of interacting with and controlling aerial agent or other agents automatically). Besides designing learning algorithms, MIG is also enthusiastic about deploying our algorithms into the edge computing devices, which can be equipped as sub-systems into aerial vehicles to improve their intelligent capabilities.
Specifically, MIG explores the cutting edge research in the area of artificial intelligence, computer vision, and unmanned aerial vehicles, including:
- Intelligent visual capturing systems: designing low-level vision algorithms for seeing in the night/shining/hazy/cloudy/rainy days, and deploying the algorithms into aerial vehicles.
- Intelligent visual perception systems: designing vision and multimodal LLMs for understanding the world in a pixel-/object-/scene-level, and deploying the algorithms into aerial vehicles enabling them to recognize objects, recognize 3D scenes, and predicting the moving trends of object and scenes.
- Intelligent embodied aerial AI systems: designing VLN and VLA algorithms and deploying them into aerial vehicle and multi-agent systems enabling them to accomplish self-control and self-organization.
Highlights

Our Research
MIG focuses on three major research directions that integrate cutting-edge artificial intelligence, computer vision, and aerial systems.