Advancing spatial reasoning of large multi-modal models.
SpatialLMM: A Compound 3D-Informed Design toward Spatially-Intelligent Large Multimodal Models
Wufei Ma, Luoxin Ye, Celso M de Melo, Alan Yuille, and Jieneng Chen
In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
PulseCheck457: A Diagnostic Benchmark for Comprehensive Spatial Reasoning of Large Mutimodal Models
Xingrui Wang, Wufei Ma, Tiezheng Zhang, Celso M de Melo, Jieneng Chen, and Alan Yuille
In IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025
3DSRBench: A Comprehensive 3D Spatial Reasoning Benchmark
Wufei Ma, Haoyu Chen, Guofeng Zhang, Celso M de Melo, Jieneng Chen, and Alan Yuille
Technical Report, 2024
Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering
Xingrui Wang, Wufei Ma, Angtian Wang, Shuo Chen, Adam Kortylewski, and Alan Yuille
In International Conference on Learning Representations, 2025
3D-Aware Visual Question Answering about Parts, Poses and Occlusions
Xingrui Wang, Wufei Ma, Angtian Wang, Shuo Chen, Adam Kortylewski, and Alan Yuille
In Advances in Neural Information Processing Systems, 2023
Wufei Ma, Yu-Cheng Chou, Xingrui Wang, Qihao Liu, Jieneng Chen, Alan Yuille
If you would like to join our team or would like to collaborate, please contact Wufei Ma.
This website template is adapted from Image Sculpting.