DICEPTION: A Generalist Diffusion Model for Vision Perception
GenDeF: Learning generative deformation field for video generation
ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
FreerCustom: Training-Free Multi-Concept Customization for Image and Video Generation
Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration
Revisiting Depth Representations for Feed-Forward 3D Gaussian Splatting