Project Detail
Visual Internal Reasoning
Structured latent image tokens for causally grounded visual reasoning inside language models.
Quick Explanation
This project tests whether language models reason better when they generate internal visual latent states before answering, and measures that effect with causal controls.
MultimodalLatent ReasoningCausal Evaluation