Abstract: Multimodal Large Language Models (MLLMs) have shown promising capabilities in mathematical reasoning within visual contexts across various datasets. However, most existing multimodal math ...
Abstract: Large language models (LLMs) can achieve superior results through iterative refinement based on internal or external signals, compared to the unstable outputs from a single pass. However, ...