Abhranil Chandra
← Back to Home

Detailed Research & Papers

This page expands on my research vision, ongoing projects, and publications. For a concise overview, see the home page.

Research Vision

Building Foundation-Model Agents that unify internet-scale knowledge with decision-making: offline/unsupervised RL, generative world models, system-2 reasoning, tool-use, and scalable oversight. I aim to move beyond purely autoregressive training toward objectives and representations that support interactive learning, planning, and self-improvement.

Projects & Preprints

Shape of Thought: When Distribution Can Matter More than Correctness
ICLR 2025 (under review) · With R. Agarwal, A. Courville, S. Fischmeister, et al.

Large-scale evidence that training on distribution-matched but incorrect CoTs can outperform training on correct but distribution-mismatched data. This reframes “reasoning supervision” and motivates RL-from-unverified trajectories.

paper
VideoAgent: Self-Improving Generative World Model for Embodied Planning
TMLR 2025 (under review) · RLBrew@RLC 2025 Spotlight · With S. Yang, B. Dai, S. Fischmeister, et al.

Iterative visual plan refinement using VLM feedback and a self-consistency objective to improve a video world-model for decision-making; shows strong gains on simulated and real robot manipulation videos.

paper
ReFeR: Hierarchical Agents for Reasoning & Evaluation
TMLR 2025 · With Y. Narsupalli, S. Muppirala, M. Gupta, P. Goyal

Small LLMs act as peer reviewers; a larger LLM acts as an area chair—reducing cost while improving alignment and reliability across generation, multimodal evaluation, and reasoning tasks.

paper
MMLU-Pro
NeurIPS Datasets & Benchmarks 2024 (Spotlight) · With Wenhu Chen lab

A more challenging, stable multi-task language understanding benchmark.

paper

Publications (List)

← Back to Home Download CV