Hello! I'm

Abhranil Chandra

Abhranil Chandra I am an AI researcher focused on
My research goal is to build Interactive Autonomous Digital/Embodied Decision Making Agents that can Plan, Reason and Act with limited supervision and in OOD settings.

I am a Research Masters (Thesis) student in Computer Science at the Cheriton School of Computer Science, University of Waterloo. I work under the supervision of Prof. Wenhu Chen and Prof. Sebastian Fischmeister. Most recently, I've been primary interested in topics around foundation models for decision making, reinforcement learning amd embodied AI, deep generative models and alignment.

During my masters, I have been supported by a full academic ride from University of Waterloo, by the Mitacs Globalink Graduate Research Fellowship, and the International Master's Award of Excellence. I previously graduated with a B.Tech. in Mechanical Engineering with a minor in Mathematics and Computing and a micro-specialization in Aritificial Interlligence from IIT Kharagpur, where I worked on probabilistic machine learning, reinforcement learning and NLP advised by Prof. Pabitra Mitra form the CSE Department of IIT Kharagpur. My undergraduate thesis was on Bayesian Uncertainty Estimation on Sensitive Data for Computer Vision Tasks using Bayesian Deep Learning.

I am thankful to have spent two wonderful summers interning at ThoughtLabs Pvt. Ltd. as a Research Engineering working on scaling multimodal foundational models and improving fine-grained knowledge extraction from extremely long documents using Retrieval Augmented Generation (RAG) with Lokesh Bhatija, as a Mitacs Globalink Research Intern working on Continual Domain Generalization with Prof. Boyu Wang in Vector Institute and Western University, and as a Research Intern at CLIP Lab , University of Maryland Institute for Advanced Computer Studies (UMIACS), with Prof. Jordan Boyd-Graber on improving Neural Question Answering Systems by leveraging self-supervised question generation.

I have also working as a research assistant in the Intelligent Robot Learning Laboratory (IRL Lab) of University of Alberta under Prof. Matthew E. Taylor and Manan Tomar on deep unsupervised represenation learning for RL, Offline RL and pre-training in RL to improve generalization and efficiency since 2022 to 2023.

Abhranil Chandra

abhranil[dot]Chandra[at]gmail[dot]com

Research Interests

I actively explore both theoretical frameworks and empirical findings via the lens of foundation models, RL and interactive learning, with specific research interests in:

  • Reinforcement Learning (robust RL, offline RL, IRL, RLHF)
  • Foundation models for decision making and policy learning (generative models, representation)
  • Improving planning and reasoning of foundation models to create interactive autonomous agents
  • Science of Foundation Models

If you want to collaborate/have any questions feel free to shoot me an email. I am always interested in connecting with people.

News

  • [Feb '24]
    Started research collaboration with Sherry Yang at Google DeepMind on Interactive Video Generation Models as World Models as part of my thesis research.
  • [Dec '23]
    Our paper DiffClone won best paper and workshop competition winner award at the TOTO Workshop, NeurIPS 2023.
  • [Oct '23]
    Our project "A Human-Aligned Automated Evaluation Framework for Natural Language Generation via Large Language Models" got accpeted for the Accelerating Foundation Models Research grant by Microsoft Research.
  • [Oct '23]
    Started as a part-time Student Researcher at Palitronica Inc.
  • [Sep '23]
    Received Mitacs Globalink Graduate Research Fellowship and International Master's Award of Excellence at UWaterloo to support my graduate research besides full academic ride from UWaterloo.

Publications and Thesis

(* denotes equal contribution, ** denotes equal contribution and co-first authorship)

VideoEval: Training Multimodal Language Models for Multi-aspect Evaluation of Video Generation Models

Xuan He, Dongfu Jiang, Ge Zhang, Max Ku,Abhranil Chandra et.al.

In Preperation

TL;DR: We introduce a large-scale video evaluation dataset and benchmark, and present a VideoEval model that significantly outperforms GPT-4o in assessing video generation quality.

You Make me Feel like a Natural Question: Training QA Systems on Transformed Trivia Questions

Tasnim Kabir, Saptarashmi Bandyopadhyay, Yoo Yeon Sung, Hao Zou*, Abhranil Chandra*, Jordan Lee Boyd-Graber

In Preperation

TL;DR: We propose a method to generate concise, web query-style natural questions from trivia data to train QA systems, offering a cost-effective alternative to expensive, hard-to-annotate synthetic datasets.

Self-Improving Generative Simulators

Abhranil Chandra et.al.

Under Review

TL;DR: We present SI-GenSim, a framework that enhances video-based policy models with diverse feedback mechanisms, significantly reducing inaccuracies like hallucinations and improving performance in real-world tasks such as robotic manipulation.

Review-Feedback-Reason (ReFeR): A Novel Framework for NLG Evaluation & Reasoning

Yaswanth Narsupalli**, Abhranil Chandra**, Sreevatsa Muppirala, Manish Gupta, Pawan Goyal

Under Review

TL;DR: We introduce ReFeR, a novel NLG evaluation framework inspired by peer review, which significantly enhances evaluation accuracy and reasoning benchmarks, outperforming current methods and enabling smaller models to match or exceed bigger models performance.

MMLU-Pro: A More Robust and Challenging Multi-Task Language Understanding Benchmark

Yubo Wang*, Xueguang Ma*, Ge Zhang, Yuansheng Ni, Abhranil Chandra et.al.

Under Review

TL;DR: We introduce MMLU-Pro, an enhanced benchmark that raises the challenge for language models with more complex reasoning questions and expanded choices, offering better discrimination of model capabilities and reducing prompt sensitivity compared to the original MMLU.

DiffClone: Enhanced Behaviour Cloning in Robotics with Diffusion-Driven Policy Learning

Sabariswaran Mani, Abhranil Chandra*, Sreyas Venkataraman*, Adyan Rizvi, Yash Sirvi, Soumojit Bhattacharya, Aritra Hazra

TOTO Workshop @ NeurIPS 2023

TL;DR: We introduce DiffClone, a diffusion-based behavior cloning agent evaluated on real robots, showing its effectiveness with offline training on the TOTO Benchmark and highlighting the superior performance of MOCO finetuned ResNet50 in visual representation.

CABACE: Injecting Character Sequence Information and Domain Knowledge forEnhanced Acronym and Long-Form Extraction

Nithish Kannen, Divyanshu Seth, Abhranil Chandra, Subhraneel Pal

SDU Workshop at AAAI 2022

TL;DR: We present CABACE, a Character-Aware BERT framework designed for acronym extraction in scientific and legal texts, outperforming baselines and demonstrating strong zero-shot generalization to non-English languages.

Leveraging recent advances in Pre-Trained Language Models for Eye-Tracking Prediction

Varun Madhavan*, Aditya Girish Pawate*, Shraman Pal*, Abhranil Chandra*

CMCL Workshop at NAACL 2021

TL;DR: We introduce a novel architecture combining RoBERTa and a transformer-based model to predict eye-gaze features for each word in a sentence, using ZuCo datasets to explore cognitive-inspired NLP for better language processing.

Highlights

Summer Research Intern

Mentor(s): Prof. Jordan Boyd-Graber and Saptarashmi Bandopadhyay
Working on improving different sate-of-the-art question answering systems like DrQA, RAG, R2D2 by using machine translation and seq-to-seq language generation to augment existing QA datasets and converting one dataset content into another thus creating bigger and better datasets.

Research Intern

Mentor(s): Prof. Yi-Zhe Song and Ayan Kumar Bhunia
Working on improving different sate-of-the-art question answering systems like DrQA, RAG, R2D2 by using machine translation and seq-to-seq language generation to augment existing QA datasets and converting one dataset content into another thus creating bigger and better datasets.

Research Intern

Mentor(s): Prof. Yi-Zhe Song and Ayan Kumar Bhunia
Working on improving different sate-of-the-art question answering systems like DrQA, RAG, R2D2 by using machine translation and seq-to-seq language generation to augment existing QA datasets and converting one dataset content into another thus creating bigger and better datasets.

Selected Projects

  • Long Video Generation In preparation for a paper Code

    CS886: Foundation Models | Supervisor: Prof. Wenhu Chen

    Generating long-format videos with high temporal coherence remains a challeng- ing task for current state-of-the-art models. Existing approaches often focus on image quality rather than temporal consistency, resulting in short video sequences with limited coherence. In this work, we try to alleviate a novel framework for learning better noise priors for consistent long video generation. Our approach inte- grates advanced noise conditioning techniques with state-of-the-art video extension methodologies to enhance temporal coherence and fidelity. Leveraging insights from recent advancements in noise modeling and video extension, we introduce a model-agnostic scheme that seamlessly integrates with existing text-to-video diffusion techniques. Specifically, we refine noise priors to capture temporal cor- relations and extend videos by adaptively conditioning on the generated frames and noise priors. We demonstrate the efficacy of our approach through extensive experimentation, showcasing significant improvements in temporal coherence and video fidelity. Our framework offers a scalable solution for generating long-format videos using diffusion models, contributing to the advancement of video synthesis techniques.

  • Air Quality Forecasting using Neural Networks Report Code

    ML for Earth System Sciences Term Paper | Supervisor: Prof. Adway Mitra

    Surveyed multiple papers on deep neural networks for air quality prediction. Methods like ANNs, LSTMs, GRUs were implemented and tested on the standard datasets. Besides we also implemented the state of the art time-series model called Prophet Net which out-performed all the other baselines.

  • Motion Planning of Articulated Robot Arm using Reinforcement Learning Report Code

    AI for Manufacturing Term Project | Supervisor: Prof. Cheruvu Siva Kumar

    Designed and implementated the DDPG algorithm with HER to train the Fetch-Reacher agent of OpenAI gym.

  • Electric Vehicle Routing and Path-lanning Code

    Artificial Intelligence Foundations and Applications Term Project | Course Instructor: Prof. Partha Pratim Chakrabarti and Prof. Arijit Mondal

    Used path planning graph based algorithms to route all the vehicles from their respective sources to destinations such that max time is minimized.

Positions Of Responsibility & Volunteer Experience

Senior Member (Jun '21 - current), Junior Member (Oct '20 - Jun '21)

Kharagpur Data Analytics Group (KDAG), IIT Kharagpur

GitHub Repo for Reading Sessions

Organized research paper-reading sessions for students of IIT Kharagpur. Conducted Data Science and ML workshop for more than 600 registered students. The KDAG is a group of students enthusiastic about Data Science and Machine Learning, along with its applications.

NSS Volunteer (Jun '19 - current)

Institute Wellness Group (IWG), IIT Kharagpur
Teach underprivileged kids in nearby villages of IIT Kharagpur the basics of English, Maths and Computing. Conducted surveys about how to improve the education system in village schools, and worked on cleanliness drives.