I am a Research Masters (Thesis) student in Computer Science at the Cheriton School of Computer Science, University of Waterloo. I work under the supervision of Prof. Wenhu Chen and Prof. Sebastian Fischmeister. Most recently, I've been primary interested in topics around foundation models for decision making, reinforcement learning amd embodied AI, deep generative models and alignment.
During my masters, I have been supported by a full academic ride from University of Waterloo, by the Mitacs Globalink Graduate Research Fellowship, and the International Master's Award of Excellence. I previously graduated with a B.Tech. in Mechanical Engineering with a minor in Mathematics and Computing and a micro-specialization in Aritificial Interlligence from IIT Kharagpur, where I worked on probabilistic machine learning, reinforcement learning and NLP advised by Prof. Pabitra Mitra form the CSE Department of IIT Kharagpur. My undergraduate thesis was on Bayesian Uncertainty Estimation on Sensitive Data for Computer Vision Tasks using Bayesian Deep Learning.
I am thankful to have spent two wonderful summers interning at ThoughtLabs Pvt. Ltd. as a Research Engineering working on scaling multimodal foundational models and improving fine-grained knowledge extraction from extremely long documents using Retrieval Augmented Generation (RAG) with Lokesh Bhatija, as a Mitacs Globalink Research Intern working on Continual Domain Generalization with Prof. Boyu Wang in Vector Institute and Western University, and as a Research Intern at CLIP Lab , University of Maryland Institute for Advanced Computer Studies (UMIACS), with Prof. Jordan Boyd-Graber on improving Neural Question Answering Systems by leveraging self-supervised question generation.
I have also working as a research assistant in the Intelligent Robot Learning Laboratory (IRL Lab) of University of Alberta under Prof. Matthew E. Taylor and Manan Tomar on deep unsupervised represenation learning for RL, Offline RL and pre-training in RL to improve generalization and efficiency since 2022 to 2023.
abhranil[dot]Chandra[at]gmail[dot]com
I actively explore both theoretical frameworks and empirical findings via the lens of foundation models, RL and interactive learning, with specific research interests in:
If you want to collaborate/have any questions feel free to shoot me an email. I am always interested in connecting with people.
(* denotes equal contribution, ** denotes equal contribution and co-first authorship)
In Preperation
TL;DR: We introduce a large-scale video evaluation dataset and benchmark, and present a VideoEval model that significantly outperforms GPT-4o in assessing video generation quality.
In Preperation
TL;DR: We propose a method to generate concise, web query-style natural questions from trivia data to train QA systems, offering a cost-effective alternative to expensive, hard-to-annotate synthetic datasets.
Under Review
TL;DR: We present SI-GenSim, a framework that enhances video-based policy models with diverse feedback mechanisms, significantly reducing inaccuracies like hallucinations and improving performance in real-world tasks such as robotic manipulation.
Under Review
TL;DR: We introduce ReFeR, a novel NLG evaluation framework inspired by peer review, which significantly enhances evaluation accuracy and reasoning benchmarks, outperforming current methods and enabling smaller models to match or exceed bigger models performance.
Under Review
TL;DR: We introduce MMLU-Pro, an enhanced benchmark that raises the challenge for language models with more complex reasoning questions and expanded choices, offering better discrimination of model capabilities and reducing prompt sensitivity compared to the original MMLU.
TOTO Workshop @ NeurIPS 2023
TL;DR: We introduce DiffClone, a diffusion-based behavior cloning agent evaluated on real robots, showing its effectiveness with offline training on the TOTO Benchmark and highlighting the superior performance of MOCO finetuned ResNet50 in visual representation.
SDU Workshop at AAAI 2022
TL;DR: We present CABACE, a Character-Aware BERT framework designed for acronym extraction in scientific and legal texts, outperforming baselines and demonstrating strong zero-shot generalization to non-English languages.
CMCL Workshop at NAACL 2021
TL;DR: We introduce a novel architecture combining RoBERTa and a transformer-based model to predict eye-gaze features for each word in a sentence, using ZuCo datasets to explore cognitive-inspired NLP for better language processing.
Generating long-format videos with high temporal coherence remains a challeng- ing task for current state-of-the-art models. Existing approaches often focus on image quality rather than temporal consistency, resulting in short video sequences with limited coherence. In this work, we try to alleviate a novel framework for learning better noise priors for consistent long video generation. Our approach inte- grates advanced noise conditioning techniques with state-of-the-art video extension methodologies to enhance temporal coherence and fidelity. Leveraging insights from recent advancements in noise modeling and video extension, we introduce a model-agnostic scheme that seamlessly integrates with existing text-to-video diffusion techniques. Specifically, we refine noise priors to capture temporal cor- relations and extend videos by adaptively conditioning on the generated frames and noise priors. We demonstrate the efficacy of our approach through extensive experimentation, showcasing significant improvements in temporal coherence and video fidelity. Our framework offers a scalable solution for generating long-format videos using diffusion models, contributing to the advancement of video synthesis techniques.
Surveyed multiple papers on deep neural networks for air quality prediction. Methods like ANNs, LSTMs, GRUs were implemented and tested on the standard datasets. Besides we also implemented the state of the art time-series model called Prophet Net which out-performed all the other baselines.
Designed and implementated the DDPG algorithm with HER to train the Fetch-Reacher agent of OpenAI gym.
Used path planning graph based algorithms to route all the vehicles from their respective sources to destinations such that max time is minimized.