I'm a PhD student at UC Berkeley working with Sergey Levine at the Robotics and AI Lab, a part of Berkeley AI Research. I work at the intersection of machine learning and robotics.
I completed my undergraduate studies at IIT-Kanpur. While I was an undergrad, I spent the summer of 2015 working with Ashutosh Saxena at Cornell University, and the summer of 2016 working with Dhruv Batra & Devi Parikh at Virginia Tech.
Github | Twitter | CV
Update Jan 2021: I am on the job market! I am looking for industry positions at the intersection of machine learning and robotics. Open to both industrial research labs and startups.
To get in touch with me: firstname.lastname@example.org
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
Avi Singh*, Huihan Liu*, Gaoyue Zhou, Albert Yu, Nicholas Rhinehart, Sergey Levine
Oral Presentation (top 1.8% of all submissions)
International Conference on Learning Representations (ICLR), 2021
Reinforcement learning provides a general framework for flexible decision making and control, but requires extensive data collection for each new task that an agent needs to learn. In other machine learning fields, such as natural language processing or computer vision, pre-training on large, previously collected datasets to bootstrap learning for new tasks has emerged as a powerful paradigm to reduce data requirements when learning a new task. In this paper, we ask the following question: how can we enable similarly useful pre-training for RL agents? We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials from a wide range of previously seen tasks, and we show how this learned prior can be used for rapidly learning new tasks.
COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning
Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey Levine
Conference on Robot Learning (CoRL), 2020
Reinforcement learning methods typically involve collecting a large amount of data for every new task. Since the amount of data we can collect for any single task is limited due to time and cost considerations in the real-world, the learned behavior is usually quite narrow. In this paper, we propose an approach to incorporate a large amount of prior data, either from previously solved tasks or from unsupervised or undirected environment interaction, to extend and generalize learned behavior. This prior data is not specific to any one task, and can be used to extend a variety of downstream skills. We train end-to-end policies, mapping high-dimensional image observations to low-level robot control commands, and present results in both simulated and real world domains. Our hardest experimental setting involves composing four robotic skills in a row: picking, placing, drawer opening, and grasping, where a +1/0 sparse re-ward is provided only on task completion.
Scalable Multi-Task Imitation Learning with Autonomous Improvement
Avi Singh, Eric Jang, Alexander Irpan, Daniel Kappler, Murtaza Dalal, Sergey Levine, Mohi Khansari, Chelsea Finn
International Conference on Robotics and Automation (ICRA), 2020
While robot learning has demonstrated promising results for enabling robots to automatically acquire new skills, a critical challenge in deploying learning-based systems is scale: acquiring enough data for the robot to effectively generalize broadly. Imitation learning, in particular, has remained a stable and powerful approach for robot learning, but critically relies on expert operators for data collection. In this work, we target this challenge, aiming to build an imitation learning system that can continuously improve through autonomous data collection, while simultaneously avoiding the explicit use of reinforcement learning, to maintain the stability, simplicity, and scalability of supervised imitation. We utilize the insight that, in a multi-task setting, a failed attempt at one task might represent a successful attempt at another task. This allows us to leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
The Ingredients of Real World Robotic Reinforcement Learning
Henry Zhu*, Justin Yu*, Abhishek Gupta*, Dhruv Shah, Kristian Hartikainen, Avi Singh, Vikash Kumar, Sergey Levine
International Conference on Learning Representations (ICLR), 2020
The success of reinforcement learning for real world robotics has been, in many cases limited to instrumented laboratory scenarios, often requiring arduous human effort and oversight to enable continuous learning. In this work, we discuss the elements that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world.
End-to-End Robotic Reinforcement Learning without Reward Engineering
Avi Singh, Larry Yang, Kristian Hartikainen, Chelsea Finn, Sergey Levine
Robotics: Science and Systems (RSS), 2019
In this paper, we propose an approach for removing the need for manual engineering of reward specifications by enabling a robot to learn from a modest number of examples of successful outcomes, followed by actively solicited queries, where the robot shows the user a state and asks for a label to determine whether that state represents successful completion of the task. While requesting labels for every single state would amount to asking the user to manually provide the reward signal, our method requires labels for only a tiny fraction of the states seen during training, making it an efficient and practical approach for learning skills without manually engineered rewards. We evaluate our method on real-world robotic manipulation tasks where the observations consist of images viewed by the robot's camera. In our experiments, our method effectively learns to arrange objects, place books, and drape cloth, directly from images and without any manually specified reward functions, and with only 1-4 hours of interaction with the real world.
Variational Inverse Control with Events: A General Framework for Data-Driven Reward Definition
Justin Fu*, Avi Singh*, Dibya Ghosh, Larry Yang, Sergey Levine
Neural Information Processing Systems (NeurIPS), 2018
The design of a reward function often poses a major practical challenge to real-world applications of reinforcement learning. Approaches such as inverse reinforcement learning attempt to overcome this challenge, but require expert demonstrations, which can be difficult or expensive to obtain in practice. We propose variational inverse control with events (VICE), which generalizes inverse reinforcement learning methods to cases where full demonstrations are not needed, such as when only samples of desired goal states are available. Our method is grounded in an alternative perspective on control and reinforcement learning, where an agent's goal is to maximize the probability that one or more events will happen at some point in the future, rather than maximizing cumulative rewards. We demonstrate the effectiveness of our methods on continuous control tasks, with a focus on high-dimensional observations like images where rewards are hard or even impossible to specify.
Few-Shot Goal Inference for Visuomotor Learning and Planning
Annie Xie, Avi Singh, Sergey Levine, Chelsea Finn
Conference on Robot Learning (CoRL), 2018
We aim to find a more general and scalable solution for specifying goals for robot learning in unconstrained environments. To that end, we formulate the few-shot objective learning problem, where the goal is to learn a task objective from only a few example images of successful end states for that task. We propose a simple solution to this problem: meta-learn a classifier that can recognize new goals from a few examples. We show how this approach can be used with both model-free reinforcement learning and visual model-based planning, for manipulating ropes from images in simulation and moving objects into user-specified configurations on a real robot.
Divide-and-Conquer Reinforcement Learning
Dibya Ghosh, Avi Singh, Aravind Rajeswaran, Vikash Kumar, Sergey Levine
International Conference on Learning Representations (ICLR), 2018
Standard model-free deep reinforcement learning (RL) algorithms sample a new initial state for each trial, allowing them to optimize policies that can perform well even in highly stochastic environments. However, problems that exhibit considerable initial state variation typically produce high-variance gradient estimates for model-free RL, making direct policy or value function optimization challenging. In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice. The ensemble is gradually unified into a single policy that can succeed on the whole state space. This approach, which we term divide-and-conquer RL, is able to solve complex tasks where conventional deep RL methods are ineffective.
GPLAC: Generalizing Vision-Based Robotic Skills using Weakly Labeled Images
Avi Singh, Larry Yang, Sergey Levine
International Conference on Computer Vision (ICCV), 2017
We tackle the problem of learning robotic sensorimotor control policies that can generalize to visually diverse and unseen environments. Achieving broad generalization typically requires large datasets, which are difficult to obtain for task-specific interactive processes such as reinforcement learning or learning from demonstration. However, much of the visual diversity in the world can be captured through passively collected datasets of images or videos. In our method, which we refer to as GPLAC (Generalized Policy Learning with Attentional Classifier), we use both interaction data and weakly labeled image data to augment the generalization capacity of sensorimotor policies.
Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, José M. F. Moura, Devi Parikh, Dhruv Batra
Computer Vision and Pattern Recognition (CVPR), 2017 (Spotlight)
We introduce the task of Visual Dialog, which requires an AI agent to hold a meaningful dialog with humans in natural, conversational language about visual content. Specifically, given an image, a dialog history, and a question about the image, the agent has to ground the question in image, infer context from history, and answer the question accurately. Visual Dialog is disentangled enough from a specific downstream task so as to serve as a general test of machine intelligence, while being grounded in vision enough to allow objective evaluation of individual responses and benchmark progress.
Recurrent Neural Networks for Driver Activity Anticipation via Sensory-Fusion Architecture
Ashesh Jain, Avi Singh, Hema S Koppula, Shane Soh, Ashutosh Saxena
International Conference on Robotics and Automation (ICRA), 2016
Anticipating the future actions of a human is a widely studied problem in robotics that requires spatio-temporal reasoning. In this work we propose a deep learning approach for anticipation in sensory-rich robotics applications. We introduce a sensory-fusion architecture which jointly learns to anticipate and fuse information from multiple sensory streams. Our approach shows significant improvement over the state-of-the-art in maneuver anticipation by increasing the precision from 77.4% to 90.5% and recall from 71.2% to 87.4%.
Brain4Cars: Car That Knows Before You Do via Sensory-Fusion Deep Learning Architecture
Ashesh Jain, Hema S Koppula, Shane Soh, Bharad Raghavan, Avi Singh, Ashutosh Saxena
International Journal on Robotics Research (IJRR), 2016
In this work we propose a vehicular sensor-rich platform and learning algorithms for maneuver anticipation. For this purpose we equip a car with cameras, Global Positioning System (GPS), and a computing device to capture the driving context from both inside and outside of the car. In order to anticipate maneuvers, we propose a sensory-fusion deep learning architecture which jointly learns to anticipate and fuse multiple sensory streams.