I am a postdoctoral fellow in the ACDS Lab at Georgia Tech, supervised by Evangelos Theodorou. My research focuses on using principles from optimal control theory and statistical mechanics to develop new machine learning algorithms with improved generalization capabilities in domains such as generative AI and imitation learning.
I received my Ph.D. from Princeton University in 2023. I conducted my dissertation research as part of the IRoM Lab, where I was advised by Anirudha Majumdar. The research I conducted as a graduate student explored the kind and quantity of sensory information a robot should use to achieve a task, as well as, the fundamental limits of performance afforded by a robot’s sensor. Answering these questions theoretically and empirically required designing and analyzing stochastic optimal control algorithms using a wide variety of tools, such as information theory, Bayesian inference, differential privacy, and statistical mechanics.
Recent advancements in diffusion bridges for distribution transport problems have heavily relied on matching frameworks, yet existing methods often face a trade-off between scalability and access to optimal pairings during training. Fully unsupervised methods make minimal assumptions but incur high computational costs, limiting their practicality. On the other hand, imposing full supervision of the matching process with optimal pairings improves scalability, however, it can be infeasible in most applications. To strike a balance between scalability and minimal supervision, we introduce Feedback Schrödinger Bridge Matching (FSBM), a novel semi-supervised matching framework that incorporates a small portion (% of the entire dataset) of pre-aligned pairs as state feedback to guide the transport map of non-coupled samples, thereby significantly improving efficiency. This is achieved by formulating a static Entropic Optimal Transport (EOT) problem with an additional term capturing the semi-supervised guidance. The generalized EOT objective is then recast into a dynamic formulation to leverage the scalability of matching frameworks. Extensive experiments demonstrate that FSBM accelerates training and enhances generalization by leveraging coupled pairs’ guidance, opening new avenues for training matching frameworks with partially aligned datasets.
@inproceedings{Theodoropoulos25,title={Feedback Schr{\"o}dinger Bridge Matching},author={Theodoropoulos, Panagiotis and Komianos, Nikolaos and Pacelli, Vincent and Liu, Guan-Horng and Theodorou, Evangelos A.},booktitle={Proc. Intl. Conf. on Learning Representations},year={2025},url={https://openreview.net/forum?id=k3tbMMW8rH},doi={10.48550/arXiv.2410.14055}}
Deep Distributed Optimization for Large-Scale Quadratic Programming
Augustinos D. Saravanos, Hunter Kuperman, Alex Oshin, and 3 more authors
In Proc. Intl. Conf. on Learning Representations, 2025
Quadratic programming (QP) forms a crucial foundation in optimization, encompassing a broad spectrum of domains and serving as the basis for more advanced algorithms. Consequently, as the scale and complexity of modern applications continue to grow, the development of efficient and reliable QP algorithms is becoming increasingly vital. In this context, this paper introduces a novel deep learning-aided distributed optimization architecture designed for tackling large-scale QP problems. First, we combine the state-of-the-art Operator Splitting QP (OSQP) method with a consensus approach to derive DistributedQP, a new method tailored for network-structured problems, with convergence guarantees to optimality. Subsequently, we unfold this optimizer into a deep learning framework, leading to DeepDistributedQP, which leverages learned policies to accelerate reaching to desired accuracy within a restricted amount of iterations. Our approach is also theoretically grounded through Probably Approximately Correct (PAC)-Bayes theory, providing generalization bounds on the expected optimality gap for unseen problems. The proposed framework, as well as its centralized version DeepQP, significantly outperform their standard optimization counterparts on a variety of tasks such as randomly generated problems, optimal control, linear regression, transportation networks and others. Notably, DeepDistributedQP demonstrates strong generalization by training on small problems and scaling to solve much larger ones (up to 50K variables and 150K constraints) using the same policy. Moreover, it achieves orders-of-magnitude improvements in wall-clock time compared to OSQP. The certifiable performance guarantees of our approach are also demonstrated, ensuring higher-quality solutions over traditional optimizers.
@inproceedings{Saravanos25,title={Deep Distributed Optimization for Large-Scale Quadratic Programming},author={Saravanos, Augustinos D. and Kuperman, Hunter and Oshin, Alex and Abdul, Arshiya Taj and Pacelli, Vincent and Theodorou, Evangelos},booktitle={Proc. Intl. Conf. on Learning Representations},year={2025},url={https://openreview.net/forum?id=hzuumhfYSO},doi={10.48550/arXiv.2412.12156}}
Fundamental Limits for Sensor-Based Robot Control
Anirudha Majumdar, Zhiting Mei, and Vincent Pacelli
Our goal is to develop theory and algorithms for establishing fundamental limits on performance imposed by a robot’s sensors for a given task. In order to achieve this, we define a quantity that captures the amount of task-relevant information provided by a sensor. Using a novel version of the generalized Fano’s inequality from information theory, we demonstrate that this quantity provides an upper bound on the highest achievable expected reward for one-step decision-making tasks. We then extend this bound to multi-step problems via a dynamic programming approach. We present algorithms for numerically computing the resulting bounds, and demonstrate our approach on three examples: (i) the lava problem from the literature on partially observable Markov decision processes, (ii) an example with continuous state and observation spaces corresponding to a robot catching a freely-falling object, and (iii) obstacle avoidance using a depth sensor with non-Gaussian noise. We demonstrate the ability of our approach to establish strong limits on achievable performance for these problems by comparing our upper bounds with achievable lower bounds (computed by synthesizing or learning concrete control policies).
@article{Majumdar23,title={Fundamental Limits for Sensor-Based Robot Control},author={Majumdar, Anirudha and Mei, Zhiting and Pacelli, Vincent},journal={Intl. J. of Robotics Research},volume={42},number={12},pages={1051--1069},year={2023},publisher={SAGE},url={https://journals.sagepub.com/doi/full/10.1177/02783649231190947},doi={https://doi.org/10.1177/02783649231190}}
Robust Control Under Uncertainty via Bounded Rationality and Differential Privacy
Vincent Pacelli and Anirudha Majumdar
In Proc. Intl. Conf on Robotics and Automation, 2022
The rapid development of affordable and compact high-fidelity sensors (e.g., cameras and LIDAR) allows robots to construct detailed estimates of their states and environments. However, the availability of such rich sensor information introduces two challenges: (i) the lack of analytic sensing models, which makes it difficult to design controllers that are robust to sensor failures, and (ii) the computational expense of processing the high-dimensional sensor information in real time. This paper addresses these challenges using the theory of differential privacy, which allows us to (i) design controllers with bounded sensitivity to errors in state estimates, and (ii) bound the amount of state information used for control (i.e., to impose decision-making under bounded rationality). The resulting framework approximates the separation principle and allows us to derive an upper-bound on the cost incurred with a faulty state estimator in terms of three quantities: the cost incurred using a perfect state estimator, the magnitude of state estimation errors, and the level of differential privacy. We demonstrate the efficacy of our framework numerically on different robotics problems, including nonlinear system stabilization and motion planning.
@inproceedings{Pacelli22,title={Robust Control Under Uncertainty via Bounded Rationality and Differential Privacy},author={Pacelli, Vincent and Majumdar, Anirudha},booktitle={Proc. Intl. Conf on Robotics and Automation},pages={3467--3474},year={2022},organization={IEEE},url={https://ieeexplore.ieee.org/abstract/document/9811557},doi={10.1109/icra46639.2022.9811557}}
Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning
Anoopkumar Sonar, Vincent Pacelli, and Anirudha Majumdar
In Proc. Conf. on Learning for Dynamics and Control, 2021
A fundamental challenge in reinforcement learning is to learn policies that generalize beyond the operating domains experienced during training. In this paper, we approach this challenge through the following invariance principle: an agent must find a representation such that there exists an action-predictor built on top of this representation that is simultaneously optimal across all training domains. Intuitively, the resulting invariant policy enhances generalization by finding causes of successful actions. We propose a novel learning algorithm, Invariant Policy Optimization (IPO), that implements this principle and learns an invariant policy during training. We compare our approach with standard policy gradient methods and demonstrate significant improvements in generalization performance on unseen domains for linear quadratic regulator and grid-world problems, and an example where a robot must learn to open doors with varying physical properties.
@inproceedings{Sonar2021,title={Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning},author={Sonar, Anoopkumar and Pacelli, Vincent and Majumdar, Anirudha},booktitle={Proc. Conf. on Learning for Dynamics and Control},pages={21--33},year={2021},organization={PMLR},url={https://proceedings.mlr.press/v144/sonar21a.html},doi={10.48550/arXiv.2006.01096}}