Vincent Pacelli

Coda Building, E1511

756 W Peachtree Street NW

Atlanta, GA, 30308

I am a postdoctoral fellow in the ACDS Lab at Georgia Tech, supervised by Evangelos Theodorou. I conducted my Ph.D. research in the IRoM Lab at Princeton University under Anirudha Majumdar.

My research focuses on task-driven methods for solving stochastic optimal control (SOC) problems with applications in robotics, machine learning, and generative AI. These methods utilize task-relevant information to synthesize control policies with improved generalization to new contexts or environments — reducing data requirements and providing more reliable intelligent systems. My work provides algorithms to synthesize such policies as well as the theoretical foundations of task-driven control. These theoretical foundations characterize the generalization capability of the policy, provide performance guarantees, and establish fundamental limits of the performance achievable by any policy given the system dynamics and quality of the data.

News

Jul 15, 2025	Our paper, “Operator Splitting Covariance Steering for Safe Stochastic Nonlinear Control” was accepted to CDC2025!
Feb 11, 2025	The two papers on which I am a coauthor were accepted to ICLR2025: “Feedback Schrödinger Bridge Matching” (Oral) and “Deep Distributed Optimization for Large-Scale Quadratic Programming” (Poster).

Selected Publications

Feedback Schrödinger Bridge Matching

Panagiotis Theodoropoulos, Nikolaos Komianos, Vincent Pacelli, and 2 more authors

In Proc. Intl. Conf. on Learning Representations, 2025

Abs DOI arXiv Bib

Recent advancements in diffusion bridges for distribution transport problems have heavily relied on matching frameworks, yet existing methods often face a trade-off between scalability and access to optimal pairings during training. Fully unsupervised methods make minimal assumptions but incur high computational costs, limiting their practicality. On the other hand, imposing full supervision of the matching process with optimal pairings improves scalability, however, it can be infeasible in most applications. To strike a balance between scalability and minimal supervision, we introduce Feedback Schrödinger Bridge Matching (FSBM), a novel semi-supervised matching framework that incorporates a small portion (% of the entire dataset) of pre-aligned pairs as state feedback to guide the transport map of non-coupled samples, thereby significantly improving efficiency. This is achieved by formulating a static Entropic Optimal Transport (EOT) problem with an additional term capturing the semi-supervised guidance. The generalized EOT objective is then recast into a dynamic formulation to leverage the scalability of matching frameworks. Extensive experiments demonstrate that FSBM accelerates training and enhances generalization by leveraging coupled pairs’ guidance, opening new avenues for training matching frameworks with partially aligned datasets.
@inproceedings{Theodoropoulos25, title = {Feedback Schr{\"o}dinger Bridge Matching}, author = {Theodoropoulos, Panagiotis and Komianos, Nikolaos and Pacelli, Vincent and Liu, Guan-Horng and Theodorou, Evangelos A.}, booktitle = {Proc. Intl. Conf. on Learning Representations}, year = {2025}, url = {https://openreview.net/forum?id=k3tbMMW8rH}, doi = {10.48550/arXiv.2410.14055} }
Deep Distributed Optimization for Large-Scale Quadratic Programming

Augustinos D. Saravanos, Hunter Kuperman, Alex Oshin, and 3 more authors

In Proc. Intl. Conf. on Learning Representations, 2025

Abs DOI arXiv Bib

Quadratic programming (QP) forms a crucial foundation in optimization, encompassing a broad spectrum of domains and serving as the basis for more advanced algorithms. Consequently, as the scale and complexity of modern applications continue to grow, the development of efficient and reliable QP algorithms is becoming increasingly vital. In this context, this paper introduces a novel deep learning-aided distributed optimization architecture designed for tackling large-scale QP problems. First, we combine the state-of-the-art Operator Splitting QP (OSQP) method with a consensus approach to derive DistributedQP, a new method tailored for network-structured problems, with convergence guarantees to optimality. Subsequently, we unfold this optimizer into a deep learning framework, leading to DeepDistributedQP, which leverages learned policies to accelerate reaching to desired accuracy within a restricted amount of iterations. Our approach is also theoretically grounded through Probably Approximately Correct (PAC)-Bayes theory, providing generalization bounds on the expected optimality gap for unseen problems. The proposed framework, as well as its centralized version DeepQP, significantly outperform their standard optimization counterparts on a variety of tasks such as randomly generated problems, optimal control, linear regression, transportation networks and others. Notably, DeepDistributedQP demonstrates strong generalization by training on small problems and scaling to solve much larger ones (up to 50K variables and 150K constraints) using the same policy. Moreover, it achieves orders-of-magnitude improvements in wall-clock time compared to OSQP. The certifiable performance guarantees of our approach are also demonstrated, ensuring higher-quality solutions over traditional optimizers.
@inproceedings{Saravanos25, title = {Deep Distributed Optimization for Large-Scale Quadratic Programming}, author = {Saravanos, Augustinos D. and Kuperman, Hunter and Oshin, Alex and Abdul, Arshiya Taj and Pacelli, Vincent and Theodorou, Evangelos}, booktitle = {Proc. Intl. Conf. on Learning Representations}, year = {2025}, url = {https://openreview.net/forum?id=hzuumhfYSO}, doi = {10.48550/arXiv.2412.12156} }
Fundamental Limits for Sensor-Based Robot Control

Anirudha Majumdar, Zhiting Mei, and Vincent Pacelli

Intl. J. of Robotics Research, 2023

Abs DOI arXiv Bib

Our goal is to develop theory and algorithms for establishing fundamental limits on performance imposed by a robot’s sensors for a given task. In order to achieve this, we define a quantity that captures the amount of task-relevant information provided by a sensor. Using a novel version of the generalized Fano’s inequality from information theory, we demonstrate that this quantity provides an upper bound on the highest achievable expected reward for one-step decision-making tasks. We then extend this bound to multi-step problems via a dynamic programming approach. We present algorithms for numerically computing the resulting bounds, and demonstrate our approach on three examples: (i) the lava problem from the literature on partially observable Markov decision processes, (ii) an example with continuous state and observation spaces corresponding to a robot catching a freely-falling object, and (iii) obstacle avoidance using a depth sensor with non-Gaussian noise. We demonstrate the ability of our approach to establish strong limits on achievable performance for these problems by comparing our upper bounds with achievable lower bounds (computed by synthesizing or learning concrete control policies).
@article{Majumdar23, title = {Fundamental Limits for Sensor-Based Robot Control}, author = {Majumdar, Anirudha and Mei, Zhiting and Pacelli, Vincent}, journal = {Intl. J. of Robotics Research}, volume = {42}, number = {12}, pages = {1051--1069}, year = {2023}, publisher = {SAGE}, url = {https://journals.sagepub.com/doi/full/10.1177/02783649231190947}, doi = {https://doi.org/10.1177/02783649231190} }
Robust Control Under Uncertainty via Bounded Rationality and Differential Privacy

Vincent Pacelli and Anirudha Majumdar

In Proc. Intl. Conf on Robotics and Automation, 2022

Abs DOI arXiv Bib

The rapid development of affordable and compact high-fidelity sensors (e.g., cameras and LIDAR) allows robots to construct detailed estimates of their states and environments. However, the availability of such rich sensor information introduces two challenges: (i) the lack of analytic sensing models, which makes it difficult to design controllers that are robust to sensor failures, and (ii) the computational expense of processing the high-dimensional sensor information in real time. This paper addresses these challenges using the theory of differential privacy, which allows us to (i) design controllers with bounded sensitivity to errors in state estimates, and (ii) bound the amount of state information used for control (i.e., to impose decision-making under bounded rationality). The resulting framework approximates the separation principle and allows us to derive an upper-bound on the cost incurred with a faulty state estimator in terms of three quantities: the cost incurred using a perfect state estimator, the magnitude of state estimation errors, and the level of differential privacy. We demonstrate the efficacy of our framework numerically on different robotics problems, including nonlinear system stabilization and motion planning.
@inproceedings{Pacelli22, title = {Robust Control Under Uncertainty via Bounded Rationality and Differential Privacy}, author = {Pacelli, Vincent and Majumdar, Anirudha}, booktitle = {Proc. Intl. Conf on Robotics and Automation}, pages = {3467--3474}, year = {2022}, organization = {IEEE}, url = {https://ieeexplore.ieee.org/abstract/document/9811557}, doi = {10.1109/icra46639.2022.9811557} }
Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning

Anoopkumar Sonar, Vincent Pacelli, and Anirudha Majumdar

In Proc. Conf. on Learning for Dynamics and Control, 2021

Abs DOI arXiv Bib

A fundamental challenge in reinforcement learning is to learn policies that generalize beyond the operating domains experienced during training. In this paper, we approach this challenge through the following invariance principle: an agent must find a representation such that there exists an action-predictor built on top of this representation that is simultaneously optimal across all training domains. Intuitively, the resulting invariant policy enhances generalization by finding causes of successful actions. We propose a novel learning algorithm, Invariant Policy Optimization (IPO), that implements this principle and learns an invariant policy during training. We compare our approach with standard policy gradient methods and demonstrate significant improvements in generalization performance on unseen domains for linear quadratic regulator and grid-world problems, and an example where a robot must learn to open doors with varying physical properties.
@inproceedings{Sonar2021, title = {Invariant Policy Optimization: Towards Stronger Generalization in Reinforcement Learning}, author = {Sonar, Anoopkumar and Pacelli, Vincent and Majumdar, Anirudha}, booktitle = {Proc. Conf. on Learning for Dynamics and Control}, pages = {21--33}, year = {2021}, organization = {PMLR}, url = {https://proceedings.mlr.press/v144/sonar21a.html}, doi = {10.48550/arXiv.2006.01096} }