I am primarily interested in the intersection of machine learning and distributed systems. My work focuses on two fronts: applying machine learning in distributed environments and designing distributed algorithms for large-scale training.
I am a machine learning research scientist at STR residing in Boston, Massachusetts. I lead the algorithms team within the Machine Intelligence for Networked Domains (MIND) group. Previously, I was a data scientist at Evonik.
I completed my PhD in mathematics at Rensselaer Polytechnic Institute where I was advised by Yangyang Xu and mentored by Jie Chen. My research focused on providing complexity analysis for first-order optimization methods in decentralized computing environments.
My resume is available upon request.
Neighbor-Sampling Based Momentum Stochastic Methods for Training Graph Neural Networks
Molly Noel, Gabriel Mancino-Ball, Yangyang Xu
A stochastic smoothing framework for nonconvex-nonconcave min-sum-max problems with applications to Wasserstein distributionally robust optimization
Wei Liu, Muhammad Khan, Gabriel Mancino-Ball, and Yangyang Xu
Variance-reduced accelerated methods for decentralized stochastic double-regularized nonconvex strongly-concave minimax problems
Gabriel Mancino-Ball and Yangyang Xu
BoFire: Bayesian Optimization Framework Intended for Real Experiments
Johannes P. Dürholt, et. al.
Journal of Machine Learning Research, 2025
Jointly Improving the Sample and Communication Complexities in Decentralized Stochastic Minimax Optimization
Xuan Zhang, Gabriel Mancino-Ball, Necdet Serhat Aybat, and Yangyang Xu
Proceedings of the 38th AAAI Conference on Artificial Intelligence, 2024
Proximal stochastic recursive momentum methods for nonconvex composite decentralized optimization
Gabriel Mancino-Ball, Shengnan Miao, Yangyang Xu, and Jie Chen
Proceedings of the 37th AAAI Conference on Artificial Intelligence, 2023
A decentralized primal-dual framework for non-convex smooth consensus optimization
Gabriel Mancino-Ball, Yangyang Xu, and Jie Chen
IEEE Transactions on Signal Processing, 2023
A PyTorch implementation of the FastGCN method
The goal of this project was to create a PyTorch implementation of the FastGCN method. The code base is designed for large-scale datasets (i.e. the OGB datasets) with new features such as mini-batch inference. All models were built from scratch to facilitate maximum learning.
Decentralized training of graph convolutional networks
The goal of this project was to study the effect of (decentralized) distributed training of graph neural networks. Up to 32 GPUs were utilized to perform parallel gradient computations with local data while MPI was used to propagate updates throughout the GPUs. This project served as a foundation for future projects where training with multiple GPUs was required.