Stochastic OR Candidate Seminar - Sajad Khodadadian - Virginia Tech

Stochastic OR Candidate Seminar - Sajad Khodadadian

views

Title: Federated Methods to Speedup Reinforcement Learning

Abstract: Reinforcement learning (RL) is a sequential decision making paradigm where an agent learns to accomplish certain tasks by interacting with the environment. It is known that RL algorithms are data-intensive, and require a large set of data to train. One way to boost the learning of RL algorithms is to employ multiple agents to collect data. Furthermore, in certain applications such as medical settings, data of the local agents might be sensitive and needs to be kept private. In this talk, we consider a federated RL framework where multiple agents collaboratively learn a global model, without sharing their sensitive individual data and policies. Although having N agents enables the sampling of N times more data, it is not clear if it leads to proportional convergence speedup. We consider federated versions of on-policy TD, off-policy TD and Q-learning, and establish that there is a speedup in learning that is linear in the number of agents. In particular, we show this even in the presence of Markovian noise and multiple local updates. We do this by developing a federated stochastic approximation algorithm with Markovian noise (FedSAM) and establishing linear speedup under a very general framework.

Bio: Sajad Khodadadian is a 5th year Operations Research PhD student in the School of Industrial and Systems Engineering at Georgia Tech, working with Prof. Siva Theja Maguluri. Prior to Georgia Tech, he received his B.Sc. degree in Electrical Engineering and Physics from Sharif University of Technology. His research is on theoretical foundations and algorithm design for Reinforcement Learning in both single agent and multiagent settings. He is the recipient of “Margaret and Stephen Kendrick Research Excellence” award from ISyE, ARC-ACO fellowship and ML@GT fellowships from Georgia Tech and a bronze medal in the international physics Olympiad (IPHO 2012).

…Read more Less…

Tags

Stochastic OR Candidate Seminar - Sajad Khodadadian

Related Media