Deep reinforcement learning for Type 1 Diabetes: Dual PPO controller for personalized insulin management

Marchetti, Alessandro; Sasso, Daniele; D'Antoni, Federico; Morandin, Francesco; Parton, Maurizio; Matarrese, Margherita Anna Grazia; Merone, Mario

doi:10.1016/j.compbiomed.2025.110147

Background: Managing blood glucose levels in Type 1 Diabetes Mellitus (T1DM) is essential to prevent complications. Traditional insulin delivery methods often require significant patient involvement, limiting automation. Reinforcement Learning (RL)-based controllers offer a promising approach for improving automated insulin administration. Methods: We propose a Dual Proximal Policy Optimization (Dual PPO) controller for personalized insulin delivery in a hybrid closed-loop system. The controller optimizes patient-specific insulin bounds through a grid search on pre-trained models to manage both hyperglycemia and hypoglycemia. A safe-control mechanism prevents insulin administration when glucose levels drop below a predefined threshold. The system was evaluated on 10 in silico adult patients using the UVA/Padova simulator, with five-day randomized meal scenarios. Results: The Dual PPO controller significantly improved Time in Range (TIR) (69.30% ±1.61) compared to a single PPO model (61.69% ±1.54). The system effectively reduced severe hyperglycemia while maintaining a low incidence of severe hypoglycemia. Unlike conventional open-loop methods such as Basal-Bolus (BBC) and Proportional–Integral–Derivative (PIDC) controllers, our system requires minimal patient interaction, eliminating the need for carbohydrate estimation. Conclusions: The Dual PPO controller enhances personalized insulin delivery in T1DM, improving glycemic control while reducing patient burden. This approach advances precision medicine in diabetes management, with potential for future real-world applications.