Abstract
Abstract
This study presents an investigation on adaptive parameter optimization techniques
 for Reinforcement Learning-based Apache Spark job scheduling. Traditional Rein-
 forcement Learning-based scheduling approaches suffer from the limitations of fixed
 hyperparameter configurations, requiring extensive manual tuning and often failing
 to adapt optimally to diverse workload characteristics. The research develops and
 evaluates adaptive mechanisms that enhance Proximal Policy Optimization (PPO) ef-
 fectiveness through dynamic parameter adjustment. Four novel adaptive approaches
 are proposed: adaptive clipping that dynamically adjusts policy update constraints
 based on Kullback-Leibler divergence feedback, adaptive learning rate mechanisms
 that modulate optimization step sizes according to training progress, a combined ap-
 proach leveraging both techniques simultaneously, and enhanced Generalized Advan-
 tage Estimation for improved value function approximation.
 The experimental evaluation is conducted within a comprehensive discrete-event sim-
 ulator that accurately models Apache Spark execution semantics. The proposed mech-
 anisms are tested using Transaction Processing Performance Council - High Perfor-
 mance (TPC-H) workloads across multiple random seeds to ensure statistical rigor
 and reproducibility. The adaptive mechanisms are formulated under the assumptions
 of policy gradient optimization theory and incorporate feedback-based parameter ad-
 justment strategies. Sample problems are considered, and the solutions obtained for
 adaptive mechanisms are compared with those achieved by baseline implementa-
 tion. The results reveal that, with proper adaptive parameter adjustment, the proposed
 mechanisms may become advantageous over traditional fixed-parameter approaches
 in terms of convergence stability, exploration effectiveness, and optimization quality.