ATLAS: Alibaba Dataset and Benchmark for Learning-Augmented Scheduling

ICLR 2026 Conference SubmissionAnonymous Authors

OpenReview Score: 6.4 Download Report PDF

Scheduling with predictionsDataset and benchmarkMachine learningLearning augmented schedulingNon-clairvoyant scheduling

Learning-augmented scheduling uses ML predictions to improve decision-making under uncertainty. Many algorithms in this class have been proposed with better theoretical guarantees than the classic methods. Translating these theoretical results into practice, however, requires an understanding of real workloads. Such an understanding is hard to develop because existing production traces either lack the ground-truth processing times or are not publicly available, while synthetic benchmarks fail to represent real-world complexity. We fill this gap by introducing Alibaba Trace for Learning-Augmented Scheduling (ATLAS), a research-ready dataset derived from Alibaba's Platform of Artificial Intelligence (PAI) cluster trace—a production system that processes hundreds of thousands of ML jobs per day. The ATLAS dataset has been cleaned and features engineered to represent the inputs and constraints of non-clairvoyant scheduling, including user tags, resource requests (CPU/GPU/memory), and job structures with ground-truth processing times. We develop a prediction benchmark reporting prediction error metrics, along with feature importance analysis, and introduce a novel multiple-stage ML model. We also provide a scheduling benchmark for minimizing the total completion time, max-stretch, and makespan. ATLAS is a reproducible foundation for researchers to study learning-augmented scheduling on real workloads, available at https://anonymous.4open.science/r/non-clairvoyant-with-predictions-7BF8/.

Abstract:

Disclaimer

This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.

NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.

If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces ATLAS, a production-derived dataset for learning-augmented scheduling, along with a prediction benchmark and a multi-stage ML model. It resides in the 'Benchmarks, Datasets, and Evaluation Frameworks' leaf of the taxonomy, which contains only two papers total. This leaf is notably sparse compared to more crowded branches such as 'Cloud and Data Center Scheduling' (six papers) or 'Task Execution Time and Resource Prediction' (four papers). The scarcity of benchmark resources in this field underscores the potential value of a well-curated dataset, as most prior work has focused on algorithm design or prediction models rather than standardized evaluation infrastructure.

The taxonomy reveals that ATLAS sits at the intersection of multiple research directions. Neighboring leaves include 'Task Execution Time and Resource Prediction' (four papers on ML models for job duration forecasting) and 'Cloud and Data Center Scheduling' (six papers on system implementations). The taxonomy's scope notes clarify that benchmark work should provide empirical infrastructure rather than novel algorithms or prediction techniques. ATLAS connects to these adjacent areas by offering a testbed for evaluating both prediction models and scheduling algorithms, bridging the gap between theoretical frameworks in 'Prediction Quality and Robustness' (four papers) and practical deployment in application domains.

Among twenty-four candidates examined via limited semantic search, none were found to clearly refute any of the three contributions. The ATLAS dataset contribution examined ten candidates with zero refutable matches; the LASched benchmark similarly examined ten candidates with no overlaps; the multi-stage ML model examined four candidates, also with no refutations. This suggests that within the scope of top-K semantic matches, the work occupies a relatively uncontested niche. However, the limited search scale means that more exhaustive exploration of adjacent fields—particularly production trace datasets in cloud computing or ML workload characterization—might reveal closer prior work not captured by this analysis.

Given the sparse benchmark leaf and the absence of refutations among examined candidates, the work appears to address a recognized gap in the field's empirical infrastructure. The limited search scope (twenty-four candidates) and the taxonomy's structure suggest that while the core contributions are distinct within learning-augmented scheduling, broader literature on workload traces or ML system benchmarks may contain related efforts not fully captured here. The analysis reflects what is visible through targeted semantic search rather than exhaustive coverage of all relevant domains.

Taxonomy

Core-task Taxonomy Papers

Claimed Contributions

Contribution Candidate Papers Compared

Refutable Paper

Research Landscape Overview

Core task: learning-augmented scheduling with machine learning predictions. This emerging field integrates predictive models into scheduling algorithms to improve performance beyond worst-case guarantees. The taxonomy organizes research into five main branches: Theoretical Foundations and Algorithm Design explores robustness-consistency trade-offs and competitive analysis when predictions may be imperfect, as seen in works like Calibrated Predictions[1] and Speed Predictions[2]. Prediction Models and Machine Learning Techniques examines how to generate and refine forecasts of job durations, resource demands, or system states, with contributions such as Learned Weights[16] and Feature Based Jobs[30]. Application Domains and System Implementation translates these ideas into real systems—cloud scheduling (GPU Cluster Scheduling[11], Kubernetes Optimization[45]), energy management (Energy Efficient Predictions[4], Renewable Microgrid[37]), and manufacturing (Smart Manufacturing[31], Master Production Scheduling[43]). Benchmarks, Datasets, and Evaluation Frameworks provides the empirical infrastructure to assess prediction quality and algorithm performance. Finally, Routing and Hybrid Optimization Problems extends learning-augmented ideas to vehicle routing and mixed combinatorial settings, such as Routing Under Uncertainty[46]. Several active lines of work highlight key trade-offs and open questions. One thread investigates how to design algorithms that remain competitive even when predictions are noisy or adversarial, balancing trust in forecasts with worst-case safeguards (Untrusted Predictions[21], Non Clairvoyant Partial[5]). Another focuses on practical deployment in cloud and edge environments, where real-time decisions must incorporate uncertain execution times and dynamic workloads (Real Time Predictions[8], ElasticBatch[20]). ATLAS[0] sits squarely within the Benchmarks, Datasets, and Evaluation Frameworks branch, providing standardized testbeds and metrics to compare learning-augmented schedulers. Its emphasis on reproducible evaluation complements nearby efforts like Hybrid Prediction[41], which blends multiple forecasting sources, by offering a common ground for assessing how different prediction strategies translate into scheduling gains. This positions ATLAS[0] as an enabling resource that bridges theoretical algorithm design and empirical validation across diverse application domains.

Claimed Contributions

ATLAS dataset for learning-augmented scheduling

10 retrieved papers

The authors introduce ATLAS, a dataset derived from Alibaba's production PAI cluster containing over 730,000 ML jobs with complete ground-truth processing times, submit-time features, and resource profiles. The dataset is specifically engineered for non-clairvoyant scheduling research, excluding post-execution metrics to prevent data leakage.

10 retrieved papers

LASched prediction and scheduling benchmark

10 retrieved papers

The authors develop LASched, a standardized benchmark with two components: a prediction task that evaluates ML models for job size prediction using multiple error metrics, and a scheduling task that evaluates learning-augmented algorithms across three objectives (total completion time, max-stretch, makespan) with reproducible evaluation protocols.

10 retrieved papers

Novel multi-stage ML prediction model

4 retrieved papers

The authors propose a multi-stage prediction approach that combines classification-first routing with specialized regressors and validation-based calibration methods (including conformal quantile regression, isotonic calibration, and meta-stacking) to achieve superior coverage metrics for job size prediction.

4 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

[41] A hybrid scheduling for multi-objective optimization using prediction approach PDF

Shobhana Kashyap, Avtar Singh (2025)

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

ATLAS dataset for learning-augmented scheduling

[33] Scheduling workflow tasks with unknown task execution time by combining machine-learning and greedy-optimization PDF

Cannot Refute

[51] Efficient deep reinforcement learning based task scheduler in multi cloud environment PDF

Cannot Refute

[52] Application of machine learning and data science in project construction scheduling PDF

Cannot Refute

[53] A hybrid metaheuristic and machine learning algorithm for optimal task scheduling in cloud computing PDF

Cannot Refute

[54] Deep learning and optimization enabled multi-objective for task scheduling in cloud computing PDF

Cannot Refute

[55] Deep reinforcement learning task scheduling method based on server real-time performance PDF

Cannot Refute

[56] An Efficient Task Scheduling for Cloud Computing Platforms Using Energy Management Algorithm: A Comparative Analysis of Workflow Execution Time PDF

Cannot Refute

[57] Equalizer: Energyâefficient machine learningâbased heterogeneous cluster load balancer PDF

Cannot Refute

[58] Improving prediction of computational job execution times with machine learning PDF

Cannot Refute

[59] CPU time prediction using machine learning for post-tapeout flow runs PDF

Cannot Refute

Contribution

LASched prediction and scheduling benchmark

[7] Activity schedule modeling using machine learning PDF

Cannot Refute

[64] The Cost of Accurate Predictions in Learning-Augmented Scheduling PDF

Cannot Refute

[65] ML-Aided Dynamic BSR Periodicity Adjustment for Enhanced UL Scheduling in Cellular Systems PDF

Cannot Refute

[66] Artificial Intelligence in Project Scheduling Management: A Systematic Literature Review PDF

Cannot Refute

[67] A machine learning-based resource-efficient task scheduler for heterogeneous computer systems PDF

Cannot Refute

[68] Control of parallelized bioreactors I: dynamic scheduling software for efficient bioprocess management in high-throughput systems PDF

Cannot Refute

[69] Utilizing artificial intelligence to enhance task duration estimation and optimize workforce qualifications in the field service industry PDF

Cannot Refute

[70] A stochastic algorithm for scheduling bag-of-tasks applications on hybrid clouds under task duration variations PDF

Cannot Refute

[71] Probabilistic forecasting of surgical case duration using machine learning: model development and validation PDF

Cannot Refute

[72] A Machine Learning Framework for Satellite Data Transmission Duration Prediction: Enhancing Mission Planning Efficiency PDF

Cannot Refute

Contribution

Novel multi-stage ML prediction model

[60] Predictive SQL Query Tuning Using Sequence Modeling of Query Plans for Performance Optimization PDF

Cannot Refute

[61] Online workload forecasting PDF

Cannot Refute

[62] CaReTS: A Multi-Task Framework Unifying Classification and Regression for Time Series Forecasting PDF

Cannot Refute

[63] PREDICTIVE OPTIMIZATION THROUGH DEEP LEARNING: A METHODOLOGICAL FRAMEWORK FOR REAL-TIME RESOURCE ALLOCATION PDF

Cannot Refute

ATLAS: Alibaba Dataset and Benchmark for Learning-Augmented Scheduling

Overview

Overall Novelty Assessment

Taxonomy

Research Landscape Overview

Claimed Contributions

Core Task Comparisons

[41] A hybrid scheduling for multi-objective optimization using prediction approach PDF

Contribution Analysis

ATLAS dataset for learning-augmented scheduling

[33] Scheduling workflow tasks with unknown task execution time by combining machine-learning and greedy-optimization PDF

[51] Efficient deep reinforcement learning based task scheduler in multi cloud environment PDF

[52] Application of machine learning and data science in project construction scheduling PDF

[53] A hybrid metaheuristic and machine learning algorithm for optimal task scheduling in cloud computing PDF

[54] Deep learning and optimization enabled multi-objective for task scheduling in cloud computing PDF

[55] Deep reinforcement learning task scheduling method based on server real-time performance PDF

[56] An Efficient Task Scheduling for Cloud Computing Platforms Using Energy Management Algorithm: A Comparative Analysis of Workflow Execution Time PDF

[57] Equalizer: Energyâefficient machine learningâbased heterogeneous cluster load balancer PDF

[58] Improving prediction of computational job execution times with machine learning PDF

[59] CPU time prediction using machine learning for post-tapeout flow runs PDF

LASched prediction and scheduling benchmark

[7] Activity schedule modeling using machine learning PDF

[64] The Cost of Accurate Predictions in Learning-Augmented Scheduling PDF

[65] ML-Aided Dynamic BSR Periodicity Adjustment for Enhanced UL Scheduling in Cellular Systems PDF

[66] Artificial Intelligence in Project Scheduling Management: A Systematic Literature Review PDF

[67] A machine learning-based resource-efficient task scheduler for heterogeneous computer systems PDF

[68] Control of parallelized bioreactors I: dynamic scheduling software for efficient bioprocess management in high-throughput systems PDF

[69] Utilizing artificial intelligence to enhance task duration estimation and optimize workforce qualifications in the field service industry PDF

[70] A stochastic algorithm for scheduling bag-of-tasks applications on hybrid clouds under task duration variations PDF

[71] Probabilistic forecasting of surgical case duration using machine learning: model development and validation PDF

[72] A Machine Learning Framework for Satellite Data Transmission Duration Prediction: Enhancing Mission Planning Efficiency PDF

Novel multi-stage ML prediction model

[60] Predictive SQL Query Tuning Using Sequence Modeling of Query Plans for Performance Optimization PDF

[61] Online workload forecasting PDF

[62] CaReTS: A Multi-Task Framework Unifying Classification and Regression for Time Series Forecasting PDF

[63] PREDICTIVE OPTIMIZATION THROUGH DEEP LEARNING: A METHODOLOGICAL FRAMEWORK FOR REAL-TIME RESOURCE ALLOCATION PDF

Table of Contents

[57] Equalizer: Energyâefficient machine learningâbased heterogeneous cluster load balancer PDF