LAMDA: A Longitudinal Android Malware Benchmark for Concept Drift Analysis
Overview
Overall Novelty Assessment
The paper introduces LAMDA, a large-scale longitudinal Android malware benchmark spanning 12 years and over 1 million samples, designed to facilitate systematic concept drift analysis. Within the taxonomy, it resides in the 'Temporal Evaluation and Benchmarking' leaf under 'Drift Detection and Characterization'. This leaf contains five papers total, indicating a moderately populated research direction. The sibling papers—Empirical Drift Evaluation, Temporal Inconsistency Revisited, Aurora, and one other—similarly focus on measuring model degradation over time, suggesting that temporal benchmarking is an established but not overcrowded subfield within concept drift research.
The taxonomy reveals that LAMDA's leaf sits within a broader branch dedicated to drift detection and characterization, which also includes 'Drift Detection Mechanisms' and 'Drift Cause Analysis'. Neighboring branches address adaptation strategies (active learning, incremental learning, retraining) and robust representation learning (invariant features, domain adaptation). The scope note for LAMDA's leaf explicitly excludes adaptation methods, clarifying that its contribution lies in providing evaluation infrastructure rather than proposing new model update techniques. This positioning suggests the work complements rather than competes with adaptation-focused research, offering a shared resource for testing drift mitigation approaches.
Among the three contributions analyzed, the dataset itself (Contribution A) examined 10 candidates with zero refutable prior work, suggesting strong novelty in scale and temporal scope. However, the empirical demonstration of concept drift (Contribution B) examined 10 candidates and found 6 refutable matches, indicating that performance degradation under temporal shifts is well-documented in prior studies. The multi-faceted analysis framework (Contribution C) examined only 1 candidate with no refutations, though the limited search scope makes it difficult to assess novelty conclusively. Overall, the dataset contribution appears more distinctive than the empirical findings, which align with established observations in the field.
Based on the limited search of 21 candidates, LAMDA's primary novelty lies in its dataset scale and temporal coverage rather than in demonstrating drift effects, which prior work has extensively characterized. The analysis does not cover exhaustive literature beyond top-K semantic matches, so additional related benchmarks or longitudinal studies may exist outside this scope. The contribution's value likely centers on enabling more rigorous comparative evaluations rather than introducing fundamentally new insights about concept drift mechanisms.
Taxonomy
Research Landscape Overview
Claimed Contributions
The authors introduce LAMDA, a comprehensive Android malware dataset spanning 12 years (2013–2025, excluding 2015) with over 1 million samples covering 1,380 malware families and 150,000 singleton samples. The dataset is specifically structured to enable systematic evaluation of concept drift in malware detection systems.
The authors conduct comprehensive empirical evaluations showing how machine learning-based malware detectors degrade over time due to concept drift. They analyze performance degradation patterns, feature stability, and temporal shifts using multiple evaluation methodologies including supervised learning experiments and distributional analysis.
The authors develop and apply a comprehensive analytical framework for studying concept drift that includes multiple complementary techniques: per-feature distribution analysis, family-wise feature stability assessment, temporal label drift tracking, and SHAP-based explanation drift analysis to reveal how feature importance changes over time.
Core Task Comparisons
Comparisons with papers in the same taxonomy category
[23] Empirical Evaluation of Concept Drift in ML-Based Android Malware Detection PDF
[24] Revisiting Temporal Inconsistency and Feature Extraction for Android Malware Detection PDF
[25] Breaking Out from the TESSERACT: Reassessing ML-based Malware Detection under Spatio-Temporal Drift PDF
[27] Aurora: Are Android Malware Classifiers Reliable under Distribution Shift? PDF
Contribution Analysis
Detailed comparisons for each claimed contribution
LAMDA: A large-scale longitudinal Android malware benchmark dataset
The authors introduce LAMDA, a comprehensive Android malware dataset spanning 12 years (2013–2025, excluding 2015) with over 1 million samples covering 1,380 malware families and 150,000 singleton samples. The dataset is specifically structured to enable systematic evaluation of concept drift in malware detection systems.
[3] Experts still needed: boosting long-term android malware detection with active learning PDF
[4] Continuous Learning for Android Malware Detection PDF
[5] Learning Temporal Invariance in Android Malware Detectors PDF
[7] Temporal-Incremental Learning for Android Malware Detection PDF
[36] Assessing and improving malware detection sustainability through app evolution studies PDF
[45] DRMD: Deep Reinforcement Learning for Malware Detection under Concept Drift PDF
[51] One step forward, two steps back: ML-based malware detection under concept drift PDF
[52] FL-MalDrift: a federated learning framework for malware detection under local concept drift PDF
[53] On the relativity of time: Implications and challenges of data drift on long-term effective android malware detection PDF
[54] LongCGDroid: Android malware detection through longitudinal study for machine learning and deep learning PDF
Empirical demonstration of concept drift and performance degradation
The authors conduct comprehensive empirical evaluations showing how machine learning-based malware detectors degrade over time due to concept drift. They analyze performance degradation patterns, feature stability, and temporal shifts using multiple evaluation methodologies including supervised learning experiments and distributional analysis.
[5] Learning Temporal Invariance in Android Malware Detectors PDF
[9] Transcend: Detecting concept drift in malware classification models PDF
[25] Breaking Out from the TESSERACT: Reassessing ML-based Malware Detection under Spatio-Temporal Drift PDF
[56] Transcending transcend: Revisiting malware classification in the presence of concept drift PDF
[58] Adapting to concept drift in malware detection PDF
[59] On the limitations of continual learning for malware classification PDF
[10] Hybrid multilevel detection of mobile devices malware under concept drift PDF
[18] Towards Explainable Drift Detection and Early Retrain in ML-Based Malware Detection Pipelines PDF
[55] Fesad: Ransomware detection with machine learning using adaption to concept drift PDF
[57] FeSA: Feature selection architecture for ransomware detection under concept drift PDF
Multi-faceted concept drift analysis framework
The authors develop and apply a comprehensive analytical framework for studying concept drift that includes multiple complementary techniques: per-feature distribution analysis, family-wise feature stability assessment, temporal label drift tracking, and SHAP-based explanation drift analysis to reveal how feature importance changes over time.