Foliagen: Framework for Foliage Image Generation from Individual Crop Leaf Images

ICLR 2026 Conference SubmissionAnonymous Authors
Foliage DatasetLeaf DiseaseImage ClassificationTransfer LearningImage Generation
Abstract:

While machine learning (ML)-based crop disease classifiers mostly targeted individual leaf images, real-world applications call for disease classification on crop foliage images instead, because they usually rely on cameras mounted on unmanned aerial vehicles to capture foliage images across vast crop fields for automated disease identification. We found that known state-of-the-art (SOTA) classifiers on the only real-world soybean foliage image dataset all exhibited unsatisfactory performance, despite the dataset being modest-sized and including just two soybean disease categories (among many). Hence, it is desirable to make available large foliage image datasets with common crop disease categories for better evaluating and possibly improving SOTA crop disease classifiers on foliage images. This paper introduces a framework that generates crop foliage images utilizing available datasets of individual leaf images, termed Foliagen (short for foliage generation). A generated foliage image dataset can be arbitrarily sized, with each image emulating the natural distribution of diseased leaves with a specified disease rate. Being annotated by design, such generated datasets are valuable for (1) evaluating the SOTA classifiers when applied to practical use and (2) pre-training general SOTA classifiers, making it possible to effectively fine-tune them using any real-world foliage image dataset for improved classification performance. The Foliagen framework is exemplified by generating foliage image datasets for soybean and tomato. Our evaluation results indicate that five SOTA classifiers on generated datasets with nine disease categories achieve accuracy up to 87% for soybean and 86% for tomato under γ\gamma = 5%, and that they all exhibit less than 92% in classifying the real soybean foliage image dataset (with just two disease categories). Foliagen makes it possible to generate crop foliage image datasets to evaluate future disease classifiers objectively, aiming at in-field applications.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces Foliagen, a framework for generating annotated crop foliage images from individual leaf datasets to enable disease classification at the canopy level. Within the taxonomy, it occupies the 'Foliage-Level Synthetic Image Generation' leaf, which currently contains only this work as a sibling. This positioning suggests the paper addresses a relatively sparse research direction, distinct from the more populated 'Leaf-Level Synthetic Generation Using GANs' branch that includes DoubleGAN and rice-specific methods. The taxonomy reveals that most synthetic generation efforts focus on individual leaf augmentation rather than compositional foliage synthesis.

The taxonomy structure shows that neighboring work concentrates on leaf-level GAN methods (DoubleGAN, rice leaf generation) and direct disease classification approaches (real-time detection, crop-specific classifiers). The 'Disease Classification Methods' branch contains multiple subcategories addressing eggplant, multi-crop mining, and explainability, but these operate on existing imagery rather than generating foliage-scale datasets. The 'Comprehensive Plant Disorder Detection Frameworks' branch integrates multiple pipeline stages, yet the taxonomy narrative highlights a persistent tension between individual-leaf and whole-plant analysis that Foliagen explicitly targets by bridging controlled datasets and natural canopy structures.

Among 22 candidates examined across three contributions, no refutable prior work was identified. The core Foliagen framework examined 10 candidates with zero refutations, the evaluation methodology examined 3 candidates with zero refutations, and the transfer learning approach examined 9 candidates with zero refutations. This limited search scope suggests that within the top-K semantic matches and citation expansions analyzed, no directly overlapping foliage-generation frameworks were found. The absence of refutations across all contributions indicates either genuine novelty in this compositional synthesis approach or limitations in the search coverage, particularly given the modest candidate pool size.

Based on the limited literature search of 22 candidates, the work appears to occupy a distinct niche in foliage-level synthesis for disease classification. The taxonomy confirms sparse prior activity in this specific direction, with most related efforts targeting leaf-level augmentation or direct classification. However, the analysis does not cover exhaustive domain-specific venues or non-English publications, leaving open the possibility of relevant work outside the examined scope.

Taxonomy

Core-task Taxonomy Papers
10
3
Claimed Contributions
22
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Generating crop foliage images from individual leaf images for disease classification. The field organizes around four main branches that reflect different stages and perspectives in automated plant health monitoring. Synthetic Image Generation for Plant Disease Detection focuses on creating realistic training data to overcome scarcity issues, with Foliagen[0] representing foliage-level synthesis approaches. Disease Classification Methods Using Deep Learning encompasses the core recognition algorithms, including works like Plant disease detection using[3] and Image-based rice leaf disease[4] that apply convolutional architectures to identify pathologies. Plant Identification and Classification from Images addresses the broader challenge of species recognition, exemplified by Automatic plant identification from[2] and Plant Classification Using Leaf[7], which establish foundational visual recognition capabilities. Comprehensive Plant Disorder Detection Frameworks integrate multiple components into end-to-end systems, such as MCIP[8] and Location-guided lesions representation learning[10], bridging data generation, feature extraction, and diagnostic reasoning. A central tension emerges between works that operate on individual leaf images versus those targeting whole-plant or foliage-level analysis. Many studies like Disease classification in Solanum[5] and Deep leaning for detection[9] concentrate on isolated leaf samples, which simplifies annotation but may not capture field conditions. Foliagen[0] addresses this gap by synthesizing foliage-scale imagery from individual leaves, positioning itself within the synthetic generation branch but with a distinctive emphasis on compositional realism. This contrasts with Real-time plant disease dataset[1], which prioritizes deployment speed, and Explaining deep learning-based leaf[6], which focuses on interpretability of leaf-level classifiers. The original work thus occupies a niche at the intersection of data augmentation and ecological validity, aiming to bridge controlled laboratory datasets and the complexity of natural canopy structures.

Claimed Contributions

Foliagen framework for generating annotated crop foliage image datasets

The authors propose Foliagen, a framework that synthesizes annotated foliage image datasets from publicly available individual leaf images. Generated datasets can be arbitrarily sized, cover multiple disease categories, and include a specified rate of diseased leaves to emulate early-stage disease conditions for training and evaluation.

10 retrieved papers
Objective evaluation methodology for SOTA crop disease classifiers on foliage images

The authors demonstrate that generated foliage datasets enable objective evaluation of state-of-the-art classifiers under identical conditions without classifier-specific preprocessing, revealing which models perform best for in-field applications using UAV-captured foliage images rather than individual leaf images.

3 retrieved papers
Transfer learning approach using generated datasets for pre-training general classifiers

The authors show that classifiers pre-trained on generated foliage datasets covering nine disease categories can be effectively fine-tuned with a small fraction of real-world foliage images to achieve improved classification performance on field-specific datasets through transfer learning.

9 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Foliagen framework for generating annotated crop foliage image datasets

The authors propose Foliagen, a framework that synthesizes annotated foliage image datasets from publicly available individual leaf images. Generated datasets can be arbitrarily sized, cover multiple disease categories, and include a specified rate of diseased leaves to emulate early-stage disease conditions for training and evaluation.

Contribution

Objective evaluation methodology for SOTA crop disease classifiers on foliage images

The authors demonstrate that generated foliage datasets enable objective evaluation of state-of-the-art classifiers under identical conditions without classifier-specific preprocessing, revealing which models perform best for in-field applications using UAV-captured foliage images rather than individual leaf images.

Contribution

Transfer learning approach using generated datasets for pre-training general classifiers

The authors show that classifiers pre-trained on generated foliage datasets covering nine disease categories can be effectively fine-tuned with a small fraction of real-world foliage images to achieve improved classification performance on field-specific datasets through transfer learning.