LiveMoments: Reselected Key Photo Restoration in Live Photos via Reference-guided Diffusion

ICLR 2026 Conference SubmissionAnonymous Authors
Live PhotoReference-based Image RestorationConditional Image GenerationMotion Alignment
Abstract:

Live Photo captures both a high-quality key photo and a short video clip to preserve the precious dynamics around the captured moment. While users may choose alternative frames as the key photo to capture better expressions or timing, these frames often exhibit noticeable quality degradation, as the photo capture ISP pipeline delivers significantly higher image quality than the video pipeline. This quality gap highlights the need for dedicated restoration techniques to enhance the reselected key photo. To this end, we propose LiveMoments, a reference-guided image restoration framework tailored for the reselected key photo in Live Photos. Our method employs a two-branch neural network: a reference branch that extracts structural and textural information from the original high-quality key photo, and a main branch that restores the reselected frame using the guidance provided by the reference branch. Furthermore, we introduce a unified Motion Alignment module that incorporates motion guidance for spatial alignment at both the latent and image levels. Experiments on real and synthetic Live Photos demonstrate that LiveMoments significantly improves perceptual quality and fidelity over existing solutions, especially in scenes with fast motion or complex structures.

Disclaimer
This report is AI-GENERATED using Large Language Models and WisPaper (A scholar search engine). It analyzes academic papers' tasks and contributions against retrieved prior work. While this system identifies POTENTIAL overlaps and novel directions, ITS COVERAGE IS NOT EXHAUSTIVE AND JUDGMENTS ARE APPROXIMATE. These results are intended to assist human reviewers and SHOULD NOT be relied upon as a definitive verdict on novelty.
NOTE that some papers exist in multiple, slightly different versions (e.g., with different titles or URLs). The system may retrieve several versions of the same underlying work. The current automated pipeline does not reliably align or distinguish these cases, so human reviewers will need to disambiguate them manually.
If you have any questions, please contact: mingzhang23@m.fudan.edu.cn

Overview

Overall Novelty Assessment

The paper introduces LiveMoments, a reference-guided restoration framework for reselected key photos in Live Photos, addressing quality degradation when users choose alternative frames from the video clip. According to the taxonomy, this work occupies the 'Multi-Frame Photo Restoration with Motion Alignment' leaf under Reference-Guided Image Restoration, where it appears as the sole paper. This positioning suggests the paper targets a relatively sparse and specialized research direction within the broader image restoration landscape, focusing specifically on the photo-video quality gap in live photo capture systems.

The taxonomy reveals that neighboring research directions include Real-Time Facial Reenactment (focusing on expression transfer) and broader Video Quality Enhancement branches (super-resolution for streaming, archival restoration). LiveMoments diverges from these by exploiting the unique structure of live photos: a high-quality reference frame paired with lower-quality video frames. Unlike general video enhancement methods that lack reference guidance, or facial reenactment techniques targeting expression manipulation, this work specifically addresses the ISP pipeline quality disparity between photo and video capture modes, carving out a distinct problem space at the intersection of multi-frame fusion and reference-based restoration.

Among the 30 candidates examined through semantic search, none clearly refute the three main contributions. The reselected key photo restoration task (10 candidates examined, 0 refutable) appears novel within this limited scope, as does the reference-guided diffusion framework (10 candidates, 0 refutable) and the LiveMoments benchmark dataset (10 candidates, 0 refutable). The absence of refutable prior work across all contributions suggests either genuine novelty or limitations in the search scope. The specialized nature of the live photo restoration problem may explain why existing multi-frame restoration or video enhancement methods do not directly overlap with these specific contributions.

Based on the limited literature search covering 30 semantically similar papers, the work appears to address an underexplored problem space with no direct prior solutions identified. However, the small search scope and the paper's isolation within its taxonomy leaf warrant caution: a broader survey of reference-guided restoration, burst photography, or computational photography venues might reveal closer related work. The analysis captures top semantic matches but may not reflect the full landscape of multi-frame image enhancement research.

Taxonomy

Core-task Taxonomy Papers
6
3
Claimed Contributions
30
Contribution Candidate Papers Compared
0
Refutable Paper

Research Landscape Overview

Core task: Reselected key photo restoration in live photos. This field addresses the challenge of enhancing a user-selected frame from a live photo sequence, leveraging temporal information from neighboring frames to improve quality. The taxonomy organizes work into four main branches. Reference-Guided Image Restoration focuses on methods that exploit multiple frames or reference images to restore a target photo, often requiring careful motion alignment and feature aggregation. Video Quality Enhancement encompasses techniques for improving temporal sequences, including super-resolution and artifact removal across frames. Microscopy Imaging Enhancement targets specialized scientific imaging where temporal data can reveal finer structural details. Real-Time Visual Processing emphasizes low-latency methods suitable for interactive applications. Representative works such as SuperResolution Streaming[1] and Trajectory SuperResolution[4] illustrate how temporal context can be harnessed for quality improvement, while CellINR[5] demonstrates domain-specific enhancements in microscopy. Several active lines of work explore trade-offs between computational efficiency and restoration quality, particularly when aligning frames with complex motion or handling degraded archival content. Within Reference-Guided Image Restoration, a small cluster of methods tackles multi-frame photo restoration with motion alignment, where the central challenge is to register and fuse information from temporally adjacent frames without introducing artifacts. LiveMoments[0] sits squarely in this cluster, emphasizing the restoration of a reselected keyframe by aligning and aggregating features from the live photo burst. Compared to approaches like Archival Enhancement[6], which may prioritize static degradation repair, LiveMoments[0] leverages the temporal redundancy inherent in live photo sequences. This positions it closer to video-inspired techniques such as Trajectory SuperResolution[4], yet with a focus on single-frame output rather than continuous playback, reflecting the unique user interaction model of live photos.

Claimed Contributions

Reselected Key Photo Restoration task for Live Photos

The authors define a new problem of restoring a blurry frame that users select as their preferred key photo in Live Photos by leveraging adjacent sharp frames from the same capture sequence as reference guidance.

10 retrieved papers
Reference-guided diffusion framework for key photo restoration

The authors develop a diffusion-based restoration method that incorporates temporal information from neighboring frames in the Live Photo sequence to enhance the quality of the user-selected blurry key frame.

10 retrieved papers
LiveMoments benchmark dataset

The authors create a dedicated benchmark dataset called LiveMoments to facilitate evaluation and research on the task of restoring reselected key photos in Live Photo sequences.

10 retrieved papers

Core Task Comparisons

Comparisons with papers in the same taxonomy category

Within the taxonomy built over the current TopK core-task papers, the original paper is assigned to a leaf with no direct siblings and no cousin branches under the same grandparent topic. In this retrieved landscape, it appears structurally isolated, which is one partial signal of novelty, but still constrained by search coverage and taxonomy granularity.

Contribution Analysis

Detailed comparisons for each claimed contribution

Contribution

Reselected Key Photo Restoration task for Live Photos

The authors define a new problem of restoring a blurry frame that users select as their preferred key photo in Live Photos by leveraging adjacent sharp frames from the same capture sequence as reference guidance.

Contribution

Reference-guided diffusion framework for key photo restoration

The authors develop a diffusion-based restoration method that incorporates temporal information from neighboring frames in the Live Photo sequence to enhance the quality of the user-selected blurry key frame.

Contribution

LiveMoments benchmark dataset

The authors create a dedicated benchmark dataset called LiveMoments to facilitate evaluation and research on the task of restoring reselected key photos in Live Photo sequences.