Motion-Aware Video Editing

AI-powered object replacement in video using generative models with motion-consistent propagation.

Abstract

We present a pipeline for motion-aware object replacement in video sequences. Our approach combines state-of-the-art segmentation (SAM2) with diffusion-based inpainting (Stable Diffusion) and generative object insertion, preserving temporal coherence through explicit motion modeling. We track similarity transforms across frames to warp replacement objects consistently, achieving high-quality results that respect the original scene dynamics.

Key contributions include: (1) a comparison of SAM2 vs. YOLO-based mask propagation for video object segmentation, (2) motion-field estimation using optical flow for temporally consistent warping, and (3) a quantitative evaluation using IoU, Dice, and SSIM metrics.

Quick Links