Existing video colorization methods struggle with temporal flickering or demand extensive manual input. We propose a novel approach automating high-fidelity video colorization using rich semantic guidance derived from language and segmentation. We employ a language-conditioned diffusion model to colorize grayscale frames. Guidance is provided via automatically generated object masks and textual prompts; our primary automatic method uses a generic prompt, achieving state-of-the-art results without specific color input. Temporal stability is achieved by warping color information from previous frames using optical flow (RAFT); a correction step detects and fixes inconsistencies introduced by warping. Evaluations on standard benchmarks (DAVIS30, VIDEVO20) show our method achieves state-of-the-art performance in colorization accuracy (PSNR) and visual realism (Colorfulness, CDC), demonstrating the efficacy of automated prompt-based guidance for consistent video colorization.

Prompt-Based Consistent Video Colorization

Tiberio Uricchio;
2026-01-01

Abstract

Existing video colorization methods struggle with temporal flickering or demand extensive manual input. We propose a novel approach automating high-fidelity video colorization using rich semantic guidance derived from language and segmentation. We employ a language-conditioned diffusion model to colorize grayscale frames. Guidance is provided via automatically generated object masks and textual prompts; our primary automatic method uses a generic prompt, achieving state-of-the-art results without specific color input. Temporal stability is achieved by warping color information from previous frames using optical flow (RAFT); a correction step detects and fixes inconsistencies introduced by warping. Evaluations on standard benchmarks (DAVIS30, VIDEVO20) show our method achieves state-of-the-art performance in colorization accuracy (PSNR) and visual realism (Colorfulness, CDC), demonstrating the efficacy of automated prompt-based guidance for consistent video colorization.
2026
9783032101846
9783032101853
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11568/1345549
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact