Preserving semantic and temporal consistency for unpaired video-to-video translation

Kwanyong Park, Sanghyun Woo, Dahun Kim, Donghyeon Cho, In So Kweon

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Scopus citations

Abstract

In this paper, we investigate the problem of unpaired video-to-video translation. Given a video in the source domain, we aim to learn the conditional distribution of the corresponding video in the target domain, without seeing any pairs of corresponding videos. While signiicant progress has been made in the unpaired translation of images, directly applying these methods to an input video leads to low visual quality due to the additional time dimension. In particular, previous methods sufer from semantic inconsistency (i.e., semantic label lipping) and temporal lickering artifacts. To alleviate these issues, we propose a new framework that is composed of carefully-designed generators and discriminators, coupled with two core objective functions: 1) content preserving loss and 2) temporal consistency loss. Extensive qualitative and quantitative evaluations demonstrate the superior performance of the proposed method against previous approaches. We further apply our framework to a domain adaptation task and achieve favorable results.

Original languageEnglish
Title of host publicationMM 2019 - Proceedings of the 27th ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages1248-1257
Number of pages10
ISBN (Electronic)9781450368896
DOIs
StatePublished - 15 Oct 2019
Event27th ACM International Conference on Multimedia, MM 2019 - Nice, France
Duration: 21 Oct 201925 Oct 2019

Publication series

NameMM 2019 - Proceedings of the 27th ACM International Conference on Multimedia

Conference

Conference27th ACM International Conference on Multimedia, MM 2019
Country/TerritoryFrance
CityNice
Period21/10/1925/10/19

Keywords

  • Domain adaptation
  • Semantic and temporal consistency
  • Unpaired video-to-video translation

Fingerprint

Dive into the research topics of 'Preserving semantic and temporal consistency for unpaired video-to-video translation'. Together they form a unique fingerprint.

Cite this