TY - GEN
T1 - Preserving semantic and temporal consistency for unpaired video-to-video translation
AU - Park, Kwanyong
AU - Woo, Sanghyun
AU - Kim, Dahun
AU - Cho, Donghyeon
AU - Kweon, In So
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/10/15
Y1 - 2019/10/15
N2 - In this paper, we investigate the problem of unpaired video-to-video translation. Given a video in the source domain, we aim to learn the conditional distribution of the corresponding video in the target domain, without seeing any pairs of corresponding videos. While signiicant progress has been made in the unpaired translation of images, directly applying these methods to an input video leads to low visual quality due to the additional time dimension. In particular, previous methods sufer from semantic inconsistency (i.e., semantic label lipping) and temporal lickering artifacts. To alleviate these issues, we propose a new framework that is composed of carefully-designed generators and discriminators, coupled with two core objective functions: 1) content preserving loss and 2) temporal consistency loss. Extensive qualitative and quantitative evaluations demonstrate the superior performance of the proposed method against previous approaches. We further apply our framework to a domain adaptation task and achieve favorable results.
AB - In this paper, we investigate the problem of unpaired video-to-video translation. Given a video in the source domain, we aim to learn the conditional distribution of the corresponding video in the target domain, without seeing any pairs of corresponding videos. While signiicant progress has been made in the unpaired translation of images, directly applying these methods to an input video leads to low visual quality due to the additional time dimension. In particular, previous methods sufer from semantic inconsistency (i.e., semantic label lipping) and temporal lickering artifacts. To alleviate these issues, we propose a new framework that is composed of carefully-designed generators and discriminators, coupled with two core objective functions: 1) content preserving loss and 2) temporal consistency loss. Extensive qualitative and quantitative evaluations demonstrate the superior performance of the proposed method against previous approaches. We further apply our framework to a domain adaptation task and achieve favorable results.
KW - Domain adaptation
KW - Semantic and temporal consistency
KW - Unpaired video-to-video translation
UR - http://www.scopus.com/inward/record.url?scp=85074816853&partnerID=8YFLogxK
U2 - 10.1145/3343031.3350864
DO - 10.1145/3343031.3350864
M3 - Conference contribution
AN - SCOPUS:85074816853
T3 - MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia
SP - 1248
EP - 1257
BT - MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 27th ACM International Conference on Multimedia, MM 2019
Y2 - 21 October 2019 through 25 October 2019
ER -