TY - JOUR
T1 - Contrastive learning for unsupervised image-to-image translation[Formula presented]
AU - Lee, Hanbit
AU - Seol, Jinseok
AU - Lee, Sang goo
AU - Park, Jaehui
AU - Shim, Junho
N1 - Publisher Copyright:
© 2023 Elsevier B.V.
PY - 2024/1
Y1 - 2024/1
N2 - Image-to-image translation (I2I) aims to learn a mapping function to transform images into different styles or domains while preserving their key structures. Typically, I2I models require manually defined image domains as a training set to learn the visual differences among the image domains and achieve the ability to translate images across them. However, constructing such multi-domain datasets on a large scale requires expensive data collection and annotation processes. Moreover, if the target domain changes or is expanded, a new dataset should be collected, and the model should be retrained. To address these challenges, this article presents a novel unsupervised I2I method that does not require manually defined image domains. The proposed method automatically learns the visual similarity between individual samples and leverages the learned similarity function to transfer a specific style or appearance across images. Therefore, the developed method does not rely on cost-intensive manual domains or unstable clustering results, leading to improved translation accuracy at minimal cost. For quantitative evaluation, we implemented a state-of-the-art I2I models and performed image transformation on the same input image using the baselines and our method. The image quality was then assessed using two quantitative metrics: Frechet inception distance (FID) and translation accuracy. The proposed method exhibited significant improvements in image quality and translation accuracy compared with the latest unsupervised I2I methods. Specifically, the developed technique achieved a 25% and 19% improvement over the best-performing unsupervised baseline in terms of FID and translation accuracy, respectively. Furthermore, this approach demonstrated performance nearly comparable to those of supervised learning-based methods trained using manually collected and constructed domains.
AB - Image-to-image translation (I2I) aims to learn a mapping function to transform images into different styles or domains while preserving their key structures. Typically, I2I models require manually defined image domains as a training set to learn the visual differences among the image domains and achieve the ability to translate images across them. However, constructing such multi-domain datasets on a large scale requires expensive data collection and annotation processes. Moreover, if the target domain changes or is expanded, a new dataset should be collected, and the model should be retrained. To address these challenges, this article presents a novel unsupervised I2I method that does not require manually defined image domains. The proposed method automatically learns the visual similarity between individual samples and leverages the learned similarity function to transfer a specific style or appearance across images. Therefore, the developed method does not rely on cost-intensive manual domains or unstable clustering results, leading to improved translation accuracy at minimal cost. For quantitative evaluation, we implemented a state-of-the-art I2I models and performed image transformation on the same input image using the baselines and our method. The image quality was then assessed using two quantitative metrics: Frechet inception distance (FID) and translation accuracy. The proposed method exhibited significant improvements in image quality and translation accuracy compared with the latest unsupervised I2I methods. Specifically, the developed technique achieved a 25% and 19% improvement over the best-performing unsupervised baseline in terms of FID and translation accuracy, respectively. Furthermore, this approach demonstrated performance nearly comparable to those of supervised learning-based methods trained using manually collected and constructed domains.
KW - Contrastive learning
KW - Generative adversarial networks
KW - Image-to-image translation
KW - Self-supervised learning
KW - Style transfer
UR - http://www.scopus.com/inward/record.url?scp=85180777229&partnerID=8YFLogxK
U2 - 10.1016/j.asoc.2023.111170
DO - 10.1016/j.asoc.2023.111170
M3 - Article
AN - SCOPUS:85180777229
SN - 1568-4946
VL - 151
JO - Applied Soft Computing
JF - Applied Soft Computing
M1 - 111170
ER -