TY - JOUR
T1 - Adaptive adversarial augmentation for molecular property prediction
AU - Cho, Soyoung
AU - Hong, Sungchul
AU - Jeon, Jong June
N1 - Publisher Copyright:
© 2025 The Authors
PY - 2025/4/25
Y1 - 2025/4/25
N2 - This paper introduces a novel adversarial data augmentation technique that enhances the performance of molecular property prediction tasks. The proposed method, Adversarial Augmentation to Influential Sample (AAIS), addresses classification problems involving imbalanced data in multitask scenarios where traditional augmentation techniques often prove ineffective. Data augmentation is conducted by a distributionally robust optimization, so its performance is less dependent on the available data and the number of tasks. Particularly, we devise an adaptive augmentation that identifies data points that significantly influence model training using the influence function. We found that these data points are located near the decision boundary, and the adaptive augmentation locally flattens the decision boundary, leading to the robustness of prediction. The proposed method was evaluated using benchmark datasets commonly employed for property prediction based on molecular structures. We confirmed that the developed adaptive adversarial augmentation approach effectively improves model prediction performance by 1%–15% in AUC and 1%–35% in F1-score. Our method effectively addresses the class imbalance problem and enhances performance in graph-level tasks by augmenting influential graph features. These findings suggest that AAIS offers a flexible tool for improving the generalization ability and accuracy of the model, particularly in complex or imbalanced molecular prediction tasks.
AB - This paper introduces a novel adversarial data augmentation technique that enhances the performance of molecular property prediction tasks. The proposed method, Adversarial Augmentation to Influential Sample (AAIS), addresses classification problems involving imbalanced data in multitask scenarios where traditional augmentation techniques often prove ineffective. Data augmentation is conducted by a distributionally robust optimization, so its performance is less dependent on the available data and the number of tasks. Particularly, we devise an adaptive augmentation that identifies data points that significantly influence model training using the influence function. We found that these data points are located near the decision boundary, and the adaptive augmentation locally flattens the decision boundary, leading to the robustness of prediction. The proposed method was evaluated using benchmark datasets commonly employed for property prediction based on molecular structures. We confirmed that the developed adaptive adversarial augmentation approach effectively improves model prediction performance by 1%–15% in AUC and 1%–35% in F1-score. Our method effectively addresses the class imbalance problem and enhances performance in graph-level tasks by augmenting influential graph features. These findings suggest that AAIS offers a flexible tool for improving the generalization ability and accuracy of the model, particularly in complex or imbalanced molecular prediction tasks.
KW - Adversarial augmentation
KW - Graph neural network
KW - Influence function
KW - Molecular properties prediction
UR - https://www.scopus.com/pages/publications/85215429750
U2 - 10.1016/j.eswa.2025.126512
DO - 10.1016/j.eswa.2025.126512
M3 - Article
AN - SCOPUS:85215429750
SN - 0957-4174
VL - 270
JO - Expert Systems with Applications
JF - Expert Systems with Applications
M1 - 126512
ER -