TY - JOUR
T1 - AI-based toxicity prediction models using ToxCast data
T2 - Current status and future directions for explainable models
AU - Kim, Donghyeon
AU - Choi, Jinhee
N1 - Publisher Copyright:
© 2025 The Authors
PY - 2025/11
Y1 - 2025/11
N2 - Artificial intelligence (AI) offers new opportunities for developing toxicity prediction models to screen environmental chemicals. U.S. EPA's ToxCast program provides one of the largest toxicological databases and has consequently become the most widely used data source for developing AI-driven models. ToxCast In this review, we analyzed 93 peer-reviewed papers published since 2015 to provide an overview of ToxCast data-based AI models. We overviewed the current landscape in terms of database structure, target endpoints, molecular representations, and learning algorithms. Most models focus on data-rich endpoints and organ-specific toxicity mechanisms, particularly endocrine disruption and hepatotoxicity. While conventional molecular fingerprints and descriptors are still common, recent studies employ alternative representations—graphs, images, and text—leveraging advances in deep learning. Likewise, traditional supervised machine-learning algorithms remain prevalent, but newer work increasingly adopts semi- and unsupervised approaches to tackle data-sparsity challenges. Beyond classical structure-based QSAR, ToxCast data are also being used as biological features to predict in vivo toxicity. We conclude by discussing current limitations and future directions for applying ToxCast-based AI models to accelerate next-generation risk assessment (NGRA).
AB - Artificial intelligence (AI) offers new opportunities for developing toxicity prediction models to screen environmental chemicals. U.S. EPA's ToxCast program provides one of the largest toxicological databases and has consequently become the most widely used data source for developing AI-driven models. ToxCast In this review, we analyzed 93 peer-reviewed papers published since 2015 to provide an overview of ToxCast data-based AI models. We overviewed the current landscape in terms of database structure, target endpoints, molecular representations, and learning algorithms. Most models focus on data-rich endpoints and organ-specific toxicity mechanisms, particularly endocrine disruption and hepatotoxicity. While conventional molecular fingerprints and descriptors are still common, recent studies employ alternative representations—graphs, images, and text—leveraging advances in deep learning. Likewise, traditional supervised machine-learning algorithms remain prevalent, but newer work increasingly adopts semi- and unsupervised approaches to tackle data-sparsity challenges. Beyond classical structure-based QSAR, ToxCast data are also being used as biological features to predict in vivo toxicity. We conclude by discussing current limitations and future directions for applying ToxCast-based AI models to accelerate next-generation risk assessment (NGRA).
KW - Artificial intelligence
KW - Next generation risk assessment
KW - ToxCast
KW - Toxicity prediction
UR - https://www.scopus.com/pages/publications/105010489586
U2 - 10.1016/j.tox.2025.154230
DO - 10.1016/j.tox.2025.154230
M3 - Review article
C2 - 40645553
AN - SCOPUS:105010489586
SN - 0300-483X
VL - 517
JO - Toxicology
JF - Toxicology
M1 - 154230
ER -