Part I. Systematic development of machine learning models for predicting mechanism-based toxicity from in vitro ToxCast bioassay data

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Artificial intelligence (AI) for toxicity prediction has gained significant attention as a potential new approach methodologies (NAMs) for next-generation risk assessment (NGRA). Among the various large toxicity data sources, the ToxCast database represents a valuable resource that is frequently used to develop AI models. To facilitate the regulatory adoption of such models, it is essential to identify those that offer both suitable predictive performance and clear relevance to regulatory endpoints. In this study, we systematically developed mechanism-based toxicity-prediction models using ToxCast bioassay data and sought to identify machine-learning models applicable to NGRA. We collected 1,485 bioassay datasets from InvitroDB v4.1 and pre-processed them for model training. Five types of molecular fingerprints (MACCS, Morgan, RDKit, Layered, and Pattern) and five machine-learning algorithms (logistic regression, decision tree, random forest, gradient boosting tree, and XGBoost) were applied to 980 bioassays, yielding 24,500 models. The best-performing model for each assay was selected according to the F1 score. Using annotations from the NTP ICE database, we ultimately selected 311 models trained on bioactivity data relevant to regulatory endpoints—including acute toxicity, developmental and reproductive toxicity, carcinogenicity, and endocrine disruption—that achieved acceptable performance (F1 score ≥ 0.5). Overall, this study provides a cornerstone for incorporating ToxCast-based AI models into NGRA.

Original languageEnglish
Article number100371
JournalComputational Toxicology
Volume35
DOIs
StatePublished - Sep 2025

Keywords

  • Machine learning
  • Next Generation Risk Assessment
  • Regulatory toxicology
  • ToxCast
  • Toxicity Prediction

Fingerprint

Dive into the research topics of 'Part I. Systematic development of machine learning models for predicting mechanism-based toxicity from in vitro ToxCast bioassay data'. Together they form a unique fingerprint.

Cite this