Evaluating In-Context Learning in Large Language Models for Molecular Property Regression

Research output: Contribution to journalArticlepeer-review

Abstract

Large language models (LLMs) demonstrate strong performance in natural language tasks, but their capacity for genuine in-context learning (ICL) in scientific regression remains unclear. We systematically assessed seven LLMs on molecular property prediction using a controlled framework of 56 transformed tasks that isolate shortcut learning and are designed to induce functional out-of-distribution (OOD) behavior. LLMs performed nearly perfectly on raw molecular weight prediction via shortcut cues but deteriorated under nonlinear transformations, whereas machine learning (ML) baselines showed greater robustness, yielding a performance crossover. Meta-analysis revealed that distributional descriptors and structure–activity landscape indices (SALI) predict task favorability, providing a framework for selecting between LLM- and ML-based approaches in chemistry.

Original languageEnglish
Article numbere70308
JournalJournal of Computational Chemistry
Volume47
Issue number2
DOIs
StatePublished - 15 Jan 2026

Keywords

  • functional out-of-distribution
  • in-context learning
  • large language models
  • molecular property prediction
  • shortcut learning
  • SMILES representation
  • structure–activity landscape index

Fingerprint

Dive into the research topics of 'Evaluating In-Context Learning in Large Language Models for Molecular Property Regression'. Together they form a unique fingerprint.

Cite this