Abstract
Large language models (LLMs) demonstrate strong performance in natural language tasks, but their capacity for genuine in-context learning (ICL) in scientific regression remains unclear. We systematically assessed seven LLMs on molecular property prediction using a controlled framework of 56 transformed tasks that isolate shortcut learning and are designed to induce functional out-of-distribution (OOD) behavior. LLMs performed nearly perfectly on raw molecular weight prediction via shortcut cues but deteriorated under nonlinear transformations, whereas machine learning (ML) baselines showed greater robustness, yielding a performance crossover. Meta-analysis revealed that distributional descriptors and structure–activity landscape indices (SALI) predict task favorability, providing a framework for selecting between LLM- and ML-based approaches in chemistry.
| Original language | English |
|---|---|
| Article number | e70308 |
| Journal | Journal of Computational Chemistry |
| Volume | 47 |
| Issue number | 2 |
| DOIs | |
| State | Published - 15 Jan 2026 |
Keywords
- functional out-of-distribution
- in-context learning
- large language models
- molecular property prediction
- shortcut learning
- SMILES representation
- structure–activity landscape index
Fingerprint
Dive into the research topics of 'Evaluating In-Context Learning in Large Language Models for Molecular Property Regression'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver