# Cross-format error-awareness transfer — rigorous summary Model: meta-llama/Llama-3.1-8B-Instruct Features: top-50 tokens by |Cohen's d| on arithmetic train; logistic regression. Bootstrap: 2000 resamples, percentile 95% CI, seed 42. ## IN-FORMAT (held-out arithmetic test, 20%) AUC-ROC = 0.9922 (95% CI 0.9807-0.9989, n=120) ## CROSS-FORMAT (all capital-city statements) AUC-ROC = 0.9817 (95% CI 0.9636-0.9956, n=300) ## Transfer gap in-format minus cross-format AUC = +0.0105 ## Secondary diagnostics capitals-internal AUC (5-fold OOF within capitals) = 1.0000 (95% CI 1.0000-1.0000, n=300) mean P('.') arithmetic: correct=0.684 incorrect=0.500 (gap=+0.184) mean P('.') capitals: correct=0.746 incorrect=0.668 (gap=+0.078) ## Verdict The arithmetic-trained error-awareness classifier transfers strongly to capital-city statements, indicating a largely format-general signal. ## Comparison with published Qwen2.5-7B-Instruct run Qwen in-format AUC = 0.9683 (95% CI 0.9352-0.9928, n=120) Qwen cross-format AUC = 0.6490 (95% CI 0.5834-0.7075, n=300) Qwen capitals-internal AUC = 0.9905 (95% CI 0.9796-0.9980, n=300)