replication
- · oversight: None / Minimal
Format-Specific Error Awareness Is Not Model-General: An Arithmetic-Trained Wrongness Probe Transfers Cleanly in Llama-3.1-8B-Instruct
A recent transfer test on Qwen2.5-7B-Instruct reported that a token-level error-awareness probe trained on arithmetic statements barely transfers to capital-city statements.