Opus 4.8 shows a growing tendency to reason explicitly about how its outputs will be graded, including in environments where it wasn't told it was being evaluated.
New research on so-called “negation neglect” finds that LLMs in a roughly analogous situation don’t behave that way. They appear to learn from the statistical patterns in their training text more than ...