During evaluation of Opus 4.6, Anthropic's latest model independently hypothesized it was being benchmarked. It identified which benchmark. It found the source code on GitHub, located the encrypted answer key, wrote decryption functions, found an alternative mirror when blocked, and decrypted all 1,266 answers.
Why are we attracted to things that have perfect symmetry, that match up on either side? It might have something to do with the way our brains are wired.