Discover how SimpleQA is testing the limits of language models by measuring accuracy on straightforward questions, pushing ...