OpenAI completed a controversial restructuring of its for-profit arm into a public benefit corporation: the latest gust in a whirlwind that has swept up hundreds of billions of dollars of global investment for artificial intelligence (AI) tools. But even as the AI company—founded as a nonprofit, now valued at $500 billion—completes its long-awaited restructuring, a nagging issue with its core offering remains unresolved: hallucinations.
Large language models (LLMs) such as those that underpin OpenAI’s popular ChatGPT platform are prone to confidently spouting factually incorrect statements. These blips are often attributed to bad input data, but in a preprint posted last month, a team from OpenAI and the Georgia Institute of Technology proves that even with flawless training data, LLMs can never be all-knowing – in part because some questions are just inherently unanswerable.
The root problem, the researchers say, may lie in how LLMs are trained. They learn to bluff because their performance is ranked using standardized benchmarks that reward confident guesses and penalize honest uncertainty. In response, the team calls for a rehaul of benchmarking so accuracy and self-awareness count as much as confidence. Although some experts find the preprint technically compelling, reactions to its suggested remedy vary. Some even question how far OpenAI will go in taking its own medicine to train its models to prioritize truthfulness over engagement.
