GPT-5.5 hallucinates 3x more than MIT-licensed GLM-5.2

	GPT-5.5 hallucinates 3x more than MIT-licensed GLM-5.2(arrowtsx.dev)
	527 points by oshrimpton 1 day ago \| 254 comments
	tl;dr: Despite being roughly half the size, the MIT-licensed GLM-5.2 scores within 4 points of GPT-5.5 on the Artificial Analysis Intelligence Index while hallucinating far less (28% vs 86%), suggesting raw parameter scaling has plateaued and may actively harm truthfulness. The author argues massive models like DeepSeek V4 Pro (94% hallucination rate) fail to recognize their own knowledge limits, wasting compute confidently producing wrong answers. Model training and selection should instead optimize a trilemma of capability, hallucination rate, and compute efficiency.
	HN Discussion: ↓Claims that bigger models hallucinate more contradict observed trends in recent years ↓Hallucination rate metrics are conditional and don't reflect real-world user experience ~Hallucination is a training/RLVR problem, not fundamentally a model size issue ↓The author has undisclosed conflicts of interest and cherry-picks rate over accuracy ↓Anecdotal experience shows GLM-5.2 actually hallucinates more than the article claims