DSpark: Speculative decoding accelerates LLM inference [pdf]

	DSpark: Speculative decoding accelerates LLM inference [pdf](github.com)
	770 points by aurenvale 1 day ago \| 329 comments
	tl;dr: Summary not available
	HN Discussion: ↑Praise for DeepSeek's openness and innovation compared to American/Western labs ↑Positive user experience reports with DeepSeek models in practice ↑Speculation that this technique explains DeepSeek's dramatically lower pricing ~Questioning novelty given prior speculative decoding work from 2022 •Forward-looking prediction about proliferation of specialized small models for speculative decoding