Why latency guarantees, memory movement, power budgets, and rapid model deployment now matter more than raw TOPS.