GitHub repo JuliusBrussee/caveman — a TypeScript Claude Code / Codex / Gemini / Cursor skill that asks the agent to talk like a caveman. 74,940 stars and 4,230 forks as of 2026-06-20, MIT, 15 releases (latest v1.9.0 on 2026-06-12). Project-published benchmark of 10 real Claude API prompts shows 65% average output-token reduction (range 22–87%); caveman-compress sub-skill cuts 46% of tokens from real memory files. The README's own Important box is the lead caveat: caveman only affects output tokens — thinking/reasoning tokens are untouched.
On June 10, 2026, Google DeepMind releases DiffusionGemma, a 26B/3.8B active model based on Gemma 4 with a text-diffusion approach that denoises 256-token blocks in parallel. Up to 1,000+ tokens/sec on a single H100, ~4x faster than an equivalent autoregressive model, Apache 2.0 license, day-one support on vLLM, Hugging Face Transformers, Unsloth, NeMo. Explicitly labeled as 'experimental': quality is below Gemma 4 AR standard — speed is the point, not peak quality.