kv-cache

1 post connected to this tag.

Jun 25, 2026

KV Cache Memory: The Hidden State That Makes LLM Decode Work

KV cache memory for CPU-native inference This ShivasNotes deep dive is written for engineers who want to understand the single largest memory consumer in autoregressive LLM inference: the KV...

Read post →

Get my rants delivered to your inbox

I will send new posts as and when I write. No fixed cadence, just engineering notes, rants, and things I am thinking through.

ShivasNotes

Engineering notes, A.I. workflow, drones, systems programming, and the messy process of building in public.

Explore

Latest Articles
Blog
Projects
Resume
About

Connect

hello@shivasnotes.com
AntShiv Robotics
StylesDoc

KV Cache Memory: The Hidden State That Makes LLM Decode Work

Subscribe

Subscribe to emails from Anthony

ShivasNotes

Explore

Connect