Legal IDE
Swiss legal workspace
Technical
Engineering notes on latency, KV cache, and on-prem AI systems.
English
Engineering notes for building responsive on-prem legal assistants.
How TP and PP work, when to use them, and how they combine with data parallelism.