Deepseek Wikipedia
Compared to DeepSeek 67B, DeepSeek-V2 offers better performance while becoming 42. 5% less expensive to train, making use of 93. 3% fewer KV cache, and generating responses as much as 5. 76 periods faster....