Particle News: DeepSeek Open-Sources OCR System That Compresses LLM Contexts With Visual Tokens

Overview

DeepSeek released code and weights on GitHub and Hugging Face, quickly attracting thousands of stars and active developer interest.
The architecture combines a ~380M-parameter DeepEncoder with a 3B-parameter MoE decoder that uses about 570M active parameters.
On OmniDocBench, the team reports surpassing GOT-OCR 2.0 using 100 vision tokens per page and outperforming MinerU 2.0 while staying under 800 tokens.
Throughput is reported at more than 200,000 pages per day on a single Nvidia A100, suggesting substantial processing and cost-efficiency gains.
Reported precision is about 97% at under 10× compression but falls to roughly 60% at 20×, highlighting trade-offs that await independent validation.