zhangxinyuezhangxinyue ・ Oct. 22, 2025
Chinese AI Startup DeepSeek Unveils Open-Source Optical Compression Model for LLM Training
This demonstrates that compact language models can effectively decode compressed visual text, potentially enabling larger models to adopt similar capabilities with fewer resources. The model is also highly scalable: a single A100-40G GPU can reportedly generate more than 200,000 pages of training data per day.

TMTPOST -- Chinese artificial intelligence firm DeepSeek has released DeepSeek-OCR, an open-source model designed to extract and compress text from images and PDFs, aiming to provide large-scale, high-quality datasets for training large language models (LLMs) and vision-language models (VLMs) while dramatically reducing computational requirements.

The model was made publicly available on GitHub yesterday, accompanied by a research paper titled DeepSeek-OCR: Contexts Optical Compression.

The technology behind DeepSeek-OCR leverages optical compression to encode textual information into visual representations, which are stored in an optical format.

According to the company, this approach addresses the major computational bottlenecks LLMs face when processing long-form content such as research papers, legal contracts, financial reports, and dialogue histories. By converting text into images, the system allows models to process extensive documents more efficiently, simulating a gradual forgetting mechanism similar to human memory.

Performance metrics shared in the research indicate that DeepSeek-OCR can achieve over 96% accuracy with a tenfold reduction in data, 90% accuracy at compression rates of 10–12 times, and around 60% accuracy with a 20-fold reduction.

This demonstrates that compact language models can effectively decode compressed visual text, potentially enabling larger models to adopt similar capabilities with fewer resources. The model is also highly scalable: a single A100-40G GPU can reportedly generate more than 200,000 pages of training data per day.

DeepSeek-OCR’s ability to compress long-form textual content opens new possibilities for LLM training, particularly for scenarios requiring the processing of massive amounts of data. By converting dialogues, research materials, and multi-page documents into images, the approach reduces token counts and computational overhead, potentially allowing models to handle larger datasets without a corresponding spike in GPU demand.

The open-source release has already attracted attention within the AI community, with DeepSeek-OCR garnering over 1,400 stars on GitHub shortly after its debut.

Analysts note that while the model represents a significant technical advancement, DeepSeek has been relatively slow in rolling out new models like R2. Some experts speculate that this may suggest the company is temporarily falling behind in the rapidly evolving AI field.

Others, however, interpret the cautious pace as a deliberate strategy to strengthen internal capabilities and lay the groundwork for a next-generation AI model.

LIKE 1
Related Posts
Philippines Faces Energy Crisis With Closure of the Strait of Hormuz
Philippines Faces Energy Crisis With Closure of the Strait of Hormuz
With Helium Spot Price up 50%, Samsung and SK Hynix on Edge
Guangxiang Technology Raises Over 100 Million Yuan
Guangxiang Technology Raises Over 100 Million Yuan
Practical Playbooks for Putting AI into Action at IBM and Schneider Electric in China
With Meager Spending in R&D, Unitree Robotics Deserves 4.2 Billion Yuan to be Raised in IPO?
With Meager Spending in R&D, Unitree Robotics Deserves 4.2 Billion Yuan to be Raised in IPO?
Tencent Cloud Achieves Scaled Profitability for First Time in 2025
Tencent Cloud Achieves Scaled Profitability for First Time in 2025

  • Subscribe To Our News