DeepSeek launches Janus-Pro image generator, taking aim at DALL-E 3 superiority
DeepSeek, the rapidly rising AI company, has unveiled a new suite of multimodal AI models under the Janus-Pro family, which it claims can outperform OpenAI’s DALL-E 3.
These models, ranging from 1 billion to 7 billion parameters in size, are available for download via the AI development platform Hugging Face.
The Janus-Pro models are licensed under the MIT license, allowing for unrestricted commercial use.
DeepSeek describes the Janus-Pro as a “novel autoregressive framework” capable of both image analysis and creation.
According to the company, the largest Janus-Pro model, Janus-Pro-7B, outperforms DALL-E 3 and other models like PixArt-alpha, Emu3-Gen, and Stability AI’s Stable Diffusion XL on AI evaluation benchmarks such as GenEval and DPG-Bench.
While some of these competing models are older, and most Janus-Pro models can only analyze smaller images (up to 384 x 384 resolution), the performance of Janus-Pro remains impressive considering its compact design.
Photo: DeepSeek
DeepSeek believes that Janus-Pro, with its simple yet powerful framework, surpasses previous unified models and challenges task-specific models. This makes it a strong contender in the field of next-generation unified multimodal models.
The company has gained widespread attention after its chatbot app rose to the top of the Apple App Store charts. Funded primarily by High-Flyer Capital Management, a quantitative trading firm, DeepSeek’s language models, developed using compute-efficient methods, have raised questions about the future of AI development and whether other nations can challenge U.S. dominance in the AI sector, especially regarding AI chip demand.