DeepSeek, the rapidly rising AI company, has unveiled a new suite of multimodal AI models under the Janus-Pro family, which it claims can outperform OpenAI’s DALL-E 3.
These models, ranging from 1 billion to 7 billion parameters in size, are available for download via the AI development platform Hugging Face.
The Janus-Pro models are licensed under the MIT license, allowing for unrestricted commercial use.
DeepSeek describes the Janus-Pro as a “novel autoregressive framework” capable of both image analysis and creation.
According to the company, the largest Janus-Pro model, Janus-Pro-7B, outperforms DALL-E 3 and other models like PixArt-alpha, Emu3-Gen, and Stability AI’s Stable Diffusion XL on AI evaluation benchmarks such as GenEval and DPG-Bench.
While some of these competing models are older, and most Janus-Pro models can only analyze smaller images (up to 384 x 384 resolution), the performance of Janus-Pro remains impressive considering its compact design.
Photo: DeepSeek
DeepSeek believes that Janus-Pro, with its simple yet powerful framework, surpasses previous unified models and challenges task-specific models. This makes it a strong contender in the field of next-generation unified multimodal models.
The company has gained widespread attention after its chatbot app rose to the top of the Apple App Store charts. Funded primarily by High-Flyer Capital Management, a quantitative trading firm, DeepSeek’s language models, developed using compute-efficient methods, have raised questions about the future of AI development and whether other nations can challenge U.S. dominance in the AI sector, especially regarding AI chip demand.
COMMENTS
Comments are moderated and generally will be posted if they are on-topic and not abusive.
For more information, please see our Comments FAQ