Unlocking Multimodal Reasoning with Apriel-1.5-15B-Thinker

The ServiceNow AI Research Lab has introduced a groundbreaking multimodal reasoning model, Apriel-1.5-15B-Thinker. With its impressive set of features and capabilities, this 15-billion-parameter model is designed to enhance AI applications while offering significant cost savings.

What’s New in Apriel-1.5-15B-Thinker?

Frontier-Level Performance

Apriel-1.5-15B-Thinker achieves an Artificial Analysis Intelligence Index (AAI) score of 52, demonstrating a competitive performance while being significantly smaller than other models like DeepSeek-R1-0528. This score is derived from ten third-party evaluations, ensuring its reliability across various benchmarks.

Single-GPU Deployability

Designed for efficiency, this model can be deployed on a single GPU, making it ideal for on-premises and air-gapped environments. This feature addresses fixed memory and latency budgets while delivering powerful processing capabilities.

Open Weights and Reproducibility

For transparency and independent verification, Apriel-1.5-15B-Thinker ships with open weights and a reproducible training pipeline. Users can access the weights, training recipes, and evaluation protocols, promoting collaborative advancements in AI research.

Training Mechanism: A Closer Look

Base and Upscaling

Apriel-1.5-15B-Thinker builds upon Mistral’s Pixtral-12B-Base-2409 multimodal decoder-vision stack. The research team applied depth upscaling, enhancing the decoder layers while optimizing the vision encoder, which preserves the model’s deployability.

Continual Pretraining (CPT)

The training mechanism involves two main stages:

Mixed text and image data to establish foundational reasoning and understanding.
Synthetic visual tasks that refine spatial and compositional reasoning. The model manages to process lengthy sequence lengths of up to 32k and 16k tokens, respectively.

Supervised Fine-Tuning (SFT)

High-quality, reasoning-focused instruction data across various domains, such as math and coding, are utilized during the fine-tuning phase. Notably, the model avoids reinforcement learning or reinforcement learning from AI feedback, ensuring a streamlined training process.

Impressive Benchmark Results

Apriel-1.5-15B-Thinker has demonstrated remarkable performance across various benchmarks, including:

AIME 2025: 87.5% – 88%
GPQA Diamond: Approximately 71%
IFBench: Around 62%
τ²-Bench (Telecom): About 68%
LiveCodeBench: 72.8%

Using the VLMEvalKit for reproducibility, Apriel shows competitive scores across multiple multimodal understanding tasks and excels in document and diagram processing.

Conclusion

Apriel-1.5-15B-Thinker exemplifies the potential of mid-training techniques to achieve high performance on critical benchmarks while remaining deployable on a single GPU. The model’s open weights and training methods make it an attractive option for enterprises looking to explore effective AI solutions without the overhead of larger, closed systems.

Related Keywords

Multimodal reasoning
Artificial Intelligence Index
Open-source AI models
GPU deployment
Continual pretraining
Supervised fine-tuning
AI benchmarks

Source link

ServiceNow AI Releases Apriel-1.5-15B-Thinker: An Open-Weights Multimodal Reasoning Model that Hits Frontier-Level Performance on a Single-GPU Budget | Insights by Willow Ventures