TLDR
- Nebius (NBIS) revealed a definitive agreement to purchase AI inference specialist Eigen AI in a transaction valued at roughly $643 million, structured as cash plus Class A shares.
- Integration of Eigen AI’s optimization solutions into Nebius Token Factory will enhance the managed inference offering for business customers.
- MIT HAN Lab’s founding researchers will launch Nebius’s inaugural Bay Area-based engineering and development center.
- Collaborative model optimizations from both organizations have already achieved top-tier performance ratings on Artificial Analysis testing platforms.
- NBIS shares climbed 8.51% following the announcement, reaching $150.00, recovering from a 6.07% weekly downturn.
On May 1, 2026, Nebius (NBIS) disclosed a definitive agreement to purchase Eigen AI in a deal worth approximately $643 million. The transaction structure combines cash with Nebius Class A shares, calculated using the company’s 30-day volume-weighted average share price as of the signing date. NBIS shares surged 8.51% on the announcement, climbing to $150.00.
Completion of the acquisition is anticipated within the coming weeks, subject to antitrust regulatory approval and customary closing requirements.
Eigen AI specializes in inference acceleration and model optimization technology. The company’s solutions enable AI development teams to deploy open-source models with superior performance and reduced costs in production environments, eliminating the need for internal optimization infrastructure development.
Nebius intends to integrate Eigen AI’s technology seamlessly into its Token Factory infrastructure. Token Factory delivers auto-scaling API endpoints and fine-tuning capabilities supporting prominent open-source frameworks such as Llama, DeepSeek, Qwen, Gemma, and additional models.
The organizations have previously collaborated successfully. Prior to this acquisition announcement, they jointly engineered optimized model deployments that achieved leading performance scores on Artificial Analysis, a prominent AI evaluation framework.
Eigen AI’s Technical Leadership
Eigen AI emerged from MIT’s HAN Lab research group. Co-founders Ryan Hanrui Wang and Wei-Chen Wang developed two foundational technologies now widely adopted in production AI systems.
Ryan’s research on Sparse Attention algorithms (SpAtten) represents the most-referenced HPCA publication since 2020. Wei-Chen’s development of Activation-aware Weight Quantization (AWQ) earned the MLSys 2024 Best Paper Award and has become the industry-standard methodology for 4-bit model deployment.
Co-founder Di Jin earned his PhD from MIT CSAIL and played a key role in developing Meta’s Llama 3 and Llama 4 post-training processes. He also co-developed the CGPO reinforcement learning from human feedback methodology.
Upon transaction completion, the engineering team will establish operations in the San Francisco Bay Area, marking Nebius’s first United States-based research and development facility.
Industry Dynamics in AI Inference
Inference workloads currently represent the most rapidly expanding segment within the AI computing ecosystem. Projections indicate inference will account for approximately two-thirds of aggregate AI computational requirements throughout 2026.
Optimizing inference operations presents substantial technical challenges. The process encompasses model representation strategies, GPU kernel optimization, and dynamic workload orchestration — capabilities that most organizations lack internally.
Open-source model architectures, which generally lack optimization out-of-the-box, compound these difficulties. Advanced designs including Mixture-of-Experts and Compressed Sparse Attention present unique memory management and computational efficiency obstacles requiring specialized expertise.
Eigen AI’s comprehensive optimization methodology addresses post-training refinement, fine-tuning workflows, and production inference deployment across all leading open-source architectures. The company’s kernel-level and model-level innovations deliver enhanced hardware utilization without requiring additional engineering resources.



