In a move that underscores the intensifying AI arms race, Microsoft has released three new in-house AI models that directly compete with products from its partner and largest AI investment, OpenAI.
The company unveiled MAI-Transcribe-1, MAI-Voice-1, and MAI-Image-2 on April 3rd, making them available through Microsoft Foundry. These models represent Microsoft’s most aggressive push yet to build and ship foundational AI capabilities independently — despite holding an estimated $135 billion stake in OpenAI.
The Models
MAI-Transcribe-1 is a speech-to-text model that achieved a 3.8% average Word Error Rate across 25 languages on the FLEURS benchmark, outperforming OpenAI’s Whisper-large-v3 on all tested languages. This positions it as a strong alternative for enterprises needing high-accuracy transcription in call centers, meetings, and media workflows. Pricing starts at $0.36 per hour.
MAI-Voice-1 is a text-to-speech model capable of generating a full minute of synthetic speech in under a second. This speed advantage could make it attractive for real-time applications like客户服务, content creation, and accessibility tools. Pricing is set at $22 per million characters.
MAI-Image-2 is already rolling out across Microsoft products including Bing and PowerPoint, marking one of the fastest production deployments for an internal model. Microsoft claims it runs at least twice as fast as its predecessor.
Why This Matters Now
The timing is noteworthy. Microsoft has spent years positioning itself as OpenAI’s primary distribution channel — embedding GPT models into Copilot, Azure, and enterprise products. Yet the company has simultaneously been building its own AI research and development infrastructure through its MAI (Microsoft AI) division.
This release signals a strategic pivot: Microsoft is no longer content being solely a customer of OpenAI. By shipping competing products — especially in transcription and voice — Microsoft is hedging its bets while the AI market rapidly consolidates.
For enterprises, this creates options. Organizations concerned about vendor lock-in or seeking more competitive pricing can now choose Microsoft’s internal models over OpenAI’s equivalents — without leaving the Microsoft ecosystem.
The Competitive Landscape
Microsoft joins Google, Anthropic, and Meta in building a full-stack AI portfolio. The company reportedly aims to develop cutting-edge AI models by 2027, according to recent statements. The MAI family now spans reasoning, speech, vision, and transcription — a breadth that mirrors what OpenAI offers, albeit through different architectural approaches.
What remains unclear is how OpenAI views this direct competition from its biggest financial backer. Microsoft has not indicated any change to its OpenAI partnership, but the release of competing transcription and voice products suggests the relationship is evolving from pure partnership toward strategic rivalry.
Sources: Microsoft AI Blog, The Tech User