- Google enterprise AI now spans a full agentic stack from model layer to individual worker inbox, closing a real deployment gap.
- Gemini 3.5 Flash benchmarks suggest Google enterprise AI is ready for serious tool-using agent workflows at speed and low cost.
- Gemini Spark runs 24/7 as a personal agent, connecting Workspace with Salesforce, Zendesk, and ServiceNow autonomously.
- CodeMender, built by Google DeepMind, audits and patches codebases automatically with full developer approval gates.
Google Enterprise AI Finally Gets End-to-End Plumbing
Google enterprise AI took a significant step forward at I/O ’26 this week — not because of a single headline model, but because Google shipped something the enterprise market has been quietly desperate for: a fully integrated agentic stack where the security, governance, and agent behaviour are all designed together rather than bolted on after the fact. That distinction matters more than it might sound.
For the past two years, most large organisations have found themselves stuck in a frustrating middle ground. They have access to capable foundation models — Gemini, GPT-4o, Claude — but the path from “we have the model” to “our teams are actually running less manual work” has required building a lot of custom scaffolding. Separate orchestration layers. Fragile connector work between tools. Homegrown governance that nobody really trusts. The result has been enterprise AI pilots that stay pilots, and real productivity gains that stay theoretical.
What Google announced today is an attempt to collapse that scaffolding into a single platform surface. Five products, each sitting at a different layer of the stack, all of them codesigned to work together. It’s an ambitious bet — and for the first time in a while, it’s the kind of announcement where the enterprise details deserve as much attention as the model numbers. Google enterprise AI is no longer just a capability story; it’s a platform story.
What Gemini 3.5 Flash Actually Signals
Start with the model layer, because the benchmark choices are telling. Gemini 3.5 Flash is Google’s new baseline for Google enterprise AI — the claim is that it competes with larger flagship models while staying within Flash’s speed and cost profile. The numbers Google is citing: 76.2% on Terminal-Bench 2.1, 83.6% on MCP Atlas, and 84.2% on CharXiv for multimodal understanding.
The MCP Atlas score is the one worth paying attention to. That benchmark is built specifically around Model Context Protocol task completion — the same protocol that’s quietly become the de facto standard for tool-using agents across the industry. Scoring 83.6% there isn’t just a performance number; it’s Google implicitly endorsing MCP as the right evaluation surface for agentic capability. Given how much momentum MCP has built since Google Cloud Next ’26, that’s a meaningful signal about where the evaluation bar is heading.
Gemini 3.5 Pro is in testing and expected next month. For now, Flash is live today in Gemini Enterprise, Google AI Studio, and Antigravity.
Gemini Omni is the video-first model — it takes text, audio, image, and video inputs and produces dynamic video output. The target use cases are things like post-production automation, e-commerce virtual try-ons, and content localisation at scale. It’s rolling out via the Gemini API in the coming weeks. Video generation for enterprise workflows has been a notoriously hard problem to operationalise, so it’ll be interesting to see how Omni performs outside controlled demos.
Antigravity 2.0: Where Developers Actually Live
Antigravity 2.0 is the coding environment piece of Google enterprise AI, and it’s gotten a meaningful upgrade. It’s now a standalone desktop app that integrates directly with Agent Platform, which means it inherits Google Cloud’s data privacy protections by default rather than requiring developers to configure them separately. There’s also a new Antigravity CLI for teams that want a lighter-weight interface without the full desktop app.
The most concrete proof point Google offered here came from outside Google itself. AirAsia Next’s CTO stated that over half of their production-ready code now comes through Antigravity agentic workflows. That’s not a benchmark number or a controlled demo — that’s a shipping company running live production traffic through the tool. For a developer-facing product, that kind of third-party validation carries real weight.
The product launch use case Google demonstrated is a good illustration of what Antigravity 2.0 is trying to do: simultaneous agent-driven execution across code generation, asset creation, and customer email drafts, all orchestrated from a single workspace. Coordination across those three work streams is normally where things fall apart in enterprise settings — someone’s waiting on someone else, context gets lost between tools, versions get out of sync. Antigravity’s pitch is that agents handle the coordination layer so humans can stay focused on decisions rather than logistics. It’s one of the cleaner examples of how Google enterprise AI is being designed around actual workflow friction rather than abstract capability.
Gemini Spark: The Personal Agent With a Real Isolation Story
Gemini Spark is the layer of Google enterprise AI that’s closest to the individual worker. It runs continuously in the background, connects to Google Workspace and a set of external connectors — Salesforce, Zendesk, ServiceNow, SharePoint — and can take multi-step actions autonomously. Anything high-risk requires explicit human approval before it executes.
The security architecture here is more specific than most personal agent announcements tend to be. Every task runs in an ephemeral VM. Credentials never touch the agent directly. All traffic routes through an Agent Gateway that enforces data loss prevention policies. That’s a concrete isolation design, not a vague promise about enterprise-grade security.
The use cases Google walked through illustrate the practical ambition well. For IT operations: Spark monitors ServiceNow tickets, detects recurring incidents, creates escalated Jira tickets, drafts incident reports, and flags the relevant manager for approval before anything goes out externally. For sales: Spark pulls account history from Salesforce, cross-references support tickets in Zendesk, identifies churn signals, and drafts a retention strategy — sitting in draft until the salesperson approves it. In both cases, the human’s job shifts from doing the research and writing to reviewing and approving the output.
That shift — from AI-assisted work to AI-executed work with human approval — is the actual direction the whole announcement points. It’s not a subtle change in how work gets done. It’s a structural one, and Google enterprise AI is clearly being positioned so that enterprises can adopt it faster because the security story is already built in rather than left to individual teams to construct.
Gemini Spark in the Gemini Enterprise app is rolling out soon, with a Workspace preview for business customers following after that.
The Managed Agents API and CodeMender
For developers who want to build their own agents rather than use Google’s pre-built ones, the Managed Agents API lets teams spin up custom agents via a single API call, running in Google-hosted environments. No infrastructure to manage; governance and security inherit from Agent Platform automatically. The documentation is live at docs.cloud.google.com.
The automatic governance inheritance is the part that matters most here. One of the consistent pain points in Google enterprise AI deployments today is that security and compliance requirements get handled as a separate workstream from the actual agent development — which means they often get handled late, or inconsistently. Making Agent Platform’s data protections automatic for anything built on the Managed Agents API is a specific attempt to close that gap at the platform level rather than leaving it to individual development teams.
CodeMender is the security-focused addition to the stack — an AI security agent built by Google DeepMind, now integrated into Agent Platform. It audits codebases for vulnerabilities, proposes patches, tests them, and can apply fixes across dependent systems with developer sign-off. For teams carrying compliance obligations where every code change needs a documented audit trail, that workflow is particularly relevant. Security tooling that generates its own audit trail as it works is genuinely useful in regulated industries, not just a feature checkbox. CodeMender is a good example of how Google enterprise AI is expanding beyond productivity use cases into security and compliance workflows.
Why the Integrated Stack Bet Makes Sense Right Now
Google enterprise AI’s big structural argument at I/O ’26 is that the market is ready to move past the “assemble your own stack” phase. The status quo — model API plus separate orchestration layer plus custom connector work plus homegrown governance — is expensive, fragile, and slow to iterate on. Google is betting that enterprises will trade some flexibility for the speed and reliability of a platform where all those layers are codesigned.
It’s a similar bet to what Salesforce made with its Agentforce platform and what Microsoft has been building with Copilot Studio and the Azure AI Foundry stack. The competitive dynamic is real: enterprises are going to consolidate around one or two agentic platforms, and the window to establish platform lock-in is open right now. Google enterprise AI’s advantage is the depth of its existing Workspace penetration combined with the breadth of its model portfolio. Its challenge is that Microsoft’s enterprise relationships are still deeper in most Fortune 500 accounts, and Salesforce owns the CRM data layer where a lot of the most valuable agentic workflows actually live.
What’s different about Google enterprise AI at I/O ’26 compared to previous pushes is the specificity of the use cases and the concreteness of the security architecture. Previous announcements have tended toward capability demonstrations. This one reads more like a product that’s been designed around how enterprise IT and security teams actually evaluate these purchases — which suggests Google has been listening to why earlier deployments stalled.
Whether the execution matches the architecture on paper is a different question. But the direction is clear: Google enterprise AI is no longer just about giving companies access to a capable model. It’s about giving them a platform where capable agents can actually run at scale — with the governance story tight enough that the security team doesn’t kill the project before it ships.

