With the announcement of Azure SRE Agent and the new Azure Agent Units, Microsoft has introduced the sixth consumption metric for its AI technologies. This is not merely technical detail: it is the signal that AI license governance has become one of the most complex – and most expensive – topics that organizations face today.
When discussing Microsoft AI licenses, the most common risk is not paying too much for a single product. It is failing to realize how many different currencies are being used simultaneously, and how each responds to its own logic of consumption, forecasting, and optimization.
With the general availability announcement of the Azure SRE Agent in March 2026, Microsoft formalized a new unit of measurement: Azure Agent Units. This is the sixth main metric by which organizations are required to license AI technologies on the Microsoft platform, and it likely won't be the last.
This article maps them all, explains how they relate to each other, and offers some considerations on what it means to manage them together.
The metrics Microsoft uses today to measure and bill AI technology consumption are not interchangeable, nor are they designed to coexist intuitively. They respond to different logics: some are per user, others per interaction, others for reserved computational capacity or for an agent's execution seconds. The crucial point is that these metrics coexist within the same ecosystem and often within the same contract.
1. Per-seat license (per-user)
This is the most familiar model. Microsoft 365 Copilot is purchased as an add-on for $30/user/month, or included in Microsoft 365 E7 at $99 along with E5, Entra Suite, and Agent 365.
The license covers the use of Copilot in M365 apps and the creation of agents in Copilot Studio. However, agent execution always consumes paid Copilot Credits – even for those with E7. And if custom models are used via Azure AI Foundry, inference costs are billed separately on Azure, outside of Credits. This is precisely where the simplicity of the per-user license ends and the world of variable consumption begins.
2. Copilot Credits
From September 1, 2025, the operational currency for executing agents in Copilot Studio is Copilot Credits. Every operation consumes a variable number of credits depending on the type – a standard response, a generative response, an autonomous action, or grounding in Microsoft Graph have different rates and add up within the same interaction.
Those with an M365 Copilot license benefit from a partial exemption: standard responses, generative responses, and Microsoft Graph grounding do not consume credits when the user interacts via Teams, SharePoint, or Copilot Chat. However, this exemption is not unlimited: autonomous triggers always consume credits, even for these users. And custom models via Azure AI Foundry remain completely outside the Credit counter, billed directly on Azure separately.
Credits are purchased in three ways. Pay-as-you-go via Azure requires no upfront commitment and costs $0.01 per credit consumed: it is the most suitable mode for pilots and seasonal peaks. Prepaid monthly packs of 25,000 credits for $200 offer a predictable cost, but unused credits expire at the end of the month with no carryover; it is better not to overestimate needs. Finally, the annual Pre-Purchase Plan (P3) allows for the advance purchase of a pool of Copilot Credit Commit Units with increasing tier discounts up to 20%, designed for organizations with stable and predictable volumes.
It is worth noting that Copilot Credits also cover AI Builder features integrated into Power Apps and Power Automate – with a progressive replacement of traditional AI Builder Credits, which will be completely retired by November 2026.
3. AI Builder Credits (being phased out)
For years, AI Builder Credits represented the metric for artificial intelligence features on the Power Platform: document processing, text recognition, and predictive models. Since November 2025, they have been in a transition phase toward Copilot Credits.
"Seeded" credits – those included in premium licenses like Power Apps Premium or Power Automate Premium – remain available only to those who had them before November 1, 2025: licenses purchased after that date no longer include these entitlements. Regardless of the original purchase date, all seeded credits will be permanently removed by November 1, 2026. For existing customers with AI Builder add-ons, renewal is still possible, but also only until November 2026, after which they will no longer be purchasable or renewable. For new customers, purchasing AI Builder add-ons is no longer possible: the path is through Copilot Credits.
For those who still have active credits, the system works in a cascade: available AI Builder Credits are consumed first, then Copilot Credits. If neither is available, features are blocked.
4. Security Compute Units (SCU)
Microsoft Security Copilot operates within its own world. Its metric is the Security Compute Unit (SCU), with has no relation to Copilot Credits or other currencies in the ecosystem. SCUs measure the computational capacity needed to perform security analysis: every query, every alert investigation, every action performed by an agent within Defender, Entra, Intune, or Purview consumes SCUs in proportion to the complexity of the operation.
The standalone purchase model is twofold. Provisioned capacity – the primary mode – costs $4 per SCU per hour and is billed monthly based on the number of deployed SCUs, regardless of actual usage. Overage capacity, designed for unexpected peaks, costs $6 per SCU per hour and is paid only for actual consumption.
For organizations on Microsoft 365 E5 an alternative path was announced at Ignite 2025 and is currently rolling out to all E5 tenants between April and June 2026: Security Copilot is now included with an allocation of 400 SCUs per month for every 1,000 paid licenses, up to a maximum of 10,000 monthly SCUs.
It is worth noting that this model is more efficient than the provisioned one: under the provisioned model, you pay for deployed SCUs every hour, regardless of activity. In contrast With the E5 inclusion model, however, only deducts what is actually consumed on an hourly basis. The downside is that the monthly pool does not accumulate: unused SCUs at the end of the month expire without carryover.
Once the threshold is exceeded, requests are blocked; a $6/SCU overage option is planned for the future, with 30 days' notice before activation. The same benefit automatically extends to those adopting E7, which includes E5 and provides the same SCU allocation.
5. Tokens and PTU — Microsoft Foundry (Azure OpenAI)
Those accessing AI models directly via API – for custom integrations, proprietary applications, RAG solutions, or agents developed in code – move within the perimeter of Microsoft Foundry, the brand under which the Azure OpenAI platform was unified. Here, the main metrics are two, with completely different logics.
The Standard (On-Demand) model is pay-as-you-go per token: you pay separately for input tokens (prompt, context, history) and output tokens (model response). Rates vary by model – GPT-4o has a different cost than GPT-4o mini, which differs from o1 or embedding models. There is no commitment, no reserved capacity: you pay exactly for what you use.
Provisioned Throughput Units (PTU) respond to an opposite logic: you reserve a computational capacity expressed in PTUs and pay for that capacity at an hourly rate regardless of actual use. In exchange, you get predictable latency, guaranteed throughput, and protection from rate limits. PTUs are technically model-independent – the quota can be used to deploy any model supported by Microsoft in the region – but each model requires a different number of PTUs for the same throughput level, so they are not entirely interchangeable. They are available with monthly or annual commitments, with increasing discounts for longer commitments.
It is worth mentioning a third, often overlooked mode: the Batch API. It allows for sending large volumes of non-urgent requests that are processed within 24 hours, at a 50% discount compared to the Standard price. It is the natural choice for offline processing, massive classifications, or any workload that does not require a real-time response.
6. Azure Agent Units (AAU) — the new entry
With the general availability of Azure SRE Agent, announced in late March 2026, Microsoft introduced a sixth metric: Azure Agent Units. The AAU standardizes the measurement of agentic work for all pre-built agents hosted on Azure – a category distinct from agents built on Copilot Studio and those developed via API on Foundry.
La struttura di billing è a doppio componente. Ogni agente attivo viene addebitato a 4 AAU all’ora come costo fisso di monitoraggio continuo (always-on flow): è il prezzo per tenere l’agente acceso e in ascolto, indipendentemente dal fatto che stia elaborando qualcosa. Quando l’agente lavora attivamente – per investigare un incidente, rispondere a un prompt, eseguire una remediation – scatta il componente variabile: dal 15 aprile 2026 l’active flow non si misura più in secondi di esecuzione ma in token LLM consumati, con tassi per milione di token che variano a seconda del modello configurato (OpenAI, Anthropic o altri). Il tempo in cui l’agente è in attesa di risposta umana non viene conteggiato.
Questa metrica non è intercambiabile con i Copilot Credits. Un’organizzazione che usa sia Copilot Studio che Azure SRE Agent si trova a gestire due valute parallele, ciascuna con le proprie regole di consumo e i propri strumenti di monitoraggio.
La moltiplicazione delle metriche ha creato un problema pratico per le organizzazioni che adottano più tecnologie AI in contemporanea. Microsoft ha risposto con il Microsoft Agent Pre-Purchase Plan (P3), lanciato il 27 novembre 2025 e aggiornato a febbraio 2026 con l’aggiunta di Microsoft Fabric e GitHub.
The plan works like an annual prepaid wallet: you purchase Agent Commit Units (ACU) in one of three predefined pools – 20,000, 100,000, or 500,000 ACU – with increasing discounts as the size grows. ACUs are automatically consumed to cover the use of Copilot Credits (Copilot Studio, Dynamics 365), PTUs and tokens on Microsoft Foundry, Microsoft Fabric, and GitHub. The mechanism is direct: if Copilot Studio generates a retail cost of $100, 100 ACUs are deducted from the pool. There is no need to manually reallocate budgets between different metrics.
However, it is worth noting the plan's limits: Security Copilot SCUs and Azure SRE Agent AAUs remain outside this perimeter, billed separately. The P3 is therefore not a truly universal wallet – it covers the Copilot Studio + Foundry part, but not the entire Microsoft AI surface.
The main advantage remains financial simplification for those using Copilot Studio and Foundry together: a single contractual commitment, a single pool to monitor, and greater spending predictability compared to distributed pay-as-you-go. The constraint is that it requires a reliable estimate of expected consumption and, as we shall see, this is precisely the hardest part.
The real complexity lies in none of these metrics taken individually. It lies in the fact that they coexist, and a medium-sized organization adopting the Microsoft AI ecosystem typically finds itself managing three or four in parallel: per-user licenses for those using Copilot in M365 apps, Copilot Credits for agents developed on Copilot Studio, PTUs or tokens for custom integrations on Foundry, and SCUs for the security team using Security Copilot.
Each metric has its own billing cycle, monitoring tools, monthly reset logic, or hourly billing. Some reset on the first of the month, others do not. Some can be accumulated in a shared pool at the tenant level, while others are tied to a specific environment or Azure subscription. Some have automatic overage mechanisms, while others block the service when capacity is exhausted.
The concrete risk is not so much the cost itself, but the difficulty of estimating aggregate consumption in advance when dealing with technologies whose real usage stabilizes only in the first 6-12 months of adoption. A commitment built on optimistic projections – typically those that sales teams tend to present – can turn into a difficult burden to manage if adoption does not follow the expected curve. And with tools like the P3, where cancellations and exchanges are not supported and renewal is automatic by default, this risk is structural, not contingent.
It is worth widening the view. The fragmentation of AI licensing metrics is not exclusive to Microsoft: it is a sector-wide phenomenon, and SAP is perhaps the most instructive example.
n mid-2025, SAP restructured its cloud ERP packages by removing advanced AI features from the RISE with SAP bundle: Joule and generative AI tools became optional add-ons with separate payment. Those using SAP S/4HANA Cloud today manage the basic ERP license, AI Units for advanced generative AI features, and BTP cloud credits for custom AI solutions built on the Business Technology Platform. Three different metrics for three levels of the same stack – a pattern identical to Microsoft's.
The parallel is direct: both SAP and Microsoft are monetizing AI through overlapping layers of consumption – one for the platform, one for applications, and one for specific agents or automations. The difference is that SAP operates in an ecosystem where contracts have longer cycles and consumption variations are historically more predictable. Microsoft, with the transition toward MCA and the multiplication of AI metrics, is compressing these cycles and increasing variability.
The signal emerging from both cases is the same: AI license management is no longer an administrative task delegated at the time of contract renewal. It is a permanent operational function requiring continuous monitoring, the ability to correlate heterogeneous metrics, and agile financial governance – the same logic that the FinOps framework applies to cloud costs, now extended to AI consumption
Those who do not organize in this direction risk discovering at the end of the year that they paid for capacity they didn't use, or – an equally problematic scenario – that they consumed resources they hadn't budgeted for without knowing it. In both cases, the impact is not just financial: it is on the credibility of AI adoption plans relative to the business.
There is no universal answer to the question of which metric to choose or how to size consumption. It depends on which products are adopted, the trajectory of usage growth, and how internal IT cost governance is structured.
What can be said with certainty is that complexity is predictable. The six metrics are known, documented, and have consumption logics that can be analyzed before making commitments. Tools for estimating expected consumption exist – Microsoft itself offers calculators for Copilot Studio and Azure OpenAI, even if the estimates they return tend to be optimistic compared to actual adoption in the early months.
The most useful work an organization can do today is to build a consumption baseline based on what it already uses, isolate the variable components (primarily those related to AI agents), and structure any multi-year commitment so that variables remain variables – and are not incorporated into the fixed base of the commitment.
It is not just a technical issue. It is a matter of IT spending governance in a context where vendors have every interest in growing the commitment, and where metric complexity can easily mask optimization opportunities worth seizing before signing.
In WEGG – The Impact Factory we follow these topics closely as ITAM and FinOps consultants, working with tools like Flexera One to bring concrete visibility to our clients' AI consumption. If this topic concerns you, contact us to learn more.
Insights
OUR OFFICES
OUR OFFICES
PADUA
Via Arnaldo Fusinato 42, 35137
MILAN
Viale Enrico Forlanini 23, 20134
ROME
Viale Giorgio Ribotta 11, 00144
Copyright © 2025 WEGG S.r.l. • P.I 03447430285 • C.F. 02371140233 • REA 311023
Azienda Certificata ISO 9001:2015 – ITA / ISO 9001:2015 – EN