Autoscaling Microsoft Fabric to enable sensible spend: A practical capacity cost optimization approach

Fabric workloads are mixed and bursty. Over-provision wastes money, under-provision hurts SLAs. Teams need one simple operating model that works for developing, test, training, demos and light production without manual heroics. NTT DATA Business Solutions built an autoscaling model for Microsoft Fabric capacity that treats demand like changing weather, not a fixed timetable.

The platform monitors real-time utilization, scales up during “storms” like ELT/ETL (Extract Load Transform / Extract Transform Load processes), model refreshes or demos, then scales down when workloads ease. Weekdays run on autopilot with guardrails and audit trails, weekends switch to an on-demand “Start/Stop” control with automatic pause to prevent budget drift.

The result is steadier performance during peaks, lower costs off-hours, and no ticket-driven resizing across regions. Next steps include native eventing, richer cost-and-utilization reporting and self-service guardrails for teams.

Yovcho Ivanov | February 25, 2026 | 6 min.
Digital cubes representing cloud data and network connections.

The challenge: Managing mixed and bursty Microsoft Fabric workloads without overspending

Fabric usage is mixed and changeable. Nightly pipelines collide with morning dashboard traffic. Ad hoc analysis appears without warning. Month-end and go-live periods bring prolonged high pressure. Over-provision and you pay for blue skies. Under-provision and you miss refresh windows or slow key journeys when a front moves in. Teams need a simple operating model with no tickets and no heroics, one pattern that supports development, testing, training, demos and light production across regions.

Our approach: Demand-driven autoscaling for Microsoft Fabric capacity

A fixed timetable is not enough for real workloads. We needed automatic adjustment based on live signals rather than a clock. The first requirement was near-real-time performance monitoring. In Microsoft Fabric, this is achieved by reading native capacity metrics app for the KPI CU % utilisation, then converting it into simple email signals coming from Power BI Scorecards that guide action through Azure Logic Apps.

Weekdays run on autopilot. The platform senses pressure and expands capacity while you build. When the load eases, it settles back to a sensible level. Engineers focus on work, Fabric adjusts in the background.

Weekends switch to on demand. A lightweight Power App provides a single Start or Stop control. Start brings capacity online immediately for the work window. If someone forgets to stop it, the platform parks idle capacity so the budget does not drift.

Everything sits behind clear guardrails with visible outcomes. Floors and caps prevent both sluggish dashboards and surprise invoices. Each action leaves an audit trail that is easy to read. Humans receive short email confirmations. Operations and audit teams have a detailed run history when they need it.

Why autoscaling Microsoft Fabric matters for performance and cost optimization

When the room fills for a demo, training sessions, POCs and individual training workspaces, performance holds steady. Queries return on time and dashboards behave. Outside the busy windows, spend reflects reality. As usage drops, costs follow. The removal of manual resizing also cuts friction across regions, so teams move faster without waiting in a queue.

From Monday to Friday there is nothing new to learn. Your users do their work and the platform breathes with demand in the background. On weekends, the only action is Start or Stop. The default posture is cautious with cash and idle capacity pauses itself.

When the room fills for a demo, training sessions, POCs and individual training workspaces, performance holds steady. Queries return on time and dashboards behave. Outside the busy windows, spend reflects reality. As usage drops, costs follow. The removal of manual resizing also cuts friction across regions, so teams move faster without waiting in a queue.

From Monday to Friday there is nothing new to learn. Your users do their work and the platform breathes with demand in the background. On weekends, the only action is Start or Stop. The default posture is cautious with cash and idle capacity pauses itself.

Early signals: Real-time monitoring of Fabric capacity utilization

We tune for fast reactions rather than after-the-fact fixes. The controller typically scales within two to three minutes, which keeps capacity below the throttling point, so reports do not feel slow, and queries are not blocked.

Fabric separates work into two broad types. Background work is the scheduled activity, such as pipelines that move and prepare data. Interactive work is what people feel in the moment, such as opening a report, slicing a visual, or running an ad hoc query. If utilisation sits above 100 percent without scaling, interactive work can begin to slow after roughly an hour and background work after roughly twenty-four hours. Because we react quickly, users should not notice this slowdown, or if they do, it should be minimal and short-lived.

How the Microsoft Fabric autoscaling control loop works

The control loop is simple. Sense utilisation and queue depth, scale up when pressure builds, scale down when stability returns. Weekend control rides through the Power App for clarity and accountability. Automation runs with Managed Identity. Environment details are parameterised. Every significant action is logged. There are no secrets in code and no surprises in production.

Governance you can trust: Security in Microsoft Fabric capacity management

Security is built in. Actions run with the least privileged identities. Change is visible and reviewable. Guardrails are explicit and adjustable. If policy calls for tighter caps, cost ceilings, or approval hooks, the pattern supports that without harming the developer experience.

Cost impact: Reducing Microsoft Fabric capacity costs with autoscaling

Before introducing autoscaling, we operated a reserved F64 capacity at a fixed monthly cost, even when workloads did not fully utilize the available compute. After implementing demand-driven autoscaling, our monthly spend now flexes with actual activity levels and the number of parallel projects. For our current workload mix, this results in an overall cost reduction of roughly five to six times compared to the previous fixed-capacity model.

A reservation can still make sense in some cases. Larger or always-on workloads benefit from predictable capacity and fixed pricing. Power BI licensing is also a factor. A dedicated capacity can reduce the need for individual Pro licences for viewers, while creators may still require the appropriate authoring licences. The right choice depends on your audience size, author count and the hours of peak use.

These figures reflect our tenant, region and recent workload shape. Your costs will vary with concurrency, data volumes, time of day and the level of interactive use. If you are unsure, start with autoscaling and review a month of signals and spend before considering a reservation.

Read more about Microsoft Services