Servermall
/
Blog
/
How to calculate IOPS and capacity under load
/

How to calculate IOPS and capacity under load

Author

SERVERMALL

Servermall – trusted server hardware supplier with 10 years of experience.

Updated - March 23, 2026

Reading time 14 minutes

How to Calculate IOPS and Capacity Under Load

Buying storage “by terabytes” almost always ends in one of two scenarios: either there is still free space, but the system is already hitting latency limits, or performance is sufficient, but usable capacity runs out faster than expected. The reason is that storage sizing is not about choosing a drive by the label; it is a calculation based on the real workload profile, taking into account block size, the read/write ratio, the target latency, the data protection scheme, snapshots, data growth, and reserve capacity for normal and emergency operations.

It is most practical to think in three dimensions at once: how many operations per second the application needs, how much data passes through the storage system per second, and how much usable space remains after all overhead is accounted for. That is why SNIA specifically emphasizes that IOPS, throughput, and latency cannot be substituted for one another, while in Azure and AWS the limits for IOPS and MB/s are explicitly treated together.

Where to Start: What Data to Gather Before Any Calculation

Before you get to formulas, you need to collect the input data. Without it, any sizing exercise turns into a neat-looking but random estimate.

What should be recorded in advance:

current data volume;
growth forecast for 6, 12, and 24 months;
average and peak workload;
read/write ratio;
access pattern: random, sequential, or mixed;
average I/O block size;
target latency;
peak load windows;
whether snapshots, backup, replication, rebuild, and rebalancing are present;
workload type: OLTP, virtualization, VDI, backup repository, file service, object storage, and others.

It is important not to confuse several things here. “Data at rest” is simply the amount of stored information, while “data in motion” is the amount that is actively read, rewritten, cloned, indexed, and backed up. Average workload is also rarely useful in isolation: storage is usually designed at least for the working peak, and ideally with p95 or p99 in mind. Another common mistake is to calculate only raw capacity, even though in practice usable capacity is more important: the space that remains after RAID, replication, snapshots, and reserves.

The more accurate the workload profile, the lower the risk of buying the “right” SSDs and ending up with the wrong system.

Core Concepts You Need for an Accurate Calculation

IOPS is the number of operations per second. Throughput is the amount of data per second. Latency is the response time for an operation. Queue depth is how many requests are simultaneously being processed or waiting. These metrics are related, but they are not interchangeable.

The simplest practical formula looks like this:

MB/s ≈ IOPS × block size

To simplify: 10,000 IOPS at 4K is about 39 MB/s, while the same 10,000 IOPS at 64K is already around 625 MB/s. On paper, the IOPS number is the same, but the workload profile and system requirements are completely different. In the first case, it is a story of small-block random I/O and sensitivity to latency; in the second, it is more about bandwidth.

That is why you cannot calculate IOPS “separately from block size.” Likewise, you cannot rely only on a drive’s maximum advertised performance. For transactional systems, a busy virtualization cluster, and VDI, what matters is not how many IOPS a benchmark showed in an ideal test, but at what latency they were achieved. If latency exceeds the limits required by the application, a high IOPS figure will no longer help.

Metric	What it shows	What it affects	Where it most often becomes critical
IOPS	Number of operations per second	Ability to handle small and frequent I/O	OLTP, VDI, virtualization
Throughput / MB/s	Amount of data per second	Streaming transfer speed	Backup, media, analytics
Latency	I/O response time	Application responsiveness	Databases, logs, busy VM workloads
Queue depth	I/O queue depth	Device and controller saturation	Highly parallel workloads
Read/write mix	Read/write ratio	The real cost of operations, especially writes	Mixed workloads, databases, clusters
Block size	Size of a single operation	Relationship between IOPS and MB/s	Any storage sizing exercise

First, define the I/O profile, and only then discuss “how many IOPS are needed.”

Step-by-Step Calculation of Required IOPS

Understand what exactly the workload is doing. Random read-heavy, random write-heavy, a mixed 70/30 or 50/50 profile, streaming sequential reads, short bursts — all of these require different solutions even with the same data volume.

Use the working peak, not the average value. If the system usually consumes 8,000 IOPS but rises to 18,000 for one hour a day, relying on the average is risky. In production, what matters is that storage can sustain the mode in which the application must remain stable, not that it “looks fine on average.”

Take block size into account. For small 4K or 8K I/O, latency and random-access efficiency usually matter more. For 128K and above, you more often hit MB/s, network, controller, or instance limits first. This is especially noticeable in the cloud, where the disk and the VM can have separate performance ceilings.

Include write penalty and the protection scheme. Reads and writes cost the system differently. For some RAID configurations and distributed systems, backend write load is significantly higher than the frontend I/O seen by the application. RAID 10 is usually more predictable for a write-heavy profile, while RAID 5/6 saves capacity but is more sensitive to writes and rebuilds. In distributed storage, you need to account not only for replication or erasure coding, but also for rebalancing, recovery, and background activity. In the Ceph documentation, this is directly tied to the practice of benchmarking under load rather than in laboratory isolation.

Add operational headroom. It is needed not “just in case,” but for specific processes:

garbage collection and internal SSD operations;
rebuild or resilver after a failure;
snapshots and clone activity;
background service workload;
growth in capacity and workload;
failover scenarios.

The practical logic looks like this:

required backend IOPS = frontend IOPS × data protection factor × peak factor × operational reserve

There is no universal coefficient for all systems, but the model itself is always the same: first calculate the application’s real workload, then add the cost of resilience and operations.

Example

Suppose a mixed virtualization workload reaches a peak of 20,000 IOPS with an 8K block size and a 70/30 read/write profile. If data protection increases the cost of writes, and the system must survive a rebuild without SLA degradation, the calculation “20,000 IOPS is enough” will almost certainly be optimistic. In practice, the target backend budget needs to be higher, because part of the resources will be consumed by service operations and work in degraded mode.

What you need to calculate is not “ideal IOPS under a clean workload,” but IOPS for real operational use.

How to Calculate Capacity: Raw, Usable, and Effective Capacity

The phrase “we have 20 TB of data” says almost nothing about the required array capacity. To the volume of useful data, you must add everything that consumes space in a real system:

RAID, erasure coding, or replication;
filesystem and metadata overhead;
snapshots;
thin provisioning risk;
rebuild reserve;
free-space policy;
SSD overprovisioning;
data growth;
backup and restore windows, if they affect the local storage footprint.

A convenient way to calculate it is:

usable capacity = raw capacity − protection overhead − reserve − snapshots − service overhead

It is more accurate to start from the future state rather than the current one. First, take today’s data volume, then add projected growth, retention, local copies, snapshots, the minimum acceptable free space, and reserve for degraded mode. Only after that should you look at what the required raw capacity becomes.

It is also important not to size SSDs “to the brim.” The closer the system is to full, the worse background processes, garbage collection, and data redistribution usually behave. In enterprise SSDs and arrays, this affects not only capacity, but also latency stability. Micron specifically describes overprovisioning as the deliberate reduction of user-available capacity in exchange for more predictable drive behavior.

Component	Affects IOPS / capacity / both	Typical calculation mistake	Should it be planned in advance
RAID / erasure coding	Both	Calculate only usable volume and forget the cost of writes	Yes
Replication	Both	Fail to multiply the storage footprint by the number of copies	Yes
Snapshots	Capacity	Treat them as “almost free”	Yes
Free space reserve	Both	Plan for utilization close to 100%	Yes
Rebuild reserve	Both	Do not account for degraded mode	Yes
Overprovisioning	Both	Use all physical SSD capacity for data	Yes
Data growth	Capacity	Calculate only today’s volume	Yes
Backup / restore window	Both	Ignore temporary spikes in I/O and space	Depends on the scenario

Usable capacity is not what remains after RAID; it is what remains after all normal system-life processes.

The Main Mistakes in Sizing IOPS and Capacity

The most common mistakes look very familiar:

calculate IOPS without block size;
rely on a read-only benchmark for a mixed or write-heavy workload;
ignore write penalty;
use average workload instead of the working peak;
ignore host, hypervisor, network, or VM limits;
treat deduplication and compression as guaranteed savings in advance;
calculate usable capacity without snapshots and rebuild reserve;
not validate the calculation with a pilot test.

It is also worth remembering the cloud. Even if the selected disk can deliver the required IOPS and throughput, the instance itself may have a lower ceiling. This is explicitly stated in Azure and AWS documentation: the performance of the storage path is limited not only by the volume, but also by the machine it is attached to.

Storage sizing almost always breaks down not on the prettiest number itself, but on the forgotten limitation around it.

Practical Scenarios

For an OLTP database what usually matters is a small block size, low latency, and write resilience. Here, the mistake of “choosing a drive with high throughput” is often useless: the bottleneck appears in latency and the write path.

For a mixed workload of virtualization the combined profile of all VMs matters more than the specification of each one separately. Neighboring noisy workloads, snapshots, backup activity, and failover quickly consume headroom. Here, it is more useful to size conservatively and account for QoS logic from the start than to rely on “average” metrics.

For a backup repository or media storage the picture is reversed: throughput and the operation window are often more critical than maximum IOPS. A large block size, sequential access, and high capacity may matter more than extreme performance on small random I/O.

How to Validate That the Calculation Is Realistic

After sizing, the model must be validated. Before purchase or migration, it is worth collecting real metrics: IOPS, MB/s, latency, queue depth, utilization level, and peak behavior. Then reproduce a similar profile with a test, rather than running an abstract synthetic benchmark “to the max.”

You need to verify not only the achieved IOPS, but also the conditions under which they were achieved:

at what latency;
at what block size;
at what queue depth;
at what system fill level;
with what level of background activity;
in normal and degraded modes.

If the calculation does not pass this check, the problem is not necessarily in the disks. The bottleneck may be the controller, the CPU storage stack, the network, the hypervisor path, or a limit of a specific VM.

Final Validation Checklist for the Calculation

Has the real workload profile been recorded, rather than only the average value?
Are block size and read/write mix known?
Have peak windows and p95/p99 been taken into account, if available?
Has the cost of resilience been included: RAID, EC, replication?
Is there reserve capacity for rebuild, snapshots, and background activity?
Was usable capacity calculated, rather than only raw capacity?
Have the limits of the host, network, controller, and VM/instance been checked?
Has the calculation been confirmed by a test close to the real profile?

Conclusion

IOPS without workload context are almost useless. Capacity also cannot be calculated as “the amount of data today” or as raw capacity after RAID. Real-world sizing is built on the combination of the I/O profile, block size, read/write mix, latency target, the data protection scheme, data growth, and operational reserve. The closer the calculation is to the application’s real behavior, the lower the risk of overpaying for unnecessary characteristics — or, conversely, getting storage that looks great in the specification but performs poorly in production.

Comments

(0)

No comments

Write the comment

First name

Comment

Send

I agree to process my personal data

Content:

Where to Start: What Data to Gather Before Any Calculation
Core Concepts You Need for an Accurate Calculation
Step-by-Step Calculation of Required IOPS
How to Calculate Capacity: Raw, Usable, and Effective Capacity
The Main Mistakes in Sizing IOPS and Capacity
Practical Scenarios
How to Validate That the Calculation Is Realistic
Final Validation Checklist for the Calculation
Conclusion

Next news

Write Cache, PLP, and SSD QoS: What's Important?

Why can two SSDs with similar spec-sheet numbers behave completely differently in a server? This article explains write cache, PLP, and QoS without marketing noise, focusing on durable writes, tail latency, and practical SSD selection for real production workloads.

March 20, 2026

28 Reading time

SMR vs CMR: is SMR possible in RAID?

SMR and CMR may look similar on paper, but their behavior in RAID can be dramatically different. This article explains rebuilds, sustained writes, ZFS, NAS workloads, and where SMR is a reasonable compromise.

March 19, 2026

28 Reading time

PCIe Gen4/Gen5: Bandwidth and Bottlenecks

PCIe Gen4 and Gen5 are now standard in modern servers, but higher link speed does not automatically mean better real-world performance. This article explains where bottlenecks actually emerge, how lane count and platform topology affect results, and what to evaluate when choosing a server configuration in 2026.

March 18, 2026

28 Reading time