A modern server can no longer be treated as “hardware that works the same way on its own.” Two nodes with identical configurations can deliver noticeably different results simply because one uses a power-saving platform profile while the other uses a performance-oriented one; one preserves explicit NUMA topology while the other partially smooths it out; one allows deep idle states while the other prioritizes predictable response. Server and CPU vendors explicitly treat BIOS/UEFI as part of the overall workload tuning model rather than as a secondary menu “for rare cases.” Intel publishes tuning guides for different workload types, HPE and Lenovo describe platform profiles as a way to tune server behavior for a target use case, Dell links System Profile to the automatic adjustment of a group of low-level parameters, and Red Hat separately ties throughput, latency, and power policy into a single system-tuning task.
The main limitation worth stating right away is this: there is no universal “best set” of BIOS/UEFI parameters. A setting that helps a low-latency service may reduce consolidation density on a virtualization host. A parameter that helps squeeze the most out of CPU-bound computation may deliver almost no benefit on a storage server, where the bottleneck runs through PCIe, networking, memory, and access locality. That is why tuning server BIOS for performance is not about finding a magic checkbox, but about selecting a platform policy for a specific metric: maximum throughput, stable all-core performance, minimum p99 latency, low jitter, performance per watt, predictability of behavior, or even temperature.
What BIOS/UEFI actually determines from a performance perspective
BIOS/UEFI affects more than just hardware initialization. On a server platform, it is the layer that defines the operating boundaries for the CPU, memory, inter-socket links, and I/O even before control passes to the OS. This is where it is decided how aggressively the processor will enter power-saving states, how Turbo/Boost will behave, whether an explicit NUMA model will be preserved, how memory-related and uncore-related policies will be set, and which power and performance profiles will become the baseline for the OS and hypervisor.
In practice, this means the following.
First, BIOS/UEFI sets the processor power-consumption policy. And this is not just a matter of the electricity bill. Power policy affects frequency behavior, the speed at which cores exit idle, the thermal budget, and therefore the real sustained performance under long-running load.
Second, it defines the behavior of P-states and C-states. The former are related to operating frequency and voltage, the latter to idle depth. That is why “energy” options often change not only power draw, but also latency, jitter, and the smoothness of system response to load spikes.
Third, BIOS/UEFI defines the conditions for Turbo/Boost. What matters is not only the peak on one or two cores, but how the server sustains frequencies under a real all-core workload, how quickly it reacts to load changes, and whether it drops into a less favorable mode because of power or thermal limits.
Fourth, BIOS/UEFI determines how system memory and topology will be presented: NUMA, memory interleaving, node interleaving, part of the memory map parameters, and sometimes aspects of uncore/fabric behavior. For servers, this is critical because memory locality and inter-socket traffic are often more important than nominal CPU frequency.
Fifth, this is also where settings related to PCIe power management, sometimes I/O bias, and virtualization features such as IOMMU, VT-d/AMD-Vi, SR-IOV, and related capabilities appear. These options do not always “speed up” the server directly, but they can change latency, I/O-path behavior, and the availability of modes used by hypervisors and devices.
That is why BIOS/UEFI should be treated as the lowest layer of overall system performance. The OS and the application can optimize only what the platform has already given them. If the firmware level sets an unfortunate policy for frequencies, power saving, and topology, later you often do not “fix” it so much as work around it.
The basic principle: tune the workload type, not the server
The main mistake in discussions of UEFI in server tuning is looking for “settings for performance in general.” In practice, tuning never starts with the server itself, but with the answer to the question: what exactly counts as success in your scenario?
For one service, success means maximum throughput. For another, it is minimum average latency. For a third, it is specifically p99/p99.9, meaning control over the tails of the latency distribution. For a fourth, it is low jitter and predictable response time. For a fifth, it is performance per watt. For a sixth, it is consolidating as many VMs as possible on one node. For a seventh, it is per-core efficiency due to licensing.
That is why the approaches differ.
On a virtualization host, balance is usually what matters: not just the speed of an individual VM, but placement density, resilience to mixed workloads, correct NUMA locality, and reasonable power consumption. For SQL/OLTP, the priority shifts toward stable frequencies, memory locality, and predictable latency rather than attractive peak values in a short benchmark. For low-latency scenarios, significantly stricter decisions around power saving are acceptable because the cost of micro-latencies is higher there than the cost of extra watts and extra heat. For HPC and memory-bandwidth-sensitive tasks, NUMA, memory channels, memory frequency, fabric/interconnect, and consistency under sustained load become decisive. For storage, Ceph, and NVMe-heavy systems, not just the CPU matters, but the entire I/O chain: PCIe, interrupt locality, network path, memory behavior, and predictability of response.
That is exactly why advice such as “disable all C-states” or “always enable Maximum Performance” is almost always bad as a universal recipe. It may help in one class of workloads and make another worse.
The most important BIOS/UEFI settings groups
Power policy, System Profile, Workload Profile, Operating Mode
This is the first group of parameters you should look at almost every time. Different vendors use different names: Dell uses System Profile, HPE uses Workload Profiles within Intelligent System Tuning, and Lenovo uses Operating Modes and tuning presets. But the logic is the same: not just a convenient interface preset, but an aggregated platform-policy setting that changes a whole set of dependent parameters at once. Dell documentation explicitly states that when you choose a profile other than Custom, the BIOS automatically sets the other related options; HPE describes Workload Profiles as a mechanism for tuning server resources for a specific workload type; Lenovo separately explains predefined modes ranging from power saving to maximum performance.
The practical takeaway is simple: it is better to start not with manually going through dozens of items, but by checking the vendor profile. It gives you a reasonable baseline from which to proceed. In many cases that is already enough, especially if you have a standard scenario: general virtualization, a mixed enterprise workload, a standard database, or an application server.
But the profile is not the ultimate truth. First, it may be too generic for your specific workload. Second, it is easy to lose visibility into which parameters actually changed. That is why after choosing a profile, you should record the system state and, if necessary, switch to Custom only for targeted changes: power policy, C-states, SMT, NUMA, memory-related options, and PCIe power management. A strong approach is to use the preset as a baseline, not as dogma.
P-states, Turbo/Boost, Energy Performance Bias, and frequency behavior
When people say “server BIOS performance mode,” they often mean exactly this group of settings, although it is broader than that. This is where it is decided whether the processor will raise frequency more aggressively, how fast it reacts to changes in load, how peak performance will be balanced against energy efficiency, and how predictable system behavior will be under a non-uniform workload.
It is important to distinguish three things.
The first is peak frequency. It looks good in specifications and marketing tables, but it does not always reflect how the server actually behaves under long-running multithreaded load.
The second is sustained all-core performance. For server workloads, this is often more important than brief Turbo on a limited number of cores.
The third is response predictability. Sometimes a server with a slightly lower peak frequency, but a stricter and more stable power-management policy, delivers a better result in tail latency and jitter.
Aggressive power saving can hurt response time because the system needs time to transition between states. But blindly putting “everything to maximum” is not always beneficial either: if the workload is memory-bound or I/O-bound, making frequency policy more aggressive may change almost nothing while worsening the thermal regime and power consumption. On new Intel and AMD platforms, vendor guides explicitly push you to evaluate frequency and power-related settings in the context of a specific workload rather than in isolation.
C-states and package C-states
If there is one parameter surrounded by the greatest number of oversimplified recommendations, it is server C-states. The mechanism is easy to understand: the deeper the idle state, the lower the power consumption, but the more expensive the return to active work may be. For some services, this is almost unnoticeable. For others, it is critical.
That is why deep C-states are often limited or disabled in ultra-low-latency environments, real-time scenarios, some networking and trading systems, and sometimes on hypervisors where stricter predictability of response to load spikes matters. But this cannot be turned into a universal rule. Completely disabling deep power saving increases power consumption, heat output, and cooling requirements; in some cases it can even reduce sustained performance if the system hits thermal or power limits sooner. In its documentation, Red Hat and its low-latency materials directly tie power-state management to the trade-off between latency, throughput, and consumption, rather than recommending one solution for every case.
In practical terms, this means the following: if you are working with a latency-sensitive workload, C-states are one of the first parameters to check. If, however, you have a general virtualization host, a mixed enterprise environment, or a task focused on performance per watt, harshly disabling all deep idle states may prove unjustified.
SMT / Hyper-Threading
SMT, also known as Hyper-Threading on Intel, is useful where the workload can fill compute resources well and benefits from additional logical parallelism. In throughput-oriented scenarios, SMT often provides a gain, especially when the application is not bottlenecked by long single critical sections and does not suffer too much from competition for shared resources within the core.
But SMT has a downside. In low-latency environments, under strict determinism requirements, and also in some per-core licensed configurations or security-sensitive environments, logical threads may become not an advantage, but a source of unwanted contention and variability. That is why the question “should SMT be enabled?” cannot be answered ideologically. It must be checked by measurement on your own workload.
Very often, the mistake looks like this: based on one synthetic test, a conclusion is made that SMT is always beneficial. Then the same system under a real database, JVM service, or network function produces a different result. The reason is that SMT changes not only absolute performance, but also the pattern of contention for cache, execution units, and scheduler resources.
NUMA, memory interleaving, node interleaving, and locality
NUMA in BIOS settings is one of the most underrated topics in all of server performance. Back when you had fewer cores and a single socket, the consequences of poor locality could go unnoticed. On modern dual-socket and dense single- or dual-socket systems, the cost of a mistake is already too high.
NUMA means that memory is “closer” to one set of cores and “farther” from another. If the process and its memory are in the same NUMA domain, the path is shorter, latency is lower, and inter-socket traffic is reduced. But if the scheduler, hypervisor, or application regularly moves execution toward remote memory, latency rises and predictability changes.
This is where the temptation to enable interleaving appears: the system starts to look more “flat,” it seems easier to administer at first glance, and some older software behaves more calmly. But behind that convenience there is often a loss of locality and, along with it, a penalty for databases, analytics, HPC, dense virtualization, and in general any workload sensitive to memory and inter-socket traffic.
NUMA cannot be considered separately from the OS and hypervisor. If in BIOS you hid or flattened the topology, beautiful CPU pinning in the hypervisor will not help later. If NUMA remains explicit, then it must be taken into account when placing VMs, vCPUs, memory, network-device queues, and the storage path. In practice, many “inexplicable” performance problems turn out not to be a matter of Turbo or memory frequency at all, but of poor locality.
Memory speed, population, channels, and memory power/performance options
When people discuss BIOS/UEFI for HPC or databases, memory is often reduced to one item only: DIMM frequency. That is far too crude a simplification.
Real memory performance depends not only on nominal memory frequency, but also on the number of active channels, the slot population pattern, ranks, supported topology, processor model, interleaving mode, and sometimes the selected performance/reliability profile. You can enable “Maximum Performance” in the memory menu and gain almost nothing if the node was assembled suboptimally to begin with: some channels are empty, population rules are violated, or frequency is reduced because of a specific module-placement scheme.
That is why any good article about UEFI server settings must make one point clear: BIOS cannot “overclock” a poorly designed memory topology. It can only avoid interfering with the platform working in the most advantageous allowed mode. For memory-bandwidth-sensitive workloads, this is critical. If the system is not getting all memory channels or is losing frequency because of the DIMM configuration, no manual Turbo adjustment will compensate for that.
Uncore, fabric, interconnect, and I/O bias
On a server, performance is defined not only by CPU cores. Between the core and useful work stand the LLC, memory controllers, buses, inter-socket interconnect, fabric, and I/O controllers. That is why uncore/fabric settings and related profiles sometimes turn out to be more important than another subtle tweak to core-frequency behavior.
This is particularly noticeable in two cases. The first is memory-heavy and HPC scenarios, where the bottleneck is not arithmetic on the core, but data delivery. The second is I/O-sensitive environments: fast NICs, NVMe, distributed storage, and accelerators. Here not only raw compute power matters, but also the platform’s ability to serve the data path predictably through memory, cache, and I/O.
That is exactly why some vendor profiles are aimed not simply at “Performance,” but at different subsystem behavior models. This approach is more useful than a mechanical effort to hold maximum CPU frequency at any cost.
PCIe ASPM and related I/O power-management options
PCIe power management often remains in the shadow of CPU settings, even though its effect can be tangible on storage and networking servers. In general, power saving on PCIe is a normal and reasonable policy. But if your scenario is highly latency-sensitive or has strict requirements for deterministic I/O behavior, additional transitions into power-saving modes along the device path may add the kind of variability you do not want to see.
You should pay especially close attention to these settings on nodes with high-speed network adapters, intensive NVMe usage, accelerators, and distributed storage. At the same time, you should not disable PCIe ASPM “just in case” either. For many standard enterprise workloads, the gain will be minimal while power consumption will be higher. The same principle applies again here: first understand the metric, then test.
Virtualization-related options
Virtualization options should not be confused with direct performance accelerators. VT-x/AMD-V, VT-d/AMD-Vi, IOMMU, SR-IOV, interrupt remapping, and related functions do not by themselves make the server “faster,” but they define the availability of mechanisms without which a modern hypervisor, passthrough, and part of high-performance scenarios simply cannot be implemented correctly.
That is why this block is important, but secondary in the context of pure BIOS performance tuning. The mistake is to build the entire article around virtualization flags while saying almost nothing about power policy, NUMA, and memory. The right approach is to explain that for virtualization, correct platform policy and topology come first, and only then specific device acceleration and isolation features. At the same time, for HPC without virtualization, disabling such settings may provide a small performance gain.
Which settings most often deliver an effect first
If you do not want to drown in hundreds of BIOS items, it is useful to keep a simple priority order in mind.
First comes System Profile, Workload Profile, or Operating Mode. This is the fastest entry point because a profile changes a whole group of related parameters at once.
Second comes CPU power policy, P-states, and performance mode. This is where the basic trade-off between power saving, frequency aggressiveness, and responsiveness often becomes apparent.
Third comes Turbo/Boost behavior. What matters is not only whether it is enabled, but also understanding what exactly your workload benefits from: peak, sustained all-core performance, or predictability.
Fourth come C-states and the package C-state limit. For latency-sensitive environments, this is one of the main levers; for others, it is a parameter that requires caution.
Fifth come NUMA and node interleaving. If you have a database, virtualization, analytics, HPC, or simply a large node with heavy memory traffic, this should not be left at “default” without deliberate validation.
After that usually come SMT, PCIe power management, and memory-related performance options. Their impact is generally more scenario-dependent.
Which BIOS/UEFI settings to check first by workload type
| Scenario | Target metric | Where to start in BIOS/UEFI | What to validate especially carefully |
|---|---|---|---|
| Virtualization | Balance of throughput, consolidation density, and latency | Vendor profile, power policy, NUMA visibility, memory settings | C-states off for the entire node, node interleaving, SMT off without measurement |
| SQL/OLTP | Stable latency, memory locality, predictability | Performance-oriented profile, Turbo, NUMA/locality, memory channels | Aggressive power saving, hidden loss of NUMA locality |
| Low-latency | p99/p99.9, jitter, deterministic behavior | Performance mode, limiting C-states, Turbo behavior, SMT validation | Completely disabling power saving without controlling thermals and power |
| HPC | Memory bandwidth, consistency, inter-socket efficiency | NUMA, memory population, fabric/interconnect, performance profile | Focusing only on core frequency for a memory-bound workload |
| Storage / Ceph / NVMe-heavy | I/O latency, sustained throughput, data-path predictability | PCIe-related options, NUMA, power profile, memory behavior | PCIe power saving, remote memory for network and storage queues |
| Mixed enterprise | Versatility, performance per watt, stability | Preset vendor profile as a baseline, then targeted adjustments | An overly aggressive performance mode without real gain |
Where mistakes are made most often in BIOS/UEFI tuning
Enable Maximum Performance and decide that the work is done. This profile may be a useful starting point, but without measurement it proves nothing. Your system may be bottlenecked not by CPU frequencies, but by memory, I/O, or topology.
Disable all power-saving functions for the sake of “maximum.” The result may be not acceleration, but higher temperature, more noise, increased consumption, earlier encounters with power/thermal limits, and even reduced sustained performance.
Confuse throughput with latency. A system may show a greater total amount of work processed per unit of time and at the same time behave worse at p99. For many production services, that is already a loss, not a win.
Break NUMA locality. This happens surprisingly often: interleaving is enabled “for convenience,” memory is placed far from computation, device queues do not match local cores, and then everything gets blamed on insufficient frequency or a “slow server.”
Change BIOS without taking the OS and hypervisor into account. Firmware sets the rules of the game, but the actual placement of threads, memory, queues, and interrupts is handled by the software layer. The firmware ↔ OS ↔ workload chain cannot be ignored.
Forget that after a BIOS, firmware, or microcode update, some settings may reset or change actual behavior. The documented platform state must be validated again after an update.
Draw a general conclusion from a single synthetic benchmark. This is especially dangerous for mixed workloads and services sensitive to tail latency.
Fail to record the baseline. Without it, you do not know what exactly improved and what merely happened at the same time.
Change 10–20 parameters at once. After that, it becomes impossible to understand what actually worked and what simply added noise.
BIOS tuning without a measurement methodology almost always turns into self-deception. The server may start to “feel faster,” but without the same test scenario, fixed metrics, and a correct comparison, such impressions are worthless.
BIOS/UEFI and the OS: why they cannot be considered separately
Firmware does not exist separately from the operating system. BIOS defines the corridor of possibilities: which power states are available, how topology is presented, which device functions are enabled, and what behavioral model the processor receives. On top of that, the OS makes its own decisions: scheduler policy, NUMA placement, drivers, IRQ policy, power profile, hypervisor behavior, queue placement, and process memory layout.
That is why the same BIOS profile can produce different results on Linux, Windows Server, VMware ESXi, or another hypervisor. Not because the firmware has somehow “become different,” but because the software layer uses platform capabilities differently. In its documentation, Red Hat treats throughput, latency, and power consumption as connected goals and recommends optimizing the system according to the scenario rather than viewing power policy solely as a matter of energy efficiency.
An important practical conclusion follows from this: if you are evaluating BIOS/UEFI settings for SQL/OLTP, virtualization, or HPC, you need to measure on the OS, hypervisor, and workload version that are actually used in production. Otherwise, you are tuning an abstract server, not a working system.
How to tune correctly: a safe methodology
A practical tuning methodology looks boring, but that is exactly what gives a reproducible result.
First, you need to record the initial state: the BIOS/UEFI version, firmware, microcode, processor model, memory configuration, OS or hypervisor version, application version, and the BIOS parameter set itself. Without this, comparison after changes quickly loses meaning. If you can make a backup of the settings, that is also worth doing for a quick rollback if the changes cause any unpleasant effects.
Then you take a baseline. Not a single “run by feel,” but a repeatable test scenario. For databases, that means queries and a load profile similar to real ones. For virtualization, consolidation and latency-sensitive scenarios. For low latency, measuring tail latency rather than just the average. For storage, not only a bandwidth benchmark, but also behavior under mixed load.
Next, define the target metric. You need to decide in advance what you are entering BIOS for at all: throughput, average latency, p99, jitter, performance per watt, stability under sustained load, or virtualization density.
It is sensible to start with the vendor profile or recommended mode. This gives you a meaningful reference point. After that, change parameters one or two at a time per iteration: for example, first the platform profile, then C-states, then SMT, then a NUMA-related setting. More than that is already difficult to interpret.
Each test must be repeated in the most identical way possible. In addition to final performance, you should look at thermals, power consumption, signs of throttling, CPU residency, memory behavior, and NUMA effects. Sometimes a setting seems to provide a gain in a short test, but loses under a long workload because of thermal degradation of the operating mode.
All results should be recorded, and rollback should be fast and unambiguous. After every significant BIOS/firmware/microcode update, the selected settings need to be validated again. For a server platform, this is not bureaucracy, but normal operations practice.
| Setting | What it may improve | Where it may hurt | Comment |
|---|---|---|---|
| Maximum Performance / Performance Profile | Throughput, responsiveness, frequency behavior | Performance per watt, thermals, sometimes mixed workloads | A good starting point, but not a universal end state |
| Turbo / Boost | Peak and part of sustained performance | Thermal regime, stability under sustained load | You need to control not only the peak, but the sustained mode as well |
| C-states off | Low latency, reduced jitter | Power consumption, heat, sometimes stability | Often justified for low latency, not for every case |
| SMT off | Determinism, part of latency-sensitive scenarios | Overall throughput | Must be checked with a workload-specific test |
| NUMA interleaving | Simplifies the memory model for some software | Loss of locality, more remote accesses | Use caution for databases, virtualization, and HPC |
| PCIe ASPM off | I/O predictability, part of latency-sensitive scenarios | Performance per watt without noticeable gain | Relevant first of all for fast NICs, NVMe, and accelerators |
| Memory performance mode | Memory bandwidth, part of memory-bound workloads | Reliability/conservatism of modes in some configurations | The effect is limited by population rules and topology |
What is important to mention by vendor: Dell, HPE, Lenovo, Intel, AMD
There is no need for a separate “brand comparison,” but understanding the difference in terminology is useful.
Dell more often uses System Profile and related parameters such as CPU Power Management, Memory Frequency, Turbo Boost, and C States. HPE more often expresses the logic through Workload Profiles and Intelligent System Tuning. Lenovo uses Operating Modes and presets, where parameter sets for performance or energy efficiency are defined in advance. Intel and AMD, on their side, publish workload-oriented tuning guides in which BIOS parameters are considered not in isolation, but as part of overall platform tuning for a specific class of tasks.
The meaning is the same for all of them: there is a general-policy preset, there is manual customization, and there is the impact of power management, frequencies, memory, NUMA, and I/O on a specific workload. That is why what matters is not so much the name of the menu as the mechanics behind it.
Practical recommendations by scenario
For virtualization, you should not automatically chase the most aggressive performance mode. It is often more important to balance consolidation density, latency, and power consumption, as well as to preserve correct NUMA locality for large VMs. If the node serves a mixed workload, an overly rigid power policy may deliver less benefit than it seems.
For SQL/OLTP, stable frequencies, memory locality, and response predictability are usually important. Here you need to be especially careful with aggressive power saving and with settings that mask or damage the NUMA model. A nice Turbo increase in a short test does not in itself mean the database will perform better.
For low latency, server BIOS settings really do more often include strict limitation of some power-saving mechanisms. This is one of the few scenarios where abandoning deep power saving is often justified. But the price is almost always obvious: more heat, higher consumption, greater cooling requirements, and less node versatility.
For HPC, not only core frequencies matter, but also memory bandwidth, NUMA, fabric/interconnect, and consistency of behavior under sustained load. It is a mistake to reduce BIOS/UEFI performance tuning for HPC to the single idea of “raise the frequency and everything will accelerate.”
For storage, NVMe, and Ceph scenarios, you need to look at the entire I/O path: PCIe, memory locality, network and storage interrupt paths, queue placement, and predictability of device response. In such systems, the CPU is only part of the picture.
Conclusion
Server BIOS/UEFI performance settings are not a set of magical flags and not a competition to see who can disable more power-saving functions. They are the choice of a platform policy for a specific task. Sometimes you need maximum throughput, sometimes stable latency, sometimes low jitter, and sometimes reasonable performance per watt.
You should start not by manually editing dozens of items, but by understanding the workload and the target metric. Then use the vendor profile as a starting point, change one or two parameters per iteration, measure with the same scenario, and record the result. The best outcome usually comes not from extreme tuning, but from tuning that is thoughtful, verified, and documented.
That is exactly why good BIOS/UEFI tuning almost always looks more boring than advice from forums. But it is precisely what works in real server operations.
Content:
What BIOS/UEFI actually determines from a performance perspective
Power policy, System Profile, Workload Profile, Operating Mode
P-states, Turbo/Boost, Energy Performance Bias, and frequency behavior
Memory speed, population, channels, and memory power/performance options
BIOS/UEFI and the OS: why they cannot be considered separately
What is important to mention by vendor: Dell, HPE, Lenovo, Intel, AMD