Servermall
/
Blog
/
On-premise Kubernetes Servers: Which Nodes Are Needed for the Control Plane, Worker, and Storage
/

On-premise Kubernetes Servers: Which Nodes Are Needed for the Control Plane, Worker, and Storage

Author

SERVERMALL

Servermall – trusted server hardware supplier with 10 years of experience.

Updated - May 22, 2026

Reading time 33 minutes

Kubernetes on-premise requires at least three logical groups of servers: control plane nodes for cluster management, worker nodes for applications, and a well-designed storage subsystem for data. In a test environment, these roles can be combined, but in a production cluster the control plane should be redundant, worker nodes should be sized according to the real application workload, and storage should be designed separately by capacity, latency, input/output operations, replication, and recovery requirements. If networking, monitoring, backups, and updates are not considered from the start, the cluster may be able to run containers, but it will not become a reliable platform for business services.

Kubernetes on-premise differs from cloud Kubernetes because the responsibility for the infrastructure remains inside the company. In the cloud, the provider often takes over part of the work: managed control plane, network load balancers, block disks, updates, and fault tolerance of individual components. In your own infrastructure, all of this has to be designed independently: servers, disks, network, power, racks, switches, backups, monitoring, and maintenance procedures.

That is why choosing servers for Kubernetes cannot be reduced to the question of how many cores and how much memory to buy. It is important to understand node roles, application profiles, data requirements, network traffic patterns, and failure scenarios. An architectural mistake may not appear immediately: the cluster will install, the first pods will start, but problems will emerge during updates, node failure, service growth, or database migration.

What Kubernetes on-premise is and why hardware matters

Kubernetes is a platform for running containerized applications. It distributes applications across nodes, monitors their state, restarts failed containers, and manages access to services, configurations, secrets, and data volumes. Inside Kubernetes, an application usually runs not simply “on a server,” but in a pod — the smallest deployable unit, which may contain one or more containers.

On-premise means that the cluster runs on company servers, in an internal server room, in a corporate data center, or on rented dedicated hardware. This approach gives control over hardware, network, data placement, and security policies. But control also brings responsibility. Kubernetes will not fix a weak network, slow storage, missing redundancy, or a chaotic update process.

If the cluster is only needed for learning, a simplified setup can be used. If it is needed for internal services, production applications, databases, queues, analytics, or CI/CD, the approach must be different. The task is not to design individual servers, but a platform: the control layer, the worker layer, storage, network, observability, backups, and maintenance plans.

Top-selling network switches

New

In stock

Switch Cisco C1300-24P-4X

Layer 2 smart-managed switch, 24× PoE+ RJ45, 4× SFP+ 10GbE uplink, 195 W PoE, rack-mountable

Price

754 €

623 €

+ 131 € VAT

Incl shipping across EU

Add to cart

Refurbished

In stock

Switch Arista DCS-7060CX-32S-R

Layer 3 / 32 x 40/100Gb QSFP28 + 2 x 1/10Gb SFP+ Ports

Price

5 124 €

4 235 €

+ 889 € VAT

Incl shipping across EU

Add to cart

Refurbished

Switch HPE Aruba 2530 48G POE+ J9772A

Layer 3, 48x RJ45 10/100/1000 POE+, 4x SFP 1/10GbE 1x USB, 1x Console, 1xRJ45

Price

1 075 €

888 €

+ 187 € VAT

Incl shipping across EU

Add to cart

Refurbished

Switch Dell Networking N1548P

Layer 2+, 48x 10/100/1000 Base-T POE+, 4 x 10 Gigabit SFP

Price

1 635 €

1 351 €

+ 284 € VAT

Incl shipping across EU

Add to cart

Main node roles in a Kubernetes cluster

Kubernetes has a control layer and worker nodes. The control layer makes decisions and stores the state of the cluster. Worker nodes are responsible for actually running applications. The storage system stores persistent data if applications are not fully stateless, meaning temporary and without saved state.

Kubernetes officially describes a cluster as a set of worker nodes and a control plane that manages those nodes and pods. For production environments, the documentation notes that the control plane usually runs across multiple computers, and the cluster uses multiple nodes for fault tolerance and high availability.

Control plane

The control plane is the management part of the cluster. It receives commands from administrators and automation systems, stores the cluster state, schedules pod placement, and ensures that the actual state matches the desired state.

The control plane includes several key components:

the API server receives requests and is the main management entry point;
etcd stores the cluster state;
the scheduler chooses which node should run a pod;
controllers monitor Kubernetes objects and try to bring the system to the desired state, for example by restarting an application if it disappears.

The control plane does not have to be the most powerful part of the cluster in terms of processors. Stability, predictable disks, low network latency between control nodes, and protected access matter more. If the control layer is unstable, the whole cluster suffers: new applications are not scheduled, changes are not applied, and maintenance operations become risky.

Worker nodes

Worker nodes are the servers where applications actually run. They run pods with containers, network components, monitoring agents, logging, ingress controllers, service operators, and sometimes storage system agents.

Worker nodes are sized according to workload: how much CPU and RAM applications need, what network traffic flows between services, whether local disks are required, whether GPU resources are needed, and how many resources system components consume. Mistakes in worker node sizing lead to pod eviction, latency, memory shortage, network overload, and an inability to safely survive server failure.

Storage nodes and the storage subsystem

Storage is not just “disks in a server.” For Kubernetes, it is a separate architectural layer that must provide applications with persistent data volumes. This may be an external storage array, distributed storage on servers, local disks of worker nodes, network file storage, or a combination of several options.

Kubernetes itself does not make data fault-tolerant. It can attach volumes, manage storage requests, and work with different systems through drivers. But reliability, replication, latency, recovery after failure, and data protection depend on the chosen storage system and its configuration.

When roles can be combined

In a lab or test environment, control plane and worker workloads can run on the same servers. This is convenient for learning, CI/CD testing, demos, development, and small experiments. It saves hardware and simplifies the start.

However, moving a lab design into a production environment without changes is dangerous. If the same node manages the cluster, runs applications, and stores data, any maintenance affects several layers at once. You need to update the server — the control plane, applications, and storage are affected. The node fails — not only compute capacity is lost, but also part of the management or disk subsystem.

For a small production cluster, teams sometimes start with three servers where roles are partially combined. This is acceptable if the load is low, the risks are understood, backups exist, and there is a growth plan. But once critical applications appear, roles should be separated at least logically: control components should not compete with heavy user workloads, and storage should not depend on random pod placement.

In medium and large clusters, the control plane is better separated from worker nodes. Storage also needs to be sized separately, especially if the cluster runs databases, queues, file services, analytics, or applications with persistent data.

Control plane requirements

For a production environment, a single control node is a weak design. It may be enough for a test setup, but in a working cluster it becomes a single point of failure. If that node is unavailable, already running applications may continue to work, but cluster management, changes, new deployments, and part of automatic recovery will be disrupted.

Three control nodes are usually used for fault tolerance. This is related to quorum: the management layer needs a majority of participants to continue consistent operation. Two nodes look better than one, but they do not provide a normal reserve if one participant is lost; a split-brain situation is also possible if connectivity between nodes is broken while both continue to operate. Three nodes allow the cluster to survive the failure of one control server without unnecessary complications and split-brain risks.

The minimum requirements from kubeadm documentation should be treated as the lower installation boundary, not as a recommendation for serious operation. Kubernetes documentation for kubeadm states a minimum of 2 GB of RAM per machine and at least 2 CPUs for control plane nodes, but these values leave little room for applications and are more suitable for small or educational scenarios.

In a production cluster, control plane nodes need fast and reliable system disks, preferably SSD or NVMe. It is especially important not to place etcd on slow HDDs or overloaded shared storage. Memory and processors should have headroom for API activity, operators, CI/CD, monitoring, frequent object changes, and growth in the number of pods.

Heavy user applications should not run on control nodes unless there is a clear reason. Even if Kubernetes allows restrictions to be removed and workloads to be placed on the control plane, this should be a conscious decision in production. The control layer must remain stable during application peaks.

Why etcd needs special attention

etcd is one of the most sensitive Kubernetes components. It stores the cluster state: information about deployments, services, secrets, configmaps, namespaces, pods, and other objects. If etcd is lost without a working backup, the description of the entire cluster can be lost.

etcd is sensitive to disk and network latency. It needs fast and predictable storage. It is not recommended to place it on slow storage that also serves heavy user applications. Low latency between etcd members is also important when there are several of them.

etcd backups must be regular. But the mere fact that a snapshot exists is not enough. Recovery must be tested: in an emergency, it is important not only to have the file, but also to understand how to restore a working control plane from it. Kubernetes directly states that all Kubernetes objects are stored in etcd, and that regular etcd backups are needed for cluster recovery after disasters, including the loss of all control nodes.

An etcd backup may contain sensitive data, including secrets. It therefore needs to be stored securely: with access control, encryption, and a clear retention period. It should not be treated as an ordinary technical copy without restrictions.

How to size worker nodes

Worker nodes should be sized by real applications, not by the number of containers. One container may be a small service, while another may be a heavy Java system, database, analytics task, or video processing service. The number of pods alone does not describe the workload.

Each worker node spends part of its resources on system components: kubelet, the container runtime, the network plugin, kube-proxy or an alternative, monitoring agents, logging agents, security tools, and sometimes storage agents. That is why 100% of CPU and RAM cannot be allocated to applications. A reserve is needed for system services and peaks.

Applications should have requests and limits — resource requests and resource limits. Without them, the scheduler has a poorer understanding of how many resources pods actually need. As a result, applications whose combined memory or CPU demand exceeds what the server can provide stably may end up on the same node.

Worker nodes can be divided by profile.

Universal worker nodes are suitable for web services, APIs, background jobs, queues, lightweight microservices, and most stateless applications. They need a good balance of CPU and RAM, a fast system disk, and reliable networking.

Nodes with a large amount of memory are needed for applications that consume a lot of RAM: Java services, caches, analytics, and backend systems with aggressive in-memory caching. Here, counting only cores is not enough. If memory is insufficient, pods will be evicted, restarted, or run unstably.

Nodes with fast CPUs are needed for computations, builds, data processing, encoding, intensive APIs, and services sensitive to response latency. For such workloads, not only the number of cores matters, but also frequency, thermal behavior, power reserve, and cooling.

GPU nodes are needed for machine learning, inference, video processing, or graphics tasks. They are sized separately: by power, cooling, PCIe slots, drivers, compatibility, and GPU resource allocation rules.

Stateful nodes are used for applications with persistent data: databases, message brokers, queues, and storage services. Disks, latency, backups, pod placement rules, and the expected behavior during server failure are especially important here.

Storage in Kubernetes: what to decide before buying servers

In Kubernetes, persistent data usually works through PersistentVolume and PersistentVolumeClaim. A PersistentVolume is a storage volume available to the cluster, while a PersistentVolumeClaim is an application’s request for such a volume. StorageClass describes a class of storage, such as fast NVMe, regular SSD, file storage, replicated storage, or volumes with a specific backup policy. Kubernetes describes Persistent Volumes as a persistent storage mechanism, and StorageClass as a way for administrators to describe available storage classes.

Before buying servers, you need to decide where data will live. An external storage array provides mature centralized management, clear levels of fault tolerance, and familiar maintenance procedures. But it costs more, requires the right network, and becomes a separate part of the architecture.

Distributed storage on servers allows you to use local disks of nodes and scale horizontally. But it requires a fast network, proper replica placement, monitoring, disk headroom, and an understanding of how recovery will happen after failure.

Local disks of worker nodes provide good speed and low latency. But if a node fails, data may become unavailable unless replication is configured at the application level or storage system level. This option is not suitable for every application.

File storage is convenient for shared files, but it is not always suitable for databases and high input/output workloads. Latency, locking, access rights, and behavior under load need to be tested.

What storage nodes are needed

Storage nodes should not be turned into ordinary worker servers that can receive any application. If storage competes with user pods for CPU, RAM, disks, and network, predictability drops. For production, the storage role is better made explicit: through separate servers, separate disks, separate placement rules, or an external storage system.

Disks must be server-class. For active databases, queues, journals, and services with low latency requirements, NVMe is better. SSDs may be suitable for less critical data. HDDs are acceptable for cold data, archives, and backups, but not as the basis for active production storage with low latency.

When sizing storage, useful capacity alone is not enough. Replication increases disk consumption. For example, three replicas mean that one nominal terabyte of useful data requires three times more real physical space. In addition, snapshots, growth reserve, recovery space, and headroom against overfilling are needed. Full storage is dangerous not only because space runs out. At high utilization, latency often grows, background operations slow down, and recovery becomes more difficult.

Storage consumes not only disks, but also CPU, RAM, and network. A distributed storage system may actively use processors for replication, compression, checksums, recovery, and data balancing. When a disk or node fails, rebuild begins — replicas are restored. At this moment, disk and network load grows, so the storage segment must have headroom.

Node roles and server requirements

Node role	What it does	CPU	RAM	Disks	Network	What not to forget
Control plane	Manages the cluster, API, scheduling, state	Moderate CPU with headroom	Enough for API, etcd, and operators	Fast SSD/NVMe for the system and etcd	Stable network between control nodes	3 nodes for production, etcd backup, API protection
Universal worker	Runs web services, APIs, background jobs	Balance of cores and frequency	Based on application profile	SSD/NVMe for system and temporary data	10GbE as a reasonable baseline	Requests/limits, reserve for system agents
Memory-heavy worker	Runs heavy backend services, caches, Java, analytics	Medium or high	Large RAM capacity with reserve	SSD/NVMe	10GbE and higher	Avoid memory overcommit, account for pod eviction
GPU worker	ML, inference, video, graphics	CPU with reserve for feeding data	Based on the task	Fast local disks	10/25GbE depending on load	Power, cooling, PCIe, drivers, GPU scheduling
Storage node	Stores data, replicas, application volumes	CPU for the storage system	RAM for cache and service processes	NVMe/SSD, HDD only for cold data	25GbE is desirable for active storage	Replication, disk monitoring, rebuild, free space
Infrastructure node	Ingress, registry, monitoring, logging	Moderate or high depending on services	Based on metrics and log volume	Fast disks for logs and registry	Reliable external and internal traffic	Do not mix chaotically with business workloads

This table does not replace sizing for specific applications. It shows that a Kubernetes cluster consists of different types of workload. One universal server profile rarely fits API services, storage, GPU tasks, and the control layer equally well.

Kubernetes on-premise networking

Networking is one of the most underestimated parts of an on-premise Kubernetes cluster. It consists of several layers: communication between nodes, the pod network, the service network, external access through ingress or a load balancer, the network to storage, the management network, monitoring, and logging.

Kubernetes uses a network model where each pod receives its own IP address inside the cluster, and the pod network provides connectivity between pods. The implementation of this model depends on the network plugin and the chosen addressing scheme.

For a production cluster, 1GbE usually becomes a weak point quickly. 10GbE can be considered a baseline for production clusters. If there is active storage, intensive service-to-service communication, or large volumes of logs and metrics, 25GbE and higher should be considered. This is especially important for distributed storage, where replication and recovery traffic flows through the network.

The network to storage must not compete with user traffic without proper sizing. If the same channel is used for ingress, data replication, logs, metrics, and service-to-service communication, peak load can create latency across the entire cluster.

Before buying servers and switches, the VLAN structure, MTU, routing, DNS, load balancers, pod address ranges, and service address ranges need to be planned. The network plugin should be selected before the hardware purchase, not after it. If network policies and application segmentation are required, the chosen network layer must support them.

Ingress nodes can also become a bottleneck. If all external traffic passes through them, they must be redundant and placed correctly. The API server needs stable and protected access: losing network access to the control layer makes diagnostics and maintenance more difficult.

Infrastructure services inside the cluster

Kubernetes runs not only business applications. The cluster needs ingress controllers, an internal registry for images, monitoring, logging, event collection, service operators, certificate management, secrets, and policies.

These components also consume resources. Monitoring stores metrics. Logging can quickly accumulate a large amount of data. A registry requires disk space and stable access. Ingress receives external traffic and must handle peaks. Operators monitor applications and create load on the API server.

Infrastructure components should not be mixed chaotically with user applications. Separate nodes can be allocated for them, or at least placement rules can be used so that monitoring, ingress, and logging do not all end up on one server. Otherwise, the failure of one worker node may simultaneously affect external access, observability, and part of the business services.

Monitoring and alerts

For Kubernetes on-premise, it is not enough to check that servers are powered on and respond over the network. The state of the cluster must be visible: API server, etcd, control components, worker nodes, pods, storage, ingress, network, certificates, backups, and updates.

For the control plane, API server availability, etcd state, etcd latency, control component errors, and request frequency are important. For worker nodes, CPU, RAM, disks, network, memory and disk pressure, frequent container restarts, and pod evictions matter. For storage, volume utilization, disk latency, replication errors, replica state, recovery speed, and proximity to limits must be monitored.

PersistentVolume objects and storage claims also need separate monitoring. If a volume fills up, the application may stop working or corrupt data, which may be even worse. If storage latency grows, the problem may look like a “slow application,” although the real cause is one layer below.

Monitoring should be designed so that it is not the first thing to disappear during an incident. If observability depends entirely on the same storage or the same worker nodes that failed, diagnostics become more difficult. For critical clusters, external metric collection, separate log storage, or at least resilient placement of monitoring components should be considered.

Redundancy and fault tolerance

A Kubernetes cluster must be sized not only for normal operation, but also for failure or maintenance. The minimum practical rule is that the cluster should survive an N-1 mode, when one server is unavailable. This may be a failure, planned update, disk replacement, power issue, or rack maintenance.

The control plane must survive the failure of one control node. Worker nodes must have enough free resources for applications to move after a failure. Storage must have replication or external fault tolerance. Ingress and load balancers must not exist as a single instance. Network switches, uplinks, and power are also part of fault tolerance, not “external details.”

If a three-node cluster is 90% loaded during normal operation, it will not safely survive a node failure. During the first incident, pods will have nowhere to move, updates will become risky, and the storage system may start recovery on overloaded disks and networks. Resource headroom is therefore not a luxury, but part of the architecture.

etcd backups and application data backups are different tasks. An etcd backup helps restore the cluster state. A database or file storage backup helps restore application data. One does not replace the other.

Cluster updates

Kubernetes needs regular updates. But Kubernetes itself is not the only thing being updated. An on-premise cluster also includes the operating system, container runtime, network plugin, storage drivers, ingress, monitoring, logging, server firmware, network card drivers, and sometimes GPU drivers.

If the cluster has been sized without reserve, every update becomes a risk. To update a worker node, it needs to be drained, pods need to move to other nodes, and the server needs to be returned to service. If there is no spare capacity, the update will either stop part of the applications or be postponed until an emergency.

The control plane is updated step by step. Worker nodes are also better updated in batches, not all at once. Storage components require a separate maintenance window and replication checks. Before an update, backups and a rollback plan are needed. After an update, pod state, ingress, persistent volumes, network policies, and monitoring must be checked.

A good architecture allows the cluster to be maintained without full downtime. A bad architecture works only until the first update.

Example of a small cluster

For a lab or a small production cluster, three servers can be a starting point. In this design, roles are sometimes combined: each node can be part of the control plane and also be a worker node. This saves hardware, but requires an understanding of the limitations.

Each server should preferably have fast SSDs or NVMe, sufficient RAM, two network ports or more, 10GbE for a production scenario, and separate backups outside the cluster. If local disks are used for storage, it must be clear how data will survive node failure.

This design is suitable for small services, development, internal tools, CI/CD, and moderate workloads. But it is poorly suited for heavy databases, strict availability requirements, active analytics, or fast growth. Most importantly, three servers should not be treated as a universal answer to every task.

Example of a medium production cluster

For a medium production cluster, roles should be separated. A typical design may include three dedicated control plane nodes, several worker nodes for applications, and a separate storage system or dedicated storage nodes. There may be three, six, or more worker nodes, depending on workload and reserve requirements.

Ingress can be placed on separate infrastructure nodes or on worker nodes with anti-affinity rules so that multiple instances do not end up on the same server. Monitoring and logging also need resilient placement. The network should be at least 10GbE, and with active storage or high internal traffic, 25GbE is better.

This kind of cluster can no longer be built as “three identical servers and we will configure it later.” It is necessary to know in advance where the control plane will be, where applications will run, where data will live, how updates will be performed, and what will happen if one node is lost.

Example of a high-load cluster

A high-load cluster is designed around specific applications. It usually has a dedicated control plane, several groups of worker nodes for different workload types, separate storage nodes or an external storage array, redundant load balancers, and separate networks for management, applications, and storage.

If there is machine learning or inference, GPU nodes are added. If there are databases and queues, fast storage and placement rules are allocated, or these systems are even moved outside the cluster. If services actively communicate with each other, not only external traffic but also internal traffic must be counted. If there are strict availability requirements, failures of nodes, storage, and network components are tested.

There is no universal specification for such clusters. Two projects with the same number of pods may require completely different hardware: one will be limited by RAM, another by disks, a third by network, and a fourth by GPU.

Typical Kubernetes on-premise configurations

Scenario	Control plane	Worker	Storage	Network	Comment
Lab	1–3 nodes, roles can be combined	On the same servers	Local disks or simple external storage	1/10GbE	Suitable for learning, not for critical services
Small production	3 nodes, partial role combination is possible	3 nodes with reserve	External or distributed storage with backup	10GbE	Monitoring, backup, and a growth plan are needed
Medium production	3 dedicated control plane nodes	3–6 or more worker nodes	Separate storage array or storage nodes	10/25GbE	Roles should be separated physically or logically
Stateful workloads	Dedicated control plane	Worker nodes with placement rules	Fast replicated storage	25GbE desirable	Latency, backup, and recovery testing matter
High load	Dedicated and redundant control plane	Several groups of worker nodes	Separate storage subsystem	25GbE and higher	Failure tests and update procedures are needed
GPU cluster	Separate control plane	Separate GPU nodes	Fast disks for data and models	25GbE depending on the task	PCIe, power, cooling, and drivers must be counted

Common mistakes when choosing servers

Putting all roles on one server and treating it as a Kubernetes cluster. This is acceptable for learning, but it does not provide fault tolerance for production.
Using two control plane nodes and considering the management layer reliable. A normal quorum usually requires an odd-numbered design, most often three nodes.
Using slow disks for etcd. The control layer may start slowing down not because of CPU, but because of storage latency.
Forgetting storage during sizing. Applications start quickly, but problems begin when databases, queues, persistent volumes, and failure recovery appear.
Counting only CPU and RAM while ignoring the network. Kubernetes has a lot of internal traffic: services communicate with each other, storage replicates, and logs and metrics are constantly transferred.
Using 1GbE for active storage. It may work for a small test, but in production it quickly becomes a limitation.
Not leaving reserve for node failure. If all resources are occupied in normal operation, applications have nowhere to move during an incident.
Ignoring ingress, monitoring, logging, and registry. These components are not business applications, but without them the cluster will not be a full platform.
Not setting requests and limits. Without them, the scheduler does not understand the real resource demand of applications.
Not backing up etcd and not testing recovery. A backup that has never been restored cannot be considered working.
Buying servers without accounting for growth. Kubernetes often starts with several services and then becomes the main platform. If expansion is not planned, the architecture may need to be rebuilt in a year.

Most popular servers

New

In stock

Server HPE DL380 Gen11 24SFF

1xIntel Xeon Bronze 3408U (8C 22.5M Cache 1.80 GHz) / 16GB DDR5 RDIMM 4800MHz / RAID HPE MR216i-o / noHDD (up to Array HDD 2.5'' SFF) / 1 × HP 800W

Base price

4 032 €

3 332 €

+ 700 € VAT

Incl shipping across EU

Configure server

New

In stock

Server HPE ML350 Gen11 8SFF

1xIntel Xeon Bronze 3408U (8C 22.5M Cache 1.80 GHz) / 16GB DDR5 RDIMM 4800MHz / RAID HPE MR416i-o / noHDD (up to Array HDD 2.5'' SFF) / 1 × HP 500W

Base price

3 894 €

3 218 €

+ 676 € VAT

Incl shipping across EU

Configure server

New

In stock

Server Dell R360 8SFF

1xIntel Xeon E-2414 (4C 12M Cache 2.60 GHz) / 16GB DDR5 UDIMM 4800MHz / RAID Dell H355 / noHDD (up to Array HDD 2.5'' SFF) / 1 × Dell 600W Hot-Plug

Base price

2 238 €

1 850 €

+ 388 € VAT

Incl shipping across EU

Configure server

Refurbished

In stock

Server Dell R750 16SFF

1xIntel Xeon Silver 4310 (12C 18M Cache 2.1 GHz) / 2x16GB DDR4 RDIMM 3200MHz / RAID Dell H755 / noHDD (up to Array HDD 2.5'' SFF)

Base price

2 000 €

1 653 €

+ 347 € VAT

Incl shipping across EU

Configure server

How to choose servers before purchase

First, describe the applications. Which services will run in the cluster, how much CPU and RAM they need, whether there are databases, queues, files, analytics, ML, GPU, or large log volumes. Then divide workloads into stateless and stateful. Stateless workloads are easier to move between nodes. Stateful workloads require careful storage and backups.

Next, choose the control plane design. For production, it is better to plan three control nodes. Then size worker nodes with system components, requests/limits, peaks, and N-1 mode in mind. After that, design storage separately: external, distributed, local, or mixed.

The network must be selected before buying servers and switches. For production, 10GbE is a reasonable baseline; for active storage and high internal traffic, 25GbE is better to plan for. The number of ports, redundancy, support for required modules, VLAN, MTU, and connection scheme need to be checked.

Ingress, load balancers, monitoring, logging, registry, backup, and updates are planned separately. Rack space, power, cooling, and free switch ports also need to be checked. Servers may fit the specifications but still be unsuitable for the site if power, space, or network capacity is insufficient.

Buying hardware before architectural sizing is risky. As a result, CPU may be sufficient while storage is weak; memory may be abundant while the network is narrow; servers may be powerful while updating them without downtime is impossible.

What needs to be planned in the end

Kubernetes on-premise requires not just servers for containers, but a well-designed platform. The control plane is responsible for management and must be stable, redundant, and protected. Worker nodes must match the application profile: universal services, memory-heavy workloads, CPU-heavy workloads, GPU tasks, and stateful applications require different configurations. Storage must be designed separately because data, recovery, and performance depend on it.

Networking, monitoring, backups, and updates are as essential to the architecture as processors and memory. A good cluster should not only start on the first day, but also survive node failure, updates, load growth, and disaster recovery. The earlier these questions are addressed, the lower the risk of getting a system that formally works but poorly handles real workloads and maintenance.

Comments

(0)

No comments

Write the comment

First name

Comment

Send

I agree to process my personal data

BESTSELLER

Refurbished

In stock

Server Dell R640 8SFF

2xIntel Xeon Bronze 3104 (6С 8.25M Cache 1.70 GHz) / 2x8GB DDR4 RDIMM 2133MHz / RAID Dell PERC H330 Mini Mono (ZM) / noHDD (up to Array HDD 2.5'' SFF) / 2 × Power supply Dell 750w

Base price

182 €

150 €

+ 32 € VAT

Incl shipping across EU

Configure server

Refurbished

In stock

Server Dell R740xd 24SFF

2xIntel Xeon Bronze 3104 (6С 8.25M Cache 1.70 GHz) / 2x16GB DDR4 RDIMM 2933MHz / RAID Dell PERC H330 Mini Mono (ZM) / noHDD (up to Array HDD 2.5'' SFF) / 2 × Power supply Dell 750w

Base price

444 €

367 €

+ 77 € VAT

Incl shipping across EU

Configure server

New

In stock

Server HPE DL360 Gen12 8SFF

1xIntel Xeon 6505P (12C 48M Cache 2.20 GHz) / 16GB DDR5 RDIMM 5200MHz / RAID HPE MR216i-o / noHDD (up to Array HDD 2.5'' SFF) / 1 × HPE 800W

Base price

4 360 €

3 603 €

+ 757 € VAT

Incl shipping across EU

Configure server

New

In stock

Server HPE DL380 Gen11 8LFF

1xIntel Xeon Bronze 3408U (8C 22.5M Cache 1.80 GHz) / 16GB DDR5 RDIMM 4800MHz / RAID HPE MR216i-o / noHDD (up to Array HDD 3.5'' LFF) / 1 × HP 800W

Base price

3 894 €

3 218 €

+ 676 € VAT

Incl shipping across EU

Configure server

New

In stock

Server HPE DL20 Gen11 2LFF

1xIntel Xeon E‑2414 (4C 12M Cache 2.60 GHz) / 1x16GB DDR5 RDIMM 4800MHz / RAID Embedded Intel VROC SATA SW / noHDD (up to 2 HDD 3.5'' LFF) / Power supply HP 290w

Price

Ask manager

Add to cart

Refurbished

In stock

Server HPE DL360 Gen10 Plus 8SFF

1xIntel Xeon Silver 4310 (12C 18M Cache 2.1 GHz) / 16GB DDR4 RDIMM 2666MHz / noHDD (up to Array HDD 2.5'' SFF) / 1 × HP 800W

Base price

3 289 €

2 718 €

+ 571 € VAT

Incl shipping across EU

Configure server

Next news

How to check a refurbished server before buying: SMART, stress tests, firmware, warranty

A practical guide to checking refurbished servers before purchase, from disk health and RAID to firmware and warranty terms.

May 21, 2026

28 Reading time

The Death of Transistor-Based GPUs/CPUs. P.2

A practical deep dive into quantum computer architecture - from superconducting qubits and cryogenic cooling to quantum teleportation, decoherence, and the physical limits of future computing.

May 20, 2026

28 Reading time

VDI Server: How to Calculate CPU, RAM, GPU, and Disks for 10, 50, and 100 Users

A practical guide to sizing VDI infrastructure for different user profiles, from office work to CAD and GPU workloads.

May 20, 2026

28 Reading time

On-premise Kubernetes Servers: Which Nodes Are Needed for the Control Plane, Worker, and Storage

What Kubernetes on-premise is and why hardware matters

Top-selling network switches

Main node roles in a Kubernetes cluster

Control plane

Worker nodes

Storage nodes and the storage subsystem

When roles can be combined

Control plane requirements

Why etcd needs special attention

How to size worker nodes

Storage in Kubernetes: what to decide before buying servers

What storage nodes are needed

Node roles and server requirements

Kubernetes on-premise networking

Infrastructure services inside the cluster

Monitoring and alerts

Redundancy and fault tolerance

Cluster updates

Example of a small cluster

Example of a medium production cluster

Example of a high-load cluster

Typical Kubernetes on-premise configurations

Common mistakes when choosing servers

Most popular servers

How to choose servers before purchase

What needs to be planned in the end

Content:

Next news