RKE2 | Squirreworks

Project Initiative: Next-Gen Cloud Native Infrastructure

Mission Parameters & Architectural Vision

The objective of this engineering initiative is to transition the Squirrelworks lab ecosystem away from legacy virtualization models and establish a hardened, production-grade Multi-Node Kubernetes cluster. By leveraging the security and efficiency profiles of enterprise-tier Open Source utilities, this bare-metal platform is built to deliver highly stable, self-healing runtime behaviors under rigorous bare-metal constraints.

Primary Infrastructure Targets

Stripped Kernel Footprint

Deploying the Rancher Kubernetes Engine (RKE2) on minimal Rocky Linux 9 nodes to enforce absolute process-level isolation and maximize CPU/RAM efficiency metrics.

eBPF-Driven Data Highway

Bypassing legacy Linux iptables entirely by injecting custom eBPF microcode bytecode straight into the kernel via Cilium, establishing raw hardware-speed packet delivery.

The Entropy Reduction Blueprint

This cluster setup functions as a technical playbook for predictable, scalable operations. Eliminating intermediate application abstractions and decoupling the underlying container runtime networking guarantees deterministic performance behaviors across the entire compute host layer.

Control Plane Architecture: RKE2 Initialization

1. Provisioning the Rocky Linux 9 Base

To establish a resilient cloud-native control plane, the master node was deployed on a minimal Rocky Linux 9 virtual instance within Proxmox. Standard network plumbing was locked down via SSH key integration, providing a stable foundation for the enterprise-grade Rancher Kubernetes Engine (RKE2).

2. Decoupling the Network Layer (CNI Switch)

During initial orchestration, a custom manifest was staged at /etc/rancher/rke2/config.yaml to inject the directive cni: none. This intentional override completely bypassed the default Canal network plugin, preventing the cluster API from establishing standard routing mechanisms and placing the control plane into a deliberate network holding pattern.

The Intentional NotReady Safe-State

By running kubectl get nodes post-initialization, the node safely defaults to a NotReady state. This is not an error—it is an expected architectural behavior proving the API server is up, running, and securely waiting for an advanced eBPF network fabric to claim its interface.

Cluster Expansion & Host Recovery

1. Static Token Authentication

Expanding the multi-node infrastructure required staging the secondary machine, rocky-worker-01. By linking the cryptographic joint token from the primary supervisor log into the worker's configuration path at 192.168.0.198, an ironclad hardware-to-host handshake was verified.

2. Remediation: Post-Crash State Alignment

Following a system-level interruption right after package deployment, a forensic review showed that while the rke2-agent.service was enabled, it had never actively executed. Resolving this discrepancy involved executing a live daemon state change on the worker host CLI:

sudo systemctl start rke2-agent.service

The node cleanly entered the runtime cluster matrix, aligning its age telemetry in the control plane database without throwing a single TLS handshake anomaly.

Deployment Artifact: Package Management & Tooling

1. Minimal OS Path Constraints

Because RKE2 strips internal runtime folders down to raw core mechanics, an investigation of the internal path structure revealed that standard utilities like helm were excluded from the binary profile layer. Attempts to establish quick symlinks failed due to the nonexistent target paths within the core orchestration directories.

2. Restoring the Upstream Binary Matrix

To bypass the minimal OS constraints of Rocky Linux without cluttering environment path files, the pure vanilla upstream binary for Helm v3 was downloaded. Because the minimal OS profile lacked native archive decompression capabilities, the package manager was used to inject the toolset required to place and flag the execution rights:

sudo dnf install -y tar && tar -zxvf helm.tar.gz

Lab Component	Allocated Network / Storage Specification
rocky-control-01	192.168.0.197 \| Primary Control Plane Listener
rocky-worker-01	192.168.0.198 \| Compute Host Agent Client
Package Tooling	Helm v3 Stable Binary \| Exoclipped to /usr/local/bin

The Happy Helming Milestone

With the physical binary safely bound into /usr/local/bin/helm, index tables updated flawlessly. Injecting the official Cilium stable repository securely hooks our local system straight into the eBPF staging yard....

CNI Orchestration: API Endpoint Remediation

1. Diagnosing the Port Mismatch

Initial deployment logs indicated the Cilium initialization pods were locked in a persistent PodInitializing block, throwing an Init:CrashLoopBackOff flag. A deep forensic review via kubectl describe pod revealed an architectural endpoint conflict: Cilium was attempting to map the global cluster topology through the RKE2 Supervisor Port (9345) instead of the standard Kubernetes API Server Port (6443).

The Supervisor Error Vector

The supervisor port on 9345 handles internal node registration and certificate synchronization. Passing API requests to this daemon causes an immediate failure return: "the server could not find the requested resource (get namespaces kube-system)".

2. Executing the Infrastructure Parameters Update

To redirect the eBPF control hooks onto the valid API endpoint registry, a hot Helm database upgrade was fired down on the control plane terminal, hard-coding the data pipeline straight to the true API gateway:

sudo /usr/local/bin/helm upgrade cilium cilium/cilium --version 1.15.4 \
  --namespace kube-system \
  --set operator.replicas=1 \
  --set kubeProxyReplacement=true \
  --set k8sServiceHost=192.168.0.197 \
  --set k8sServicePort=6443 \
  --kubeconfig /etc/rancher/rke2/rke2.yaml

Helm processed the parameter updates cleanly, outputting a successful REVISION: 2 tracking manifest to the cluster backend.

Hypervisor Hardening: eBPF Routing Storm Remediation

1. The Proxmox Bridge Packet Collision

The immediate activation of native kubeProxyReplacement=true triggered a critical network outage, severing management access to the Proxmox hypervisor IP (192.168.0.105). Cilium’s high-performance eBPF engine attempted to hook directly into the virtual interface drivers to establish direct packet routing. However, this action clashed head-on with the active hypervisor-level Proxmox Hardware Firewall on the VM network devices.

The simultaneously active packet-filtering engines generated an intense Layer 2 MAC flapping loop/ARP storm, completely saturating the Proxmox virtual bridge (vmbr0) and dropping all external management traffic.

2. Emergency Console Isolation & Mitigation

To break the data loop without risking a host kernel crash, access was established via the hardware terminal console of the physical Proxmox node, forcing an immediate teardown of the cluster virtual machines and a flush of the network interfaces:

# Local Hypervisor Emergency Restoration Terminal
qm stop 107 && qm stop 108
systemctl restart networking

3. Decoupling Filtering Rules for Native eBPF Access

To guarantee clean execution paths for the cluster datapath going forward, the Proxmox virtual hardware firewall was permanently un-checked across both rocky-control-01 and rocky-worker-01 interface properties. This architecture safely hands over full Layer 2–4 network policy management natively to Cilium's internal security engines.

Interface Target	Hypervisor Firewall State	CNI Routing Strategy
VM 107 (Control)	DISABLED	eBPF Direct Host-Routing (ens18)
VM 108 (Worker)	DISABLED	eBPF Direct Host-Routing (ens18)
PVE Bridge (vmbr0)	ENABLED	Hypervisor Management Plane Protection

Verification: Active Cluster Cluster Telemetry

1. End-to-End Data Plane Validation

Following host network restoration, both compute nodes negotiated successful secure cluster entry. Querying the integrated CNI environment via cilium status verified that the infrastructure paths had attained complete operational integrity.

The eBPF controller reported 2/2 cluster hosts reachable over direct routing lanes with zero memory allocation errors or route tracking dropouts.

2. CoreDNS Status Resolution

With a highly efficient network fabric successfully claiming the underlying kernel interfaces, the core cluster DNS resources were automatically kicked out of their pending states. The deployment engine successfully bound local container IPs and scaled up active instances smoothly:

The Deterministic Goal State Attained

Executing a terminal-wide kubectl get nodes confirms both hosts are sitting comfortably at a permanent Ready status. The Squirrelworks lab multi-node environment is fully live, secure, and ready to receive production-tier container deployments.

| Accessibility --overview | API --REST best practices --REST demo --REST vs RPC --Wikipedia API | Blockchain --overview | Cloud --AWS overview | CSS/HTML --Bootstrap carousel --Grid demo --markdown demo | DevOps --Agile Principles --DevOps overview --RKE2: Deploying the Rancher Kubernetes Engine | Electricity --fundamentals | Encoding --Overview | Ergonomics --Desk configuration --Device fleet --Input device array --keystroke mechanics --Phones & RSI | ERP --Anthology overview --Ellucian Banner --Higher Ed ERP Simulation Lab --PeopleSoft Campus Solutions --PESC standards --Slate data model | Git --syntax overview --troubleshooting libcrypto | Hardware --Device fleet --Homelab diagram | Java --Fundamentals | Javascript --Advanced Interaction: jQuery & UI Frameworks --input prompt demo --misc demo --Time and Date functions --Vue demo | Linux --grep demo --HCI and Proxmox --Proxmox install --xammp ftp server | Mail flow --DKIM, SPF, DMARC --MAPI | Microsoft --AZ-800: Administering Windows Server Hybrid Core Infrastructure --BAT scripting --Group Policy --IIS --robocopy --Server 2022 setup - Virtualbox | Misc --Applications --regex --Resources --Sustainable Computing --Terminology --The Humility Protocol: Reality Over Reputation --The Jobsian Protocol: Systems Analysis as a War on Entropy --The Jordan Framework: Engineering a Competitive Edge --Tribute to Computer Scientists | Networks --BGP Peering & Security Hardening Lab --CCNA Lammle Study Guide --Cisco 1921/K9 router --routing protocols --throughput calculations | PHP/SQL --Cookies --database interaction --demo, OSI Layers quiz --Foreign key constraint demo --fundamentals --MySQL and PHPmyAdmin setup --pagination --security --session variables --SQL fundamentals --structures --Tables display | Python --fundamentals | Security --Overview- GRC (Governance, Risk, and Compliance) --Security Blog --SSH fundamentals | Serialization --JSON demo --YAML demo

Accessibility
--overview

API
--REST best practices
--REST demo
--REST vs RPC
--Wikipedia API

Blockchain
--overview

Cloud
--AWS overview

CSS/HTML
--Bootstrap carousel
--Grid demo
--markdown demo

DevOps
--Agile Principles
--DevOps overview
--RKE2: Deploying the Rancher Kubernetes Engine

Electricity
--fundamentals

Encoding
--Overview

Ergonomics
--Desk configuration
--Device fleet
--Input device array
--keystroke mechanics
--Phones & RSI

ERP
--Anthology overview
--Ellucian Banner
--Higher Ed ERP Simulation Lab
--PeopleSoft Campus Solutions
--PESC standards
--Slate data model

Git
--syntax overview
--troubleshooting libcrypto

Hardware
--Device fleet
--Homelab diagram

Java
--Fundamentals

Javascript
--Advanced Interaction: jQuery & UI Frameworks
--input prompt demo
--misc demo
--Time and Date functions
--Vue demo

Linux
--grep demo
--HCI and Proxmox
--Proxmox install
--xammp ftp server

Mail flow
--DKIM, SPF, DMARC
--MAPI

Microsoft
--AZ-800: Administering Windows Server Hybrid Core Infrastructure
--BAT scripting
--Group Policy
--IIS
--robocopy
--Server 2022 setup - Virtualbox

Misc
--Applications
--regex
--Resources
--Sustainable Computing
--Terminology
--The Humility Protocol: Reality Over Reputation
--The Jobsian Protocol: Systems Analysis as a War on Entropy
--The Jordan Framework: Engineering a Competitive Edge
--Tribute to Computer Scientists

Networks
--BGP Peering & Security Hardening Lab
--CCNA Lammle Study Guide
--Cisco 1921/K9 router
--routing protocols
--throughput calculations

PHP/SQL
--Cookies
--database interaction
--demo, OSI Layers quiz
--Foreign key constraint demo
--fundamentals
--MySQL and PHPmyAdmin setup
--pagination
--security
--session variables
--SQL fundamentals
--structures
--Tables display

Python
--fundamentals

Security
--Overview- GRC (Governance, Risk, and Compliance)
--Security Blog
--SSH fundamentals

Serialization
--JSON demo
--YAML demo