Deploying a High-Availability Kubernetes Cluster with kubeadm

To facilitate later verification of private deployment, a quick Kubernetes cluster setup is required in the internal network environment. Previously, for larger clusters, I typically used Kubeasz or Kubespray. For this small-scale cluster, using kubeadm will be more efficient.

Below is the recorded process for deploying with kubeadm:

Cluster Nodes:

1
2
3
4
192.168.1.206 sd-cluster-206 node
192.168.1.207 sd-cluster-207 master,etcd
192.168.1.208 sd-cluster-208 master,etcd,haproxy,keepalived
192.168.1.209 sd-cluster-209 master,etcd,haproxy,keepalived

Image Versions:

1
2
3
4
5
6
7
8
9
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:v1.18.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:v1.18.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:v1.18.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:v1.18.3
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:1.6.5
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:3.4.3-0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.2
docker pull registry.cn-shanghai.aliyuncs.com/gcr-k8s/flannel:v0.14.0
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/nginx-ingress-controller:v0.48.1

I. Basic Environment Setup

1. Install Docker and Configure Hosts

1
2
3
4
5
6
yum install -y yum-utils device-mapper-persistent-data lvm2 git
yum-config-manager --add-repo http://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
yum install docker-ce -y
systemctl start docker
systemctl enable docker
systemctl status docker

2. Configure /etc/hosts

1
2
3
4
5
cat >> /etc/hosts << hhhh
192.168.1.207 sd-cluster-207
192.168.1.208 sd-cluster-208
192.168.1.209 sd-cluster-209
hhhh

3. Disable Firewall and Set SELinux

1
2
3
4
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i 's/SELINUX=permissive/SELINUX=disabled/' /etc/sysconfig/selinux

4. Disable Swap

Kubernetes 1.8+ requires disabling swap. If not disabled, kubelet will fail to start by default.
Option 1: Use --fail-swap-on=false in kubelet startup args.
Option 2: Disable system swap.

Implementing Internal DNS with Alibaba Cloud PrivateZone + Bind9 + Dnsmasq

Requirements:

  • Alibaba Cloud cluster can resolve internal domain names
  • Office network resolves internal domain names + internet access resolution

Solution:

  • For the first requirement, directly use Alibaba Cloud PrivateZone for resolution.
  • For the second requirement, configure internal domain zones in PrivateZone, then synchronize them to the office network’s bind9 server using Alibaba Cloud’s synchronization tool. Use Dnsmasq as the DNS entry point for the office network: forward public queries to public DNS servers, and forward internal domain queries to the bind9 server.

Some may wonder: Why not use bind9 alone to handle all internal resolutions? The main reason is that in practice, bind9 exhibits performance issues when forwarding to multiple DNS servers simultaneously—occasional timeouts occur. In contrast, Dnsmasq handles this scenario significantly better.

Getting Started with Argo Events

Previously, we introduced how to install Argo Workflow and trigger tasks. In this article, we focus on a new tool:

What is ArgoEvents?

Argo Events is an event-driven Kubernetes workflow automation framework. It supports over 20 different event sources (e.g., webhooks, S3 drops, cronjobs, message queues such as Kafka, GCP PubSub, SNS, SQS, etc.).

Features:

  • Supports events from over 20 event sources and more than 10 trigger types.
  • Enables customization of business-level constraints for workflow automation.
  • Manages everything from simple, linear, real-time workflows to complex, multi-source event scenarios.
  • Complies with the CloudEvents specification.

Components:

  • EventSource (similar to a gateway; sends messages to the event bus)
  • EventBus (event message queue; implemented using high-performance distributed messaging middleware NATS — note that NATS has ceased maintenance after 2023, so architectural changes may be expected in the future)
  • EventSensor (subscribes to the message queue, parameterizes events, and filters them)

Deploying ArgoEvents

Deploy argo-events:

1
2
kubectl create ns argo-events
kubectl apply -n argo-events -f https://raw.githubusercontent.com/argoproj/argo-events/v1.2.3/manifests/install.yaml

Deploy argo-eventbus:

1
kubectl apply -n argo-events -f https://raw.githubusercontent.com/argoproj/argo-events/stable/examples/eventbus/native.yaml

RBAC Account Authorization

Create operate-workflow-sa account

Grant operate-workflow-sa permission to create Argo Workflows within the argo-events namespace — required for EventSensor to automatically create workflows later.

Argo Workflow Practice I: Installation and Deployment

Introduction & Architecture

Argo Workflows is an open-source, container-level workflow engine designed to orchestrate parallel jobs on Kubernetes. It leverages Kubernetes Custom Resource Definitions (CRDs) to implement its full architecture, including Workflow, Workflow Template, and Cron Workflow.

What Can Argo Workflows Do?

  • Define workflows where each step is a container.
  • Model multi-step workflows as a sequence of tasks or capture task dependencies using Directed Acyclic Graphs (DAGs).
  • Easily run compute-intensive jobs—such as machine learning or data processing—on Kubernetes within a short time frame.
  • Run CI/CD pipelines natively on Kubernetes without configuring complex software development tooling.

Key Features of Argo Workflows:

  • Workflow: Orchestrates multiple workflow templates with customizable execution order.
  • Workflow Template: A reusable template definition for workflows; can be invoked by other workflows or templates within the same namespace or cluster.
  • Cluster Workflow Template: A cluster-scoped workflow template accessible across all namespaces via ClusterRole permissions.
  • Cron Workflow: Scheduled workflow type, equivalent to an advanced version of Kubernetes CronJob.

Installation & Configuration

Install Argo Workflows

We are installing the stable version 2.12.10. The installation process sets up ServiceAccount, Role, ClusterRole, Deployment, and other necessary components.

Configure GitLab Runner with Ceph S3

When building frontend projects with npm, it’s common for dependency downloads to take a long time, and reusing artifacts or caches across different jobs is also challenging. Whether using artifacts or cache, we ultimately need persistent reuse of files. Here, we’ll use cache as an example.

Note: The GitLab Runner is deployed to the Kubernetes cluster via Helm chart (deployment details are omitted). You must prepare a Ceph S3 key pair in advance for configuring accesskey and secretkey.

Automated ECS Creation with Terraform

Quickly Create an Alibaba Cloud ECS Instance

Specify Terraform Version

Here, we specify the Alibaba Cloud provider version and set the required Terraform version.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# mkdir aliyun-ecs-one && cd aliyun-ecs-one
# touch versions.tf
# vim versions.tf
terraform {
  required_providers {
    alicloud = {
      source  = "aliyun/alicloud"
      version = "1.115.1"
    }
  }

  required_version = ">= 0.12"
}

Configure Variables

Here we define key pairs, cloud region, ECS account, and image information.

Terraform Installation and Command Reference

Installing Terraform

Installing on Mac

1
2
brew tap hashicorp/tap
brew install hashicorp/tap/terraform

Installing on Linux

  1. Ubuntu installation
1
2
3
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main"
sudo apt-get update && sudo apt-get install terraform
  1. CentOS installation
1
2
3
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo
sudo yum -y install terraform

Verifying Installation

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# terraform -v
Terraform v0.14.3

Your version of Terraform is out of date! The latest version
is 0.14.7. You can update by downloading from https://www.terraform.io/downloads.html
# terraform
Usage: terraform [global options] <subcommand> [args]

The available commands for execution are listed below.
The primary workflow commands are given first, followed by
less common or more advanced commands.

Main commands:
  init          Prepare your working directory for other commands
  validate      Check whether the configuration is valid
  plan          Show changes required by the current configuration
  apply         Create or update infrastructure
  destroy       Destroy previously-created infrastructure

All other commands:
  console       Try Terraform expressions at an interactive command prompt
  fmt           Reformat your configuration in the standard style
  force-unlock  Release a stuck lock on the current workspace
  get           Install or upgrade remote Terraform modules
  graph         Generate a Graphviz graph of the steps in an operation
  import        Associate existing infrastructure with a Terraform resource
  login         Obtain and save credentials for a remote host
  logout        Remove locally-stored credentials for a remote host
  output        Show output values from your root module
  providers     Show the providers required for this configuration
  refresh       Update the state to match remote systems
  show          Show the current state or a saved plan
  state         Advanced state management
  taint         Mark a resource instance as not fully functional
  untaint       Remove the 'tainted' state from a resource instance
  version       Show the current Terraform version
  workspace     Workspace management

Global options (use these before the subcommand, if any):
  -chdir=DIR    Switch to a different working directory before executing the
                given subcommand.
  -help         Show this help output, or the help for a specified subcommand.
  -version      An alias for the "version" subcommand.

Terraform Commands for Resource Management

Initializing Resources

For a Terraform project, I created three basic files: main.tf (entry file), variables.tf (variable definitions), and versions.tf (version information).

Introduction to the Automation Orchestration Tool Terraform

What is Terraform?

Terraform is an open-source infrastructure orchestration tool introduced by HashiCorp around 2014. It is now supported by nearly all major cloud service providers, including Alibaba Cloud, Tencent Cloud, Huawei Cloud, AWS, Azure, Baidu Cloud, and more. Many companies today build their infrastructure using Terraform.

Background: In traditional operations, launching a business required multiple preparatory steps such as hardware procurement, server rack mounting, network setup, and system installation. With the rise of cloud computing, major public cloud providers offer user-friendly graphical interfaces—users can purchase various cloud resources via a browser and quickly set up their architecture. However, as business architectures expand, the scale and variety of cloud resource procurement continue to grow. When users need to rapidly acquire large numbers of diverse cloud resources, the numerous interactive operations across cloud management consoles actually reduce procurement efficiency. For example, initializing a classic VPC network on the Alibaba Cloud console—from creating the VPC and VSwitches to setting up NAT gateways, elastic IPs, and routing configurations—can take 20 minutes or even longer. Moreover, the non-reproducible nature of manual work leads to redundant efforts when operating across regions or multi-cloud environments.