Deploy PrometheusAlert
Create a WeChat Work Group Robot
After creating a WeChat Work group, right-click the group → “Add Group Robot”. This will generate a webhook URL for the robot. Record this URL for later use.
After creating a WeChat Work group, right-click the group → “Add Group Robot”. This will generate a webhook URL for the robot. Record this URL for later use.
For backups, every internet company’s technical team must handle this task, and we are no exception. Today, I’ll share my own strategies for backing up production Kubernetes clusters.
My primary goals for Kubernetes backups are to prevent:
Backing up etcd prevents catastrophic failures at the cluster level or loss of etcd data, which could render the entire cluster unusable. In such cases, only full cluster recovery can restore services.
Recently, our internal project has been supporting a big data initiative, requiring the simulation of customer scenarios using Greenplum (older version 4.2.2.4). Below is a record of the Greenplum cluster setup process—note that the procedure for higher versions of GP remains largely identical.
CentOS 6 Dockerfile:
|
|
Build image:
Before deployment, ensure that nvidia-driver and nvidia-docker are installed on your Kubernetes nodes, and Docker’s default runtime has been set to nvidia.
Ensure kubectl is already installed (omitted here).
To facilitate later verification of private deployment, a quick Kubernetes cluster setup is required in the internal network environment. Previously, for larger clusters, I typically used Kubeasz or Kubespray. For this small-scale cluster, using kubeadm will be more efficient.
Below is the recorded process for deploying with kubeadm:
|
|
/etc/hostsKubernetes 1.8+ requires disabling swap. If not disabled, kubelet will fail to start by default.
Option 1: Use --fail-swap-on=false in kubelet startup args.
Option 2: Disable system swap.
Requirements:
- Alibaba Cloud cluster can resolve internal domain names
- Office network resolves internal domain names + internet access resolution
Solution:
Some may wonder: Why not use bind9 alone to handle all internal resolutions? The main reason is that in practice, bind9 exhibits performance issues when forwarding to multiple DNS servers simultaneously—occasional timeouts occur. In contrast, Dnsmasq handles this scenario significantly better.
Previously, we introduced how to install Argo Workflow and trigger tasks. In this article, we focus on a new tool:
Argo Events is an event-driven Kubernetes workflow automation framework. It supports over 20 different event sources (e.g., webhooks, S3 drops, cronjobs, message queues such as Kafka, GCP PubSub, SNS, SQS, etc.).
|
|
operate-workflow-sa accountGrant operate-workflow-sa permission to create Argo Workflows within the argo-events namespace — required for EventSensor to automatically create workflows later.
Argo Workflows is an open-source, container-level workflow engine designed to orchestrate parallel jobs on Kubernetes. It leverages Kubernetes Custom Resource Definitions (CRDs) to implement its full architecture, including Workflow, Workflow Template, and Cron Workflow.
We are installing the stable version 2.12.10. The installation process sets up ServiceAccount, Role, ClusterRole, Deployment, and other necessary components.