Implementing Resource Isolation in Doris Offline Analysis Data Warehouse Cluster

I. Background

With rapid business development, the current offline analysis data warehouse Doris cluster is facing resource contention issues due to diverse computing needs such as daily data science jobs, offline batch processing, and customer-side report queries. To address this, resource limits need to be implemented for different account resource requests. After research, the official Doris Workload Group solution has been adopted.

Doris version: doris-2.1.8-1-834d802457

II. Workload Group Introduction

Workload Group is an in-process resource isolation mechanism provided by Apache Doris that achieves resource isolation between different business loads through fine-grained division of CPU, memory, and IO resources within BE processes.

The principle is as shown in the diagram below:

workload_group

Currently supported isolation capabilities include:

Resource Type	Isolation Method	Description
CPU	Soft limit / Hard limit	Soft limit allocates by weight, hard limit is absolute upper bound
Memory	Soft limit / Hard limit	Hard limit automatically kills queries to free up memory when exceeded
IO	Speed limit	Limits IO throughput for reading local/remote files
Concurrency	Queuing mechanism	Exceeding concurrency queries enter queue to wait

For more information: https://doris.apache.org/zh-CN/docs/2.1/admin-manual/workload-management/workload-group

III. Doris Cluster Adjustment to Support Cgroup (Manual Deployment)

3.1 Cgroup Environment Configuration

Check the system’s supported CGroup version:

root@doristest:~# cat /proc/filesystems | grep cgroup
nodev   cgroup
nodev   cgroup2

Confirm the effective CGroup version:

root@doristest:~# ls /sys/fs/cgroup/cpu/
ls: cannot access '/sys/fs/cgroup/cpu/': No such file or directory 
root@doristest:~# ls /sys/fs/cgroup/cgroup.controllers
/sys/fs/cgroup/cgroup.controllers # Indicates cgroup v2 is in effect

3.2 Create Doris CGroup Directory

# ========== CGroup v1 Operations ==========
mkdir /sys/fs/cgroup/cpu/doris

# Set permissions (root is the user running BE)
chmod 770 /sys/fs/cgroup/cpu/doris
chown -R root:root /sys/fs/cgroup/cpu/doris


# ========== CGroup v2 Operations ==========
mkdir /sys/fs/cgroup/doris

# Set permissions
chmod 770 /sys/fs/cgroup/doris
chown -R root:root /sys/fs/cgroup/doris  # Adjust as needed, currently root

3.3 CGroup v2 Additional Configuration

CGroup v2 has stricter permission control and requires additional configuration:

# Modify root directory cgroup.procs file permissions
chmod a+w /sys/fs/cgroup/cgroup.procs

# Enable CPU controller
# Enter doris directory
cd /sys/fs/cgroup/doris

# Enable CPU controller (by modifying parent subtree_control)
# Already set by default, can be ignored
echo +cpu > ../cgroup.subtree_control

# Verify: Check if cpu.max file appears in doris directory
/sys/fs/cgroup/doris/cpu.max
# And cgroup.controllers should contain cpu
# cat /sys/fs/cgroup/cgroup.controllers 
cpuset cpu io memory hugetlb pids rdma misc

3.4 Configure BE and Restart

Edit the be.conf file:

# ========== CGroup v1 Configuration ==========
doris_cgroup_cpu_path = /sys/fs/cgroup/cpu/doris

# ========== CGroup v2 Configuration ==========
doris_cgroup_cpu_path = /sys/fs/cgroup/doris

Restart BE service:

# Restart BE
./bin/stop_be.sh
./bin/start_be.sh

Verification: Check the be.INFO log

grep "add thread" log/be.INFO
# Appearance of "add thread xxx to group" indicates successful configuration

3.5 Persistent Configuration (Optional)

CGroup configuration will be cleared after machine reboot. It is recommended to use systemd to create an auto-configuration service on boot:

# /etc/systemd/system/doris-cgroup.service
[Unit]
Description=Doris CGroup Setup
Before=doris-be.service

[Service]
Type=oneshot
ExecStart=/bin/bash -c 'mkdir -p /sys/fs/cgroup/cpu/doris && chmod 770 /sys/fs/cgroup/cpu/doris && chown -R doris:doris /sys/fs/cgroup/cpu/doris'

[Install]
WantedBy=multi-user.target

Set to auto-start on boot:

systemctl enable doris-cgroup.service

IV. Batch Enable Cgroup on Doris Nodes via Ansible

4.1 Preparation

Before executing batch deployment, confirm the following information:

Confirm CGroup version: CGroup version (v1 or v2) for all Doris BE nodes
Prepare host inventory: Edit Ansible’s hosts file to add all BE nodes
Confirm BE user: User running Doris BE service (currently root)

4.2 Configure Host Inventory

Edit the hosts file:

1
2
3

[doris_dabe_nodes]
# If nodes have different CGroup versions, they can be grouped. Here only specifying offline analysis warehouse BE nodes
10.18.7.160

4.3 Create Ansible Playbook

Create file setup_doris_cgroup.yaml:

---
- name: Setup Doris CGroup Configuration
  hosts: doris_dabe_nodes
  become: true
  vars:
    # CGroup configuration path (automatically selected based on version)
    cgroup_v1_path: "/sys/fs/cgroup/cpu/doris"
    cgroup_v2_path: "/sys/fs/cgroup/doris"
    doris_user: "root"
    doris_group: "root"
    
  tasks:
    - name: Detect CGroup version
      stat:
        path: /sys/fs/cgroup/cpu/
      register: cgroup_v1_check
      
    - name: Set CGroup path based on version
      set_fact:
        cgroup_path: "{{ cgroup_v1_path if cgroup_v1_check.stat.exists else cgroup_v2_path }}"
        cgroup_version: "{{ 'v1' if cgroup_v1_check.stat.exists else 'v2' }}"
        
    - name: Display detected CGroup version
      debug:
        msg: "Node {{ inventory_hostname }} uses CGroup {{ cgroup_version }}, path: {{ cgroup_path }}"

    - name: Create CGroup directory
      file:
        path: "{{ cgroup_path }}"
        state: directory
        mode: '0770'
        owner: "{{ doris_user }}"
        group: "{{ doris_group }}"
      register: cgroup_dir_result
      
    - name: Configure CGroup v2 specific settings
      block:
        - name: Enable write permission on cgroup.procs
          file:
            path: /sys/fs/cgroup/cgroup.procs
            mode: 'a+w'
            
        - name: Enable CPU controller for CGroup v2
          shell: |
            if ! grep -q '+cpu' /sys/fs/cgroup/cgroup.subtree_control 2>/dev/null; then
              echo '+cpu' > /sys/fs/cgroup/cgroup.subtree_control
            fi            
          args:
            executable: /bin/bash
          ignore_errors: yes
          
        - name: Verify CPU controller is enabled
          stat:
            path: "{{ cgroup_path }}/cpu.max"
          register: cpu_controller_check
          
        - name: Display CPU controller status
          debug:
            msg: "CPU controller is {{ 'enabled' if cpu_controller_check.stat.exists else 'NOT enabled' }}"
      when: cgroup_version == 'v2'

    - name: Verify CGroup directory structure
      shell: |
        echo "=== CGroup Version: {{ cgroup_version }} ==="
        echo "CGroup Path: {{ cgroup_path }}"
        ls -la {{ cgroup_path }} 2>/dev/null || echo "Directory verified"
        echo ""
        if [ "{{ cgroup_version }}" = "v2" ]; then
          echo "=== CPU Controller Check ==="
          cat /sys/fs/cgroup/cgroup.controllers 2>/dev/null
          test -f {{ cgroup_path }}/cpu.max && echo "cpu.max: EXISTS" || echo "cpu.max: NOT FOUND"
        fi        
      args:
        executable: /bin/bash
      register: verify_result
      
    - name: Display verification results
      debug:
        var: verify_result.stdout_lines

    - name: Create systemd service for CGroup persistence (optional)
      copy:
        dest: /etc/systemd/system/doris-cgroup.service
        content: |
          [Unit]
          Description=Doris CGroup Setup
          Before=doris-be.service

          [Service]
          Type=oneshot
          ExecStart=/bin/bash -c 'mkdir -p {{ cgroup_path }} && chmod 770 {{ cgroup_path }} && chown -R {{ doris_user }}:{{ doris_group }} {{ cgroup_path }}'
          {% if cgroup_version == 'v2' %}
          ExecStartPost=/bin/bash -c 'chmod a+w /sys/fs/cgroup/cgroup.procs'
          ExecStartPost=/bin/bash -c 'echo "+cpu" > /sys/fs/cgroup/cgroup.subtree_control 2>/dev/null || true'
          {% endif %}

          [Install]
          WantedBy=multi-user.target          
        mode: '0644'
      when: cgroup_dir_result is changed
      notify: Enable systemd service

  handlers:   
    - name: Enable systemd service
      systemd:
        name: doris-cgroup
        enabled: yes
        daemon_reload: yes

4.4 Execute Batch Deployment

Syntax check and dry run

# 1. Syntax check
ansible-playbook -i hosts setup_doris_cgroup.yaml --syntax-check

# 2. List all tasks to be executed
ansible-playbook -i hosts setup_doris_cgroup.yaml --list-tasks

# 3. Dry run (not actually executing, only showing changes to be made)
ansible-playbook -i hosts setup_doris_cgroup.yaml --check --diff

4.5 Official Execution and Verification of Deployment Results

Official execution

# Method 1: Execute on all BE nodes
ansible-playbook -i hosts setup_doris_cgroup.yaml

# Method 2: Execute only on specific nodes (e.g., test on single node first)
ansible-playbook -i hosts setup_doris_cgroup.yaml --limit 10.18.7.101

# Method 3: Execute step by step, each task requires confirmation
ansible-playbook -i hosts setup_doris_cgroup.yaml --step

Adjust Doris BE configuration via Doris Manager (rolling restart):

# ========== CGroup v1 Configuration ==========
doris_cgroup_cpu_path = /sys/fs/cgroup/cpu/doris

# ========== CGroup v2 Configuration ==========
doris_cgroup_cpu_path = /sys/fs/cgroup/doris

doris be增加Cgroup配置

Verification:

# 1. Check if CGroup directory is created
ansible doris_dabe_nodes -i hosts -m shell -a "ls -la /sys/fs/cgroup/doris 2>/dev/null || ls -la /sys/fs/cgroup/cpu/doris 2>/dev/null"

# 2. Check CGroup initialization in BE logs (adjust according to actual BE log path)
ansible doris_dabe_nodes -i hosts -m shell -a "tail -100 /opt/doris/doris/be/log/be.INFO | grep -i 'cgroup\|add thread' | tail -5"

At this point, Doris BE nodes have enabled Cgroup resource limits.

V. Workload Group Management

5.1 Create Workload Group

Basic example (CPU soft limit)

CREATE WORKLOAD GROUP IF NOT EXISTS g1
PROPERTIES (
    "cpu_share" = "1024"
);

Complete configuration example

CREATE WORKLOAD GROUP IF NOT EXISTS etl_group
PROPERTIES (
    -- CPU configuration (soft limit mode)
    "cpu_share" = "2048",
    "cpu_hard_limit" = "40%",
    
    -- Memory configuration (hard limit mode)
    "memory_limit" = "40%",
    "enable_memory_overcommit" = "false",
    
    -- Concurrency control
    "max_concurrency" = "1",
    "max_queue_size" = "100",
    "queue_timeout" = "10000",
    
    -- IO speed limit
    "read_bytes_per_second" = "104857600",        -- 100MB/s
    "remote_read_bytes_per_second" = "52428800"   -- 50MB/s
);

5.2 Attribute Details

Attribute Name	Type	Default Value	Value Range	Description
cpu_share	INT	-1	[1, 10000]	CPU soft limit weight, higher value means higher priority
cpu_hard_limit	INT	-1	[1%, 100%]	CPU hard limit percentage (new in version 2.1)
memory_limit	FLOAT	-1	(0%, 100%]	Memory limit percentage
enable_memory_overcommit	BOOL	true	true/false	true=soft limit (can exceed), false=hard limit
max_concurrency	INT	2147483647	[0, 2147483647]	Maximum concurrent queries
max_queue_size	INT	0	[0, 2147483647]	Queue length (0=no queuing)
queue_timeout	INT	0	[0, 2147483647]	Queue timeout (milliseconds)
scan_thread_num	INT	-1	[1, 2147483647]	Scan thread count (-1=use BE configuration)
max_remote_scan_thread_num	INT	-1	[1, 2147483647]	Maximum scan thread count for external tables
min_remote_scan_thread_num	INT	-1	[1, 2147483647]	Minimum scan thread count for external tables
read_bytes_per_second	INT	-1	[1, 9223372036854775807]	Internal table read IO limit (bytes/second)
remote_read_bytes_per_second	INT	-1	[1, 9223372036854775807]	External table read IO limit (bytes/second)

Notes:

Must specify at least one attribute when creating
CGroup v1 cpu_share default is 1024, range 2-262144
CGroup v2 cpu_share default is 100, range 1-10000
Sum of all Workload Group memory_limit cannot exceed 100%
Sum of all Workload Group cpu_hard_limit cannot exceed 100%

5.3 Modify Workload Group

ALTER WORKLOAD GROUP g1 PROPERTIES('cpu_share' = '4096');

-- Change memory limit to hard limit
ALTER WORKLOAD GROUP g1 PROPERTIES(
    'memory_limit' = '30%',
    'enable_memory_overcommit' = 'false'
);

-- Modify concurrency and queuing parameters
ALTER WORKLOAD GROUP g1 PROPERTIES(
    'max_concurrency' = '100',
    'max_queue_size' = '200',
    'queue_timeout' = '30000'
);

-- Add CPU hard limit (hard limit mode must be enabled first)
ALTER WORKLOAD GROUP g1 PROPERTIES('cpu_hard_limit' = '20%');

5.4 Delete Workload Group

- Delete specified Workload Group
DROP WORKLOAD GROUP g1;

Note: The default normal group cannot be deleted.

VI. User Binding and Authorization

6.1 View Available Workload Groups

-- View Workload Groups that the current user has permission to use
SELECT name FROM information_schema.workload_groups;

6.2 Authorization

-- Grant user permission to use specified Workload Group
GRANT USAGE_PRIV ON WORKLOAD GROUP 'g1' TO 'user_1'@'%';

-- Grant permission to all Workload Groups
GRANT USAGE_PRIV ON WORKLOAD GROUP '*' TO 'user_1'@'%';

-- Revoke permission
REVOKE USAGE_PRIV ON WORKLOAD GROUP 'g1' FROM 'user_1'@'%';

6.3 Binding Method 1: User Properties (Recommended)

Set default Workload Group for user (persistent):

-- Set user's default Workload Group
SET PROPERTY FOR 'user_1' = 'default_workload_group' = 'g1';

-- View user properties
SHOW PROPERTY FOR 'user_1';

VII. Monitoring and Viewing

7.1 View Workload Group List

-- Show statement
SHOW WORKLOAD GROUPS;

-- System table query
SELECT * FROM information_schema.workload_groups;

7.2 View Resource Usage

-- View memory usage of each Workload Group (unit: MB)
SELECT 
    workload_group_id,
    name,
    MEMORY_USAGE_BYTES / 1024 / 1024 AS mem_used_mb,
    CPU_USAGE_PERCENT AS cpu_percent
FROM workload_group_resource_usage;

-- View details of specific Workload Group
SELECT * FROM information_schema.workload_groups WHERE name = 'g1';

7.3 System Table Field Description

workload_groups table

Field	Description
ID	Workload Group ID
NAME	Name
CPU_SHARE	CPU soft limit weight
MEMORY_LIMIT	Memory limit percentage
ENABLE_MEMORY_OVERCOMMIT	Whether memory overcommit is allowed
MAX_CONCURRENCY	Maximum concurrency
MAX_QUEUE_SIZE	Queue length
QUEUE_TIMEOUT	Queue timeout
CPU_HARD_LIMIT	CPU hard limit percentage
SCAN_THREAD_NUM	Scan thread count
READ_BYTES_PER_SECOND	Internal table IO limit
REMOTE_READ_BYTES_PER_SECOND	External table IO limit

VIII. CPU Soft/Hard Limit Mode Switching

8.1 Mode Description

Mode	Characteristics	Applicable Scenarios
CPU soft limit	Allocates CPU time by weight, can use all CPU when idle	Large load fluctuations, want to fully utilize resources
CPU hard limit	Absolute upper limit, cannot exceed even if CPU is idle	Need strict resource isolation, guarantee SLA

Note: Only one mode can be used in a cluster at the same time, cannot be mixed.

8.2 Switch from Soft Limit to Hard Limit

Step 1: Set hard limit values for all Workload Groups

-- Must set cpu_hard_limit for all Groups
ALTER WORKLOAD GROUP g1 PROPERTIES('cpu_hard_limit' = '30%');
ALTER WORKLOAD GROUP g2 PROPERTIES('cpu_hard_limit' = '20%');
ALTER WORKLOAD GROUP normal PROPERTIES('cpu_hard_limit' = '50%');

-- Verify: Sum of all Group's cpu_hard_limit <= 100%
SELECT SUM(cpu_hard_limit) FROM information_schema.workload_groups 
WHERE cpu_hard_limit > 0;

Step 2: Enable cluster hard limit switch

-- Takes effect immediately in memory (invalid after restart)
ADMIN SET FRONTEND CONFIG ("enable_cpu_hard_limit" = "true");

-- Persistent configuration: Modify fe.conf for all FEs
echo "experimental_enable_cpu_hard_limit = true" >> fe/conf/fe.conf

8.3 Switch from Hard Limit Back to Soft Limit

-- Disable hard limit switch (automatically switches back to soft limit mode)
ADMIN SET FRONTEND CONFIG ("enable_cpu_hard_limit" = "false");

For persistence, modify fe.conf to false or delete the configuration.

The above is the overall solution for adding resource limits to the Doris offline analysis data warehouse.

Reference document: https://doris.apache.org/zh-CN/docs/2.1/admin-manual/workload-management/workload-group