Complete ClickHouse Installation Guide

Windows, Linux, Docker

Ready to get your hands dirty with ClickHouse? This comprehensive guide will walk you through installing ClickHouse on any platform, configuring it for optimal performance, and building a real-world application using Go. Whether you’re setting up a development environment or preparing for production, we’ve got you covered.

Table of Contents

  1. Installation Overview
  2. Docker Installation (Recommended for Development)
  3. Linux Installation (Ubuntu/Debian)
  4. Windows Installation
  5. ClickHouse Configuration Deep Dive
  6. Security and Performance Tuning

Installation Overview

ClickHouse can be installed in several ways, each with its own advantages:

MethodBest ForProsCons
DockerDevelopment, TestingQuick setup, Easy cleanup, Version controlResource overhead, Network complexity
Package ManagerProduction LinuxNative performance, System integrationOS-specific, Complex upgrades
BinaryCustom setupsMaximum controlManual management
CloudProductionManaged service, ScalingCost, Vendor lock-in

System Requirements

Minimum Requirements:

  • CPU: 2 cores (4+ recommended)
  • RAM: 4GB (8GB+ recommended)
  • Storage: 10GB free space (SSD strongly recommended)
  • OS: Linux (preferred), Windows 10+, macOS 10.15+

Recommended for Production:

  • CPU: 8+ cores with high clock speed
  • RAM: 32GB+ (ClickHouse loves memory)
  • Storage: NVMe SSD with high IOPS
  • Network: 1Gbps+ for distributed setups

Docker Installation

Docker is the fastest way to get ClickHouse running, perfect for development and testing.

Basic Docker Setup

# Pull the latest ClickHouse image
docker pull clickhouse/clickhouse-server:latest

# Run ClickHouse with basic configuration
docker run -d \
  --name clickhouse-server \
  -p 8123:8123 \
  -p 9000:9000 \
  -p 9009:9009 \
  --ulimit nofile=262144:262144 \
  clickhouse/clickhouse-server:latest

Production-Ready Docker Setup

Create a proper directory structure and configuration:

# Create directory structure
mkdir -p clickhouse-data/{config,data,logs}
cd clickhouse-data

docker-compose.yml:

version: '3.8'

services:
  clickhouse-server:
    image: clickhouse/clickhouse-server:24.8-alpine
    container_name: clickhouse-server
    ports:
      - "8123:8123"    # HTTP interface
      - "9000:9000"    # Native TCP interface
      - "9009:9009"    # Interserver communication
    volumes:
      - ./data:/var/lib/clickhouse/
      - ./logs:/var/log/clickhouse-server/
      - ./config/config.xml:/etc/clickhouse-server/config.xml
      - ./config/users.xml:/etc/clickhouse-server/users.xml
    environment:
      CLICKHOUSE_DB: analytics
      CLICKHOUSE_USER: analytics_user
      CLICKHOUSE_PASSWORD: secure_password_123
      CLICKHOUSE_DEFAULT_ACCESS_MANAGEMENT: 1
    ulimits:
      nofile:
        soft: 262144
        hard: 262144
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:8123/ping"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s

  clickhouse-client:
    image: clickhouse/clickhouse-client:24.8-alpine
    container_name: clickhouse-client
    depends_on:
      - clickhouse-server
    command: ['--host', 'clickhouse-server']
    profiles: ["client"]

volumes:
  clickhouse-data:
  clickhouse-logs:

Start the services:

# Start ClickHouse server
docker-compose up -d clickhouse-server

# Connect with client (optional)
docker-compose run --rm clickhouse-client

# Or connect via HTTP
curl http://localhost:8123/ping

Docker Configuration Files

config/config.xml (Basic Production Config):

<?xml version="1.0"?>
<clickhouse>
    <!-- Paths -->
    <path>/var/lib/clickhouse/</path>
    <tmp_path>/var/lib/clickhouse/tmp/</tmp_path>
    <user_files_path>/var/lib/clickhouse/user_files/</user_files_path>
    <format_schema_path>/var/lib/clickhouse/format_schemas/</format_schema_path>

    <!-- Network -->
    <listen_host>0.0.0.0</listen_host>
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <interserver_http_port>9009</interserver_http_port>

    <!-- Logging -->
    <logger>
        <level>information</level>
        <log>/var/log/clickhouse-server/clickhouse-server.log</log>
        <errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
        <size>1000M</size>
        <count>10</count>
    </logger>

    <!-- Performance Settings -->
    <max_connections>4096</max_connections>
    <keep_alive_timeout>3</keep_alive_timeout>
    <max_concurrent_queries>100</max_concurrent_queries>
    <uncompressed_cache_size>8589934592</uncompressed_cache_size>
    <mark_cache_size>5368709120</mark_cache_size>

    <!-- Memory Management -->
    <max_server_memory_usage_to_ram_ratio>0.8</max_server_memory_usage_to_ram_ratio>
    <max_memory_usage>10000000000</max_memory_usage>

    <!-- Storage Configuration -->
    <storage_configuration>
        <disks>
            <default>
                <path>/var/lib/clickhouse/</path>
            </default>
        </disks>
        <policies>
            <default>
                <volumes>
                    <default>
                        <disk>default</disk>
                    </volumes>
                </default>
            </volumes>
        </policies>
    </storage_configuration>

    <!-- Default Profile -->
    <profiles>
        <default>
            <max_memory_usage>10000000000</max_memory_usage>
            <use_uncompressed_cache>1</use_uncompressed_cache>
            <load_balancing>random</load_balancing>
        </default>
    </profiles>

    <!-- Default Database -->
    <default_database>default</default_database>
</clickhouse>

config/users.xml:

<?xml version="1.0"?>
<clickhouse>
    <users>
        <!-- Default user (restricted) -->
        <default>
            <password></password>
            <networks>
                <ip>::1</ip>
                <ip>127.0.0.1</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
        </default>

        <!-- Analytics user (for applications) -->
        <analytics_user>
            <password>secure_password_123</password>
            <networks>
                <ip>::/0</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
            <databases>
                <analytics>
                    <query>1</query>
                    <insert>1</insert>
                </analytics>
            </databases>
        </analytics_user>

        <!-- Admin user (full access) -->
        <admin>
            <password_sha256_hex>e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855</password_sha256_hex>
            <networks>
                <ip>127.0.0.1</ip>
                <ip>::1</ip>
            </networks>
            <profile>default</profile>
            <quota>default</quota>
            <access_management>1</access_management>
        </admin>
    </users>

    <quotas>
        <default>
            <interval>
                <duration>3600</duration>
                <queries>0</queries>
                <errors>0</errors>
                <result_rows>0</result_rows>
                <read_rows>0</read_rows>
                <execution_time>0</execution_time>
            </interval>
        </default>
    </quotas>
</clickhouse>

Linux Installation

Ubuntu/Debian Installation

Method 1: Official Repository (Recommended)

# Add ClickHouse repository
sudo apt-get install -y apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4

echo "deb https://packages.clickhouse.com/deb stable main" | sudo tee /etc/apt/sources.list.d/clickhouse.list

# Update and install
sudo apt-get update
sudo apt-get install -y clickhouse-server clickhouse-client

# Start ClickHouse
sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server

# Check status
sudo systemctl status clickhouse-server

Method 2: Download and Install Packages

# Download packages
curl -O 'https://packages.clickhouse.com/deb/pool/stable/c/clickhouse-server/clickhouse-server_24.8.1.2797_all.deb'
curl -O 'https://packages.clickhouse.com/deb/pool/stable/c/clickhouse-client/clickhouse-client_24.8.1.2797_all.deb'

# Install packages
sudo dpkg -i clickhouse-server_24.8.1.2797_all.deb
sudo dpkg -i clickhouse-client_24.8.1.2797_all.deb

# Fix dependencies if needed
sudo apt-get install -f

CentOS/RHEL Installation

# Add repository
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://packages.clickhouse.com/rpm/clickhouse.repo

# Install ClickHouse
sudo yum install -y clickhouse-server clickhouse-client

# Start service
sudo systemctl enable clickhouse-server
sudo systemctl start clickhouse-server

Post-Installation Setup (Linux)

# Create data directories
sudo mkdir -p /var/lib/clickhouse
sudo mkdir -p /var/log/clickhouse-server
sudo chown clickhouse:clickhouse /var/lib/clickhouse
sudo chown clickhouse:clickhouse /var/log/clickhouse-server

# Set up configuration
sudo cp /etc/clickhouse-server/config.xml /etc/clickhouse-server/config.xml.backup
sudo cp /etc/clickhouse-server/users.xml /etc/clickhouse-server/users.xml.backup

# Test connection
clickhouse-client --query "SELECT version()"

Windows Installation

Windows Subsystem for Linux (WSL) – Recommended

# Install WSL2 if not already installed
wsl --install

# Inside WSL, follow the Linux installation steps
curl -O 'https://packages.clickhouse.com/deb/pool/stable/c/clickhouse-server/clickhouse-server_24.8.1.2797_all.deb'
# ... (continue with Linux steps)

Native Windows Installation

Download and Install:

# Download using PowerShell
Invoke-WebRequest -Uri "https://packages.clickhouse.com/windows/clickhouse-windows-amd64.exe" -OutFile "clickhouse.exe"

# Or download manually from: https://packages.clickhouse.com/windows/

Windows Service Setup:

REM Create service directory
mkdir C:\ClickHouse\data
mkdir C:\ClickHouse\logs
mkdir C:\ClickHouse\config

REM Install as Windows service
sc create ClickHouse binpath= "C:\ClickHouse\clickhouse.exe server --config-file=C:\ClickHouse\config\config.xml"
sc start ClickHouse

REM Or run directly
clickhouse.exe server --config-file=config.xml

Windows Configuration (config.xml):

<?xml version="1.0"?>
<clickhouse>
    <path>C:\ClickHouse\data\</path>
    <tmp_path>C:\ClickHouse\data\tmp\</tmp_path>
    <user_files_path>C:\ClickHouse\data\user_files\</user_files_path>
    
    <listen_host>127.0.0.1</listen_host>
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    
    <logger>
        <level>information</level>
        <log>C:\ClickHouse\logs\clickhouse-server.log</log>
        <errorlog>C:\ClickHouse\logs\clickhouse-server.err.log</errorlog>
        <size>1000M</size>
        <count>10</count>
    </logger>
    
    <!-- Include other settings from Linux config -->
</clickhouse>

Configuration Deep Dive

Understanding ClickHouse Configuration

ClickHouse uses XML configuration files with a hierarchical structure:

/etc/clickhouse-server/
├── config.xml          # Main server configuration
├── users.xml           # User accounts and permissions
├── config.d/           # Additional config files
│   ├── storage.xml
│   ├── logging.xml
│   └── custom.xml
└── users.d/            # Additional user files
    └── custom_users.xml

Essential Configuration Options

Memory Management:

<clickhouse>
    <!-- Total server memory usage (80% of RAM) -->
    <max_server_memory_usage_to_ram_ratio>0.8</max_server_memory_usage_to_ram_ratio>
    
    <!-- Per-query memory limit (10GB) -->
    <max_memory_usage>10000000000</max_memory_usage>
    
    <!-- Cache sizes -->
    <uncompressed_cache_size>8589934592</uncompressed_cache_size>  <!-- 8GB -->
    <mark_cache_size>5368709120</mark_cache_size>                  <!-- 5GB -->
    <compiled_expression_cache_size>1073741824</compiled_expression_cache_size>  <!-- 1GB -->
</clickhouse>

Performance Tuning:

<clickhouse>
    <!-- Connection limits -->
    <max_connections>4096</max_connections>
    <max_concurrent_queries>100</max_concurrent_queries>
    <max_concurrent_queries_for_user>20</max_concurrent_queries_for_user>
    
    <!-- Thread settings -->
    <max_thread_pool_size>10000</max_thread_pool_size>
    <max_insert_threads>16</max_insert_threads>
    
    <!-- Compression -->
    <compression>
        <case>
            <method>zstd</method>
            <level>3</level>
        </case>
    </compression>
</clickhouse>

Storage Configuration:

<clickhouse>
    <storage_configuration>
        <disks>
            <!-- Fast SSD for hot data -->
            <hot>
                <path>/var/lib/clickhouse/hot/</path>
            </hot>
            
            <!-- Slower storage for cold data -->
            <cold>
                <path>/var/lib/clickhouse/cold/</path>
            </cold>
            
            <!-- S3 storage for archival -->
            <s3>
                <type>s3</type>
                <endpoint>https://s3.amazonaws.com/my-bucket/clickhouse/</endpoint>
                <access_key_id>ACCESS_KEY</access_key_id>
                <secret_access_key>SECRET_KEY</secret_access_key>
            </s3>
        </disks>
        
        <policies>
            <tiered>
                <volumes>
                    <hot>
                        <disk>hot</disk>
                        <max_data_part_size_bytes>1073741824</max_data_part_size_bytes>  <!-- 1GB -->
                    </hot>
                    <cold>
                        <disk>cold</disk>
                        <max_data_part_size_bytes>10737418240</max_data_part_size_bytes> <!-- 10GB -->
                    </cold>
                    <archive>
                        <disk>s3</disk>
                    </archive>
                </volumes>
                <move_factor>0.1</move_factor>
            </tiered>
        </policies>
    </storage_configuration>
</clickhouse>

Production Configuration Template

config.d/production.xml:

<?xml version="1.0"?>
<clickhouse>
    <!-- Network Security -->
    <listen_host>0.0.0.0</listen_host>
    <http_port>8123</http_port>
    <tcp_port>9000</tcp_port>
    <https_port>8443</https_port>
    <tcp_port_secure>9440</tcp_port_secure>
    
    <!-- SSL Configuration -->
    <openSSL>
        <server>
            <certificateFile>/etc/clickhouse-server/server.crt</certificateFile>
            <privateKeyFile>/etc/clickhouse-server/server.key</privateKeyFile>
            <dhParamsFile>/etc/clickhouse-server/dhparam.pem</dhParamsFile>
            <verificationMode>relaxed</verificationMode>
            <loadDefaultCAFile>true</loadDefaultCAFile>
            <cacheSessions>true</cacheSessions>
            <disableProtocols>sslv2,sslv3</disableProtocols>
            <preferServerCiphers>true</preferServerCiphers>
        </server>
    </openSSL>
    
    <!-- Monitoring -->
    <prometheus>
        <endpoint>/metrics</endpoint>
        <port>9363</port>
        <metrics>true</metrics>
        <events>true</events>
        <asynchronous_metrics>true</asynchronous_metrics>
    </prometheus>
    
    <!-- Query Logging -->
    <query_log>
        <database>system</database>
        <table>query_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </query_log>
    
    <!-- Part Log -->
    <part_log>
        <database>system</database>
        <table>part_log</table>
        <partition_by>toYYYYMM(event_date)</partition_by>
        <flush_interval_milliseconds>7500</flush_interval_milliseconds>
    </part_log>
</clickhouse>

Security and Performance Tuning

Security Best Practices

1. User Management:

<!-- users.d/production_users.xml -->
<clickhouse>
    <users>
        <!-- Disable default user in production -->
        <default remove="1"/>
        
        <!-- Application user with limited permissions -->
        <app_user>
            <password_sha256_hex>sha256_hash_here</password_sha256_hex>
            <networks>
                <ip>10.0.0.0/8</ip>
                <ip>172.16.0.0/12</ip>
                <ip>192.168.0.0/16</ip>
            </networks>
            <profile>readonly</profile>
            <quota>default</quota>
            <databases>
                <analytics>
                    <query>1</query>
                    <insert>1</insert>
                </analytics>
            </databases>
        </app_user>
        
        <!-- Read-only user for reporting -->
        <readonly_user>
            <password_sha256_hex>another_hash_here</password_sha256_hex>
            <networks>
                <ip>10.0.0.0/8</ip>
            </networks>
            <profile>readonly</profile>
            <quota>readonly_quota</quota>
        </readonly_user>
    </users>
    
    <profiles>
        <readonly>
            <readonly>1</readonly>
            <max_memory_usage>4000000000</max_memory_usage>
            <max_execution_time>300</max_execution_time>
        </readonly>
    </profiles>
    
    <quotas>
        <readonly_quota>
            <interval>
                <duration>3600</duration>
                <queries>1000</queries>
                <errors>100</errors>
                <result_rows>1000000000</result_rows>
                <read_rows>1000000000</read_rows>
                <execution_time>3600</execution_time>
            </interval>
        </readonly_quota>
    </quotas>
</clickhouse>

2. Network Security:

# Firewall rules (Ubuntu/Debian)
sudo ufw allow from 10.0.0.0/8 to any port 8123 comment 'ClickHouse HTTP'
sudo ufw allow from 10.0.0.0/8 to any port 9000 comment 'ClickHouse Native'
sudo ufw deny 8123
sudo ufw deny 9000

# Or use iptables
sudo iptables -A INPUT -s 10.0.0.0/8 -p tcp --dport 8123 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 8123 -j DROP

Performance Monitoring

System Monitoring Script:

#!/bin/bash
# clickhouse_monitor.sh

echo "=== ClickHouse Performance Monitor ==="
echo "Time: $(date)"

# CPU and Memory
echo -e "\n--- System Resources ---"
top -bn1 | grep "Cpu(s)" | awk '{print "CPU Usage: " $2}'
free -h | grep Mem | awk '{print "Memory: " $3 "/" $2 " (" $3/$2*100 "%)"}'

# ClickHouse specific metrics
echo -e "\n--- ClickHouse Metrics ---"
clickhouse-client --query "
SELECT 
    'Active Queries' as metric,
    toString(count()) as value
FROM system.processes
UNION ALL
SELECT 
    'Uptime (hours)' as metric,
    toString(round(uptime() / 3600, 2)) as value
UNION ALL
SELECT 
    'Memory Usage (GB)' as metric,
    toString(round(memory_usage / 1024 / 1024 / 1024, 2)) as value
FROM system.processes
WHERE memory_usage > 0
ORDER BY metric
"

# Recent slow queries
echo -e "\n--- Slow Queries (>5s in last hour) ---"
clickhouse-client --query "
SELECT 
    query_duration_ms / 1000 as duration_sec,
    read_rows,
    formatReadableSize(read_bytes) as read_bytes,
    substring(query, 1, 100) as query_preview
FROM system.query_log 
WHERE event_time >= now() - INTERVAL 1 HOUR 
  AND query_duration_ms > 5000
  AND type = 'QueryFinish'
ORDER BY query_duration_ms DESC 
LIMIT 5
"

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top