ClickHouse 是一种开源的列式数据库管理系统,专注于处理大规模数据分析工作负载。它在处理海量数据时具有高性能、可扩展、灵活的数据模型、支持 SQL 等特性;适用于需要及时分析最新数据的应用场景,如日志分析、事件跟踪等
在应用过程中,使用 ClickHouse 作为数据源,保存服务上报的秒级指标信息,用于 Grafana 查询秒级监控指标
这里使用的 Grafana 版本 v10.1.1, ClickHouse 版本为 23.9.3.12
启动服务并添加数据
启动 ClickHouse 和 Grafana
使用 Docker Compose 启动
version: "3.7"
services:
grafana:
image: grafana/grafana
ports:
- "3000:3000"
volumes:
- ./data/grafana:/var/lib/grafana
- ./data/plugins/:/var/lib/grafana/plugins
- ./data/provisioning:/etc/grafana/provisioning
networks:
- grafana
clickhouse:
image: "clickhouse/clickhouse-server"
container_name: "grafana-clickhouse-server"
ports:
- "8123:8123"
- "9000:9000"
volumes:
- ./data/data:/var/lib/clickhouse
- ./data/temp:/temp
- ./data/log:/var/log/clickhouse-server
environment:
CLICKHOUSE_SERVER_USER: default
CLICKHOUSE_SERVER_PASSWORD: 123456
networks:
- grafana
networks:
grafana:
添加数据
- 登陆数据库
进入到 docker 容器中,然后使用 clickhouse-client
访问数据库
docker exec -it clickhouse bash
clickhouse-client
ClickHouse client version 23.10.5.20 (official build).
Connecting to localhost:9000 as user default.
Connected to ClickHouse server version 23.10.5 revision 54466.
Warnings:
* Linux transparent hugepages are set to "always". Check /sys/kernel/mm/transparent_hugepage/enabled
clickhouse :)
- 创建测试数据库
CREATE DATABASE IF NOT EXISTS metrics
- 创建测试表
CREATE TABLE IF NOT EXISTS metrics.request_history
(
`id` UUID DEFAULT generateUUIDv4(),
`application` LowCardinality(String),
`env` LowCardinality(String),
`hostname` LowCardinality(String),
`uri` String,
`add_time` DateTime64(3) CODEC(DoubleDelta, LZ4)
)
ENGINE = MergeTree
PARTITION BY toDate(add_time)
ORDER BY (application, env, hostname, uri)
TTL toDate(add_time) + 30;
- 插入随机数据
INSERT INTO metrics.request_history(application, env, hostname, uri) VALUES('app1', 'prod', 'hostname1', '/api/user');
//...
配置 Grafana
安装 ClickHouse 插件
在 Grafana 面板 Connections -> Add new connection 中添加 ClickHouse 插件
ClickHouse 有两个插件,分别是 Grafana 官方提供的 ClickHouse 和社区提供的 Altinity plugin for ClickHouse;
官方的 ClickHouse 仅支持 Grafana >=9.0.0 的版本使用,Altinity plugin for ClickHouse 支持 Grafana 4.6 之后的版本
这两个插件在用法和交互上有所差异,这里使用官方的 ClickHouse 进行查询
配置数据源
在 Grafana 面板 Connections -> Data sources 中添加 ClickHouse 数据源
配置图表
选择新增图表,并选择 ClickHouse 作为数据源
配置 table 图表
创建 Table 图表直接使用 SQL 查询需要的列即可
SELECT env,
hostname,
uri,
value
FROM metrics.request_history
ORDER BY add_time
LIMIT 10
配置时间序列的图表
时间序列的图表需要至少有三列,分别是时间、名称、值;此时列需要聚合
SELECT toDateTime64(add_time, 0) as time ,
uri as label,
sum(value) as value
FROM metrics.request_history
GROUP BY label, time
ORDER BY time
配置查询条件
ClickHouse 插件支持多个宏(Macro),在向 ClickHouse 发送查询前会将 Macro 替换为对应的值
时间范围查询条件
使用 $__timeFilter()
作为时间范围的查询条件,使用的字段是指定的 DateTime 或 Timestamp
对应的列,如 where $__timeFilter(add_time)
会被替换为 where add_time >= xxx and add_time <= xxx
SELECT toDateTime64(add_time, 0) as time ,
uri as label,
sum(value) as value
FROM metrics.request_history
WHERE $__timeFilter(add_time)
GROUP BY label, time
ORDER BY time
最终查询的 SQL 为:
SELECT toDateTime64(add_time, 0) as time ,
uri as label,
sum(value) as value
FROM metrics.request_history
WHERE add_time >= '1699940462' AND add_time <= '1699962062'
GROUP BY label, time
ORDER BY time
使用变量作为查询条件
- 添加变量
变量的定义和 Prometheus 数据源类似,不过是使用 SQL 查询,如配置 env,使用选中的时间范围进行查询;
因为 add_time
类型是 DateTime
,所以需要将当前的时间范围转为秒级的时间戳,然后使用 toDateTime
转为 DateTime
进行比较
SELECT distinct(toString(env))
FROM metrics.request_history
WHERE add_time BETWEEN toDateTime(${__from:type: post
date:seconds}) AND toDateTime(${__to:type: post
date:seconds})
依赖其他变量进行查询,添加到查询条件即可,使用 singlequote
添加引号,避免查询语法错误
SELECT distinct(toString(application))
FROM `logs`.`red_sentinel_metrics`
WHERE env in (${env:singlequote})
AND ActionDate BETWEEN toDateTime(${__from:type: post
date:seconds}) AND toDateTime(${__to:type: post
date:seconds})
- 添加动态查询条件
动态查询条件使用 $__conditionalAll
进行查询;
$__conditionalAll
接收两个参数,第一个参数是 SQL predicate
,第二个参数是 $variable
,当 $variable
不为空的时候会将 SQL predicate
作为查询条件添加到 SQL 中,如果是空或者是 ALL 则会添加 1=1
,
如 AND $__conditionalAll(uri in ('${uri:singlequote}'), '${uri:singlequote}')
,如果 uri 不为空,则会将 AND uri in (xxx)
作为查询条件;如果 uri 为空则是 AND 1=1
SELECT toDateTime64(add_time, 0) as time,
uri as label,
sum(value) as value
FROM metrics.request_history
WHERE $__timeFilter(add_time)
AND $__conditionalAll(uri in ('${uri:singlequote}'), '${uri:singlequote}')
GROUP BY label,time
ORDER BY time
最终 SQL 为:
SELECT add_time as time,
uri as label,
sum(value) as value
FROM metrics.request_history
WHERE add_time >= '1699861790' AND add_time <= '1699883390'
AND uri in ('/api/user')
GROUP BY label,time
ORDER BY time
聚合
可以用 $__conditionalAll
进行聚合,或者将要聚合的字段拼接为一个字段
官方的 ClickHouse 对聚合支持不够完善,$__conditionalAll
如果是空会被替换为 1=1
,会导致语法错误,如果不为空,则字段是固定存在的,事实上并不动态;而 Altinity plugin for ClickHouse 插件的 $conditionalTest
为空时不做任何拼接更能满足动态聚合的场景
- 用 conditionalAll
注意,这种场景下 uri 变量必须要有值,否则会出现 GROUP BY 1=1
这样的错误语法
SELECT $__timeInterval(add_time) as time,
uri as label,
sum(value) as value
FROM metrics.request_history
WHERE $__timeFilter(add_time)
AND $__conditionalAll(uri in (${uri:singlequote}), ${uri:singlequote})
GROUP BY $__conditionalAll(uri, ${uri:singlequote}), $__conditionalAll(hostname, ${hostname:singlequote}), time
ORDER BY time
最终查询为:
SELECT toDateTime64(add_time, 0) as time,
uri as label,
sum(value) as value
FROM metrics.request_history
WHERE add_time >= '1699944714' AND add_time <= '1699966314'
AND uri in ('/status','/actuator/health')
GROUP BY uri, hostname, time
ORDER BY time
- 用字段拼接
拼接后的字段用于做聚合,适用于固定的聚合字段
SELECT toDateTime64(add_time, 0) as time,
concat(application, ': ', uri) as label,
sum(value) as value
FROM metrics.request_history
WHERE $__timeFilter(add_time)
AND $__conditionalAll(uri in (${uri:singlequote}), ${uri:singlequote})
GROUP BY label, time
ORDER BY time
最终查询为:
SELECT toDateTime64(add_time, 0) as time,
concat(application, ': ', uri) as label,
sum(value) as value
FROM metrics.request_history
WHERE add_time >= '1699944327' AND add_time <= '1699965927'
AND uri in ('/status','/actuator/health')
GROUP BY label, time
ORDER BY time