原表:
bw_orders,使用的是ReplaceingMergeTree引擎。
原表有几个重要的字段:
id,spread,pay_time,is_pay,ads_id,ads_at,money,update_time,create_time 等,说白了是一个常规的业务订单表,数据量有点大,现在是接近6千万的样子。
ORDER BY = id, ads_id, spread, create_time
ver = update_time
业务创建一个待支付订单->支付;
支付状态的变更都是通过新增同主键(ORDER BY),新的版本号(update_time)的形式。
我物化视图SQL如下:
CREATE MATERIALIZED VIEW view_ads_aapp
(
`date` Date COMMENT '日期',
`ads_id` Int32 COMMENT '广告ID',
`ads_at` String COMMENT '广告AT',
`pay_orders_count` Int32 COMMENT '支付单数',
`pay_orders_amount` Float64 COMMENT '支付金额'
)
ENGINE = SummingMergeTree
PARTITION BY toYear(date)
ORDER BY (date, ads_id, ads_at)
SETTINGS index_granularity = 8192
AS
select date,
ads_id,
ads_at,
COUNT(*) as pay_orders_count,
SUM(money) as pay_orders_amount
from `bw_orders`
where `is_pay` = 1 and pay_time > 1652775300
group by toDate(pay_time) as date, ads_id, ads_at;
数据跑一段时间,我发现物化视图一整天下来,支付单数比源表多,不知道问题出在哪里,特来求救。