Skip to main content

Continuous Benchmarking

One of Databend design goals is to keep top performance, to guarantee it Databend runs Continuous Benchmarking on every nightly release to detect performance regressions and visualizes it on the website: perf.databend.rs

The benchmark runner and results daily which run daily are defined in the repository datafuselabs/databend-perf. From

Vectorized Execution Benchmarking

This benchmarking mainly for Databend vectorized execution, it will tell us how fast the vectorized execution in the memory is, we run these queries to measure it:

NumberQuery
Q1SELECT avg(number) FROM numbers_mt(100000000000)
Q2SELECT sum(number) FROM numbers_mt(100000000000)
Q3SELECT min(number) FROM numbers_mt(100000000000)
Q4SELECT max(number) FROM numbers_mt(100000000000)
Q5SELECT count(number) FROM numbers_mt(100000000000)
Q6SELECT sum(number+number+number) FROM numbers_mt(100000000000)
Q7SELECT sum(number) / count(number) FROM numbers_mt(100000000000)
Q8SELECT sum(number) / count(number), max(number), min(number) FROM numbers_mt(100000000000)
Q9SELECT number FROM numbers_mt(10000000000) ORDER BY number DESC LIMIT 10
Q10SELECT max(number), sum(number) FROM numbers_mt(10000000000) GROUP BY number % 3, number % 4, number % 5 LIMIT 10

Ontime Benchmarking

This benchmarking will tell us what the performance is when Databend works with Ontime dataset which on the AWS S3, we measure it by these queries:

NumberQuery
Q1SELECT DayOfWeek, count(*) AS c FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q2SELECT DayOfWeek, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY DayOfWeek ORDER BY c DESC;
Q3SELECT Origin, count(*) AS c FROM ontime WHERE DepDelay>10 AND Year >= 2000 AND Year <= 2008 GROUP BY Origin ORDER BY c DESC LIMIT 10;
Q4SELECT IATA_CODE_Reporting_Airline AS Carrier, count() FROM ontime WHERE DepDelay>10 AND Year = 2007 GROUP BY Carrier ORDER BY count() DESC;
Q5SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(cast(DepDelay>10 as Int8))*1000 AS c3 FROM ontime WHERE Year=2007 GROUP BY Carrier ORDER BY c3 DESC;
Q6SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(cast(DepDelay>10 as Int8))*1000 AS c3 FROM ontime WHERE Year>=2000 AND Year <=2008 GROUP BY Carrier ORDER BY c3 DESC;
Q7SELECT IATA_CODE_Reporting_Airline AS Carrier, avg(DepDelay) * 1000 AS c3 FROM ontime WHERE Year >= 2000 AND Year <= 2008 GROUP BY Carrier;
Q8SELECT Year, avg(DepDelay) FROM ontime GROUP BY Year;
Q9SELECT Year, count(*) as c1 FROM ontime GROUP BY Year;
Q10SELECT avg(cnt) FROM (SELECT Year,Month,count(*) AS cnt FROM ontime WHERE DepDel15=1 GROUP BY Year,Month) a;
Q11SELECT avg(c1) FROM (SELECT Year,Month,count(*) AS c1 FROM ontime GROUP BY Year,Month) a;
Q12SELECT OriginCityName, DestCityName, count(*) AS c FROM ontime GROUP BY OriginCityName, DestCityName ORDER BY c DESC LIMIT 10;
Q13SELECT OriginCityName, count(*) AS c FROM ontime GROUP BY OriginCityName ORDER BY c DESC LIMIT 10;
Q14SELECT count(*) FROM ontime;