MySQL 窗口函数

MySQL 8.0 引入了 窗口函数(window functions) ,我们看一下都有哪些功能。

NameDescription
CUME_DIST()Cumulative distribution value
DENSE_RANK()Rank of current row within its partition, without gaps
FIRST_VALUE()Value of argument from first row of window frame
LAG()Value of argument from row lagging current row within partition
LAST_VALUE()Value of argument from last row of window frame
LEAD()Value of argument from row leading current row within partition
NTH_VALUE()Value of argument from N-th row of window frame
NTILE()Bucket number of current row within its partition.
PERCENT_RANK()Percentage rank value
RANK()Rank of current row within its partition, with gaps
ROW_NUMBER()Number of current row within its partition

这里我们使用 world 示例数据库进行演示,首先使用 docker 安装 mysql,然后导入 world.sql。

docker run --name db -e MYSQL_ROOT_PASSWORD=123456 -d mysql
docker exec -it db mysql -u root -p < C:\Users\tanghengzhi\Downloads\world.sql




比较常用的有排序:rank(), dense_rank() 和 row_number() 。

mysql> select Name, LifeExpectancy, rank() over (w), dense_rank() over(w), row_number() over(w) 
from country 
where Continent = 'Europe' 
window w as (order by LifeExpectancy desc) 
limit 10;
+---------------+----------------+-----------------+----------------------+----------------------+
| Name          | LifeExpectancy | rank() over (w) | dense_rank() over(w) | row_number() over(w) |
+---------------+----------------+-----------------+----------------------+----------------------+
| Andorra       |           83.5 |               1 |                    1 |                    1 |
| San Marino    |           81.1 |               2 |                    2 |                    2 |
| Switzerland   |           79.6 |               3 |                    3 |                    3 |
| Sweden        |           79.6 |               3 |                    3 |                    4 |
| Iceland       |           79.4 |               5 |                    4 |                    5 |
| Gibraltar     |           79.0 |               6 |                    5 |                    6 |
| Italy         |           79.0 |               6 |                    5 |                    7 |
| Spain         |           78.8 |               8 |                    6 |                    8 |
| France        |           78.8 |               8 |                    6 |                    9 |
| Liechtenstein |           78.8 |               8 |                    6 |                   10 |
+---------------+----------------+-----------------+----------------------+----------------------+
10 rows in set (0.00 sec)




还有第 N 行的值,first_value(), last_value() 和 nth_value()。

mysql> select Name, LifeExpectancy, first_value(Name) over(w) as firth, last_value(Name) over(w) as last, nth_value(Name, 2) over(w) as second, nth_value(Name, 4) over (w) as fourth 
from country 
where Continent = 'Europe' 
window w as (order by LifeExpectancy desc) 
limit 10;
+---------------+----------------+---------+------------+------------+--------+
| Name          | LifeExpectancy | firth   | last       | second     | fourth |
+---------------+----------------+---------+------------+------------+--------+
| Andorra       |           83.5 | Andorra | Andorra    | NULL       | NULL   |
| San Marino    |           81.1 | Andorra | San Marino | San Marino | NULL   |
| Switzerland   |           79.6 | Andorra | Sweden     | San Marino | Sweden |
| Sweden        |           79.6 | Andorra | Sweden     | San Marino | Sweden |
| Iceland       |           79.4 | Andorra | Iceland    | San Marino | Sweden |
| Gibraltar     |           79.0 | Andorra | Italy      | San Marino | Sweden |
| Italy         |           79.0 | Andorra | Italy      | San Marino | Sweden |
| Spain         |           78.8 | Andorra | Monaco     | San Marino | Sweden |
| France        |           78.8 | Andorra | Monaco     | San Marino | Sweden |
| Liechtenstein |           78.8 | Andorra | Monaco     | San Marino | Sweden |
+---------------+----------------+---------+------------+------------+--------+
10 rows in set (0.00 sec)




其他的窗口函数不是很常用,就不一一介绍了,需要用到的请参考官方文档。


接下来是一个在实际使用中遇到的 Bug:

由于涉及到具体业务,这里不方便直接展示相关 SQL 语句。

还是使用演示数据库,查询各大洲的总人口,总人口排名,以及该洲人口最多的国家,按照总人口倒叙排列。

mysql> select Continent, sum(Population), rank() over(order by sum(Population) desc) as `rank`, substring_index(group_concat(Name order by Population desc), ',', 1) as `The most populous country` 
from country 
group by Continent 
order by sum(Population) desc;




MySQL 8.0.16:

+---------------+-----------------+------+----------------------------------------------+
| Continent     | sum(Population) | rank | The most populous country                    |
+---------------+-----------------+------+----------------------------------------------+
| Asia          |      3705025700 |    1 | China                                        |
| Africa        |       784475000 |    2 | Nigeria                                      |
| Europe        |       730074600 |    3 | Russian Federation                           |
| North America |       482993000 |    4 | United States                                |
| South America |       345780000 |    5 | Brazil                                       |
| Oceania       |        30401150 |    6 | Australia                                    |
| Antarctica    |               0 |    7 | South Georgia and the South Sandwich Islands |
+---------------+-----------------+------+----------------------------------------------+
7 rows in set (0.00 sec)




MySQL 8.0.18:

+---------------+-----------------+------+----------------------------------------------+
| Continent     | sum(Population) | rank | The most populous country                    |
+---------------+-----------------+------+----------------------------------------------+
| Asia          |      3705025700 |    1 | South Georgia and the South Sandwich Islands |
| Africa        |       784475000 |    2 | South Georgia and the South Sandwich Islands |
| Europe        |       730074600 |    3 | South Georgia and the South Sandwich Islands |
| North America |       482993000 |    4 | South Georgia and the South Sandwich Islands |
| South America |       345780000 |    5 | South Georgia and the South Sandwich Islands |
| Oceania       |        30401150 |    6 | South Georgia and the South Sandwich Islands |
| Antarctica    |               0 |    7 | South Georgia and the South Sandwich Islands |
+---------------+-----------------+------+----------------------------------------------+
7 rows in set (0.00 sec)




可以看到,同样一个 SQL 语句,在 MySQL 8.0.16 和 MySQL 8.0.18 版本的查询结果不一样的。最新的 MySQL 8.0.22 版本也存在这个问题,目前已经反馈给阿里云,阿里云的小伙伴提交了 MySQL Bug,等待后续解决方案吧。

在这个 Bug 被修复之前,只能先临时修改一下 SQL 语句,如下:

mysql> select Continent, sum(Population), rank() over(w) as `rank`, substring_index(group_concat(Name order by Population desc), ',', 1) as `The most populous country` 
from country 
group by Continent 
window w as (order by sum(Population) desc);

参考:

https://hub.docker.com/_/mysql

https://dev.mysql.com/doc/world-setup/en/

https://dev.mysql.com/doc/refman/8.0/en/mysql-nutshell.html

https://dev.mysql.com/doc/refman/8.0/en/window-functions.html

https://bugs.mysql.com/bug.php?id=101691

 355 total views,  3 views today

Leave a comment

Your email address will not be published. Required fields are marked *