MySQL 8.0 引入了 窗口函数(window functions) ,我们看一下都有哪些功能。
Name | Description |
---|---|
CUME_DIST() | Cumulative distribution value |
DENSE_RANK() | Rank of current row within its partition, without gaps |
FIRST_VALUE() | Value of argument from first row of window frame |
LAG() | Value of argument from row lagging current row within partition |
LAST_VALUE() | Value of argument from last row of window frame |
LEAD() | Value of argument from row leading current row within partition |
NTH_VALUE() | Value of argument from N-th row of window frame |
NTILE() | Bucket number of current row within its partition. |
PERCENT_RANK() | Percentage rank value |
RANK() | Rank of current row within its partition, with gaps |
ROW_NUMBER() | Number of current row within its partition |
这里我们使用 world 示例数据库进行演示,首先使用 docker 安装 mysql,然后导入 world.sql。
docker run --name db -e MYSQL_ROOT_PASSWORD=123456 -d mysql
docker exec -it db mysql -u root -p < C:\Users\tanghengzhi\Downloads\world.sql
比较常用的有排序:rank(), dense_rank() 和 row_number() 。
mysql> select Name, LifeExpectancy, rank() over (w), dense_rank() over(w), row_number() over(w)
from country
where Continent = 'Europe'
window w as (order by LifeExpectancy desc)
limit 10;
+---------------+----------------+-----------------+----------------------+----------------------+
| Name | LifeExpectancy | rank() over (w) | dense_rank() over(w) | row_number() over(w) |
+---------------+----------------+-----------------+----------------------+----------------------+
| Andorra | 83.5 | 1 | 1 | 1 |
| San Marino | 81.1 | 2 | 2 | 2 |
| Switzerland | 79.6 | 3 | 3 | 3 |
| Sweden | 79.6 | 3 | 3 | 4 |
| Iceland | 79.4 | 5 | 4 | 5 |
| Gibraltar | 79.0 | 6 | 5 | 6 |
| Italy | 79.0 | 6 | 5 | 7 |
| Spain | 78.8 | 8 | 6 | 8 |
| France | 78.8 | 8 | 6 | 9 |
| Liechtenstein | 78.8 | 8 | 6 | 10 |
+---------------+----------------+-----------------+----------------------+----------------------+
10 rows in set (0.00 sec)
还有第 N 行的值,first_value(), last_value() 和 nth_value()。
mysql> select Name, LifeExpectancy, first_value(Name) over(w) as firth, last_value(Name) over(w) as last, nth_value(Name, 2) over(w) as second, nth_value(Name, 4) over (w) as fourth
from country
where Continent = 'Europe'
window w as (order by LifeExpectancy desc)
limit 10;
+---------------+----------------+---------+------------+------------+--------+
| Name | LifeExpectancy | firth | last | second | fourth |
+---------------+----------------+---------+------------+------------+--------+
| Andorra | 83.5 | Andorra | Andorra | NULL | NULL |
| San Marino | 81.1 | Andorra | San Marino | San Marino | NULL |
| Switzerland | 79.6 | Andorra | Sweden | San Marino | Sweden |
| Sweden | 79.6 | Andorra | Sweden | San Marino | Sweden |
| Iceland | 79.4 | Andorra | Iceland | San Marino | Sweden |
| Gibraltar | 79.0 | Andorra | Italy | San Marino | Sweden |
| Italy | 79.0 | Andorra | Italy | San Marino | Sweden |
| Spain | 78.8 | Andorra | Monaco | San Marino | Sweden |
| France | 78.8 | Andorra | Monaco | San Marino | Sweden |
| Liechtenstein | 78.8 | Andorra | Monaco | San Marino | Sweden |
+---------------+----------------+---------+------------+------------+--------+
10 rows in set (0.00 sec)
其他的窗口函数不是很常用,就不一一介绍了,需要用到的请参考官方文档。
接下来是一个在实际使用中遇到的 Bug:
由于涉及到具体业务,这里不方便直接展示相关 SQL 语句。
还是使用演示数据库,查询各大洲的总人口,总人口排名,以及该洲人口最多的国家,按照总人口倒叙排列。
mysql> select Continent, sum(Population), rank() over(order by sum(Population) desc) as `rank`, substring_index(group_concat(Name order by Population desc), ',', 1) as `The most populous country`
from country
group by Continent
order by sum(Population) desc;
MySQL 8.0.16:
+---------------+-----------------+------+----------------------------------------------+
| Continent | sum(Population) | rank | The most populous country |
+---------------+-----------------+------+----------------------------------------------+
| Asia | 3705025700 | 1 | China |
| Africa | 784475000 | 2 | Nigeria |
| Europe | 730074600 | 3 | Russian Federation |
| North America | 482993000 | 4 | United States |
| South America | 345780000 | 5 | Brazil |
| Oceania | 30401150 | 6 | Australia |
| Antarctica | 0 | 7 | South Georgia and the South Sandwich Islands |
+---------------+-----------------+------+----------------------------------------------+
7 rows in set (0.00 sec)
MySQL 8.0.18:
+---------------+-----------------+------+----------------------------------------------+
| Continent | sum(Population) | rank | The most populous country |
+---------------+-----------------+------+----------------------------------------------+
| Asia | 3705025700 | 1 | South Georgia and the South Sandwich Islands |
| Africa | 784475000 | 2 | South Georgia and the South Sandwich Islands |
| Europe | 730074600 | 3 | South Georgia and the South Sandwich Islands |
| North America | 482993000 | 4 | South Georgia and the South Sandwich Islands |
| South America | 345780000 | 5 | South Georgia and the South Sandwich Islands |
| Oceania | 30401150 | 6 | South Georgia and the South Sandwich Islands |
| Antarctica | 0 | 7 | South Georgia and the South Sandwich Islands |
+---------------+-----------------+------+----------------------------------------------+
7 rows in set (0.00 sec)
可以看到,同样一个 SQL 语句,在 MySQL 8.0.16 和 MySQL 8.0.18 版本的查询结果不一样的。最新的 MySQL 8.0.22 版本也存在这个问题,目前已经反馈给阿里云,阿里云的小伙伴提交了 MySQL Bug,等待后续解决方案吧。
在这个 Bug 被修复之前,只能先临时修改一下 SQL 语句,如下:
mysql> select Continent, sum(Population), rank() over(w) as `rank`, substring_index(group_concat(Name order by Population desc), ',', 1) as `The most populous country`
from country
group by Continent
window w as (order by sum(Population) desc);
参考:
https://hub.docker.com/_/mysql
https://dev.mysql.com/doc/world-setup/en/
https://dev.mysql.com/doc/refman/8.0/en/mysql-nutshell.html
https://dev.mysql.com/doc/refman/8.0/en/window-functions.html
https://bugs.mysql.com/bug.php?id=101691
354 total views, 2 views today