(图片来源于网络,侵删)
Hive练习题👇
题目1如下👇
我们有如下的用户访问数据
要求使用SQL统计出每个用户的累积访问次数,如下表所示:
数据如下👇
u01 2017/1/21 5
u02 2017/1/23 6
u03 2017/1/22 8
u04 2017/1/20 3
u01 2017/1/23 6
u01 2017/2/21 8
u02 2017/1/23 6
u01 2017/2/22 4
建表语句
create table action
(userId string,
visitDate string,
visitCount int)
row format delimited fields terminated by "\t";
参考答案如下👇
select
userId,
date,
sum_visitCount,
sum(sum_visitCount) over(distribute by userId sort by date)
from
(select
userId,
date_format(regexp_replace(visitDate,"/","-"),"yyyy-MM") date,
sum(visitCount) sum_visitCount
from action group by userId,date_format(regexp_replace(visitDate,"/","-"),"yyyy-MM")) A;
题目2如下👇
有50W个店铺,每个顾客访问任何一个店铺的任何一个商品时都会产生一条访问日志,
访问日志存储的表名为visit,访客的用户id为user_id,被访问的店铺名称为shop
求:
1)每个店铺的UV(访客数)
2)每个店铺访问次数top3的访客信息。输出店铺名称、访客id、访问次数
数据如下👇
u1 a
u2 b
u1 b
u1 a
u3 c
u4 b
u1 a
u2 c
u5 b
u4 b
u6 c
u2 c
u1 b
u2 a
u2 a
u3 a
u5 a
u5 a
u5 a
建表语句
create table visit(user_id string,shop string) row format delimited fields terminated by '\t';
参考答案如下👇
1.
select
shop,
count(*)
from(select
user_id,
shop
from
visit
group by user_id,shop) A group by shop;
2.
select
shop,
user_id,
num
from(select
user_id,
shop,
num,
row_number() over(partition by shop order by num desc) rk
from
(select
shop,
user_id,
count(*) num
from
visit
group by shop,user_id) A) B
where rk <=3;
都看到这里了,点赞评论一下吧!!!
点击查看👇
【Hive】Hive练习题(二)
转载:https://blog.csdn.net/qq_43733123/article/details/105362529
查看评论