MySQL 合并查询join 查询出的不同列合并到一个表中

为了求解问题时思路清晰，建议先分列查询，再将列合并到一个表中，这样相当于将复杂问题拆解为简单问题，一一解决。优点是避免所有问题混在一起，代码逻辑清晰，可迁移性强，下次遇到类似的查询问题能快速求解，缺点是代码看起来不够简洁，存在代码冗余的问题。

一、适用场景和方法

（1）适用场景

考虑查询过程中是否存在以下情况：

查询某些列时需要分组才能得到，某些列不需要分组就能得到；
查询某些列时需要where条件，某些列不需要where条件；
查询这些列时需要多次用到不同的表；
某一个列或几个列的查询过程很复杂。

存在上述情况时候，为了求解问题时思路清晰，建议先分列查询，再将列合并到一个表中，这样相当于将复杂问题拆解为简单问题，一一解决。

（2）方法

MySQL多表查询，将查询到的列合并到一个表中使用join函数

具体包括：

连接类型（四者选一）	连接条件（三者选一）
left join	natural
right join	on <连接条件>
inner join	using(col1,col2,...,coln)
full outer join

根据查询需要使用不同的连接类型和条件。其中col指列名（注意两个表的该列名必须相同）。

二、案例分析

案例来自：SQL135 每个6/7级用户活跃情况

现有用户信息表user_info（uid用户ID，nick_name昵称, achievement成就值, level等级, job职业方向, register_time注册时间）：

id	uid	nick_name	achievement	level	job	register_time
1	1001	牛客1号	3100	7	算法	2020-01-01 10:00:00
2	1002	牛客2号	2300	7	算法	2020-01-01 10:00:00
3	1003	牛客3号	2500	7	算法	2020-01-01 10:00:00
4	1004	牛客4号	1200	5	算法	2020-01-01 10:00:00
5	1005	牛客5号	1600	6	C++	2020-01-01 10:00:00
6	1006	牛客6号	2600	7	C++	2020-01-01 10:00:00

试卷信息表examination_info（exam_id试卷ID, tag试卷类别, difficulty试卷难度, duration考试时长, release_time发布时间）：

id	exam_id	tag	difficulty	duration	release_time
1	9001	SQL	hard	60	2021-09-01 06:00:00
2	9002	C++	easy	60	2021-09-01 06:00:00
3	9003	算法	medium	80	2021-09-01 10:00:00

试卷作答记录表exam_record（uid用户ID, exam_id试卷ID, start_time开始作答时间, submit_time交卷时间, score得分）：

uid	exam_id	start_time	submit_time	score
1001	9001	2021-09-01 09:01:01	2021-09-01 09:31:00	78
1001	9001	2021-09-01 09:01:01	2021-09-01 09:31:00	81
1005	9001	2021-09-01 19:01:01	2021-09-01 19:30:01	85
1005	9002	2021-09-01 12:01:01	2021-09-01 12:31:02	85
1006	9003	2021-09-07 10:01:01	2021-09-07 10:21:59	84
1006	9001	2021-09-07 10:01:01	2021-09-07 10:21:01	81
1002	9001	2020-09-01 13:01:01	2020-09-01 13:41:01	81
1005	9001	2021-09-01 14:01:01	(NULL)	(NULL)

题目练习记录表practice_record（uid用户ID, question_id题目ID, submit_time提交时间, score得分）：

uid	question_id	submit_time	score
1001	8001	2021-08-02 11:41:01	60
1004	8001	2021-08-02 19:38:01	70
1004	8002	2021-08-02 19:48:01	90
1001	8002	2021-08-02 19:38:01	70
1004	8002	2021-08-02 19:48:01	90
1006	8002	2021-08-04 19:58:01	94
1006	8003	2021-08-03 19:38:01	70
1006	8003	2021-08-02 19:48:01	90
1006	8003	2020-08-01 19:38:01	80

请统计每个6/7级用户总活跃月份数、2021年活跃天数、2021年试卷作答活跃天数、2021年答题活跃天数，按照总活跃月份数、2021年活跃天数降序排序。由示例数据结果输出如下：

uid	act_month_total	act_days_2021	act_days_2021_exam	act_days_2021_question
1006	3	4	1	3
1001	2	2	1	1
1005	1	1	1	0
1002	1	0	0	0
1003	0	0	0	0

解释：6/7级用户共有5个，其中1006在202109、202108、202008共3个月活跃过，2021年活跃的日期有20210907、20210804、20210803、20210802共4天，2021年在试卷作答区20210907活跃1天，在题目练习区活跃了3天。

分析是否适用‘列拼接成表’的方法：

根据题目要求可知，总活跃月份数、2021年活跃天数和 2021年试卷作答活跃天数、2021年答题活跃天数，查询所用的表不一样，并且每一个列的查询过程都很复杂，所以采取分列查询再合并列的方法。

分析思路

难点：

1.建立合并列的思想

（1）统计用户总活跃月份数 如果日期重复算一个月份

[使用]：[年月]：date_format(exrp,'%y%m') ; 去重：distinct

（2）统计用户2021年活跃天数 如果日期重复算一天

[使用]：[2021年]: year(exrp) = 2021; [年月日]：date(exrp) ; 去重：distinct；

注意： 判断是否是2021年应该放在select里面而不是where中

（3）统计2021年试卷作答活跃天数

[使用]： [2021年]: year(exrp) = 2021; [年月日]：date(exrp) ;

（4）统计2021年答题活跃天数

[使用]：多表连接使用 join using( )

（5）合并列

[使用]： [2021年]: year(exrp) = 2021; [年月日]：date(exrp) ;

最终结果

select 查询结果 [总活跃月份数; 2021年活跃天数; 2021年试卷作答活跃天数; 2021年答题活跃天数]

from 从哪张表中查询数据[多个join连接的表]

where 查询条件 [level等级是6/7]

order by 对查询结果排序 [按照总活跃月份数、2021年活跃天数降序];

实现过程

（1）需要一个临时表：

with

    main as(

	#试卷作答记录和题目练习记录

        select distinct

            a.uid,

            date(start_time) as days,

            'exam' as tag

        from user_info a

        left join exam_record b

        using(uid)

        union

        select distinct

            a.uid,

            date(submit_time) as days,

            'question' as tag

        from user_info a

        left join practice_record c

        using(uid)

    )

注意：mysql版本在8.0之前不支持with。如需配置mysql的8.0版本参考

（2）求select列

总活跃月份数

#总活跃月份数 attr

select

    uid,

    count(distinct date_format(days,'%y%m')) as act_month_total

from main

group by uid

2021年活跃天数

#2021年试卷作答活跃天数 attr1

select

    uid,

    count(distinct(if(year(start_time) = 2021,start_time,null))) as act_days_2021_exam

from main

group by uid

2021年试卷作答活跃天数
count(distinct(if(year(date(act_date)) = 2021 and tag = 'exam',act_date,null)))

利用tag标记是试卷作答记录还是答题作答记录。

#2021年试卷作答活跃天数 attr2

select

    uid,

    count(distinct(if(year(days) = 2021 and tag = 'exam',days,null))) as act_days_2021_exam

from main1

group by uid

2021年答题活跃天数

#2021年答题活跃天数 attr3

select

    uid,

    count(distinct(if(year(days) = 2021 and tag = 'question', days, null))) as act_days_2021_question

from main1

group by uid

（3）合并列

select

    a.uid,

    act_month_total,

    act_days_2021,

    act_days_2021_exam,

    act_days_2021_question

from user_info a

left join attr using(uid)

left join attr1 using(uid)

left join attr2 using(uid)

left join attr3 using(uid)

where level between 6 and 7

order by act_month_total desc,act_days_2021 desc

求解代码

方法一：

使用 with

with

    main as(

	#试卷作答记录和题目练习记录

        select distinct

            a.uid,

            date(start_time) as days,

            'exam' as tag

        from user_info a

        left join exam_record b

        using(uid)

        union

        select distinct

            a.uid,

            date(submit_time) as days,

            'question' as tag

        from user_info a

        left join practice_record c

        using(uid)

    )

#合并列

select

    a.uid,

    act_month_total,

    act_days_2021,

    act_days_2021_exam,

    act_days_2021_question

from user_info a

left join(

    #总活跃月份数指的是所有年

    select

        uid,

        count(distinct date_format(days,'%y%m')) as act_month_total

    from main

    group by uid

) attr using(uid)

left join(

    #2021年活跃天数

    select

        uid,

        count(distinct if(year(days) = 2021,days,null)) as act_days_2021

    from main

    group by uid

) attr1 using(uid)

left join(

    #2021年试卷作答活跃天数

    select

        uid,

        count(distinct(if(year(days) = 2021 and tag = 'exam',days,null))) as act_days_2021_exam

    from main

    group by uid

) attr2 using(uid)

left join(

    #2021年答题活跃天数

    select

        uid,

        count(distinct(if(year(days) = 2021 and tag = 'question',days,null))) as act_days_2021_question

    from main

    group by uid

) attr3 using(uid)

where level between 6 and 7

order by  act_month_total desc,act_days_2021 desc#按照总活跃月份数、2021年活跃天数降序排序

方法二：

不使用 with

select

    uid,

    count(distinct date_format(days,'%y%m')) as act_month_total,#总活跃月份数指的是所有年

    count(distinct if(year(days) = 2021,days,null)) as act_days_2021,#2021年活跃天数

    count(distinct(if(year(days) = 2021 and tag = 'exam',days,null))) as act_days_2021_exam,#2021年试卷作答活跃天数

    count(distinct(if(year(days) = 2021 and tag = 'question',days,null))) as act_days_2021_question#试卷作答记录和题目练习记录

from user_info

left join(

    select distinct

        uid,

        date(start_time) as days,

        'exam' as tag

    from user_info

    left join exam_record using(uid)

    union

    select distinct

        uid,

        date(submit_time) as days,

        'question' as tag

    from user_info

    left join practice_record using(uid)

) main using(uid)

where level between 6 and 7

group by uid

order by  act_month_total desc,act_days_2021 desc#按照总活跃月份数、2021年活跃天数降序排序

扩展：

前往查看MySQL 嵌套子查询 with子句 from子查询 in子查询 join子查询