半连接是MySQL 5.6.5引入的,多在子查询exists中使用,对外部row source的每个键值,查找到内部row source匹配的第一个键值后就返回,如果找到就不用再查找内部row source其他的键值了。


mysql> desc class;
| Field | Type | Null | Key | Default | Extra |
| class_num | int(11) | NO | PRI | NULL | |
| class_name | varchar(20) | YES | | NULL | |
2 rows in set (0.00 sec) mysql> desc roster;
| Field | Type | Null | Key | Default | Extra |
| class_num | int(11) | YES | | NULL | |
| student_num | int(11) | YES | | NULL | |
2 rows in set (0.00 sec)




mysql>  SELECT class.class_num, class.class_name FROM class INNER JOIN roster WHERE class.class_num = roster.class_num;
| class_num | class_name |
| 2 | class 2 |
| 3 | class 3 |
| 3 | class 3 |
3 rows in set (0.00 sec)


mysql>  SELECT class_num, class_name FROM class WHERE class_num IN (SELECT class_num FROM roster);
| class_num | class_name |
| 2 | class 2 |
| 3 | class 3 |
2 rows in set (0.00 sec)


mysql> explain SELECT class_num, class_name FROM class WHERE class_num IN (SELECT class_num FROM roster);
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | roster | NULL | ALL | NULL | NULL | NULL | NULL | 3 | 100.00 | Start temporary |
| 1 | SIMPLE | class | NULL | ALL | PRIMARY | NULL | NULL | NULL | 4 | 25.00 | Using where; End temporary; Using join buffer (Block Nested Loop) |
2 rows in set, 1 warning (0.00 sec) mysql> show warnings;
| Level | Code | Message |
| Note | 1003 | /* select#1 */ select `test`.`class`.`class_num` AS `class_num`,`test`.`class`.`class_name` AS `class_name` from `test`.`class` semi join (`test`.`roster`) where (`test`.`class`.`class_num` = `test`.`roster`.`class_num`) |
1 row in set (0.00 sec)
Start temporary 和 End temporary的使用表明使用了临时表来去除重复值
如果 select_type 的值为 MATERIALIZED 并且 字段 rows的输出是 <subqueryN> 则表明临时表用于了物化表 select_type value of MATERIALIZED and rows with a table value of <subqueryN>.


  • Convert the subquery to a join, or use table pullout and run the query as an inner join between subquery tables and outer tables. Table pullout pulls a table out from the subquery to the outer query.

  • Duplicate Weedout: Run the semi-join as if it was a join and remove duplicate records using a temporary table.

  • FirstMatch: When scanning the inner tables for row combinations and there are multiple instances of a given value group, choose one rather than returning them all. This "shortcuts" scanning and eliminates production of unnecessary rows.

  • LooseScan: Scan a subquery table using an index that enables a single value to be chosen from each subquery's value group.

  • Materialize the subquery into a temporary table with an index and use the temporary table to perform a join. The index is used to remove duplicates. The index might also be used later for lookups when joining the temporary table with the outer tables; if not, the table is scanned


系统变量optimizer_switch中的semi join 标签控制着半连接是否可用,5.6默认是开启的

