当两张表进行联结的时候,如果表1中的数据行是否出现在结果集中需要根据表2中出现或不出现至少一个相匹配的数据行来判断,这种情况就会发生半联结;而反联结便是半联结的补集,它们会作为数据库中常见的联结方法如NESTED LOOPS,MERGE SORT JOIN,HASH JOIN的选项出现。

实际上半联结和反联结本身也可以被认同是两种联结方法;在CBO优化模式下,优化器能够根据实际情况灵活的转换执行语句从而实现半联结和反联结方法,毕竟没有什么SQL语法可以显式的调用半联结和反联结,它们只是SQL语句满足某些条件时优化器可以选择的选项而已,不过仍然有必要深入这两种选项在特定情况下带来的性能优势。

半联结

半联结通常都发生在使用含有IN和EXISTS的相关子查询的时候,=ANY的用法与IN相同,所以也会出现发生半联结的情况;不过也是有例外的,在11gR2版本中,优化器不会为任何包含在OR分支中的子查询选择半联结,这也是现在官档中唯一明确标识的限制条件,来看几种发生场景:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
-- 使用IN关键字的相关子查询 => 发生NESTED LOOPS半联结
SQL> select department_name
2 from hr.departments dept
3 where department_id in (select department_id from hr.employees emp);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2605691773
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 190 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 10 | 190 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
-- 使用EXISTS关键字的相关子查询 => 发生NESTED LOOPS半联结
SQL> select department_name
2 from hr.departments dept where exists
3 (select null from hr.employees emp where emp.department_id = dept.department_id);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2605691773
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 190 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 10 | 190 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMP"."DEPARTMENT_ID"="DEPT"."DEPARTMENT_ID")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
-- 谓语中使用了OR分支中的EXISTS子查询 => 禁用半联结
SQL> select department_name
2 from hr.departments dept
3 where 1=2 OR exists
4 (select null from hr.employees emp where emp.department_id = dept.department_id);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 440241596
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 27 | 432 | 4 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 2 | 6 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS (SELECT 0 FROM "HR"."EMPLOYEES" "EMP" WHERE
"EMP"."DEPARTMENT_ID"=:B1))
3 - access("EMP"."DEPARTMENT_ID"=:B1)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
35 consistent gets
0 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed

从结果集来看,我们很容易联想到内联结,那为什么半联结通常来说会获得更高的性能呢?这实际也是半联结优化的关键,拿NESTED LOOPS来举例,在NESTED LOOPS联结中,驱动表被读取后需要逐个的进入内层循环来进行匹配工作,并且只有当外层循环的数据行与内层循环中的每一行数据匹配运算完成后才会结束一个结果集的获取;而相对而言,半联结的区别在于数据集1中的每一条记录只返回一次,而不管数据集2中有几条匹配的记录,因此,半联结会在找到子查询中匹配到的第一条数据后立即结束处理从而提高性能。

对于某些需要利用半联结来提高性能的场景,可以通过手动的方式控制半联结的执行计划,使用SEMIJOIN和NO_SEMIJOIN提示分别可以指定优化器使用和禁用半联结。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-- 使用NO_SEMIJOIN提示禁用半联结
SQL> select department_name
2 from hr.departments dept
3 where department_id in (select /*+ no_semijoin */department_id from hr.employees emp);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3372191744
------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 106 | 1802 | 4 (25)| 00:00:01 |
| 1 | VIEW | VM_NWVW_2 | 106 | 1802 | 4 (25)| 00:00:01 |
| 2 | HASH UNIQUE | | 106 | 2544 | 4 (25)| 00:00:01 |
| 3 | NESTED LOOPS | | 106 | 2544 | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 567 | 3 (0)| 00:00:01 |
|* 5 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 4 | 12 | 0 (0)| 00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
506 recursive calls
0 db block gets
188 consistent gets
7 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
10 sorts (memory)
0 sorts (disk)
11 rows processed

除此之外,我们还可以使用always_semi_join隐藏参数选择半联结的联结类型,有关always_semi_join参数的可选值:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
SQL> SELECT
2 PARNO_KSPVLD_VALUES pvalid_par#,
3 NAME_KSPVLD_VALUES pvalid_name,
4 VALUE_KSPVLD_VALUES pvalid_value,
5 DECODE(ISDEFAULT_KSPVLD_VALUES, 'FALSE', '', 'DEFAULT' ) pvalid_default
6 FROM
7 X$KSPVLD_VALUES
8 WHERE
9 LOWER(NAME_KSPVLD_VALUES) LIKE '%'||LOWER(nvl('&pname',name_kspvld_values))||'%'
10 ORDER BY
11 pvalid_par#,
12 pvalid_default,
13 pvalid_Value
14 /
PAR# PARAMETER VALUE DEFAULT
------ -------------------------------------------------- ------------------------------ -------
1705 _always_semi_join CHOOSE
_always_semi_join HASH
_always_semi_join MERGE
_always_semi_join NESTED_LOOPS
_always_semi_join OFF

该参数的默认值为choose,表示选用半联结的类型由优化器来决定,下面来使用_always_semi_join参数将上面的NESTED LOOPS半联结改变为HASH JOIN半联结:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
-- 默认发生NESTED LOOPS SEMI
SQL> select department_name
2 from hr.departments dept
3 where department_id in (select department_id from hr.employees emp);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2605691773
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 190 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS SEMI | | 10 | 190 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
11 consistent gets
0 physical reads
0 redo size
742 bytes sent via SQL*Net to client
524 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
11 rows processed
-- session级别修改参数
SQL> alter session set "_always_semi_join"=merge;
Session altered.
-- 发生MERGE JOIN SEMI
SQL> select department_name
2 from hr.departments dept
3 where department_id in (select department_id from hr.employees emp);
11 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 954076352
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10 | 190 | 4 (25)| 00:00:01 |
| 1 | MERGE JOIN SEMI | | 10 | 190 | 4 (25)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPARTMENTS | 27 | 432 | 2 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | DEPT_ID_PK | 27 | | 1 (0)| 00:00:01 |
|* 4 | SORT UNIQUE | | 107 | 321 | 2 (50)| 00:00:01 |
| 5 | INDEX FULL SCAN | EMP_DEPARTMENT_IX | 107 | 321 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
filter("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5 consistent gets
1 physical reads
0 redo size
742 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
11 rows processed
-- 从trace来看优化器的选择还是非常可靠的

反联结

从本质上来说,反联结和半联结很多相似的因素,反联结的发生通常是在使用含有NOT IN,NOT EXISTS的相关子查询的时候,同样,如果子查询谓语OR分支中,反联结也会被禁用,它和半联结主要的不同点还是在返回数据的匹配方式上,它是会返回在子查询中没有匹配到的数据行,不过其优化的原理是一致的,通过在子查询中找到第一条匹配记录而立即停止处理来提高效率,一下是发生的集中场景:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
SQL> set autotrace traceonly
-- NOT IN 触发反联结
SQL> select department_name
2 from hr.departments
3 where department_id not in
4 (select department_id from hr.employees where department_id is not null);
16 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3082375452
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 323 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 17 | 323 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
filter("DEPARTMENT_ID" IS NOT NULL)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
6 physical reads
0 redo size
985 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
16 rows processed
-- EXISTS触发反联结
SQL> select department_name
2 from hr.departments dept
3 where not exists
4 (select null from hr.employees emp where emp.department_id = dept.department_id);
16 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3082375452
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 323 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 17 | 323 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("EMP"."DEPARTMENT_ID"="DEPT"."DEPARTMENT_ID")
Statistics
----------------------------------------------------------
3 recursive calls
0 db block gets
13 consistent gets
0 physical reads
0 redo size
985 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
16 rows processed

在上例中,分别使用了NOT IN和NOT EXISTS两种方法来发生反联结,可以注意到为了获得两个等价的查询,我在NOT IN子查询中进行了去除NULL的操作,这是因为在SQL标准中,如果NOT IN运算中返回一个空值,默认整个查询不会返回任何记录,所以为了获得正确的数据,需要在子查询中增加对空值的处理,我这里使用了IS NOT NULL的判别,同样也可以使用NVL函数;当然这个操作的目的只是为了获得正确的数据结果集,在11g版本中,不处理空值一样能够发生反联结:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
SQL> select department_name
2 from hr.departments
3 where department_id not in (select department_id from hr.employees);
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 4201340344
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 323 | 6 (17)| 00:00:01 |
| 1 | MERGE JOIN ANTI NA | | 17 | 323 | 6 (17)| 00:00:01 |
| 2 | SORT JOIN | | 27 | 432 | 2 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| DEPARTMENTS | 27 | 432 | 2 (0)| 00:00:01 |
| 4 | INDEX FULL SCAN | DEPT_ID_PK | 27 | | 1 (0)| 00:00:01 |
|* 5 | SORT UNIQUE | | 107 | 321 | 4 (25)| 00:00:01 |
| 6 | TABLE ACCESS FULL | EMPLOYEES | 107 | 321 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
5 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
filter("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
1140 recursive calls
0 db block gets
301 consistent gets
13 physical reads
0 redo size
343 bytes sent via SQL*Net to client
513 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
15 sorts (memory)
0 sorts (disk)
0 rows processed

上例中可以看到ANTI NA的选项,这是11g中新的优化方法,它会允许优化器在不知道子查询是否返回NULL的情况下处理NOT IN类型的查询,11g之前的版本中这种情况就不会发生反联结的,不过仍然需要记住,这仅仅是加快了处理速度,对于获得正确的结果集没有任何帮助!

反联结的发生情况很容易让人联想到minus操作,实际上minus确实也可以替代NOT IN和NOT EXISTS操作,同样的还有外联结操作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
-- minus代替用法
SQL> select department_name
2 from hr.departments
3 where department_id in
4 (
5 select department_id from hr.departments
6 minus
7 select department_id from hr.employees
8 );
DEPARTMENT_NAME
------------------------------
Treasury
Corporate Tax
Control And Credit
Shareholder Services
Benefits
Manufacturing
Construction
Contracting
Operations
IT Support
NOC
DEPARTMENT_NAME
------------------------------
IT Helpdesk
Government Sales
Retail Sales
Recruiting
Payroll
16 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2972564128
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 27 | 783 | 9 (34)| 00:00:01 |
| 1 | MERGE JOIN | | 27 | 783 | 9 (34)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| DEPARTMENTS | 27 | 432 | 2 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | DEPT_ID_PK | 27 | | 1 (0)| 00:00:01 |
|* 4 | SORT JOIN | | 27 | 351 | 7 (43)| 00:00:01 |
| 5 | VIEW | VW_NSO_1 | 27 | 351 | 6 (34)| 00:00:01 |
| 6 | MINUS | | | | | |
| 7 | SORT UNIQUE | | 27 | 108 | | |
| 8 | INDEX FULL SCAN | DEPT_ID_PK | 27 | 108 | 1 (0)| 00:00:01 |
| 9 | SORT UNIQUE | | 107 | 321 | | |
| 10 | TABLE ACCESS FULL | EMPLOYEES | 107 | 321 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("DEPARTMENT_ID"="DEPARTMENT_ID")
filter("DEPARTMENT_ID"="DEPARTMENT_ID")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
985 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
16 rows processed
-- outer join的代替用法
SQL> select department_name
2 from hr.departments dept left outer join hr.employees emp
3 on dept.department_id = emp.department_id
4 where emp.department_id is null;
DEPARTMENT_NAME
------------------------------
Treasury
Corporate Tax
Control And Credit
Shareholder Services
Benefits
Manufacturing
Construction
Contracting
Operations
IT Support
NOC
DEPARTMENT_NAME
------------------------------
IT Helpdesk
Government Sales
Retail Sales
Recruiting
Payroll
16 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3082375452
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 323 | 3 (0)| 00:00:01 |
| 1 | NESTED LOOPS ANTI | | 17 | 323 | 3 (0)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 41 | 123 | 0 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPT"."DEPARTMENT_ID"="EMP"."DEPARTMENT_ID")
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
12 consistent gets
0 physical reads
0 redo size
985 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
16 rows processed

从上面的例子可以看出minus和outer join操作都可以巧妙的实现同样的结果,不过从执行计划来看,minus操作显然没有反联结操作优化,而使用outer join虽然发生了反联结优化,但是由于使用了带空值的虚拟记录来匹配数据行,不便于理解,因此实际还是不建议使用的。

如果想要手动控制反联结的执行计划,这里也有一些hint和参数可以使用,常用的hint有:

  • ANTIJOIN-进行反联结,优化器决定联结类型
  • USE_ANTI-老版本的提示,和ANTIJOIN功能一致
  • [NL_AJ] | [HASH_AJ] | [MERGE_AJ]-指定发生反联结的类型(10g开始被弃用,不过仍然可以生效)

在参数控制方面,也有个和_always_semi_join非常相同的_always_anti_join参数,用法完全一致;还有参数_optimizer_null_aware_antijoin,_optimizer_outer_to_anti_enable用于控制对含空值和外联结的反联结转换。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
-- 使用hint显式指定反联结类型
SQL> select department_name
2 from hr.departments dept
3 where not exists (select /*+ hash_aj */ null from hr.employees emp
4 where emp.department_id = dept.department_id);
16 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3587451639
----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 17 | 323 | 5 (20)| 00:00:01 |
|* 1 | HASH JOIN ANTI | | 17 | 323 | 5 (20)| 00:00:01 |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
| 3 | INDEX FULL SCAN | EMP_DEPARTMENT_IX | 107 | 321 | 1 (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("EMP"."DEPARTMENT_ID"="DEPT"."DEPARTMENT_ID")
Statistics
----------------------------------------------------------
566 recursive calls
0 db block gets
193 consistent gets
0 physical reads
0 redo size
985 bytes sent via SQL*Net to client
535 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
12 sorts (memory)
0 sorts (disk)
16 rows processed
-- 使用_optimizer_null_antijoin参数关闭反联结中的空值考虑选项-即返回空值的情况不使用反联结
SQL> alter session set "_optimizer_null_aware_antijoin"=false;
Session altered.
SQL> select department_name
2 from hr.departments
3 where department_id not in (select department_id from hr.employees);
no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 3416340233
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 26 | 416 | 30 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL| DEPARTMENTS | 27 | 432 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| EMPLOYEES | 2 | 6 | 2 (0)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( NOT EXISTS (SELECT 0 FROM "HR"."EMPLOYEES" "EMPLOYEES"
WHERE LNNVL("DEPARTMENT_ID"<>:B1)))
3 - filter(LNNVL("DEPARTMENT_ID"<>:B1))
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
172 consistent gets
0 physical reads
0 redo size
343 bytes sent via SQL*Net to client
513 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed