Hash Join: Basic Steps

Joins https://docs.oracle.com/database/121/TGSQL/tgsql_join.htm#TGSQL242

tidb/index_lookup_hash_join.go at master · pingcap/tidb https://github.com/pingcap/tidb/blob/master/executor/index_lookup_hash_join.go

Hash Join: Basic Steps

The optimizer uses the smaller data source to build a hash table on the join key in memory, and then scans the larger table to find the joined rows.

The basic steps are as follows:

The database performs a full scan of the smaller data set, called the build table, and then applies a hash function to the join key in each row to build a hash table in the PGA.

In pseudocode, the algorithm might look as follows:
```
FOR small_table_row IN (SELECT * FROM small_table)

LOOP

  slot_number := HASH(small_table_row.join_key);

  INSERT_HASH_TABLE(slot_number,small_table_row);

END LOOP;
```
The database probes the second data set, called the probe table, using whichever access mechanism has the lowest cost.

Typically, the database performs a full scan of both the smaller and larger data set. The algorithm in pseudocode might look as follows:
```
FOR large_table_row IN (SELECT * FROM large_table)

LOOP

   slot_number := HASH(large_table_row.join_key);

   small_table_row = LOOKUP_HASH_TABLE(slot_number,large_table_row.join_key);

   IF small_table_row FOUND

   THEN

      output small_table_row + large_table_row;

   END IF;

END LOOP;
```
For each row retrieved from the larger data set, the database does the following:
1. Applies the same hash function to the join column or columns to calculate the number of the relevant slot in the hash table.
  
  For example, to probe the hash table for department ID 30, the database applies the hash function to 30, which generates the hash value 4.
2. Probes the hash table to determine whether rows exists in the slot.
  
  If no rows exist, then the database processes the next row in the larger data set. If rows exist, then the database proceeds to the next step.
3. Checks the join column or columns for a match. If a match occurs, then the database either reports the rows or passes them to the next step in the plan, and then processes the next row in the larger data set.
  
  If multiple rows exist in the hash table slot, the database walks through the linked list of rows, checking each one. For example, if department 30 hashes to slot 4, then the database checks each row until it finds 30.

Example 9-4 Hash Joins

An application queries the oe.orders and oe.order_items tables, joining on the order_id column.

SELECT o.customer_id, l.unit_price * l.quantity

FROM   orders o, order_items l

WHERE  l.order_id = o.order_id;

The execution plan is as follows:

--------------------------------------------------------------------------

| Id  | Operation            |  Name        | Rows  | Bytes | Cost (%CPU)|

--------------------------------------------------------------------------

|   0 | SELECT STATEMENT     |              |   665 | 13300 |     8  (25)|

|*  1 |  HASH JOIN           |              |   665 | 13300 |     8  (25)|

|   2 |   TABLE ACCESS FULL  | ORDERS       |   105 |   840 |     4  (25)|

|   3 |   TABLE ACCESS FULL  | ORDER_ITEMS  |   665 |  7980 |     4  (25)|

--------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

   1 - access("L"."ORDER_ID"="O"."ORDER_ID")

Because the orders table is small relative to the order_items table, which is 6 times larger, the database hashes orders. In a hash join, the data set for the build table always appears first in the list of operations (Step 2). In Step 3, the database performs a full scan of the larger order_items later, probing the hash table for each row.

How Hash Joins Work When the Hash Table Does Not Fit in the PGA

The database must use a different technique when the hash table does not fit entirely in the PGA. In this case, the database uses a temporary space to hold portions (called partitions) of the hash table, and sometimes portions of the larger table that probes the hash table.

The basic process is as follows:

The database performs a full scan of the smaller data set, and then builds an array of hash buckets in both the PGA and on disk.

When the PGA hash area fills up, the database finds the largest partition within the hash table and writes it to temporary space on disk. The database stores any new row that belongs to this on-disk partition on disk, and all other rows in the PGA. Thus, part of the hash table is in memory and part of it on disk.
The database takes a first pass at reading the other data set.

For each row, the database does the following:
1. Applies the same hash function to the join column or columns to calculate the number of the relevant hash bucket.
2. Probes the hash table to determine whether rows exist in the bucket in memory.
  
  If the hashed value points to a row in memory, then the database completes the join and returns the row. If the value points to a hash partition on disk, however, then the database stores this row in the temporary tablespace, using the same partitioning scheme used for the original data set.
The database reads each on-disk temporary partition one by one
The database joins each partition row to the row in the corresponding on-disk temporary partition.

Hash Join Controls

The USE_HASH hint instructs the optimizer to use a hash join when joining two tables together.