【Java】浅谈HashMap

HashMap是常用的集合类，以Key-Value形式存储值。下面一起从代码层面理解它的实现。

构造方法

它有好几个构造方法，但几乎都是调此构造方法：

    public HashMap(int initialCapacity, float loadFactor) { // 初始容量，过载因子

        if (initialCapacity < 0) // 初始容量<0的异常判断

            throw new IllegalArgumentException("Illegal initial capacity: " +

                                               initialCapacity);

        if (initialCapacity > MAXIMUM_CAPACITY)

            initialCapacity = MAXIMUM_CAPACITY; // 容量的饱顶

        if (loadFactor <= 0 || Float.isNaN(loadFactor)) // 过载因子的范围校验

            throw new IllegalArgumentException("Illegal load factor: " +

                                               loadFactor);

        // Find a power of 2 >= initialCapacity

        int capacity = 1;

        while (capacity < initialCapacity) // 按初始容量找到最近的2的n次方值，为真实的容量。为什么？个人认为因计算下标用&元素效率较高

            capacity <<= 1;

        this.loadFactor = loadFactor;

        threshold = (int)Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1); // 计算扩容阀值，容量 * 过载因子

        table = new Entry[capacity]; // 实例化容量的数组

        useAltHashing = sun.misc.VM.isBooted() &&

                (capacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);

        init(); // HashMap构造完毕，还没有插入任何元素的回调方法

    }

放入元素，put(K key, V value)

实际的逻辑在putVal方法：

    public V put(K key, V value) {

        if (key == null)

            return putForNullKey(value); // 存储在table[0]

        int hash = hash(key); // 计算hash

        int i = indexFor(hash, table.length); // 计算数组下标

        for (Entry<K,V> e = table[i]; e != null; e = e.next) {

            Object k;

            if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { // 首先判断hash值是否相等（不同hash有可能映射到同一下标），再判断引用是否相等或equal方法相等

                V oldValue = e.value; // 暂存旧值

                e.value = value; // 赋予新值

                e.recordAccess(this); // 调用覆盖值回调方法

                return oldValue; // 返回旧值

            }

        }

        modCount++; // 递增变更次数

        addEntry(hash, key, value, i); // 构造Entry，添加在i下标的链表中

        return null;

    }

通过hash和数组长度计算数组下标，indexFor(int h, int length)

    static int indexFor(int h, int length) {

        return h & (length-1); // hash和数组长度-1做与运算，得到下标

    }

Value被覆盖回调方法，当put(k,v)覆盖原值时调用，recordAccess()

        /**

         * This method is invoked whenever the value in an entry is

         * overwritten by an invocation of put(k,v) for a key k that's already

         * in the HashMap.

         */

        void recordAccess(HashMap<K,V> m) {

        }

结构变更次数，modCount

此字段记录HashMap结构变更次数，如添加新元素、rehash、删除元素。此字段用于迭代器的快速失败机制。

    /**

     * The number of times this HashMap has been structurally modified

     * Structural modifications are those that change the number of mappings in

     * the HashMap or otherwise modify its internal structure (e.g.,

     * rehash).  This field is used to make iterators on Collection-views of

     * the HashMap fail-fast.  (See ConcurrentModificationException).

     */

    transient int modCount;

添加元素，addEntry()

此方法包含数组是否扩容的判断，如需扩容，会调用扩容方法：

    /**

     * Adds a new entry with the specified key, value and hash code to

     * the specified bucket.  It is the responsibility of this

     * method to resize the table if appropriate.

     *

     * Subclass overrides this to alter the behavior of put method.

     */

    void addEntry(int hash, K key, V value, int bucketIndex) {

        if ((size >= threshold) && (null != table[bucketIndex])) { // 数组是否扩容的标志：大小是否大于阀值，并且当前下标的链表不为空

            resize(2 * table.length); // 两倍扩容

            hash = (null != key) ? hash(key) : 0;

            bucketIndex = indexFor(hash, table.length); // 重新计算映射到扩容后数组的下标

        }

        createEntry(hash, key, value, bucketIndex);

    }

实际的创建元素，createEntry()

    void createEntry(int hash, K key, V value, int bucketIndex) {

        Entry<K,V> e = table[bucketIndex]; // 获取链表首元素

        table[bucketIndex] = new Entry<>(hash, key, value, e); // 构建新节点，其下一节点指向链表首元素，再讲链表首元素指向新元素（从前面插入）

        size++; // 递增容量

    }

数组扩容，resize()

    void resize(int newCapacity) {

        Entry[] oldTable = table; // 暂存原数组

        int oldCapacity = oldTable.length; // 暂存原数组容量

        if (oldCapacity == MAXIMUM_CAPACITY) {

            threshold = Integer.MAX_VALUE;

            return;

        }

        Entry[] newTable = new Entry[newCapacity]; // 实例化新容量的数组

        boolean oldAltHashing = useAltHashing;

        useAltHashing |= sun.misc.VM.isBooted() &&

                (newCapacity >= Holder.ALTERNATIVE_HASHING_THRESHOLD);

        boolean rehash = oldAltHashing ^ useAltHashing; // 是否重新hash

        transfer(newTable, rehash); // 转移所有元素到新数组

        table = newTable; // 正式使用新数组

        threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1); // 重新计算阀值

    }

转移所有元素到新数组

逐个遍历，映射到新数组的链表中：

    void transfer(Entry[] newTable, boolean rehash) {

        int newCapacity = newTable.length;

        for (Entry<K,V> e : table) { // 遍历数组

            while(null != e) { // 遍历链表

                Entry<K,V> next = e.next;

                if (rehash) {

                    e.hash = null == e.key ? 0 : hash(e.key); // 重新hash

                }

                int i = indexFor(e.hash, newCapacity); // 重新计算下标

                e.next = newTable[i]; // 当前节点的下一节点指向链表首元素（在链表前插入）

                newTable[i] = e; // 链表首元素指向当前节点

                e = next;

            }

        }

    }

删除元素，remove()

删除元素的入口如下，其实质调用removeEntryForKey方法：

    public V remove(Object key) {

        Entry<K,V> e = removeEntryForKey(key);

        return (e == null ? null : e.value);

    }

真实的删除元素，removeEntryForKey()

    final Entry<K,V> removeEntryForKey(Object key) {

        int hash = (key == null) ? 0 : hash(key); // 计算hash值

        int i = indexFor(hash, table.length); // 计算下标

        Entry<K,V> prev = table[i]; // 该下标的链表首元素

        Entry<K,V> e = prev;

        while (e != null) {

            Entry<K,V> next = e.next;

            Object k;

            if (e.hash == hash &&

                ((k = e.key) == key || (key != null && key.equals(k)))) {

                modCount++; // 删除元素，也属于结构变化

                size--; // 容量减一

                if (prev == e) // 如果当前元素是链表首元素

                    table[i] = next; // 链表首元素指向当前节点的下一节点

                else

                    prev.next = next; // 当前节点的前一节点的next指向当前节点的下一节点（删除当前节点，即跳过当前节点）

                e.recordRemoval(this); // 删除后的回调方法

                return e;

            }

            prev = e;

            e = next;

        }

        return e;

    }

获取元素，get()

    public V get(Object key) {

        if (key == null)

            return getForNullKey(); // 在table[0]的下标寻找

        Entry<K,V> entry = getEntry(key); // 计算下标、遍历链表对比（与之前的put、remove方法找元素类似）

        return null == entry ? null : entry.getValue();

    }

小疑问

计算最接近的2的n次方，roundUpToPowerOf2(int number)

这个方法是计算number最接近的2的N次方数。

其中Integer.highestOneBit()是取最高位1对应的数，如果是正数，返回的是最接近的比它小的2的N次方；如果是负数，返回的是-2147483648，即Integer的最小值。

那为什么要先减1，再求highestOneBit()？

举几个数的二进制就知道了：

00001111 = 15 -> 00011110 = 30 -> highestOneBit(30) = 16

00010000 = 16 -> 00100000 = 32 -> highestOneBit(32) = 32

所以，为了获取number最接近的2的N次方数，就先减一。

private static int roundUpToPowerOf2(int number) {

    // assert number >= 0 : "number must be non-negative";

    return number >= MAXIMUM_CAPACITY

            ? MAXIMUM_CAPACITY

            : (number > 1) ? Integer.highestOneBit((number - 1) << 1) : 1;

}

计算映射到指定范围的下标，indexFor(int h, int length)

将h映射到length的范围里，效果就像求模。

return h & (length-1);

将h和length - 1和操作就可以了。

比如length为16，那么：

16 = 00010000

15 = 00001111

为什么hash数组的长度要弄成2的N次方？

要将散列值映射到一定范围内，一般来说有2种方法，一是求模，二是与2的N次方值作&运算。而现代CPU对除法、求模运算的效率不算高，所以用第二种方法会效率比较高，所以数组被设计为2的N次方。

拓展：LinkedHashMap

见此类的声明可知其继承自HashMap，而实际的存储逻辑也是由HashMap提供：

public class LinkedHashMap<K,V>

    extends HashMap<K,V>

    implements Map<K,V>

链表的维护顺序

而LinkedHashMap中维护了遍历的顺序，是通过另外的双向链表维护的，比如，链表首元素：

    /**

     * The head of the doubly linked list.

     */

    private transient Entry<K,V> header;

元素之间的指向：

        // These fields comprise the doubly linked list used for iteration.

        Entry<K,V> before, after;

用此字段表示链表维护的顺序，true表示访问顺序，false表示插入顺序：

    private final boolean accessOrder;

放入元素

覆盖了HashMap的addEntry和createEntry方法：

    /**

     * This override alters behavior of superclass put method. It causes newly

     * allocated entry to get inserted at the end of the linked list and

     * removes the eldest entry if appropriate.

     */

    void addEntry(int hash, K key, V value, int bucketIndex) {

        super.addEntry(hash, key, value, bucketIndex); // 沿用HashMap的逻辑

        // Remove eldest entry if instructed

        Entry<K,V> eldest = header.after;

        if (removeEldestEntry(eldest)) { // 是否删除最老元素（LRU原则）

            removeEntryForKey(eldest.key); // 删除最老元素

        }

    }

    /**

     * This override differs from addEntry in that it doesn't resize the

     * table or remove the eldest entry.

     */

    void createEntry(int hash, K key, V value, int bucketIndex) {

        HashMap.Entry<K,V> old = table[bucketIndex];

        Entry<K,V> e = new Entry<>(hash, key, value, old);

        table[bucketIndex] = e;

        e.addBefore(header); // 插入到Header节点前

        size++;

    }

        /**

         * Inserts this entry before the specified existing entry in the list.

         */

        private void addBefore(Entry<K,V> existingEntry) {

            after  = existingEntry; // 指定节点的后节点

            before = existingEntry.before; // 指定节点的前节点

            before.after = this; // 将当前节点赋予前节点的后节点赋值

            after.before = this; // 将当前节点赋予后节点的前节点赋值

        }

获取元素

    public V get(Object key) {

        Entry<K,V> e = (Entry<K,V>)getEntry(key);

        if (e == null)

            return null;

        e.recordAccess(this); // 维护链表的顺序

        return e.value;

    }

        void recordAccess(HashMap<K,V> m) {

            LinkedHashMap<K,V> lm = (LinkedHashMap<K,V>)m;

            if (lm.accessOrder) { // 如果按访问顺序记录

                lm.modCount++;

                remove(); // 删除当前节点

                addBefore(lm.header); // 将当前节点加入到列表头

            }

        }

        /**

         * Removes this entry from the linked list.

         */

        private void remove() {

            before.after = after; // 将“当前节点的后节点”赋予“当前节点的前节点的后节点”

            after.before = before; // 将“当前节点的前节点”赋予“当前节点的后节点的前节点”

        }