原文地址:https://dev.mysql.com/doc/refman/5.1/en/partitioning-hash.html

HASH Partitioning

Partitioning by HASH is used primarily to ensure an even distribution of data among a predetermined number of partitions. With range or list partitioning, you must specify explicitly into which partition a given column value or set of column values is to be stored; with hash partitioning, MySQL takes care of this for you, and you need only specify a column value or expression based on a column value to be hashed and the number of partitions into which the partitioned table is to be divided.

To partition a table using HASH partitioning, it is necessary to append to the CREATE TABLE statement aPARTITION BY HASH (expr) clause, where expr is an expression that returns an integer. This can simply be the name of a column whose type is one of MySQL's integer types. In addition, you most likely want to follow this with PARTITIONS num, where num is a positive integer representing the number of partitions into which the table is to be divided.

Note

For simplicity, the tables in the examples that follow do not use any keys. You should be aware that, if a table has any unique keys, every column used in the partitioning expression for this this table must be part of every unique key, including the primary key. See Section 18.5.1, “Partitioning Keys, Primary Keys, and Unique Keys”, for more information.

The following statement creates a table that uses hashing on the store_id column and is divided into 4 partitions:

CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
)
PARTITION BY HASH(store_id)
PARTITIONS 4;

If you do not include a PARTITIONS clause, the number of partitions defaults to 1.

Using the PARTITIONS keyword without a number following it results in a syntax error.

You can also use an SQL expression that returns an integer for expr. For instance, you might want to partition based on the year in which an employee was hired. This can be done as shown here:

CREATE TABLE employees (
id INT NOT NULL,
fname VARCHAR(30),
lname VARCHAR(30),
hired DATE NOT NULL DEFAULT '1970-01-01',
separated DATE NOT NULL DEFAULT '9999-12-31',
job_code INT,
store_id INT
)
PARTITION BY HASH( YEAR(hired) )
PARTITIONS 4;

expr must return a nonconstant, nonrandom integer value (in other words, it should be varying but deterministic), and must not contain any prohibited constructs as described in Section 18.5, “Restrictions and Limitations on Partitioning”. You should also keep in mind that this expression is evaluated each time a row is inserted or updated (or possibly deleted); this means that very complex expressions may give rise to performance issues, particularly when performing operations (such as batch inserts) that affect a great many rows at one time.

The most efficient hashing function is one which operates upon a single table column and whose value increases or decreases consistently with the column value, as this allows for “pruning” on ranges of partitions. That is, the more closely that the expression varies with the value of the column on which it is based, the more efficiently MySQL can use the expression for hash partitioning.

For example, where date_col is a column of type DATE, then the expression TO_DAYS(date_col) is said to vary directly with the value of date_col, because for every change in the value of date_col, the value of the expression changes in a consistent manner. The variance of the expression YEAR(date_col) with respect todate_col is not quite as direct as that of TO_DAYS(date_col), because not every possible change in date_colproduces an equivalent change in YEAR(date_col). Even so, YEAR(date_col) is a good candidate for a hashing function, because it varies directly with a portion of date_col and there is no possible change indate_col that produces a disproportionate change in YEAR(date_col).

By way of contrast, suppose that you have a column named int_col whose type is INT. Now consider the expression POW(5-int_col,3) + 6. This would be a poor choice for a hashing function because a change in the value of int_col is not guaranteed to produce a proportional change in the value of the expression. Changing the value of int_col by a given amount can produce by widely different changes in the value of the expression. For example, changing int_col from 5 to 6 produces a change of -1 in the value of the expression, but changing the value of int_col from 6 to 7 produces a change of -7 in the expression value.

In other words, the more closely the graph of the column value versus the value of the expression follows a straight line as traced by the equation y=cx where c is some nonzero constant, the better the expression is suited to hashing. This has to do with the fact that the more nonlinear an expression is, the more uneven the distribution of data among the partitions it tends to produce.

In theory, pruning is also possible for expressions involving more than one column value, but determining which of such expressions are suitable can be quite difficult and time-consuming. For this reason, the use of hashing expressions involving multiple columns is not particularly recommended.

When PARTITION BY HASH is used, MySQL determines which partition of num partitions to use based on the modulus of the result of the user function. In other words, for an expression expr, the partition in which the record is stored is partition number N, where N = MOD(exprnum). Suppose that table t1 is defined as follows, so that it has 4 partitions:

CREATE TABLE t1 (col1 INT, col2 CHAR(5), col3 DATE)
PARTITION BY HASH( YEAR(col3) )
PARTITIONS 4;

If you insert a record into t1 whose col3 value is '2005-09-15', then the partition in which it is stored is determined as follows:

MOD(YEAR('2005-09-01'),4)
= MOD(2005,4)
= 1

MySQL 5.1 also supports a variant of HASH partitioning known as linear hashing which employs a more complex algorithm for determining the placement of new rows inserted into the partitioned table. See Section 18.2.3.1, “LINEAR HASH Partitioning”, for a description of this algorithm.

The user function is evaluated each time a record is inserted or updated. It may also—depending on the circumstances—be evaluated when records are deleted.

Note

If a table to be partitioned has a UNIQUE key, then any columns supplied as arguments to the HASH user function or to the KEY's column_list must be part of that key.

HASH Partitioning--转载的更多相关文章

  1. LINEAR HASH Partitioning

    MySQL :: MySQL 8.0 Reference Manual :: 23.2.4.1 LINEAR HASH Partitioning https://dev.mysql.com/doc/r ...

  2. Mysql 分区(range,list,hash)转载

    MySQL支持RANGE,LIST,HASH和KEY四种分区.其中,每个分区又都有一种特殊的类型.对于RANGE分区,有RANGE COLUMNS分区.对于LIST分区,有LIST COLUMNS分区 ...

  3. hash 分区的用途是什么?

    Hash partitioning enables easy partitioning of data that does not lend itself to rangeor list partit ...

  4. 【转】各种字符串Hash函数比较

    常用的字符串Hash函数还有ELFHash,APHash等等,都是十分简单有效的方法.这些函数使用位运算使得每一个字符都对最后的函数值产生影响.另外还有以MD5和SHA1为代表的杂凑函数,这些函数几乎 ...

  5. 分区实践 A PRIMARY KEY must include all columns in the table's partitioning function

    MySQL :: MySQL 8.0 Reference Manual :: 23 Partitioning https://dev.mysql.com/doc/refman/8.0/en/parti ...

  6. HASH、HASH函数、HASH算法的通俗理解

    之前经常遇到hash函数或者经常用到hash函数,但是hash到底是什么?或者hash函数到底是什么?却很少去考虑.最近同学去面试被问到这个问题,自己看文章也看到hash的问题.遂较为细致的追究了一番 ...

  7. redis该如何分区-译文(原创)

    写在最前,最近一直在研究redis的使用,包括redis应用场景.性能优化.可行性.这是看到redis官网中一个链接,主要是讲解redis数据分区的,既然是官方推荐的,那我就翻译一下,与大家共享. P ...

  8. Android ListView初始化简单分析

    下面是分析ListView初始化的源码流程分析,主要是ListVIew.onLayout过程与普通视图的layout过程完全不同,避免流程交代不清楚,以下是一个流程的思维导图. 思维导图是顺序是从左向 ...

  9. Redis短结构与分片

    本文将介绍两种降低Redis内存占用的方法——使用短结构存储数据和对数据进行分片. 降低Redis内存占用有助于减少创建快照和加载快照所需的时间.提升载入AOF文件和重写AOF文件时的效率.缩短从服务 ...

随机推荐

  1. Spring-statemachine给end状态设置action

    Spring-statemachine版本:当前最新的1.2.3.RELEASE版本 builder.configureStates() .withStates() .initial(generate ...

  2. easyui combobox 设置值 顺序放在最后

    easyui combobox 设置值 顺序放在最后 如果设置函数.又设置选中的值,注意顺序, 设置值需要放到最后,否则会设置了之后又没有了: $('#spanId'+i).combobox(res) ...

  3. Android ImageView设置图片原理(下)

    本文来自http://blog.csdn.net/liuxian13183/ ,引用必须注明出处! 写完上一篇后,总认为介绍的知识点不多,仅仅是一种在UI线程解析载入图片.两种在子线程解析,在UI线程 ...

  4. javascript小白学习指南1---0

    第二章 变量和作用域    在看第二章时我希望,你能够回想一下前一次所讲的内容  假设有所遗忘 点这里    今天我们来说说 变量和作用域的问题 本章主要内容 基本类型和引用类型 运行环境 垃圾回收( ...

  5. ReactNative之Flux框架的使用

    概述 流程图 项目结构 View Components actions Dispatcher Stores 感谢 概述 React Native 能够说非常火,非常多bat的项目都在使用.不用发版就能 ...

  6. 20. idea刷新项目、清除项目缓存

    转自:https://www.cnblogs.com/suiyisuixing/p/7723458.html 点击File -> Invalidate caches ,点击之后在弹出框中点击确认 ...

  7. 学习参考《零基础入门学习Python》电子书PDF+笔记+课后题及答案

    国内编写的关于python入门的书,初学者可以看看. 参考: <零基础入门学习Python>电子书PDF+笔记+课后题及答案 Python3入门必备; 小甲鱼手把手教授Python; 包含 ...

  8. hzwer 模拟题 祖孙询问

    祖孙询问 题目描述 已知一棵n个节点的有根树.有m个询问.每个询问给出了一对节点的编号x和y,询问x与y的祖孙关系. 输入输出格式 输入格式: 输入第一行包括一个整数n表示节点个数. 接下来n行每行一 ...

  9. Floyd-Warshall 算法-- 最短路径(适合节点密集的图)

     由于此算法时间复杂度为O(V³).大多数情况下不如迪杰斯特拉算法的.迪杰斯特拉算法适合于节点疏散的图.  演示样例图例如以下:     Step 1 创建节点与边的最短路径结果表(直接可达关系).数 ...

  10. 发送邮件被退回,提示: Helo command rejected: Invalid name 错误

    我自己配置的 postfix + dovecot server, 配置了outlook 后, 相同的账号. 在有的电脑上能收发成功, 在有的电脑上发送邮件就出现退信.提示 Helo command r ...