Normalization

In creating a database, normalization is the process of organizing it into tables in such a way that the results of using the database are always unambiguous and as intended. Normalization may have the effect of duplicating data within the database and often results in the creation of additional tables. (While normalization tends to increase the duplication of data, it does not introduce redundancy, which is unnecessary duplication.) Normalization is typically a refinement process after the initial exercise of identifying the data objects that should be in the database, identifying their relationships, and defining the tables required and the columns within each table.

A simple example of normalizing data might consist of a table showing:

Customer	Item purchased	Purchase price
Thomas	Shirt	$40
Maria	Tennis shoes	$35
Evelyn	Shirt	$40
Pajaro	Trousers	$25

If this table is used for the purpose of keeping track of the price of items and you want to delete one of the customers, you will also delete a price. Normalizing the data would mean understanding this and solving the problem by dividing this table into two tables, one with information about each customer and a product they bought and the second about each product and its price. Making additions or deletions to either table would not affect the other.

Normalization degrees of relational database tables have been defined and include:

First normal form (1NF). This is the "basic" level of normalization and generally corresponds to the definition of any database, namely:

It contains two-dimensional tables with rows and columns.
Each column corresponds to a sub-object or an attribute of the object represented by the entire table.
Each row represents a unique instance of that sub-object or attribute and must be different in some way from any other row (that is, no duplicate rows are possible).
All entries in any column must be of the same kind. For example, in the column labeled "Customer," only customer names or numbers are permitted.

Second normal form (2NF). At this level of normalization, each column in a table that is not a determiner of the contents of another column must itself be a function of the other columns in the table. For example, in a table with three columns containing customer ID, product sold, and price of the product when sold, the price would be a function of the customer ID (entitled to a discount) and the specific product.

Third normal form (3NF). At the second normal form, modifications are still possible because a change to one row in a table may affect data that refers to this information from another table. For example, using the customer table just cited, removing a row describing a customer purchase (because of a return perhaps) will also remove the fact that the product has a certain price. In the third normal form, these tables would be divided into two tables so that product pricing would be tracked separately.

Domain/key normal form (DKNF). A key uniquely identifies each row in a table. A domain is the set of permissible values for an attribute. By enforcing key and domain restrictions, the database is assured of being freed from modification anomalies. DKNF is the normalization level that most designers aim to achieve.

Normalization的更多相关文章

数据预处理中归一化（Normalization）与损失函数中正则化（Regularization）解惑
背景:数据挖掘/机器学习中的术语较多,而且我的知识有限.之前一直疑惑正则这个概念.所以写了篇博文梳理下摘要: 1.正则化(Regularization) 1.1 正则化的目的 1.2 正则化的L1范 ...
归一化方法 Normalization Method
1. 概要数据预处理在众多深度学习算法中都起着重要作用,实际情况中,将数据做归一化和白化处理后,很多算法能够发挥最佳效果.然而除非对这些算法有丰富的使用经验,否则预处理的精确参数并非显而易见. 2. ...
归一化交叉相关Normalization cross correlation (NCC)
归一化交叉相关Normalization cross correlation (NCC) 相关系数,图像匹配 NCC正如其名字,是用来描述两个目标的相关程度的,也就是说可以用来刻画目标间的相似性.一般 ...
从Bayesian角度浅析Batch Normalization
前置阅读:http://blog.csdn.net/happynear/article/details/44238541——Batch Norm阅读笔记与实现前置阅读:http://www.zhih ...
quantile normalization原理
对于芯片或者其它表达数据来说,最常见的莫过于quantile normalization啦. 那么它到底对我们的表达数据做了什么呢?首先要么要清楚一个概念,表达矩阵的每一列都是一个样本,每一行都是一个 ...
数据标准化 Normalization
数据的标准化(normalization)是将数据按比例缩放,使之落入一个小的特定区间.在某些比较和评价的指标处理中经常会用到,去除数据的单位限制,将其转化为无量纲的纯数值,便于不同单位或量级的指标能 ...
[CS231n-CNN] Training Neural Networks Part 1 : activation functions, weight initialization, gradient flow, batch normalization | babysitting the learning process, hyperparameter optimization
课程主页:http://cs231n.stanford.edu/ Introduction to neural networks -Training Neural Network ________ ...
深度学习网络层之 Batch Normalization
Batch Normalization Ioffe 和 Szegedy 在2015年<Batch Normalization: Accelerating Deep Network Trainin ...
Batch Normalization
一.BN 的作用 1.具有快速训练收敛的特性:采用初始很大的学习率,然后学习率的衰减速度也很大 2.具有提高网络泛化能力的特性:不用去理会过拟合中drop out.L2正则项参数的选择问题 3.不需要 ...

随机推荐

webstorm卡顿问题
近期随着项目开展,文件逐渐增大,webstrom频繁出现卡顿,而且时有崩溃现象,提示没有足够的内存来执行请求的操作,需要增加Xms设置. 解决办法: 1.找到WebStorm.exe.vmoption ...
bootstrap学习笔记--bootstrap排版类的使用
标题 Bootstrap 中定义了所有的 HTML 标题(h1 到 h6)的样式,这个和一般的html没啥区别.请看下面的实例: <h1>测试1 h1</h1> <h2& ...
IntelliJ IDEA 15，16 win 7 64位安装包以及注册码百度云盘
不知道发出来,有用没有,要是官网下载不了的话,可以用我的这个哦,虽然不是最新的. ideaIU-162.1447.21 ideaIU-15.0.2 win7 64系统的安装包链接:http://p ...
机器学习——k-近邻算法
k-近邻算法(kNN)采用测量不同特征值之间的距离方法进行分类. 优点:精度高.对异常值不敏感.无数据输入假定缺点:计算复杂度高.空间复杂度高使用数据范围:数值型和标称型工作原理:存在一个样本数 ...
【先定一个小目标】在Windows下的安装Elasticsearch
ElasticSearch是一个基于Lucene的搜索服务器.它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口.Elasticsearch是用Java开发的,并作为Apach ...
C语言中，while()语句中使用赋值语句
while()语句括号中是一个逻辑表达式,用以判断while循环是否需要继续执行.可以是赋值语句. while循环的一般格式为: while(expr) { ;//body } 其中用来判断循环条件的 ...
(转)EasyUI-datagrid-自动合并单元格
1.目标 1.1表格初始化完成后,已经自动合并好需要合并的行: 1.2当点击字段排序后,重新进行合并: 2.实现 2.1 引入插件 /** * author ____′↘夏悸 * create dat ...
Java界面设计 Swing(1)
Java界面设计的用途开发者可以通过Java SE开发丰富并且强大的具有图形界面的桌面应用程序.也可以设计一些提高效率的工具软件,帮助自己处理机械性工作. Java 的图形界面工具包,可以用于工具类 ...
acm的ubuntu (ubuntu16.04 安装指南,chrome安装,vim配置,git设置和github,装QQ)
日常手贱把ubuntu14.04更新到了16.04,然后就game over了.mdzz,不然泥萌也看不到这篇博客了=.= 然后花了些时间重装了一个16.04版的,原来那个14.04的用可以用,就是动 ...
MFC去掉标题栏
在初始化的时候加入以下函数 //去掉标题栏 ModifyStyle(WS_CAPTION, NULL, SWP_DRAWFRAME );

Normalization

Normalization的更多相关文章

随机推荐

热门专题