Small things are better
Yesterday I had fun time repairing 1.5Tb ext3 partition, containing many millions of files. Of course it should have never happened – this was decent PowerEdge 2850 box with RAID volume, ECC memory and reliable CentOS 4.4 distribution but still it did. We had “journal failed” message in kernel log and filesystem needed to be checked and repaired even though it is journaling file system which should not need checks in normal use, even in case of power failures. Checking and repairing took many hours especially as automatic check on boot failed and had to be manually restarted.
Same may happen with Innodb tables. They are designed to never crash, surviving power failures and even partial page writes but still they can get corrupted because of MySQL bugs, OS Bugs or hardware bugs, misconfiguration or failures.
Sometimes
corruption kind be mild, so ALTER TABLE to rebuild the table fixes it.
Sometimes table needs to be dropped and recovered from backup but in
certain cases you may need to reimport whole database – if corruption is
happens to be in undo tablespace or log files.
So do not forget
to have your recovery plan this kind failures. This is one thing you
better to have backups for. Backups however take time to restore,
especially if you do point in time recovery using binary log to get to
actual database state.
The good practice to approach this kind of
problem is first to have enough redundancy. I always assume any
component, such as piece of hardware or software can fail, even if this
piece of hardware has some internal redundancy by itself, such as RAID
or SAN solutions.
If you can’t afford full redundancy for
everything (and probably even if you do) the good idea is to keep your
objects smaller so if you need to do any maintenance with them it will
take less times. Smaller RAID volumes would typically rebuild faster,
smaller database size per system (yet another reason to like medium
end commodity hardware) makes it faster to recover, smaller tables
allow per table backup and recovery to happen faster.
With MySQL
and blocking ALTER TABLE there is yet another reason to keep tables
small, so you do not have to use complicated scenarios to do simple
things. Assume for example you need to add extra column to 500GB
Innodb table. It will probably take long hours or even days for ALTER
TABLE to complete and about 500GB of temporary space will be required
which you simply might not have. You can of course use MASTER-MASTER
replication and run statement on one server, switch role and then do it
on other, but if alter table takes several days do you really can
afford having no box to fall back to for such a long time ?
On
other hand if you would have 500 of 1GB tables it would be very easy –
you can simply move small pieces of data offline for a minute and alter
them live. Also all process will be much faster this way as whole
indexes will well fit in memory for such small tables.
Not to mention splitting 500 tables to several servers will likely be easy than splitting one big one.
There
are bunch of complications with many tables of course, it is not always
easy to partition your data appropriately, also code gets complicated
but for many applications it is worth the trouble
At NNSEEK
for example we have data split at 256 groups of tables. Current data
size is small enough so even single table would not be big problem but
it is much easier to write your code to handle split from very beginning
rather than try to add in later on when there are 100 helper scripts
written etc.
For the same reason I would recommend setting up
multiple virtual servers even if you work with physical one in the
beginning. Different accounts with different permissions will be good
enough. Doing so will ensure you will not have problems once you will
really need to scale to multiple servers.
随机推荐
- STM32F4使用FPU+DSP库进行FFT运算的测试过程二
原文地址:http://www.cnblogs.com/NickQ/p/8541156.html 测试环境:单片机:STM32F407ZGT6 IDE:Keil5.20.0.0 固件库版本:STM32 ...
- Python学习手册之数据类型
在上一篇文章中,我们介绍了 Python 的异常和文件,现在我们介绍 Python 中的数据类型. 查看上一篇文章请点击:https://www.cnblogs.com/dustman/p/99799 ...
- 糖果 南阳acm589
糖果 时间限制:1000 ms | 内存限制:65535 KB 难度:2 描述 topcoder工作室的PIAOYIi超级爱吃糖果,现在他拥有一大堆不同种类的糖果,他准备一口气把它们吃完,可是 ...
- 生产Web架构优化方案(动态转静态)
Infi-chu: http://www.cnblogs.com/Infi-chu/ 一.门户新闻业务: 1. 特点:网页一旦发布,再次改动网页内容的几率很低,新闻业务内容的静态化相对比较简单 2. ...
- Rmarkdown:输出pdf设置
输出pdf需要安装Ctex --- title: "first markdown" author: "name" date: "`r format(S ...
- 洛谷(P1006 传纸条)
题目描述 小渊和小轩是好朋友也是同班同学,他们在一起总有谈不完的话题.一次素质拓展活动中,班上同学安排做成一个mm行nn列的矩阵,而小渊和小轩被安排在矩阵对角线的两端,因此,他们就无法直接交谈了.幸运 ...
- ChipScope软件使用
内容组织 1.建立工程 2.插入及配置核 2.1运行Synthesize 2.2新建cdc文件 2.3 ILA核的配置 3. Implement and generate programmi ...
- 4 echo服务器
收到数据,给别人原封不动返回 #4. 将接收到的数据再发送给对方 udpSocket.sendto(recvData[0], recvData[1]) #coding=utf-8 from soc ...
- P1016 旅行家的预算
P1016 旅行家的预算 题目描述 一个旅行家想驾驶汽车以最少的费用从一个城市到另一个城市(假设出发时油箱是空的).给定两个城市之间的距离D1.汽车油箱的容量C(以升为单位).每升汽油能行驶的距离D2 ...
- VINS紧耦合优化公式及代码解析
1.首先确定待优化的状态变量 对应代码,优化参数为: Vector3d Ps[(WINDOW_SIZE + )];(平移向量) Vector3d Vs[(WINDOW_SIZE + )];(速度) M ...