Yesterday I had fun time repairing 1.5Tb ext3 partition, containing many millions of files. Of course it should have never happened – this was decent PowerEdge 2850 box with RAID volume, ECC memory and reliable CentOS 4.4 distribution but still it did. We had “journal failed” message in kernel log and filesystem needed to be checked and repaired even though it is journaling file system which should not need checks in normal use, even in case of power failures. Checking and repairing took many hours especially as automatic check on boot failed and had to be manually restarted.

Same may happen with Innodb tables. They are designed to never crash, surviving power failures and even partial page writes but still they can get corrupted because of MySQL bugs, OS Bugs or hardware bugs, misconfiguration or failures.

Sometimes
corruption kind be mild, so ALTER TABLE to rebuild the table fixes it.
Sometimes table needs to be dropped and recovered from backup but in
certain cases you may need to reimport whole database – if corruption is
happens to be in undo tablespace or log files.

So do not forget
to have your recovery plan this kind failures. This is one thing you
better to have backups for. Backups however take time to restore,
especially if you do point in time recovery using binary log to get to
actual database state.

The good practice to approach this kind of
problem is first to have enough redundancy. I always assume any
component, such as piece of hardware or software can fail, even if this
piece of hardware has some internal redundancy by itself, such as RAID
or SAN solutions.

If you can’t afford full redundancy for
everything (and probably even if you do) the good idea is to keep your
objects smaller so if you need to do any maintenance with them it will
take less times. Smaller RAID volumes would typically rebuild faster,
smaller database size per system (yet another reason to like medium
end commodity hardware) makes it faster to recover, smaller tables
allow per table backup and recovery to happen faster.

With MySQL
and blocking ALTER TABLE there is yet another reason to keep tables
small, so you do not have to use complicated scenarios to do simple
things. Assume for example you need to add extra column to 500GB
Innodb table. It will probably take long hours or even days for ALTER
TABLE to complete and about 500GB of temporary space will be required
which you simply might not have. You can of course use MASTER-MASTER
replication and run statement on one server, switch role and then do it
on other, but if alter table takes several days do you really can
afford having no box to fall back to for such a long time ?

On
other hand if you would have 500 of 1GB tables it would be very easy –
you can simply move small pieces of data offline for a minute and alter
them live. Also all process will be much faster this way as whole
indexes will well fit in memory for such small tables.

Not to mention splitting 500 tables to several servers will likely be easy than splitting one big one.

There
are bunch of complications with many tables of course, it is not always
easy to partition your data appropriately, also code gets complicated
but for many applications it is worth the trouble

At NNSEEK
for example we have data split at 256 groups of tables. Current data
size is small enough so even single table would not be big problem but
it is much easier to write your code to handle split from very beginning
rather than try to add in later on when there are 100 helper scripts
written etc.

For the same reason I would recommend setting up
multiple virtual servers even if you work with physical one in the
beginning. Different accounts with different permissions will be good
enough. Doing so will ensure you will not have problems once you will
really need to scale to multiple servers.

 
参考:
http://www.mysqlperformanceblog.com/2006/10/08/small-things-are-better/

随机推荐

  1. Ruby字符串的一些方法

    最近因为公司需求开始看ruby,先从ruby的基本数据类型开始看 看到ruby的字符串类型string,发现ruby中的字符串单双引号是不一样的,这点和Python有那么点不一样 主要是我们对字符串进 ...

  2. Java学习笔记七:Java的流程控制语句之switch

    Java条件语句之 switch 当需要对选项进行等值判断时,使用 switch 语句更加简洁明了.例如:根据考试分数,给予前四名不同的奖品.第一名,奖励笔记本一台:第二名,奖励 IPAD 2 一个: ...

  3. gvim编码配置知识

    配置 .vimrc 解决 Vim / gVim 在中文 Windows 下的字符编码问题     Vim / gVim 在中文 Windows 下的字符编码有两个问题: 默认没有编码检测功能 如果一个 ...

  4. C# 生成机器码

    using System; using System.Collections.Generic; using System.ComponentModel; using System.Data; usin ...

  5. NO-ZERO(空格补全)

    The NO-ZERO command follows the DATA statement REPORT Z_Test123_01. DATA: W_NUR(10) TYPE N. MOVE 50 ...

  6. 【SAPUI5】ODataを構成するもの

    はじめに SAPUI5でアプリケーションを作るにあたり.ODataは避けては通れないトピックです.結構広いテーマなので.5-7回くらいに分けて書きたいと思います.1回目はODataの概要について説明し ...

  7. CC3200模块的内存地址划分和bootloader(一)

    1. CC3200的内存地址划分非常特殊,如果没测试的话,很容易懵逼.我们先看芯片手册里面的内存地址.芯片的RAM是256KB,下图的0x2000 0000-0x2003 FFFF,正好是256KB. ...

  8. jmeter常用的内置变量

    1. vars   API:http://jmeter.apache.org/api/org/apache/jmeter/threads/JMeterVariables.html vars.get(& ...

  9. 今日Linux

    一.复习了vi 三个模式下的一些操作.贴上一些比较常用,个人觉得比较难记的操作.1.一般模式:h  光标向左移动一个字符j   光标向下移动一个字符K  光标向上移动一个字符l    光标向右移动一个 ...

  10. tensorflow Importing Data

    tf.data API可以建立复杂的输入管道.它可以从分布式文件系统中汇总数据,对每个图像数据施加随机扰动,随机选择图像组成一个批次训练.一个文本模型的管道可能涉及提取原始文本数据的符号,使用查询表将 ...