A LiveJournal course of development

is a project in the 99 years began in the campus, a few people do as a hobby such an application, in order to achieve the following functions:

  • Blog, forum
  • Social network, find friends
  • Polymerization article polymerization of friends

LiveJournal uses a lot of open source software, even if it itself is an open source software.

After on-line, LiveJournal very rapid growth:

  • April 2004: 280 million registered users.
  • April 2005: 680 million registered users.
  • August 2005: 790 million registered users.
  • Reached a thousand times per second page request processing.
  • A large number of MySQL server.
  • Use a lot of common components.

LiveJournal architecture status quo Profile

Third, from LiveJournal Development in learning

LiveJournal to 100 servers from a server development, which has experienced numerous pain, but also worked out a solution to these problems, through LiveJournal learning, allows us to avoid LJ had mistakes in the past, and good design of the system from the outset, in order to avoid the pain of late.

Let's step-by-step look at the pace of development of LJ.

1, a server

Others a donor server, LJ initially run at the top, just like Google began when broken server, worthy of our respect. At this stage, the LJ at an alarming rate familiar with the Unix operating management, server performance issues, Fortunately, you can change some minor repairs to muddle through. At this stage LJ upgrade CGI to FastCGI.

Final problem, the site is getting slower and slower, has been unable to pass too excellent to solve the point, you need more servers, then LJ began offering paid services may want the money to buy a new server to resolve then predicament. 
There is no doubt that when LJ there is a huge single point, everything in server tin box filled.

2, two servers

Earned money paid service LJ bought two servers: one called Kenny Dell 6U machine is used to provide Web services, called Cartman Dell 6U server used to provide database services.

LJ have a larger disk, more computing resources. But at the same time, the network structure is very simple, each machine two network cards, Cartman Kenny intranet MySQL database services.

Temporary solution to the problem of the load, a new problem has emerged:

  • A single point into a two and a single point.
  • No cold backup or hot backup.
  • Site slow problems began to appear, no way, grow too fast.
  • Web server CPU limit is reached, the Web server.

3, 4 servers

Bought two, Kyle and Stan, this is 1U, are used to provide Web services. LJ, a total of 3 Web server and a database server. At this time both horizontal load 3 Web server.

LJ Kenny gateway for external mod_backhand to both horizontal load.

Then the problem has emerged:

  • Single point of failure. Database for gateway Web server is a single point, once any machine problems will result in all the service is not available. Web server can be used to make the gateway quickly switch synchronization by maintaining the heartbeat, but still can not solve a single point of the database, LJ that time, did not do this.
  • Website and slow, this is because the IO and database problem, the problem is how to add to the application inside the database?

4, five servers

Bought a database server. On two database servers using the database synchronization (MySQL support Master-Slave mode), the write operation all the master database (by Binlog, the write operation on the master server can quickly sync from the server), the read operation in two the database at the same time (it can be considered both horizontal load a).

Synchronize to the attention of a few things:

  • Read operation database selection algorithm processing to choose a current database load lighter.
  • Is only read from the database server
  • Ready to deal with the delay in the synchronization process, handled properly may result in database synchronization interrupt. Only the judge can write operation, the read operation does not exist synchronization problems.

5 or more servers

Money, of course, to buy more servers. Fast deployment did not take long, they began to slow. The more Web servers, database servers, there are IO and CPU contention. So the BIG-IP load balancing solution.

6, where we are now:

Server is basically enough, but the performance is still a problem, the reason for the structure.

The structure of the database is the biggest problem. Slave mode due to the increase in the database are added to the application, so the only advantage is that the read operation is distributed to multiple machines, but such consequences is a write operation is distributed, each machine must be running the server more , the greater the waste, with the increase of the write operation, the fewer resources used to service the read operation.

Distribution from one to two

The final results

Now we find that we do not need these data in so many servers keep a copy. Have done a RAID server, database backup, so the backup is completely a waste of resources, a redundant extreme excessive. Why not the distribution of data storage?

The problem is found, start thinking about how to solve. To do now is the distribution of different user data to a different server for storage, in order to achieve the distributed storage of data, each machine only for fixed relative to the user, in order to achieve parallel architecture and good scalability .

In order to achieve user group, we need to be allocated for each user a set of tags used to mark user's data is stored in the database server in which group. Each group database consists of one master and several slave, and the slave in 2-3, in order to achieve the most rational allocation of system resources, both to ensure the distribution of the data read operation, but also avoid the excessive redundancy of data and synchronous operation of system resources excessive consumption.

User packet control is provided by a (group of) central server. All user packet information is stored in this machine, all users need to query the user group number of this machine, and then to get the data in the database group.

This user structure and the LJ architecture has very similar.

In the specific implementation, a couple of caveats:

  • Do not use auto-incremented in the database group ID, in order to migrate users between the database group at a later date, in order to achieve a more reasonable I / O, disk space and load distribution.
  • Userid, postid is stored in the global server, you can use the increment, the corresponding value in the database group must be subject to the value on the global server. Global server transactional database InnoDB.
  • Between the database group when migrating users to be extremely careful when migrate user can not write operation.

7, Where are we now

Question:

  • A global master server, hang up, then all users to register and write operations to hang.
  • A master server for each database group, hang up, then the write operation of this group of users and hung.
  • Database group hang from the server it will lead to other server load is too large.

Single point for Master-Slave mode, LJ adopted a Master-Master mode to resolve. Master-Master is actually artificial, not provided directly by MySQL, which is actually two machines at the same time is the Master, also is the slave, synchronized with each other.

Master-Master achieve need to pay attention to:

  • A Master synchronization error recovery, it is best done automatically by the server.
  • Digital distribution, write on both machines at the same time, some ID may conflict.

Solution:

  • The parity assigned ID write an odd number, a machine, a machine to write even
  • Allocated by the global server (LJ practice).

Master-Master mode there is a use of this method with the former compared to still maintain the synchronization of the two machines, but only one machine (read and write), rotation every night, or appear problem when switching.

8 Where are we now

Now an ad spots MyISAM vs InnoDB.

Using InnoDB:

  • Support transactions
  • Need to do more configuration, but it is worth more secure storage of data, as well as get a faster rate.

Use MyISAM:

  • Log (LJ use it to the network access log).
  • Read-only static data storage, fast enough.
  • Concurrency is poor, unable to read and write data at the same time (add data can)
  • MySQL non-normal shutdown or crash can cause the index error, need to use myisamchk to repair, and when access to large amount of very frequent.

9 cache

Last year, I wrote , it is a caching tool developed by the team of LJ, key-value way to store data to distributed memory. Data LJ buffer:

  • 12 stand-alone server (not donated)
  • 28 instances
  • 30GB total capacity
  • 90-93% hit rate (used squid may know, squid memory plus disk hit rate of about 70-80%)

How to create a cache strategy?

I want to cache all things? It is not possible, we only need to cache or may result in system bottleneck submission system efficiency. MySQL log analysis, we can find the cached object.

The disadvantage of the cache?

  • Nothing is perfect, the cache also has drawbacks:
  • Increase the amount of development, the need for caching write special code.
  • Management more difficult, more people are needed to participate in system maintenance.
  • Of course, large memory needed money.

Web access load balancing

At the packet level using the BIG-IP, BIG-IP does not know our internal processing mechanism, can not determine which server processing these requests. The reverse proxy does not play a role, not been fast enough, that is, up to less than the effect we want.

So, LJ the development . Features:

  • Fast, small, manageable http web server / proxy
  • Can be forwarded to the internal
  • Using the Perl development
  • Single-threaded, asynchronous, event-based, use epoll, kqueue
  • Support Console management and http remote management, support for dynamic configuration loaded
  • A variety of modes: web server, reverse proxy, plug-ins
  • Support plugin: GIF / PNG interchangeable?

11 MogileFS

LJ use open source as the distributed file storage system. MogileFS very simple to use, its main design idea is:

  • The file belongs to the class (the class is the smallest unit of replication)
  • Storage location of the trace file
  • Stored on different hosts
  • MySQL Cluster Unified Storage distribution information
  • Big Easy Inexpensive Disks

So far so many more documents can be found in the . students take this document to participate in two MySQL Con, twice OS Con, as well as numerous other meetings, selfless to share their experience, that we can learn. In web2.0 era rapid development to get more and more attention, but good design is still the basis of each application, web2.0 in the way of growth Top500 website, not because of the architecture hindered the development of the site.

Development of large-scale site performance optimization method from LiveJournal background的更多相关文章

  1. Development of a High Coverage Pseudotargeted Lipidomics Method Based on Ultra-High Performance Liquid Chromatography−Mass Spectrometry(基于超高效液相色谱-质谱法的高覆盖拟靶向脂质组学方法的开发)

    文献名:Development of a High Coverage Pseudotargeted Lipidomics Method Based on Ultra-High Performance ...

  2. 大规模视觉识别挑战赛ILSVRC2015各团队结果和方法 Large Scale Visual Recognition Challenge 2015

    Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Legend: Yellow background = winner in thi ...

  3. Introducing DataFrames in Apache Spark for Large Scale Data Science(中英双语)

    文章标题 Introducing DataFrames in Apache Spark for Large Scale Data Science 一个用于大规模数据科学的API——DataFrame ...

  4. Goal driven performance optimization

    When your goal is to optimize application performance it is very important to understand what goal d ...

  5. Java Performance Optimization Tools and Techniques for Turbocharged Apps--reference

    Java Performance Optimization by: Pierre-Hugues Charbonneau reference:http://refcardz.dzone.com/refc ...

  6. 论文笔记之:Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation

    Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation Google  2016.10.06 官方 ...

  7. 快速高分辨率图像的立体匹配方法Effective large scale stereo matching

    <Effective large scale stereo matching> In this paper we propose a novel approach to binocular ...

  8. Computer Vision_33_SIFT:Improving Bag-of-Features for Large Scale Image Search——2010

    此部分是计算机视觉部分,主要侧重在底层特征提取,视频分析,跟踪,目标检测和识别方面等方面.对于自己不太熟悉的领域比如摄像机标定和立体视觉,仅仅列出上google上引用次数比较多的文献.有一些刚刚出版的 ...

  9. opengl performance optimization

    OpenGL 性能优化 作者: Yang Jian (jyang@cad.zju.edu.cn) 日期: 2009-05-04 本文从硬件体系结构.状态机.光照.纹理.顶点数组.LOD.Cull等方面 ...

随机推荐

  1. poj2478 Farey Sequence (欧拉函数)

    Farey Sequence 题意:给定一个数n,求在[1,n]这个范围内两两互质的数的个数.(转化为给定一个数n,比n小且与n互质的数的个数) 知识点: 欧拉函数: 普通求法: int Euler( ...

  2. 【2016-11-7】【坚持学习】【Day22】【Oracle 递归查询】

    直接在oracle 递归查询语句 select * from groups start with id=:DeptId connect by prior superiorid =id 往下找 sele ...

  3. Mirantis OpenStack 8.0 版本大概性分析

    作为 OpenStack 领域标杆性企业之一的 Mirantis 在2016年3月初发布了最新的 MOS 8.0 版本.本文试着基于公开资料进行一些归纳分析. 1. 版本概况 1.1 概况 社区版本: ...

  4. 【Python数据分析】Python3多线程并发网络爬虫-以豆瓣图书Top250为例

    基于上两篇文章的工作 [Python数据分析]Python3操作Excel-以豆瓣图书Top250为例 [Python数据分析]Python3操作Excel(二) 一些问题的解决与优化 已经正确地实现 ...

  5. BZOJ1497: [NOI2006]最大获利[最小割 最大闭合子图]

    1497: [NOI2006]最大获利 Time Limit: 5 Sec  Memory Limit: 64 MBSubmit: 4375  Solved: 2142[Submit][Status] ...

  6. sublime 插件

    由于之前的代码可视化方案太复杂,分析时间太长,不实用,另一方面是而且工作以后业余时间大大减少,因此决定放弃原有路线,从工作中最迫切的需求着手,逐步构建一个实用的工具. 新的方法仍然依赖understa ...

  7. jquery的show/hide/toggle详解

    通过阅读源码我们发现show,hide,toggle调用了showHide和isHidden这2个方法,所以我们要搞明白原理必须先看一下这2个方法. jQuery.fn.extend({ ...... ...

  8. Java基础知识总结

    写代码: 1,明确需求.我要做什么? 2,分析思路.我要怎么做?1,2,3. 3,确定步骤.每一个思路部分用到哪些语句,方法,和对象. 4,代码实现.用具体的java语言代码把思路体现出来. 学习新技 ...

  9. typescript实现react中的批次式更新

    欢迎吐槽讨论 前言 笔者在React经常使用setState,在学习过程中作笔记以作总结,欢迎讨论. 关于setState的核心观点 1 . 执行setState不都是异步的. 2 . setStat ...

  10. Linux基础 - scp免密码登陆进行远程文件同步

    在工作中经常有遇到需要脚本自动化同步文件的地方,比如数据库异地备份.假设有两台机子A(192.168.16.218)和B(192.168.16.117),需要能够让A免密码连接B. 先来看看正常的ss ...