Development of large-scale site performance optimization method from LiveJournal background
A LiveJournal course of development
is a project in the 99 years began in the campus, a few people do as a hobby such an application, in order to achieve the following functions:
- Blog, forum
- Social network, find friends
- Polymerization article polymerization of friends
LiveJournal uses a lot of open source software, even if it itself is an open source software.
After on-line, LiveJournal very rapid growth:
- April 2004: 280 million registered users.
- April 2005: 680 million registered users.
- August 2005: 790 million registered users.
- Reached a thousand times per second page request processing.
- A large number of MySQL server.
- Use a lot of common components.
LiveJournal architecture status quo Profile
Third, from LiveJournal Development in learning
LiveJournal to 100 servers from a server development, which has experienced numerous pain, but also worked out a solution to these problems, through LiveJournal learning, allows us to avoid LJ had mistakes in the past, and good design of the system from the outset, in order to avoid the pain of late.
Let's step-by-step look at the pace of development of LJ.
1, a server
Others a donor server, LJ initially run at the top, just like Google began when broken server, worthy of our respect. At this stage, the LJ at an alarming rate familiar with the Unix operating management, server performance issues, Fortunately, you can change some minor repairs to muddle through. At this stage LJ upgrade CGI to FastCGI.
Final problem, the site is getting slower and slower, has been unable to pass too excellent to solve the point, you need more servers, then LJ began offering paid services may want the money to buy a new server to resolve then predicament.
There is no doubt that when LJ there is a huge single point, everything in server tin box filled.
2, two servers
Earned money paid service LJ bought two servers: one called Kenny Dell 6U machine is used to provide Web services, called Cartman Dell 6U server used to provide database services.
LJ have a larger disk, more computing resources. But at the same time, the network structure is very simple, each machine two network cards, Cartman Kenny intranet MySQL database services.
Temporary solution to the problem of the load, a new problem has emerged:
- A single point into a two and a single point.
- No cold backup or hot backup.
- Site slow problems began to appear, no way, grow too fast.
- Web server CPU limit is reached, the Web server.
3, 4 servers
Bought two, Kyle and Stan, this is 1U, are used to provide Web services. LJ, a total of 3 Web server and a database server. At this time both horizontal load 3 Web server.
LJ Kenny gateway for external mod_backhand to both horizontal load.
Then the problem has emerged:
- Single point of failure. Database for gateway Web server is a single point, once any machine problems will result in all the service is not available. Web server can be used to make the gateway quickly switch synchronization by maintaining the heartbeat, but still can not solve a single point of the database, LJ that time, did not do this.
- Website and slow, this is because the IO and database problem, the problem is how to add to the application inside the database?
4, five servers
Bought a database server. On two database servers using the database synchronization (MySQL support Master-Slave mode), the write operation all the master database (by Binlog, the write operation on the master server can quickly sync from the server), the read operation in two the database at the same time (it can be considered both horizontal load a).
Synchronize to the attention of a few things:
- Read operation database selection algorithm processing to choose a current database load lighter.
- Is only read from the database server
- Ready to deal with the delay in the synchronization process, handled properly may result in database synchronization interrupt. Only the judge can write operation, the read operation does not exist synchronization problems.
5 or more servers
Money, of course, to buy more servers. Fast deployment did not take long, they began to slow. The more Web servers, database servers, there are IO and CPU contention. So the BIG-IP load balancing solution.
6, where we are now:
Server is basically enough, but the performance is still a problem, the reason for the structure.
The structure of the database is the biggest problem. Slave mode due to the increase in the database are added to the application, so the only advantage is that the read operation is distributed to multiple machines, but such consequences is a write operation is distributed, each machine must be running the server more , the greater the waste, with the increase of the write operation, the fewer resources used to service the read operation.
Distribution from one to two
The final results
Now we find that we do not need these data in so many servers keep a copy. Have done a RAID server, database backup, so the backup is completely a waste of resources, a redundant extreme excessive. Why not the distribution of data storage?
The problem is found, start thinking about how to solve. To do now is the distribution of different user data to a different server for storage, in order to achieve the distributed storage of data, each machine only for fixed relative to the user, in order to achieve parallel architecture and good scalability .
In order to achieve user group, we need to be allocated for each user a set of tags used to mark user's data is stored in the database server in which group. Each group database consists of one master and several slave, and the slave in 2-3, in order to achieve the most rational allocation of system resources, both to ensure the distribution of the data read operation, but also avoid the excessive redundancy of data and synchronous operation of system resources excessive consumption.
User packet control is provided by a (group of) central server. All user packet information is stored in this machine, all users need to query the user group number of this machine, and then to get the data in the database group.
This user structure and the LJ architecture has very similar.
In the specific implementation, a couple of caveats:
- Do not use auto-incremented in the database group ID, in order to migrate users between the database group at a later date, in order to achieve a more reasonable I / O, disk space and load distribution.
- Userid, postid is stored in the global server, you can use the increment, the corresponding value in the database group must be subject to the value on the global server. Global server transactional database InnoDB.
- Between the database group when migrating users to be extremely careful when migrate user can not write operation.
7, Where are we now
Question:
- A global master server, hang up, then all users to register and write operations to hang.
- A master server for each database group, hang up, then the write operation of this group of users and hung.
- Database group hang from the server it will lead to other server load is too large.
Single point for Master-Slave mode, LJ adopted a Master-Master mode to resolve. Master-Master is actually artificial, not provided directly by MySQL, which is actually two machines at the same time is the Master, also is the slave, synchronized with each other.
Master-Master achieve need to pay attention to:
- A Master synchronization error recovery, it is best done automatically by the server.
- Digital distribution, write on both machines at the same time, some ID may conflict.
Solution:
- The parity assigned ID write an odd number, a machine, a machine to write even
- Allocated by the global server (LJ practice).
Master-Master mode there is a use of this method with the former compared to still maintain the synchronization of the two machines, but only one machine (read and write), rotation every night, or appear problem when switching.
8 Where are we now
Now an ad spots MyISAM vs InnoDB.
Using InnoDB:
- Support transactions
- Need to do more configuration, but it is worth more secure storage of data, as well as get a faster rate.
Use MyISAM:
- Log (LJ use it to the network access log).
- Read-only static data storage, fast enough.
- Concurrency is poor, unable to read and write data at the same time (add data can)
- MySQL non-normal shutdown or crash can cause the index error, need to use myisamchk to repair, and when access to large amount of very frequent.
9 cache
Last year, I wrote , it is a caching tool developed by the team of LJ, key-value way to store data to distributed memory. Data LJ buffer:
- 12 stand-alone server (not donated)
- 28 instances
- 30GB total capacity
- 90-93% hit rate (used squid may know, squid memory plus disk hit rate of about 70-80%)
How to create a cache strategy?
I want to cache all things? It is not possible, we only need to cache or may result in system bottleneck submission system efficiency. MySQL log analysis, we can find the cached object.
The disadvantage of the cache?
- Nothing is perfect, the cache also has drawbacks:
- Increase the amount of development, the need for caching write special code.
- Management more difficult, more people are needed to participate in system maintenance.
- Of course, large memory needed money.
Web access load balancing
At the packet level using the BIG-IP, BIG-IP does not know our internal processing mechanism, can not determine which server processing these requests. The reverse proxy does not play a role, not been fast enough, that is, up to less than the effect we want.
So, LJ the development . Features:
- Fast, small, manageable http web server / proxy
- Can be forwarded to the internal
- Using the Perl development
- Single-threaded, asynchronous, event-based, use epoll, kqueue
- Support Console management and http remote management, support for dynamic configuration loaded
- A variety of modes: web server, reverse proxy, plug-ins
- Support plugin: GIF / PNG interchangeable?
11 MogileFS
LJ use open source as the distributed file storage system. MogileFS very simple to use, its main design idea is:
- The file belongs to the class (the class is the smallest unit of replication)
- Storage location of the trace file
- Stored on different hosts
- MySQL Cluster Unified Storage distribution information
- Big Easy Inexpensive Disks
So far so many more documents can be found in the . students take this document to participate in two MySQL Con, twice OS Con, as well as numerous other meetings, selfless to share their experience, that we can learn. In web2.0 era rapid development to get more and more attention, but good design is still the basis of each application, web2.0 in the way of growth Top500 website, not because of the architecture hindered the development of the site.
Development of large-scale site performance optimization method from LiveJournal background的更多相关文章
- Development of a High Coverage Pseudotargeted Lipidomics Method Based on Ultra-High Performance Liquid Chromatography−Mass Spectrometry(基于超高效液相色谱-质谱法的高覆盖拟靶向脂质组学方法的开发)
文献名:Development of a High Coverage Pseudotargeted Lipidomics Method Based on Ultra-High Performance ...
- 大规模视觉识别挑战赛ILSVRC2015各团队结果和方法 Large Scale Visual Recognition Challenge 2015
Large Scale Visual Recognition Challenge 2015 (ILSVRC2015) Legend: Yellow background = winner in thi ...
- Introducing DataFrames in Apache Spark for Large Scale Data Science(中英双语)
文章标题 Introducing DataFrames in Apache Spark for Large Scale Data Science 一个用于大规模数据科学的API——DataFrame ...
- Goal driven performance optimization
When your goal is to optimize application performance it is very important to understand what goal d ...
- Java Performance Optimization Tools and Techniques for Turbocharged Apps--reference
Java Performance Optimization by: Pierre-Hugues Charbonneau reference:http://refcardz.dzone.com/refc ...
- 论文笔记之:Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation
Large Scale Distributed Semi-Supervised Learning Using Streaming Approximation Google 2016.10.06 官方 ...
- 快速高分辨率图像的立体匹配方法Effective large scale stereo matching
<Effective large scale stereo matching> In this paper we propose a novel approach to binocular ...
- Computer Vision_33_SIFT:Improving Bag-of-Features for Large Scale Image Search——2010
此部分是计算机视觉部分,主要侧重在底层特征提取,视频分析,跟踪,目标检测和识别方面等方面.对于自己不太熟悉的领域比如摄像机标定和立体视觉,仅仅列出上google上引用次数比较多的文献.有一些刚刚出版的 ...
- opengl performance optimization
OpenGL 性能优化 作者: Yang Jian (jyang@cad.zju.edu.cn) 日期: 2009-05-04 本文从硬件体系结构.状态机.光照.纹理.顶点数组.LOD.Cull等方面 ...
随机推荐
- [django/mysql] 使用distinct在mysql中查询多条不重复记录值的解决办法
前言:不废话.,直接进入正文 正文: 如何使用distinct在mysql中查询多条不重复记录值? 首先,我们必须知道在django中模型执行查询有两种方法: 第一种,使用django给出的api,例 ...
- Java基础知识笔记(四:多线程基础及生命周期)
一.多线程基础 编写线程程序主要是构造线程类.构造线程类的方式主要有两种,一种是通过构造类java.lang.Thread的子类,另一种是通过构造方法实现接口java.lang.Runnable的类. ...
- Java JDBC基础学习小结
JDBC是一个Java应用程序接口,作用是封装了对数据库的各种操作.JDBC由类和接口组成,使用Java开发数据库应用都需要4个主要的接口:Driver.Connection.Statement.Re ...
- 用MATLAB对信号做频谱分析
1.首先学习下傅里叶变换的东西.学高数的时候老师只是将傅里叶变换简单的说了下,并没有深入的讲解.而现在看来,傅里叶变换似乎是信号处理的方面的重点只是呢,现在就先学习学习傅里叶变换吧. 上面这幅图在知乎 ...
- 项目<<魔兽登录系统>>
创建魔兽系统相关窗体: 登录窗体(frmLogin) 注册窗体(frmRegister) 主窗体 (frmMain) 实现魔兽登录系统: 登录的界面如下 实现思路: 1.创建一个对象数组,长度为1 ...
- 百度数据可视化图表套件echart实战
最近我一直在做数据可视化的前端工作,我用的最多的绘图工具是d3.d3有点像photoshop,功能很强大,例子也很多,但是学习成本也不低,做项目是需要较大人力投入的.3月底由在亚马逊工作的同学介绍下使 ...
- mybatis 使用经验小结
一.多数据源问题 主要思路是把dataSource.sqlSesstionFactory.MapperScannerConfigurer在配置中区分开,各Mapper对应的包名.类名区分开 <? ...
- 兼容firefox的 keyCode
<script language = "javascript"> document.onkeydown=inLogin function inLogin(e) { va ...
- Android开发的菜鸟小记
1.主线程异常:添加网络连接: 2.权限异常: 3.空指针异常:NullException: 添加网络权限: DEBUG:Connected to the target VM, address: 'l ...
- Codeforces Round #380(div 2)
A. 题意:给你一串字符串(<=100),将ogo ogogo ogogogo ogogogogo……这种全部缩成***,输出缩后的字符串 分析:第一遍扫对于那些go的位置,记录下next[i] ...