reference:http://blog.couchbase.com/topology-architecture-distributed-systems

You can’t judge a book by its cover, but you can judge the architecture of a distributed system by its topology.

If two distributed systems are equally effective, is the one with the simpler topology the one with the better architecture? This article compares the architecture of two document databases and two wide column stores by looking at their topologies.

Wide Column Store

Topology #1

Wow. There is a lot going on here. There are four nodes types and multiple components per node.

Topology #2

Nice. Simple. There is one node type.

Which wide column store would you choose?

  • Which one is going to be easier to deploy?
  • Which one is going to be easier to maintain?
  • Which one is going to be easier to scale?
  • Which one is going to be more resilient

I believe the less moving parts, the better.

Apache HBase

Apaceh HBase sits on top of Apache Hadoop, so there are a lot of nodes types and components. Apache Hadoop requires name nodes and data nodes for HDFS. It requires job trackers and task trackers for map / reduce.  Apache HBase requires master servers, region servers, and a Zookeeper cluster. The Apache HBase, HDFS, and map / reduce components can be co-located. However, they don’t have to be.

The master server and the name node may be single points of failure. However, multiple name nodes can be deployed, as can multiple master servers. That being said, there will be problems if the name nodes are unavailable, the master servers are unavailable, and / or the Zookeeper cluster is unavailable.

Apache Cassandra

There is one node type. That’s it. Clients communicate directly with the nodes. There are no single points of failure. There are no dependencies on independent nodes or separate clusters.

Document Databases

Topology #1

Wow. There is a lot going on here. There are four node types and two layers of logical groupings.

Topology #2

Nice. Simple. There is one node type.

Which document database would you choose?

  • Which one is going to be easier to deploy?
  • Which one is going to be easier to maintain?
  • Which one is going to be easier to scale?
  • Which one is going to be more resilient?

I believe the less moving parts, the better.

MongoDB

The MongoDB topology is similar to the Apache HBase topology. The difference is that clients to not directly connect to the nodes. THe client requests are proxied by the router nodes. The router nodes retrieve shard information from the config nodes. A shard consists of a replica set. A replica set consists of multiple nodes and an arbiter.

Like Apache HBase, the router node and the config node may be single points of failure. However, like Apache HBase, multiple router nodes and multiple config nodes can be deployed. That being said, there will be problems if the router nodes and / or the config nodes are unavailable.

Couchbase Server

There is one node type. That’s it. Clients communicate directly with the nodes. There are no single points of failure. There are no dependencies on independent nodes or separate clusters.

Summary

A great architecture balances flexibility and simplicity. There is value in a modular architecture. There is value in a simple architecture. However, modularity does not have to be reflected in the topology of a distributed system. Couchbase Server is a modular, distributed system. A single instance is compromised of multiple components and multiple services. However, the modularity is not forced on administrators. It is an aspect of the distributed system itself, not its deployment.

Topology: The Architecture of Distributed Systems--reference的更多相关文章

  1. Scalable Web Architecture and Distributed Systems

    转自:http://aosabook.org/en/distsys.html Scalable Web Architecture and Distributed Systems Kate Matsud ...

  2. 可扩展的Web系统和分布式系统(Scalable Web Architecture and Distributed Systems)

    Open source software has become a fundamental building block for some of the biggest websites. And a ...

  3. Distributed systems theory for the distributed systems engineer

    Gwen Shapira, SA superstar and now full-time engineer at Cloudera, asked a question on Twitter that ...

  4. Scalable, Distributed Systems Using Akka, Spring Boot, DDD, and Java--转

    原文地址:https://dzone.com/articles/scalable-distributed-systems-using-akka-spring-boot-ddd-and-java Whe ...

  5. [翻译] TensorFlow 分布式之论文篇 "TensorFlow : Large-Scale Machine Learning on Heterogeneous Distributed Systems"

    [翻译] TensorFlow 分布式之论文篇 "TensorFlow : Large-Scale Machine Learning on Heterogeneous Distributed ...

  6. Let it crash philosophy for distributed systems

    This past weekend I read Joe Armstrong’s paper on the history of Erlang. Now, HOPL papers in general ...

  7. [分布式系统学习]阅读笔记 Distributed systems for fun and profit 之一 基本概念

    因为工作的原因,最近打算看一些分布式学习的资料.其中这个http://book.mixu.net/distsys/就是一篇非常适合分布式入门的介绍. 这个短小的材料有下面5个小的章节,图文并茂,也没有 ...

  8. Distributed systems

    http://book.mixu.net/distsys/single-page.html

  9. Mit 分布式系统导论,Distributed Systems ,lab1 -lab6 总结,实验一到实验六总结

    终于把Mit的分布式系统导论课的实验1-6写完了 做得有些痛苦,但是收获也很大 http://pdos.csail.mit.edu/6.824-2012/labs/index.html 把实验1-6用 ...

随机推荐

  1. dom 优酷得弹出

    <!doctype html> <html> <head> <meta charset="utf-8"> <title> ...

  2. 第三百四十六天 how can I 坚持

    徐斌的电脑来了,thinkpad,感觉还好,电脑也就这样,联想..不好说,不做评论,末日王者吧. 为什么写博客tab键不管用了呢. 下午又去奥体跑了一圈,好累,刚跑完腿疼,现在还好. 还没洗澡呢,都这 ...

  3. JVM系列二:GC策略&内存申请、对象衰老

    JVM里的GC(Garbage Collection)的算法有很多种,如标记清除收集器,压缩收集器,分代收集器等等,详见HotSpot VM GC 的种类 现在比较常用的是分代收集(generatio ...

  4. HDU 2034 人见人爱A-B 分类: ACM 2015-06-23 23:42 9人阅读 评论(0) 收藏

    人见人爱A-B Time Limit: 2000/1000 MS (Java/Others)    Memory Limit: 65536/32768 K (Java/Others) Total Su ...

  5. HD1000A + B Problem

    Time Limit: 2000/1000 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others) Total Submission(s) ...

  6. Keil MDK Code、RO-data、RW-data、ZI-data数据段

      Program Size: Code=10848 RO-data=780 RW-data=372 ZI-data=868   Code 表示程序代码指令部分 存放在Flash区 RO-data 表 ...

  7. HDU 5794 A Simple Chess (Lucas + dp)

    题目链接:http://acm.hdu.edu.cn/showproblem.php?pid=5794 多校这题转化一下模型跟cf560E基本一样,可以先做cf上的这个题. 题目让你求一个棋子开始在( ...

  8. Oracle数据库编程:使用PL/SQL编写触发器

    8.使用PL/SQL编写触发器: 触发器存放在数据缓冲区中.        触发器加序列能够实现自动增长.        在触发器中不能使用connit和rollback.        DML触发器 ...

  9. 当WEB站点应用程序池标识为ApplicationPoolIdentity,出现运行错误时的解决方法

    对于数据库文件加Authenticated Users用户,并授予完全权限.

  10. [OAuth2 & OpenID] 1.OAuth2授权

    1 OAuth2解决什么问题的? 举个栗子先.小明在QQ空间积攒了多年的照片,想挑选一些照片来打印出来.然后小明在找到一家提供在线打印并且包邮的网站(我们叫它PP吧(Print Photo缩写