
You can’t judge a book by its cover, but you can judge the architecture of a distributed system by its topology.

If two distributed systems are equally effective, is the one with the simpler topology the one with the better architecture? This article compares the architecture of two document databases and two wide column stores by looking at their topologies.

Wide Column Store

Topology #1

Wow. There is a lot going on here. There are four nodes types and multiple components per node.

Topology #2

Nice. Simple. There is one node type.

Which wide column store would you choose?

  • Which one is going to be easier to deploy?
  • Which one is going to be easier to maintain?
  • Which one is going to be easier to scale?
  • Which one is going to be more resilient

I believe the less moving parts, the better.

Apache HBase

Apaceh HBase sits on top of Apache Hadoop, so there are a lot of nodes types and components. Apache Hadoop requires name nodes and data nodes for HDFS. It requires job trackers and task trackers for map / reduce.  Apache HBase requires master servers, region servers, and a Zookeeper cluster. The Apache HBase, HDFS, and map / reduce components can be co-located. However, they don’t have to be.

The master server and the name node may be single points of failure. However, multiple name nodes can be deployed, as can multiple master servers. That being said, there will be problems if the name nodes are unavailable, the master servers are unavailable, and / or the Zookeeper cluster is unavailable.

Apache Cassandra

There is one node type. That’s it. Clients communicate directly with the nodes. There are no single points of failure. There are no dependencies on independent nodes or separate clusters.

Document Databases

Topology #1

Wow. There is a lot going on here. There are four node types and two layers of logical groupings.

Topology #2

Nice. Simple. There is one node type.

Which document database would you choose?

  • Which one is going to be easier to deploy?
  • Which one is going to be easier to maintain?
  • Which one is going to be easier to scale?
  • Which one is going to be more resilient?

I believe the less moving parts, the better.


The MongoDB topology is similar to the Apache HBase topology. The difference is that clients to not directly connect to the nodes. THe client requests are proxied by the router nodes. The router nodes retrieve shard information from the config nodes. A shard consists of a replica set. A replica set consists of multiple nodes and an arbiter.

Like Apache HBase, the router node and the config node may be single points of failure. However, like Apache HBase, multiple router nodes and multiple config nodes can be deployed. That being said, there will be problems if the router nodes and / or the config nodes are unavailable.

Couchbase Server

There is one node type. That’s it. Clients communicate directly with the nodes. There are no single points of failure. There are no dependencies on independent nodes or separate clusters.


A great architecture balances flexibility and simplicity. There is value in a modular architecture. There is value in a simple architecture. However, modularity does not have to be reflected in the topology of a distributed system. Couchbase Server is a modular, distributed system. A single instance is compromised of multiple components and multiple services. However, the modularity is not forced on administrators. It is an aspect of the distributed system itself, not its deployment.

Topology: The Architecture of Distributed Systems--reference的更多相关文章

  1. Scalable Web Architecture and Distributed Systems

    转自: Scalable Web Architecture and Distributed Systems Kate Matsud ...

  2. 可扩展的Web系统和分布式系统(Scalable Web Architecture and Distributed Systems)

    Open source software has become a fundamental building block for some of the biggest websites. And a ...

  3. Distributed systems theory for the distributed systems engineer

    Gwen Shapira, SA superstar and now full-time engineer at Cloudera, asked a question on Twitter that ...

  4. Scalable, Distributed Systems Using Akka, Spring Boot, DDD, and Java--转

    原文地址: Whe ...

  5. [翻译] TensorFlow 分布式之论文篇 "TensorFlow : Large-Scale Machine Learning on Heterogeneous Distributed Systems"

    [翻译] TensorFlow 分布式之论文篇 "TensorFlow : Large-Scale Machine Learning on Heterogeneous Distributed ...

  6. Let it crash philosophy for distributed systems

    This past weekend I read Joe Armstrong’s paper on the history of Erlang. Now, HOPL papers in general ...

  7. [分布式系统学习]阅读笔记 Distributed systems for fun and profit 之一 基本概念

    因为工作的原因,最近打算看一些分布式学习的资料.其中这个就是一篇非常适合分布式入门的介绍. 这个短小的材料有下面5个小的章节,图文并茂,也没有 ...

  8. Distributed systems

  9. Mit 分布式系统导论,Distributed Systems ,lab1 -lab6 总结,实验一到实验六总结

    终于把Mit的分布式系统导论课的实验1-6写完了 做得有些痛苦,但是收获也很大 把实验1-6用 ...


  1. Mapreduce执行过程分析(基于Hadoop2.4)——(一)

    1 概述 该瞅瞅MapReduce的内部运行原理了,以前只知道个皮毛,再不搞搞,不然怎么死的都不晓得.下文会以2.4版本中的WordCount这个经典例子作为分析的切入点,一步步来看里面到底是个什么情 ...

  2. cocos2d 设置按钮不可用

    需要两步设置按钮变灰,然后不可点击 btnBuy.setBright(false); btnBuy.setTouchEnabled(false); 或者直接不显示按钮 btnBuy.setEnable ...

  3. Sharding & IDs at Instagram(转)

    英文原文: 译文:http://ww ...

  4. 1001Sum Problem

    Time Limit: 1000/500 MS (Java/Others) Memory Limit: 65536/32768 K (Java/Others) Total Submission(s): ...

  5. sql中获得时间的参数

    (----我错了,以下非php.php用例:FORM_UNIXTIME($time, '%H')取小时数.---) 返回表示指定日期的指定日期部分的整数. 语法 DATEPART ( datepart ...

  6. oracle学习 一 (持续更新中)

    首先你需要创建一个表空间,然后,再创建一个用户名,用户名要给他指定一个表空间,并且给这个用户赋予权限, DBA: 拥有全部特权,是系统最高权限,只有DBA才可以创建数据库结构. RESOURCE:拥有 ...

  7. arrayObj.splice(start, deleteCount, [item1[, item2[, . . . [,itemN]]]])

    测试方法 function test(){ var arr = [0,1,2,3]; arr.splice(1,1,'a');//case console.dir(arr); } case1: arr ...

  8. 使用Map List 封装json数据

    <dependency> <groupId>net.sf.json-lib</groupId> <artifactId>json-lib</art ...

  9. 配置Synergy(Server : XP, client: Win7)

    此文只是为了Mark一下配置方法,以防以后重装系统的时候,忘记.   首先,因为我的Server机器是XP,所以要求两台机器,都是安装的x86的版本,而不能是x64的版本. 我用的版本是1.4.11, ...

  10. 设置UIButton文字大小颜色不同

    _loginBtn = [[UIButton alloc]initWithFrame:CGRectMake(iconX, CGRectGetMaxY(passwordBGView.frame)+25, ...