The Step-by-Step Approach

break down a tricky problem and to solve problems using what you do know.

Step 1: Make Believe

Pretend that the data can all fit on one machine and there are no memory limitations. Provide the general outline for your solution.

Step 2:  Get Real

figure out how to logically divide the data up, and how one machine would identify where to look up a different piece of data.

Step 3: Solve Problems

  Dividing Up Lots of Data:

By Order of Appearance:

By Hash Value: 1)pick some sort of key relating to the data 2)hash the key 3)mod the hash value by the number of machines 4)store data on the machine with that value

        there is no relationship between what the data represents and which machine stores data.

By Acutal Value: reduce system latency by using information about what the data represents.

Arbitrarily:

Good Example: Find all documents that contains a list of words.


10.1 build some sort of service that will be called by up to 1000 client applications to get simple end-of-day stock price information.

We want to start off by thinking about what the different aspects we should consider in a given proposal are:

1. Client Ease of Use: we want the service to be easy for the clients to implement and useful for them

2. Ease for Ourselves: consider in this not only the cost of implementing, but also the cost of maintenance

3. Flexibility for Future Demands:

4. Scalability and Efficiency: not to overly burden our service.

DataBase vs XML(json) P 343


10.2 good problem

10.7 LRU Cache

Chp10: Scalability and Memory Limits的更多相关文章

  1. is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.6 GB of 40 GB virtual memory used

    昨天使用hadoop跑五一的数据,发现报错: Container [pid=,containerID=container_1453101066555_4130018_01_000067] GB phy ...

  2. Memory Limits for Windows and Windows Server Releases

    来源:https://msdn.microsoft.com/en-us/library/windows/desktop/aa366778(v=vs.85).aspx Limits on memory ...

  3. [hadoop] - Container [xxxx] is running beyond physical/virtual memory limits.

    当运行mapreduce的时候,有时候会出现异常信息,提示物理内存或者虚拟内存超出限制,默认情况下:虚拟内存是物理内存的2.1倍.异常信息类似如下: Container [pid=13026,cont ...

  4. hive: insert数据时Error during job, obtaining debugging information 以及beyond physical memory limits

    insert overwrite table canal_amt1...... 2014-10-09 10:40:27,368 Stage-1 map = 100%, reduce = 32%, Cu ...

  5. hadoop is running beyond virtual memory limits问题解决

    单机搭建了2.6.5的伪分布式集群,写了一个tf-idf计算程序,分词用的是结巴分词,使用standalone模式运行没有任何问题,切换到伪分布式模式运行一直报错: hadoop is running ...

  6. hadoop的job执行在yarn中内存分配调节————Container [pid=108284,containerID=container_e19_1533108188813_12125_01_000002] is running beyond virtual memory limits. Current usage: 653.1 MB of 2 GB physical memory used

    实际遇到的真实问题,解决方法: 1.调整虚拟内存率yarn.nodemanager.vmem-pmem-ratio (这个hadoop默认是2.1) 2.调整map与reduce的在AM中的大小大于y ...

  7. [转载]Memory Limits for Windows and Windows Server Releases

    Memory Limits for Windows and Windows Server Releases This topic describes the memory limits for sup ...

  8. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

  9. Container [pid=6263,containerID=container_1494900155967_0001_02_000001] is running beyond virtual memory limits

    以Spark-Client模式运行,Spark-Submit时出现了下面的错误: User: hadoop Name: Spark Pi Application Type: SPARK Applica ...

随机推荐

  1. Populating Next Right Pointers in Each Node II

    题目地址: https://oj.leetcode.com/problems/populating-next-right-pointers-in-each-node-ii/ 关键思路:讲节点的左右子节 ...

  2. 用委托在listbox中异步显示信息,解决线程间操作无效,从不是创建控件的线程访问它

    //创建一个委托,是为访问listbox控件服务的. public delegate void UpdateTxt(string msg); //定义一个委托变量 public UpdateTxt u ...

  3. RHEL7 Ansible

    [root@promote tt]# rpm -iUvh http://dl.Fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-8.noarch ...

  4. CLR via C# 序列化读书笔记

    1. 序列化格式类 a. 二进制BinaryFormatter b. XML流 NetDataContractSerializer c. CLR类据类型与非CLR数据类型之间互操作 XmlSerial ...

  5. Laravel 5 基础(三)- 向视图传送数据(续)

    我们不仅仅可以向视图传送一个数据,同样我们可以传送Array public function about() { return view('pages.about')->with([ 'firs ...

  6. Android屏幕像素密度适配详解

    讲到像素密度,我们先要搞明白什么是像素密度,像素密度的字面上的意思为手机屏幕上一定尺寸区域内像素的个数.在Android开发中, 我们一般会使用每英寸像素密度(dpi)这样一个单位来表示手机屏幕的像素 ...

  7. CentOS安装 pure-ftpd

    yum -y install pam-devel cd /usr/local .tar.gz cd pure-ftpd- ./configure --prefix=/usr/local/pure-ft ...

  8. 获得当前时间的PRO

    1.没有参数的存储过程 create or replace procedure get_timeas    cur_time varchar2(10);begin  select to_char(sy ...

  9. Nginx模块开发-理解HTTP配置

    理解HTTP配置 相关数据结构 先明白Nginx下述数据结构,再理解 HTTP配置的解析与合并过程 ngx_module_t 官方API typedef struct{ NGX_MODULE_V1; ...

  10. 博主教你制作类似9patch效果的iOS图片拉伸

    下面张图片,本来是设计来做按钮背景的:   button.png,尺寸为:24x60 现在我们把它用作为按钮背景,按钮尺寸是150x50: // 得到view的尺寸 CGSize viewSize = ...