The Step-by-Step Approach

break down a tricky problem and to solve problems using what you do know.

Step 1: Make Believe

Pretend that the data can all fit on one machine and there are no memory limitations. Provide the general outline for your solution.

Step 2:  Get Real

figure out how to logically divide the data up, and how one machine would identify where to look up a different piece of data.

Step 3: Solve Problems

  Dividing Up Lots of Data:

By Order of Appearance:

By Hash Value: 1)pick some sort of key relating to the data 2)hash the key 3)mod the hash value by the number of machines 4)store data on the machine with that value

        there is no relationship between what the data represents and which machine stores data.

By Acutal Value: reduce system latency by using information about what the data represents.

Arbitrarily:

Good Example: Find all documents that contains a list of words.


10.1 build some sort of service that will be called by up to 1000 client applications to get simple end-of-day stock price information.

We want to start off by thinking about what the different aspects we should consider in a given proposal are:

1. Client Ease of Use: we want the service to be easy for the clients to implement and useful for them

2. Ease for Ourselves: consider in this not only the cost of implementing, but also the cost of maintenance

3. Flexibility for Future Demands:

4. Scalability and Efficiency: not to overly burden our service.

DataBase vs XML(json) P 343


10.2 good problem

10.7 LRU Cache

Chp10: Scalability and Memory Limits的更多相关文章

  1. is running beyond physical memory limits. Current usage: 2.0 GB of 2 GB physical memory used; 2.6 GB of 40 GB virtual memory used

    昨天使用hadoop跑五一的数据,发现报错: Container [pid=,containerID=container_1453101066555_4130018_01_000067] GB phy ...

  2. Memory Limits for Windows and Windows Server Releases

    来源:https://msdn.microsoft.com/en-us/library/windows/desktop/aa366778(v=vs.85).aspx Limits on memory ...

  3. [hadoop] - Container [xxxx] is running beyond physical/virtual memory limits.

    当运行mapreduce的时候,有时候会出现异常信息,提示物理内存或者虚拟内存超出限制,默认情况下:虚拟内存是物理内存的2.1倍.异常信息类似如下: Container [pid=13026,cont ...

  4. hive: insert数据时Error during job, obtaining debugging information 以及beyond physical memory limits

    insert overwrite table canal_amt1...... 2014-10-09 10:40:27,368 Stage-1 map = 100%, reduce = 32%, Cu ...

  5. hadoop is running beyond virtual memory limits问题解决

    单机搭建了2.6.5的伪分布式集群,写了一个tf-idf计算程序,分词用的是结巴分词,使用standalone模式运行没有任何问题,切换到伪分布式模式运行一直报错: hadoop is running ...

  6. hadoop的job执行在yarn中内存分配调节————Container [pid=108284,containerID=container_e19_1533108188813_12125_01_000002] is running beyond virtual memory limits. Current usage: 653.1 MB of 2 GB physical memory used

    实际遇到的真实问题,解决方法: 1.调整虚拟内存率yarn.nodemanager.vmem-pmem-ratio (这个hadoop默认是2.1) 2.调整map与reduce的在AM中的大小大于y ...

  7. [转载]Memory Limits for Windows and Windows Server Releases

    Memory Limits for Windows and Windows Server Releases This topic describes the memory limits for sup ...

  8. Kafka:ZK+Kafka+Spark Streaming集群环境搭建(十三)kafka+spark streaming打包好的程序提交时提示虚拟内存不足(Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical memory used; 2.2 GB of 2.1 G)

    异常问题:Container is running beyond virtual memory limits. Current usage: 119.5 MB of 1 GB physical mem ...

  9. Container [pid=6263,containerID=container_1494900155967_0001_02_000001] is running beyond virtual memory limits

    以Spark-Client模式运行,Spark-Submit时出现了下面的错误: User: hadoop Name: Spark Pi Application Type: SPARK Applica ...

随机推荐

  1. ant条件逻辑

    <condition property="sdk-folder" value="E:\android\android-sdk\adt-bundle-windows- ...

  2. 《gpg文件加密的使用》RHEL6

    甲端: 首先是要生成一对密钥: 提示是否要生成2048个字节的密钥对:   下面都是生成密钥对时的步骤: 按“o”键开始生成密钥对: 提示要我给密钥对加个密码: 输入2次 之后密钥对的字符需要我按键盘 ...

  3. exynos 4412 电源管理芯片PMIC 的配置及使用方法

    /** ****************************************************************************** * @author    Maox ...

  4. eclipse 最全快捷键(网络收集)

    Ctrl+1 快速修复(最经典的快捷键,就不用多说了) Ctrl+D: 删除当前行 Ctrl+Alt+↓ 复制当前行到下一行(复制增加) Ctrl+Alt+↑ 复制当前行到上一行(复制增加) Alt+ ...

  5. 解决grub引导错误的一次经历

    我的电脑上一共是两块硬盘,1块固态硬盘(sda)装了win7,另外一块普通硬盘(sdb)装了ubuntu和centos两个系统,系统启动的引导是装在sdb上面的ubuntu的grub2,它负责选择不同 ...

  6. winform INI文件操作辅助类

    using System;using System.Runtime.InteropServices;using System.Text; namespace connectCMCC.Utils{ // ...

  7. Silverlight浮动窗体 floatablewindow 非模态对话框

    1.http://www.cnblogs.com/yinxiangpei/articles/2613913.html 说明:Silverlight的ChildWindow组件给我们的开发带来了便利,比 ...

  8. Firebird数据库相关备忘录

    Firebird数据库中有一些很特别的东西,很好用,但由于平时用的不多,记在这里,以备以后用到时查询. 1.以ADO 的OLE ODBC驱动方式访问 Firebird,可以使用如下连接串: FBCon ...

  9. mysql基础三(视图、触发器、函数、存储过程、事务、防注入)

    一.视图 视图是一个虚拟表(非真实存在),其本质是[根据SQL语句获取动态的数据集,并为其命名],用户使用时只需使用[名称]即可获取结果集,并可以将其当作表来使用. 1.创建视图 -格式:CREATE ...

  10. python学习第三天第一部分

    字典 1.字典的定义和规则: 定义:{key1:value1,key2:value2} key 的定义规则:1.必须是不可变的(数字.字符串.元组):2.必须是唯一的, value的定义规则:任意类型 ...