此书不错,很短,且想打通PYTHON和大数据架构的关系. 先看一次,计划把这个文档作个翻译. 先来一个模拟MAPREDUCE的东东... mapper.py class Mapper: def map(self, data): returnval = [] counts = {} for line in data: words = line.split() for w in words: counts[w] = counts.get(w, 0) + 1 for w, c in counts.it…