归约与分组 - 读《Java 8实战》

区分Collection,Collector和collect

代码中用到的类与方法用红框标出，可从git库中查看

收集器用作高级归约

// 按货币对交易进行分组

Map<Currency, List<Transaction>> currencyListMap = getTransactions().stream()

    .collect(groupingBy(Transaction::getCurrency));

for (Map.Entry<Currency, List<Transaction>> entry : currencyListMap.entrySet()) {

    System.out.println(entry.getKey() + "\t" + entry.getValue().size());

}

预定义收集器的功能

将流元素归约和汇总为一个值
元素分组
元素分区，分组的特殊情况，使用谓词作为分组函数(谓词，返回boolean类型的函数)

Collectorsors类的静态工厂方法一览

// import static java.util.stream.Collectors.*;

Stream<Dish> menuStream = getMenu().stream();

// Collectors类的静态工厂方法

List<Dish> dishes1 =

    menuStream.collect(toList());

Set<Dish> dishes2 =

    menuStream.collect(toSet());

Collection<Dish> dishes3 =

    menuStream.collect(toCollection(ArrayList::new));

long howManyDishes =

    menuStream.collect(counting());

int totalCalories =

    menuStream.collect(summingInt(Dish::getCalories));

double avgCalories =

    menuStream.collect(averagingInt(Dish::getCalories));

IntSummaryStatistics menuStatistics =

    menuStream.collect(summarizingInt(Dish::getCalories));

String shortMenu =

    menuStream.map(Dish::getName).collect(joining(", "));

Optional<Dish> fattest =

    menuStream.collect(maxBy(comparingInt(Dish::getCalories)));

Optional<Dish> lightest =

    menuStream.collect(minBy(comparingInt(Dish::getCalories)));

int totalCalories2 =

    menuStream.collect(reducing(0, Dish::getCalories, Integer::sum));

int howManyDishes2 =

    menuStream.collect(collectingAndThen(toList(), List::size));

Map<Dish.Type,List<Dish>> dishesByType =

    menuStream.collect(groupingBy(Dish::getType));

Map<Boolean,List<Dish>> vegetarianDishes =

    menuStream.collect(partitioningBy(Dish::isVegetarian));

归约和汇总

汇总是归约的一种特殊情况

汇总

菜单中有多少种菜

// 菜单里有多少种菜

long howManyDishes = getMenu().stream().collect(Collectors.counting());

System.out.println(howManyDishes); // 8

long howManyDishes2 = getMenu().stream().count();

System.out.println(howManyDishes2); // 8

System.out.println(getMenu().size()); // 8，这样不是更简单？？

最大值，最小值和平均值

// 菜单中热量最高的菜

Optional<Dish> mostCalaorieDish =

    getMenu().stream().collect(maxBy(comparingInt(Dish::getCalories)));

System.out.println(mostCalaorieDish.orElse(null)); // pork

// 菜单中热量最低的菜

Optional<Dish> leastCalaorieDish =

    getMenu().stream().collect(minBy(comparingInt(Dish::getCalories)));

System.out.println(leastCalaorieDish.orElse(null)); //season

// 菜单中总热量

int totalCalories =

    getMenu().stream().collect(summingInt(Dish::getCalories));

System.out.println(totalCalories); // 3850

// 菜单中的平均热量

OptionalDouble averageCalories =

    getMenu().stream().mapToDouble(Dish::getCalories).average();

System.out.println(averageCalories.orElse(0d)); // 481.25

一个综合的方法：求count,sum,min,average,max

// 以上汇总数据可用下面一个方法执行

IntSummaryStatistics menuStatistics = getMenu().stream().collect(summarizingInt(Dish::getCalories));

System.out.println(menuStatistics);

// IntSummaryStatistics{count=8, sum=3850, min=120, average=481.250000, max=800}

连接字符串joining

// 连接字符串

String shortMenu = getMenu().stream()

    .map(Dish::getName) // 省略这步，返回Dish的toString

    .collect(joining());

System.out.println(shortMenu);

// porkchickenfrench friesriceseasonpizzaprawnssalmon

// 逗号分隔

String shortMenu2 = getMenu().stream()

    .map(Dish::getName)

    .collect(joining(", "));

System.out.println(shortMenu2);

// pork, chicken, french fries, rice, season, pizza, prawns, salmon

广义的汇总：归约

所有收集器，都是一个可以用reducing工厂方法定义的归约过程的特殊情况而已。 Collectors.reducing工厂方法是所有这些特殊情况的一般化。

// Collectors.reducing() 是以上情况的一般化

// 菜单中总热量

int totalCalories2 = getMenu().stream()

    .collect(reducing(0,        		// 第一个参数：初始值

					Dish::getCalories, // 第二个参数：转换函数，要被操作的值

					(i, j) -> i + j)); // 第三个参数：累积函数，求和代码

System.out.println(totalCalories2); // 3850

// 菜单中热量最高的菜

Optional<Dish> mostCaloriesDish = getMenu().stream()

    .collect(reducing((d1, d2) -> d1.getCalories() > d2.getCalories() ? d1 : d2));

System.out.println(mostCalaorieDish.orElse(null)); // pork

// collect与reduce

int totalCalories3 = getMenu().stream()

    .map(Dish::getCalories)

    .reduce(Integer::sum)

    .get();

System.out.println(totalCalories3);

分组和分区

按类型对菜肴进行分组

// 按类型分组

Map<Dish.Type, List<Dish>> typeMap = getMenu().stream()

    .collect(groupingBy(Dish::getType));

System.out.println(typeMap);

// {OTHER=[rice, season, pizza], FISH=[prawns, salmon], MEAT=[pork, chicken, french fries]}

// 按热量分组

Map<CaloricLevel, List<Dish>> dishesByCaloricLevel = getMenu().stream()

    .collect(groupingBy(Dish::getCaloricLevel));

System.out.println(dishesByCaloricLevel);

// {DIET=[french fries, season, prawns], FAT=[pork], NORMAL=[chicken, rice, pizza, salmon]}

多级分组

先按类型分，再按热量分

// 先按类型分，再按热量分

Map<Dish.Type, Map<CaloricLevel, List<Dish>>> dishesByTypeCaloriclevel =

    getMenu().stream()

    	.collect(groupingBy(Dish::getType, groupingBy(Dish::getCaloricLevel)));

System.out.println(dishesByTypeCaloriclevel);

// {OTHER={DIET=[season], NORMAL=[rice, pizza]},

// FISH={DIET=[prawns], NORMAL=[salmon]},

// MEAT={DIET=[french fries], FAT=[pork], NORMAL=[chicken]}}

按子组收集数据

// 每种类型的菜有多少个

Map<Dish.Type, Long> typesCount = getMenu().stream()

    .collect(groupingBy(Dish::getType, counting()));

System.out.println(typesCount);

// {OTHER=3, FISH=2, MEAT=3}

// 注意：groupingBy(f)  等价于 groupingBy(f, toList())

把收集器的结果转换为另一种类型

// 每种类型的中最高热量的那个菜

Map<Dish.Type, Optional<Dish>> mostCaloricByType = getMenu().stream()

    .collect(groupingBy(Dish::getType, maxBy(comparingInt(Dish::getCalories))));

System.out.println(mostCaloricByType);

// {OTHER=Optional[pizza], FISH=Optional[salmon], MEAT=Optional[pork]}

// 把收集器的结果转换为另一种类型

// 每种类型的中最高热量的那个菜

Map<Dish.Type, Dish> mostCaloricByType2 = getMenu().stream()

    .collect(groupingBy(Dish::getType, // 分类函数

                        collectingAndThen( // 这是一个收集器

                            maxBy(comparingInt(Dish::getCalories)), // 要转换的收集器

                            Optional::get))); // 转换函数

与groupingBy联合使用的其他收集器的例子

// 与groupingBy联合使用的其他收集器的例子

// 每种类型的总热量

Map<Dish.Type, Integer> totalCaloriesByType = getMenu().stream()

    .collect(groupingBy(Dish::getType,

                        summingInt(Dish::getCalories)));

System.out.println(totalCaloriesByType);

// 每种类型有哪些热量类型

// 使用toSet()

Map<Dish.Type, Set<CaloricLevel>> caloricLevelsByType = getMenu().stream()

    .collect(groupingBy(Dish::getType,

                        mapping(

// 在累加前对每个输入元素应用一个映射函数，这样就可以让接受特定类型元素的收集器适用不同类型的对象

                            Dish::getCaloricLevel, // 对流中的元素做变换

                            toSet()))); // 将变换的结果对象收集起来

System.out.println(caloricLevelsByType);

// {FISH=[NORMAL, DIET], MEAT=[FAT, NORMAL, DIET], OTHER=[NORMAL, DIET]}

// 使用toCollection(HashSet::new)

Map<Dish.Type, Set<CaloricLevel>> caloricLevelsByType2 = getMenu().stream()

	.collect(groupingBy(Dish::getType,

                        mapping(Dish::getCaloricLevel,

                                toCollection(HashSet::new))));

System.out.println(caloricLevelsByType2);

// {FISH=[NORMAL, DIET], MEAT=[FAT, NORMAL, DIET], OTHER=[NORMAL, DIET]}

特殊情况：分区

分区是分组的特殊情况：由一个谓词(返回一个布尔值的函数)作为分类函数，它称为分区函数。

// 区分素食与非素食

Map<Boolean, List<Dish>> partitionedMenu = getMenu().stream()

    .collect(partitioningBy(Dish::isVegetarian));

System.out.println(partitionedMenu);

// {false=[pork, chicken, french fries, prawns, salmon], true=[rice, season, pizza]}

// 区分素食与非素食，再按类型分类

Map<Boolean, Map<Dish.Type, List<Dish>>> vegetarianDishesByType = getMenu().stream()

    .collect(partitioningBy(Dish::isVegetarian, // 分区函数

                            groupingBy(Dish::getType))); // 收集器

// 素食与非素食中热量最高的菜

Map<Boolean, Dish> mostCaloricPartitionedByVegetarian = getMenu().stream()

    .collect(partitioningBy(Dish::isVegetarian,

                            collectingAndThen(

                                maxBy(comparing(Dish::getCalories)),

                                Optional::get)));

System.out.println(mostCaloricPartitionedByVegetarian);

// {false=pork, true=pizza}

将数字按质数和非质数分区

判断质数

// 质数

public boolean isPrime(int candidate) {

    return IntStream.range(2, candidate)

        .noneMatch(i -> candidate % i == 0);

}

// 优化，仅测试小于等于待测试数平方根的因子(限制除数不超过被测试数的平方根)

public boolean isPrime2(int candidate) {

    int candidateRoot = (int) Math.sqrt((double) candidate);

    return IntStream.rangeClosed(2, candidateRoot)

        .noneMatch(i -> candidate % i == 0);

}

将数字按质数和非质数分区

// 将数字按质数和非质数分区

public Map<Boolean, List<Integer>> partitionPrimes(int n) {

    return IntStream.rangeClosed(2, n).boxed()

        .collect(partitioningBy(candidate -> isPrime2(candidate)));

}

自定义收集器

将Stream里的元素收集到List

/**

 * 将Stream<T>中的所有元素收集到一个List<T>里

 * Author:   admin

 * Date:     2018/8/15 15:03

 */

public class ToListCollector<T> implements Collector<T, List<T>, List<T>> {

    // T是流中要收集的项目的泛型

    // A是累加器的类型，累加器是在收集过程中用于累积部分结果的对象。

    // R是收集操作得到的对象（通常但并不一定是集合）的类型。

    // 建立新的结果容器

    @Override

    public Supplier<List<T>> supplier() {

        // 必须返回一个结果为空的Supplier，也就是一个元参函数

        // 在调用它时它会创建一个空的累加器实例，供数据收集过程使用

        // return () -> new ArrayList<T>();

        return ArrayList::new; // 修建集合操作的起始点

    }

    // 将元素添加到结果容器

    @Override

    public BiConsumer<List<T>, T> accumulator() {

        // 返回执行归约操作的函数

        // return (list, item) -> list.add(item);

        return List::add; // 累积遍历过的项目，原位修改累加器

    }

    // 对结果容器应用最终转换

    @Override

    public Function<List<T>, List<T>> finisher() {

        return Function.identity(); // 恒等函数

    }

    // 合并两个结果容器

    @Override

    public BinaryOperator<List<T>> combiner() {

        return (list1, list2) -> { // 合并两个累加器

            list1.addAll(list2);

            return list1;

        };

    }

    // 返回一个不可变的Characteristics集合

    @Override

    public Set<Characteristics> characteristics() {

        // IDENTITY_FINISH:将累加器A不加检查地转换为结果R是安全的

        // CONCURRENT:accumulator函数可以从多个线程同时调用，且该收集器可以并行归约流

        return Collections.unmodifiableSet( // 为收集器添加标志

                EnumSet.of(Characteristics.IDENTITY_FINISH,

                        Characteristics.CONCURRENT));

    }

}

使用

Stream<Dish> menuStream = FakeDb.getMenu().stream();

// 使用已有的收集器

List<Dish> dishes2 = menuStream.collect(Collectors.toList());

// 使用自定义的收集器

List<Dish> dishes = menuStream.collect(new ToListCollector<Dish>());

// 自定义收集而不去实现Collector

List<Dish> dishes3 = menuStream.collect(

    ArrayList::new, /// 供应源

    List::add, // 累加器

    List::addAll // 组合器

);

将数字按质数和非质数分区

/**

 * 将前n个自然数按质数和非质数分区

 * Author:   admin

 * Date:     2018/8/15 15:28

 */

public class PrimeNumbersCollector implements Collector<Integer,

        Map<Boolean, List<Integer>>,

        Map<Boolean, List<Integer>>> {

    @Override

    public Supplier<Map<Boolean, List<Integer>>> supplier() {

        // 从一个有两个空List的Map开始收集过程

        return () -> new HashMap<Boolean, List<Integer>>() {{

           put(true, new ArrayList<Integer>());

           put(false, new ArrayList<Integer>());

        }};

    }

    @Override

    public BiConsumer<Map<Boolean, List<Integer>>, Integer> accumulator() {

        // 将已经找到的质数列表传递给isPrime方法

        return (Map<Boolean, List<Integer>> acc, Integer candidate) -> {

            // 根据isPrime方法返回值，从Map中取质数或非质数列表，把当前的被测数据加进去

            acc.get(isPrime(acc.get(true), candidate)).add(candidate);

        };

    }

    @Override

    public BinaryOperator<Map<Boolean, List<Integer>>> combiner() {

        // 将第2个Map合并到第1个

        return (Map<Boolean, List<Integer>> map1, Map<Boolean, List<Integer>> map2) -> {

            map1.get(true).addAll(map2.get(true));

            map1.get(false).addAll(map2.get(false));

            return map1;

        };

    }

    @Override

    public Function<Map<Boolean, List<Integer>>, Map<Boolean, List<Integer>>> finisher() {

        return Function.identity();

    }

    @Override

    public Set<Characteristics> characteristics() {

        // 质数是按顺序发现的

        return Collections.unmodifiableSet(EnumSet.of(Characteristics.IDENTITY_FINISH));

    }

    // 再优化，仅仅用被测试数之前的质数来测试

    public static boolean isPrime(List<Integer> primes, int candidate) {

        // return primes.stream().noneMatch(i -> candidate % i == 0);

        int candidateRoot = (int) Math.sqrt((double) candidate);

        return takeWhile(primes, i -> i <= candidateRoot)

                .stream()

                .noneMatch(p -> candidate %p == 0);

    }

    public static <A> List<A> takeWhile(List<A> list, Predicate<A> p) {

        int i = 0;

        for (A item : list) {

            if (!p.test(item)) { // 检查列表中的当前项目是否满足谓词

                return list.subList(0, i); // 如果不满足，返回之前的列表

            }

            i++;

        }

        return list; // 都满足，返回全部

    }

}

使用

// 使用自定义的素数收集器 实现 将数字按质数和非质数分区

public Map<Boolean, List<Integer>> partitionPrimesWithCustomCollector(int n) {

    return IntStream.rangeClosed(2, n).boxed()

        .collect(new PrimeNumbersCollector());

}

比较收集器的性能

@Test

public void test08() {

    long fastest = Long.MAX_VALUE;

    for (int i=0; i<10; i++) {

        long start = System.nanoTime();

        partitionPrimes(1_000_000);

        long duration = (System.nanoTime() - start) / 1_000_000;

        if (duration < fastest) fastest = duration;

    }

    System.out.println("Fastest execution done in " + fastest + " msecs");

    // Fastest execution done in 371 msecs

}

@Test

public void test09() {

    long fastest = Long.MAX_VALUE;

    for (int i=0; i<10; i++) {

        long start = System.nanoTime();

        partitionPrimesWithCustomCollector(1_000_000);

        long duration = (System.nanoTime() - start) / 1_000_000;

        if (duration < fastest) fastest = duration;

    }

    System.out.println("Fastest execution done in " + fastest + " msecs");

    // Fastest execution done in 294 msecs

}

环境：

Intel i7-4790 3.60GHz
Windows 10
jdk 1.8

性能提升(371 - 294) / 371 = 20.75%

总结

收集器的两个功能

归约，特殊情况是汇总，将流元素归约和汇总为一个值
分组，特殊情况是分区

代码

https://gitee.com/yysue/tutorials-java/tree/master/java-8

归约与分组 - 读《Java 8实战》的更多相关文章

selenium2 Webdriver + Java 自动化测试实战和完全教程
selenium2 Webdriver + Java 自动化测试实战和完全教程一.快速开始博客分类: Selenium-webdriverselenium webdriver 学习selenium ...
Java日志实战及解析
Java日志实战及解析日志是程序员必须掌握的基础技能之一,如果您写的软件没有日志,可以说你没有成为一个真正意义上的程序员. 为什么要记日志? • 监控代码 • 变量变化情况, ...
Java编程实战宝典PDF （中文版带书签）
Java编程实战宝典PDF 目录第1篇 Java基础知识入门第1章 Java的开发运行环境( 教学视频:57分钟)1.1 Java运行原理与Java虚拟机1.1.1 Java运行原理简述1.1.2 ...
《Java 8实战》读书笔记系列——第三部分：高效Java 8编程（四）：使用新的日期时间API
https://www.lilu.org.cn/https://www.lilu.org.cn/ 第十二章:新的日期时间API 在Java 8之前,我们常用的日期时间API是java.util.Dat ...
流，用声明性的方式处理数据集 - 读《Java 8实战》
引入流 Stream API的代码声明性更简洁,更易读可复合更灵活可并行性能更好流是什么? 它允许以声明方式处理数据集合遍历数据集的高级迭代器透明地并行处理简短定义:从支持数据处理 ...
行为参数化与lambda表达式 - 读《Java 8实战》
零. 概述第一部分:1~3章主要讲了行为参数化和Lambda表达式第二部分:4~7章主要讲了流的应用,包括流与集合差异,流的操作,收集器,注的并行执行第三部分:8~12章主要讲了怎样用Ja ...
日期和时间API - 读《Java 8实战》
日期与时间 LocalDate 创建一个LocalDate对象并读取其值 // 根据年月日创建日期 LocalDate date1 = LocalDate.of(2014, 3, 18); // 读取 ...
Java 8实战之读书笔记三：函数式数据处理
二.函数式数据处理第4章引入流流是Java API的新成员,它允许你以声明性方式处理数据集合(通过查询语句来表达,而不是临时编写一个实现). 示例: import static java.uti ...
Java 8 实战
Java8 函数式接口,方法传递与Lambda Java8新特性方法作为参数传递给方法,方法成为一等公民 Lambda,匿名函数 Stream API : 将一系列相关操作用流水线的思想分配到CPU ...

随机推荐

Noob渗透笔记
靶机下载地址:https://www.vulnhub.com/entry/noob-1,746/ kali ip 信息收集依旧我们先使用nmap扫描确定一下靶机ip nmap -sP 192.168 ...
ctfhub 双写绕过文件头检查
双写绕过进入环境上传一句话木马上传路径感觉不对检查源代码从此处可以看出需要双写绕过使用bp抓包此处这样修改即可双写绕过使用蚁剑连接即可找到flag 文件头检查进入环境上传一句话木马 ...
(4)_结果与讨论Result and Discussion【论文写作】
C#和TS/JS的对比学习02：函数与方法
程序本质上,就是由数据和处理数据的方法构成.函数和方法,这两个名词虽然字面不同,但意义上其实没有区别.只是因为它们出现的地方有异,给予了不同的名称,比如在全局环境中,叫函数,在对象或类中,叫方法.而C ...
微信小程序安全浅析
引言近期微信小程序重磅发布,在互联网界掀起不小的波澜,已有许多公司发布了自己的小程序,涉及不同的行业领域.大家在体验小程序用完即走便利的同时,是否对小程序的安全性还存有疑虑.白泽日前对微信小程序进行 ...
snippet,让你编码效率翻倍
为什么谈到Snippet 今天下午在用vscode做小程序的时候,发现很不方便,因为商店里提供的代码片段极为有限,而且平时几乎每天都需要用到代码片段,所以就在思考他们是怎么做到给别人提供代码的,我可以 ...
[ Shell ] 通过 Shell 脚本导出 GDSII/OASIS 文件
https://www.cnblogs.com/yeungchie/ 常见的集成电路版图数据库文件格式有 GDSII 和 OASIS,virtuoso 提供了下面两个工具用来在 Shell 中导出版图 ...
Python入门-程序结构扩展
deque双端队列 #双端队列,就是生产消费者模式,依赖collections模块 from collections import deque def main(): info = deque((&q ...
LC-209
给定一个含有 n 个正整数的数组和一个正整数 target . 找出该数组中满足其和 ≥ target 的长度最小的连续子数组 [numsl, numsl+1, ..., numsr-1, nums ...
使用ABP SignalR重构消息服务（二）
使用ABP SignalR重构消息服务(二) 上篇使用ABP SignalR重构消息服务(一)主要讲的是SignalR的基础知识和前端如何使用SignalR,这段时间也是落实方案设计.这篇我主要讲解S ...

归约与分组 - 读《Java 8实战》

收集器用作高级归约

预定义收集器的功能

Collectorsors类的静态工厂方法一览

归约和汇总

汇总

连接字符串joining

广义的汇总：归约

分组和分区

多级分组

按子组收集数据

特殊情况：分区

将数字按质数和非质数分区

自定义收集器

将Stream里的元素收集到List

将数字按质数和非质数分区

比较收集器的性能

总结

收集器的两个功能

代码

归约与分组 - 读《Java 8实战》的更多相关文章

随机推荐

热门专题