import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function2;
import org.apache.spark.api.java.function.VoidFunction;
import scala.Tuple2; import java.util.Arrays;
import java.util.List; /**
* sortbykey([ascending],[numTasks]) 算子:
* 根据key进行排序操作
* 第一个参数为true,则为升序,反之为降序
* 第二个参数决定执行的task数目
*
*/
public class SortByKeyOperator {
public static void main(String[] args){
SparkConf conf = new SparkConf().setMaster("local").setAppName("sortByKey");
JavaSparkContext sc = new JavaSparkContext(conf); List<Tuple2<String,Integer>> list = Arrays.asList(
new Tuple2<String,Integer>("w1",1),
new Tuple2<String,Integer>("w2",2),
new Tuple2<String,Integer>("w3",3),
new Tuple2<String,Integer>("w2",22),
new Tuple2<String,Integer>("w1",11)
); JavaPairRDD<String,Integer> pairRdd = sc.parallelizePairs(list); JavaPairRDD<String,Integer> result = pairRdd.sortByKey(true,2); result.foreach(new VoidFunction<Tuple2<String, Integer>>() {
@Override
public void call(Tuple2<String, Integer> stringIntegerTuple2) throws Exception {
System.err.println(stringIntegerTuple2._1+":"+stringIntegerTuple2._2);
}
}); }
}

微信扫描下图二维码加入博主知识星球,获取更多大数据、人工智能、算法等免费学习资料哦!

java实现spark常用算子之SortByKey的更多相关文章

  1. java实现spark常用算子之Union

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  2. java实现spark常用算子之TakeSample

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  3. java实现spark常用算子之SaveAsTextFile

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  4. java实现spark常用算子之Repartitions

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  5. java实现spark常用算子之mapPartitionsWithIndex

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  6. java实现spark常用算子之map

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  7. java实现spark常用算子之intersection

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  8. java实现spark常用算子之frist

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  9. java实现spark常用算子之flatmap

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

随机推荐

  1. android data binding jetpack VII @BindingAdapter

    android data binding jetpack VIII BindingConversion android data binding jetpack VII @BindingAdapter ...

  2. 搭建Django项目虚拟环境(Windows系统下)

    一.安装virtualenv 我们可以使用正式的Python环境中的pip进行安装.进入cmd界面,运行“ pip install virtualenv ”,完成安装后,可以运行“ where vir ...

  3. leetcode-hard-ListNode-Copy List with Random Pointer-NO

    mycode 报错:Node with val 1 was not copied but a reference to the original one. 其实我并没有弄懂对于ListNode而言咋样 ...

  4. outlier异常值检验算法之_箱型图(附python代码)

    python机器学习-乳腺癌细胞挖掘(博主亲自录制视频) https://study.163.com/course/introduction.htm?courseId=1005269003&u ...

  5. Hibernate 的一些注解配置

    网上参考资料很多,但总是不符合自身习惯,遂记录下来. 一对多的关系 如class与student的关系 class中 @OneToMany(mappedBy = "class") ...

  6. 【转】实现1080P延迟低于500ms的实时超清直播传输技术

    最近由于公司业务关系,需要一个在公网上能实时互动超清视频的架构和技术方案.众所周知,视频直播用 CDN + RTMP 就可以满足绝大部分视频直播业务,我们也接触了和测试了几家 CDN 提供的方案,单人 ...

  7. lumen怎么得到当前Uri的控制器、Action、路由规则

    <?php namespace App\Http\Controllers; class HelloController extends Controller { public function ...

  8. 导出设计文档总结 plantUML Graphviz jacob

    plantUML https://blog.csdn.net/HelloWorld998/article/details/90676496 http://skyao.github.io/2014/12 ...

  9. ASP.NET Core 入门笔记10,ASP.NET Core 中间件(Middleware)入门

    一.前言 1.本教程主要内容 ASP.NET Core 中间件介绍 通过自定义 ASP.NET Core 中间件实现请求验签 2.本教程环境信息 软件/环境 说明 操作系统 Windows 10 SD ...

  10. Leetcode之53. Maximum Subarray Easy

    Leetcode 53 Maximum Subarray Easyhttps://leetcode.com/problems/maximum-subarray/Given an integer arr ...