import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext; import java.util.Arrays;
import java.util.List; /**
* sampleTake 算子:
* 先 sample 再 take
* 第一个参数:是否可以重复
* 第二个参数:返回take(n)
* 第三个参数:代表一个随机数种子,就是抽样算法的初始值
*/
public class TakeSampleOperator {
public static void main(String[] args) {
SparkConf conf = new SparkConf().setMaster("local").setAppName("sampleTake");
JavaSparkContext sc = new JavaSparkContext(conf);
List<String> list = Arrays.asList("w1","w2","w3","w4","w5");
JavaRDD<String> listRDD = sc.parallelize(list); List<String> reuslt = listRDD.takeSample(false,2,1);
System.err.println(reuslt); }
}

微信扫描下图二维码加入博主知识星球,获取更多大数据、人工智能、算法等免费学习资料哦!

java实现spark常用算子之TakeSample的更多相关文章

  1. java实现spark常用算子之Union

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  2. java实现spark常用算子之SaveAsTextFile

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  3. java实现spark常用算子之Repartitions

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  4. java实现spark常用算子之mapPartitionsWithIndex

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  5. java实现spark常用算子之map

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  6. java实现spark常用算子之intersection

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  7. java实现spark常用算子之frist

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  8. java实现spark常用算子之flatmap

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

  9. java实现spark常用算子之filter

    import org.apache.spark.SparkConf;import org.apache.spark.api.java.JavaRDD;import org.apache.spark.a ...

随机推荐

  1. Devops(一):CentOS7 安装Maven3.6.1详解

    https://yq.aliyun.com/topic/78?spm=5176.8290451.656547.7.rMYhAF https://yq.aliyun.com/activity/155?u ...

  2. pytorch-卷积基本网络结构-提取网络参数-初始化网络参数

    基本的卷积神经网络 from torch import nn class SimpleCNN(nn.Module): def __init__(self): super(SimpleCNN, self ...

  3. AB窗体互传参数本质

    一.找了好几个,都不靠谱,不是说不靠谱,自己感觉太繁琐,根本就是本窗体的属性(对象)的传递,1实例化2把实例化后的窗体属性=本窗体的对象 二.传递的的时候都是在互相引用的时候传递,推荐的个人认为最简单 ...

  4. 语法错误 : 缺少“;”(在“<”的前面)

    该错误有可能是由错误所指行“<”附近的某个类型忘记#include <>所造成的

  5. 深度学习之加载VGG19模型获取特征图

    1.加载VGG19获取图片特征图 # coding = utf-8 import tensorflow as tf import numpy as np import matplotlib.pyplo ...

  6. python之scrapy爬取某集团招聘信息

    1.创建工程 scrapy startproject gosuncn 2.创建项目 cd gosuncn scrapy genspider gaoxinxing gosuncn.zhiye.com 3 ...

  7. delphi循环校验数据集

    function XXXXXFrom.CheckData(Sender: TObject): Boolean; var tmpcds:TfwClientDataset; begin Result:=F ...

  8. MySQL数据库锁机制之MyISAM引擎表锁和InnoDB行锁详解

    转 http://blog.csdn.net/hsd2012/article/details/51112009 转 http://blog.csdn.net/e421083458/article/de ...

  9. liunx 中如何删除export设置的环境变量

    1,网上有资料说,export命令添加的环境变量,利用export -p 删除: 例如:export  KUBECONFIG="/etc/kubernetes/admin.conf" ...

  10. PTA (Advanced Level)1077.Kuchiguse

    The Japanese language is notorious for its sentence ending particles. Personal preference of such pa ...