文中的所有操作都是在之前的文章spark集群的搭建基础上建立的,重复操作已经简写: 之前的配置中使用了master01.slave01.slave02.slave03: 本篇文章还要添加master02和CloudDeskTop两个节点,并配置好运行环境: 一.流程: 1.在搭建高可用集群之前需要先配置高可用,首先在master01上: [hadoop@master01 ~]$ cd /software/spark-2.1.1/conf/ [hadoop@master01 conf]$ vi s
//统计单词top10def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("tst").setMaster("local[3]") val sc = new SparkContext(conf) //wc val res = sc.textFile("D:\\test\\spark\\urlCount").flatMap(_.split(&qu
新建一个 dataFrame : val conf = new SparkConf().setAppName("TTyb").setMaster("local") val sc = new SparkContext(conf) val spark: SQLContext = new SQLContext(sc) import org.apache.spark.sql.functions.explode import org.apache.spark.sql.func