1.Wordcount.scala(本地模式) package com.Mars.spark import org.apache.spark.{SparkConf, SparkContext} /** * Created by Mars on 2017/1/11. */ object Wordcount { def main(args: Array[String]) { val conf = new SparkConf().setAppName("SparkwordcountApp")
val conf: SparkConf = new SparkConf().setMaster(Local[*]).setAppName("wordCount") val sc=new SparkContext(conf) sc.textFile("/input").flatMap(" ").map((_,1)).reduceByKey(_+_).saveAsTextFile("/output") sc.stop val co
一.使用数据 Apache Spark is a fast and general-purpose cluster computing system.It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools inc
准备的测试数据文件hello.txt hello scala hello world nihao hello i am scala this is spark demo gan jiu wan le 将文件上传到hdfs中 #创建hdfs测试目录 hdfs dfs -mkdir /user/spark/input/ #上传本地文件hello.txt到hdfs hdfs dfs -put ./hello.txt /user/spark/input/ 代码(改为读取hdfs上的数据,并写入hdfs)
Attempting to run http://spark.apache.org/docs/latest/quick-start.html#a-standalone-app-in-scala from source. This line val wordCounts = textFile.flatMap(line => line.split(" ")).map(word => (word, 1)).reduceByKey(_+_) reports compile valu