example:
To run in eclipse:
Before run the above wordcount program in Eclipse, set the below configurations:
1. Add the spark jars in the classpath
2. Add the scala, java libraries in library
3. How to run the program:
Programs arguments:
"C:\Programs\spark-2.1.1-bin-hadoop2.7/README.md" "C:\Users\svm6kor\Documents\Trainings\SparkWithScala\Code\output5"
VM Arguments:
-Xms1336m -Xmx1336m
-Dspark.driver.memory=2g
-Djava.net.preferIPv4Stack=true
To run comment prompt:
spark submit:
outside the project work space:
C:\Programs\spark-2.1.1-bin-hadoop2.7\bin>spark-submit --class Wordcount --master local C:/Users/userName/Documents/Trainings/SparkWithScala/Code/Wordcount.jar C:/Users/userName/Documents/Trainings/SparkWithScala/Code/WordcountData output
or
working folder:
C:\Users\userName\Documents\Trainings\SparkWithScala\Code>c:\Programs\spark-2.1.1-bin-hadoop2.7\bin\spark-submit --class Wordcount --master local Wordcount.jar output
import org.apache.spark.SparkContext
import org.apache.spark.SparkConf
object Wordcount {
def main(args: Array[String]) {
//Create conf object
val conf = new SparkConf()
.setMaster("local")// to set the environment
.set("spark.driver.memory","1g") // to resolve the memory issue(Could not reserve enough space for 3145728KB object heap)
.setAppName("WordCount")//application name
//create spark context object
val sc = new SparkContext(conf)
//Check whether sufficient params are supplied
if (args.length < 2) {
println("Usage: ScalaWordCount
To run in eclipse:
Before run the above wordcount program in Eclipse, set the below configurations:
1. Add the spark jars in the classpath
2. Add the scala, java libraries in library
3. How to run the program:
Programs arguments:
"C:\Programs\spark-2.1.1-bin-hadoop2.7/README.md" "C:\Users\svm6kor\Documents\Trainings\SparkWithScala\Code\output5"
VM Arguments:
-Xms1336m -Xmx1336m
-Dspark.driver.memory=2g
-Djava.net.preferIPv4Stack=true
To run comment prompt:
spark submit:
outside the project work space:
C:\Programs\spark-2.1.1-bin-hadoop2.7\bin>spark-submit --class Wordcount --master local C:/Users/userName/Documents/Trainings/SparkWithScala/Code/Wordcount.jar C:/Users/userName/Documents/Trainings/SparkWithScala/Code/WordcountData output
or
working folder:
C:\Users\userName\Documents\Trainings\SparkWithScala\Code>c:\Programs\spark-2.1.1-bin-hadoop2.7\bin\spark-submit --class Wordcount --master local Wordcount.jar output
No comments:
Post a Comment
I'm certainly not an expert, but I'll try my hardest to explain what I do know and research what I don't know.