example:
To run in eclipse:
Before run the above wordcount program in Eclipse, set the below configurations:
1. Add the spark jars in the classpath
2. Add the scala, java libraries in library
3. How to run the program:
Programs arguments:
"C:\Programs\spark-2.1.1-bin-hadoop2.7/README.md" "C:\Users\svm6kor\Documents\Trainings\SparkWithScala\Code\output5"
VM Arguments:
-Xms1336m -Xmx1336m
-Dspark.driver.memory=2g
-Djava.net.preferIPv4Stack=true
To run comment prompt:
spark submit:
outside the project work space:
C:\Programs\spark-2.1.1-bin-hadoop2.7\bin>spark-submit --class Wordcount --master local C:/Users/userName/Documents/Trainings/SparkWithScala/Code/Wordcount.jar C:/Users/userName/Documents/Trainings/SparkWithScala/Code/WordcountData output
or
working folder:
C:\Users\userName\Documents\Trainings\SparkWithScala\Code>c:\Programs\spark-2.1.1-bin-hadoop2.7\bin\spark-submit --class Wordcount --master local Wordcount.jar output
import org.apache.spark.SparkContext import org.apache.spark.SparkConf object Wordcount { def main(args: Array[String]) { //Create conf object val conf = new SparkConf() .setMaster("local")// to set the environment .set("spark.driver.memory","1g") // to resolve the memory issue(Could not reserve enough space for 3145728KB object heap) .setAppName("WordCount")//application name //create spark context object val sc = new SparkContext(conf) //Check whether sufficient params are supplied if (args.length < 2) { println("Usage: ScalaWordCount
To run in eclipse:
Before run the above wordcount program in Eclipse, set the below configurations:
1. Add the spark jars in the classpath
2. Add the scala, java libraries in library
3. How to run the program:
Programs arguments:
"C:\Programs\spark-2.1.1-bin-hadoop2.7/README.md" "C:\Users\svm6kor\Documents\Trainings\SparkWithScala\Code\output5"
VM Arguments:
-Xms1336m -Xmx1336m
-Dspark.driver.memory=2g
-Djava.net.preferIPv4Stack=true
To run comment prompt:
spark submit:
outside the project work space:
C:\Programs\spark-2.1.1-bin-hadoop2.7\bin>spark-submit --class Wordcount --master local C:/Users/userName/Documents/Trainings/SparkWithScala/Code/Wordcount.jar C:/Users/userName/Documents/Trainings/SparkWithScala/Code/WordcountData output
or
working folder:
C:\Users\userName\Documents\Trainings\SparkWithScala\Code>c:\Programs\spark-2.1.1-bin-hadoop2.7\bin\spark-submit --class Wordcount --master local Wordcount.jar output