Spark commands Flashcards
1
Q
Name 6 action commands we learned in class
A
.colllect() .take(num=filestaken) .foreach() .count() .countByValue() .saveAsTextFile()
2
Q
Name 9 transformation commands we learned in class
A
- filter
- map
- flatMap
- sample
- distinct
- union
- intersection
- subtract
- cartesian
3
Q
What is the pyspark syntax to start a spark context
A
conf = SparkConf().setAppName('example').setMaster('local[*]') sc = SparkContext(conf=conf)
4
Q
What are different syntax to load a file into a Spark environment?
A
df = spark.read.load(‘FOO’, format=’FILEFORMAT’, inferSchema = ‘true’, header = ‘true’)
alternatively you can load a file via:
df = spark.read.json('FOO') df = spark.read.csv('FOO')
5
Q
What are the five categories of spark?
A
1) SparkCore – SparkContext
2) SparkSQL – SparkSession
3) Spark Streaming
4) MLlib
5) graphx