Columns Flashcards
What are the different ways you can select columns from customerDf
.select("column name") .select('column_name) .select($"column_name") .select(col("column_name") .select(column("column_name")) .select(customerDf.col("column_name"))
How do you combine two columns
.select(expr(“concat(firstname, lastname) name”))
Is there an overload that takes a string and a column object
No, you cannot do variations of column objects, but you can’t mix strings and column objects
How can you select columns using sql
.selectExpr(“column”)
With SQL, how do you get and rename a column
.selectExpr(“birthdate birthday”)
How do you see all the columns for a DataFrame
customerDf.columns
How do you rename a column in the data frame
.withColumnRenamed(“old_name”, “new name”)
If you rename a column that does not exist, spark will fail
False, it will succeed but do nothing.
can columnRenamed take in column objects
No, strings only
How do you print the schema of the data frame
.printSchema
How can you change a datatype of a column not using apache spark types
.select($”column_object”.cast(“long”))
How can you change the data type of the column using apache spark types
import org.apache.spark.sql.types._ (the _ means all tyoes)
.select($”column_object”.cast(StringType))
How can you change the data type of a column using a select expression
.selectExpr(“cast(complex_object.property_in_there[0]) as double) rename_if_want”)
example is a changing the property of a complex type and renaming it. The same works with any column
How can you add a column, make it of two existing columns with a space
.withColumn(“new_column_name”, concat_ws(“ “, $”first_column”, $”second_column”))
How can you remove a column using a string
.drop(‘‘column_name”)