|
Turn narrow dataframe into wide
Just for matter of completeness, here a way of turning the narrow dataframe back into a wide one.
We have this narrow dataframe:
sale_narrow_df.show()
+----------+-------+---+----+
| shop|product| x| y|
+----------+-------+---+----+
|superstore| bread| 1| 341|
|superstore| cheese| 1| 57|
|superstore| razors| 1| 360|
|superstore| soap| 1| 8|
|superstore| milk| 1| 33|
|superstore| nuts| 1|1338|
| megamart| bread| 1| 371|
| megamart| cheese| 1| 51|
| megamart| razors| 1| 599|
| megamart| nuts| 1|1342|
|superstore| bread| 2| 398|
|superstore| cheese| 2| 52|
|superstore| razors| 2| 362|
|superstore| soap| 2| 8|
|superstore| nuts| 2|1369|
| megamart| bread| 2| 432|
| megamart| cheese| 2| 56|
| megamart| soap| 2| 7|
| megamart| milk| 2| 29|
| megamart| nuts| 2|1264|
+----------+-------+---+----+
And here's a way to turn it back into a wide frame:
Step 1: create a list of dataframes, one per month
val months=Array("jan","feb","mar","apr","may","jun","jul","aug","sep","oct","nov","dec")
def sel( mon:Int, colname:String) = {
sale_narrow_df.where(s"x=$mon").select("shop","product","y").
withColumnRenamed("y",colname)
}
val df_ls = for ( i <- 0 until months.length ) yield { sel(i+1, months(i) ) }
Step 2: join the list of dataframes on columns 'shop' and 'product' :
def jn(df1: org.apache.spark.sql.DataFrame, df2:org.apache.spark.sql.DataFrame) = {
df1.join(df2,Seq("shop","product"), "outer")
}
val wide_df=df_ls.reduce( jn(_,_) )
Et voila:
wide_df.orderBy("shop","product").show()
+----------+-------+----+----+----+----+----+----+----+----+----+----+----+----+
| shop|product| jan| feb| mar| apr| may| jun| jul| aug| sep| oct| nov| dec|
+----------+-------+----+----+----+----+----+----+----+----+----+----+----+----+
| megamart| bread| 371| 432| 425| 524| 468| 414|null| 487| 493| 517| 473| 470|
| megamart| cheese| 51| 56| 63|null| 66| 66| 50| 56| 58|null| 48| 50|
| megamart| milk|null| 29| 26| 30| 26| 29| 29| 25| 27|null| 28| 30|
| megamart| nuts|1342|1264|1317|1425|1326|1187|1478|1367|1274|1380|1584|1156|
| megamart| razors| 599|null| 500| 423| 574| 403| 609| 520| 495| 577| 491| 524|
| megamart| soap|null| 7| 8| 9| 9| 8| 9| 9| 9| 6| 6| 8|
|superstore| bread| 341| 398| 427| 344| 472| 370| 354| 406|null| 407| 465| 402|
|superstore| cheese| 57| 52|null| 54| 62|null| 56| 66| 46| 63| 55| 53|
|superstore| milk| 33|null|null| 33| 30| 36| 35| 34| 38| 32| 35| 29|
|superstore| nuts|1338|1369|1157|1305|1532|1231|1466|1148|1298|1059|1216|1231|
|superstore| razors| 360| 362| 366| 352| 365| 361| 361| 353| 317| 335| 290| 406|
|superstore| soap| 8| 8| 7| 8| 6|null| 7| 7| 7| 8| 6|null|
+----------+-------+----+----+----+----+----+----+----+----+----+----+----+----+
| |