| Simple Sales Prediction
| |
|
Difference between each element and the average of all elements
Calculate the dif between the x-values and avg(x), and the same for the y-values and avg(y) :
val d2=d1.withColumn("xd", col("x")-col("avg_x")).
withColumn("yd", col("y")-col("avg_y"))
// apply cosmetics: round off
val d3=d2.withColumn("xdif", udfRound( d2("xd") )).
withColumn("ydif", udfRound( d2("yd") )).
drop("xd").drop("yd")
Result:
d3.orderBy("shop","product","x").show()
+--------+-------+---+---+------+--------+-------+--------+
| shop|product| x| y| avg_x| avg_y| xdif| ydif|
+--------+-------+---+---+------+--------+-------+--------+
|megamart| bread| 1|371|6.4545|461.2727|-5.4545|-90.2727|
|megamart| bread| 2|432|6.4545|461.2727|-4.4545|-29.2727|
|megamart| bread| 3|425|6.4545|461.2727|-3.4545|-36.2727|
|megamart| bread| 4|524|6.4545|461.2727|-2.4546| 62.7273|
|megamart| bread| 5|468|6.4545|461.2727|-1.4546| 6.7273|
|megamart| bread| 6|414|6.4545|461.2727|-0.4546|-47.2727|
|megamart| bread| 8|487|6.4545|461.2727| 1.5454| 25.7273|
|megamart| bread| 9|493|6.4545|461.2727| 2.5454| 31.7273|
|megamart| bread| 10|517|6.4545|461.2727| 3.5455| 55.7273|
|megamart| bread| 11|473|6.4545|461.2727| 4.5455| 11.7273|
|megamart| bread| 12|470|6.4545|461.2727| 5.5455| 8.7273|
|megamart| cheese| 1| 51| 6.4| 56.4| -5.4| -5.4|
|megamart| cheese| 2| 56| 6.4| 56.4| -4.4| -0.4|
|megamart| cheese| 3| 63| 6.4| 56.4| -3.4| 6.6|
|megamart| cheese| 5| 66| 6.4| 56.4|-1.4001| 9.6|
|megamart| cheese| 6| 66| 6.4| 56.4|-0.4001| 9.6|
|megamart| cheese| 7| 50| 6.4| 56.4| 0.5999| -6.4|
|megamart| cheese| 8| 56| 6.4| 56.4| 1.5999| -0.4|
|megamart| cheese| 9| 58| 6.4| 56.4| 2.5999| 1.6|
|megamart| cheese| 11| 48| 6.4| 56.4| 4.6| -8.4|
+--------+-------+---+---+------+--------+-------+--------+
| |