Simple Sales Prediction

Difference between each element and the average of all elements

Calculate the dif between the x-values and avg(x), and the same for the y-values and avg(y) :

    val d2=d1.withColumn("xd", col("x")-col("avg_x")).
              withColumn("yd", col("y")-col("avg_y"))

    // apply cosmetics: round off
    val d3=d2.withColumn("xdif", udfRound( d2("xd") )).
              withColumn("ydif", udfRound( d2("yd") )).



|    shop|product|  x|  y| avg_x|   avg_y|   xdif|    ydif|
|megamart|  bread|  1|371|6.4545|461.2727|-5.4545|-90.2727|
|megamart|  bread|  2|432|6.4545|461.2727|-4.4545|-29.2727|
|megamart|  bread|  3|425|6.4545|461.2727|-3.4545|-36.2727|
|megamart|  bread|  4|524|6.4545|461.2727|-2.4546| 62.7273|
|megamart|  bread|  5|468|6.4545|461.2727|-1.4546|  6.7273|
|megamart|  bread|  6|414|6.4545|461.2727|-0.4546|-47.2727|
|megamart|  bread|  8|487|6.4545|461.2727| 1.5454| 25.7273|
|megamart|  bread|  9|493|6.4545|461.2727| 2.5454| 31.7273|
|megamart|  bread| 10|517|6.4545|461.2727| 3.5455| 55.7273|
|megamart|  bread| 11|473|6.4545|461.2727| 4.5455| 11.7273|
|megamart|  bread| 12|470|6.4545|461.2727| 5.5455|  8.7273|
|megamart| cheese|  1| 51|   6.4|    56.4|   -5.4|    -5.4|
|megamart| cheese|  2| 56|   6.4|    56.4|   -4.4|    -0.4|
|megamart| cheese|  3| 63|   6.4|    56.4|   -3.4|     6.6|
|megamart| cheese|  5| 66|   6.4|    56.4|-1.4001|     9.6|
|megamart| cheese|  6| 66|   6.4|    56.4|-0.4001|     9.6|
|megamart| cheese|  7| 50|   6.4|    56.4| 0.5999|    -6.4|
|megamart| cheese|  8| 56|   6.4|    56.4| 1.5999|    -0.4|
|megamart| cheese|  9| 58|   6.4|    56.4| 2.5999|     1.6|
|megamart| cheese| 11| 48|   6.4|    56.4|    4.6|    -8.4|
Notes by Data Munging Ninja. Generated on nini:/home/willem/sync/20151223_datamungingninja/simplesalesprediction at 2016-06-25 10:02