Terentia Flores

05_Hive_UDF
20151226

# User Defined Functions in Hive

In Hive it is very easy to define your own function:

• write some Java code
• wrap it into a JAR
• add the jar in Hive
• define a function to your UDF

The user defined hive function of UdfRoughDistance.java:

 ```1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 ``` ``````import java.lang.Math; import org.apache.hadoop.hive.ql.exec.UDF; public class UdfRoughDistance extends UDF { /** Calculate the approximate distance between two points */ public double evaluate(double lat1, double lon1, double lat2, double lon2) { // convert to radians lat1 = lat1 * Math.PI / 180.0; lon1 = lon1 * Math.PI / 180.0; lat2 = lat2 * Math.PI / 180.0; lon2 = lon2 * Math.PI / 180.0; double r = 6371.0; // radius of the earth in kilometer double x = (lon2 - lon1) * Math.cos((lat1+lat2)/2.0); double y = (lat2 - lat1); return r*Math.sqrt(x*x+y*y); } // end evaluate /* The above formulas are called the "equirectangular approximation", * to be used for small distances, if performance is more important * than accuracy. * See: http://www.movable-type.co.uk/scripts/latlong.html */ }``````

Once you have setup the proper class path, just compile your java file:

``javac UdfRoughDistance.java``

.. and create a jar file:

``jar cvf udf.jar UdfRoughDistance.class``

.. which you incorporate into hive as follows:

``````ADD JAR udf.jar;
CREATE TEMPORARY FUNCTION UDF_ROUGH_DISTANCE as 'UdfRoughDistance';``````

My classpath is defined as follows (from script: compile_jar_udf.sh) :

 ```7 8 9 10 11 12 13 14 15 16 17 ``` ``````export HH=/opt/hadoop-2.7.1/share/hadoop export HI=/opt/apache-hive-1.2.1-bin export CLASSPATH=\$CLASSPATH\ :\$HH/common/hadoop-common-2.7.1.jar\ :\$HH/hdfs/hadoop-hdfs-2.7.1.jar\ :\$HH/mapreduce/lib/*\ :\$HH/common/lib/*\ :\$HH/tools/lib/*\ :\$HI/lib/hive-common-1.2.1.jar\ :\$HI/lib/lib/hive-contrib-1.2.1.jar\ :\$HI/lib/hive-exec-1.2.1.jar``````

Notes by Data Munging Ninja. Generated on nini:sync/20151223_datamungingninja/terentiaflores at 2016-10-18 07:18