Apache Tez is an extensible framework for YARN-based high-performance data processing applications. Projects such as Hive and Pig can leverage this framework for improved performance and faster response times and they can be used for interactive needs.
HDInsight 3.1 is capable of running Hive queries using Tez, which provides substantial performance improvements over MapReduce. By default, Tez is not enabled for Hive and can be enabled, as shown in the following code snippet:
set hive.execution_engine=tez; select flightyear, flightquarter, flightmonth , regexp_replace(uniquecarrier,"\"","") as airlinecarrier, avg(depdelay) as avgdepdelay from airline_otp_refined group by flightyear, flightquarter, flightmonth , regexp_replace(uniquecarrier,"\"","");