apache-pig Getting started with apache-pig Installation or Setup



Requirements (r0.16.0)


As per current Apache-Pig documentation it supports only Unix & Windows operating systems.

  • Hadoop 0.23.X, 1.X or 2.X
  • Java 1.6 or Later versions installed and JAVA_HOME environment variable set to Java installation directory


  • Python 2.7 or more (Python UDFs)
  • Ant 1.8 (for builds)

Download the latest Pig release

Download the latest version of pig from http://pig.apache.org/releases.html#Download


mkdir Pig
cd Downloads/ 
tar zxvf pig-(latest-version).tar.gz 
tar zxvf pig-(latest-version).tar.gz 
mv pig-(latest-version).tar.gz/* /home/Pig/


After installing Apache Pig, we have to configure it.

Open the .bashrc file

vim ~/.bashrc

In the .bashrc file, set the following variables −

export PIG_HOME = /home/Pig
export PATH  = PATH:/home/Pig/bin

save the file and reload bashrc again in the environment using

. ~/.bashrc

Verifying Pig version

pig –version 

If the installation is successful, the above command displays the installed Pig version number.

Testing Pig Installation

pig -h

This should display all the possible commands associated with pig

Your pig is now installed locally and you can run it using local parameter like

pig -x local

Connecting to Hadoop

If Hadoop1.x or 2.x is Installed on the cluster and the HADOOP_HOME environment variable is setup.

you can connect pig to Hadoop by adding the line in the .bashrc like before


Running Pig

Execution Modes

You can run Pig either using the pig (bin/pig) command or by running jar file (java -cp pig.jar)

PIG scripts can be executed in 3 different modes:

  • Local Mode

     pig -x local ...
  • Mapreduce Mode (default mode)

     pig -x mapreduce ...
     pig ...
  • Tez Local Mode

     pig -x tez ...

Interactive Mode

Pig can be run in interactive mode using the Grunt shell. Pig Latin statements and commands can be entered interactively in this shell.


$ pig -x <mode> <enter>

Mode can be one of execution modes as explained in the previous section.

Batch Mode

Pig can also be executed in batch mode. Here a .pig file containing a list of pig statements and commands is provided.


$ pig -x <mode> <script.pig>

Similarly Mode can be one of execution modes as explained in the previous section.