Requirements (r0.16.0)
Mandatory
As per current Apache-Pig
documentation it supports only Unix
& Windows
operating systems.
Optional
Download the latest Pig release
Download the latest version of pig from http://pig.apache.org/releases.html#Download
Installation
mkdir Pig
cd Downloads/
tar zxvf pig-(latest-version).tar.gz
tar zxvf pig-(latest-version).tar.gz
mv pig-(latest-version).tar.gz/* /home/Pig/
Configuration
After installing Apache Pig, we have to configure it.
Open the .bashrc file
vim ~/.bashrc
In the .bashrc file, set the following variables −
export PIG_HOME = /home/Pig
export PATH = PATH:/home/Pig/bin
save the file and reload bashrc again in the environment using
. ~/.bashrc
Verifying Pig version
pig –version
If the installation is successful, the above command displays the installed Pig version number.
Testing Pig Installation
pig -h
This should display all the possible commands associated with pig
Your pig is now installed locally and you can run it using local parameter like
pig -x local
Connecting to Hadoop
If Hadoop1.x or 2.x is Installed on the cluster and the HADOOP_HOME environment variable is setup.
you can connect pig to Hadoop by adding the line in the .bashrc like before
export PIG_CLASSPATH = $HADOOP_HOME/conf
Running Pig
Execution Modes
You can run Pig either using the pig
(bin/pig) command or by running jar
file (java -cp pig.jar)
PIG
scripts can be executed in 3 different modes:
Local Mode
pig -x local ...
Mapreduce Mode (default mode)
pig -x mapreduce ...
(or)
pig ...
Tez Local Mode
pig -x tez ...
Interactive Mode
Pig can be run in interactive mode using the Grunt
shell. Pig Latin statements and commands can be entered interactively in this shell.
Example
$ pig -x <mode> <enter>
grunt>
Mode
can be one of execution modes as explained in the previous section.
Batch Mode
Pig can also be executed in batch mode. Here a .pig
file containing a list of pig statements and commands is provided.
Example
$ pig -x <mode> <script.pig>
grunt>
Similarly Mode
can be one of execution modes as explained in the previous section.