Pig Filter

Pig Filter is used to FILTER only the Rows of data based on some condition.

How to use Pig Filter?

Let us Filter only Employees using this condition as Salary=9000 from below given Data.

Step1 : Put this file into HDFS using the following command

EmployeeDetails.txt has the sample data.

Step 2 :Launch the Pig Grunt shell by typing

once grunt shells opens up , we need to load the above Data into Pig Relation called EmployeeDetails.

How to Filter rows ?

Now i am going to FILTER ROWS based on given condition.

Let us DUMP and see the result of FILTER command


How to Filter NOT NULL values using Pig FILTER?

If you see the employeeDetails.txt , you can find NULL values, let us filter only NOT NULL  ROWS.

if we dump the EmployeeNotNull Relation on grunt shell we will get the below output.

All the Pig Arithmetic operators we can use them with FILTER command as Conditional extracts.