HIVE PARTITIONING Table

1 comment
  • Partitioning the table  helps us to improve the performance of your HIVEQL queries, usually the normal hive query will take long time to process even for a single record  it has to process all the records, where as if we use partition then the query performance will be fast and the selection is particularly made on those partitioned columns.
  • Partitioned Columns are Virtual columns, it should NOT be part of your Table fields, if we specify so then we will get an error.

How to Partition the Hive Table?

To understand this easily, Let us  first create a table called Student and that will be partitioned by Country and City  using PARTITIONED BY Keyword.
If you create a table with partitions, you will see the separate folder inside default folder structure called  /user/hive/warehouse/ to hold the data for these partitions.

How to Insert Dynamically into Partitioned Hive Table?

If we want to do manually multi Insert into partitioned table, we need to set the Dynamic partition mode to nonrestrict as follows

The Multi  Dynamic Insert Query  to Partitioned table  :

Selecting rows from Hive Partitioned Table

Let us check whether the dynamic insert Hive Query Language has inserted multiple rows into the partitioned table or not.

How to Insert statically Into Partitioned Table?

If we would want to insert statically into partitioned table we need to declare Partition value followed by the partition keyword (For Example : PARTITION(COUNTRY=’USA’)) unlike the above one which we have seen as part of dynamic insert into partitioned table.

Using Load Data from Local file system to Partitioned Table.

We can also use LOAD DATA LOCAL INPATH for loading the data from local file system into Partitioned Table.

 

Leave a comment

Your email address will not be published. Required fields are marked *

One thought on “HIVE PARTITIONING Table”