HIVE Query Language (HQL) – HIVE Create Database, Create Table

In this section, we will discuss data definition language parts of HIVE Query Language(HQL), which are used for creating, altering and dropping databases, tables, views, functions, and indexes.

We will also look into SHOW and DESCRIBE commands for listing and describing databases and tables stored in HDFS file system.

Hive Database – HIVE Query

A database in Hive is just a namespace or catalog of tables. If we do not specify database, default database is used.

HIVE CREATE database syntax 

HIVE USE Database syntax

This command is used for using a particular database.

HIVE SHOW all existing databases

For each database, HIVE will create a directory and the tables say “EMP” in that database and say “financial” is stored in sub-directories.

The database directory is created under the directory specified in the parameter “hive.metastore.warehouse.dir”. 

Assuming the default value of the parameter is “/usr/hive/warehouse”, when we create the “financial” database, HIVE will create the sub-directory as “/usr/hive/warehouse/financial.db” where “.db” is the extension.

HIVE DESCRIBE database syntax

This will show the directory location of the “financial” database.

HIVE ALTER Database Syntax

We can associate key-value pairs with a database in the DBPROPERTIES using ALTER DATABASE command.

HIVE CREATE Table Syntax

CREATE table statement in Hive is similar to what we follow in SQL but hive provides lots of flexibilities in terms of where the data files for the table will be stored, the format used, delimiter used etc.

In the above example, the default location of the table will be “/user/hive/warehouse/financial.db/EMP”.

If we want to explicitly override the location then we need to use the “LOCATION” keyword in the above create table statement.

HIVE SHOW TABLES Syntax

This command lists the tables in the current working database.

HIVE DESCRIBE Syntax

The DESCRIBE statement displays metadata about a table, such as the column names and their data types.

HIVE DROP Database Syntax

DROP Database statement deletes all the related tables and then delete the database.

HIVE DROP Table Syntax

HIVE ALTER Table Syntax

HIVE DESCRIBE EXTENDED Syntax

This command is used for describing details about the table. If we replace EXTENDED with FORMATTED then it provides more verbose output.

The output lines in the description that start with LOCATION also shows the full URL path in the HDFS where hive will keep all the data related to a given table.

HIVE Managed and External Tables

HIVE Managed Tables

By default, HIVE tables are the managed tables. HIVE controls metadata and the lifecycle of the data.

HIVE stores all the data related to a given table in the subdirectory under the directory defined by the parameter “hive.metastore.warehouse.dir” which is “/user/hive/warehouse” by default.

Dropping a managed table deletes the data from the table by deleting the sub-directory that has got created for the respective table. “DESCRIBE EXTENDED” command output will tell whether a table is managed or extended.

The drawback of the managed table is less convenient to use with other tools. For

For Example: We have inserted some data into some table by Pig or some other tool. Now, we wanted to run some HIVE queries on the data inserted by Pig but do not want to give the ownership to HIVE. In such case, we create an external table which has access to the data but not the ownership.

HIVE External Tables

Let us suppose we want to analyze the marketing data getting ingested from different sources.

We ingest south and north zone data using many tools such as Pig, HIVE and so on. Let us assume data file for the SALES table resides in directory /data/marketing.

The following table declaration creates an external table that can read all the data files from /data/marketing distributed directory with data stored as comma-delimited data.

The keyword “EXTERNAL” tells HIVE that this table is external and the data is stored in the directory mentioned in “LOCATION” clause.

Since the table is external, HIVE does not assume it owns the data. Therefore, dropping table deletes only the metadata in HIVE Metastore and the actual data remains intact.

Therefore, if the data is shared between tools, then it is always advisable to create an external table to make ownership explicit.

Copy the schema (not the data) of an existing table

Team RCV Academy

About Team RCV Academy

RCV Academy Team is a group of professionals working in various industries and contributing to tutorials on the website and other channels.

View all posts by Team RCV Academy