hive Tutorial => PARQUET

Example

Parquet columnar storage format in Hive 0.13.0 and later. Parquet is built from the ground up with complex nested data structures in mind, and uses the record shredding and assembly algorithm described in the Dremel paper. We believe this approach is superior to simple flattening of nested name spaces.

Parquet is built to support very efficient compression and encoding schemes. Multiple projects have demonstrated the performance impact of applying the right compression and encoding scheme to the data. Parquet allows compression schemes to be specified on a per-column level, and is future-proofed to allow adding more encodings as they are invented and implemented.

Parquet is recommended File Format with Impala Tables in Cloudera distributions.

See: http://parquet.apache.org/documentation/latest/

CREATE TABLE parquet_table_name (x INT, y STRING) STORED AS PARQUET;

PDF - Download hive for free

Previous Next

hive

Fastest Entity Framework Extensions

Example

Got any hive Question?

hive

hive File formats in HIVE PARQUET

Fastest Entity Framework Extensions

Example

Got any hive Question?