Package org.apache.iceberg.parquet
Class ParquetSchemaUtil
- java.lang.Object
-
- org.apache.iceberg.parquet.ParquetSchemaUtil
-
public class ParquetSchemaUtil extends java.lang.Object
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
ParquetSchemaUtil.HasIds
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static org.apache.parquet.schema.MessageType
addFallbackIds(org.apache.parquet.schema.MessageType fileSchema)
static org.apache.parquet.schema.MessageType
applyNameMapping(org.apache.parquet.schema.MessageType fileSchema, NameMapping nameMapping)
static org.apache.parquet.schema.MessageType
convert(Schema schema, java.lang.String name)
static Schema
convert(org.apache.parquet.schema.MessageType parquetSchema)
Converts a Parquet schema to an Iceberg schema.static Schema
convertAndPrune(org.apache.parquet.schema.MessageType parquetSchema)
Converts a Parquet schema to an Iceberg schema and prunes fields without IDs.static boolean
hasIds(org.apache.parquet.schema.MessageType fileSchema)
static org.apache.parquet.schema.MessageType
pruneColumns(org.apache.parquet.schema.MessageType fileSchema, Schema expectedSchema)
static org.apache.parquet.schema.MessageType
pruneColumnsFallback(org.apache.parquet.schema.MessageType fileSchema, Schema expectedSchema)
Prunes columns from a Parquet file schema that was written without field ids.
-
-
-
Method Detail
-
convert
public static org.apache.parquet.schema.MessageType convert(Schema schema, java.lang.String name)
-
convert
public static Schema convert(org.apache.parquet.schema.MessageType parquetSchema)
Converts a Parquet schema to an Iceberg schema. Fields without IDs are kept and assigned fallback IDs.- Parameters:
parquetSchema
- a Parquet schema- Returns:
- a matching Iceberg schema for the provided Parquet schema
-
convertAndPrune
public static Schema convertAndPrune(org.apache.parquet.schema.MessageType parquetSchema)
Converts a Parquet schema to an Iceberg schema and prunes fields without IDs.- Parameters:
parquetSchema
- a Parquet schema- Returns:
- a matching Iceberg schema for the provided Parquet schema
-
pruneColumns
public static org.apache.parquet.schema.MessageType pruneColumns(org.apache.parquet.schema.MessageType fileSchema, Schema expectedSchema)
-
pruneColumnsFallback
public static org.apache.parquet.schema.MessageType pruneColumnsFallback(org.apache.parquet.schema.MessageType fileSchema, Schema expectedSchema)
Prunes columns from a Parquet file schema that was written without field ids.Files that were written without field ids are read assuming that schema evolution preserved column order. Deleting columns was not allowed.
The order of columns in the resulting Parquet schema matches the Parquet file.
- Parameters:
fileSchema
- schema from a Parquet file that does not have field ids.expectedSchema
- expected schema- Returns:
- a parquet schema pruned using the expected schema
-
hasIds
public static boolean hasIds(org.apache.parquet.schema.MessageType fileSchema)
-
addFallbackIds
public static org.apache.parquet.schema.MessageType addFallbackIds(org.apache.parquet.schema.MessageType fileSchema)
-
applyNameMapping
public static org.apache.parquet.schema.MessageType applyNameMapping(org.apache.parquet.schema.MessageType fileSchema, NameMapping nameMapping)
-
-