Package org.apache.iceberg.parquet
Class ParquetSchemaUtil
java.lang.Object
org.apache.iceberg.parquet.ParquetSchemaUtil
- 
Nested Class SummaryNested Classes
- 
Method SummaryModifier and TypeMethodDescriptionstatic org.apache.parquet.schema.MessageTypeaddFallbackIds(org.apache.parquet.schema.MessageType fileSchema) static org.apache.parquet.schema.MessageTypeapplyNameMapping(org.apache.parquet.schema.MessageType fileSchema, NameMapping nameMapping) static org.apache.parquet.schema.MessageTypeConvert an Iceberg schema to Parquet.static org.apache.parquet.schema.MessageTypeconvert(Schema schema, String name, VariantShreddingFunction variantShreddingFunc) Convert an Iceberg schema to Parquet.static Schemaconvert(org.apache.parquet.schema.MessageType parquetSchema) Converts a Parquet schema to an Iceberg schema.static SchemaconvertAndPrune(org.apache.parquet.schema.MessageType parquetSchema) Converts a Parquet schema to an Iceberg schema and prunes fields without IDs.static org.apache.parquet.schema.TypedetermineListElementType(org.apache.parquet.schema.GroupType array) static org.apache.parquet.schema.TypeReturns the Type of the named field in the struct/group, or null.static booleanReturns true if the name identifies a field in the struct/group.static booleanhasIds(org.apache.parquet.schema.MessageType fileSchema) static org.apache.parquet.schema.MessageTypepruneColumns(org.apache.parquet.schema.MessageType fileSchema, Schema expectedSchema) static org.apache.parquet.schema.MessageTypepruneColumnsFallback(org.apache.parquet.schema.MessageType fileSchema, Schema expectedSchema) Prunes columns from a Parquet file schema that was written without field ids.
- 
Method Details- 
convertConvert an Iceberg schema to Parquet.- Parameters:
- schema- an Iceberg- Schema
- name- name for the Parquet schema
- Returns:
- the schema converted to a Parquet MessageType
 
- 
convertpublic static org.apache.parquet.schema.MessageType convert(Schema schema, String name, VariantShreddingFunction variantShreddingFunc) Convert an Iceberg schema to Parquet.Variant fields are converted by calling the VariantShreddingFunctionwith the variant's and field ID and name to produce the shredding type as atyped_valuefield. This field is added to the variant struct alongside themetadataandvaluefields.- Parameters:
- schema- an Iceberg- Schema
- name- name for the Parquet schema
- variantShreddingFunc-- VariantShreddingFunctionthat produces a shredded type
- Returns:
- the schema converted to a Parquet MessageType
 
- 
convertConverts a Parquet schema to an Iceberg schema. Fields without IDs are kept and assigned fallback IDs.- Parameters:
- parquetSchema- a Parquet schema
- Returns:
- a matching Iceberg schema for the provided Parquet schema
 
- 
convertAndPruneConverts a Parquet schema to an Iceberg schema and prunes fields without IDs.- Parameters:
- parquetSchema- a Parquet schema
- Returns:
- a matching Iceberg schema for the provided Parquet schema
 
- 
hasFieldReturns true if the name identifies a field in the struct/group.- Parameters:
- group- a GroupType
- name- a String name
- Returns:
- true if the group contains a field with the given name
 
- 
fieldTypepublic static org.apache.parquet.schema.Type fieldType(org.apache.parquet.schema.GroupType group, String name) Returns the Type of the named field in the struct/group, or null.- Parameters:
- group- a GroupType
- name- a String name
- Returns:
- the Type of the field in the group, or null if it is not present.
 
- 
pruneColumnspublic static org.apache.parquet.schema.MessageType pruneColumns(org.apache.parquet.schema.MessageType fileSchema, Schema expectedSchema) 
- 
pruneColumnsFallbackpublic static org.apache.parquet.schema.MessageType pruneColumnsFallback(org.apache.parquet.schema.MessageType fileSchema, Schema expectedSchema) Prunes columns from a Parquet file schema that was written without field ids.Files that were written without field ids are read assuming that schema evolution preserved column order. Deleting columns was not allowed. The order of columns in the resulting Parquet schema matches the Parquet file. - Parameters:
- fileSchema- schema from a Parquet file that does not have field ids.
- expectedSchema- expected schema
- Returns:
- a parquet schema pruned using the expected schema
 
- 
hasIdspublic static boolean hasIds(org.apache.parquet.schema.MessageType fileSchema) 
- 
addFallbackIdspublic static org.apache.parquet.schema.MessageType addFallbackIds(org.apache.parquet.schema.MessageType fileSchema) 
- 
applyNameMappingpublic static org.apache.parquet.schema.MessageType applyNameMapping(org.apache.parquet.schema.MessageType fileSchema, NameMapping nameMapping) 
- 
determineListElementTypepublic static org.apache.parquet.schema.Type determineListElementType(org.apache.parquet.schema.GroupType array) 
 
-