Class TypeUtil
- java.lang.Object
-
- org.apache.iceberg.types.TypeUtil
-
public class TypeUtil extends java.lang.Object
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
TypeUtil.CustomOrderSchemaVisitor<T>
static interface
TypeUtil.NextID
Interface for passing a function that assigns column IDs.static class
TypeUtil.SchemaVisitor<T>
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static Schema
assignFreshIds(int schemaId, Schema schema, TypeUtil.NextID nextId)
Assigns fresh ids from thenextId function
for all fields in a schema.static Schema
assignFreshIds(Schema schema, Schema baseSchema, TypeUtil.NextID nextId)
Assigns ids to match a given schema, and fresh ids from thenextId function
for all other fields.static Schema
assignFreshIds(Schema schema, TypeUtil.NextID nextId)
Assigns fresh ids from thenextId function
for all fields in a schema.static Type
assignFreshIds(Type type, TypeUtil.NextID nextId)
Assigns fresh ids from thenextId function
for all fields in a type.static Schema
assignIncreasingFreshIds(Schema schema)
Assigns strictly increasing fresh ids for all fields in a schema, starting from 1.static int
decimalRequiredBytes(int precision)
static int
estimateSize(Types.NestedField field)
Estimates the number of bytes a value for a given field may occupy in memory.static Type
find(Schema schema, java.util.function.Predicate<Type> predicate)
static java.util.Set<java.lang.Integer>
getProjectedIds(Schema schema)
static java.util.Set<java.lang.Integer>
getProjectedIds(Type type)
static java.util.Map<java.lang.Integer,Types.NestedField>
indexById(Types.StructType struct)
static java.util.Map<java.lang.String,java.lang.Integer>
indexByLowerCaseName(Types.StructType struct)
static java.util.Map<java.lang.String,java.lang.Integer>
indexByName(Types.StructType struct)
static java.util.Map<java.lang.Integer,java.lang.String>
indexNameById(Types.StructType struct)
static java.util.Map<java.lang.Integer,java.lang.Integer>
indexParents(Types.StructType struct)
static java.util.Map<java.lang.Integer,java.lang.String>
indexQuotedNameById(Types.StructType struct, java.util.function.Function<java.lang.String,java.lang.String> quotingFunc)
static boolean
isPromotionAllowed(Type from, Type.PrimitiveType to)
static Schema
join(Schema left, Schema right)
static Schema
project(Schema schema, java.util.Set<java.lang.Integer> fieldIds)
Project extracts particular fields from a schema by ID.static Types.StructType
project(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
static Schema
reassignDoc(Schema schema, Schema docSourceSchema)
Reassigns doc in a schema from another schema.static Schema
reassignIds(Schema schema, Schema idSourceSchema)
Reassigns ids in a schema from another schema.static Schema
reassignIds(Schema schema, Schema idSourceSchema, boolean caseSensitive)
Reassigns ids in a schema from another schema.static Schema
reassignOrRefreshIds(Schema schema, Schema idSourceSchema)
static Schema
reassignOrRefreshIds(Schema schema, Schema idSourceSchema, boolean caseSensitive)
static java.util.Set<java.lang.Integer>
refreshIdentifierFields(Types.StructType freshSchema, Schema baseSchema)
Get the identifier fields in the fresh schema based on the identifier fields in the base schema.static Schema
select(Schema schema, java.util.Set<java.lang.Integer> fieldIds)
static Types.StructType
select(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
static Schema
selectNot(Schema schema, java.util.Set<java.lang.Integer> fieldIds)
static Types.StructType
selectNot(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
static void
validateSchema(java.lang.String context, Schema expectedSchema, Schema providedSchema, boolean checkNullability, boolean checkOrdering)
Validates whether the provided schema is compatible with the expected schema.static void
validateWriteSchema(Schema tableSchema, Schema writeSchema, java.lang.Boolean checkNullability, java.lang.Boolean checkOrdering)
Check whether we could write the iceberg table with the user-provided write schema.static <T> T
visit(Schema schema, TypeUtil.CustomOrderSchemaVisitor<T> visitor)
static <T> T
visit(Schema schema, TypeUtil.SchemaVisitor<T> visitor)
static <T> T
visit(Type type, TypeUtil.CustomOrderSchemaVisitor<T> visitor)
Used to traverse types with traversals other than post-order.static <T> T
visit(Type type, TypeUtil.SchemaVisitor<T> visitor)
-
-
-
Method Detail
-
project
public static Schema project(Schema schema, java.util.Set<java.lang.Integer> fieldIds)
Project extracts particular fields from a schema by ID.Unlike
select(Schema, Set)
, project will pick out only the fields enumerated. Structs that are explicitly projected are empty unless sub-fields are explicitly projected. Maps and lists cannot be explicitly selected in fieldIds.- Parameters:
schema
- to project fields fromfieldIds
- list of explicit fields to extract- Returns:
- the schema with all fields fields not selected removed
-
project
public static Types.StructType project(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
-
select
public static Types.StructType select(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
-
getProjectedIds
public static java.util.Set<java.lang.Integer> getProjectedIds(Schema schema)
-
getProjectedIds
public static java.util.Set<java.lang.Integer> getProjectedIds(Type type)
-
selectNot
public static Types.StructType selectNot(Types.StructType struct, java.util.Set<java.lang.Integer> fieldIds)
-
indexByName
public static java.util.Map<java.lang.String,java.lang.Integer> indexByName(Types.StructType struct)
-
indexNameById
public static java.util.Map<java.lang.Integer,java.lang.String> indexNameById(Types.StructType struct)
-
indexQuotedNameById
public static java.util.Map<java.lang.Integer,java.lang.String> indexQuotedNameById(Types.StructType struct, java.util.function.Function<java.lang.String,java.lang.String> quotingFunc)
-
indexByLowerCaseName
public static java.util.Map<java.lang.String,java.lang.Integer> indexByLowerCaseName(Types.StructType struct)
-
indexById
public static java.util.Map<java.lang.Integer,Types.NestedField> indexById(Types.StructType struct)
-
indexParents
public static java.util.Map<java.lang.Integer,java.lang.Integer> indexParents(Types.StructType struct)
-
assignFreshIds
public static Type assignFreshIds(Type type, TypeUtil.NextID nextId)
Assigns fresh ids from thenextId function
for all fields in a type.- Parameters:
type
- a typenextId
- an id assignment function- Returns:
- an structurally identical type with new ids assigned by the nextId function
-
assignFreshIds
public static Schema assignFreshIds(Schema schema, TypeUtil.NextID nextId)
Assigns fresh ids from thenextId function
for all fields in a schema.- Parameters:
schema
- a schemanextId
- an id assignment function- Returns:
- a structurally identical schema with new ids assigned by the nextId function
-
assignFreshIds
public static Schema assignFreshIds(int schemaId, Schema schema, TypeUtil.NextID nextId)
Assigns fresh ids from thenextId function
for all fields in a schema.- Parameters:
schemaId
- an ID assigned to this schemaschema
- a schemanextId
- an id assignment function- Returns:
- a structurally identical schema with new ids assigned by the nextId function
-
assignFreshIds
public static Schema assignFreshIds(Schema schema, Schema baseSchema, TypeUtil.NextID nextId)
Assigns ids to match a given schema, and fresh ids from thenextId function
for all other fields.- Parameters:
schema
- a schemabaseSchema
- a schema with existing IDs to copy by namenextId
- an id assignment function- Returns:
- a structurally identical schema with new ids assigned by the nextId function
-
refreshIdentifierFields
public static java.util.Set<java.lang.Integer> refreshIdentifierFields(Types.StructType freshSchema, Schema baseSchema)
Get the identifier fields in the fresh schema based on the identifier fields in the base schema.- Parameters:
freshSchema
- fresh schemabaseSchema
- base schema- Returns:
- identifier fields in the fresh schema
-
assignIncreasingFreshIds
public static Schema assignIncreasingFreshIds(Schema schema)
Assigns strictly increasing fresh ids for all fields in a schema, starting from 1.- Parameters:
schema
- a schema- Returns:
- a structurally identical schema with new ids assigned strictly increasing from 1
-
reassignIds
public static Schema reassignIds(Schema schema, Schema idSourceSchema)
Reassigns ids in a schema from another schema.Ids are determined by field names. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.
This will not alter a schema's structure, nullability, or types.
- Parameters:
schema
- the schema to have ids reassignedidSourceSchema
- the schema from which field ids will be used- Returns:
- an structurally identical schema with field ids matching the source schema
- Throws:
java.lang.IllegalArgumentException
- if a field cannot be found (by name) in the source schema
-
reassignDoc
public static Schema reassignDoc(Schema schema, Schema docSourceSchema)
Reassigns doc in a schema from another schema.Doc are determined by field id. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.
This will not alter a schema's structure, nullability, or types.
- Parameters:
schema
- the schema to have doc reassigneddocSourceSchema
- the schema from which field doc will be used- Returns:
- an structurally identical schema with field ids matching the source schema
- Throws:
java.lang.IllegalArgumentException
- if a field cannot be found (by id) in the source schema
-
reassignIds
public static Schema reassignIds(Schema schema, Schema idSourceSchema, boolean caseSensitive)
Reassigns ids in a schema from another schema.Ids are determined by field names. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.
This will not alter a schema's structure, nullability, or types.
- Parameters:
schema
- the schema to have ids reassignedidSourceSchema
- the schema from which field ids will be used- Returns:
- an structurally identical schema with field ids matching the source schema
- Throws:
java.lang.IllegalArgumentException
- if a field cannot be found (by name) in the source schema
-
reassignOrRefreshIds
public static Schema reassignOrRefreshIds(Schema schema, Schema idSourceSchema)
-
reassignOrRefreshIds
public static Schema reassignOrRefreshIds(Schema schema, Schema idSourceSchema, boolean caseSensitive)
-
isPromotionAllowed
public static boolean isPromotionAllowed(Type from, Type.PrimitiveType to)
-
validateWriteSchema
public static void validateWriteSchema(Schema tableSchema, Schema writeSchema, java.lang.Boolean checkNullability, java.lang.Boolean checkOrdering)
Check whether we could write the iceberg table with the user-provided write schema.- Parameters:
tableSchema
- the table schema written in iceberg meta data.writeSchema
- the user-provided write schema.checkNullability
- If true, not allow to write optional values to a required field.checkOrdering
- If true, not allow input schema to have different ordering than table schema.
-
validateSchema
public static void validateSchema(java.lang.String context, Schema expectedSchema, Schema providedSchema, boolean checkNullability, boolean checkOrdering)
Validates whether the provided schema is compatible with the expected schema.- Parameters:
context
- the schema context (e.g. row ID)expectedSchema
- the expected schemaprovidedSchema
- the provided schemacheckNullability
- whether to check field nullabilitycheckOrdering
- whether to check field ordering
-
estimateSize
public static int estimateSize(Types.NestedField field)
Estimates the number of bytes a value for a given field may occupy in memory.This method approximates the memory size based on heuristics and the internal Java representation defined by
Type.TypeID
. It is important to note that the actual size might differ from this estimation. The method is designed to handle a variety of data types, including primitive types, strings, and nested types such as structs, maps, and lists.- Parameters:
field
- a field for which to estimate the size- Returns:
- the estimated size in bytes of the field's value in memory
-
visit
public static <T> T visit(Schema schema, TypeUtil.SchemaVisitor<T> visitor)
-
visit
public static <T> T visit(Type type, TypeUtil.SchemaVisitor<T> visitor)
-
visit
public static <T> T visit(Schema schema, TypeUtil.CustomOrderSchemaVisitor<T> visitor)
-
visit
public static <T> T visit(Type type, TypeUtil.CustomOrderSchemaVisitor<T> visitor)
Used to traverse types with traversals other than post-order.This passes a
Supplier
to eachvisitor
method that returns the result of traversing child types. Structs are passed anIterable
that traverses child fields during iteration.An example use is assigning column IDs, which should be done with a pre-order traversal.
- Type Parameters:
T
- the type returned by the visitor- Parameters:
type
- a type to traverse with a visitorvisitor
- a custom order visitor- Returns:
- the result of traversing the given type with the visitor
-
decimalRequiredBytes
public static int decimalRequiredBytes(int precision)
-
-