Class TypeUtil

java.lang.Object
org.apache.iceberg.types.TypeUtil

public class TypeUtil extends Object
  • Method Details Link icon

    • project Link icon

      public static Schema project(Schema schema, Set<Integer> fieldIds)
      Project extracts particular fields from a schema by ID.

      Unlike select(Schema, Set), project will pick out only the fields enumerated. Structs that are explicitly projected are empty unless sub-fields are explicitly projected. Maps and lists cannot be explicitly selected in fieldIds.

      Parameters:
      schema - to project fields from
      fieldIds - list of explicit fields to extract
      Returns:
      the schema with all fields fields not selected removed
    • project Link icon

      public static Types.StructType project(Types.StructType struct, Set<Integer> fieldIds)
    • select Link icon

      public static Schema select(Schema schema, Set<Integer> fieldIds)
    • select Link icon

      public static Types.StructType select(Types.StructType struct, Set<Integer> fieldIds)
    • getProjectedIds Link icon

      public static Set<Integer> getProjectedIds(Schema schema)
    • getProjectedIds Link icon

      public static Set<Integer> getProjectedIds(Type type)
    • selectNot Link icon

      public static Types.StructType selectNot(Types.StructType struct, Set<Integer> fieldIds)
    • selectNot Link icon

      public static Schema selectNot(Schema schema, Set<Integer> fieldIds)
    • join Link icon

      public static Schema join(Schema left, Schema right)
    • indexByName Link icon

      public static Map<String,Integer> indexByName(Types.StructType struct)
    • indexNameById Link icon

      public static Map<Integer,String> indexNameById(Types.StructType struct)
    • indexQuotedNameById Link icon

      public static Map<Integer,String> indexQuotedNameById(Types.StructType struct, Function<String,String> quotingFunc)
    • indexByLowerCaseName Link icon

      public static Map<String,Integer> indexByLowerCaseName(Types.StructType struct)
      Creates a mapping from lower-case field names to their corresponding field IDs.

      This method iterates over the fields of the provided struct and maps each field's name (converted to lower-case) to its ID. If two fields have the same lower-case name, an `IllegalArgumentException` is thrown.

      Parameters:
      struct - the struct type whose fields are to be indexed
      Returns:
      a map where the keys are lower-case field names and the values are field IDs
      Throws:
      IllegalArgumentException - if two fields have the same lower-case name
    • indexById Link icon

      public static Map<Integer,Types.NestedField> indexById(Types.StructType struct)
    • indexParents Link icon

      public static Map<Integer,Integer> indexParents(Types.StructType struct)
    • assignFreshIds Link icon

      public static Type assignFreshIds(Type type, TypeUtil.NextID nextId)
      Assigns fresh ids from the nextId function for all fields in a type.
      Parameters:
      type - a type
      nextId - an id assignment function
      Returns:
      an structurally identical type with new ids assigned by the nextId function
    • assignFreshIds Link icon

      public static Schema assignFreshIds(Schema schema, TypeUtil.NextID nextId)
      Assigns fresh ids from the nextId function for all fields in a schema.
      Parameters:
      schema - a schema
      nextId - an id assignment function
      Returns:
      a structurally identical schema with new ids assigned by the nextId function
    • assignFreshIds Link icon

      public static Schema assignFreshIds(int schemaId, Schema schema, TypeUtil.NextID nextId)
      Assigns fresh ids from the nextId function for all fields in a schema.
      Parameters:
      schemaId - an ID assigned to this schema
      schema - a schema
      nextId - an id assignment function
      Returns:
      a structurally identical schema with new ids assigned by the nextId function
    • assignFreshIds Link icon

      public static Schema assignFreshIds(Schema schema, Schema baseSchema, TypeUtil.NextID nextId)
      Assigns ids to match a given schema, and fresh ids from the nextId function for all other fields.
      Parameters:
      schema - a schema
      baseSchema - a schema with existing IDs to copy by name
      nextId - an id assignment function
      Returns:
      a structurally identical schema with new ids assigned by the nextId function
    • refreshIdentifierFields Link icon

      public static Set<Integer> refreshIdentifierFields(Types.StructType freshSchema, Schema baseSchema)
      Get the identifier fields in the fresh schema based on the identifier fields in the base schema.
      Parameters:
      freshSchema - fresh schema
      baseSchema - base schema
      Returns:
      identifier fields in the fresh schema
    • assignIncreasingFreshIds Link icon

      public static Schema assignIncreasingFreshIds(Schema schema)
      Assigns strictly increasing fresh ids for all fields in a schema, starting from 1.
      Parameters:
      schema - a schema
      Returns:
      a structurally identical schema with new ids assigned strictly increasing from 1
    • reassignIds Link icon

      public static Schema reassignIds(Schema schema, Schema idSourceSchema)
      Reassigns ids in a schema from another schema.

      Ids are determined by field names. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.

      This will not alter a schema's structure, nullability, or types.

      Parameters:
      schema - the schema to have ids reassigned
      idSourceSchema - the schema from which field ids will be used
      Returns:
      an structurally identical schema with field ids matching the source schema
      Throws:
      IllegalArgumentException - if a field cannot be found (by name) in the source schema
    • reassignDoc Link icon

      public static Schema reassignDoc(Schema schema, Schema docSourceSchema)
      Reassigns doc in a schema from another schema.

      Doc are determined by field id. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.

      This will not alter a schema's structure, nullability, or types.

      Parameters:
      schema - the schema to have doc reassigned
      docSourceSchema - the schema from which field doc will be used
      Returns:
      an structurally identical schema with field ids matching the source schema
      Throws:
      IllegalArgumentException - if a field cannot be found (by id) in the source schema
    • reassignIds Link icon

      public static Schema reassignIds(Schema schema, Schema idSourceSchema, boolean caseSensitive)
      Reassigns ids in a schema from another schema.

      Ids are determined by field names. If a field in the schema cannot be found in the source schema, this will throw IllegalArgumentException.

      This will not alter a schema's structure, nullability, or types.

      Parameters:
      schema - the schema to have ids reassigned
      idSourceSchema - the schema from which field ids will be used
      Returns:
      an structurally identical schema with field ids matching the source schema
      Throws:
      IllegalArgumentException - if a field cannot be found (by name) in the source schema
    • reassignOrRefreshIds Link icon

      public static Schema reassignOrRefreshIds(Schema schema, Schema idSourceSchema)
    • reassignOrRefreshIds Link icon

      public static Schema reassignOrRefreshIds(Schema schema, Schema idSourceSchema, boolean caseSensitive)
    • assignIds Link icon

      public static Type assignIds(Type type, TypeUtil.GetID getId)
      Assigns fresh ids from the getId function for all fields in a type.
      Parameters:
      type - a type
      getId - an id assignment function
      Returns:
      an structurally identical type with new ids assigned by the getId function
    • find Link icon

      public static Type find(Schema schema, Predicate<Type> predicate)
    • find Link icon

      public static Type find(Type type, Predicate<Type> predicate)
    • isPromotionAllowed Link icon

      public static boolean isPromotionAllowed(Type from, Type.PrimitiveType to)
    • validateWriteSchema Link icon

      public static void validateWriteSchema(Schema tableSchema, Schema writeSchema, Boolean checkNullability, Boolean checkOrdering)
      Check whether we could write the iceberg table with the user-provided write schema.
      Parameters:
      tableSchema - the table schema written in iceberg meta data.
      writeSchema - the user-provided write schema.
      checkNullability - If true, not allow to write optional values to a required field.
      checkOrdering - If true, not allow input schema to have different ordering than table schema.
    • validateSchema Link icon

      public static void validateSchema(String context, Schema expectedSchema, Schema providedSchema, boolean checkNullability, boolean checkOrdering)
      Validates whether the provided schema is compatible with the expected schema.
      Parameters:
      context - the schema context (e.g. row ID)
      expectedSchema - the expected schema
      providedSchema - the provided schema
      checkNullability - whether to check field nullability
      checkOrdering - whether to check field ordering
    • estimateSize Link icon

      public static int estimateSize(Types.NestedField field)
      Estimates the number of bytes a value for a given field may occupy in memory.

      This method approximates the memory size based on heuristics and the internal Java representation defined by Type.TypeID. It is important to note that the actual size might differ from this estimation. The method is designed to handle a variety of data types, including primitive types, strings, and nested types such as structs, maps, and lists.

      Parameters:
      field - a field for which to estimate the size
      Returns:
      the estimated size in bytes of the field's value in memory
    • visit Link icon

      public static <T> T visit(Schema schema, TypeUtil.SchemaVisitor<T> visitor)
    • visit Link icon

      public static <T> T visit(Type type, TypeUtil.SchemaVisitor<T> visitor)
    • visit Link icon

      public static <T> T visit(Schema schema, TypeUtil.CustomOrderSchemaVisitor<T> visitor)
    • visit Link icon

      public static <T> T visit(Type type, TypeUtil.CustomOrderSchemaVisitor<T> visitor)
      Used to traverse types with traversals other than post-order.

      This passes a Supplier to each visitor method that returns the result of traversing child types. Structs are passed an Iterable that traverses child fields during iteration.

      An example use is assigning column IDs, which should be done with a pre-order traversal.

      Type Parameters:
      T - the type returned by the visitor
      Parameters:
      type - a type to traverse with a visitor
      visitor - a custom order visitor
      Returns:
      the result of traversing the given type with the visitor
    • decimalRequiredBytes Link icon

      public static int decimalRequiredBytes(int precision)