Package org.apache.iceberg.data
Class TableMigrationUtil
- java.lang.Object
-
- org.apache.iceberg.data.TableMigrationUtil
-
public class TableMigrationUtil extends java.lang.Object
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static java.util.List<DataFile>
listPartition(java.util.Map<java.lang.String,java.lang.String> partition, java.lang.String uri, java.lang.String format, PartitionSpec spec, org.apache.hadoop.conf.Configuration conf, MetricsConfig metricsConfig, NameMapping mapping)
Returns the data files in a partition by listing the partition location.static java.util.List<DataFile>
listPartition(java.util.Map<java.lang.String,java.lang.String> partition, java.lang.String partitionUri, java.lang.String format, PartitionSpec spec, org.apache.hadoop.conf.Configuration conf, MetricsConfig metricsSpec, NameMapping mapping, int parallelism)
Returns the data files in a partition by listing the partition location.
-
-
-
Method Detail
-
listPartition
public static java.util.List<DataFile> listPartition(java.util.Map<java.lang.String,java.lang.String> partition, java.lang.String uri, java.lang.String format, PartitionSpec spec, org.apache.hadoop.conf.Configuration conf, MetricsConfig metricsConfig, NameMapping mapping)
Returns the data files in a partition by listing the partition location.For Parquet and ORC partitions, this will read metrics from the file footer. For Avro partitions, metrics other than row count are set to null.
Note: certain metrics, like NaN counts, that are only supported by Iceberg file writers but not file footers, will not be populated.
- Parameters:
partition
- map of column names to column values for the partitionuri
- partition location URIformat
- partition format, avro, parquet or orcspec
- a partition specconf
- a Hadoop confmetricsConfig
- a metrics confmapping
- a name mapping- Returns:
- a List of DataFile
-
listPartition
public static java.util.List<DataFile> listPartition(java.util.Map<java.lang.String,java.lang.String> partition, java.lang.String partitionUri, java.lang.String format, PartitionSpec spec, org.apache.hadoop.conf.Configuration conf, MetricsConfig metricsSpec, NameMapping mapping, int parallelism)
Returns the data files in a partition by listing the partition location. Metrics are read from the files and the file reading is done in parallel by a specified number of threads.For Parquet and ORC partitions, this will read metrics from the file footer. For Avro partitions, metrics other than row count are set to null.
Note: certain metrics, like NaN counts, that are only supported by Iceberg file writers but not file footers, will not be populated.
- Parameters:
partition
- map of column names to column values for the partitionpartitionUri
- partition location URIformat
- partition format, avro, parquet or orcspec
- a partition specconf
- a Hadoop confmetricsSpec
- a metrics confmapping
- a name mappingparallelism
- number of threads to use for file reading- Returns:
- a List of DataFile
-
-