Package org.apache.iceberg.spark
Class SparkDataFile
- java.lang.Object
-
- org.apache.iceberg.spark.SparkDataFile
-
- All Implemented Interfaces:
ContentFile<DataFile>,DataFile
public class SparkDataFile extends java.lang.Object implements DataFile
-
-
Field Summary
-
Fields inherited from interface org.apache.iceberg.DataFile
COLUMN_SIZES, CONTENT, EQUALITY_IDS, FILE_FORMAT, FILE_PATH, FILE_SIZE, KEY_METADATA, LOWER_BOUNDS, NAN_VALUE_COUNTS, NULL_VALUE_COUNTS, PARTITION_DOC, PARTITION_ID, PARTITION_NAME, RECORD_COUNT, SORT_ORDER_ID, SPEC_ID, SPLIT_OFFSETS, UPPER_BOUNDS, VALUE_COUNTS
-
-
Constructor Summary
Constructors Constructor Description SparkDataFile(Types.StructType type, org.apache.spark.sql.types.StructType sparkType)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.Map<java.lang.Integer,java.lang.Long>columnSizes()Returns if collected, map from column ID to the size of the column in bytes, null otherwise.DataFilecopy()Copies this file.DataFilecopyWithoutStats()Copies this file without file stats.longfileSizeInBytes()Returns the file size in bytes.FileFormatformat()Returns format of the file.java.nio.ByteBufferkeyMetadata()Returns metadata about how this file is encrypted, or null if the file is stored in plain text.java.util.Map<java.lang.Integer,java.nio.ByteBuffer>lowerBounds()Returns if collected, map from column ID to value lower bounds, null otherwise.java.util.Map<java.lang.Integer,java.lang.Long>nanValueCounts()Returns if collected, map from column ID to its NaN value count, null otherwise.java.util.Map<java.lang.Integer,java.lang.Long>nullValueCounts()Returns if collected, map from column ID to its null value count, null otherwise.StructLikepartition()Returns partition for this file as aStructLike.java.lang.CharSequencepath()Returns fully qualified path to the file, suitable for constructing a Hadoop Path.java.lang.Longpos()Returns the ordinal position of the file in a manifest, or null if it was not read from a manifest.longrecordCount()Returns the number of top-level records in the file.java.lang.IntegersortOrderId()Returns the sort order id of this file, which describes how the file is ordered.intspecId()Returns id of the partition spec used for partition metadata.java.util.List<java.lang.Long>splitOffsets()Returns list of recommended split locations, if applicable, null otherwise.java.util.Map<java.lang.Integer,java.nio.ByteBuffer>upperBounds()Returns if collected, map from column ID to value upper bounds, null otherwise.java.util.Map<java.lang.Integer,java.lang.Long>valueCounts()Returns if collected, map from column ID to the count of its non-null values, null otherwise.SparkDataFilewrap(org.apache.spark.sql.Row row)-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.iceberg.DataFile
content, equalityFieldIds
-
-
-
-
Constructor Detail
-
SparkDataFile
public SparkDataFile(Types.StructType type, org.apache.spark.sql.types.StructType sparkType)
-
-
Method Detail
-
wrap
public SparkDataFile wrap(org.apache.spark.sql.Row row)
-
pos
public java.lang.Long pos()
Description copied from interface:ContentFileReturns the ordinal position of the file in a manifest, or null if it was not read from a manifest.- Specified by:
posin interfaceContentFile<DataFile>
-
specId
public int specId()
Description copied from interface:ContentFileReturns id of the partition spec used for partition metadata.- Specified by:
specIdin interfaceContentFile<DataFile>
-
path
public java.lang.CharSequence path()
Description copied from interface:ContentFileReturns fully qualified path to the file, suitable for constructing a Hadoop Path.- Specified by:
pathin interfaceContentFile<DataFile>
-
format
public FileFormat format()
Description copied from interface:ContentFileReturns format of the file.- Specified by:
formatin interfaceContentFile<DataFile>
-
partition
public StructLike partition()
Description copied from interface:ContentFileReturns partition for this file as aStructLike.- Specified by:
partitionin interfaceContentFile<DataFile>
-
recordCount
public long recordCount()
Description copied from interface:ContentFileReturns the number of top-level records in the file.- Specified by:
recordCountin interfaceContentFile<DataFile>
-
fileSizeInBytes
public long fileSizeInBytes()
Description copied from interface:ContentFileReturns the file size in bytes.- Specified by:
fileSizeInBytesin interfaceContentFile<DataFile>
-
columnSizes
public java.util.Map<java.lang.Integer,java.lang.Long> columnSizes()
Description copied from interface:ContentFileReturns if collected, map from column ID to the size of the column in bytes, null otherwise.- Specified by:
columnSizesin interfaceContentFile<DataFile>
-
valueCounts
public java.util.Map<java.lang.Integer,java.lang.Long> valueCounts()
Description copied from interface:ContentFileReturns if collected, map from column ID to the count of its non-null values, null otherwise.- Specified by:
valueCountsin interfaceContentFile<DataFile>
-
nullValueCounts
public java.util.Map<java.lang.Integer,java.lang.Long> nullValueCounts()
Description copied from interface:ContentFileReturns if collected, map from column ID to its null value count, null otherwise.- Specified by:
nullValueCountsin interfaceContentFile<DataFile>
-
nanValueCounts
public java.util.Map<java.lang.Integer,java.lang.Long> nanValueCounts()
Description copied from interface:ContentFileReturns if collected, map from column ID to its NaN value count, null otherwise.- Specified by:
nanValueCountsin interfaceContentFile<DataFile>
-
lowerBounds
public java.util.Map<java.lang.Integer,java.nio.ByteBuffer> lowerBounds()
Description copied from interface:ContentFileReturns if collected, map from column ID to value lower bounds, null otherwise.- Specified by:
lowerBoundsin interfaceContentFile<DataFile>
-
upperBounds
public java.util.Map<java.lang.Integer,java.nio.ByteBuffer> upperBounds()
Description copied from interface:ContentFileReturns if collected, map from column ID to value upper bounds, null otherwise.- Specified by:
upperBoundsin interfaceContentFile<DataFile>
-
keyMetadata
public java.nio.ByteBuffer keyMetadata()
Description copied from interface:ContentFileReturns metadata about how this file is encrypted, or null if the file is stored in plain text.- Specified by:
keyMetadatain interfaceContentFile<DataFile>
-
copy
public DataFile copy()
Description copied from interface:ContentFileCopies this file. Manifest readers can reuse file instances; use this method to copy data when collecting files from tasks.- Specified by:
copyin interfaceContentFile<DataFile>- Returns:
- a copy of this data file
-
copyWithoutStats
public DataFile copyWithoutStats()
Description copied from interface:ContentFileCopies this file without file stats. Manifest readers can reuse file instances; use this method to copy data without stats when collecting files.- Specified by:
copyWithoutStatsin interfaceContentFile<DataFile>- Returns:
- a copy of this data file, without lower bounds, upper bounds, value counts, null value counts, or nan value counts
-
splitOffsets
public java.util.List<java.lang.Long> splitOffsets()
Description copied from interface:ContentFileReturns list of recommended split locations, if applicable, null otherwise.When available, this information is used for planning scan tasks whose boundaries are determined by these offsets. The returned list must be sorted in ascending order.
- Specified by:
splitOffsetsin interfaceContentFile<DataFile>
-
sortOrderId
public java.lang.Integer sortOrderId()
Description copied from interface:ContentFileReturns the sort order id of this file, which describes how the file is ordered. This information will be useful for merging data and equality delete files more efficiently when they share the same sort order id.- Specified by:
sortOrderIdin interfaceContentFile<DataFile>
-
-