Package org.apache.iceberg.spark
Class SparkDataFile
- java.lang.Object
-
- org.apache.iceberg.spark.SparkDataFile
-
- All Implemented Interfaces:
ContentFile<DataFile>
,DataFile
public class SparkDataFile extends java.lang.Object implements DataFile
-
-
Field Summary
-
Fields inherited from interface org.apache.iceberg.DataFile
COLUMN_SIZES, CONTENT, FILE_FORMAT, FILE_PATH, FILE_SIZE, KEY_METADATA, LOWER_BOUNDS, NULL_VALUE_COUNTS, PARTITION_DOC, PARTITION_ID, PARTITION_NAME, RECORD_COUNT, SPLIT_OFFSETS, UPPER_BOUNDS, VALUE_COUNTS
-
-
Constructor Summary
Constructors Constructor Description SparkDataFile(Types.StructType type, org.apache.spark.sql.types.StructType sparkType)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.util.Map<java.lang.Integer,java.lang.Long>
columnSizes()
DataFile
copy()
Copies this file.DataFile
copyWithoutStats()
Copies this file without file stats.long
fileSizeInBytes()
FileFormat
format()
java.nio.ByteBuffer
keyMetadata()
java.util.Map<java.lang.Integer,java.nio.ByteBuffer>
lowerBounds()
java.util.Map<java.lang.Integer,java.lang.Long>
nullValueCounts()
StructLike
partition()
java.lang.CharSequence
path()
long
recordCount()
java.util.List<java.lang.Long>
splitOffsets()
java.util.Map<java.lang.Integer,java.nio.ByteBuffer>
upperBounds()
java.util.Map<java.lang.Integer,java.lang.Long>
valueCounts()
SparkDataFile
wrap(org.apache.spark.sql.Row row)
-
-
-
Constructor Detail
-
SparkDataFile
public SparkDataFile(Types.StructType type, org.apache.spark.sql.types.StructType sparkType)
-
-
Method Detail
-
wrap
public SparkDataFile wrap(org.apache.spark.sql.Row row)
-
path
public java.lang.CharSequence path()
- Specified by:
path
in interfaceContentFile<DataFile>
- Returns:
- fully qualified path to the file, suitable for constructing a Hadoop Path
-
format
public FileFormat format()
- Specified by:
format
in interfaceContentFile<DataFile>
- Returns:
- format of the file
-
partition
public StructLike partition()
- Specified by:
partition
in interfaceContentFile<DataFile>
- Returns:
- partition for this file as a
StructLike
-
recordCount
public long recordCount()
- Specified by:
recordCount
in interfaceContentFile<DataFile>
- Returns:
- the number of top-level records in the file
-
fileSizeInBytes
public long fileSizeInBytes()
- Specified by:
fileSizeInBytes
in interfaceContentFile<DataFile>
- Returns:
- the file size in bytes
-
columnSizes
public java.util.Map<java.lang.Integer,java.lang.Long> columnSizes()
- Specified by:
columnSizes
in interfaceContentFile<DataFile>
- Returns:
- if collected, map from column ID to the size of the column in bytes, null otherwise
-
valueCounts
public java.util.Map<java.lang.Integer,java.lang.Long> valueCounts()
- Specified by:
valueCounts
in interfaceContentFile<DataFile>
- Returns:
- if collected, map from column ID to the count of its non-null values, null otherwise
-
nullValueCounts
public java.util.Map<java.lang.Integer,java.lang.Long> nullValueCounts()
- Specified by:
nullValueCounts
in interfaceContentFile<DataFile>
- Returns:
- if collected, map from column ID to its null value count, null otherwise
-
lowerBounds
public java.util.Map<java.lang.Integer,java.nio.ByteBuffer> lowerBounds()
- Specified by:
lowerBounds
in interfaceContentFile<DataFile>
- Returns:
- if collected, map from column ID to value lower bounds, null otherwise
-
upperBounds
public java.util.Map<java.lang.Integer,java.nio.ByteBuffer> upperBounds()
- Specified by:
upperBounds
in interfaceContentFile<DataFile>
- Returns:
- if collected, map from column ID to value upper bounds, null otherwise
-
keyMetadata
public java.nio.ByteBuffer keyMetadata()
- Specified by:
keyMetadata
in interfaceContentFile<DataFile>
- Returns:
- metadata about how this file is encrypted, or null if the file is stored in plain text.
-
copy
public DataFile copy()
Description copied from interface:ContentFile
Copies this file. Manifest readers can reuse file instances; use this method to copy data when collecting files from tasks.- Specified by:
copy
in interfaceContentFile<DataFile>
- Returns:
- a copy of this data file
-
copyWithoutStats
public DataFile copyWithoutStats()
Description copied from interface:ContentFile
Copies this file without file stats. Manifest readers can reuse file instances; use this method to copy data without stats when collecting files.- Specified by:
copyWithoutStats
in interfaceContentFile<DataFile>
- Returns:
- a copy of this data file, without lower bounds, upper bounds, value counts, or null value counts
-
splitOffsets
public java.util.List<java.lang.Long> splitOffsets()
- Specified by:
splitOffsets
in interfaceContentFile<DataFile>
- Returns:
- List of recommended split locations, if applicable, null otherwise. When available, this information is used for planning scan tasks whose boundaries are determined by these offsets. The returned list must be sorted in ascending order.
-
-