SparkDataFile

java.lang.Object
- org.apache.iceberg.spark.SparkDataFile

All Implemented Interfaces:: ContentFile<DataFile>, DataFile

public class SparkDataFile
extends java.lang.Object
implements DataFile

Field Summary
- Fields inherited from interface org.apache.iceberg.DataFile
  COLUMN_SIZES, CONTENT, EQUALITY_IDS, FILE_FORMAT, FILE_PATH, FILE_SIZE, KEY_METADATA, LOWER_BOUNDS, NAN_VALUE_COUNTS, NULL_VALUE_COUNTS, PARTITION_DOC, PARTITION_ID, PARTITION_NAME, RECORD_COUNT, SORT_ORDER_ID, SPEC_ID, SPLIT_OFFSETS, UPPER_BOUNDS, VALUE_COUNTS

Constructor Summary

Constructors
Constructor and Description
`SparkDataFile(Types.StructType type, org.apache.spark.sql.types.StructType sparkType)`
`SparkDataFile(Types.StructType type, Types.StructType projectedType, org.apache.spark.sql.types.StructType sparkType)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`java.util.Map<java.lang.Integer,java.lang.Long>`	`columnSizes()` Returns if collected, map from column ID to the size of the column in bytes, null otherwise.
`DataFile`	`copy()` Copies this file.
`DataFile`	`copyWithoutStats()` Copies this file without file stats.
`long`	`fileSizeInBytes()` Returns the file size in bytes.
`FileFormat`	`format()` Returns format of the file.
`java.nio.ByteBuffer`	`keyMetadata()` Returns metadata about how this file is encrypted, or null if the file is stored in plain text.
`java.util.Map<java.lang.Integer,java.nio.ByteBuffer>`	`lowerBounds()` Returns if collected, map from column ID to value lower bounds, null otherwise.
`java.util.Map<java.lang.Integer,java.lang.Long>`	`nanValueCounts()` Returns if collected, map from column ID to its NaN value count, null otherwise.
`java.util.Map<java.lang.Integer,java.lang.Long>`	`nullValueCounts()` Returns if collected, map from column ID to its null value count, null otherwise.
`StructLike`	`partition()` Returns partition for this file as a `StructLike`.
`java.lang.CharSequence`	`path()` Returns fully qualified path to the file, suitable for constructing a Hadoop Path.
`java.lang.Long`	`pos()` Returns the ordinal position of the file in a manifest, or null if it was not read from a manifest.
`long`	`recordCount()` Returns the number of top-level records in the file.
`java.lang.Integer`	`sortOrderId()` Returns the sort order id of this file, which describes how the file is ordered.
`int`	`specId()` Returns id of the partition spec used for partition metadata.
`java.util.List<java.lang.Long>`	`splitOffsets()` Returns list of recommended split locations, if applicable, null otherwise.
`java.util.Map<java.lang.Integer,java.nio.ByteBuffer>`	`upperBounds()` Returns if collected, map from column ID to value upper bounds, null otherwise.
`java.util.Map<java.lang.Integer,java.lang.Long>`	`valueCounts()` Returns if collected, map from column ID to the count of its non-null values, null otherwise.
`SparkDataFile`	`wrap(org.apache.spark.sql.Row row)`

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.apache.iceberg.DataFile
content, equalityFieldIds, getType

Methods inherited from interface org.apache.iceberg.ContentFile
copy

- Constructor Detail
  - SparkDataFile
```
public SparkDataFile(Types.StructType type,
                     org.apache.spark.sql.types.StructType sparkType)
```
  - SparkDataFile
```
public SparkDataFile(Types.StructType type,
                     Types.StructType projectedType,
                     org.apache.spark.sql.types.StructType sparkType)
```
- Method Detail
  - wrap
```
public SparkDataFile wrap(org.apache.spark.sql.Row row)
```
  - pos
```
public java.lang.Long pos()
```
    Description copied from interface: ContentFile
    
    Returns the ordinal position of the file in a manifest, or null if it was not read from a manifest.
    
    Specified by:
    
    pos in interface ContentFile<DataFile>
  - specId
```
public int specId()
```
    Description copied from interface: ContentFile
    
    Returns id of the partition spec used for partition metadata.
    
    Specified by:
    
    specId in interface ContentFile<DataFile>
  - path
```
public java.lang.CharSequence path()
```
    Description copied from interface: ContentFile
    
    Returns fully qualified path to the file, suitable for constructing a Hadoop Path.
    
    Specified by:
    
    path in interface ContentFile<DataFile>
  - format
```
public FileFormat format()
```
    Description copied from interface: ContentFile
    
    Returns format of the file.
    
    Specified by:
    
    format in interface ContentFile<DataFile>
  - partition
```
public StructLike partition()
```
    Description copied from interface: ContentFile
    
    Returns partition for this file as a StructLike.
    
    Specified by:
    
    partition in interface ContentFile<DataFile>
  - recordCount
```
public long recordCount()
```
    Description copied from interface: ContentFile
    
    Returns the number of top-level records in the file.
    
    Specified by:
    
    recordCount in interface ContentFile<DataFile>
  - fileSizeInBytes
```
public long fileSizeInBytes()
```
    Description copied from interface: ContentFile
    
    Returns the file size in bytes.
    
    Specified by:
    
    fileSizeInBytes in interface ContentFile<DataFile>
  - columnSizes
```
public java.util.Map<java.lang.Integer,java.lang.Long> columnSizes()
```
    Description copied from interface: ContentFile
    
    Returns if collected, map from column ID to the size of the column in bytes, null otherwise.
    
    Specified by:
    
    columnSizes in interface ContentFile<DataFile>
  - valueCounts
```
public java.util.Map<java.lang.Integer,java.lang.Long> valueCounts()
```
    Description copied from interface: ContentFile
    
    Returns if collected, map from column ID to the count of its non-null values, null otherwise.
    
    Specified by:
    
    valueCounts in interface ContentFile<DataFile>
  - nullValueCounts
```
public java.util.Map<java.lang.Integer,java.lang.Long> nullValueCounts()
```
    Description copied from interface: ContentFile
    
    Returns if collected, map from column ID to its null value count, null otherwise.
    
    Specified by:
    
    nullValueCounts in interface ContentFile<DataFile>
  - nanValueCounts
```
public java.util.Map<java.lang.Integer,java.lang.Long> nanValueCounts()
```
    Description copied from interface: ContentFile
    
    Returns if collected, map from column ID to its NaN value count, null otherwise.
    
    Specified by:
    
    nanValueCounts in interface ContentFile<DataFile>
  - lowerBounds
```
public java.util.Map<java.lang.Integer,java.nio.ByteBuffer> lowerBounds()
```
    Description copied from interface: ContentFile
    
    Returns if collected, map from column ID to value lower bounds, null otherwise.
    
    Specified by:
    
    lowerBounds in interface ContentFile<DataFile>
  - upperBounds
```
public java.util.Map<java.lang.Integer,java.nio.ByteBuffer> upperBounds()
```
    Description copied from interface: ContentFile
    
    Returns if collected, map from column ID to value upper bounds, null otherwise.
    
    Specified by:
    
    upperBounds in interface ContentFile<DataFile>
  - keyMetadata
```
public java.nio.ByteBuffer keyMetadata()
```
    Description copied from interface: ContentFile
    
    Returns metadata about how this file is encrypted, or null if the file is stored in plain text.
    
    Specified by:
    
    keyMetadata in interface ContentFile<DataFile>
  - copy
```
public DataFile copy()
```
    Description copied from interface: ContentFile
    
    Copies this file. Manifest readers can reuse file instances; use this method to copy data when collecting files from tasks.
    
    Specified by:
    
    copy in interface ContentFile<DataFile>
    
    Returns:
    
    a copy of this data file
  - copyWithoutStats
```
public DataFile copyWithoutStats()
```
    Description copied from interface: ContentFile
    
    Copies this file without file stats. Manifest readers can reuse file instances; use this method to copy data without stats when collecting files.
    
    Specified by:
    
    copyWithoutStats in interface ContentFile<DataFile>
    
    Returns:
    
    a copy of this data file, without lower bounds, upper bounds, value counts, null value counts, or nan value counts
  - splitOffsets
```
public java.util.List<java.lang.Long> splitOffsets()
```
    Description copied from interface: ContentFile
    
    Returns list of recommended split locations, if applicable, null otherwise.
    When available, this information is used for planning scan tasks whose boundaries are determined by these offsets. The returned list must be sorted in ascending order.
    
    Specified by:
    
    splitOffsets in interface ContentFile<DataFile>
  - sortOrderId
```
public java.lang.Integer sortOrderId()
```
    Description copied from interface: ContentFile
    
    Returns the sort order id of this file, which describes how the file is ordered. This information will be useful for merging data and equality delete files more efficiently when they share the same sort order id.
    
    Specified by:
    
    sortOrderId in interface ContentFile<DataFile>

Class SparkDataFile

Field Summary

Fields inherited from interface org.apache.iceberg.DataFile

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface org.apache.iceberg.DataFile

Methods inherited from interface org.apache.iceberg.ContentFile

Constructor Detail

SparkDataFile

SparkDataFile

Method Detail

wrap

pos

specId

path

format

partition

recordCount

fileSizeInBytes

columnSizes

valueCounts

nullValueCounts

nanValueCounts

lowerBounds

upperBounds

keyMetadata

copy

copyWithoutStats

splitOffsets

sortOrderId