Class SparkDataFile

    • Constructor Detail

      • SparkDataFile

        public SparkDataFile​(Types.StructType type,
                             org.apache.spark.sql.types.StructType sparkType)
    • Method Detail

      • wrap

        public SparkDataFile wrap​(org.apache.spark.sql.Row row)
      • path

        public java.lang.CharSequence path()
        Specified by:
        path in interface ContentFile<DataFile>
        Returns:
        fully qualified path to the file, suitable for constructing a Hadoop Path
      • recordCount

        public long recordCount()
        Specified by:
        recordCount in interface ContentFile<DataFile>
        Returns:
        the number of top-level records in the file
      • columnSizes

        public java.util.Map<java.lang.Integer,​java.lang.Long> columnSizes()
        Specified by:
        columnSizes in interface ContentFile<DataFile>
        Returns:
        if collected, map from column ID to the size of the column in bytes, null otherwise
      • valueCounts

        public java.util.Map<java.lang.Integer,​java.lang.Long> valueCounts()
        Specified by:
        valueCounts in interface ContentFile<DataFile>
        Returns:
        if collected, map from column ID to the count of its non-null values, null otherwise
      • nullValueCounts

        public java.util.Map<java.lang.Integer,​java.lang.Long> nullValueCounts()
        Specified by:
        nullValueCounts in interface ContentFile<DataFile>
        Returns:
        if collected, map from column ID to its null value count, null otherwise
      • lowerBounds

        public java.util.Map<java.lang.Integer,​java.nio.ByteBuffer> lowerBounds()
        Specified by:
        lowerBounds in interface ContentFile<DataFile>
        Returns:
        if collected, map from column ID to value lower bounds, null otherwise
      • upperBounds

        public java.util.Map<java.lang.Integer,​java.nio.ByteBuffer> upperBounds()
        Specified by:
        upperBounds in interface ContentFile<DataFile>
        Returns:
        if collected, map from column ID to value upper bounds, null otherwise
      • keyMetadata

        public java.nio.ByteBuffer keyMetadata()
        Specified by:
        keyMetadata in interface ContentFile<DataFile>
        Returns:
        metadata about how this file is encrypted, or null if the file is stored in plain text.
      • copy

        public DataFile copy()
        Description copied from interface: ContentFile
        Copies this file. Manifest readers can reuse file instances; use this method to copy data when collecting files from tasks.
        Specified by:
        copy in interface ContentFile<DataFile>
        Returns:
        a copy of this data file
      • copyWithoutStats

        public DataFile copyWithoutStats()
        Description copied from interface: ContentFile
        Copies this file without file stats. Manifest readers can reuse file instances; use this method to copy data without stats when collecting files.
        Specified by:
        copyWithoutStats in interface ContentFile<DataFile>
        Returns:
        a copy of this data file, without lower bounds, upper bounds, value counts, or null value counts
      • splitOffsets

        public java.util.List<java.lang.Long> splitOffsets()
        Specified by:
        splitOffsets in interface ContentFile<DataFile>
        Returns:
        List of recommended split locations, if applicable, null otherwise. When available, this information is used for planning scan tasks whose boundaries are determined by these offsets. The returned list must be sorted in ascending order.