Class HadoopFileIO

java.lang.Object
org.apache.iceberg.hadoop.HadoopFileIO
All Implemented Interfaces:
Closeable, Serializable, AutoCloseable, org.apache.hadoop.conf.Configurable, HadoopConfigurable, DelegateFileIO, FileIO, SupportsBulkOperations, SupportsPrefixOperations

public class HadoopFileIO extends Object implements HadoopConfigurable, DelegateFileIO
See Also:
  • Constructor Details

    • HadoopFileIO

      public HadoopFileIO()
      Constructor used for dynamic FileIO loading.

      Hadoop configuration must be set through setConf(Configuration)

    • HadoopFileIO

      public HadoopFileIO(org.apache.hadoop.conf.Configuration hadoopConf)
    • HadoopFileIO

      public HadoopFileIO(SerializableSupplier<org.apache.hadoop.conf.Configuration> hadoopConf)
  • Method Details

    • conf

      public org.apache.hadoop.conf.Configuration conf()
    • initialize

      public void initialize(Map<String,String> props)
      Description copied from interface: FileIO
      Initialize File IO from catalog properties.
      Specified by:
      initialize in interface FileIO
      Parameters:
      props - catalog properties
    • newInputFile

      public InputFile newInputFile(String path)
      Description copied from interface: FileIO
      Get a InputFile instance to read bytes from the file at the given path.
      Specified by:
      newInputFile in interface FileIO
    • newInputFile

      public InputFile newInputFile(String path, long length)
      Description copied from interface: FileIO
      Get a InputFile instance to read bytes from the file at the given path, with a known file length.
      Specified by:
      newInputFile in interface FileIO
    • newOutputFile

      public OutputFile newOutputFile(String path)
      Description copied from interface: FileIO
      Get a OutputFile instance to write bytes to the file at the given path.
      Specified by:
      newOutputFile in interface FileIO
    • deleteFile

      public void deleteFile(String path)
      Description copied from interface: FileIO
      Delete the file at the given path.
      Specified by:
      deleteFile in interface FileIO
    • properties

      public Map<String,String> properties()
      Description copied from interface: FileIO
      Returns the property map used to configure this FileIO
      Specified by:
      properties in interface FileIO
    • setConf

      public void setConf(org.apache.hadoop.conf.Configuration conf)
      Specified by:
      setConf in interface org.apache.hadoop.conf.Configurable
    • getConf

      public org.apache.hadoop.conf.Configuration getConf()
      Specified by:
      getConf in interface org.apache.hadoop.conf.Configurable
    • serializeConfWith

      public void serializeConfWith(Function<org.apache.hadoop.conf.Configuration,SerializableSupplier<org.apache.hadoop.conf.Configuration>> confSerializer)
      Description copied from interface: HadoopConfigurable
      Take a function that serializes Hadoop configuration into a supplier. An implementation is supposed to pass in its current Hadoop configuration into this function, and the result can be safely serialized for future use.
      Specified by:
      serializeConfWith in interface HadoopConfigurable
      Parameters:
      confSerializer - A function that takes Hadoop configuration and returns a serializable supplier of it.
    • listPrefix

      public Iterable<FileInfo> listPrefix(String prefix)
      Description copied from interface: SupportsPrefixOperations
      Return an iterable of all files under a prefix.

      Hierarchical file systems (e.g. HDFS) may impose additional restrictions like the prefix must fully match a directory whereas key/value object stores may allow for arbitrary prefixes.

      Specified by:
      listPrefix in interface SupportsPrefixOperations
      Parameters:
      prefix - prefix to list
      Returns:
      iterable of file information
    • deletePrefix

      public void deletePrefix(String prefix)
      Description copied from interface: SupportsPrefixOperations
      Delete all files under a prefix.

      Hierarchical file systems (e.g. HDFS) may impose additional restrictions like the prefix must fully match a directory whereas key/value object stores may allow for arbitrary prefixes.

      Specified by:
      deletePrefix in interface SupportsPrefixOperations
      Parameters:
      prefix - prefix to delete
    • deleteFiles

      public void deleteFiles(Iterable<String> pathsToDelete) throws BulkDeletionFailureException
      Description copied from interface: SupportsBulkOperations
      Delete the files at the given paths.
      Specified by:
      deleteFiles in interface SupportsBulkOperations
      Parameters:
      pathsToDelete - The paths to delete
      Throws:
      BulkDeletionFailureException - in case of failure to delete at least 1 file