Class HadoopTableOperations

java.lang.Object
org.apache.iceberg.hadoop.HadoopTableOperations
All Implemented Interfaces:
TableOperations

public class HadoopTableOperations extends Object implements TableOperations
TableOperations implementation for file systems that support atomic rename.

This maintains metadata in a "metadata" folder under the table location.

  • Constructor Details

    • HadoopTableOperations

      protected HadoopTableOperations(org.apache.hadoop.fs.Path location, FileIO fileIO, org.apache.hadoop.conf.Configuration conf, LockManager lockManager)
  • Method Details

    • current

      public TableMetadata current()
      Description copied from interface: TableOperations
      Return the currently loaded table metadata, without checking for updates.
      Specified by:
      current in interface TableOperations
      Returns:
      table metadata
    • refresh

      public TableMetadata refresh()
      Description copied from interface: TableOperations
      Return the current table metadata after checking for updates.
      Specified by:
      refresh in interface TableOperations
      Returns:
      table metadata
    • commit

      public void commit(TableMetadata base, TableMetadata metadata)
      Description copied from interface: TableOperations
      Replace the base table metadata with a new version.

      This method should implement and document atomicity guarantees.

      Implementations must check that the base metadata is current to avoid overwriting updates. Once the atomic commit operation succeeds, implementations must not perform any operations that may fail because failure in this method cannot be distinguished from commit failure.

      Implementations must throw a CommitStateUnknownException in cases where it cannot be determined if the commit succeeded or failed. For example if a network partition causes the confirmation of the commit to be lost, the implementation should throw a CommitStateUnknownException. This is important because downstream users of this API need to know whether they can clean up the commit or not, if the state is unknown then it is not safe to remove any files. All other exceptions will be treated as if the commit has failed.

      Specified by:
      commit in interface TableOperations
      Parameters:
      base - table metadata on which changes were based
      metadata - new table metadata with updates
    • io

      public FileIO io()
      Description copied from interface: TableOperations
      Returns a FileIO to read and write table data and metadata files.
      Specified by:
      io in interface TableOperations
    • locationProvider

      public LocationProvider locationProvider()
      Description copied from interface: TableOperations
      Returns a LocationProvider that supplies locations for new new data files.
      Specified by:
      locationProvider in interface TableOperations
      Returns:
      a location provider configured for the current table state
    • metadataFileLocation

      public String metadataFileLocation(String fileName)
      Description copied from interface: TableOperations
      Given the name of a metadata file, obtain the full path of that file using an appropriate base location of the implementation's choosing.

      The file may not exist yet, in which case the path should be returned as if it were to be created by e.g. FileIO.newOutputFile(String).

      Specified by:
      metadataFileLocation in interface TableOperations
    • temp

      public TableOperations temp(TableMetadata uncommittedMetadata)
      Description copied from interface: TableOperations
      Return a temporary TableOperations instance that uses configuration from uncommitted metadata.

      This is called by transactions when uncommitted table metadata should be used; for example, to create a metadata file location based on metadata in the transaction that has not been committed.

      Transactions will not call TableOperations.refresh() or TableOperations.commit(TableMetadata, TableMetadata).

      Specified by:
      temp in interface TableOperations
      Parameters:
      uncommittedMetadata - uncommitted table metadata
      Returns:
      a temporary table operations that behaves like the uncommitted metadata is current
    • getFileSystem

      protected org.apache.hadoop.fs.FileSystem getFileSystem(org.apache.hadoop.fs.Path path, org.apache.hadoop.conf.Configuration hadoopConf)