Class S3FileIO

java.lang.Object
org.apache.iceberg.aws.s3.S3FileIO
All Implemented Interfaces:
Closeable, Serializable, AutoCloseable, CredentialSupplier, DelegateFileIO, FileIO, SupportsBulkOperations, SupportsPrefixOperations, SupportsRecoveryOperations

public class S3FileIO extends Object implements CredentialSupplier, DelegateFileIO, SupportsRecoveryOperations
FileIO implementation backed by S3.

Locations used must follow the conventions for S3 URIs (e.g. s3://bucket/path...). URIs with schemes s3a, s3n, https are also treated as s3 file paths. Using this FileIO with other schemes will result in ValidationException.

See Also:
  • Constructor Details

    • S3FileIO

      public S3FileIO()
      No-arg constructor to load the FileIO dynamically.

      All fields are initialized by calling initialize(Map) later.

    • S3FileIO

      public S3FileIO(SerializableSupplier<software.amazon.awssdk.services.s3.S3Client> s3)
      Constructor with custom s3 supplier and S3FileIO properties.

      Calling initialize(Map) will overwrite information set in this constructor.

      Parameters:
      s3 - s3 supplier
    • S3FileIO

      public S3FileIO(SerializableSupplier<software.amazon.awssdk.services.s3.S3Client> s3, S3FileIOProperties s3FileIOProperties)
      Constructor with custom s3 supplier and S3FileIO properties.

      Calling initialize(Map) will overwrite information set in this constructor.

      Parameters:
      s3 - s3 supplier
      s3FileIOProperties - S3 FileIO properties
  • Method Details

    • newInputFile

      public InputFile newInputFile(String path)
      Description copied from interface: FileIO
      Get a InputFile instance to read bytes from the file at the given path.
      Specified by:
      newInputFile in interface FileIO
    • newInputFile

      public InputFile newInputFile(String path, long length)
      Description copied from interface: FileIO
      Get a InputFile instance to read bytes from the file at the given path, with a known file length.
      Specified by:
      newInputFile in interface FileIO
    • newOutputFile

      public OutputFile newOutputFile(String path)
      Description copied from interface: FileIO
      Get a OutputFile instance to write bytes to the file at the given path.
      Specified by:
      newOutputFile in interface FileIO
    • deleteFile

      public void deleteFile(String path)
      Description copied from interface: FileIO
      Delete the file at the given path.
      Specified by:
      deleteFile in interface FileIO
    • properties

      public Map<String,String> properties()
      Description copied from interface: FileIO
      Returns the property map used to configure this FileIO
      Specified by:
      properties in interface FileIO
    • deleteFiles

      public void deleteFiles(Iterable<String> paths) throws BulkDeletionFailureException
      Deletes the given paths in a batched manner.

      The paths are grouped by bucket, and deletion is triggered when we either reach the configured batch size or have a final remainder batch for each bucket.

      Specified by:
      deleteFiles in interface SupportsBulkOperations
      Parameters:
      paths - paths to delete
      Throws:
      BulkDeletionFailureException - in case of failure to delete at least 1 file
    • listPrefix

      public Iterable<FileInfo> listPrefix(String prefix)
      Description copied from interface: SupportsPrefixOperations
      Return an iterable of all files under a prefix.

      Hierarchical file systems (e.g. HDFS) may impose additional restrictions like the prefix must fully match a directory whereas key/value object stores may allow for arbitrary prefixes.

      Specified by:
      listPrefix in interface SupportsPrefixOperations
      Parameters:
      prefix - prefix to list
      Returns:
      iterable of file information
    • deletePrefix

      public void deletePrefix(String prefix)
      This method provides a "best-effort" to delete all objects under the given prefix.

      Bulk delete operations are used and no reattempt is made for deletes if they fail, but will log any individual objects that are not deleted as part of the bulk operation.

      Specified by:
      deletePrefix in interface SupportsPrefixOperations
      Parameters:
      prefix - prefix to delete
    • client

      public software.amazon.awssdk.services.s3.S3Client client()
    • getCredential

      public String getCredential()
      Description copied from interface: CredentialSupplier
      Returns the credential string
      Specified by:
      getCredential in interface CredentialSupplier
    • initialize

      public void initialize(Map<String,String> props)
      Description copied from interface: FileIO
      Initialize File IO from catalog properties.
      Specified by:
      initialize in interface FileIO
      Parameters:
      props - catalog properties
    • close

      public void close()
      Description copied from interface: FileIO
      Close File IO to release underlying resources.

      Calling this method is only required when this FileIO instance is no longer expected to be used, and the resources it holds need to be explicitly released to avoid resource leaks.

      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Specified by:
      close in interface FileIO
    • finalize

      protected void finalize() throws Throwable
      Overrides:
      finalize in class Object
      Throws:
      Throwable
    • recoverFile

      public boolean recoverFile(String path)
      Description copied from interface: SupportsRecoveryOperations
      Perform a best-effort recovery of a file at a given path
      Specified by:
      recoverFile in interface SupportsRecoveryOperations
      Parameters:
      path - Absolute path of file to attempt recovery for
      Returns:
      true if recovery was successful, false otherwise