Package org.apache.iceberg.hadoop
Class HadoopFileIO
java.lang.Object
org.apache.iceberg.hadoop.HadoopFileIO
- All Implemented Interfaces:
Closeable
,Serializable
,AutoCloseable
,org.apache.hadoop.conf.Configurable
,HadoopConfigurable
,DelegateFileIO
,FileIO
,SupportsBulkOperations
,SupportsPrefixOperations
- See Also:
-
Constructor Summary
ConstructorDescriptionConstructor used for dynamic FileIO loading.HadoopFileIO
(org.apache.hadoop.conf.Configuration hadoopConf) HadoopFileIO
(SerializableSupplier<org.apache.hadoop.conf.Configuration> hadoopConf) -
Method Summary
Modifier and TypeMethodDescriptionorg.apache.hadoop.conf.Configuration
conf()
void
deleteFile
(String path) Delete the file at the given path.void
deleteFiles
(Iterable<String> pathsToDelete) Delete the files at the given paths.void
deletePrefix
(String prefix) Delete all files under a prefix.org.apache.hadoop.conf.Configuration
getConf()
void
initialize
(Map<String, String> props) Initialize File IO from catalog properties.listPrefix
(String prefix) Return an iterable of all files under a prefix.newInputFile
(String path) Get aInputFile
instance to read bytes from the file at the given path.newInputFile
(String path, long length) Get aInputFile
instance to read bytes from the file at the given path, with a known file length.newOutputFile
(String path) Get aOutputFile
instance to write bytes to the file at the given path.Returns the property map used to configure this FileIOvoid
serializeConfWith
(Function<org.apache.hadoop.conf.Configuration, SerializableSupplier<org.apache.hadoop.conf.Configuration>> confSerializer) Take a function that serializes Hadoop configuration into a supplier.void
setConf
(org.apache.hadoop.conf.Configuration conf) Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.iceberg.io.FileIO
close, deleteFile, deleteFile, newInputFile, newInputFile, newInputFile
-
Constructor Details
-
HadoopFileIO
public HadoopFileIO()Constructor used for dynamic FileIO loading.Hadoop configuration
must be set throughsetConf(Configuration)
-
HadoopFileIO
public HadoopFileIO(org.apache.hadoop.conf.Configuration hadoopConf) -
HadoopFileIO
-
-
Method Details
-
conf
public org.apache.hadoop.conf.Configuration conf() -
initialize
Description copied from interface:FileIO
Initialize File IO from catalog properties.- Specified by:
initialize
in interfaceFileIO
- Parameters:
props
- catalog properties
-
newInputFile
Description copied from interface:FileIO
Get aInputFile
instance to read bytes from the file at the given path.- Specified by:
newInputFile
in interfaceFileIO
-
newInputFile
Description copied from interface:FileIO
Get aInputFile
instance to read bytes from the file at the given path, with a known file length.- Specified by:
newInputFile
in interfaceFileIO
-
newOutputFile
Description copied from interface:FileIO
Get aOutputFile
instance to write bytes to the file at the given path.- Specified by:
newOutputFile
in interfaceFileIO
-
deleteFile
Description copied from interface:FileIO
Delete the file at the given path.- Specified by:
deleteFile
in interfaceFileIO
-
properties
Description copied from interface:FileIO
Returns the property map used to configure this FileIO- Specified by:
properties
in interfaceFileIO
-
setConf
public void setConf(org.apache.hadoop.conf.Configuration conf) - Specified by:
setConf
in interfaceorg.apache.hadoop.conf.Configurable
-
getConf
public org.apache.hadoop.conf.Configuration getConf()- Specified by:
getConf
in interfaceorg.apache.hadoop.conf.Configurable
-
serializeConfWith
public void serializeConfWith(Function<org.apache.hadoop.conf.Configuration, SerializableSupplier<org.apache.hadoop.conf.Configuration>> confSerializer) Description copied from interface:HadoopConfigurable
Take a function that serializes Hadoop configuration into a supplier. An implementation is supposed to pass in its current Hadoop configuration into this function, and the result can be safely serialized for future use.- Specified by:
serializeConfWith
in interfaceHadoopConfigurable
- Parameters:
confSerializer
- A function that takes Hadoop configuration and returns a serializable supplier of it.
-
listPrefix
Description copied from interface:SupportsPrefixOperations
Return an iterable of all files under a prefix.Hierarchical file systems (e.g. HDFS) may impose additional restrictions like the prefix must fully match a directory whereas key/value object stores may allow for arbitrary prefixes.
- Specified by:
listPrefix
in interfaceSupportsPrefixOperations
- Parameters:
prefix
- prefix to list- Returns:
- iterable of file information
-
deletePrefix
Description copied from interface:SupportsPrefixOperations
Delete all files under a prefix.Hierarchical file systems (e.g. HDFS) may impose additional restrictions like the prefix must fully match a directory whereas key/value object stores may allow for arbitrary prefixes.
- Specified by:
deletePrefix
in interfaceSupportsPrefixOperations
- Parameters:
prefix
- prefix to delete
-
deleteFiles
Description copied from interface:SupportsBulkOperations
Delete the files at the given paths.- Specified by:
deleteFiles
in interfaceSupportsBulkOperations
- Parameters:
pathsToDelete
- The paths to delete- Throws:
BulkDeletionFailureException
- in case of failure to delete at least 1 file
-