Package org.apache.iceberg.mr.hive
Class HiveIcebergOutputCommitter
- java.lang.Object
-
- org.apache.hadoop.mapreduce.OutputCommitter
-
- org.apache.hadoop.mapred.OutputCommitter
-
- org.apache.iceberg.mr.hive.HiveIcebergOutputCommitter
-
public class HiveIcebergOutputCommitter extends org.apache.hadoop.mapred.OutputCommitterAn Iceberg table committer for adding data files to the Iceberg tables. Currently independent of the Hive ACID transactions.
-
-
Constructor Summary
Constructors Constructor Description HiveIcebergOutputCommitter()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidabortJob(org.apache.hadoop.mapred.JobContext originalContext, int status)Removes the generated data files if there is a commit file already generated for them.voidabortTask(org.apache.hadoop.mapred.TaskAttemptContext originalContext)Removes files generated by this task.voidcommitJob(org.apache.hadoop.mapred.JobContext originalContext)Reads the commit files stored in the temp directories and collects the generated committed data files.voidcommitTask(org.apache.hadoop.mapred.TaskAttemptContext originalContext)Collects the generated data files and creates a commit file storing the data file list.booleanneedsTaskCommit(org.apache.hadoop.mapred.TaskAttemptContext context)voidsetupJob(org.apache.hadoop.mapred.JobContext jobContext)voidsetupTask(org.apache.hadoop.mapred.TaskAttemptContext taskAttemptContext)-
Methods inherited from class org.apache.hadoop.mapred.OutputCommitter
abortJob, abortTask, cleanupJob, cleanupJob, commitJob, commitTask, isCommitJobRepeatable, isCommitJobRepeatable, isRecoverySupported, isRecoverySupported, isRecoverySupported, needsTaskCommit, recoverTask, recoverTask, setupJob, setupTask
-
-
-
-
Method Detail
-
setupJob
public void setupJob(org.apache.hadoop.mapred.JobContext jobContext)
- Specified by:
setupJobin classorg.apache.hadoop.mapred.OutputCommitter
-
setupTask
public void setupTask(org.apache.hadoop.mapred.TaskAttemptContext taskAttemptContext)
- Specified by:
setupTaskin classorg.apache.hadoop.mapred.OutputCommitter
-
needsTaskCommit
public boolean needsTaskCommit(org.apache.hadoop.mapred.TaskAttemptContext context)
- Specified by:
needsTaskCommitin classorg.apache.hadoop.mapred.OutputCommitter
-
commitTask
public void commitTask(org.apache.hadoop.mapred.TaskAttemptContext originalContext) throws java.io.IOExceptionCollects the generated data files and creates a commit file storing the data file list.- Specified by:
commitTaskin classorg.apache.hadoop.mapred.OutputCommitter- Parameters:
originalContext- The task attempt context- Throws:
java.io.IOException- Thrown if there is an error writing the commit file
-
abortTask
public void abortTask(org.apache.hadoop.mapred.TaskAttemptContext originalContext) throws java.io.IOExceptionRemoves files generated by this task.- Specified by:
abortTaskin classorg.apache.hadoop.mapred.OutputCommitter- Parameters:
originalContext- The task attempt context- Throws:
java.io.IOException- Thrown if there is an error closing the writer
-
commitJob
public void commitJob(org.apache.hadoop.mapred.JobContext originalContext) throws java.io.IOExceptionReads the commit files stored in the temp directories and collects the generated committed data files. Appends the data files to the tables. At the end removes the temporary directories.- Overrides:
commitJobin classorg.apache.hadoop.mapred.OutputCommitter- Parameters:
originalContext- The job context- Throws:
java.io.IOException- if there is a failure accessing the files
-
abortJob
public void abortJob(org.apache.hadoop.mapred.JobContext originalContext, int status) throws java.io.IOExceptionRemoves the generated data files if there is a commit file already generated for them. The cleanup at the end removes the temporary directories as well.- Overrides:
abortJobin classorg.apache.hadoop.mapred.OutputCommitter- Parameters:
originalContext- The job contextstatus- The status of the job- Throws:
java.io.IOException- if there is a failure deleting the files
-
-