Class RemoveOrphanFilesProcedure
- java.lang.Object
- 
- org.apache.iceberg.spark.procedures.RemoveOrphanFilesProcedure
 
- 
- All Implemented Interfaces:
- Procedure
 
 public class RemoveOrphanFilesProcedure extends java.lang.ObjectA procedure that removes orphan files in a table.- See Also:
- SparkActions.deleteOrphanFiles(Table)
 
- 
- 
Field SummaryFields Modifier and Type Field Description protected static org.apache.spark.sql.types.DataTypeSTRING_ARRAYprotected static org.apache.spark.sql.types.DataTypeSTRING_MAP
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected SparkActionsactions()static SparkProcedures.ProcedureBuilderbuilder()org.apache.spark.sql.catalyst.InternalRow[]call(org.apache.spark.sql.catalyst.InternalRow args)Executes this procedure.protected voidcloseService()Closes this procedure's executor service if a new one was created withBaseProcedure.executorService(int, String).java.lang.Stringdescription()Returns the description of this procedure.protected java.util.concurrent.ExecutorServiceexecutorService(int threadPoolSize, java.lang.String nameFormat)Starts a new executor service which can be used by this procedure in its work.protected ExpressionfilterExpression(org.apache.spark.sql.connector.catalog.Identifier ident, java.lang.String where)protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>loadRows(org.apache.spark.sql.connector.catalog.Identifier tableIdent, java.util.Map<java.lang.String,java.lang.String> options)protected SparkTableloadSparkTable(org.apache.spark.sql.connector.catalog.Identifier ident)protected <T> TmodifyIcebergTable(org.apache.spark.sql.connector.catalog.Identifier ident, java.util.function.Function<Table,T> func)protected org.apache.spark.sql.catalyst.InternalRownewInternalRow(java.lang.Object... values)org.apache.spark.sql.types.StructTypeoutputType()Returns the type of rows produced by this procedure.ProcedureParameter[]parameters()Returns the input parameters of this procedure.protected voidrefreshSparkCache(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.Table table)protected org.apache.spark.sql.SparkSessionspark()protected org.apache.spark.sql.connector.catalog.TableCatalogtableCatalog()protected Spark3Util.CatalogAndIdentifiertoCatalogAndIdentifier(java.lang.String identifierAsString, java.lang.String argName, org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)protected org.apache.spark.sql.connector.catalog.IdentifiertoIdentifier(java.lang.String identifierAsString, java.lang.String argName)protected <T> TwithIcebergTable(org.apache.spark.sql.connector.catalog.Identifier ident, java.util.function.Function<Table,T> func)
 
- 
- 
- 
Method Detail- 
builderpublic static SparkProcedures.ProcedureBuilder builder() 
 - 
parameterspublic ProcedureParameter[] parameters() Description copied from interface:ProcedureReturns the input parameters of this procedure.
 - 
outputTypepublic org.apache.spark.sql.types.StructType outputType() Description copied from interface:ProcedureReturns the type of rows produced by this procedure.
 - 
callpublic org.apache.spark.sql.catalyst.InternalRow[] call(org.apache.spark.sql.catalyst.InternalRow args) Description copied from interface:ProcedureExecutes this procedure.Spark will align the provided arguments according to the input parameters defined in Procedure.parameters()either by position or by name before execution.Implementations may provide a summary of execution by returning one or many rows as a result. The schema of output rows must match the defined output type in Procedure.outputType().- Parameters:
- args- input arguments
- Returns:
- the result of executing this procedure with the given arguments
 
 - 
descriptionpublic java.lang.String description() Description copied from interface:ProcedureReturns the description of this procedure.
 - 
sparkprotected org.apache.spark.sql.SparkSession spark() 
 - 
actionsprotected SparkActions actions() 
 - 
tableCatalogprotected org.apache.spark.sql.connector.catalog.TableCatalog tableCatalog() 
 - 
modifyIcebergTableprotected <T> T modifyIcebergTable(org.apache.spark.sql.connector.catalog.Identifier ident, java.util.function.Function<Table,T> func)
 - 
withIcebergTableprotected <T> T withIcebergTable(org.apache.spark.sql.connector.catalog.Identifier ident, java.util.function.Function<Table,T> func)
 - 
toIdentifierprotected org.apache.spark.sql.connector.catalog.Identifier toIdentifier(java.lang.String identifierAsString, java.lang.String argName)
 - 
toCatalogAndIdentifierprotected Spark3Util.CatalogAndIdentifier toCatalogAndIdentifier(java.lang.String identifierAsString, java.lang.String argName, org.apache.spark.sql.connector.catalog.CatalogPlugin catalog) 
 - 
loadSparkTableprotected SparkTable loadSparkTable(org.apache.spark.sql.connector.catalog.Identifier ident) 
 - 
loadRowsprotected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadRows(org.apache.spark.sql.connector.catalog.Identifier tableIdent, java.util.Map<java.lang.String,java.lang.String> options)
 - 
refreshSparkCacheprotected void refreshSparkCache(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.Table table)
 - 
filterExpressionprotected Expression filterExpression(org.apache.spark.sql.connector.catalog.Identifier ident, java.lang.String where) 
 - 
newInternalRowprotected org.apache.spark.sql.catalyst.InternalRow newInternalRow(java.lang.Object... values) 
 - 
closeServiceprotected void closeService() Closes this procedure's executor service if a new one was created withBaseProcedure.executorService(int, String). Does not block for any remaining tasks.
 - 
executorServiceprotected java.util.concurrent.ExecutorService executorService(int threadPoolSize, java.lang.String nameFormat)Starts a new executor service which can be used by this procedure in its work. The pool will be automatically shut down ifwithIcebergTable(Identifier, Function)ormodifyIcebergTable(Identifier, Function)are called. If these methods are not used then the service can be shut down withcloseService()or left to be closed when this class is finalized.- Parameters:
- threadPoolSize- number of threads in the service
- nameFormat- name prefix for threads created in this service
- Returns:
- the new executor service owned by this procedure
 
 
- 
 
-