Class RewritePositionDeleteFilesProcedure

java.lang.Object
org.apache.iceberg.spark.procedures.RewritePositionDeleteFilesProcedure
All Implemented Interfaces:
Procedure

public class RewritePositionDeleteFilesProcedure extends Object
A procedure that rewrites position delete files in a table.
See Also:
  • Field Details Link icon

    • STRING_MAP Link icon

      protected static final org.apache.spark.sql.types.DataType STRING_MAP
    • STRING_ARRAY Link icon

      protected static final org.apache.spark.sql.types.DataType STRING_ARRAY
  • Method Details Link icon

    • builder Link icon

      public static SparkProcedures.ProcedureBuilder builder()
    • parameters Link icon

      public ProcedureParameter[] parameters()
      Description copied from interface: Procedure
      Returns the input parameters of this procedure.
    • outputType Link icon

      public org.apache.spark.sql.types.StructType outputType()
      Description copied from interface: Procedure
      Returns the type of rows produced by this procedure.
    • call Link icon

      public org.apache.spark.sql.catalyst.InternalRow[] call(org.apache.spark.sql.catalyst.InternalRow args)
      Description copied from interface: Procedure
      Executes this procedure.

      Spark will align the provided arguments according to the input parameters defined in Procedure.parameters() either by position or by name before execution.

      Implementations may provide a summary of execution by returning one or many rows as a result. The schema of output rows must match the defined output type in Procedure.outputType().

      Parameters:
      args - input arguments
      Returns:
      the result of executing this procedure with the given arguments
    • description Link icon

      public String description()
      Description copied from interface: Procedure
      Returns the description of this procedure.
    • spark Link icon

      protected org.apache.spark.sql.SparkSession spark()
    • actions Link icon

      protected SparkActions actions()
    • tableCatalog Link icon

      protected org.apache.spark.sql.connector.catalog.TableCatalog tableCatalog()
    • modifyIcebergTable Link icon

      protected <T> T modifyIcebergTable(org.apache.spark.sql.connector.catalog.Identifier ident, Function<Table,T> func)
    • withIcebergTable Link icon

      protected <T> T withIcebergTable(org.apache.spark.sql.connector.catalog.Identifier ident, Function<Table,T> func)
    • toIdentifier Link icon

      protected org.apache.spark.sql.connector.catalog.Identifier toIdentifier(String identifierAsString, String argName)
    • toCatalogAndIdentifier Link icon

      protected Spark3Util.CatalogAndIdentifier toCatalogAndIdentifier(String identifierAsString, String argName, org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)
    • loadSparkTable Link icon

      protected SparkTable loadSparkTable(org.apache.spark.sql.connector.catalog.Identifier ident)
    • loadRows Link icon

      protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadRows(org.apache.spark.sql.connector.catalog.Identifier tableIdent, Map<String,String> options)
    • refreshSparkCache Link icon

      protected void refreshSparkCache(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.Table table)
    • filterExpression Link icon

      protected Expression filterExpression(org.apache.spark.sql.connector.catalog.Identifier ident, String where)
    • newInternalRow Link icon

      protected org.apache.spark.sql.catalyst.InternalRow newInternalRow(Object... values)
    • closeService Link icon

      protected void closeService()
      Closes this procedure's executor service if a new one was created with BaseProcedure.executorService(int, String). Does not block for any remaining tasks.
    • executorService Link icon

      protected ExecutorService executorService(int threadPoolSize, String nameFormat)
      Starts a new executor service which can be used by this procedure in its work. The pool will be automatically shut down if withIcebergTable(Identifier, Function) or modifyIcebergTable(Identifier, Function) are called. If these methods are not used then the service can be shut down with closeService() or left to be closed when this class is finalized.
      Parameters:
      threadPoolSize - number of threads in the service
      nameFormat - name prefix for threads created in this service
      Returns:
      the new executor service owned by this procedure