Class AncestorsOfProcedure

  • All Implemented Interfaces:
    Procedure

    public class AncestorsOfProcedure
    extends java.lang.Object
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected static org.apache.spark.sql.types.DataType STRING_ARRAY  
      protected static org.apache.spark.sql.types.DataType STRING_MAP  
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected SparkActions actions()  
      static SparkProcedures.ProcedureBuilder builder()  
      org.apache.spark.sql.catalyst.InternalRow[] call​(org.apache.spark.sql.catalyst.InternalRow args)
      Executes this procedure.
      protected void closeService()
      Closes this procedure's executor service if a new one was created with executorService(int, String).
      java.lang.String description()
      Returns the description of this procedure.
      protected java.util.concurrent.ExecutorService executorService​(int threadPoolSize, java.lang.String nameFormat)
      Starts a new executor service which can be used by this procedure in its work.
      protected Expression filterExpression​(org.apache.spark.sql.connector.catalog.Identifier ident, java.lang.String where)  
      protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadRows​(org.apache.spark.sql.connector.catalog.Identifier tableIdent, java.util.Map<java.lang.String,​java.lang.String> options)  
      protected SparkTable loadSparkTable​(org.apache.spark.sql.connector.catalog.Identifier ident)  
      protected <T> T modifyIcebergTable​(org.apache.spark.sql.connector.catalog.Identifier ident, java.util.function.Function<Table,​T> func)  
      protected org.apache.spark.sql.catalyst.InternalRow newInternalRow​(java.lang.Object... values)  
      org.apache.spark.sql.types.StructType outputType()
      Returns the type of rows produced by this procedure.
      ProcedureParameter[] parameters()
      Returns the input parameters of this procedure.
      protected void refreshSparkCache​(org.apache.spark.sql.connector.catalog.Identifier ident, org.apache.spark.sql.connector.catalog.Table table)  
      protected org.apache.spark.sql.SparkSession spark()  
      protected org.apache.spark.sql.connector.catalog.TableCatalog tableCatalog()  
      protected Spark3Util.CatalogAndIdentifier toCatalogAndIdentifier​(java.lang.String identifierAsString, java.lang.String argName, org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)  
      protected org.apache.spark.sql.connector.catalog.Identifier toIdentifier​(java.lang.String identifierAsString, java.lang.String argName)  
      protected <T> T withIcebergTable​(org.apache.spark.sql.connector.catalog.Identifier ident, java.util.function.Function<Table,​T> func)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • STRING_MAP

        protected static final org.apache.spark.sql.types.DataType STRING_MAP
      • STRING_ARRAY

        protected static final org.apache.spark.sql.types.DataType STRING_ARRAY
    • Method Detail

      • parameters

        public ProcedureParameter[] parameters()
        Description copied from interface: Procedure
        Returns the input parameters of this procedure.
      • outputType

        public org.apache.spark.sql.types.StructType outputType()
        Description copied from interface: Procedure
        Returns the type of rows produced by this procedure.
      • call

        public org.apache.spark.sql.catalyst.InternalRow[] call​(org.apache.spark.sql.catalyst.InternalRow args)
        Description copied from interface: Procedure
        Executes this procedure.

        Spark will align the provided arguments according to the input parameters defined in Procedure.parameters() either by position or by name before execution.

        Implementations may provide a summary of execution by returning one or many rows as a result. The schema of output rows must match the defined output type in Procedure.outputType().

        Parameters:
        args - input arguments
        Returns:
        the result of executing this procedure with the given arguments
      • description

        public java.lang.String description()
        Description copied from interface: Procedure
        Returns the description of this procedure.
      • spark

        protected org.apache.spark.sql.SparkSession spark()
      • tableCatalog

        protected org.apache.spark.sql.connector.catalog.TableCatalog tableCatalog()
      • modifyIcebergTable

        protected <T> T modifyIcebergTable​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                           java.util.function.Function<Table,​T> func)
      • withIcebergTable

        protected <T> T withIcebergTable​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                         java.util.function.Function<Table,​T> func)
      • toIdentifier

        protected org.apache.spark.sql.connector.catalog.Identifier toIdentifier​(java.lang.String identifierAsString,
                                                                                 java.lang.String argName)
      • toCatalogAndIdentifier

        protected Spark3Util.CatalogAndIdentifier toCatalogAndIdentifier​(java.lang.String identifierAsString,
                                                                         java.lang.String argName,
                                                                         org.apache.spark.sql.connector.catalog.CatalogPlugin catalog)
      • loadSparkTable

        protected SparkTable loadSparkTable​(org.apache.spark.sql.connector.catalog.Identifier ident)
      • loadRows

        protected org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> loadRows​(org.apache.spark.sql.connector.catalog.Identifier tableIdent,
                                                                                  java.util.Map<java.lang.String,​java.lang.String> options)
      • refreshSparkCache

        protected void refreshSparkCache​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                         org.apache.spark.sql.connector.catalog.Table table)
      • filterExpression

        protected Expression filterExpression​(org.apache.spark.sql.connector.catalog.Identifier ident,
                                              java.lang.String where)
      • newInternalRow

        protected org.apache.spark.sql.catalyst.InternalRow newInternalRow​(java.lang.Object... values)
      • closeService

        protected void closeService()
        Closes this procedure's executor service if a new one was created with executorService(int, String). Does not block for any remaining tasks.
      • executorService

        protected java.util.concurrent.ExecutorService executorService​(int threadPoolSize,
                                                                       java.lang.String nameFormat)
        Starts a new executor service which can be used by this procedure in its work. The pool will be automatically shut down if withIcebergTable(Identifier, Function) or modifyIcebergTable(Identifier, Function) are called. If these methods are not used then the service can be shut down with closeService() or left to be closed when this class is finalized.
        Parameters:
        threadPoolSize - number of threads in the service
        nameFormat - name prefix for threads created in this service
        Returns:
        the new executor service owned by this procedure