Class SparkExecutorCache


  • public class SparkExecutorCache
    extends java.lang.Object
    An executor cache for reducing the computation and IO overhead in tasks.

    The cache is configured and controlled through Spark SQL properties. It supports both limits on the total cache size and maximum size for individual entries. Additionally, it implements automatic eviction of entries after a specified duration of inactivity. The cache will respect the SQL configuration valid at the time of initialization. All subsequent changes to the configuration will have no effect.

    The cache is accessed and populated via getOrLoad(String, String, Supplier, long). If the value is not present in the cache, it is computed using the provided supplier and stored in the cache, subject to the defined size constraints. When a key is added, it must be associated with a particular group ID. Once the group is no longer needed, it is recommended to explicitly invalidate its state by calling invalidate(String) instead of relying on automatic eviction.

    Note that this class employs the singleton pattern to ensure only one cache exists per JVM.

    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static SparkExecutorCache get()
      Returns the cache if already created or null otherwise.
      static SparkExecutorCache getOrCreate()
      Returns the cache if created or creates and returns it.
      <V> V getOrLoad​(java.lang.String group, java.lang.String key, java.util.function.Supplier<V> valueSupplier, long valueSize)
      Gets the cached value for the key or populates the cache with a new mapping.
      void invalidate​(java.lang.String group)
      Invalidates all keys associated with the given group ID.
      long maxEntrySize()
      Returns the max entry size in bytes that will be considered for caching.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • getOrCreate

        public static SparkExecutorCache getOrCreate()
        Returns the cache if created or creates and returns it.

        Note this method returns null if caching is disabled.

      • get

        public static SparkExecutorCache get()
        Returns the cache if already created or null otherwise.
      • maxEntrySize

        public long maxEntrySize()
        Returns the max entry size in bytes that will be considered for caching.
      • getOrLoad

        public <V> V getOrLoad​(java.lang.String group,
                               java.lang.String key,
                               java.util.function.Supplier<V> valueSupplier,
                               long valueSize)
        Gets the cached value for the key or populates the cache with a new mapping.
        Parameters:
        group - a group ID
        key - a cache key
        valueSupplier - a supplier to compute the value
        valueSize - an estimated memory size of the value in bytes
        Returns:
        the cached or computed value
      • invalidate

        public void invalidate​(java.lang.String group)
        Invalidates all keys associated with the given group ID.
        Parameters:
        group - a group ID