Class SparkExecutorCache
- java.lang.Object
-
- org.apache.iceberg.spark.SparkExecutorCache
-
public class SparkExecutorCache extends java.lang.Object
An executor cache for reducing the computation and IO overhead in tasks.The cache is configured and controlled through Spark SQL properties. It supports both limits on the total cache size and maximum size for individual entries. Additionally, it implements automatic eviction of entries after a specified duration of inactivity. The cache will respect the SQL configuration valid at the time of initialization. All subsequent changes to the configuration will have no effect.
The cache is accessed and populated via
getOrLoad(String, String, Supplier, long)
. If the value is not present in the cache, it is computed using the provided supplier and stored in the cache, subject to the defined size constraints. When a key is added, it must be associated with a particular group ID. Once the group is no longer needed, it is recommended to explicitly invalidate its state by callinginvalidate(String)
instead of relying on automatic eviction.Note that this class employs the singleton pattern to ensure only one cache exists per JVM.
-
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static SparkExecutorCache
get()
Returns the cache if already created or null otherwise.static SparkExecutorCache
getOrCreate()
Returns the cache if created or creates and returns it.<V> V
getOrLoad(java.lang.String group, java.lang.String key, java.util.function.Supplier<V> valueSupplier, long valueSize)
Gets the cached value for the key or populates the cache with a new mapping.void
invalidate(java.lang.String group)
Invalidates all keys associated with the given group ID.long
maxEntrySize()
Returns the max entry size in bytes that will be considered for caching.
-
-
-
Method Detail
-
getOrCreate
public static SparkExecutorCache getOrCreate()
Returns the cache if created or creates and returns it.Note this method returns null if caching is disabled.
-
get
public static SparkExecutorCache get()
Returns the cache if already created or null otherwise.
-
maxEntrySize
public long maxEntrySize()
Returns the max entry size in bytes that will be considered for caching.
-
getOrLoad
public <V> V getOrLoad(java.lang.String group, java.lang.String key, java.util.function.Supplier<V> valueSupplier, long valueSize)
Gets the cached value for the key or populates the cache with a new mapping.- Parameters:
group
- a group IDkey
- a cache keyvalueSupplier
- a supplier to compute the valuevalueSize
- an estimated memory size of the value in bytes- Returns:
- the cached or computed value
-
invalidate
public void invalidate(java.lang.String group)
Invalidates all keys associated with the given group ID.- Parameters:
group
- a group ID
-
-