Class BroadcastableTokenPartitioner

  • All Implemented Interfaces:
    java.io.Serializable

    public final class BroadcastableTokenPartitioner
    extends java.lang.Object
    implements java.io.Serializable
    Broadcastable wrapper for TokenPartitioner with ZERO transient fields to optimize Spark broadcasting.

    Only contains the partition mappings; executors will use this to reconstruct TokenPartitioner.

    Why ZERO transient fields matters:
    Spark's SizeEstimator uses reflection to estimate object sizes before broadcasting. Each transient field forces SizeEstimator to inspect the field's type hierarchy, which is expensive. Logger references are particularly costly due to their deep object graphs (appenders, layouts, contexts). By eliminating ALL transient fields and Logger references, we:

    • Minimize SizeEstimator reflection overhead during broadcast preparation
    • Reduce broadcast variable serialization size
    • Avoid accidental serialization of non-serializable objects
    See Also:
    Serialized Form
    • Method Detail

      • from

        public static BroadcastableTokenPartitioner from​(@NotNull
                                                         TokenPartitioner source)
        Creates a BroadcastableTokenPartitioner from a TokenPartitioner. Extracts only the partition mappings, avoiding the Logger.
        Parameters:
        source - the source TokenPartitioner
      • getPartitionEntries

        public java.util.Map<com.google.common.collect.Range<java.math.BigInteger>,​java.lang.Integer> getPartitionEntries()
      • numSplits

        public java.lang.Integer numSplits()