Class ZOrderByteUtils


  • public class ZOrderByteUtils
    extends java.lang.Object
    Within Z-Ordering the byte representations of objects being compared must be ordered, this requires several types to be transformed when converted to bytes. The goal is to map object's whose byte representation are not lexicographically ordered into representations that are lexicographically ordered. Bytes produced should be compared lexicographically as unsigned bytes, big-endian.

    All types except for String are stored within an 8 Byte Buffer

    Most of these techniques are derived from https://aws.amazon.com/blogs/database/z-order-indexing-for-multifaceted-queries-in-amazon-dynamodb-part-2/

    Some implementation is taken from https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/util/OrderedBytes.java

    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static java.nio.ByteBuffer byteTruncateOrFill​(byte[] val, int length, java.nio.ByteBuffer reuse)
      Return a bytebuffer with the given bytes truncated to length, or filled with 0's to length depending on whether the given bytes are larger or smaller than the given length.
      static java.nio.ByteBuffer doubleToOrderedBytes​(double val, java.nio.ByteBuffer reuse)
      Doubles are treated the same as floats in floatToOrderedBytes(float, ByteBuffer)
      static java.nio.ByteBuffer floatToOrderedBytes​(float val, java.nio.ByteBuffer reuse)
      IEEE 754 : “If two floating-point numbers in the same format are ordered (say, x < y), they are ordered the same way when their bits are reinterpreted as sign-magnitude integers.” Which means floats can be treated as sign magnitude integers which can then be converted into lexicographically comparable bytes
      static byte[] interleaveBits​(byte[][] columnsBinary, int interleavedSize, java.nio.ByteBuffer reuse)
      Interleave bits using a naive loop.
      static java.nio.ByteBuffer intToOrderedBytes​(int val, java.nio.ByteBuffer reuse)
      Signed ints do not have their bytes in magnitude order because of the sign bit.
      static java.nio.ByteBuffer longToOrderedBytes​(long val, java.nio.ByteBuffer reuse)
      Signed longs are treated the same as the signed ints in intToOrderedBytes(int, ByteBuffer)
      static java.nio.ByteBuffer shortToOrderedBytes​(short val, java.nio.ByteBuffer reuse)
      Signed shorts are treated the same as the signed ints in intToOrderedBytes(int, ByteBuffer)
      static java.nio.ByteBuffer stringToOrderedBytes​(java.lang.String val, int length, java.nio.ByteBuffer reuse, java.nio.charset.CharsetEncoder encoder)
      Strings are lexicographically sortable BUT if different byte array lengths will ruin the Z-Ordering.
      static java.nio.ByteBuffer tinyintToOrderedBytes​(byte val, java.nio.ByteBuffer reuse)
      Signed tiny ints are treated the same as the signed ints in intToOrderedBytes(int, ByteBuffer)
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

    • Method Detail

      • intToOrderedBytes

        public static java.nio.ByteBuffer intToOrderedBytes​(int val,
                                                            java.nio.ByteBuffer reuse)
        Signed ints do not have their bytes in magnitude order because of the sign bit. To fix this, flip the sign bit so that all negatives are ordered before positives. This essentially shifts the 0 value so that we don't break our ordering when we cross the new 0 value.
      • longToOrderedBytes

        public static java.nio.ByteBuffer longToOrderedBytes​(long val,
                                                             java.nio.ByteBuffer reuse)
        Signed longs are treated the same as the signed ints in intToOrderedBytes(int, ByteBuffer)
      • shortToOrderedBytes

        public static java.nio.ByteBuffer shortToOrderedBytes​(short val,
                                                              java.nio.ByteBuffer reuse)
        Signed shorts are treated the same as the signed ints in intToOrderedBytes(int, ByteBuffer)
      • tinyintToOrderedBytes

        public static java.nio.ByteBuffer tinyintToOrderedBytes​(byte val,
                                                                java.nio.ByteBuffer reuse)
        Signed tiny ints are treated the same as the signed ints in intToOrderedBytes(int, ByteBuffer)
      • floatToOrderedBytes

        public static java.nio.ByteBuffer floatToOrderedBytes​(float val,
                                                              java.nio.ByteBuffer reuse)
        IEEE 754 : “If two floating-point numbers in the same format are ordered (say, x < y), they are ordered the same way when their bits are reinterpreted as sign-magnitude integers.” Which means floats can be treated as sign magnitude integers which can then be converted into lexicographically comparable bytes
      • doubleToOrderedBytes

        public static java.nio.ByteBuffer doubleToOrderedBytes​(double val,
                                                               java.nio.ByteBuffer reuse)
        Doubles are treated the same as floats in floatToOrderedBytes(float, ByteBuffer)
      • stringToOrderedBytes

        public static java.nio.ByteBuffer stringToOrderedBytes​(java.lang.String val,
                                                               int length,
                                                               java.nio.ByteBuffer reuse,
                                                               java.nio.charset.CharsetEncoder encoder)
        Strings are lexicographically sortable BUT if different byte array lengths will ruin the Z-Ordering. (ZOrder requires that a given column contribute the same number of bytes every time). This implementation just uses a set size to for all output byte representations. Truncating longer strings and right padding 0 for shorter strings.
      • byteTruncateOrFill

        public static java.nio.ByteBuffer byteTruncateOrFill​(byte[] val,
                                                             int length,
                                                             java.nio.ByteBuffer reuse)
        Return a bytebuffer with the given bytes truncated to length, or filled with 0's to length depending on whether the given bytes are larger or smaller than the given length.
      • interleaveBits

        public static byte[] interleaveBits​(byte[][] columnsBinary,
                                            int interleavedSize,
                                            java.nio.ByteBuffer reuse)
        Interleave bits using a naive loop. Variable length inputs are allowed but to get a consistent ordering it is required that every column contribute the same number of bytes in each invocation. Bits are interleaved from all columns that have a bit available at that position. Once a Column has no more bits to produce it is skipped in the interleaving.
        Parameters:
        columnsBinary - an array of ordered byte representations of the columns being ZOrdered
        interleavedSize - the number of bytes to use in the output
        Returns:
        the columnbytes interleaved