Package org.apache.iceberg.util
Class UnicodeUtil
- java.lang.Object
-
- org.apache.iceberg.util.UnicodeUtil
-
public class UnicodeUtil extends java.lang.Object
-
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static boolean
isCharHighSurrogate(char ch)
Determines if the given character value is a unicode high-surrogate code unit.static java.lang.CharSequence
truncateString(java.lang.CharSequence input, int length)
Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to lengthstatic Literal<java.lang.CharSequence>
truncateStringMax(Literal<java.lang.CharSequence> input, int length)
Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to lengthstatic Literal<java.lang.CharSequence>
truncateStringMin(Literal<java.lang.CharSequence> input, int length)
Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length
-
-
-
Method Detail
-
isCharHighSurrogate
public static boolean isCharHighSurrogate(char ch)
Determines if the given character value is a unicode high-surrogate code unit. The range of high-surrogates is 0xD800 - 0xDBFF.
-
truncateString
public static java.lang.CharSequence truncateString(java.lang.CharSequence input, int length)
Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to length
-
truncateStringMin
public static Literal<java.lang.CharSequence> truncateStringMin(Literal<java.lang.CharSequence> input, int length)
Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length
-
truncateStringMax
public static Literal<java.lang.CharSequence> truncateStringMax(Literal<java.lang.CharSequence> input, int length)
Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length
-
-