Package org.apache.iceberg.util
Class UnicodeUtil
java.lang.Object
org.apache.iceberg.util.UnicodeUtil
-
Method Summary
Modifier and TypeMethodDescriptionstatic boolean
isCharHighSurrogate
(char ch) Determines if the given character value is a unicode high-surrogate code unit.static CharSequence
truncateString
(CharSequence input, int length) Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to lengthstatic String
truncateStringMax
(String input, int length) Returns a valid String that is greater than the given input such that the number of unicode characters in the truncated String is lesser than or equal to lengthstatic Literal<CharSequence>
truncateStringMax
(Literal<CharSequence> input, int length) Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to lengthstatic String
truncateStringMin
(String input, int length) Returns a valid String that is lower than the given input such that the number of unicode characters in the truncated String is lesser than or equal to lengthstatic Literal<CharSequence>
truncateStringMin
(Literal<CharSequence> input, int length) Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length
-
Method Details
-
isCharHighSurrogate
public static boolean isCharHighSurrogate(char ch) Determines if the given character value is a unicode high-surrogate code unit. The range of high-surrogates is 0xD800 - 0xDBFF. -
truncateString
Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to length -
truncateStringMin
Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length -
truncateStringMin
Returns a valid String that is lower than the given input such that the number of unicode characters in the truncated String is lesser than or equal to length -
truncateStringMax
Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length -
truncateStringMax
Returns a valid String that is greater than the given input such that the number of unicode characters in the truncated String is lesser than or equal to length
-