Package org.apache.iceberg.util
Class UnicodeUtil
java.lang.Object
org.apache.iceberg.util.UnicodeUtil
-
Method Summary
Modifier and TypeMethodDescriptionstatic booleanisCharHighSurrogate(char ch) Determines if the given character value is a unicode high-surrogate code unit.static CharSequencetruncateString(CharSequence input, int length) Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to lengthstatic StringtruncateStringMax(String input, int length) Returns a valid String that is greater than the given input such that the number of unicode characters in the truncated String is lesser than or equal to lengthstatic Literal<CharSequence> truncateStringMax(Literal<CharSequence> input, int length) Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to lengthstatic StringtruncateStringMin(String input, int length) Returns a valid String that is lower than the given input such that the number of unicode characters in the truncated String is lesser than or equal to lengthstatic Literal<CharSequence> truncateStringMin(Literal<CharSequence> input, int length) Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length
-
Method Details
-
isCharHighSurrogate
public static boolean isCharHighSurrogate(char ch) Determines if the given character value is a unicode high-surrogate code unit. The range of high-surrogates is 0xD800 - 0xDBFF. -
truncateString
Truncates the input charSequence such that the truncated charSequence is a valid unicode string and the number of unicode characters in the truncated charSequence is lesser than or equal to length -
truncateStringMin
Returns a valid unicode charsequence that is lower than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length -
truncateStringMin
Returns a valid String that is lower than the given input such that the number of unicode characters in the truncated String is lesser than or equal to length -
truncateStringMax
Returns a valid unicode charsequence that is greater than the given input such that the number of unicode characters in the truncated charSequence is lesser than or equal to length -
truncateStringMax
Returns a valid String that is greater than the given input such that the number of unicode characters in the truncated String is lesser than or equal to length
-