Class Character
- java.lang.Object
-
- java.lang.Character
-
- All Implemented Interfaces:
Serializable
,Comparable<Character>
,Modified
public final class Character extends Object implements Serializable, Comparable<Character>, Modified
The wrapper for the primitive typechar
. This class also provides a number of utility methods for working with characters.Character data is kept up to date as Unicode evolves. See the Locale data section of the
Locale
documentation for details of the Unicode versions implemented by current and historical Android releases.The Unicode specification, character tables, and other information are available at http://www.unicode.org/.
Unicode characters are referred to as code points. The range of valid code points is U+0000 to U+10FFFF. The Basic Multilingual Plane (BMP) is the code point range U+0000 to U+FFFF. Characters above the BMP are referred to as Supplementary Characters. On the Java platform, UTF-16 encoding and
char
pairs are used to represent code points in the supplementary range. A pair ofchar
values that represent a supplementary character are made up of a high surrogate with a value range of 0xD800 to 0xDBFF and a low surrogate with a value range of 0xDC00 to 0xDFFF.On the Java platform a
char
value represents either a single BMP code point or a UTF-16 unit that's part of a surrogate pair. Theint
type is used to represent all Unicode code points.Unicode categories
Here's a list of the Unicode character categories and the corresponding Java constant, grouped semantically to provide a convenient overview. This table is also useful in conjunction with
\p
and\P
inregular expressions
.Categories Cn Unassigned UNASSIGNED
Cc Control CONTROL
Cf Format FORMAT
Co Private use PRIVATE_USE
Cs Surrogate SURROGATE
Lu Uppercase letter UPPERCASE_LETTER
Ll Lowercase letter LOWERCASE_LETTER
Lt Titlecase letter TITLECASE_LETTER
Lm Modifier letter MODIFIER_LETTER
Lo Other letter OTHER_LETTER
Mn Non-spacing mark NON_SPACING_MARK
Me Enclosing mark ENCLOSING_MARK
Mc Combining spacing mark COMBINING_SPACING_MARK
Nd Decimal digit number DECIMAL_DIGIT_NUMBER
Nl Letter number LETTER_NUMBER
No Other number OTHER_NUMBER
Pd Dash punctuation DASH_PUNCTUATION
Ps Start punctuation START_PUNCTUATION
Pe End punctuation END_PUNCTUATION
Pc Connector punctuation CONNECTOR_PUNCTUATION
Pi Initial quote punctuation INITIAL_QUOTE_PUNCTUATION
Pf Final quote punctuation FINAL_QUOTE_PUNCTUATION
Po Other punctuation OTHER_PUNCTUATION
Sm Math symbol MATH_SYMBOL
Sc Currency symbol CURRENCY_SYMBOL
Sk Modifier symbol MODIFIER_SYMBOL
So Other symbol OTHER_SYMBOL
Zs Space separator SPACE_SEPARATOR
Zl Line separator LINE_SEPARATOR
Zp Paragraph separator PARAGRAPH_SEPARATOR
- Since:
- 1.0
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static byte
COMBINING_SPACING_MARK
Unicode category constant Mc.static byte
CONNECTOR_PUNCTUATION
Unicode category constant Pc.static byte
CONTROL
Unicode category constant Cc.static byte
CURRENCY_SYMBOL
Unicode category constant Sc.static byte
DASH_PUNCTUATION
Unicode category constant Pd.static byte
DECIMAL_DIGIT_NUMBER
Unicode category constant Nd.static byte
DIRECTIONALITY_ARABIC_NUMBER
Unicode bidirectional constant AN.static byte
DIRECTIONALITY_BOUNDARY_NEUTRAL
Unicode bidirectional constant BN.static byte
DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
Unicode bidirectional constant CS.static byte
DIRECTIONALITY_EUROPEAN_NUMBER
Unicode bidirectional constant EN.static byte
DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
Unicode bidirectional constant ES.static byte
DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
Unicode bidirectional constant ET.static byte
DIRECTIONALITY_LEFT_TO_RIGHT
Unicode bidirectional constant L.static byte
DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
Unicode bidirectional constant LRE.static byte
DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
Unicode bidirectional constant LRO.static byte
DIRECTIONALITY_NONSPACING_MARK
Unicode bidirectional constant NSM.static byte
DIRECTIONALITY_OTHER_NEUTRALS
Unicode bidirectional constant ON.static byte
DIRECTIONALITY_PARAGRAPH_SEPARATOR
Unicode bidirectional constant B.static byte
DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
Unicode bidirectional constant PDF.static byte
DIRECTIONALITY_RIGHT_TO_LEFT
Unicode bidirectional constant R.static byte
DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
Unicode bidirectional constant AL.static byte
DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
Unicode bidirectional constant RLE.static byte
DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
Unicode bidirectional constant RLO.static byte
DIRECTIONALITY_SEGMENT_SEPARATOR
Unicode bidirectional constant S.static byte
DIRECTIONALITY_UNDEFINED
Unicode bidirectional constant.static byte
DIRECTIONALITY_WHITESPACE
Unicode bidirectional constant WS.static byte
ENCLOSING_MARK
Unicode category constant Me.static byte
END_PUNCTUATION
Unicode category constant Pe.static byte
FINAL_QUOTE_PUNCTUATION
Unicode category constant Pf.static byte
FORMAT
Unicode category constant Cf.static byte
INITIAL_QUOTE_PUNCTUATION
Unicode category constant Pi.static byte
LETTER_NUMBER
Unicode category constant Nl.static byte
LINE_SEPARATOR
Unicode category constant Zl.static byte
LOWERCASE_LETTER
Unicode category constant Ll.static byte
MATH_SYMBOL
Unicode category constant Sm.static int
MAX_CODE_POINT
The maximum code point value,U+10FFFF
.static char
MAX_HIGH_SURROGATE
The maximum value of a high surrogate or leading surrogate unit in UTF-16 encoding,'?'
.static char
MAX_LOW_SURROGATE
The maximum value of a low surrogate or trailing surrogate unit in UTF-16 encoding,'?'
.static int
MAX_RADIX
The maximum radix used for conversions between characters and integers.static char
MAX_SURROGATE
The maximum value of a surrogate unit in UTF-16 encoding,'?'
.static char
MAX_VALUE
The maximumCharacter
value.static int
MIN_CODE_POINT
The minimum code point value,U+0000
.static char
MIN_HIGH_SURROGATE
The minimum value of a high surrogate or leading surrogate unit in UTF-16 encoding,'?'
.static char
MIN_LOW_SURROGATE
The minimum value of a low surrogate or trailing surrogate unit in UTF-16 encoding,'?'
.static int
MIN_RADIX
The minimum radix used for conversions between characters and integers.static int
MIN_SUPPLEMENTARY_CODE_POINT
The minimum value of a supplementary code point,U+010000
.static char
MIN_SURROGATE
The minimum value of a surrogate unit in UTF-16 encoding,'?'
.static char
MIN_VALUE
The minimumCharacter
value.static byte
MODIFIER_LETTER
Unicode category constant Lm.static byte
MODIFIER_SYMBOL
Unicode category constant Sk.static byte
NON_SPACING_MARK
Unicode category constant Mn.static byte
OTHER_LETTER
Unicode category constant Lo.static byte
OTHER_NUMBER
Unicode category constant No.static byte
OTHER_PUNCTUATION
Unicode category constant Po.static byte
OTHER_SYMBOL
Unicode category constant So.static byte
PARAGRAPH_SEPARATOR
Unicode category constant Zp.static byte
PRIVATE_USE
Unicode category constant Co.static int
SIZE
The number of bits required to represent aCharacter
value unsigned form.static byte
SPACE_SEPARATOR
Unicode category constant Zs.static byte
START_PUNCTUATION
Unicode category constant Ps.static byte
SURROGATE
Unicode category constant Cs.static byte
TITLECASE_LETTER
Unicode category constant Lt.static byte
UNASSIGNED
Unicode category constant Cn.static byte
UPPERCASE_LETTER
Unicode category constant Lu.
-
Constructor Summary
Constructors Constructor Description Character(char value)
Constructs a newCharacter
with the specified primitive char value.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static int
charCount(int codePoint)
Calculates the number ofchar
values required to represent the specified Unicode code point.char
charValue()
Gets the primitive value of this character.static int
codePointAt(char[] seq, int index)
Returns the code point atindex
in the specified array of character units.static int
codePointAt(char[] seq, int index, int limit)
Returns the code point atindex
in the specified array of character units, whereindex
has to be less thanlimit
.static int
codePointBefore(char[] seq, int index)
Returns the code point that precedesindex
in the specified array of character units.static int
codePointBefore(char[] seq, int index, int start)
Returns the code point that precedes theindex
in the specified array of character units and is not less thanstart
.static int
codePointCount(char[] seq, int offset, int count)
Counts the number of Unicode code points in the subsequence of the specified char array, as delineated byoffset
andcount
.static int
compare(char lhs, char rhs)
Compares twochar
values.int
compareTo(Character c)
Compares this object to the specified character object to determine their relative order.static int
digit(char c, int radix)
Convenience method to determine the value of the specified characterc
in the supplied radix.static int
digit(int codePoint, int radix)
Convenience method to determine the value of the charactercodePoint
in the supplied radix.boolean
equals(Object object)
Compares this object with the specified object and indicates if they are equal.static char
forDigit(int digit, int radix)
Returns the character which represents the specified digit in the specified radix.static int
getNumericValue(char c)
Returns the numeric value of the specified Unicode character.static int
getNumericValue(int codePoint)
Gets the numeric value of the specified Unicode code point.int
hashCode()
Returns an integer hash code for this object.static char
highSurrogate(int codePoint)
Returns the high surrogate for the given code point.static boolean
isBmpCodePoint(int codePoint)
Returns true if the given code point is in the Basic Multilingual Plane (BMP).static boolean
isDigit(char c)
Indicates whether the specified character is a digit.static boolean
isDigit(int codePoint)
Indicates whether the specified code point is a digit.static boolean
isHighSurrogate(char ch)
Indicates whetherch
is a high- (or leading-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.static boolean
isISOControl(char c)
Indicates whether the specified character is an ISO control character.static boolean
isISOControl(int c)
Indicates whether the specified code point is an ISO control character.static boolean
isLetter(char c)
Indicates whether the specified character is a letter.static boolean
isLetter(int codePoint)
Indicates whether the specified code point is a letter.static boolean
isLetterOrDigit(char c)
Indicates whether the specified character is a letter or a digit.static boolean
isLetterOrDigit(int codePoint)
Indicates whether the specified code point is a letter or a digit.static boolean
isLowerCase(char c)
Indicates whether the specified character is a lower case letter.static boolean
isLowerCase(int codePoint)
Indicates whether the specified code point is a lower case letter.static boolean
isLowSurrogate(char ch)
Indicates whetherch
is a low- (or trailing-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.static boolean
isSpace(char c)
Deprecated.UseisWhitespace(char)
instead.static boolean
isSpaceChar(char c)
SeeisSpaceChar(int)
.static boolean
isSpaceChar(int codePoint)
Returns true if the given code point is a Unicode space character.static boolean
isSupplementaryCodePoint(int codePoint)
Indicates whethercodePoint
is within the supplementary code point range.static boolean
isSurrogate(char ch)
Returns true if the given character is a high or low surrogate.static boolean
isSurrogatePair(char high, char low)
Indicates whether the specified character pair is a valid surrogate pair.static boolean
isUpperCase(char c)
Indicates whether the specified character is an upper case letter.static boolean
isUpperCase(int codePoint)
Indicates whether the specified code point is an upper case letter.static boolean
isValidCodePoint(int codePoint)
Indicates whethercodePoint
is a valid Unicode code point.static boolean
isWhitespace(char c)
SeeisWhitespace(int)
.static boolean
isWhitespace(int codePoint)
Returns true if the given code point is a Unicode whitespace character.static char
lowSurrogate(int codePoint)
Returns the low surrogate for the given code point.static int
offsetByCodePoints(char[] seq, int start, int count, int index, int codePointOffset)
Determines the index in a subsequence of the specified character array that is offsetcodePointOffset
code points fromindex
.static char
reverseBytes(char c)
Reverses the order of the first and second byte in the specified character.static char[]
toChars(int codePoint)
Converts the specified Unicode code point into a UTF-16 encoded sequence and returns it as a char array.static int
toChars(int codePoint, char[] dst, int dstIndex)
Converts the specified Unicode code point into a UTF-16 encoded sequence and copies the value(s) into the char arraydst
, starting at indexdstIndex
.static int
toCodePoint(char high, char low)
Converts a surrogate pair into a Unicode code point.static char
toLowerCase(char c)
Returns the lower case equivalent for the specified character if the character is an upper case letter.static int
toLowerCase(int codePoint)
Returns the lower case equivalent for the specified code point if it is an upper case letter.String
toString()
Returns a string containing a concise, human-readable description of this object.static String
toString(char value)
Converts the specified character to its string representation.static char
toUpperCase(char c)
Returns the upper case equivalent for the specified character if the character is a lower case letter.static int
toUpperCase(int codePoint)
Returns the upper case equivalent for the specified code point if the code point is a lower case letter.static Character
valueOf(char c)
Returns aCharacter
instance for thechar
value passed.
-
-
-
Field Detail
-
MIN_VALUE
public static final char MIN_VALUE
The minimumCharacter
value.- See Also:
- Constant Field Values
-
MAX_VALUE
public static final char MAX_VALUE
The maximumCharacter
value.- See Also:
- Constant Field Values
-
MIN_RADIX
public static final int MIN_RADIX
The minimum radix used for conversions between characters and integers.- See Also:
- Constant Field Values
-
MAX_RADIX
public static final int MAX_RADIX
The maximum radix used for conversions between characters and integers.- See Also:
- Constant Field Values
-
UNASSIGNED
public static final byte UNASSIGNED
Unicode category constant Cn.- See Also:
- Constant Field Values
-
UPPERCASE_LETTER
public static final byte UPPERCASE_LETTER
Unicode category constant Lu.- See Also:
- Constant Field Values
-
LOWERCASE_LETTER
public static final byte LOWERCASE_LETTER
Unicode category constant Ll.- See Also:
- Constant Field Values
-
TITLECASE_LETTER
public static final byte TITLECASE_LETTER
Unicode category constant Lt.- See Also:
- Constant Field Values
-
MODIFIER_LETTER
public static final byte MODIFIER_LETTER
Unicode category constant Lm.- See Also:
- Constant Field Values
-
OTHER_LETTER
public static final byte OTHER_LETTER
Unicode category constant Lo.- See Also:
- Constant Field Values
-
NON_SPACING_MARK
public static final byte NON_SPACING_MARK
Unicode category constant Mn.- See Also:
- Constant Field Values
-
ENCLOSING_MARK
public static final byte ENCLOSING_MARK
Unicode category constant Me.- See Also:
- Constant Field Values
-
COMBINING_SPACING_MARK
public static final byte COMBINING_SPACING_MARK
Unicode category constant Mc.- See Also:
- Constant Field Values
-
DECIMAL_DIGIT_NUMBER
public static final byte DECIMAL_DIGIT_NUMBER
Unicode category constant Nd.- See Also:
- Constant Field Values
-
LETTER_NUMBER
public static final byte LETTER_NUMBER
Unicode category constant Nl.- See Also:
- Constant Field Values
-
OTHER_NUMBER
public static final byte OTHER_NUMBER
Unicode category constant No.- See Also:
- Constant Field Values
-
SPACE_SEPARATOR
public static final byte SPACE_SEPARATOR
Unicode category constant Zs.- See Also:
- Constant Field Values
-
LINE_SEPARATOR
public static final byte LINE_SEPARATOR
Unicode category constant Zl.- See Also:
- Constant Field Values
-
PARAGRAPH_SEPARATOR
public static final byte PARAGRAPH_SEPARATOR
Unicode category constant Zp.- See Also:
- Constant Field Values
-
CONTROL
public static final byte CONTROL
Unicode category constant Cc.- See Also:
- Constant Field Values
-
FORMAT
public static final byte FORMAT
Unicode category constant Cf.- See Also:
- Constant Field Values
-
PRIVATE_USE
public static final byte PRIVATE_USE
Unicode category constant Co.- See Also:
- Constant Field Values
-
SURROGATE
public static final byte SURROGATE
Unicode category constant Cs.- See Also:
- Constant Field Values
-
DASH_PUNCTUATION
public static final byte DASH_PUNCTUATION
Unicode category constant Pd.- See Also:
- Constant Field Values
-
START_PUNCTUATION
public static final byte START_PUNCTUATION
Unicode category constant Ps.- See Also:
- Constant Field Values
-
END_PUNCTUATION
public static final byte END_PUNCTUATION
Unicode category constant Pe.- See Also:
- Constant Field Values
-
CONNECTOR_PUNCTUATION
public static final byte CONNECTOR_PUNCTUATION
Unicode category constant Pc.- See Also:
- Constant Field Values
-
OTHER_PUNCTUATION
public static final byte OTHER_PUNCTUATION
Unicode category constant Po.- See Also:
- Constant Field Values
-
MATH_SYMBOL
public static final byte MATH_SYMBOL
Unicode category constant Sm.- See Also:
- Constant Field Values
-
CURRENCY_SYMBOL
public static final byte CURRENCY_SYMBOL
Unicode category constant Sc.- See Also:
- Constant Field Values
-
MODIFIER_SYMBOL
public static final byte MODIFIER_SYMBOL
Unicode category constant Sk.- See Also:
- Constant Field Values
-
OTHER_SYMBOL
public static final byte OTHER_SYMBOL
Unicode category constant So.- See Also:
- Constant Field Values
-
INITIAL_QUOTE_PUNCTUATION
public static final byte INITIAL_QUOTE_PUNCTUATION
Unicode category constant Pi.- Since:
- 1.4
- See Also:
- Constant Field Values
-
FINAL_QUOTE_PUNCTUATION
public static final byte FINAL_QUOTE_PUNCTUATION
Unicode category constant Pf.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_UNDEFINED
public static final byte DIRECTIONALITY_UNDEFINED
Unicode bidirectional constant.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_LEFT_TO_RIGHT
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT
Unicode bidirectional constant L.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_RIGHT_TO_LEFT
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT
Unicode bidirectional constant R.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
Unicode bidirectional constant AL.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_EUROPEAN_NUMBER
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER
Unicode bidirectional constant EN.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
Unicode bidirectional constant ES.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
Unicode bidirectional constant ET.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_ARABIC_NUMBER
public static final byte DIRECTIONALITY_ARABIC_NUMBER
Unicode bidirectional constant AN.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
public static final byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
Unicode bidirectional constant CS.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_NONSPACING_MARK
public static final byte DIRECTIONALITY_NONSPACING_MARK
Unicode bidirectional constant NSM.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_BOUNDARY_NEUTRAL
public static final byte DIRECTIONALITY_BOUNDARY_NEUTRAL
Unicode bidirectional constant BN.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_PARAGRAPH_SEPARATOR
public static final byte DIRECTIONALITY_PARAGRAPH_SEPARATOR
Unicode bidirectional constant B.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_SEGMENT_SEPARATOR
public static final byte DIRECTIONALITY_SEGMENT_SEPARATOR
Unicode bidirectional constant S.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_WHITESPACE
public static final byte DIRECTIONALITY_WHITESPACE
Unicode bidirectional constant WS.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_OTHER_NEUTRALS
public static final byte DIRECTIONALITY_OTHER_NEUTRALS
Unicode bidirectional constant ON.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
Unicode bidirectional constant LRE.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
Unicode bidirectional constant LRO.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
Unicode bidirectional constant RLE.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
Unicode bidirectional constant RLO.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
public static final byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
Unicode bidirectional constant PDF.- Since:
- 1.4
- See Also:
- Constant Field Values
-
MIN_HIGH_SURROGATE
public static final char MIN_HIGH_SURROGATE
The minimum value of a high surrogate or leading surrogate unit in UTF-16 encoding,'?'
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MAX_HIGH_SURROGATE
public static final char MAX_HIGH_SURROGATE
The maximum value of a high surrogate or leading surrogate unit in UTF-16 encoding,'?'
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MIN_LOW_SURROGATE
public static final char MIN_LOW_SURROGATE
The minimum value of a low surrogate or trailing surrogate unit in UTF-16 encoding,'?'
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MAX_LOW_SURROGATE
public static final char MAX_LOW_SURROGATE
The maximum value of a low surrogate or trailing surrogate unit in UTF-16 encoding,'?'
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MIN_SURROGATE
public static final char MIN_SURROGATE
The minimum value of a surrogate unit in UTF-16 encoding,'?'
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MAX_SURROGATE
public static final char MAX_SURROGATE
The maximum value of a surrogate unit in UTF-16 encoding,'?'
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MIN_SUPPLEMENTARY_CODE_POINT
public static final int MIN_SUPPLEMENTARY_CODE_POINT
The minimum value of a supplementary code point,U+010000
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MIN_CODE_POINT
public static final int MIN_CODE_POINT
The minimum code point value,U+0000
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MAX_CODE_POINT
public static final int MAX_CODE_POINT
The maximum code point value,U+10FFFF
.- Since:
- 1.5
- See Also:
- Constant Field Values
-
SIZE
public static final int SIZE
The number of bits required to represent aCharacter
value unsigned form.- Since:
- 1.5
- See Also:
- Constant Field Values
-
-
Method Detail
-
charValue
public char charValue()
Gets the primitive value of this character.- Returns:
- this object's primitive value.
-
compareTo
public int compareTo(Character c)
Compares this object to the specified character object to determine their relative order.- Specified by:
compareTo
in interfaceComparable<Character>
- Parameters:
c
- the character object to compare this object to.- Returns:
0
if the value of this character and the value ofc
are equal; a positive value if the value of this character is greater than the value ofc
; a negative value if the value of this character is less than the value ofc
.- Since:
- 1.2
- See Also:
Comparable
-
compare
public static int compare(char lhs, char rhs)
Compares twochar
values.- Parameters:
lhs
- First value.rhs
- Second value.- Returns:
- 0 if lhs = rhs, less than 0 if lhs < rhs, and greater than 0 if lhs > rhs.
- Since:
- 1.7
-
valueOf
public static Character valueOf(char c)
Returns aCharacter
instance for thechar
value passed.- Parameters:
c
- the char value for which to get aCharacter
instance.- Returns:
- the
Character
instance forc
. - Since:
- 1.5
-
isValidCodePoint
public static boolean isValidCodePoint(int codePoint)
Indicates whethercodePoint
is a valid Unicode code point.- Parameters:
codePoint
- the code point to test.- Returns:
true
ifcodePoint
is a valid Unicode code point;false
otherwise.- Since:
- 1.5
-
isSupplementaryCodePoint
public static boolean isSupplementaryCodePoint(int codePoint)
Indicates whethercodePoint
is within the supplementary code point range.- Parameters:
codePoint
- the code point to test.- Returns:
true
ifcodePoint
is within the supplementary code point range;false
otherwise.- Since:
- 1.5
-
isHighSurrogate
public static boolean isHighSurrogate(char ch)
Indicates whetherch
is a high- (or leading-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.- Parameters:
ch
- the character to test.- Returns:
true
ifch
is a high-surrogate code unit;false
otherwise.- Since:
- 1.5
- See Also:
isLowSurrogate(char)
-
isLowSurrogate
public static boolean isLowSurrogate(char ch)
Indicates whetherch
is a low- (or trailing-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.- Parameters:
ch
- the character to test.- Returns:
true
ifch
is a low-surrogate code unit;false
otherwise.- Since:
- 1.5
- See Also:
isHighSurrogate(char)
-
isSurrogate
public static boolean isSurrogate(char ch)
Returns true if the given character is a high or low surrogate.- Parameters:
ch
- Character- Returns:
true
if surrogate.- Since:
- 1.7
-
isSurrogatePair
public static boolean isSurrogatePair(char high, char low)
Indicates whether the specified character pair is a valid surrogate pair.- Parameters:
high
- the high surrogate unit to test.low
- the low surrogate unit to test.- Returns:
true
ifhigh
is a high-surrogate code unit andlow
is a low-surrogate code unit;false
otherwise.- Since:
- 1.5
- See Also:
isHighSurrogate(char)
,isLowSurrogate(char)
-
charCount
public static int charCount(int codePoint)
Calculates the number ofchar
values required to represent the specified Unicode code point. This method checks if thecodePoint
is greater than or equal to0x10000
, in which case2
is returned, otherwise1
. To test if the code point is valid, use theisValidCodePoint(int)
method.- Parameters:
codePoint
- the code point for which to calculate the number of required chars.- Returns:
2
ifcodePoint >= 0x10000
;1
otherwise.- Since:
- 1.5
- See Also:
isValidCodePoint(int)
,isSupplementaryCodePoint(int)
-
toCodePoint
public static int toCodePoint(char high, char low)
Converts a surrogate pair into a Unicode code point. This method assumes that the pair are valid surrogates. If the pair are not valid surrogates, then the result is indeterminate. TheisSurrogatePair(char, char)
method should be used prior to this method to validate the pair.- Parameters:
high
- the high surrogate unit.low
- the low surrogate unit.- Returns:
- the Unicode code point corresponding to the surrogate unit pair.
- Since:
- 1.5
- See Also:
isSurrogatePair(char, char)
-
codePointAt
public static int codePointAt(char[] seq, int index)
Returns the code point atindex
in the specified array of character units. If the unit atindex
is a high-surrogate unit,index + 1
is less than the length of the array and the unit atindex + 1
is a low-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise thechar
value atindex
is returned.- Parameters:
seq
- the source array ofchar
units.index
- the position inseq
from which to retrieve the code point.- Returns:
- the Unicode code point or
char
value atindex
inseq
. - Throws:
NullPointerException
- ifseq
isnull
.IndexOutOfBoundsException
- if theindex
is negative or greater than or equal to the length ofseq
.- Since:
- 1.5
-
codePointAt
public static int codePointAt(char[] seq, int index, int limit)
Returns the code point atindex
in the specified array of character units, whereindex
has to be less thanlimit
. If the unit atindex
is a high-surrogate unit,index + 1
is less thanlimit
and the unit atindex + 1
is a low-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise thechar
value atindex
is returned.- Parameters:
seq
- the source array ofchar
units.index
- the position inseq
from which to get the code point.limit
- the index after the last unit inseq
that can be used.- Returns:
- the Unicode code point or
char
value atindex
inseq
. - Throws:
NullPointerException
- ifseq
isnull
.IndexOutOfBoundsException
- ifindex < 0
,index >= limit
,limit < 0
or iflimit
is greater than the length ofseq
.- Since:
- 1.5
-
codePointBefore
public static int codePointBefore(char[] seq, int index)
Returns the code point that precedesindex
in the specified array of character units. If the unit atindex - 1
is a low-surrogate unit,index - 2
is not negative and the unit atindex - 2
is a high-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise thechar
value atindex - 1
is returned.- Parameters:
seq
- the source array ofchar
units.index
- the position inseq
following the code point that should be returned.- Returns:
- the Unicode code point or
char
value beforeindex
inseq
. - Throws:
NullPointerException
- ifseq
isnull
.IndexOutOfBoundsException
- if theindex
is less than 1 or greater than the length ofseq
.- Since:
- 1.5
-
codePointBefore
public static int codePointBefore(char[] seq, int index, int start)
Returns the code point that precedes theindex
in the specified array of character units and is not less thanstart
. If the unit atindex - 1
is a low-surrogate unit,index - 2
is not less thanstart
and the unit atindex - 2
is a high-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise thechar
value atindex - 1
is returned.- Parameters:
seq
- the source array ofchar
units.index
- the position inseq
following the code point that should be returned.start
- the index of the first element inseq
.- Returns:
- the Unicode code point or
char
value beforeindex
inseq
. - Throws:
NullPointerException
- ifseq
isnull
.IndexOutOfBoundsException
- if theindex <= start
,start < 0
,index
is greater than the length ofseq
, or ifstart
is equal or greater than the length ofseq
.- Since:
- 1.5
-
toChars
public static int toChars(int codePoint, char[] dst, int dstIndex)
Converts the specified Unicode code point into a UTF-16 encoded sequence and copies the value(s) into the char arraydst
, starting at indexdstIndex
.- Parameters:
codePoint
- the Unicode code point to encode.dst
- the destination array to copy the encoded value into.dstIndex
- the index indst
from where to start copying.- Returns:
- the number of
char
value units copied intodst
. - Throws:
IllegalArgumentException
- ifcodePoint
is not a valid code point.NullPointerException
- ifdst
isnull
.IndexOutOfBoundsException
- ifdstIndex
is negative, greater than or equal todst.length
or equalsdst.length - 1
whencodePoint
is asupplementary code point
.- Since:
- 1.5
-
toChars
public static char[] toChars(int codePoint)
Converts the specified Unicode code point into a UTF-16 encoded sequence and returns it as a char array.- Parameters:
codePoint
- the Unicode code point to encode.- Returns:
- the UTF-16 encoded char sequence. If
codePoint
is asupplementary code point
, then the returned array contains two characters, otherwise it contains just one character. - Throws:
IllegalArgumentException
- ifcodePoint
is not a valid code point.- Since:
- 1.5
-
codePointCount
public static int codePointCount(char[] seq, int offset, int count)
Counts the number of Unicode code points in the subsequence of the specified char array, as delineated byoffset
andcount
. Any surrogate values with missing pair values will be counted as one code point.- Parameters:
seq
- the char array to look throughoffset
- the inclusive index to begin counting at.count
- the number ofchar
values to look through inseq
.- Returns:
- the number of Unicode code points.
- Throws:
NullPointerException
- ifseq
isnull
.IndexOutOfBoundsException
- ifoffset < 0
,count < 0
or ifoffset + count
is greater than the length ofseq
.- Since:
- 1.5
-
offsetByCodePoints
public static int offsetByCodePoints(char[] seq, int start, int count, int index, int codePointOffset)
Determines the index in a subsequence of the specified character array that is offsetcodePointOffset
code points fromindex
. The subsequence is delineated bystart
andcount
.- Parameters:
seq
- the character array to find the index in.start
- the inclusive index that marks the beginning of the subsequence.count
- the number ofchar
values to include within the subsequence.index
- the start index in the subsequence of the char array.codePointOffset
- the number of code points to look backwards or forwards; may be a negative or positive value.- Returns:
- the index in
seq
that iscodePointOffset
code points away fromindex
. - Throws:
NullPointerException
- ifseq
isnull
.IndexOutOfBoundsException
- ifstart < 0
,count < 0
,index < start
,index > start + count
,start + count
is greater than the length ofseq
, or if there are not enough values inseq
to skipcodePointOffset
code points forward or backward (ifcodePointOffset
is negative) fromindex
.- Since:
- 1.5
-
digit
public static int digit(char c, int radix)
Convenience method to determine the value of the specified characterc
in the supplied radix. The value ofradix
must be between MIN_RADIX and MAX_RADIX.
-
digit
public static int digit(int codePoint, int radix)
Convenience method to determine the value of the charactercodePoint
in the supplied radix. The value ofradix
must be between MIN_RADIX and MAX_RADIX.
-
equals
public boolean equals(Object object)
Compares this object with the specified object and indicates if they are equal. In order to be equal,object
must be an instance ofCharacter
and have the same char value as this object.- Overrides:
equals
in classObject
- Parameters:
object
- the object to compare this double with.- Returns:
true
if the specified object is equal to thisCharacter
;false
otherwise.- See Also:
Object.hashCode()
-
forDigit
public static char forDigit(int digit, int radix)
Returns the character which represents the specified digit in the specified radix. Theradix
must be betweenMIN_RADIX
andMAX_RADIX
inclusive;digit
must not be negative and smaller thanradix
. If any of these conditions does not hold, 0 is returned.- Parameters:
digit
- the integer value.radix
- the radix.- Returns:
- the character which represents the
digit
in theradix
.
-
getNumericValue
public static int getNumericValue(char c)
Returns the numeric value of the specified Unicode character. SeegetNumericValue(int)
.- Parameters:
c
- the character- Returns:
- a non-negative numeric integer value if a numeric value for
c
exists, -1 if there is no numeric value forc
, -2 if the numeric value can not be represented as an integer.
-
getNumericValue
public static int getNumericValue(int codePoint)
Gets the numeric value of the specified Unicode code point. For example, the code point 'Ⅻ' stands for the Roman number XII, which has the numeric value 12.There are two points of divergence between this method and the Unicode specification. This method treats the letters a-z (in both upper and lower cases, and their full-width variants) as numbers from 10 to 35. The Unicode specification also supports the idea of code points with non-integer numeric values; this method does not (except to the extent of returning -2 for such code points).
- Parameters:
codePoint
- the code point- Returns:
- a non-negative numeric integer value if a numeric value for
codePoint
exists, -1 if there is no numeric value forcodePoint
, -2 if the numeric value can not be represented with an integer.
-
hashCode
public int hashCode()
Description copied from class:Object
Returns an integer hash code for this object. By contract, any two objects for whichObject.equals(java.lang.Object)
returnstrue
must return the same hash code value. This means that subclasses ofObject
usually override both methods or neither method.Note that hash values must not change over time unless information used in equals comparisons also changes.
See Writing a correct
hashCode
method if you intend implementing your ownhashCode
method.- Overrides:
hashCode
in classObject
- Returns:
- this object's hash code.
- See Also:
Object.equals(java.lang.Object)
-
highSurrogate
public static char highSurrogate(int codePoint)
Returns the high surrogate for the given code point. The result is meaningless if the given code point is not a supplementary character.- Parameters:
codePoint
- Code point.- Returns:
- High surrogate.
- Since:
- 1.7
-
lowSurrogate
public static char lowSurrogate(int codePoint)
Returns the low surrogate for the given code point. The result is meaningless if the given code point is not a supplementary character.- Parameters:
codePoint
- Code point.- Returns:
- Low surrogate.
- Since:
- 1.7
-
isBmpCodePoint
public static boolean isBmpCodePoint(int codePoint)
Returns true if the given code point is in the Basic Multilingual Plane (BMP). Such code points can be represented by a singlechar
.- Parameters:
codePoint
- Code point.- Returns:
true
if in plane.- Since:
- 1.7
-
isDigit
public static boolean isDigit(char c)
Indicates whether the specified character is a digit.- Parameters:
c
- the character to check.- Returns:
true
ifc
is a digit;false
otherwise.
-
isDigit
public static boolean isDigit(int codePoint)
Indicates whether the specified code point is a digit.- Parameters:
codePoint
- the code point to check.- Returns:
true
ifcodePoint
is a digit;false
otherwise.
-
isISOControl
public static boolean isISOControl(char c)
Indicates whether the specified character is an ISO control character.- Parameters:
c
- the character to check.- Returns:
true
ifc
is an ISO control character;false
otherwise.
-
isISOControl
public static boolean isISOControl(int c)
Indicates whether the specified code point is an ISO control character.- Parameters:
c
- the code point to check.- Returns:
true
ifc
is an ISO control character;false
otherwise.
-
isLetter
public static boolean isLetter(char c)
Indicates whether the specified character is a letter.- Parameters:
c
- the character to check.- Returns:
true
ifc
is a letter;false
otherwise.
-
isLetter
public static boolean isLetter(int codePoint)
Indicates whether the specified code point is a letter.- Parameters:
codePoint
- the code point to check.- Returns:
true
ifcodePoint
is a letter;false
otherwise.
-
isLetterOrDigit
public static boolean isLetterOrDigit(char c)
Indicates whether the specified character is a letter or a digit.- Parameters:
c
- the character to check.- Returns:
true
ifc
is a letter or a digit;false
otherwise.
-
isLetterOrDigit
public static boolean isLetterOrDigit(int codePoint)
Indicates whether the specified code point is a letter or a digit.- Parameters:
codePoint
- the code point to check.- Returns:
true
ifcodePoint
is a letter or a digit;false
otherwise.
-
isLowerCase
public static boolean isLowerCase(char c)
Indicates whether the specified character is a lower case letter.- Parameters:
c
- the character to check.- Returns:
true
ifc
is a lower case letter;false
otherwise.
-
isLowerCase
public static boolean isLowerCase(int codePoint)
Indicates whether the specified code point is a lower case letter.- Parameters:
codePoint
- the code point to check.- Returns:
true
ifcodePoint
is a lower case letter;false
otherwise.
-
isSpace
public static boolean isSpace(char c)
Deprecated.UseisWhitespace(char)
instead.UseisWhitespace(char)
instead.- Parameters:
c
- Character- Returns:
true
if white space.
-
isSpaceChar
public static boolean isSpaceChar(char c)
SeeisSpaceChar(int)
.- Parameters:
c
- Character- Returns:
true
if space character.
-
isSpaceChar
public static boolean isSpaceChar(int codePoint)
Returns true if the given code point is a Unicode space character. The exact set of characters considered as whitespace varies with Unicode version. Note that non-breaking spaces are considered whitespace. Note also that line separators are not considered whitespace; seeisWhitespace(char)
for an alternative.- Parameters:
codePoint
- Code point.- Returns:
true
if space character
-
isUpperCase
public static boolean isUpperCase(char c)
Indicates whether the specified character is an upper case letter.- Parameters:
c
- the character to check.- Returns:
true
ifc
is a upper case letter;false
otherwise.
-
isUpperCase
public static boolean isUpperCase(int codePoint)
Indicates whether the specified code point is an upper case letter.- Parameters:
codePoint
- the code point to check.- Returns:
true
ifcodePoint
is a upper case letter;false
otherwise.
-
isWhitespace
public static boolean isWhitespace(char c)
SeeisWhitespace(int)
.- Parameters:
c
- Character- Returns:
true
if white space
-
isWhitespace
public static boolean isWhitespace(int codePoint)
Returns true if the given code point is a Unicode whitespace character. The exact set of characters considered as whitespace varies with Unicode version. Note that non-breaking spaces are not considered whitespace. Note also that line separators are considered whitespace; seeisSpaceChar(char)
for an alternative.- Parameters:
codePoint
- Code point- Returns:
true
if white space
-
reverseBytes
public static char reverseBytes(char c)
Reverses the order of the first and second byte in the specified character.- Parameters:
c
- the character to reverse.- Returns:
- the character with reordered bytes.
-
toLowerCase
public static char toLowerCase(char c)
Returns the lower case equivalent for the specified character if the character is an upper case letter. Otherwise, the specified character is returned unchanged.- Parameters:
c
- the character- Returns:
- if
c
is an upper case character then its lower case counterpart, otherwise justc
.
-
toLowerCase
public static int toLowerCase(int codePoint)
Returns the lower case equivalent for the specified code point if it is an upper case letter. Otherwise, the specified code point is returned unchanged.- Parameters:
codePoint
- the code point to check.- Returns:
- if
codePoint
is an upper case character then its lower case counterpart, otherwise justcodePoint
.
-
toString
public String toString()
Description copied from class:Object
Returns a string containing a concise, human-readable description of this object. Subclasses are encouraged to override this method and provide an implementation that takes into account the object's type and data.
-
toString
public static String toString(char value)
Converts the specified character to its string representation.- Parameters:
value
- the character to convert.- Returns:
- the character converted to a string.
-
toUpperCase
public static char toUpperCase(char c)
Returns the upper case equivalent for the specified character if the character is a lower case letter. Otherwise, the specified character is returned unchanged.- Parameters:
c
- the character to convert.- Returns:
- if
c
is a lower case character then its upper case counterpart, otherwise justc
.
-
toUpperCase
public static int toUpperCase(int codePoint)
Returns the upper case equivalent for the specified code point if the code point is a lower case letter. Otherwise, the specified code point is returned unchanged.- Parameters:
codePoint
- the code point to convert.- Returns:
- if
codePoint
is a lower case character then its upper case counterpart, otherwise justcodePoint
.
-
-