Class Character
- java.lang.Object
-
- java.lang.Character
-
- All Implemented Interfaces:
Serializable,Comparable<Character>,Modified
public final class Character extends Object implements Serializable, Comparable<Character>, Modified
The wrapper for the primitive typechar. This class also provides a number of utility methods for working with characters.Character data is kept up to date as Unicode evolves. See the Locale data section of the
Localedocumentation for details of the Unicode versions implemented by current and historical Android releases.The Unicode specification, character tables, and other information are available at http://www.unicode.org/.
Unicode characters are referred to as code points. The range of valid code points is U+0000 to U+10FFFF. The Basic Multilingual Plane (BMP) is the code point range U+0000 to U+FFFF. Characters above the BMP are referred to as Supplementary Characters. On the Java platform, UTF-16 encoding and
charpairs are used to represent code points in the supplementary range. A pair ofcharvalues that represent a supplementary character are made up of a high surrogate with a value range of 0xD800 to 0xDBFF and a low surrogate with a value range of 0xDC00 to 0xDFFF.On the Java platform a
charvalue represents either a single BMP code point or a UTF-16 unit that's part of a surrogate pair. Theinttype is used to represent all Unicode code points.Unicode categories
Here's a list of the Unicode character categories and the corresponding Java constant, grouped semantically to provide a convenient overview. This table is also useful in conjunction with
\pand\Pinregular expressions.Categories Cn Unassigned UNASSIGNEDCc Control CONTROLCf Format FORMATCo Private use PRIVATE_USECs Surrogate SURROGATELu Uppercase letter UPPERCASE_LETTERLl Lowercase letter LOWERCASE_LETTERLt Titlecase letter TITLECASE_LETTERLm Modifier letter MODIFIER_LETTERLo Other letter OTHER_LETTERMn Non-spacing mark NON_SPACING_MARKMe Enclosing mark ENCLOSING_MARKMc Combining spacing mark COMBINING_SPACING_MARKNd Decimal digit number DECIMAL_DIGIT_NUMBERNl Letter number LETTER_NUMBERNo Other number OTHER_NUMBERPd Dash punctuation DASH_PUNCTUATIONPs Start punctuation START_PUNCTUATIONPe End punctuation END_PUNCTUATIONPc Connector punctuation CONNECTOR_PUNCTUATIONPi Initial quote punctuation INITIAL_QUOTE_PUNCTUATIONPf Final quote punctuation FINAL_QUOTE_PUNCTUATIONPo Other punctuation OTHER_PUNCTUATIONSm Math symbol MATH_SYMBOLSc Currency symbol CURRENCY_SYMBOLSk Modifier symbol MODIFIER_SYMBOLSo Other symbol OTHER_SYMBOLZs Space separator SPACE_SEPARATORZl Line separator LINE_SEPARATORZp Paragraph separator PARAGRAPH_SEPARATOR- Since:
- 1.0
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static byteCOMBINING_SPACING_MARKUnicode category constant Mc.static byteCONNECTOR_PUNCTUATIONUnicode category constant Pc.static byteCONTROLUnicode category constant Cc.static byteCURRENCY_SYMBOLUnicode category constant Sc.static byteDASH_PUNCTUATIONUnicode category constant Pd.static byteDECIMAL_DIGIT_NUMBERUnicode category constant Nd.static byteDIRECTIONALITY_ARABIC_NUMBERUnicode bidirectional constant AN.static byteDIRECTIONALITY_BOUNDARY_NEUTRALUnicode bidirectional constant BN.static byteDIRECTIONALITY_COMMON_NUMBER_SEPARATORUnicode bidirectional constant CS.static byteDIRECTIONALITY_EUROPEAN_NUMBERUnicode bidirectional constant EN.static byteDIRECTIONALITY_EUROPEAN_NUMBER_SEPARATORUnicode bidirectional constant ES.static byteDIRECTIONALITY_EUROPEAN_NUMBER_TERMINATORUnicode bidirectional constant ET.static byteDIRECTIONALITY_LEFT_TO_RIGHTUnicode bidirectional constant L.static byteDIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDINGUnicode bidirectional constant LRE.static byteDIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDEUnicode bidirectional constant LRO.static byteDIRECTIONALITY_NONSPACING_MARKUnicode bidirectional constant NSM.static byteDIRECTIONALITY_OTHER_NEUTRALSUnicode bidirectional constant ON.static byteDIRECTIONALITY_PARAGRAPH_SEPARATORUnicode bidirectional constant B.static byteDIRECTIONALITY_POP_DIRECTIONAL_FORMATUnicode bidirectional constant PDF.static byteDIRECTIONALITY_RIGHT_TO_LEFTUnicode bidirectional constant R.static byteDIRECTIONALITY_RIGHT_TO_LEFT_ARABICUnicode bidirectional constant AL.static byteDIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDINGUnicode bidirectional constant RLE.static byteDIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDEUnicode bidirectional constant RLO.static byteDIRECTIONALITY_SEGMENT_SEPARATORUnicode bidirectional constant S.static byteDIRECTIONALITY_UNDEFINEDUnicode bidirectional constant.static byteDIRECTIONALITY_WHITESPACEUnicode bidirectional constant WS.static byteENCLOSING_MARKUnicode category constant Me.static byteEND_PUNCTUATIONUnicode category constant Pe.static byteFINAL_QUOTE_PUNCTUATIONUnicode category constant Pf.static byteFORMATUnicode category constant Cf.static byteINITIAL_QUOTE_PUNCTUATIONUnicode category constant Pi.static byteLETTER_NUMBERUnicode category constant Nl.static byteLINE_SEPARATORUnicode category constant Zl.static byteLOWERCASE_LETTERUnicode category constant Ll.static byteMATH_SYMBOLUnicode category constant Sm.static intMAX_CODE_POINTThe maximum code point value,U+10FFFF.static charMAX_HIGH_SURROGATEThe maximum value of a high surrogate or leading surrogate unit in UTF-16 encoding,'?'.static charMAX_LOW_SURROGATEThe maximum value of a low surrogate or trailing surrogate unit in UTF-16 encoding,'?'.static intMAX_RADIXThe maximum radix used for conversions between characters and integers.static charMAX_SURROGATEThe maximum value of a surrogate unit in UTF-16 encoding,'?'.static charMAX_VALUEThe maximumCharactervalue.static intMIN_CODE_POINTThe minimum code point value,U+0000.static charMIN_HIGH_SURROGATEThe minimum value of a high surrogate or leading surrogate unit in UTF-16 encoding,'?'.static charMIN_LOW_SURROGATEThe minimum value of a low surrogate or trailing surrogate unit in UTF-16 encoding,'?'.static intMIN_RADIXThe minimum radix used for conversions between characters and integers.static intMIN_SUPPLEMENTARY_CODE_POINTThe minimum value of a supplementary code point,U+010000.static charMIN_SURROGATEThe minimum value of a surrogate unit in UTF-16 encoding,'?'.static charMIN_VALUEThe minimumCharactervalue.static byteMODIFIER_LETTERUnicode category constant Lm.static byteMODIFIER_SYMBOLUnicode category constant Sk.static byteNON_SPACING_MARKUnicode category constant Mn.static byteOTHER_LETTERUnicode category constant Lo.static byteOTHER_NUMBERUnicode category constant No.static byteOTHER_PUNCTUATIONUnicode category constant Po.static byteOTHER_SYMBOLUnicode category constant So.static bytePARAGRAPH_SEPARATORUnicode category constant Zp.static bytePRIVATE_USEUnicode category constant Co.static intSIZEThe number of bits required to represent aCharactervalue unsigned form.static byteSPACE_SEPARATORUnicode category constant Zs.static byteSTART_PUNCTUATIONUnicode category constant Ps.static byteSURROGATEUnicode category constant Cs.static byteTITLECASE_LETTERUnicode category constant Lt.static byteUNASSIGNEDUnicode category constant Cn.static byteUPPERCASE_LETTERUnicode category constant Lu.
-
Constructor Summary
Constructors Constructor Description Character(char value)Constructs a newCharacterwith the specified primitive char value.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static intcharCount(int codePoint)Calculates the number ofcharvalues required to represent the specified Unicode code point.charcharValue()Gets the primitive value of this character.static intcodePointAt(char[] seq, int index)Returns the code point atindexin the specified array of character units.static intcodePointAt(char[] seq, int index, int limit)Returns the code point atindexin the specified array of character units, whereindexhas to be less thanlimit.static intcodePointBefore(char[] seq, int index)Returns the code point that precedesindexin the specified array of character units.static intcodePointBefore(char[] seq, int index, int start)Returns the code point that precedes theindexin the specified array of character units and is not less thanstart.static intcodePointCount(char[] seq, int offset, int count)Counts the number of Unicode code points in the subsequence of the specified char array, as delineated byoffsetandcount.static intcompare(char lhs, char rhs)Compares twocharvalues.intcompareTo(Character c)Compares this object to the specified character object to determine their relative order.static intdigit(char c, int radix)Convenience method to determine the value of the specified charactercin the supplied radix.static intdigit(int codePoint, int radix)Convenience method to determine the value of the charactercodePointin the supplied radix.booleanequals(Object object)Compares this object with the specified object and indicates if they are equal.static charforDigit(int digit, int radix)Returns the character which represents the specified digit in the specified radix.static intgetNumericValue(char c)Returns the numeric value of the specified Unicode character.static intgetNumericValue(int codePoint)Gets the numeric value of the specified Unicode code point.inthashCode()Returns an integer hash code for this object.static charhighSurrogate(int codePoint)Returns the high surrogate for the given code point.static booleanisBmpCodePoint(int codePoint)Returns true if the given code point is in the Basic Multilingual Plane (BMP).static booleanisDigit(char c)Indicates whether the specified character is a digit.static booleanisDigit(int codePoint)Indicates whether the specified code point is a digit.static booleanisHighSurrogate(char ch)Indicates whetherchis a high- (or leading-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.static booleanisISOControl(char c)Indicates whether the specified character is an ISO control character.static booleanisISOControl(int c)Indicates whether the specified code point is an ISO control character.static booleanisLetter(char c)Indicates whether the specified character is a letter.static booleanisLetter(int codePoint)Indicates whether the specified code point is a letter.static booleanisLetterOrDigit(char c)Indicates whether the specified character is a letter or a digit.static booleanisLetterOrDigit(int codePoint)Indicates whether the specified code point is a letter or a digit.static booleanisLowerCase(char c)Indicates whether the specified character is a lower case letter.static booleanisLowerCase(int codePoint)Indicates whether the specified code point is a lower case letter.static booleanisLowSurrogate(char ch)Indicates whetherchis a low- (or trailing-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.static booleanisSpace(char c)Deprecated.UseisWhitespace(char)instead.static booleanisSpaceChar(char c)SeeisSpaceChar(int).static booleanisSpaceChar(int codePoint)Returns true if the given code point is a Unicode space character.static booleanisSupplementaryCodePoint(int codePoint)Indicates whethercodePointis within the supplementary code point range.static booleanisSurrogate(char ch)Returns true if the given character is a high or low surrogate.static booleanisSurrogatePair(char high, char low)Indicates whether the specified character pair is a valid surrogate pair.static booleanisUpperCase(char c)Indicates whether the specified character is an upper case letter.static booleanisUpperCase(int codePoint)Indicates whether the specified code point is an upper case letter.static booleanisValidCodePoint(int codePoint)Indicates whethercodePointis a valid Unicode code point.static booleanisWhitespace(char c)SeeisWhitespace(int).static booleanisWhitespace(int codePoint)Returns true if the given code point is a Unicode whitespace character.static charlowSurrogate(int codePoint)Returns the low surrogate for the given code point.static intoffsetByCodePoints(char[] seq, int start, int count, int index, int codePointOffset)Determines the index in a subsequence of the specified character array that is offsetcodePointOffsetcode points fromindex.static charreverseBytes(char c)Reverses the order of the first and second byte in the specified character.static char[]toChars(int codePoint)Converts the specified Unicode code point into a UTF-16 encoded sequence and returns it as a char array.static inttoChars(int codePoint, char[] dst, int dstIndex)Converts the specified Unicode code point into a UTF-16 encoded sequence and copies the value(s) into the char arraydst, starting at indexdstIndex.static inttoCodePoint(char high, char low)Converts a surrogate pair into a Unicode code point.static chartoLowerCase(char c)Returns the lower case equivalent for the specified character if the character is an upper case letter.static inttoLowerCase(int codePoint)Returns the lower case equivalent for the specified code point if it is an upper case letter.StringtoString()Returns a string containing a concise, human-readable description of this object.static StringtoString(char value)Converts the specified character to its string representation.static chartoUpperCase(char c)Returns the upper case equivalent for the specified character if the character is a lower case letter.static inttoUpperCase(int codePoint)Returns the upper case equivalent for the specified code point if the code point is a lower case letter.static CharactervalueOf(char c)Returns aCharacterinstance for thecharvalue passed.
-
-
-
Field Detail
-
MIN_VALUE
public static final char MIN_VALUE
The minimumCharactervalue.- See Also:
- Constant Field Values
-
MAX_VALUE
public static final char MAX_VALUE
The maximumCharactervalue.- See Also:
- Constant Field Values
-
MIN_RADIX
public static final int MIN_RADIX
The minimum radix used for conversions between characters and integers.- See Also:
- Constant Field Values
-
MAX_RADIX
public static final int MAX_RADIX
The maximum radix used for conversions between characters and integers.- See Also:
- Constant Field Values
-
UNASSIGNED
public static final byte UNASSIGNED
Unicode category constant Cn.- See Also:
- Constant Field Values
-
UPPERCASE_LETTER
public static final byte UPPERCASE_LETTER
Unicode category constant Lu.- See Also:
- Constant Field Values
-
LOWERCASE_LETTER
public static final byte LOWERCASE_LETTER
Unicode category constant Ll.- See Also:
- Constant Field Values
-
TITLECASE_LETTER
public static final byte TITLECASE_LETTER
Unicode category constant Lt.- See Also:
- Constant Field Values
-
MODIFIER_LETTER
public static final byte MODIFIER_LETTER
Unicode category constant Lm.- See Also:
- Constant Field Values
-
OTHER_LETTER
public static final byte OTHER_LETTER
Unicode category constant Lo.- See Also:
- Constant Field Values
-
NON_SPACING_MARK
public static final byte NON_SPACING_MARK
Unicode category constant Mn.- See Also:
- Constant Field Values
-
ENCLOSING_MARK
public static final byte ENCLOSING_MARK
Unicode category constant Me.- See Also:
- Constant Field Values
-
COMBINING_SPACING_MARK
public static final byte COMBINING_SPACING_MARK
Unicode category constant Mc.- See Also:
- Constant Field Values
-
DECIMAL_DIGIT_NUMBER
public static final byte DECIMAL_DIGIT_NUMBER
Unicode category constant Nd.- See Also:
- Constant Field Values
-
LETTER_NUMBER
public static final byte LETTER_NUMBER
Unicode category constant Nl.- See Also:
- Constant Field Values
-
OTHER_NUMBER
public static final byte OTHER_NUMBER
Unicode category constant No.- See Also:
- Constant Field Values
-
SPACE_SEPARATOR
public static final byte SPACE_SEPARATOR
Unicode category constant Zs.- See Also:
- Constant Field Values
-
LINE_SEPARATOR
public static final byte LINE_SEPARATOR
Unicode category constant Zl.- See Also:
- Constant Field Values
-
PARAGRAPH_SEPARATOR
public static final byte PARAGRAPH_SEPARATOR
Unicode category constant Zp.- See Also:
- Constant Field Values
-
CONTROL
public static final byte CONTROL
Unicode category constant Cc.- See Also:
- Constant Field Values
-
FORMAT
public static final byte FORMAT
Unicode category constant Cf.- See Also:
- Constant Field Values
-
PRIVATE_USE
public static final byte PRIVATE_USE
Unicode category constant Co.- See Also:
- Constant Field Values
-
SURROGATE
public static final byte SURROGATE
Unicode category constant Cs.- See Also:
- Constant Field Values
-
DASH_PUNCTUATION
public static final byte DASH_PUNCTUATION
Unicode category constant Pd.- See Also:
- Constant Field Values
-
START_PUNCTUATION
public static final byte START_PUNCTUATION
Unicode category constant Ps.- See Also:
- Constant Field Values
-
END_PUNCTUATION
public static final byte END_PUNCTUATION
Unicode category constant Pe.- See Also:
- Constant Field Values
-
CONNECTOR_PUNCTUATION
public static final byte CONNECTOR_PUNCTUATION
Unicode category constant Pc.- See Also:
- Constant Field Values
-
OTHER_PUNCTUATION
public static final byte OTHER_PUNCTUATION
Unicode category constant Po.- See Also:
- Constant Field Values
-
MATH_SYMBOL
public static final byte MATH_SYMBOL
Unicode category constant Sm.- See Also:
- Constant Field Values
-
CURRENCY_SYMBOL
public static final byte CURRENCY_SYMBOL
Unicode category constant Sc.- See Also:
- Constant Field Values
-
MODIFIER_SYMBOL
public static final byte MODIFIER_SYMBOL
Unicode category constant Sk.- See Also:
- Constant Field Values
-
OTHER_SYMBOL
public static final byte OTHER_SYMBOL
Unicode category constant So.- See Also:
- Constant Field Values
-
INITIAL_QUOTE_PUNCTUATION
public static final byte INITIAL_QUOTE_PUNCTUATION
Unicode category constant Pi.- Since:
- 1.4
- See Also:
- Constant Field Values
-
FINAL_QUOTE_PUNCTUATION
public static final byte FINAL_QUOTE_PUNCTUATION
Unicode category constant Pf.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_UNDEFINED
public static final byte DIRECTIONALITY_UNDEFINED
Unicode bidirectional constant.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_LEFT_TO_RIGHT
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT
Unicode bidirectional constant L.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_RIGHT_TO_LEFT
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT
Unicode bidirectional constant R.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_ARABIC
Unicode bidirectional constant AL.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_EUROPEAN_NUMBER
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER
Unicode bidirectional constant EN.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_SEPARATOR
Unicode bidirectional constant ES.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
public static final byte DIRECTIONALITY_EUROPEAN_NUMBER_TERMINATOR
Unicode bidirectional constant ET.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_ARABIC_NUMBER
public static final byte DIRECTIONALITY_ARABIC_NUMBER
Unicode bidirectional constant AN.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
public static final byte DIRECTIONALITY_COMMON_NUMBER_SEPARATOR
Unicode bidirectional constant CS.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_NONSPACING_MARK
public static final byte DIRECTIONALITY_NONSPACING_MARK
Unicode bidirectional constant NSM.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_BOUNDARY_NEUTRAL
public static final byte DIRECTIONALITY_BOUNDARY_NEUTRAL
Unicode bidirectional constant BN.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_PARAGRAPH_SEPARATOR
public static final byte DIRECTIONALITY_PARAGRAPH_SEPARATOR
Unicode bidirectional constant B.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_SEGMENT_SEPARATOR
public static final byte DIRECTIONALITY_SEGMENT_SEPARATOR
Unicode bidirectional constant S.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_WHITESPACE
public static final byte DIRECTIONALITY_WHITESPACE
Unicode bidirectional constant WS.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_OTHER_NEUTRALS
public static final byte DIRECTIONALITY_OTHER_NEUTRALS
Unicode bidirectional constant ON.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_EMBEDDING
Unicode bidirectional constant LRE.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
public static final byte DIRECTIONALITY_LEFT_TO_RIGHT_OVERRIDE
Unicode bidirectional constant LRO.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_EMBEDDING
Unicode bidirectional constant RLE.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
public static final byte DIRECTIONALITY_RIGHT_TO_LEFT_OVERRIDE
Unicode bidirectional constant RLO.- Since:
- 1.4
- See Also:
- Constant Field Values
-
DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
public static final byte DIRECTIONALITY_POP_DIRECTIONAL_FORMAT
Unicode bidirectional constant PDF.- Since:
- 1.4
- See Also:
- Constant Field Values
-
MIN_HIGH_SURROGATE
public static final char MIN_HIGH_SURROGATE
The minimum value of a high surrogate or leading surrogate unit in UTF-16 encoding,'?'.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MAX_HIGH_SURROGATE
public static final char MAX_HIGH_SURROGATE
The maximum value of a high surrogate or leading surrogate unit in UTF-16 encoding,'?'.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MIN_LOW_SURROGATE
public static final char MIN_LOW_SURROGATE
The minimum value of a low surrogate or trailing surrogate unit in UTF-16 encoding,'?'.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MAX_LOW_SURROGATE
public static final char MAX_LOW_SURROGATE
The maximum value of a low surrogate or trailing surrogate unit in UTF-16 encoding,'?'.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MIN_SURROGATE
public static final char MIN_SURROGATE
The minimum value of a surrogate unit in UTF-16 encoding,'?'.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MAX_SURROGATE
public static final char MAX_SURROGATE
The maximum value of a surrogate unit in UTF-16 encoding,'?'.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MIN_SUPPLEMENTARY_CODE_POINT
public static final int MIN_SUPPLEMENTARY_CODE_POINT
The minimum value of a supplementary code point,U+010000.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MIN_CODE_POINT
public static final int MIN_CODE_POINT
The minimum code point value,U+0000.- Since:
- 1.5
- See Also:
- Constant Field Values
-
MAX_CODE_POINT
public static final int MAX_CODE_POINT
The maximum code point value,U+10FFFF.- Since:
- 1.5
- See Also:
- Constant Field Values
-
SIZE
public static final int SIZE
The number of bits required to represent aCharactervalue unsigned form.- Since:
- 1.5
- See Also:
- Constant Field Values
-
-
Method Detail
-
charValue
public char charValue()
Gets the primitive value of this character.- Returns:
- this object's primitive value.
-
compareTo
public int compareTo(Character c)
Compares this object to the specified character object to determine their relative order.- Specified by:
compareToin interfaceComparable<Character>- Parameters:
c- the character object to compare this object to.- Returns:
0if the value of this character and the value ofcare equal; a positive value if the value of this character is greater than the value ofc; a negative value if the value of this character is less than the value ofc.- Since:
- 1.2
- See Also:
Comparable
-
compare
public static int compare(char lhs, char rhs)Compares twocharvalues.- Parameters:
lhs- First value.rhs- Second value.- Returns:
- 0 if lhs = rhs, less than 0 if lhs < rhs, and greater than 0 if lhs > rhs.
- Since:
- 1.7
-
valueOf
public static Character valueOf(char c)
Returns aCharacterinstance for thecharvalue passed.- Parameters:
c- the char value for which to get aCharacterinstance.- Returns:
- the
Characterinstance forc. - Since:
- 1.5
-
isValidCodePoint
public static boolean isValidCodePoint(int codePoint)
Indicates whethercodePointis a valid Unicode code point.- Parameters:
codePoint- the code point to test.- Returns:
trueifcodePointis a valid Unicode code point;falseotherwise.- Since:
- 1.5
-
isSupplementaryCodePoint
public static boolean isSupplementaryCodePoint(int codePoint)
Indicates whethercodePointis within the supplementary code point range.- Parameters:
codePoint- the code point to test.- Returns:
trueifcodePointis within the supplementary code point range;falseotherwise.- Since:
- 1.5
-
isHighSurrogate
public static boolean isHighSurrogate(char ch)
Indicates whetherchis a high- (or leading-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.- Parameters:
ch- the character to test.- Returns:
trueifchis a high-surrogate code unit;falseotherwise.- Since:
- 1.5
- See Also:
isLowSurrogate(char)
-
isLowSurrogate
public static boolean isLowSurrogate(char ch)
Indicates whetherchis a low- (or trailing-) surrogate code unit that is used for representing supplementary characters in UTF-16 encoding.- Parameters:
ch- the character to test.- Returns:
trueifchis a low-surrogate code unit;falseotherwise.- Since:
- 1.5
- See Also:
isHighSurrogate(char)
-
isSurrogate
public static boolean isSurrogate(char ch)
Returns true if the given character is a high or low surrogate.- Parameters:
ch- Character- Returns:
trueif surrogate.- Since:
- 1.7
-
isSurrogatePair
public static boolean isSurrogatePair(char high, char low)Indicates whether the specified character pair is a valid surrogate pair.- Parameters:
high- the high surrogate unit to test.low- the low surrogate unit to test.- Returns:
trueifhighis a high-surrogate code unit andlowis a low-surrogate code unit;falseotherwise.- Since:
- 1.5
- See Also:
isHighSurrogate(char),isLowSurrogate(char)
-
charCount
public static int charCount(int codePoint)
Calculates the number ofcharvalues required to represent the specified Unicode code point. This method checks if thecodePointis greater than or equal to0x10000, in which case2is returned, otherwise1. To test if the code point is valid, use theisValidCodePoint(int)method.- Parameters:
codePoint- the code point for which to calculate the number of required chars.- Returns:
2ifcodePoint >= 0x10000;1otherwise.- Since:
- 1.5
- See Also:
isValidCodePoint(int),isSupplementaryCodePoint(int)
-
toCodePoint
public static int toCodePoint(char high, char low)Converts a surrogate pair into a Unicode code point. This method assumes that the pair are valid surrogates. If the pair are not valid surrogates, then the result is indeterminate. TheisSurrogatePair(char, char)method should be used prior to this method to validate the pair.- Parameters:
high- the high surrogate unit.low- the low surrogate unit.- Returns:
- the Unicode code point corresponding to the surrogate unit pair.
- Since:
- 1.5
- See Also:
isSurrogatePair(char, char)
-
codePointAt
public static int codePointAt(char[] seq, int index)Returns the code point atindexin the specified array of character units. If the unit atindexis a high-surrogate unit,index + 1is less than the length of the array and the unit atindex + 1is a low-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise thecharvalue atindexis returned.- Parameters:
seq- the source array ofcharunits.index- the position inseqfrom which to retrieve the code point.- Returns:
- the Unicode code point or
charvalue atindexinseq. - Throws:
NullPointerException- ifseqisnull.IndexOutOfBoundsException- if theindexis negative or greater than or equal to the length ofseq.- Since:
- 1.5
-
codePointAt
public static int codePointAt(char[] seq, int index, int limit)Returns the code point atindexin the specified array of character units, whereindexhas to be less thanlimit. If the unit atindexis a high-surrogate unit,index + 1is less thanlimitand the unit atindex + 1is a low-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise thecharvalue atindexis returned.- Parameters:
seq- the source array ofcharunits.index- the position inseqfrom which to get the code point.limit- the index after the last unit inseqthat can be used.- Returns:
- the Unicode code point or
charvalue atindexinseq. - Throws:
NullPointerException- ifseqisnull.IndexOutOfBoundsException- ifindex < 0,index >= limit,limit < 0or iflimitis greater than the length ofseq.- Since:
- 1.5
-
codePointBefore
public static int codePointBefore(char[] seq, int index)Returns the code point that precedesindexin the specified array of character units. If the unit atindex - 1is a low-surrogate unit,index - 2is not negative and the unit atindex - 2is a high-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise thecharvalue atindex - 1is returned.- Parameters:
seq- the source array ofcharunits.index- the position inseqfollowing the code point that should be returned.- Returns:
- the Unicode code point or
charvalue beforeindexinseq. - Throws:
NullPointerException- ifseqisnull.IndexOutOfBoundsException- if theindexis less than 1 or greater than the length ofseq.- Since:
- 1.5
-
codePointBefore
public static int codePointBefore(char[] seq, int index, int start)Returns the code point that precedes theindexin the specified array of character units and is not less thanstart. If the unit atindex - 1is a low-surrogate unit,index - 2is not less thanstartand the unit atindex - 2is a high-surrogate unit, then the supplementary code point represented by the pair is returned; otherwise thecharvalue atindex - 1is returned.- Parameters:
seq- the source array ofcharunits.index- the position inseqfollowing the code point that should be returned.start- the index of the first element inseq.- Returns:
- the Unicode code point or
charvalue beforeindexinseq. - Throws:
NullPointerException- ifseqisnull.IndexOutOfBoundsException- if theindex <= start,start < 0,indexis greater than the length ofseq, or ifstartis equal or greater than the length ofseq.- Since:
- 1.5
-
toChars
public static int toChars(int codePoint, char[] dst, int dstIndex)Converts the specified Unicode code point into a UTF-16 encoded sequence and copies the value(s) into the char arraydst, starting at indexdstIndex.- Parameters:
codePoint- the Unicode code point to encode.dst- the destination array to copy the encoded value into.dstIndex- the index indstfrom where to start copying.- Returns:
- the number of
charvalue units copied intodst. - Throws:
IllegalArgumentException- ifcodePointis not a valid code point.NullPointerException- ifdstisnull.IndexOutOfBoundsException- ifdstIndexis negative, greater than or equal todst.lengthor equalsdst.length - 1whencodePointis asupplementary code point.- Since:
- 1.5
-
toChars
public static char[] toChars(int codePoint)
Converts the specified Unicode code point into a UTF-16 encoded sequence and returns it as a char array.- Parameters:
codePoint- the Unicode code point to encode.- Returns:
- the UTF-16 encoded char sequence. If
codePointis asupplementary code point, then the returned array contains two characters, otherwise it contains just one character. - Throws:
IllegalArgumentException- ifcodePointis not a valid code point.- Since:
- 1.5
-
codePointCount
public static int codePointCount(char[] seq, int offset, int count)Counts the number of Unicode code points in the subsequence of the specified char array, as delineated byoffsetandcount. Any surrogate values with missing pair values will be counted as one code point.- Parameters:
seq- the char array to look throughoffset- the inclusive index to begin counting at.count- the number ofcharvalues to look through inseq.- Returns:
- the number of Unicode code points.
- Throws:
NullPointerException- ifseqisnull.IndexOutOfBoundsException- ifoffset < 0,count < 0or ifoffset + countis greater than the length ofseq.- Since:
- 1.5
-
offsetByCodePoints
public static int offsetByCodePoints(char[] seq, int start, int count, int index, int codePointOffset)Determines the index in a subsequence of the specified character array that is offsetcodePointOffsetcode points fromindex. The subsequence is delineated bystartandcount.- Parameters:
seq- the character array to find the index in.start- the inclusive index that marks the beginning of the subsequence.count- the number ofcharvalues to include within the subsequence.index- the start index in the subsequence of the char array.codePointOffset- the number of code points to look backwards or forwards; may be a negative or positive value.- Returns:
- the index in
seqthat iscodePointOffsetcode points away fromindex. - Throws:
NullPointerException- ifseqisnull.IndexOutOfBoundsException- ifstart < 0,count < 0,index < start,index > start + count,start + countis greater than the length ofseq, or if there are not enough values inseqto skipcodePointOffsetcode points forward or backward (ifcodePointOffsetis negative) fromindex.- Since:
- 1.5
-
digit
public static int digit(char c, int radix)Convenience method to determine the value of the specified charactercin the supplied radix. The value ofradixmust be between MIN_RADIX and MAX_RADIX.
-
digit
public static int digit(int codePoint, int radix)Convenience method to determine the value of the charactercodePointin the supplied radix. The value ofradixmust be between MIN_RADIX and MAX_RADIX.
-
equals
public boolean equals(Object object)
Compares this object with the specified object and indicates if they are equal. In order to be equal,objectmust be an instance ofCharacterand have the same char value as this object.- Overrides:
equalsin classObject- Parameters:
object- the object to compare this double with.- Returns:
trueif the specified object is equal to thisCharacter;falseotherwise.- See Also:
Object.hashCode()
-
forDigit
public static char forDigit(int digit, int radix)Returns the character which represents the specified digit in the specified radix. Theradixmust be betweenMIN_RADIXandMAX_RADIXinclusive;digitmust not be negative and smaller thanradix. If any of these conditions does not hold, 0 is returned.- Parameters:
digit- the integer value.radix- the radix.- Returns:
- the character which represents the
digitin theradix.
-
getNumericValue
public static int getNumericValue(char c)
Returns the numeric value of the specified Unicode character. SeegetNumericValue(int).- Parameters:
c- the character- Returns:
- a non-negative numeric integer value if a numeric value for
cexists, -1 if there is no numeric value forc, -2 if the numeric value can not be represented as an integer.
-
getNumericValue
public static int getNumericValue(int codePoint)
Gets the numeric value of the specified Unicode code point. For example, the code point 'Ⅻ' stands for the Roman number XII, which has the numeric value 12.There are two points of divergence between this method and the Unicode specification. This method treats the letters a-z (in both upper and lower cases, and their full-width variants) as numbers from 10 to 35. The Unicode specification also supports the idea of code points with non-integer numeric values; this method does not (except to the extent of returning -2 for such code points).
- Parameters:
codePoint- the code point- Returns:
- a non-negative numeric integer value if a numeric value for
codePointexists, -1 if there is no numeric value forcodePoint, -2 if the numeric value can not be represented with an integer.
-
hashCode
public int hashCode()
Description copied from class:ObjectReturns an integer hash code for this object. By contract, any two objects for whichObject.equals(java.lang.Object)returnstruemust return the same hash code value. This means that subclasses ofObjectusually override both methods or neither method.Note that hash values must not change over time unless information used in equals comparisons also changes.
See Writing a correct
hashCodemethod if you intend implementing your ownhashCodemethod.- Overrides:
hashCodein classObject- Returns:
- this object's hash code.
- See Also:
Object.equals(java.lang.Object)
-
highSurrogate
public static char highSurrogate(int codePoint)
Returns the high surrogate for the given code point. The result is meaningless if the given code point is not a supplementary character.- Parameters:
codePoint- Code point.- Returns:
- High surrogate.
- Since:
- 1.7
-
lowSurrogate
public static char lowSurrogate(int codePoint)
Returns the low surrogate for the given code point. The result is meaningless if the given code point is not a supplementary character.- Parameters:
codePoint- Code point.- Returns:
- Low surrogate.
- Since:
- 1.7
-
isBmpCodePoint
public static boolean isBmpCodePoint(int codePoint)
Returns true if the given code point is in the Basic Multilingual Plane (BMP). Such code points can be represented by a singlechar.- Parameters:
codePoint- Code point.- Returns:
trueif in plane.- Since:
- 1.7
-
isDigit
public static boolean isDigit(char c)
Indicates whether the specified character is a digit.- Parameters:
c- the character to check.- Returns:
trueifcis a digit;falseotherwise.
-
isDigit
public static boolean isDigit(int codePoint)
Indicates whether the specified code point is a digit.- Parameters:
codePoint- the code point to check.- Returns:
trueifcodePointis a digit;falseotherwise.
-
isISOControl
public static boolean isISOControl(char c)
Indicates whether the specified character is an ISO control character.- Parameters:
c- the character to check.- Returns:
trueifcis an ISO control character;falseotherwise.
-
isISOControl
public static boolean isISOControl(int c)
Indicates whether the specified code point is an ISO control character.- Parameters:
c- the code point to check.- Returns:
trueifcis an ISO control character;falseotherwise.
-
isLetter
public static boolean isLetter(char c)
Indicates whether the specified character is a letter.- Parameters:
c- the character to check.- Returns:
trueifcis a letter;falseotherwise.
-
isLetter
public static boolean isLetter(int codePoint)
Indicates whether the specified code point is a letter.- Parameters:
codePoint- the code point to check.- Returns:
trueifcodePointis a letter;falseotherwise.
-
isLetterOrDigit
public static boolean isLetterOrDigit(char c)
Indicates whether the specified character is a letter or a digit.- Parameters:
c- the character to check.- Returns:
trueifcis a letter or a digit;falseotherwise.
-
isLetterOrDigit
public static boolean isLetterOrDigit(int codePoint)
Indicates whether the specified code point is a letter or a digit.- Parameters:
codePoint- the code point to check.- Returns:
trueifcodePointis a letter or a digit;falseotherwise.
-
isLowerCase
public static boolean isLowerCase(char c)
Indicates whether the specified character is a lower case letter.- Parameters:
c- the character to check.- Returns:
trueifcis a lower case letter;falseotherwise.
-
isLowerCase
public static boolean isLowerCase(int codePoint)
Indicates whether the specified code point is a lower case letter.- Parameters:
codePoint- the code point to check.- Returns:
trueifcodePointis a lower case letter;falseotherwise.
-
isSpace
public static boolean isSpace(char c)
Deprecated.UseisWhitespace(char)instead.UseisWhitespace(char)instead.- Parameters:
c- Character- Returns:
trueif white space.
-
isSpaceChar
public static boolean isSpaceChar(char c)
SeeisSpaceChar(int).- Parameters:
c- Character- Returns:
trueif space character.
-
isSpaceChar
public static boolean isSpaceChar(int codePoint)
Returns true if the given code point is a Unicode space character. The exact set of characters considered as whitespace varies with Unicode version. Note that non-breaking spaces are considered whitespace. Note also that line separators are not considered whitespace; seeisWhitespace(char)for an alternative.- Parameters:
codePoint- Code point.- Returns:
trueif space character
-
isUpperCase
public static boolean isUpperCase(char c)
Indicates whether the specified character is an upper case letter.- Parameters:
c- the character to check.- Returns:
trueifcis a upper case letter;falseotherwise.
-
isUpperCase
public static boolean isUpperCase(int codePoint)
Indicates whether the specified code point is an upper case letter.- Parameters:
codePoint- the code point to check.- Returns:
trueifcodePointis a upper case letter;falseotherwise.
-
isWhitespace
public static boolean isWhitespace(char c)
SeeisWhitespace(int).- Parameters:
c- Character- Returns:
trueif white space
-
isWhitespace
public static boolean isWhitespace(int codePoint)
Returns true if the given code point is a Unicode whitespace character. The exact set of characters considered as whitespace varies with Unicode version. Note that non-breaking spaces are not considered whitespace. Note also that line separators are considered whitespace; seeisSpaceChar(char)for an alternative.- Parameters:
codePoint- Code point- Returns:
trueif white space
-
reverseBytes
public static char reverseBytes(char c)
Reverses the order of the first and second byte in the specified character.- Parameters:
c- the character to reverse.- Returns:
- the character with reordered bytes.
-
toLowerCase
public static char toLowerCase(char c)
Returns the lower case equivalent for the specified character if the character is an upper case letter. Otherwise, the specified character is returned unchanged.- Parameters:
c- the character- Returns:
- if
cis an upper case character then its lower case counterpart, otherwise justc.
-
toLowerCase
public static int toLowerCase(int codePoint)
Returns the lower case equivalent for the specified code point if it is an upper case letter. Otherwise, the specified code point is returned unchanged.- Parameters:
codePoint- the code point to check.- Returns:
- if
codePointis an upper case character then its lower case counterpart, otherwise justcodePoint.
-
toString
public String toString()
Description copied from class:ObjectReturns a string containing a concise, human-readable description of this object. Subclasses are encouraged to override this method and provide an implementation that takes into account the object's type and data.
-
toString
public static String toString(char value)
Converts the specified character to its string representation.- Parameters:
value- the character to convert.- Returns:
- the character converted to a string.
-
toUpperCase
public static char toUpperCase(char c)
Returns the upper case equivalent for the specified character if the character is a lower case letter. Otherwise, the specified character is returned unchanged.- Parameters:
c- the character to convert.- Returns:
- if
cis a lower case character then its upper case counterpart, otherwise justc.
-
toUpperCase
public static int toUpperCase(int codePoint)
Returns the upper case equivalent for the specified code point if the code point is a lower case letter. Otherwise, the specified code point is returned unchanged.- Parameters:
codePoint- the code point to convert.- Returns:
- if
codePointis a lower case character then its upper case counterpart, otherwise justcodePoint.
-
-