3.5.2 Character Types
Static Semantics
{character type}
An enumeration type is said to be a
character
type if at least one of its enumeration literals is a
character_literal.
{
AI95-00285-01}
{Latin-1} {BMP}
{ISO/IEC 10646:2003}
{Character}
The predefined type Character is a character type
whose values correspond to the 256 code positions of Row 00 (also known
as Latin-1) of the ISO/IEC 10646:2003 Basic Multilingual Plane (BMP).
Each of the graphic characters of Row 00 of the BMP has a corresponding
character_literal
in Character. Each of the nongraphic positions of Row 00 (0000-001F and
007F-009F) has a corresponding language-defined name, which is not usable
as an enumeration literal, but which is usable with the attributes Image,
Wide_Image, Wide_Wide_Image, Value, Wide_Value, and Wide_Wide_Value;
these names are given in the definition of type Character in
A.1,
“
The Package Standard”, but are
set in
italics.
{italics (nongraphic
characters)}
{
AI95-00285-01}
{Wide_Character} {BMP}
{ISO/IEC 10646:2003}
The predefined type Wide_Character is a character
type whose values correspond to the 65536 code positions of the ISO/IEC
10646:2003 Basic Multilingual Plane (BMP). Each of the graphic characters
of the BMP has a corresponding
character_literal
in Wide_Character. The first 256 values of Wide_Character have the same
character_literal
or language-defined name as defined for Character. Each of the
graphic_characters
has a corresponding
character_literal.
{
AI95-00285-01}
{Wide_Wide_Character} {BMP}
{ISO/IEC 10646:2003}
The predefined type Wide_Wide_Character is a character
type whose values correspond to the 2147483648 code positions of the
ISO/IEC 10646:2003 character set. Each of the
graphic_characters
has a corresponding
character_literal
in Wide_Wide_Character. The first 65536 values of Wide_Wide_Character
have the same
character_literal
or language-defined name as defined for Wide_Character.
{
AI95-00285-01}
The characters whose code position is larger than 16#FF# and which are
not
graphic_characters have language-defined
names which are formed by appending to the string "Hex_" the
representation of their code position in hexadecimal as eight extended
digits. As with other language-defined names, these names are usable
only with the attributes (Wide_)Wide_Image and (Wide_)Wide_Value; they
are not usable as enumeration literals.
Reason: {
AI95-00285-01}
The language-defined names are not usable as enumeration literals to
avoid "polluting" the name space. Since Wide_Character and
Wide_Wide_Character are defined in Standard, if the language-defined
names were usable as enumeration literals, they would hide other nonoverloadable
declarations with the same names in
use-d packages.]}
Implementation Permissions
Implementation Advice
25 The language-defined library package
Characters.Latin_1 (see
A.3.3) includes the
declaration of constants denoting control characters, lower case characters,
and special characters of the predefined type Character.
To be honest: The package ASCII does
the same, but only for the first 128 characters of Character. Hence,
it is an obsolescent package, and we no longer mention it here.
26 A conventional character set such as
EBCDIC can be declared as a character type; the internal codes
of the characters can be specified by an
enumeration_representation_clause
as explained in clause
13.4.
Examples
Example of a character
type:
type Roman_Digit is ('I', 'V', 'X', 'L', 'C', 'D', 'M');
Inconsistencies With Ada 83
{
inconsistencies with Ada 83}
The
declaration of Wide_Character in package Standard hides use-visible declarations
with the same defining identifier. In the unlikely event that an Ada
83 program had depended on such a use-visible declaration, and the program
remains legal after the substitution of Standard.Wide_Character, the
meaning of the program will be different.
Incompatibilities With Ada 83
{
incompatibilities with Ada 83}
The
presence of Wide_Character in package Standard means that an expression
such as
'a' = 'b'
is ambiguous in Ada 95, whereas in Ada 83 both
literals could be resolved to be of type Character.
The change in visibility rules (see
4.2)
for character literals means that additional qualification might be necessary
to resolve expressions involving overloaded subprograms and character
literals.
Extensions to Ada 83
{
extensions to Ada 83}
The
type Character has been extended to have 256 positions, and the type
Wide_Character has been added. Note that this change was already approved
by the ARG for Ada 83 conforming compilers.
The rules for referencing character literals
are changed (see
4.2), so that the declaration
of the character type need not be directly visible to use its literals,
similar to
null and string literals. Context is used to resolve
their type.
Inconsistencies With Ada 95
{
AI95-00285-01}
{
inconsistencies with Ada 95}
Ada 95 defined
most characters in Wide_Character to be graphic characters, while Ada
2005 uses the categorizations from ISO-10646:2003. It also provides language-defined
names for all non-graphic characters. That means that in Ada 2005, Wide_Character'Wide_Value
will raise Constraint_Error for a string representing a
character_literal
of a non-graphic character, while Ada 95 would have accepted it. Similarly,
the result of Wide_Character'Wide_Image will change for such non-graphic
characters.
{
AI95-00395-01}
The language-defined names FFFE and FFFF were replaced by a consistent
set of language-defined names for all non-graphic characters with positions
greater than 16#FF#. That means that in Ada 2005, Wide_Character'Wide_Value("FFFE")
will raise Constraint_Error while Ada 95 would have accepted it. Similarly,
the result of Wide_Character'Wide_Image will change for the position
numbers 16#FFFE# and 16#FFFF#. It is very unlikely that this will matter
in practice, as these names do not represent useable characters.
{
AI95-00285-01}
{
AI95-00395-01}
Because of the previously mentioned changes to the Wide_Character'Wide_Image
of various character values, the value of attribute Wide_Width will change
for some subtypes of Wide_Character. However, the new language-defined
names were chosen so that the value of Wide_Character'Wide_Width itself
does not change.
{
AI95-00285-01}
The declaration of Wide_Wide_Character in package Standard hides use-visible
declarations with the same defining identifier. In the (very) unlikely
event that an Ada 95 program had depended on such a use-visible declaration,
and the program remains legal after the substitution of Standard.Wide_Wide_Character,
the meaning of the program will be different.
Extensions to Ada 95
{
AI95-00285-01}
{
extensions to Ada 95}
The type Wide_Wide_Character
is new.
Wording Changes from Ada 95
{
AI95-00285-01}
Characters are now defined in terms of the entire ISO/IEC 10646:2003
character set.
{
AI95-00285-01}
We dropped the Implementation Advice for non-standard interpretation
of character sets; an implementation can do what it wants in a non-standard
mode, so there isn't much point to any advice.