2.3 Identifiers
Syntax
{
AI95-00285-01}
{
AI95-00395-01}
identifier_start ::=
letter_uppercase
|
letter_lowercase
|
letter_titlecase
|
letter_modifier
|
letter_other
|
number_letter
{
AI95-00285-01}
{
AI95-00395-01}
identifier_extend ::=
mark_non_spacing
|
mark_spacing_combining
|
number_decimal
|
punctuation_connector
|
other_format
{
AI95-00395-01}
After eliminating the characters in category
other_format,
an
identifier
shall not contain two consecutive characters in category punctuation_connector,
or end with a character in that category.
Reason: This rule was stated in the syntax
in Ada 95, but that has gotten too complex in Ada 2005. Since other_format
characters usually do not display, we do not want to count them as separating
two underscores.
Static Semantics
{
AI95-00285-01}
Two
identifiers
are considered the same if they consist of the same sequence of characters
after applying the following transformations (in this order):
{
AI95-00285-01}
The characters in category
other_format are
eliminated.
{
AI95-00395-01}
After applying these transformations, an
identifier
shall not be identical to a reserved word (in upper case).
Implementation Note: We match the reserved
words after doing these transformations so that the rules for
identifiers
and reserved words are the same. (This allows
other_format
characters, which usually don't display, in a reserved word without changing
it to an
identifier.)
Since a compiler usually will lexically process
identifiers
and reserved words the same way (often with the same code), this will
prevent a lot of headaches.
Ramification: The rules for reserved
words differ in one way: they define case conversion on letters rather
than sequences. This means that some unusual sequences are neither
identifiers
nor reserved words. For instance, “ıf” and “acceß”
have upper case conversions of “IF” and “ACCESS”
respectively. These are not
identifiers,
because the transformed values are identical to a reserved word. But
they are not reserved words, either, because the original values do not
match any reserved word as defined or with any number of characters of
the reserved word in upper case. Thus, these odd constructions are just
illegal, and should not appear in the source of a program.
Implementation Permissions
In a nonstandard mode, an implementation may support
other upper/lower case equivalence rules for
identifiers[,
to accommodate local conventions].
Discussion:
{
AI95-00285-01}
For instance, in most languages, the uppercase equivalent of LATIN SMALL
LETTER I (a lower case letter with a dot above) is LATIN CAPITAL LETTER
I (an upper case letter without a dot above). In Turkish, though, LATIN
SMALL LETTER I and LATIN SMALL LETTER DOTLESS I are two distinct letters,
so the upper case equivalent of LATIN SMALL LETTER I is LATIN CAPITAL
LETTER I WITH DOT ABOVE, and the upper case equivalent of LATIN SMALL
LETTER DOTLESS I is LATIN CAPITAL LETTER I. Take for instance the following
identifier (which is the name of a city on the Tigris river in Eastern
Anatolia):
diyarbakır -- The first i is dotted, the second isn't.
Locale-independent
conversion to upper case results in:
DIYARBAKIR -- Both Is are dotless.
This means that
the four following sequences of characters represent the same identifier,
even though for a locutor of Turkish they would probably be considered
distinct words:
diyarbakir
diyarbakır
dıyarbakir
dıyarbakır
An implementation
targeting the Turkish market is allowed (in fact, expected) to provide
a nonstandard mode where case folding is appropriate for Turkish. This
would cause the original identifier to be converted to:
DİYARBAKIR -- The first I is dotted, the second isn't.
and the four sequences of characters shown above
would represent four distinct identifiers.
Lithuanian and Azeri are two other languages
that present similar idiosyncrasies.
3 {
AI95-00285-01}
Identifiers
differing only in the use of corresponding upper and lower case letters
are considered the same.
Examples
Examples of identifiers:
{
AI95-00433-01}
Count X Get_Symbol Ethelyn Marion
Snobol_4 X1 Page_Count Store_Next_Item
Πλάτων --
Plato
Чайковский --
Tchaikovsky
θ φ --
Angles
Wording Changes from Ada 83
We no longer include reserved words as
identifiers.
This is not a language change. In Ada 83,
identifier
included reserved words. However, this complicated several other rules
(for example, regarding implementation-defined attributes and pragmas,
etc.). We now explicitly allow certain reserved words for attribute designators,
to make up for the loss.
Ramification: Because syntax rules are
relevant to overload resolution, it means that if it looks like a reserved
word, it is not an
identifier.
As a side effect, implementations cannot use reserved words as implementation-defined
attributes or pragma names.
Extensions to Ada 95
{
AI95-00285-01}
{
extensions to Ada 95}
An
identifier
can use any letter defined by ISO-10646:2003, along with several other
categories. This should ease programming in languages other than English.