Ada Reference ManualLegal Information
Contents   Index   References   Search   Previous   Next 

2.1 Character Set

1/3
The character repertoire for the text of an Ada program consists of the entire coding space described by the ISO/IEC 10646:2011 Universal Multiple-Octet Coded Character Set. This coding space is organized in planes, each plane comprising 65536 characters.

Syntax

Paragraphs 2 and 3 were deleted. 
3.1/3
A character is defined by this International Standard for each cell in the coding space described by ISO/IEC 10646:2011, regardless of whether or not ISO/IEC 10646:2011 allocates a character to that cell. 

Static Semantics

4/3
The coded representation for characters is implementation defined (it need not be a representation defined within ISO/IEC 10646:2011). A character whose relative code point in its plane is 16#FFFE# or 16#FFFF# is not allowed anywhere in the text of a program. The only characters allowed outside of comments are those in categories other_format, format_effector, and graphic_character.
4.1/3
 The semantics of an Ada program whose text is not in Normalization Form KC (as defined by Clause 21 of ISO/IEC 10646:2011) is implementation defined. 
5/3
The description of the language definition in this International Standard uses the character properties General Category, Simple Uppercase Mapping, Uppercase Mapping, and Special Case Condition of the documents referenced by the note in Clause 1 of ISO/IEC 10646:2011. The actual set of graphic symbols used by an implementation for the visual representation of the text of an Ada program is not specified.
6/3
Characters are categorized as follows: 
7/2
This paragraph was deleted.
8/2
letter_uppercase

Any character whose General Category is defined to be “Letter, Uppercase”.
9/2
letter_lowercase

Any character whose General Category is defined to be “Letter, Lowercase”. 
9.1/2
 letter_titlecase

Any character whose General Category is defined to be “Letter, Titlecase”.
9.2/2
 letter_modifier

Any character whose General Category is defined to be “Letter, Modifier”.
9.3/2
 letter_other
Any character whose General Category is defined to be “Letter, Other”.
9.4/2
 mark_non_spacing

Any character whose General Category is defined to be “Mark, Non-Spacing”.
9.5/2
 mark_spacing_combining

Any character whose General Category is defined to be “Mark, Spacing Combining”.
10/2
number_decimal

Any character whose General Category is defined to be “Number, Decimal”.
10.1/2
  number_letter

Any character whose General Category is defined to be “Number, Letter”.
10.2/2
  punctuation_connector

Any character whose General Category is defined to be “Punctuation, Connector”.
10.3/2
  other_format
Any character whose General Category is defined to be “Other, Format”.
11/2
separator_space
Any character whose General Category is defined to be “Separator, Space”.
12/2
separator_line
Any character whose General Category is defined to be “Separator, Line”. 
12.1/2
  separator_paragraph

Any character whose General Category is defined to be “Separator, Paragraph”.
13/3
format_effector

The characters whose code points are 16#09# (CHARACTER TABULATION), 16#0A# (LINE FEED), 16#0B# (LINE TABULATION), 16#0C# (FORM FEED), 16#0D# (CARRIAGE RETURN), 16#85# (NEXT LINE), and the characters in categories separator_line and separator_paragraph.
13.1/2
  other_control
Any character whose General Category is defined to be “Other, Control”, and which is not defined to be a format_effector.
13.2/2
  other_private_use

Any character whose General Category is defined to be “Other, Private Use”.
13.3/2
  other_surrogate

Any character whose General Category is defined to be “Other, Surrogate”.
14/3
graphic_character

Any character that is not in the categories other_control, other_private_use, other_surrogate, format_effector, and whose relative code point in its plane is neither 16#FFFE# nor 16#FFFF#. 
15/3
The following names are used when referring to certain characters (the first name is that given in ISO/IEC 10646:2011):
  graphic symbolname  graphic symbolname
    
         "quotation mark         :colon
         #number sign         ;semicolon
         &ampersand         <less-than sign
         'apostrophe, tick         =equals sign
         (left parenthesis         >greater-than sign
         )right parenthesis         _low line, underline
         *asterisk, multiply         |vertical line
         +plus sign         /solidus, divide
         ,comma         !exclamation point
         –hyphen-minus, minus         %percent sign
         . full stop, dot, point  

Implementation Requirements

16/3
An Ada implementation shall accept Ada source code in UTF-8 encoding, with or without a BOM (see A.4.11), where every character is represented by its code point. The character pair CARRIAGE RETURN/LINE FEED (code points 16#0D# 16#0A#) signifies a single end of line (see 2.2); every other occurrence of a format_effector other than the character whose code point position is 16#09# (CHARACTER TABULATION) also signifies a single end of line.

Implementation Permissions

17/3
The categories defined above, as well as case mapping and folding, may be based on an implementation-defined version of ISO/IEC 10646 (2003 edition or later). 
NOTES
18/2
1  The characters in categories other_control, other_private_use, and other_surrogate are only allowed in comments.

Contents   Index   References   Search   Previous   Next 
Ada-Europe Ada 2005 and 2012 Editions sponsored in part by Ada-Europe