Among the facilities C++ programmers have developed over and over again are those manipulating chunks of text, commonly called strings. The C programming language offers rudimentary string support.
To process text C++ offers a std::string
type. In C++
the traditional C library functions manipulating NTB strings are
deprecated in favor of using string
objects. Many problems in C
programs are caused by buffer overruns, boundary errors and allocation
problems that can be traced back to improperly using these traditional C
string library functions. Many of these problems can be prevented using
C++ string objects.
Actually, string
objects are class type variables, and in that sense
they are comparable to stream objects like cin
and cout
. In this
section the use of string
type objects is covered. The focus is on their
definition and their use. When using string
objects the
member function syntax is commonly used:
stringVariable.operation(argumentList)
For example, if string1
and string2
are variables of type
std::string
, then
string1.compare(string2)
can be used to compare both strings.
In addition to the common member functions the string
class also offers a
wide variety of operators, like the assignment (=
) and the comparison
operator (==
). Operators often result in code that is easy to understand
and their use is generally preferred over the use of member functions offering
comparable functionality. E.g., rather than writing
if (string1.compare(string2) == 0)
the following is generally preferred:
if (string1 == string2)
To define and use string
-type objects, sources must include the header
file <string>
. To merely declare the string type
the header iosfwd can be included.
In addition to std::string
, the header file string
defines the
following string types:
std::wstring
, a string type consisting of wchar_t
characters;
std::u16string
, a string type consisting of char16_t
characters;
std::u32string
, a string type consisting of char32_t
characters.
string::npos
is returned. This value is a symbolic
value of type string::size_type
, which is (for all practical
purposes) an (unsigned
) int
.
All string
member functions accepting string
objects as arguments also
accept NTBS arguments. The same usually holds true for operators accepting
string
objects.
Some string
-members use iterators. Iterators are formally introduced
in section 18.2. Member functions using iterators are listed in
the next section (5.2), but the iterator concept itself
is not further covered by this chapter.
Strings support a large variety of members and operators. A short overview
listing their capabilities is provided in this section, with subsequent
sections offering a detailed discussion. The bottom line: C++ strings are
extremely versatile and there is hardly a reason for falling back on the C
library to process text. C++ strings handle all the required memory
management and thus memory related problems, which are the #1 source of
problems in C programs, can be prevented when C++ strings are
used. Strings do come at a price, though. The class's extensive capabilities
have also turned it into a beast. It's hard to learn and master all its
features and in the end you'll find that not all that you expected is actually
there. For example, std::string
doesn't offer case-insensitive
comparisons. But in the end it isn't even as simple as that. It is there,
but it is somewhat hidden and at this point in the C++ Annotations it's too
early to study into that hidden corner yet. Instead, realize that C's
standard library does offer useful functions that can be used as long as
we're aware of their limitations and are able to avoid their traps. So for
now, to perform a traditional case-insensitive comparison of the content
of two std::string
objects str1
and str2
the following will do:
strcasecmp(str1.c_str(), str2.c_str());
Strings support the following functionality:
initialization
:assignment
:assign
) but a plain assignment operator (i.e.,
=
)may also be used. Furthermore, assignment to a character buffer is
also supported.
conversions
:std::string
s are accepted as well.
breakdown
:[]
) allowing us to either access or
modify information in the middle of a string.
comparisons
:==, !=, <, <=, >
and >=
. There
are also member functions available offering a more fine-grained comparison.
modification
:swapping
:searching
:housekeeping
:stream I/O
:object
is always a string
-object;
argument
is a string const &
or a char const *
unless
indicated otherwise. The content of an argument
never is modified by the
operation processing the argument
;
opos
refers to an offset into an object
string;
apos
refers to an offset into an argument
;
on
represents a number of characters in an object
(starting
at opos
);
an
represents a number of characters in an argument
(starting
at apos
).
Both opos
and apos
must refer to existing offsets, or an exception
(cf. chapter 10) is generated. In contrast, an
and on
may
exceed the number of available characters, in which case only the available
characters are considered.
Many members declare default values for on, an
and apos
. Some members
declare default values for opos
. Default offset values are 0, the default
values of on
and an
is string::npos
, which can be interpreted as
`the required number of characters to reach the end of the string'.
With members starting their operations at the end of the string object's
content proceeding backwards, the default value of opos
is the index of
the object's last character, with on
by default equal to opos + 1
,
representing the length of the substring ending at opos
.
In the overview of member functions presented below it may be assumed that all these parameters accept default values unless indicated otherwise. Of course, the default argument values cannot be used if a function requires additional arguments beyond the ones otherwise accepting default values.
Some members have overloaded versions expecting an initial argument of type
char const *
. But even if that is not the case the first argument can
always be of type char const *
where a parameter of std::string
is
defined.
Several member functions accept iterators. Section 18.2 covers
the technical aspects of iterators, but these may be ignored at this point
without loss of continuity. Like apos
and opos
, iterators must refer
to existing positions and/or to an existing range of characters within the
string object's content.
All string
-member functions computing indices return the
predefined constant string::npos
on failure.
The s
literal suffix to indicate that a std::string
constant is
intended when a string literal (like "hello world"
) is used. It can be
used after declaring using namespace std
or, more specific, after
declaring
using namespace std::literals::string_literals
.
When string literals are used when explicitly defining or using
std::string
objects the s
-suffix is hardly ever required, but it may
come in handy when using the auto
keyword. E.g., auto str = "hello
world"s
defines std::string str
, whereas it would have been a char
const *
if the literal suffix had been omitted.
string
constructors are
available:
string object
:object
to an empty string. When defining a string
this way no argument list may be specified;
string object(string::size_type count, char ch)
:object
with count
characters ch
. Caveat:
to initialize a string object using this constructor do not use the
curly-braces variant, but use the constructor as shown, to avoid
selecting the initializer-list constructor (see below);
string object(string const &argument)
:object
with argument
;
string object(std::string const &argument, string::size_type
apos, string::size_type an)
:object
with argument
's content starting at index
position apos
, using at most an
of argument
's characters;
string object(InputIterator begin, InputIterator end)
:object
with the characters in the range of characters
defined by the two InputIterators
.
string object(std::initializer_list<char> chars)
:object
with the characters specified in the
initializer list. The string may also directly be initialized, using
the curly braced initialization. Here is an example showing both
forms:
string str1({'h', 'e', 'l', 'l', 'o'}); string str2{ 'h', 'e', 'l', 'l', 'o' };
Iterators play an important role in the context of generic algorithms
(cf. chapter 19). The class std::string
defines the following
iterator types:
string::iterator
and string::const_iterator
:
these iterators are forward iterators. Theconst_iterator
is returned bystring const
objects, the plainiterator
is returned by non-const string objects. Characters referred to byiterators
may be modified;
string::reverse_iterator
and string::reverse_const_iterator
:
these iterators are also forward iterators but when incrementing the iterator the previous character in the string object is reached. Other than that they are comparable to, respectively,string::iterator
andstring::const_iterator
.
The following operators are available for string
objects (in the examples
`object' and `argument' refer to existing std::string
objects).
a character, C or C++ string may be assigned to astring
object. The assignment operator returns its left-hand side operand. Example:object = argument; object = "C string"; object = 'x'; object = 120; // same as object = 'x'
the arithmetic additive assignment operator and the addition operator add text to astring
object. The compound assignment operator returns its left-hand side operand, the addition operator returns its result in a temporary string object. When using the addition operator either the left-hand side operand or the right-hand side operand must be astd::string
object. The other operand may be a char, a C string or a C++ string. Example:object += argument; object += "hello"; object += 'x'; // integral expressions are OK argument + otherArgument; // two std::string objects argument + "hello"; // using + at least one "hello" + argument; // std::string is required argument + 'a'; // integral expressions are OK 'a' + argument;
The index operator may be used to retrieveobject
's individual characters, or to assign new values to individual characters of a non-const string object. There is no range-checking (use theat()
member function for that). This operator returns achar &
orchar const &
. Example:object[3] = argument[5];
the logical comparison operators may be applied to two string objects or to a string object and a C string to compare their content. These operators return abool
value. The==, !=, >, >=, <,
and<=
operators are available. The ordering operators perform a lexicographical comparison of their content using the ASCII character collating sequence. Example:object == object; // true object != (object + 'x'); // true object <= (object + 'x'); // true
the insertion-operator (cf. section 3.1.4) may be used to insert astring
object into anostream
, the extraction-operator may be used to extract a string object from anistream
. The extraction operator by default first ignores all whitespace characters and then extracts all consecutively non-blank characters from anistream
. Instead of a string a character array may be extracted as well, but the advantage of using a string object should be clear: the destination string object is automatically resized to the required number of characters. Example:cin >> object; cout << object;
std::string
class offers many member function as well as additional
non-member functions that should be considered part of the string class.
All these functions are listed below in alphabetic order.
The symbolic value string::npos
is defined by the string class. It
represents `index-not-found' when returned by member functions returning
string offset positions. Example: when calling `object.find('x')
' (see
below) on a string object not containing the character 'x'
, npos
is returned, as the requested position does not exist.
The final 0-byte used in C strings to indicate the end of an NTBS is
not considered part of a C++ string, and so the member function will
return npos
, rather than length()
when looking for 0 in a string
object containing the characters of a C string.
Here are the standard functions that operate on objects of the class
string. When a parameter of size_t
is mentioned it may be interpreted as a
parameter of type string::size_type
, but without defining a default
argument value. The type size_type
should be read as
string::size_type
. With size_type
the default argument values
mentioned in section 5.2 apply. All quoted functions are
member functions of the class std::string
, except where indicated
otherwise.
char &at(size_t opos)
:string const
objects a char const &
is returned. The member function performs range-checking, raising an
exception (that by default aborts the program) if an invalid index is
passed.
string &append(InputIterator begin, InputIterator end)
:begin
and end
are
appended to the current string object.
string &append(string const &argument, size_type apos,
size_type an)
:argument
(or a substring) is appended to the current string
object.
string &append(char const *argument, size_type an)
:an
characters of argument
are appended to the
string object.
string &append(size_type n, char ch)
:n
characters ch
are appended to the current string object.
string &assign(string const &argument, size_type
apos, size_type an)
:argument
(or a substring) is assigned to the string object.
If argument
is of type char const *
and one additional
argument is provided the second argument is interpreted as a value
initializing an
, using 0 to initialize apos
.
string &assign(size_type n, char ch)
:n
characters ch
are assigned to the current string object.
char &back()
:char
stored inside the string
object. The result is undefined for empty strings.
string::iterator begin()
:const
string objects a const_iterator
is returned.
size_type capacity() const
:string::const_iterator cbegin()
:const_iterator
referring to the first character of the current
string object is returned.
string::const_iterator cend()
:const_iterator
referring to the end of the current
string object is returned.
void clear()
:int compare(string const &argument) const
:argument
is compared using a lexicographical comparison using the
ASCII character collating sequence. zero is returned if the two
strings have identical content, a negative value is returned if the
text in the current object should be ordered before the text in
argument
; a positive value is returned if the text in the current
object should be ordered beyond the text in argument
.
int compare(size_t opos, size_t on, string const &argument) const
:argument
. At most on
characters
starting at offset opos
are compared to the text in argument
.
int compare(size_t opos, size_t on, string const &argument,
size_type apos, size_type an)
:argument
. At most
on
characters of the current string object, starting at offset
opos
, are compared to at most an
characters of argument
,
starting at offset apos
. In this case argument
must be a
string object.
int compare(size_t opos, size_t on, char const *argument, size_t an)
:argument
. At most
on
characters of the current string object starting at offset
opos
are compared to at most an
characters of
argument
. Argument
must have at least an
characters. The
characters may have arbitrary values: 0-valued characters have no
special meanings.
bool contains(argument) const
:true
if the object contains argument's
characters as a
substring. The argument can be a string_view
(see section
5.3), a char
or an NTBS.
size_t copy(char *argument, size_t on, size_type opos) const
:argument
. The actual number of characters copied is returned. The
second argument, specifying the number of characters to copy, from the
current string object is required. No 0-valued character is appended
to the copied string but can be appended to the copied text using an
idiom like the following:
argument[object.copy(argument, string::npos)] = 0;Of course, the programmer should make sure that
argument
's size is
large enough to accommodate the additional 0-byte.
string::const_reverse_iterator crbegin()
:const_reverse_iterator
referring to the last character of the
current string object is returned.
string::const_reverse_iterator crend()
:const_reverse_iterator
referring to the begin of the current
string object is returned.
char const *c_str() const
:char const *data() const
:c_str
does), it can be used to retrieve any kind of information
stored inside the current string object including, e.g., series of
0-bytes:
string s(2, 0); cout << static_cast<int>(s.data()[1]) << '\n';
bool empty() const
:true
is returned if the current string object contains no data.
string::iterator end()
:const
string
objects a const_iterator
is returned.
bool ends_with(argument) const
:true
if the object's characters end with argument
. The
argument can be a string_view
, a t(char) or an NTBS.
string &erase(size_type opos, size_type on)
:string::iterator erase(string::iterator begin, string::iterator end)
:end
is optional. If omitted the value returned by
the current object's end
member is used. The characters defined by
the begin
and end
iterators are erased. The iterator begin
is returned, which is then referring to the position immediately
following the last erased character.
size_t find(string const &argument, size_type opos) const
:argument
is found.
size_t find(char const *argument, size_type opos, size_type an) const
:argument
is found. When all three arguments are specified the
first argument must be a char const *
.
size_t find(char ch, size_type opos) const
:ch
is found.
size_t find_first_of(string const &argument,
size_type opos) const
:argument
.
size_type find_first_of(char const *argument, size_type opos,
size_type an) const
:argument
. If opos
is
provided it refers to the first index in the current string object
where the search for argument
should start. If omitted, the string
object is completely scanned. If an
is provided it indicates the
number of characters of the char const *
argument that should be
used in the search. It defines a substring starting at the beginning
of argument
. If omitted, all of argument
's characters are
used.
size_type find_first_of(char ch, size_type opos)
:ch
.
size_t find_first_not_of(string const &argument,
size_type opos) const
:argument
.
size_type find_first_not_of(char const *argument, size_type opos,
size_type an) const
:argument
. The opos
and an
parameters are handled as with find_first_of
size_t find_first_not_of(char ch, size_type opos) const
:ch
.
size_t find_last_of(string const &argument,
size_type opos) const
:argument
.
size_type find_last_of(char const *argument, size_type opos,
size_type an) const
:argument
. If opos
is
provided it refers to the last index in the current string object
where the search for argument
should start (searching backward
towards the beginning of the current object). If omitted, the string
object is scanned completely. If an
is provided it indicates the
number of characters of the char const *
argument that should be
used in the search. It defines a substring starting at the beginning
of argument
. If omitted, all of argument
's characters are
used.
size_type find_last_of(char ch, size_type opos)
:ch
.
size_t find_last_not_of(string const &argument,
size_type opos) const
:argument
.
size_type find_last_not_of(char const *argument, size_type opos,
size_type an) const
:argument
. The opos
and an
parameters are handled as with find_last_of
.
size_t find_last_not_of(char ch, size_type opos) const
:ch
.
char &front()
:char
stored inside the string
object. The result is undefined for empty strings.
allocator_type get_allocator()
:std::string
istream &std::getline(istream &istr, string &object,
char delimiter = '\n')
:string
.istr
. All characters until
delimiter
(or the end of the stream, whichever comes first) are
read from istr
and are stored in object
. If the delimiter is
encountered it is removed from the stream, but is not stored in
object
.istr.eof
returns true
(see
section 6.3.1). Since streams may be interpreted as bool
values (cf. section 6.3.1) a commonly encountered idiom to
read all lines from a stream successively into a string object
line
looks like this:
while (getline(istr, line)) process(line);The content of the last line, whether or not it was terminated by a delimiter, is eventually also assigned to
object
.
string &insert(size_t opos, string const &argument,
size_type apos, size_type an)
:argument
is inserted into the current string
object at the current string object's index position
opos
. Arguments for apos
and an
must either both be
provided or they must both be omitted.
string &insert(size_t opos, char const *argument,
size_type an)
:argument
(of type char const *
) is inserted at index opos
into the current string object.
string &insert(size_t opos, size_t count, char ch)
:Count
characters ch
are inserted at index opos
into the
current string object.
string::iterator insert(string::iterator begin, char ch)
:ch
is inserted at the current object's position
referred to by begin
. Begin
is returned.
string::iterator insert(string::iterator begin, size_t count,
char ch)
:Count
characters ch
are inserted at the current object's
position referred to by begin
. Begin
is returned.
string::iterator insert(string::iterator begin, InputIterator
abegin, InputIterator aend)
:InputIterators abegin
and aend
are inserted at the current object's position referred to
by begin
. Begin
is returned.
size_t length() const
:size_t max_size() const
:void pop_back()
:void push_back(char ch)
:ch
is appended to the string object.
string::reverse_iterator rbegin()
:const
string objects a
reverse_const_iterator
is returned.
string::reverse_iterator rend()
:const
string objects a reverse_const_iterator
is returned.
string &replace(size_t opos, size_t on, string const
&argument, size_type apos, size_type an)
:object
are replaced by the (subset
of) characters of argument
. If on
is specified as 0
argument
is inserted into object
at offset opos
.
string &replace(size_t opos, size_t on, char const *argument,
size_type an)
:object
are replaced by the first an
characters of char const *
argument.
string &replace(size_t opos, size_t on, size_type count, char ch)
:on
characters of the current string object, starting at index
position opos
, are replaced by count
characters ch
.
string &replace(string::iterator begin, string::iterator end,
string const &argument)
:begin
and end
are replaced by argument
. If
argument
is a char const *
, an additional argument an
may
be used, specifying the number of characters of argument
that are
used in the replacement.
string &replace(string::iterator begin, string::iterator end,
size_type count, char ch)
:begin
and end
are replaced by count
characters
having values ch
.
string &replace(string::iterator begin, string::iterator end,
InputIterator abegin, InputIterator aend)
:begin
and end
are replaced by the characters in the
range defined by the InputIterators abegin
and aend
.
void reserve(size_t request)
:request
. After calling this member, capacity
's return value
will be at least request
. A request for a smaller size than the
value returned by capacity
is ignored. A
std::length_error
exception is thrown if
request
exceeds the value returned by max_size
(std::length_error
is defined in the stdexcept
header). Calling reserve()
has the effect of redefining a string's
capacity: when enlarging the capacity extra memory is allocated, but
not immediately available to the program. This is illustrated by the
exception thrown by the string's at()
member when trying to access
an element exceeding the string's size
but not the string's
capacity
.
void resize(size_t size, char ch = 0)
:size
characters. If the string object is resized to a size larger
than its current size the additional characters will be initialized to
ch
. If it is reduced in size the characters having the highest
indices are chopped off.
size_t rfind(string const &argument, size_type opos) const
:argument
is
found is returned. Searching proceeds from the current object's offset
opos
back to its beginning.
size_t rfind(char const *argument, size_type opos, size_type an) const
:argument
is
found is returned. Searching proceeds from the current object's
offset opos
back to its beginning. The parameter an
specifies
the length of the substring of argument
to look for, starting at
argument
's beginning.
size_t rfind(char ch, size_type opos)const
:ch
is found
is returned. Searching proceeds from the current object's offset
opos
back to its beginning.
void shrink_to_fit()
:string{ stringObject }.swap(stringObject)idiom can be used.
size_t size() const
:length()
.
bool starts_with(argument) const
:true
if the object's character range starts with
argument
. The argument can be another string_view
, a t(char)
or an NTBS.
string substr(size_type opos, size_type on) const
:on
characters
starting at index opos
is returned.
void swap(string &argument)
:argument
. For this member argument
must be a
string object and cannot be a char const *
.
std::string
objects. These functions are listed below in alphabetic
order. They are not member functions, but class-less (free) functions declared
in the std
namespace. The <string>
header file must be included
before they can be used.
float stof(std::string const &str, size_t *pos = 0)
:str
are ignored. Then the
following sequences of characters are converted to a float
value,
which is returned:
inf
or infinity
(case insensitive words)
nan
or nan(alphanumeric character sequence)
(nan
is a case insensitive word), resulting in a NaN
floating point value
pos != 0
the index of the first character in str
which was
not converted is returned in *pos
. A std::invalid_argument
exception is thrown if the characters in str
could not be
converted to a float
, a std::out_of_range
exception is thrown
if the converted value would have exceeded the range of float
values.
double stod(std::string const &str, size_t *pos = 0)
:stof
is performed, but now to a
value of type double
.
double stold(std::string const &str, size_t *pos = 0)
:stof
is performed, but now to a
value of type long double
.
int stoi(std::string const &str, size_t *pos = 0,
int base = 10)
:str
are ignored. Then all
characters representing numeric constants of the number system whose
base
is specified are converted to an int
value, which is
returned. An optional + or - character may prefix the numeric
characters. Values starting with 0 are automatically interpreted as
octal values, values starting with 0x or 0X as hexadecimal
characters. The value base
must be between 2 and 36. If pos !=
0
the index of the first character in str
which was not converted
is returned in *pos
. A std::invalid_argument
exception is
thrown if the characters in str
could not be converted to an
int
, a std::out_of_range
exception is thrown if the converted
value would have exceeded the range of int
values.
Here is an example of its use:
int value = stoi(" -123"s); // assigns value -123 value = stoi(" 123"s, 0, 5); // assigns value 38
long stol(std::string const &str, size_t *pos = 0,
int base = 10)
:stoi
is performed, but now to a
value of type long
.
long long stoll(std::string const &str, size_t *pos = 0,
int base = 10)
:stoi
is performed, but now to a
value of type long long
.
unsigned long stoul(std::string const &str, size_t *pos = 0,
int base = 10)
:stoi
is performed, but now to a
value of type unsigned long
.
unsigned long long stoull(std::string const &str,
size_t *pos = 0, int base = 10)
:stoul
is performed, but now to a
value of type unsigned long long
.
std::string to_string(Type value)
:int, long, long long, unsigned, unsigned
long, unsigned long long, float, double,
or long double
. The
value of the argument is converted to a textual representation, which
is returned as a std::string
value.
std::wstring to_wstring(Type value)
:to_string
is performed, returning
a std::wstring
.
std::string
the class std::string_view
can be
used as a wrapper-class of char
arrays. The class string_view
can
be considered a light-weight string
class. Before using
std::string_view
objects the <string_view>
header file must have been
included.
In addition to the standard constructors (default, copy, move) it offers the following constructors:
constexpr string_view(char const *src, size_t nChars)
, constructs a
string_view
object from the first nChars
characters of src
. The
characters in the range [src, src + nChars)
may be 0-valued
characters;
constexpr string_view(char const *src)
, constructs a string_view
object from the NTBS starting at src
. The argument passed to this
constructor may not be a null pointer;
constexpr string_view(Iterator begin, Iterator end)
, constructs a
string_view
object from the characters in the iterator-range
[begin, end)
.
string_view
object does not contain its own copy of the initialized
data. Instead, it refers to the characters that were used when it was
initially constructed. E.g., the following program produces unpredictable
output, but when the hello
array is defined as a static array it shows
hello:
#include <string_view> #include <iostream> using namespace std; string_view fun() { char hello[] = "hello"; return { hello }; } int main() { string_view obj = fun(); cout << obj << '\n'; }
The std::string_view
class provides the same members as std::string
,
except for members extending the string_view's
characters (neither
appending nor inserting characters is possible). However, string_view
objects can modify their characters (using the index operator or at
member).
The string_view
class also offers some extra members:
remove_prefix(size_t step)
:step
positions.
remove_suffix(size_t step)
:step
positions.
constexpr string_view operator""sv(char const *str,
size_t len)
:string_view
object containing len
characters of
str
.
Like std::string
the std::string_view
class provides hashing
facilities, so string_view
objects can be used as keys in, e.g., map
containers (cf. chapter 12).