Module BatString

module BatString: sig .. end

String operations.

Given a string s of length l, we call character number in s the index of a character in s. Indexes start at 0, and we will call a character number valid in s if it falls within the range [0...l-1]. A position is the point between two characters or at the beginning or end of the string. We call a position valid in s if it falls within the range [0...l]. Note that character number n is between positions n and n+1.

Two parameters start and len are said to designate a valid substring of s if len >= 0 and start and start+len are valid positions in s.

This module replaces Stdlib's String module.

If you're going to do a lot of string slicing, BatSubstring might be a useful module to represent slices of strings, as it doesn't allocate new strings on every operation.


val init : int -> (int -> char) -> string

init l f returns the string of length l with the chars f 0 , f 1 , f 2 ... f (l-1).

Example: String.init 256 char_of_int

val empty : string

The empty string.

val is_empty : string -> bool

is_empty s returns true if s is the empty string, false otherwise.

Usually a tad faster than comparing s with "".

Example (for some string s):  if String.is_empty s then "(Empty)" else s 

val of_bytes : Stdlib.Bytes.t -> string

Return a new string that contains the same bytes as the given byte sequence.

val to_bytes : string -> Stdlib.Bytes.t

Return a new byte sequence that contains the same bytes as the given string.

val cat : string -> string -> string

cat s1 s2 concatenates s1 and s2 (equivalent to s1 ^ s2).

val for_all : (char -> bool) -> string -> bool

for_all p s check if all chars in s satisfy the predicate p.

val length : string -> int

Return the length (number of characters) of the given string.

val get : string -> int -> char

String.get s n returns character number n in string s. You can also write s.[n] instead of String.get s n.

val set : Stdlib.Bytes.t -> int -> char -> unit

String.set s n c modifies string s in place, replacing the character number n by c. You can also write s.[n] <- c instead of String.set s n c.

val create : int -> Stdlib.Bytes.t

String.create n returns a fresh string of length n. The string initially contains arbitrary characters.

val make : int -> char -> string

String.make n c returns a fresh string of length n, filled with the character c.

val copy : string -> string

Return a copy of the given string.

val sub : string -> int -> int -> string

String.sub s start len returns a fresh string of length len, containing the substring of s that starts at position start and has length len.

val fill : Stdlib.Bytes.t -> int -> int -> char -> unit

String.fill s start len c modifies the byte sequence s in place, replacing len characters by c, starting at start.

val blit : string -> int -> Stdlib.Bytes.t -> int -> int -> unit

String.blit src srcoff dst dstoff len copies len characters from string src, starting at character number srcoff, to the byte sequence dst, starting at character number dstoff.

val concat : string -> string list -> string

String.concat sep sl concatenates the list of strings sl, inserting the separator string sep between each.

val iter : (char -> unit) -> string -> unit

String.iter f s applies function f in turn to all the characters of s. It is equivalent to f s.[0]; f s.[1]; ...; f s.[String.length s - 1]; ().

val iteri : (int -> char -> unit) -> string -> unit

Same as String.iter, but the function is applied to the index of the element as first argument (counting from 0), and the character itself as second argument.

val map : (char -> char) -> string -> string

String.map f s applies function f in turn to all the characters of s and stores the results in a new string that is returned.

val mapi : (int -> char -> char) -> string -> string

String.mapi f s calls f with each character of s and its index (in increasing index order) and stores the results in a new string that is returned.

val trim : string -> string

Return a copy of the argument, without leading and trailing whitespace (according to BatChar.is_whitespace). The characters regarded as whitespace are: ' ', '\n', '\r', '\t', '\012' and '\026'. If there is no leading nor trailing whitespace character in the argument, return the original string itself, not a copy.

val escaped : string -> string

Return a copy of the argument, with special characters represented by escape sequences, following the lexical conventions of OCaml. If there is no special character in the argument, return the original string itself, not a copy. Its inverse function is Scanf.unescaped.

val index : string -> char -> int

String.index s c returns the character number of the first occurrence of character c in string s.

val index_opt : string -> char -> int option

String.index_opt s c returns the index of the first occurrence of character c in string s, or None if c does not occur in s.

val rindex : string -> char -> int

String.rindex s c returns the character number of the last occurrence of character c in string s.

val rindex_opt : string -> char -> int option

String.rindex_opt s c returns the index of the last occurrence of character c in string s, or None if c does not occur in s.

val index_from : string -> int -> char -> int

String.index_from s i c returns the character number of the first occurrence of character c in string s after or at position i. String.index s c is equivalent to String.index_from s 0 c.

val index_from_opt : string -> int -> char -> int option

String.index_from_opt s i c returns the index of the first occurrence of character c in string s after position i or None if c does not occur in s after position i.

String.index_opt s c is equivalent to String.index_from_opt s 0 c. Raise Invalid_argument if i is not a valid position in s.

val rindex_from : string -> int -> char -> int

String.rindex_from s i c returns the character number of the last occurrence of character c in string s before position i+1. String.rindex s c is equivalent to String.rindex_from s (String.length s - 1) c.

val rindex_from_opt : string -> int -> char -> int option

String.rindex_from_opt s i c returns the index of the last occurrence of character c in string s before position i+1 or None if c does not occur in s before position i+1.

String.rindex_opt s c is equivalent to String.rindex_from_opt s (String.length s - 1) c.

Raise Invalid_argument if i+1 is not a valid position in s.

val index_after_n : char -> int -> string -> int

index_after_n chr n str returns the index of the character that comes immediately after the n-th occurrence of chr in str.

val contains : string -> char -> bool

String.contains s c tests if character c appears in the string s.

val contains_from : string -> int -> char -> bool

String.contains_from s start c tests if character c appears in s after position start. String.contains s c is equivalent to String.contains_from s 0 c.

val rcontains_from : string -> int -> char -> bool

String.rcontains_from s stop c tests if character c appears in s before position stop+1.

val uppercase : string -> string

Return a copy of the argument, with all lowercase letters translated to uppercase, including accented letters of the ISO Latin-1 (8859-1) character set.

val lowercase : string -> string

Return a copy of the argument, with all uppercase letters translated to lowercase, including accented letters of the ISO Latin-1 (8859-1) character set.

val capitalize : string -> string

Return a copy of the argument, with the first character set to uppercase.

val uncapitalize : string -> string

Return a copy of the argument, with the first character set to lowercase.

val uppercase_ascii : string -> string

Return a copy of the argument, with all lowercase letters translated to uppercase, using the US-ASCII character set.

val lowercase_ascii : string -> string

Return a copy of the argument, with all uppercase letters translated to lowercase, using the US-ASCII character set.

val capitalize_ascii : string -> string

Return a copy of the argument, with the first character set to uppercase, using the US-ASCII character set.

val uncapitalize_ascii : string -> string

Return a copy of the argument, with the first character set to lowercase, using the US-ASCII character set.

type t = string 

An alias for the type of strings.

val compare : t -> t -> int

The comparison function for strings, with the same specification as Pervasives.compare. Along with the type t, this function compare allows the module String to be passed as argument to the functors Set.Make and Map.Make.

Conversions
val enum : string -> char BatEnum.t

Returns an enumeration of the characters of a string. The behaviour is unspecified if the string is mutated while it is enumerated.

Examples: "foo" |> String.enum |> List.of_enum = ['f''o''o'] String.enum "a b c" // ((<>) ' ') |> String.of_enum = "abc"

val of_enum : char BatEnum.t -> string

Creates a string from a character enumeration. Example: ['f''o''o'] |> List.enum |> String.of_enum = "foo"

val backwards : string -> char BatEnum.t

Returns an enumeration of the characters of a string, from last to first.

Examples:  "foo" |> String.backwards |> String.of_enum = "oof"   let rev s = String.backwards s |> String.of_enum 

val of_backwards : char BatEnum.t -> string

Build a string from an enumeration, starting with last character, ending with first.

Examples:  "foo" |> String.enum |> String.of_backwards = "oof"   "foo" |> String.backwards |> String.of_backwards = "foo"   let rev s = String.enum s |> String.of_backwards 

val of_list : char list -> string

Converts a list of characters to a string.

Example:  ['c''h''a''r''s'] |> String.of_list = "chars" 

val to_list : string -> char list

Converts a string to the list of its characters.

Example:  String.to_list "string" |> List.interleave ';' |> String.of_list = "s;t;r;i;n;g" 

val of_int : int -> string

Returns the string representation of an int.

Example:  String.of_int 56 = "56" && String.of_int (-1) = "-1" 

val of_float : float -> string

Returns the string representation of an float.

Example:  String.of_float 1.246 = "1.246" 

val of_char : char -> string

Returns a string containing one given character.

Example:  String.of_char 's' = "s" 

val to_int : string -> int

Returns the integer represented by the given string or

val to_float : string -> float

Returns the float represented by the given string or

String traversals
val map : (char -> char) -> string -> string

map f s returns a string where all characters c in s have been replaced by f c.

Example: String.map Char.uppercase "Five" = "FIVE" *

val fold_left : ('a -> char -> 'a) -> 'a -> string -> 'a

fold_left f a s is f (... (f (f a s.[0]) s.[1]) ...) s.[n-1]

Examples: String.fold_left (fun li c -> c::li) [] "foo" = ['o';'o';'f'] String.fold_left max 'a' "apples" = 's'

val fold_lefti : ('a -> int -> char -> 'a) -> 'a -> string -> 'a

As fold_left, but with the index of the element as additional argument

val fold_right : (char -> 'a -> 'a) -> string -> 'a -> 'a

fold_right f s b is f s.[0] (f s.[1] (... (f s.[n-1] b) ...))

Examples: String.fold_right List.cons "foo" [] = ['f';'o';'o'] String.fold_right (fun c a -> if c = ' ' then a+1 else a) "a b c" 0 = 2

val fold_righti : (int -> char -> 'a -> 'a) -> string -> 'a -> 'a

As fold_right, but with the index of the element as additional argument

val filter : (char -> bool) -> string -> string

filter f s returns a copy of string s in which only characters c such that f c = true remain.

Example:  String.filter ((<>) ' '"a b c" = "abc" 

val filter_map : (char -> char option) -> string -> string

filter_map f s calls (f a0) (f a1).... (f an) where a0..an are the characters of s. It returns the string of characters ci such as f ai = Some ci (when f returns None, the corresponding element of s is discarded).

Example:  String.filter_map (function 'a'..'z' as c -> Some (Char.uppercase c) | _ -> None"a b c" = "ABC" 

val iteri : (int -> char -> unit) -> string -> unit

String.iteri f s is equivalent to f 0 s.[0]; f 1 s.[1]; ...; f len s.[len] where len is length of string s. Example:

 let letter_positions word =
      let positions = Array.make 256 [] in
      let count_letter pos c =
        positions.(int_of_char c) <- pos :: positions.(int_of_char c) in
      String.iteri count_letter word;
      Array.mapi (fun c pos -> (char_of_int c, List.rev pos)) positions
      |> Array.to_list
      |> List.filter (fun (c,pos) -> pos <> [])
      in
      letter_positions "hello" = ['e',[1]; 'h',[0]; 'l',[2;3]; 'o',[4] ]
    
Finding
val find : string -> string -> int

find s x returns the starting index of the first occurrence of string x within string s.

Note This implementation is optimized for short strings.

val find_from : string -> int -> string -> int

find_from s pos x behaves as find s x but starts searching at position pos. find s x is equivalent to find_from s 0 x.

val rfind : string -> string -> int

rfind s x returns the starting index of the last occurrence of string x within string s.

Note This implementation is optimized for short strings.

val rfind_from : string -> int -> string -> int

rfind_from s pos x behaves as rfind s x but starts searching from the right at position pos + 1. rfind s x is equivalent to rfind_from s (String.length s - 1) x.

Beware, it search between the beginning of the string to the position pos + 1, not between pos + 1 and the end.

val find_all : string -> string -> int BatEnum.t

find_all s x enumerates positions of s at which x occurs. Example: find_all "aabaabaa" "aba" |> List.of_enum will return the list [1; 4].

val count_string : string -> string -> int

count_string s x count how many times x is found in s.

val ends_with : string -> string -> bool

ends_with s x returns true if the string s is ending with x, false otherwise.

Example: String.ends_with "foobarbaz" "rbaz" = true

val starts_with : string -> string -> bool

starts_with s x returns true if s is starting with x, false otherwise.

Example: String.starts_with "foobarbaz" "fooz" = false

val starts_with_stdlib : prefix:string -> string -> bool

Equivalent to starts_with but the prefix is a labelled parameter.

val ends_with_stdlib : suffix:string -> string -> bool

Equivalent to ends_with but the suffix is a labelled parameter.

val exists : string -> string -> bool

exists str sub returns true if sub is a substring of str or false otherwise.

Example: String.exists "foobarbaz" "obar" = true

val exists_stdlib : (char -> bool) -> string -> bool

exists_stdlib p str check if at least one char of str satisfies the predicate p.

val count_char : string -> char -> int

count_char str c returns the number of times c is used in str.

Transformations
val lchop : ?n:int -> string -> string

Returns the same string but without the first n characters. By default n is 1. If n is strictly less than zero

val rchop : ?n:int -> string -> string

Returns the same string but without the last n characters. By default n is 1. If n is strictly less than zero

val chop : ?l:int -> ?r:int -> string -> string

Returns the same string but with the first l characters on the left and the first r characters on the right removed. By default, l and r are both 1.

chop ~l ~r s is equivalent to lchop ~n:l (rchop ~n:r s).

val quote : string -> string

Add quotes around a string and escape any quote or escape appearing in that string. This function is used typically when you need to generate source code from a string.

Examples: String.quote "foo" = "\"foo\"" String.quote "\"foo\"" = "\"\\\"foo\\\"\"" String.quote "\n" = "\"\\n\"" etc.

More precisely, the returned string conforms to the OCaml syntax: if printed, it outputs a representation of the input string as an OCaml string litteral.

val left : string -> int -> string

left r len returns the string containing the len first characters of r. If r contains less than len characters, it returns r.

Examples: String.left "Weeble" 4 = "Weeb" String.left "Weeble" 0 = "" String.left "Weeble" 10 = "Weeble"

val right : string -> int -> string

right r len returns the string containing the len last characters of r. If r contains less than len characters, it returns r.

Example: String.right "Weeble" 4 = "eble"

val head : string -> int -> string

as BatString.left

val tail : string -> int -> string

tail r pos returns the string containing all but the pos first characters of r

Example: String.tail "Weeble" 4 = "le"

val strip : ?chars:string -> string -> string

Returns the string without the chars if they are at the beginning or at the end of the string. By default chars are " \t\r\n".

Examples: String.strip " foo " = "foo" String.strip ~chars:" ,()" " boo() bar()" = "boo() bar"

val replace_chars : (char -> string) -> string -> string

replace_chars f s returns a string where all chars c of s have been replaced by the string returned by f c.

Example: String.replace_chars (function ' ' -> "(space)" | c -> String.of_char c) "foo bar" = "foo(space)bar"

val replace : str:string -> sub:string -> by:string -> bool * string

replace ~str ~sub ~by returns a tuple consisting of a boolean and a string where the first occurrence of the string sub within str has been replaced by the string by. The boolean is true if a substitution has taken place.

Example: String.replace "foobarbaz" "bar" "rab" = (true"foorabbaz")

val nreplace : str:string -> sub:string -> by:string -> string

nreplace ~str ~sub ~by returns a string obtained by iteratively replacing each occurrence of sub by by in str, from right to left. It returns a copy of str if sub has no occurrence in str.

Example: nreplace ~str:"bar foo aaa bar" ~sub:"aa" ~by:"foo" = "bar foo afoo bar"

val repeat : string -> int -> string

repeat s n returns s ^ s ^ ... ^ s

Example: String.repeat "foo" 4 = "foofoofoofoo"

val rev : string -> string

rev s returns the reverse of string s

In-Place Transformations
val rev_in_place : Stdlib.Bytes.t -> unit

rev_in_place s mutates the byte sequence s, so that its new value is the mirror of its old one: for instance if s contained "Example!", after the mutation it will contain "!elpmaxE".

val in_place_mirror : Stdlib.Bytes.t -> unit
Deprecated. Use String.rev_in_place instead
Splitting around
val split_on_char : char -> string -> string list

String.split_on_char sep s returns the list of all (possibly empty) substrings of s that are delimited by the sep character.

The function's output is specified by the following invariants:

Note: prior to 2.11.0 split_on_char _ "" used to return an empty list.

val split : string -> by:string -> string * string

split s sep splits the string s between the first occurrence of sep, and returns the two parts before and after the occurrence (excluded).

val rsplit : string -> by:string -> string * string

rsplit s sep splits the string s between the last occurrence of sep, and returns the two parts before and after the occurrence (excluded).

val nsplit : string -> by:string -> string list
Deprecated. use BatString.split_on_string

nsplit s sep splits the string s into a list of strings which are separated by sep (excluded). nsplit "" _ returns a single empty string. Note: prior to 2.11.0 nsplit "" _ used to return an empty list.

Example: String.nsplit "abcabcabc" "bc" = ["a""a""a"""]

val split_on_string : by:string -> string -> string list

split_on_string sep s splits the string s into a list of strings which are separated by sep (excluded). split_on_string _ "" returns a single empty string. Note: split_on_string sep s is identical to nsplit s sep but for empty strings.

Example: String.split_on_string "bc" "abcabcabc" = ["a""a""a"""]

val cut_on_char : char -> int -> string -> string

Similar to Unix cut. cut_on_char chr n str returns the substring of str located strictly between the n-th occurrence of chr and the n+1-th one.

Remark: cut_on_char can return the empty string. Examples of this behaviour are cut_on_char ',' 1 "foo,,bar" and cut_on_char ',' 0 ",foo".

val join : string -> string list -> string

Same as BatString.concat

val slice : ?first:int -> ?last:int -> string -> string

slice ?first ?last s returns a "slice" of the string which corresponds to the characters s.[first], s.[first+1], ..., s[last-1]. Note that the character at index last is not included! If first is omitted it defaults to the start of the string, i.e. index 0, and if last is omitted is defaults to point just past the end of s, i.e. length s. Thus, slice s is equivalent to copy s.

Negative indexes are interpreted as counting from the end of the string. For example, slice ~last:(-2) s will return the string s, but without the last two characters.

This function never raises any exceptions. If the indexes are out of bounds they are automatically clipped.

Example: String.slice ~first:1 ~last:(-3) " foo bar baz" = "foo bar "

val splice : string -> int -> int -> string -> string

String.splice s off len rep cuts out the section of s indicated by off and len and replaces it by rep

Negative indexes are interpreted as counting from the end of the string. If off+len is greater than length s, the end of the string is used, regardless of the value of len.

If len is zero or negative, rep is inserted at position off without replacing any of s.

Example: String.splice "foo bar baz" 3 5 "XXX" = "fooXXXbaz"

val explode : string -> char list

explode s returns the list of characters in the string s.

Example: String.explode "foo" = ['f''o''o']

val implode : char list -> string

implode cs returns a string resulting from concatenating the characters in the list cs.

Example: String.implode ['b''a''r'] = "bar"

Iterators
val to_seq : t -> char Stdlib.Seq.t

Iterate on the string, in increasing index order. Modifications of the string during iteration will be reflected in the iterator.

val to_seqi : t -> (int * char) Stdlib.Seq.t

Iterate on the string, in increasing order, yielding indices along chars

val of_seq : char Stdlib.Seq.t -> t

Create a string from the generator

Binary decoding of integers

The functions in this section binary decode integers from strings.

All following functions raise Invalid_argument if the characters needed at index i to decode the integer are not available.

Little-endian (resp. big-endian) encoding means that least (resp. most) significant bytes are stored first. Big-endian is also known as network byte order. Native-endian encoding is either little-endian or big-endian depending on Sys.big_endian.

32-bit and 64-bit integers are represented by the int32 and int64 types, which can be interpreted either as signed or unsigned numbers.

8-bit and 16-bit integers are represented by the int type, which has more bits than the binary encoding. These extra bits are sign-extended (or zero-extended) for functions which decode 8-bit or 16-bit integers and represented them with int values.

val get_uint8 : string -> int -> int

get_uint8 b i is b's unsigned 8-bit integer starting at character index i.

val get_int8 : string -> int -> int

get_int8 b i is b's signed 8-bit integer starting at character index i.

val get_uint16_ne : string -> int -> int

get_uint16_ne b i is b's native-endian unsigned 16-bit integer starting at character index i.

val get_uint16_be : string -> int -> int

get_uint16_be b i is b's big-endian unsigned 16-bit integer starting at character index i.

val get_uint16_le : string -> int -> int

get_uint16_le b i is b's little-endian unsigned 16-bit integer starting at character index i.

val get_int16_ne : string -> int -> int

get_int16_ne b i is b's native-endian signed 16-bit integer starting at character index i.

val get_int16_be : string -> int -> int

get_int16_be b i is b's big-endian signed 16-bit integer starting at character index i.

val get_int16_le : string -> int -> int

get_int16_le b i is b's little-endian signed 16-bit integer starting at character index i.

val get_int32_ne : string -> int -> int32

get_int32_ne b i is b's native-endian 32-bit integer starting at character index i.

val get_int32_be : string -> int -> int32

get_int32_be b i is b's big-endian 32-bit integer starting at character index i.

val get_int32_le : string -> int -> int32

get_int32_le b i is b's little-endian 32-bit integer starting at character index i.

val get_int64_ne : string -> int -> int64

get_int64_ne b i is b's native-endian 64-bit integer starting at character index i.

val get_int64_be : string -> int -> int64

get_int64_be b i is b's big-endian 64-bit integer starting at character index i.

val get_int64_le : string -> int -> int64

get_int64_le b i is b's little-endian 64-bit integer starting at character index i.

Comparisons
val equal : t -> t -> bool

String equality

val ord : t -> t -> BatOrd.order

Ordering function for strings, see BatOrd

val compare : t -> t -> int

The comparison function for strings, with the same specification as Pervasives.compare. Along with the type t, this function compare allows the module String to be passed as argument to the functors Set.Make and Map.Make.

Example: String.compare "FOO" "bar" = -1 i.e. "FOO" < "bar"

val icompare : t -> t -> int

Compare two strings, case-insensitive.

Example: String.icompare "FOO" "bar" = 1 i.e. "foo" > "bar"

module IString: BatInterfaces.OrderedType  with type t = t

uses icompare as ordering function

val numeric_compare : t -> t -> int

Compare two strings, sorting "abc32def" before "abc210abc".

Algorithm: splits both strings into lists of (strings of digits) or (strings of non digits) (["abc""32""def"] and ["abc""210""abc"]) Then both lists are compared lexicographically by comparing elements numerically when both are numbers or lexicographically in other cases.

Example: String.numeric_compare "xx32" "xx210" < 0

module NumString: BatInterfaces.OrderedType  with type t = t

uses numeric_compare as its ordering function

val edit_distance : t -> t -> int

Edition distance (also known as "Levenshtein distance"). See wikipedia

Boilerplate code
Printing
val print : 'a BatInnerIO.output -> string -> unit

Print a string.

Example: String.print stdout "foo\n"

val println : 'a BatInnerIO.output -> string -> unit

Print a string, end the line.

Example: String.println stdout "foo"

val print_quoted : 'a BatInnerIO.output -> string -> unit

Print a string, with quotes as added by the quote function.

String.print_quoted stdout "foo" prints "foo" (with the quotes).

String.print_quoted stdout "\"bar\"" prints "\"bar\"" (with the quotes).

String.print_quoted stdout "\n" prints "\n" (not the escaped character, but '\' then 'n').

module Exceptionless: sig .. end

Exceptionless counterparts for error-raising operations

module Cap: sig .. end

Capabilities for strings.