Class CharSet

  • All Implemented Interfaces:
    Serializable

    public class CharSet
    extends Object
    implements Serializable

    A set of characters.

    Instances are immutable, but instances of subclasses may not be.

    #ThreadSafe#

    Since:
    1.0
    Version:
    $Id: CharSet.java 1056988 2011-01-09 17:58:53Z niallp $
    See Also:
    Serialized Form
    • Field Detail

      • EMPTY

        public static final CharSet EMPTY
        A CharSet defining no characters.
        Since:
        2.0
      • ASCII_ALPHA

        public static final CharSet ASCII_ALPHA
        A CharSet defining ASCII alphabetic characters "a-zA-Z".
        Since:
        2.0
      • ASCII_ALPHA_LOWER

        public static final CharSet ASCII_ALPHA_LOWER
        A CharSet defining ASCII alphabetic characters "a-z".
        Since:
        2.0
      • ASCII_ALPHA_UPPER

        public static final CharSet ASCII_ALPHA_UPPER
        A CharSet defining ASCII alphabetic characters "A-Z".
        Since:
        2.0
      • ASCII_NUMERIC

        public static final CharSet ASCII_NUMERIC
        A CharSet defining ASCII alphabetic characters "0-9".
        Since:
        2.0
      • COMMON

        protected static final Map COMMON
        A Map of the common cases used in the factory. Subclasses can add more common patterns if desired
        Since:
        2.0
    • Constructor Detail

      • CharSet

        protected CharSet​(String setStr)

        Constructs a new CharSet using the set syntax.

        Parameters:
        setStr - the String describing the set, may be null
        Since:
        2.0
      • CharSet

        protected CharSet​(String[] set)

        Constructs a new CharSet using the set syntax. Each string is merged in with the set.

        Parameters:
        set - Strings to merge into the initial set
        Throws:
        NullPointerException - if set is null
    • Method Detail

      • getInstance

        public static CharSet getInstance​(String setStr)

        Factory method to create a new CharSet using a special syntax.

        • null or empty string ("") - set containing no characters
        • Single character, such as "a" - set containing just that character
        • Multi character, such as "a-e" - set containing characters from one character to the other
        • Negated, such as "^a" or "^a-e" - set containing all characters except those defined
        • Combinations, such as "abe-g" - set containing all the characters from the individual sets

        The matching order is:

        1. Negated multi character range, such as "^a-e"
        2. Ordinary multi character range, such as "a-e"
        3. Negated single character, such as "^a"
        4. Ordinary single character, such as "a"

        Matching works left to right. Once a match is found the search starts again from the next character.

        If the same range is defined twice using the same syntax, only one range will be kept. Thus, "a-ca-c" creates only one range of "a-c".

        If the start and end of a range are in the wrong order, they are reversed. Thus "a-e" is the same as "e-a". As a result, "a-ee-a" would create only one range, as the "a-e" and "e-a" are the same.

        The set of characters represented is the union of the specified ranges.

        All CharSet objects returned by this method will be immutable.

        Parameters:
        setStr - the String describing the set, may be null
        Returns:
        a CharSet instance
        Since:
        2.0
      • getInstance

        public static CharSet getInstance​(String[] setStrs)

        Constructs a new CharSet using the set syntax. Each string is merged in with the set.

        Parameters:
        setStrs - Strings to merge into the initial set, may be null
        Returns:
        a CharSet instance
        Since:
        2.4
      • add

        protected void add​(String str)

        Add a set definition string to the CharSet.

        Parameters:
        str - set definition string
      • getCharRanges

        public CharRange[] getCharRanges()

        Gets the internal set as an array of CharRange objects.

        Returns:
        an array of immutable CharRange objects
        Since:
        2.0
      • contains

        public boolean contains​(char ch)

        Does the CharSet contain the specified character ch.

        Parameters:
        ch - the character to check for
        Returns:
        true if the set contains the characters
      • equals

        public boolean equals​(Object obj)

        Compares two CharSet objects, returning true if they represent exactly the same set of characters defined in the same way.

        The two sets abc and a-c are not equal according to this method.

        Overrides:
        equals in class Object
        Parameters:
        obj - the object to compare to
        Returns:
        true if equal
        Since:
        2.0
      • hashCode

        public int hashCode()

        Gets a hashCode compatible with the equals method.

        Overrides:
        hashCode in class Object
        Returns:
        a suitable hashCode
        Since:
        2.0
      • toString

        public String toString()

        Gets a string representation of the set.

        Overrides:
        toString in class Object
        Returns:
        string representation of the set