Package org.fife.io

Class UnicodeReader

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Readable

    public class UnicodeReader
    extends Reader
    A reader capable of identifying Unicode streams by their BOMs. This class will recognize the following encodings:
    • UTF-8
    • UTF-16LE
    • UTF-16BE
    • UTF-32LE
    • UTF-32BE
    If the stream is not found to be any of the above, then a default encoding is used for reading. The user can specify this default encoding, or a system default will be used.

    For optimum performance, it is recommended that you wrap all instances of UnicodeReader with a java.io.BufferedReader.

    This class is mostly ripped off from the workaround in the description of Java Bug 4508058.

    • Constructor Detail

      • UnicodeReader

        public UnicodeReader​(String file)
                      throws IOException,
                             FileNotFoundException,
                             SecurityException
        This utility constructor is here because you will usually use a UnicodeReader on files.

        Creates a reader using the encoding specified by the BOM in the file; if there is no recognized BOM, then a system default encoding is used.

        Parameters:
        file - The file from which you want to read.
        Throws:
        IOException - If an error occurs when checking for/reading the BOM.
        FileNotFoundException - If the file does not exist, is a directory, or cannot be opened for reading.
        SecurityException - If a security manager exists and its checkRead method denies read access to the file.
      • UnicodeReader

        public UnicodeReader​(File file)
                      throws IOException,
                             FileNotFoundException,
                             SecurityException
        This utility constructor is here because you will usually use a UnicodeReader on files.

        Creates a reader using the encoding specified by the BOM in the file; if there is no recognized BOM, then a system default encoding is used.

        Parameters:
        file - The file from which you want to read.
        Throws:
        IOException - If an error occurs when checking for/reading the BOM.
        FileNotFoundException - If the file does not exist, is a directory, or cannot be opened for reading.
        SecurityException - If a security manager exists and its checkRead method denies read access to the file.
      • UnicodeReader

        public UnicodeReader​(File file,
                             String defaultEncoding)
                      throws IOException,
                             FileNotFoundException,
                             SecurityException
        This utility constructor is here because you will usually use a UnicodeReader on files.

        Creates a reader using the encoding specified by the BOM in the file; if there is no recognized BOM, then a specified default encoding is used.

        Parameters:
        file - The file from which you want to read.
        defaultEncoding - The encoding to use if no BOM is found. If this value is null, a system default is used.
        Throws:
        IOException - If an error occurs when checking for/reading the BOM.
        FileNotFoundException - If the file does not exist, is a directory, or cannot be opened for reading.
        SecurityException - If a security manager exists and its checkRead method denies read access to the file.
      • UnicodeReader

        public UnicodeReader​(InputStream in)
                      throws IOException
        Creates a reader using the encoding specified by the BOM in the file; if there is no recognized BOM, then a system default encoding is used.
        Parameters:
        in - The input stream from which to read.
        Throws:
        IOException - If an error occurs when checking for/reading the BOM.
      • UnicodeReader

        public UnicodeReader​(InputStream in,
                             String defaultEncoding)
                      throws IOException
        Creates a reader using the encoding specified by the BOM in the file; if there is no recognized BOM, then defaultEncoding is used.
        Parameters:
        in - The input stream from which to read.
        defaultEncoding - The encoding to use if no recognized BOM is found. If this value is null, a system default is used.
        Throws:
        IOException - If an error occurs when checking for/reading the BOM.
    • Method Detail

      • getEncoding

        public String getEncoding()
        Returns the encoding being used to read this input stream (i.e., the encoding of the file). If a BOM was recognized, then the specific Unicode type is returned; otherwise, either the default encoding passed into the constructor or the system default is returned.
        Returns:
        The encoding of the stream.
      • init

        protected void init​(InputStream in,
                            String defaultEncoding)
                     throws IOException
        Read-ahead four bytes and check for BOM marks. Extra bytes are unread back to the stream, only BOM bytes are skipped.
        Parameters:
        defaultEncoding - The encoding to use if no BOM was recognized. If this value is null, then a system default is used.
        Throws:
        IOException - If an error occurs when trying to read a BOM.
      • read

        public int read​(char[] cbuf,
                        int off,
                        int len)
                 throws IOException
        Read characters into a portion of an array. This method will block until some input is available, an I/O error occurs, or the end of the stream is reached.
        Specified by:
        read in class Reader
        Parameters:
        cbuf - The buffer into which to read.
        off - The offset at which to start storing characters.
        len - The maximum number of characters to read.
        Returns:
        The number of characters read, or -1 if the end of the stream has been reached.
        Throws:
        IOException