Class HTMLTokenMaker
- java.lang.Object
-
- org.fife.ui.rsyntaxtextarea.TokenMakerBase
-
- org.fife.ui.rsyntaxtextarea.AbstractJFlexTokenMaker
-
- org.fife.ui.rsyntaxtextarea.modes.AbstractMarkupTokenMaker
-
- org.fife.ui.rsyntaxtextarea.modes.HTMLTokenMaker
-
- All Implemented Interfaces:
TokenMaker
public class HTMLTokenMaker extends AbstractMarkupTokenMaker
Scanner for HTML 5 files. This implementation was created using JFlex 1.4.1; however, the generated file was modified for performance. Memory allocation needs to be almost completely removed to be competitive with the handwritten lexers (subclasses ofAbstractTokenMaker
, so this class has been modified so that Strings are never allocated (via yytext()), and the scanner never has to worry about refilling its buffer (needlessly copying chars around). We can achieve this because RText always scans exactly 1 line of tokens at a time, and hands the scanner this line as an array of characters (a Segment really). Since tokens contain pointers to char arrays instead of Strings holding their contents, there is no need for allocating new memory for Strings.The actual algorithm generated for scanning has, of course, not been modified.
If you wish to regenerate this file yourself, keep in mind the following:
- The generated
HTMLTokenMaker.java
file will contain two definitions of bothzzRefill
andyyreset
. You should hand-delete the second of each definition (the ones generated by the lexer), as these generated methods modify the input buffer, which we'll never have to do. - You should also change the declaration/definition of zzBuffer to NOT be initialized. This is a needless memory allocation for us since we will be pointing the array somewhere else anyway.
- You should NOT call
yylex()
on the generated scanner directly; rather, you should usegetTokenList
as you would with any otherTokenMaker
instance.
-
-
Field Summary
Fields Modifier and Type Field Description static int
COMMENT
static int
CSS
static int
CSS_C_STYLE_COMMENT
static int
CSS_CHAR_LITERAL
static int
CSS_PROPERTY
static int
CSS_STRING
static int
CSS_VALUE
static int
DTD
static int
INATTR_DOUBLE
static int
INATTR_DOUBLE_SCRIPT
static int
INATTR_DOUBLE_STYLE
static int
INATTR_SINGLE
static int
INATTR_SINGLE_SCRIPT
lexical statesstatic int
INATTR_SINGLE_STYLE
static int
INTAG
static int
INTAG_CHECK_TAG_NAME
static int
INTAG_SCRIPT
static int
INTAG_STYLE
static int
INTERNAL_ATTR_DOUBLE
Type specific to XMLTokenMaker denoting a line ending with an unclosed double-quote attribute.static int
INTERNAL_ATTR_DOUBLE_QUOTE_SCRIPT
Token type specifying we're in a double-qouted attribute in a script tag.static int
INTERNAL_ATTR_DOUBLE_QUOTE_STYLE
Token type specifying we're in a double-qouted attribute in a style tag.static int
INTERNAL_ATTR_SINGLE
Type specific to XMLTokenMaker denoting a line ending with an unclosed single-quote attribute.static int
INTERNAL_ATTR_SINGLE_QUOTE_SCRIPT
Token type specifying we're in a single-qouted attribute in a script tag.static int
INTERNAL_ATTR_SINGLE_QUOTE_STYLE
Token type specifying we're in a single-qouted attribute in a style tag.static int
INTERNAL_CSS
Internal type denoting a line ending in CSS.static int
INTERNAL_CSS_CHAR
Internal type denoting line ending in a CSS single-quote string.static int
INTERNAL_CSS_MLC
Internal type denoting line ending in a CSS multi-line comment.static int
INTERNAL_CSS_PROPERTY
Internal type denoting a line ending in a CSS property.static int
INTERNAL_CSS_STRING
Internal type denoting line ending in a CSS double-quote string.static int
INTERNAL_CSS_VALUE
Internal type denoting a line ending in a CSS property value.static int
INTERNAL_IN_JS
Token type specifying we're in JavaScript.static int
INTERNAL_IN_JS_CHAR_INVALID
Token type specifying we're in an invalid multi-line JS single-quoted string.static int
INTERNAL_IN_JS_CHAR_VALID
Token type specifying we're in a valid multi-line JS single-quoted string.static int
INTERNAL_IN_JS_MLC
Token type specifying we're in a JavaScript multiline comment.static int
INTERNAL_IN_JS_STRING_INVALID
Token type specifying we're in an invalid multi-line JS string.static int
INTERNAL_IN_JS_STRING_VALID
Token type specifying we're in a valid multi-line JS string.static int
INTERNAL_INTAG
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed HTML tag; thus a new line is beginning still inside of the tag.static int
INTERNAL_INTAG_SCRIPT
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed<script>
tag.static int
INTERNAL_INTAG_STYLE
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed<style>
tag.static int
JAVASCRIPT
static int
JS_CHAR
static int
JS_EOL_COMMENT
static int
JS_MLC
static int
JS_STRING
static int
PI
static int
YYEOF
This character denotes the end of filestatic int
YYINITIAL
-
Fields inherited from class org.fife.ui.rsyntaxtextarea.AbstractJFlexTokenMaker
offsetShift, s, start
-
Fields inherited from class org.fife.ui.rsyntaxtextarea.TokenMakerBase
currentToken, firstToken, previousToken
-
-
Constructor Summary
Constructors Constructor Description HTMLTokenMaker()
Constructor.HTMLTokenMaker(InputStream in)
Creates a new scanner.HTMLTokenMaker(Reader in)
Creates a new scanner There is also a java.io.InputStream version of this constructor.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addToken(char[] array, int start, int end, int tokenType, int startOffset)
Adds the token specified to the current linked list of tokens.protected OccurrenceMarker
createOccurrenceMarker()
Returns the occurrence marker to use for this token maker.boolean
getCompleteCloseTags()
Sets whether markup close tags should be completed.boolean
getCurlyBracesDenoteCodeBlocks(int languageIndex)
Returns whether this programming language uses curly braces ('{
' and '}
') to denote code blocks.String[]
getLineCommentStartAndEnd(int languageIndex)
Returns the text to place at the beginning and end of a line to "comment" it in this programming language.boolean
getMarkOccurrencesOfTokenType(int type)
ReturnsToken.MARKUP_TAG_NAME
.boolean
getShouldIndentNextLineAfter(Token token)
Overridden to handle newlines in JS and CSS differently than those in markup.Token
getTokenList(Segment text, int initialTokenType, int startOffset)
Returns the first token in the linked list of tokens generated fromtext
.static void
setCompleteCloseTags(boolean complete)
Sets whether markup close tags should be completed.void
yybegin(int newState)
Enters a new lexical statechar
yycharat(int pos)
Returns the character at position pos from the matched text.void
yyclose()
Closes the input stream.int
yylength()
Returns the length of the matched text region.Token
yylex()
Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.void
yypushback(int number)
Pushes the specified amount of characters back into the input stream.void
yyreset(Reader reader)
Resets the scanner to read from a new input stream.int
yystate()
Returns the current lexical state.String
yytext()
Returns the text matched by the current regular expression.-
Methods inherited from class org.fife.ui.rsyntaxtextarea.modes.AbstractMarkupTokenMaker
isMarkupLanguage
-
Methods inherited from class org.fife.ui.rsyntaxtextarea.AbstractJFlexTokenMaker
yybegin
-
Methods inherited from class org.fife.ui.rsyntaxtextarea.TokenMakerBase
addNullToken, addToken, addToken, getClosestStandardTokenTypeForInternalType, getInsertBreakAction, getLanguageIndex, getLastTokenTypeOnLine, getOccurrenceMarker, isIdentifierChar, resetTokenList, setLanguageIndex
-
-
-
-
Field Detail
-
YYEOF
public static final int YYEOF
This character denotes the end of file- See Also:
- Constant Field Values
-
INATTR_SINGLE_SCRIPT
public static final int INATTR_SINGLE_SCRIPT
lexical states- See Also:
- Constant Field Values
-
JS_CHAR
public static final int JS_CHAR
- See Also:
- Constant Field Values
-
CSS_STRING
public static final int CSS_STRING
- See Also:
- Constant Field Values
-
JS_MLC
public static final int JS_MLC
- See Also:
- Constant Field Values
-
CSS_CHAR_LITERAL
public static final int CSS_CHAR_LITERAL
- See Also:
- Constant Field Values
-
INTAG_SCRIPT
public static final int INTAG_SCRIPT
- See Also:
- Constant Field Values
-
CSS_PROPERTY
public static final int CSS_PROPERTY
- See Also:
- Constant Field Values
-
CSS_C_STYLE_COMMENT
public static final int CSS_C_STYLE_COMMENT
- See Also:
- Constant Field Values
-
CSS
public static final int CSS
- See Also:
- Constant Field Values
-
CSS_VALUE
public static final int CSS_VALUE
- See Also:
- Constant Field Values
-
COMMENT
public static final int COMMENT
- See Also:
- Constant Field Values
-
INATTR_DOUBLE_SCRIPT
public static final int INATTR_DOUBLE_SCRIPT
- See Also:
- Constant Field Values
-
PI
public static final int PI
- See Also:
- Constant Field Values
-
JAVASCRIPT
public static final int JAVASCRIPT
- See Also:
- Constant Field Values
-
INTAG
public static final int INTAG
- See Also:
- Constant Field Values
-
INTAG_CHECK_TAG_NAME
public static final int INTAG_CHECK_TAG_NAME
- See Also:
- Constant Field Values
-
INATTR_SINGLE_STYLE
public static final int INATTR_SINGLE_STYLE
- See Also:
- Constant Field Values
-
DTD
public static final int DTD
- See Also:
- Constant Field Values
-
JS_EOL_COMMENT
public static final int JS_EOL_COMMENT
- See Also:
- Constant Field Values
-
INATTR_DOUBLE_STYLE
public static final int INATTR_DOUBLE_STYLE
- See Also:
- Constant Field Values
-
INATTR_SINGLE
public static final int INATTR_SINGLE
- See Also:
- Constant Field Values
-
YYINITIAL
public static final int YYINITIAL
- See Also:
- Constant Field Values
-
INATTR_DOUBLE
public static final int INATTR_DOUBLE
- See Also:
- Constant Field Values
-
JS_STRING
public static final int JS_STRING
- See Also:
- Constant Field Values
-
INTAG_STYLE
public static final int INTAG_STYLE
- See Also:
- Constant Field Values
-
INTERNAL_ATTR_DOUBLE
public static final int INTERNAL_ATTR_DOUBLE
Type specific to XMLTokenMaker denoting a line ending with an unclosed double-quote attribute.- See Also:
- Constant Field Values
-
INTERNAL_ATTR_SINGLE
public static final int INTERNAL_ATTR_SINGLE
Type specific to XMLTokenMaker denoting a line ending with an unclosed single-quote attribute.- See Also:
- Constant Field Values
-
INTERNAL_INTAG
public static final int INTERNAL_INTAG
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed HTML tag; thus a new line is beginning still inside of the tag.- See Also:
- Constant Field Values
-
INTERNAL_INTAG_SCRIPT
public static final int INTERNAL_INTAG_SCRIPT
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed<script>
tag.- See Also:
- Constant Field Values
-
INTERNAL_ATTR_DOUBLE_QUOTE_SCRIPT
public static final int INTERNAL_ATTR_DOUBLE_QUOTE_SCRIPT
Token type specifying we're in a double-qouted attribute in a script tag.- See Also:
- Constant Field Values
-
INTERNAL_ATTR_SINGLE_QUOTE_SCRIPT
public static final int INTERNAL_ATTR_SINGLE_QUOTE_SCRIPT
Token type specifying we're in a single-qouted attribute in a script tag.- See Also:
- Constant Field Values
-
INTERNAL_INTAG_STYLE
public static final int INTERNAL_INTAG_STYLE
Token type specific to HTMLTokenMaker; this signals that the user has ended a line with an unclosed<style>
tag.- See Also:
- Constant Field Values
-
INTERNAL_ATTR_DOUBLE_QUOTE_STYLE
public static final int INTERNAL_ATTR_DOUBLE_QUOTE_STYLE
Token type specifying we're in a double-qouted attribute in a style tag.- See Also:
- Constant Field Values
-
INTERNAL_ATTR_SINGLE_QUOTE_STYLE
public static final int INTERNAL_ATTR_SINGLE_QUOTE_STYLE
Token type specifying we're in a single-qouted attribute in a style tag.- See Also:
- Constant Field Values
-
INTERNAL_IN_JS
public static final int INTERNAL_IN_JS
Token type specifying we're in JavaScript.- See Also:
- Constant Field Values
-
INTERNAL_IN_JS_MLC
public static final int INTERNAL_IN_JS_MLC
Token type specifying we're in a JavaScript multiline comment.- See Also:
- Constant Field Values
-
INTERNAL_IN_JS_STRING_INVALID
public static final int INTERNAL_IN_JS_STRING_INVALID
Token type specifying we're in an invalid multi-line JS string.- See Also:
- Constant Field Values
-
INTERNAL_IN_JS_STRING_VALID
public static final int INTERNAL_IN_JS_STRING_VALID
Token type specifying we're in a valid multi-line JS string.- See Also:
- Constant Field Values
-
INTERNAL_IN_JS_CHAR_INVALID
public static final int INTERNAL_IN_JS_CHAR_INVALID
Token type specifying we're in an invalid multi-line JS single-quoted string.- See Also:
- Constant Field Values
-
INTERNAL_IN_JS_CHAR_VALID
public static final int INTERNAL_IN_JS_CHAR_VALID
Token type specifying we're in a valid multi-line JS single-quoted string.- See Also:
- Constant Field Values
-
INTERNAL_CSS
public static final int INTERNAL_CSS
Internal type denoting a line ending in CSS.- See Also:
- Constant Field Values
-
INTERNAL_CSS_PROPERTY
public static final int INTERNAL_CSS_PROPERTY
Internal type denoting a line ending in a CSS property.- See Also:
- Constant Field Values
-
INTERNAL_CSS_VALUE
public static final int INTERNAL_CSS_VALUE
Internal type denoting a line ending in a CSS property value.- See Also:
- Constant Field Values
-
INTERNAL_CSS_STRING
public static final int INTERNAL_CSS_STRING
Internal type denoting line ending in a CSS double-quote string. The state to return to is embedded in the actual end token type.- See Also:
- Constant Field Values
-
INTERNAL_CSS_CHAR
public static final int INTERNAL_CSS_CHAR
Internal type denoting line ending in a CSS single-quote string. The state to return to is embedded in the actual end token type.- See Also:
- Constant Field Values
-
INTERNAL_CSS_MLC
public static final int INTERNAL_CSS_MLC
Internal type denoting line ending in a CSS multi-line comment. The state to return to is embedded in the actual end token type.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
HTMLTokenMaker
public HTMLTokenMaker()
Constructor. This must be here because JFlex does not generate a no-parameter constructor.
-
HTMLTokenMaker
public HTMLTokenMaker(Reader in)
Creates a new scanner There is also a java.io.InputStream version of this constructor.- Parameters:
in
- the java.io.Reader to read input from.
-
HTMLTokenMaker
public HTMLTokenMaker(InputStream in)
Creates a new scanner. There is also java.io.Reader version of this constructor.- Parameters:
in
- the java.io.Inputstream to read input from.
-
-
Method Detail
-
addToken
public void addToken(char[] array, int start, int end, int tokenType, int startOffset)
Adds the token specified to the current linked list of tokens.- Specified by:
addToken
in interfaceTokenMaker
- Overrides:
addToken
in classTokenMakerBase
- Parameters:
array
- The character array.start
- The starting offset in the array.end
- The ending offset in the array.tokenType
- The token's type.startOffset
- The offset in the document at which this token occurs.
-
createOccurrenceMarker
protected OccurrenceMarker createOccurrenceMarker()
Returns the occurrence marker to use for this token maker. Subclasses can override to use different implementations.- Overrides:
createOccurrenceMarker
in classTokenMakerBase
- Returns:
- The occurrence marker to use.
-
getCompleteCloseTags
public boolean getCompleteCloseTags()
Sets whether markup close tags should be completed. You might not want this to be the case, since some tags in standard HTML aren't usually closed.- Specified by:
getCompleteCloseTags
in classAbstractMarkupTokenMaker
- Returns:
- Whether closing markup tags are completed.
- See Also:
setCompleteCloseTags(boolean)
-
getCurlyBracesDenoteCodeBlocks
public boolean getCurlyBracesDenoteCodeBlocks(int languageIndex)
Description copied from class:TokenMakerBase
Returns whether this programming language uses curly braces ('{
' and '}
') to denote code blocks. The default implementation returnsfalse
; subclasses can override this method if necessary.- Specified by:
getCurlyBracesDenoteCodeBlocks
in interfaceTokenMaker
- Overrides:
getCurlyBracesDenoteCodeBlocks
in classTokenMakerBase
- Parameters:
languageIndex
- The language index at the offset in question. Since someTokenMaker
s effectively have nested languages (such as JavaScript in HTML), this parameter tells theTokenMaker
what sub-language to look at.- Returns:
- Whether curly braces denote code blocks.
-
getLineCommentStartAndEnd
public String[] getLineCommentStartAndEnd(int languageIndex)
Returns the text to place at the beginning and end of a line to "comment" it in this programming language.- Specified by:
getLineCommentStartAndEnd
in interfaceTokenMaker
- Overrides:
getLineCommentStartAndEnd
in classAbstractMarkupTokenMaker
- Parameters:
languageIndex
- The language index at the offset in question. Since someTokenMaker
s effectively have nested languages (such as JavaScript in HTML), this parameter tells theTokenMaker
what sub-language to look at.- Returns:
- The start and end strings to add to a line to "comment"
it out. A
null
value for either means there is no string to add for that part. A value ofnull
for the array means this language does not support commenting/uncommenting lines.
-
getMarkOccurrencesOfTokenType
public boolean getMarkOccurrencesOfTokenType(int type)
ReturnsToken.MARKUP_TAG_NAME
.- Specified by:
getMarkOccurrencesOfTokenType
in interfaceTokenMaker
- Overrides:
getMarkOccurrencesOfTokenType
in classTokenMakerBase
- Parameters:
type
- The token type.- Returns:
- Whether tokens of this type should have "mark occurrences" enabled.
-
getShouldIndentNextLineAfter
public boolean getShouldIndentNextLineAfter(Token token)
Overridden to handle newlines in JS and CSS differently than those in markup.- Specified by:
getShouldIndentNextLineAfter
in interfaceTokenMaker
- Overrides:
getShouldIndentNextLineAfter
in classTokenMakerBase
- Parameters:
token
- The token the previous line ends with.- Returns:
- Whether the next line should be indented.
-
getTokenList
public Token getTokenList(Segment text, int initialTokenType, int startOffset)
Returns the first token in the linked list of tokens generated fromtext
. This method must be implemented by subclasses so they can correctly implement syntax highlighting.- Parameters:
text
- The text from which to get tokens.initialTokenType
- The token type we should start with.startOffset
- The offset into the document at whichtext
starts.- Returns:
- The first
Token
in a linked list representing the syntax highlighted text.
-
setCompleteCloseTags
public static void setCompleteCloseTags(boolean complete)
Sets whether markup close tags should be completed. You might not want this to be the case, since some tags in standard HTML aren't usually closed.- Parameters:
complete
- Whether closing markup tags are completed.- See Also:
getCompleteCloseTags()
-
yyreset
public final void yyreset(Reader reader)
Resets the scanner to read from a new input stream. Does not close the old reader. All internal variables are reset, the old input stream cannot be reused (internal buffer is discarded and lost). Lexical state is set to YY_INITIAL.- Parameters:
reader
- the new input stream
-
yyclose
public final void yyclose() throws IOException
Closes the input stream.- Throws:
IOException
-
yystate
public final int yystate()
Returns the current lexical state.
-
yybegin
public final void yybegin(int newState)
Enters a new lexical state- Specified by:
yybegin
in classAbstractJFlexTokenMaker
- Parameters:
newState
- the new lexical state
-
yytext
public final String yytext()
Returns the text matched by the current regular expression.
-
yycharat
public final char yycharat(int pos)
Returns the character at position pos from the matched text. It is equivalent to yytext().charAt(pos), but faster- Parameters:
pos
- the position of the character to fetch. A value from 0 to yylength()-1.- Returns:
- the character at position pos
-
yylength
public final int yylength()
Returns the length of the matched text region.
-
yypushback
public void yypushback(int number)
Pushes the specified amount of characters back into the input stream. They will be read again by then next call of the scanning method- Parameters:
number
- the number of characters to be read again. This number must not be greater than yylength()!
-
yylex
public Token yylex() throws IOException
Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.- Returns:
- the next token
- Throws:
IOException
- if any I/O-Error occurs
-
-