org.lobobrowser.html.parser
public class HtmlParser extends java.lang.Object
HtmlParser
class is an HTML DOM parser.
This parser provides the functionality for
the standard DOM parser implementation DocumentBuilderImpl
.
This parser class may be used directly when a different DOM
implementation is preferred.Modifier and Type | Field and Description |
---|---|
static java.lang.String |
MODIFYING_KEY
A node
UserData key used to tell
nodes that their content may be about to be
modified. |
Constructor and Description |
---|
HtmlParser(org.w3c.dom.Document document,
org.xml.sax.ErrorHandler errorHandler,
java.lang.String publicId,
java.lang.String systemId)
Deprecated.
UserAgentContext should be passed in constructor.
|
HtmlParser(UserAgentContext ucontext,
org.w3c.dom.Document document)
Constructs a
HtmlParser . |
HtmlParser(UserAgentContext ucontext,
org.w3c.dom.Document document,
org.xml.sax.ErrorHandler errorHandler,
java.lang.String publicId,
java.lang.String systemId)
Constructs a
HtmlParser . |
Modifier and Type | Method and Description |
---|---|
static boolean |
isDecodeEntities(java.lang.String elementName) |
void |
parse(java.io.InputStream in)
Parses HTML from an input stream, assuming
the character set is ISO-8859-1.
|
void |
parse(java.io.InputStream in,
java.lang.String charset)
Parses HTML from an input stream, using the given character set.
|
void |
parse(java.io.LineNumberReader reader) |
void |
parse(java.io.LineNumberReader reader,
org.w3c.dom.Node parent)
This method may be used when the DOM should be built under
a given node, such as when
innerHTML is used
in Javascript. |
void |
parse(java.io.Reader reader)
Parses HTML given by a
Reader . |
void |
parse(java.io.Reader reader,
org.w3c.dom.Node parent)
This method may be used when the DOM should be built under
a given node, such as when
innerHTML is used
in Javascript. |
public static final java.lang.String MODIFYING_KEY
UserData
key used to tell
nodes that their content may be about to be
modified. Elements could use this to temporarily
suspend notifications. The value set
will be either Boolean.TRUE
or
Boolean.FALSE
.public HtmlParser(org.w3c.dom.Document document, org.xml.sax.ErrorHandler errorHandler, java.lang.String publicId, java.lang.String systemId)
HtmlParser
.document
- A W3C Document instance.errorHandler
- The error handler.publicId
- The public ID of the document.systemId
- The system ID of the document.public HtmlParser(UserAgentContext ucontext, org.w3c.dom.Document document, org.xml.sax.ErrorHandler errorHandler, java.lang.String publicId, java.lang.String systemId)
HtmlParser
.ucontext
- The user agent context.document
- An W3C Document instance.errorHandler
- The error handler.publicId
- The public ID of the document.systemId
- The system ID of the document.public HtmlParser(UserAgentContext ucontext, org.w3c.dom.Document document)
HtmlParser
.ucontext
- The user agent context.document
- A W3C Document instance.public static boolean isDecodeEntities(java.lang.String elementName)
public void parse(java.io.InputStream in) throws java.io.IOException, org.xml.sax.SAXException, java.io.UnsupportedEncodingException
in
- The input stream.java.io.IOException
- Thrown when there are errors reading the stream.org.xml.sax.SAXException
- Thrown when there are parse errors.java.io.UnsupportedEncodingException
public void parse(java.io.InputStream in, java.lang.String charset) throws java.io.IOException, org.xml.sax.SAXException, java.io.UnsupportedEncodingException
in
- The input stream.charset
- The character set.java.io.IOException
- Thrown when there's an error reading from the stream.org.xml.sax.SAXException
- Thrown when there is a parser error.java.io.UnsupportedEncodingException
- Thrown if the character set is not supported.public void parse(java.io.Reader reader) throws java.io.IOException, org.xml.sax.SAXException
Reader
. This method appends
nodes to the document provided to the parser.reader
- An instance of Reader
.java.io.IOException
- Thrown if there are errors reading the input stream.org.xml.sax.SAXException
- Thrown if there are parse errors.public void parse(java.io.LineNumberReader reader) throws java.io.IOException, org.xml.sax.SAXException
java.io.IOException
org.xml.sax.SAXException
public void parse(java.io.Reader reader, org.w3c.dom.Node parent) throws java.io.IOException, org.xml.sax.SAXException
innerHTML
is used
in Javascript.reader
- A document reader.parent
- The root node for the parsed DOM.java.io.IOException
org.xml.sax.SAXException
public void parse(java.io.LineNumberReader reader, org.w3c.dom.Node parent) throws java.io.IOException, org.xml.sax.SAXException
innerHTML
is used
in Javascript.reader
- A LineNumberReader for the document.parent
- The root node for the parsed DOM.java.io.IOException
org.xml.sax.SAXException