Class XMLUtils
- java.lang.Object
-
- org.apache.uima.internal.util.XMLUtils
-
public abstract class XMLUtils extends java.lang.Object
Some utilities for working with XML.
-
-
Constructor Summary
Constructors Constructor Description XMLUtils()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static int
checkForNonXmlCharacters(char[] ch, int start, int length, boolean xml11)
Check the input character array for non-XML characters.static int
checkForNonXmlCharacters(java.lang.String s)
Check the input string for non-XML 1.0 characters.static int
checkForNonXmlCharacters(java.lang.String s, boolean xml11)
Check the input string for non-XML characters.static javax.xml.parsers.DocumentBuilderFactory
createDocumentBuilderFactory()
static javax.xml.parsers.SAXParserFactory
createSAXParserFactory()
static javax.xml.transform.sax.SAXTransformerFactory
createSaxTransformerFactory()
static javax.xml.transform.TransformerFactory
createTransformerFactory()
static org.xml.sax.XMLReader
createXMLReader()
static org.w3c.dom.Element
getChildByTagName(org.w3c.dom.Element aElem, java.lang.String aName)
Gets the first child of the given Element with the given tag name.static org.w3c.dom.Element
getFirstChildElement(org.w3c.dom.Element aElem)
Gets the first child of the given Element.static java.lang.String
getText(org.w3c.dom.Element aElem)
Gets the text of this Element.static java.lang.String
getText(org.w3c.dom.Element aElem, boolean aExpandEnvVarRefs)
Gets the text of this Element.static void
normalize(java.lang.String aStr, java.lang.StringBuffer aResultBuf)
Normalizes the given string for output to XML.static void
normalize(java.lang.String aStr, java.lang.StringBuffer aResultBuf, boolean aNewlinesToSpaces)
Normalizes the given string for output to XML.static java.lang.Object
readPrimitiveValue(org.w3c.dom.Element aElem)
Reads a primitive value from its standard DOM representation.static void
writeNormalizedString(java.lang.String aStr, java.io.Writer aWriter, boolean aNewlinesToSpaces)
Normalizes the given string for output to XML, and writes the normalized string to the given Writer.static void
writePrimitiveValue(java.lang.Object aObj, java.io.Writer aWriter)
Writes a standard XML representation of the specified Object, in the form:
<className>string value%lt;/className%gt;
-
-
-
Method Detail
-
normalize
public static void normalize(java.lang.String aStr, java.lang.StringBuffer aResultBuf)
Normalizes the given string for output to XML. This converts all special characters, e.g. <, %gt;, &, to their XML representations, e.g. <, >, &. The normalized string is appended to the specified StringBuffer.- Parameters:
aStr
- input stringaResultBuf
- the StringBuffer to which the normalized string will be appended
-
normalize
public static void normalize(java.lang.String aStr, java.lang.StringBuffer aResultBuf, boolean aNewlinesToSpaces)
Normalizes the given string for output to XML. This converts all special characters, e.g. <, %gt;, &, to their XML representations, e.g. <, >, &. Also may convert newlines to spaces, depending on theaNewlinesToSpaces
parameter. The normalized string is appended to the specified StringBuffer.- Parameters:
aStr
- input stringaNewlinesToSpaces
- iff true, newlines (\r and \n) will be converted to spacesaResultBuf
- the StringBuffer to which the normalized string will be appended
-
writeNormalizedString
public static void writeNormalizedString(java.lang.String aStr, java.io.Writer aWriter, boolean aNewlinesToSpaces) throws java.io.IOException
Normalizes the given string for output to XML, and writes the normalized string to the given Writer. Normalization converts all special characters, e.g. <, %gt;, &, to their XML representations, e.g. <, >, &. Also may convert newlines to spaces, depending on theaNewlinesToSpaces
parameter.- Parameters:
aStr
- input stringaWriter
- a Writer to which the normalized string will be writtenaNewlinesToSpaces
- iff true, newlines (\r and \n) will be converted to spaces- Throws:
java.io.IOException
- if an I/O failure occurs when writing toaWriter
-
writePrimitiveValue
public static void writePrimitiveValue(java.lang.Object aObj, java.io.Writer aWriter) throws java.io.IOException
Writes a standard XML representation of the specified Object, in the form:
<className>string value%lt;/className%gt;
where
className
is the object's java class name without the package and made lowercase, e.g. "string","integer", "boolean" andstring value
is the result ofObject.toString()
.This is intended to be used for Java Strings and wrappers for primitive value classes (e.g. Integer, Boolean).
- Parameters:
aObj
- the object to writeaWriter
- a Writer to which the XML will be written- Throws:
java.io.IOException
- if an I/O failure occurs when writing toaWriter
-
getChildByTagName
public static org.w3c.dom.Element getChildByTagName(org.w3c.dom.Element aElem, java.lang.String aName)
Gets the first child of the given Element with the given tag name.- Parameters:
aElem
- the parent elementaName
- tag name of the child to retrieve- Returns:
- the first child of
aElem
with tag nameaName
,null
if there is no such child.
-
getFirstChildElement
public static org.w3c.dom.Element getFirstChildElement(org.w3c.dom.Element aElem)
Gets the first child of the given Element.- Parameters:
aElem
- the parent element- Returns:
- the first child of
aElem
,null
if it has no children.
-
readPrimitiveValue
public static java.lang.Object readPrimitiveValue(org.w3c.dom.Element aElem)
Reads a primitive value from its standard DOM representation. (This is the representation produced bywritePrimitiveValue(Object, Writer)
.This is intended to be used for Java Strings and wrappers for primitive value classes (e.g. Integer, Boolean).
- Parameters:
aElem
- the element representing the value- Returns:
- the value that was read,
null
if a primitive value could not be constructed from the element
-
getText
public static java.lang.String getText(org.w3c.dom.Element aElem)
Gets the text of this Element. Leading and trailing whitespace is removed.- Parameters:
aElem
- the element- Returns:
- the text of
aElem
-
getText
public static java.lang.String getText(org.w3c.dom.Element aElem, boolean aExpandEnvVarRefs)
Gets the text of this Element. Leading and trailing whitespace is removed. Environment variable references of the form <envVarRef%gt;PARAM_NAME</envVarRef> may be expanded.- Parameters:
aElem
- the elementaExpandEnvVarRefs
- whether to expand environment variable references. Defaults to false.- Returns:
- the text of
aElem
-
checkForNonXmlCharacters
public static final int checkForNonXmlCharacters(java.lang.String s)
Check the input string for non-XML 1.0 characters. If non-XML characters are found, return the position of first offending character. Else, return-1
.From the XML 1.0 spec:
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] // any Unicode character, excluding the surrogate blocks, FFFE, and FFFF.
And from the UTF-16 spec:
Characters with values between 0x10000 and 0x10FFFF are represented by a 16-bit integer with a value between 0xD800 and 0xDBFF (within the so-called high-half zone or high surrogate area) followed by a 16-bit integer with a value between 0xDC00 and 0xDFFF (within the so-called low-half zone or low surrogate area).
- Parameters:
s
- Input string- Returns:
- The position of the first invalid XML character encountered.
-1
if no invalid XML characters found.
-
checkForNonXmlCharacters
public static final int checkForNonXmlCharacters(java.lang.String s, boolean xml11)
Check the input string for non-XML characters. If non-XML characters are found, return the position of first offending character. Else, return-1
.The definition of an XML character is different for XML 1.0 and 1.1. This method will check either version, depending on the value of the
xml11
argument.From the XML 1.0 spec:
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] // any Unicode character, excluding the surrogate blocks, FFFE, and FFFF.
From the XML 1.1 spec:
Char ::= [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
And from the UTF-16 spec:
Characters with values between 0x10000 and 0x10FFFF are represented by a 16-bit integer with a value between 0xD800 and 0xDBFF (within the so-called high-half zone or high surrogate area) followed by a 16-bit integer with a value between 0xDC00 and 0xDFFF (within the so-called low-half zone or low surrogate area).
- Parameters:
s
- Input stringxml11
- true to check for invalid XML 1.1 characters, false to check for invalid XML 1.0 characters. The default is false.- Returns:
- The position of the first invalid XML character encountered.
-1
if no invalid XML characters found.
-
checkForNonXmlCharacters
public static final int checkForNonXmlCharacters(char[] ch, int start, int length, boolean xml11)
Check the input character array for non-XML characters. If non-XML characters are found, return the position of first offending character. Else, return-1
.- Parameters:
ch
- Input character arraystart
- offset of first char to checklength
- number of chars to checkxml11
- true to check for invalid XML 1.1 characters, false to check for invalid XML 1.0 characters. The default is false.- Returns:
- The position of the first invalid XML character encountered.
-1
if no invalid XML characters found. - See Also:
checkForNonXmlCharacters(String, boolean)
-
createSAXParserFactory
public static javax.xml.parsers.SAXParserFactory createSAXParserFactory()
-
createXMLReader
public static org.xml.sax.XMLReader createXMLReader() throws org.xml.sax.SAXException
- Throws:
org.xml.sax.SAXException
-
createSaxTransformerFactory
public static javax.xml.transform.sax.SAXTransformerFactory createSaxTransformerFactory()
-
createTransformerFactory
public static javax.xml.transform.TransformerFactory createTransformerFactory()
-
createDocumentBuilderFactory
public static javax.xml.parsers.DocumentBuilderFactory createDocumentBuilderFactory()
-
-