Package com.liferay.portal.kernel.util
Class HtmlParserUtil
Object
com.liferay.portal.kernel.util.HtmlParserUtil
Provides utility methods for rendering HTML text.
This class uses XSS recommendations from http://www.owasp.org/index.php/Cross_Site_Scripting#How_to_Protect_Yourself
when escaping HTML text.
- Author:
- Brian Wing Shun Chan, Clarence Shen, Harry Mark, Samuel Kong
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic String
extractText
(String html) Extracts the raw text from the HTML input, compressing its whitespace and removing all attributes, scripts, and styles.static String
findAttributeValue
(Predicate<Function<String, String>> findValuePredicate, Function<Function<String, String>, String> returnValueFunction, String html, String startTagName) static String
Renders the HTML content into text.
-
Constructor Details
-
HtmlParserUtil
public HtmlParserUtil()
-
-
Method Details
-
extractText
Extracts the raw text from the HTML input, compressing its whitespace and removing all attributes, scripts, and styles.For example, raw text returned by this method can be stored in a search index.
- Parameters:
html
- the HTML text- Returns:
- the raw text from the HTML input, or
null
if the HTML input isnull
-
findAttributeValue
-
render
Renders the HTML content into text. This provides a human readable version of the segment content that is modeled on the way Mozilla Thunderbird® and other email clients provide an automatic conversion of HTML content to text in their alternative MIME encoding of emails.Using the default settings, the output complies with the
Text/Plain; Format=Flowed (DelSp=No)
protocol described in RFC-3676.- Parameters:
html
- the HTML text- Returns:
- the rendered HTML text, or
null
if the HTML text isnull
-