module Loofah::TextBehavior

Overrides text in HTML::Document and HTML::DocumentFragment, and mixes in to_text.

Public Instance Methods

inner_text(options = {})
Alias for: text
text(options = {}) click to toggle source

Returns a plain-text version of the markup contained by the document, with HTML entities encoded.

This method is significantly faster than to_text, but isn't clever about whitespace around block elements.

Loofah.document("<h1>Title</h1><div>Content</div>").text
# => "TitleContent"

By default, the returned text will have HTML entities escaped. If you want unescaped entities, and you understand that the result is unsafe to render in a browser, then you can pass an argument as shown:

frag = Loofah.fragment("&lt;script&gt;alert('EVIL');&lt;/script&gt;")
# ok for browser:
frag.text                                 # => "&lt;script&gt;alert('EVIL');&lt;/script&gt;"
# decidedly not ok for browser:
frag.text(:encode_special_chars => false) # => "<script>alert('EVIL');</script>"
# File lib/loofah/instance_methods.rb, line 95
def text(options = {})
  result = if serialize_root
    serialize_root.children.reject(&:comment?).map(&:inner_text).join("")
  else
    ""
  end
  if options[:encode_special_chars] == false
    result # possibly dangerous if rendered in a browser
  else
    encode_special_chars result
  end
end
Also aliased as: inner_text, to_str
to_str(options = {})
Alias for: text
to_text(options = {}) click to toggle source

Returns a plain-text version of the markup contained by the fragment, with HTML entities encoded.

This method is slower than text, but is clever about whitespace around block elements and line break elements.

Loofah.document("<h1>Title</h1><div>Content<br>Next line</div>").to_text
# => "\nTitle\n\nContent\nNext line\n"
# File lib/loofah/instance_methods.rb, line 121
def to_text(options = {})
  Loofah.remove_extraneous_whitespace self.dup.scrub!(:newline_block_elements).text(options)
end