module Kramdown::Parser::Html::Parser

Contains the parsing methods. This module can be mixed into any parser to get HTML parsing functionality. The only thing that must be provided by the class are instance variable @stack for storing the needed state and @src (instance of StringScanner) for the actual parsing.

Public Instance Methods

handle_html_start_tag(line = nil) { |el, closed, handle_body| ... } click to toggle source

Process the HTML start tag that has already be scanned/checked via @src.

Does the common processing steps and then yields to the caller for further processing (first parameter is the created element; the second parameter is true if the HTML element is already closed, ie. contains no body; the third parameter specifies whether the body - and the end tag - need to be handled in case closed=false).

    # File lib/kramdown/parser/html.rb
 86 def handle_html_start_tag(line = nil) # :yields: el, closed, handle_body
 87   name = @src[1]
 88   name.downcase! if HTML_ELEMENT[name.downcase]
 89   closed = !@src[4].nil?
 90   attrs = parse_html_attributes(@src[2], line, HTML_ELEMENT[name])
 91 
 92   el = Element.new(:html_element, name, attrs, category: :block)
 93   el.options[:location] = line if line
 94   @tree.children << el
 95 
 96   if !closed && HTML_ELEMENTS_WITHOUT_BODY.include?(el.value)
 97     closed = true
 98   end
 99   if name == 'script' || name == 'style'
100     handle_raw_html_tag(name)
101     yield(el, false, false)
102   else
103     yield(el, closed, true)
104   end
105 end
handle_raw_html_tag(name) click to toggle source

Handle the raw HTML tag at the current position.

    # File lib/kramdown/parser/html.rb
126 def handle_raw_html_tag(name)
127   curpos = @src.pos
128   if @src.scan_until(/(?=<\/#{name}\s*>)/mi)
129     add_text(extract_string(curpos...@src.pos, @src), @tree.children.last, :raw)
130     @src.scan(HTML_TAG_CLOSE_RE)
131   else
132     add_text(@src.rest, @tree.children.last, :raw)
133     @src.terminate
134     warning("Found no end tag for '#{name}' - auto-closing it")
135   end
136 end
parse_html_attributes(str, line = nil, in_html_tag = true) click to toggle source

Parses the given string for HTML attributes and returns the resulting hash.

If the optional line parameter is supplied, it is used in warning messages.

If the optional in_html_tag parameter is set to false, attributes are not modified to contain only lowercase letters.

    # File lib/kramdown/parser/html.rb
113 def parse_html_attributes(str, line = nil, in_html_tag = true)
114   attrs = {}
115   str.scan(HTML_ATTRIBUTE_RE).each do |attr, val, _sep, quoted_val|
116     attr.downcase! if in_html_tag
117     if attrs.key?(attr)
118       warning("Duplicate HTML attribute '#{attr}' on line #{line || '?'} - overwriting previous one")
119     end
120     attrs[attr] = val || quoted_val || ""
121   end
122   attrs
123 end
parse_raw_html(el, &block) click to toggle source

Parse raw HTML from the current source position, storing the found elements in el. Parsing continues until one of the following criteria are fulfilled:

  • The end of the document is reached.

  • The matching end tag for the element el is found (only used if el is an HTML element).

When an HTML start tag is found, processing is deferred to handle_html_start_tag, providing the block given to this method.

    # File lib/kramdown/parser/html.rb
149 def parse_raw_html(el, &block)
150   @stack.push(@tree)
151   @tree = el
152 
153   done = false
154   while !@src.eos? && !done
155     if (result = @src.scan_until(HTML_RAW_START))
156       add_text(result, @tree, :text)
157       line = @src.current_line_number
158       if (result = @src.scan(HTML_COMMENT_RE))
159         @tree.children << Element.new(:xml_comment, result, nil, category: :block, location: line)
160       elsif (result = @src.scan(HTML_INSTRUCTION_RE))
161         @tree.children << Element.new(:xml_pi, result, nil, category: :block, location: line)
162       elsif @src.scan(HTML_TAG_RE)
163         if method(:handle_html_start_tag).arity.abs >= 1
164           handle_html_start_tag(line, &block)
165         else
166           handle_html_start_tag(&block) # DEPRECATED: method needs to accept line number in 2.0
167         end
168       elsif @src.scan(HTML_TAG_CLOSE_RE)
169         if @tree.value == (HTML_ELEMENT[@tree.value] ? @src[1].downcase : @src[1])
170           done = true
171         else
172           add_text(@src.matched, @tree, :text)
173           warning("Found invalidly used HTML closing tag for '#{@src[1]}' on " \
174                   "line #{line} - ignoring it")
175         end
176       else
177         add_text(@src.getch, @tree, :text)
178       end
179     else
180       add_text(@src.rest, @tree, :text)
181       @src.terminate
182       if @tree.type == :html_element
183         warning("Found no end tag for '#{@tree.value}' on line " \
184                 "#{@tree.options[:location]} - auto-closing it")
185       end
186       done = true
187     end
188   end
189 
190   @tree = @stack.pop
191 end