class Ronn::Document

The Document class can be used to load and inspect a ronn document and to convert a ronn document into other formats, like roff or HTML.

Ronn files may optionally follow the naming convention: “<name>.<section>.ronn”. The <name> and <section> are used in generated documentation unless overridden by the information extracted from the document's name section.

Attributes

data[R]

The raw input data, read from path or stream and unmodified.

date[W]

The date the document was published; center displayed in the document footer.

encoding[RW]

Encoding that the Ronn document is in

index[RW]

The index used to resolve man and file references.

manual[RW]

The manual this document belongs to; center displayed in the header.

name[W]

The man pages name: usually a single word name of a program or filename; displayed along with the section in the left and right portions of the header as well as the bottom right section of the footer.

organization[RW]

The name of the group, organization, or individual responsible for this document; displayed in the left portion of the footer.

outdir[RW]

Output directory to write files to.

path[R]

Path to the Ronn document. This may be '-' or nil when the Ronn::Document object is created with a stream, in which case stdin will be read.

section[W]

The man page's section: a string whose first character is numeric; displayed in parenthesis along with the name.

styles[R]

Array of style modules to apply to the document.

tagline[RW]

Single sentence description of the thing being described by this man page; displayed in the NAME section.

Public Class Methods

new(path = nil, attributes = {}, &block) click to toggle source

Create a Ronn::Document given a path or with the data returned by calling the block. The document is loaded and preprocessed before the intialize method returns. The attributes hash may contain values for any writeable attributes defined on this class.

# File lib/ronn/document.rb, line 71
def initialize(path = nil, attributes = {}, &block)
  @path = path
  @basename = path.to_s =~ /^-?$/ ? nil : File.basename(path)
  @reader = block ||
            lambda do |f|
              if ['-', nil].include?(f)
                STDIN.read
              else
                File.read(f, encoding: @encoding)
              end
            end
  @data = @reader.call(path)
  @name, @section, @tagline = sniff

  @styles = %w[man]
  @manual, @organization, @date = nil
  @markdown, @input_html, @html = nil
  @index = Ronn::Index[path || '.']
  @index.add_manual(self) if path && name

  attributes.each { |attr_name, value| send("#{attr_name}=", value) }
end

Public Instance Methods

basename(type = nil) click to toggle source

Generate a file basename of the form “<name>.<section>.<type>” for the given file extension. Uses the name and section from the source file path but falls back on the name and section defined in the document.

# File lib/ronn/document.rb, line 98
def basename(type = nil)
  type = nil if ['', 'roff'].include?(type.to_s)
  [path_name || @name, path_section || @section, type]
    .compact.join('.')
end
convert(format) click to toggle source

Convert the document to :roff, :html, or :html_fragment and return the result as a string.

# File lib/ronn/document.rb, line 240
def convert(format)
  send "to_#{format}"
end
date() click to toggle source

The date the man page was published. If not set explicitly, this is the file's modified time or, if no file is given, the current time. Center displayed in the document footer.

# File lib/ronn/document.rb, line 184
def date
  return @date if @date

  return File.mtime(path) if File.exist?(path)

  Time.now
end
html() click to toggle source

A Nokogiri DocumentFragment for the manual content fragment.

# File lib/ronn/document.rb, line 234
def html
  @html ||= process_html!
end
markdown() click to toggle source

Preprocessed markdown input text.

# File lib/ronn/document.rb, line 229
def markdown
  @markdown ||= process_markdown!
end
name() click to toggle source

Returns the manual page name based first on the document's contents and then on the path name. Usually a single word name of a program or filename; displayed along with the section in the left and right portions of the header as well as the bottom right section of the footer.

# File lib/ronn/document.rb, line 140
def name
  @name || path_name
end
name?() click to toggle source

Truthful when the name was extracted from the name section of the document.

# File lib/ronn/document.rb, line 146
def name?
  !@name.nil?
end
path_for(type = nil) click to toggle source

Construct a path for a file near the source file. Uses the #basename method to generate the basename part and appends it to the dirname of the source document.

# File lib/ronn/document.rb, line 107
def path_for(type = nil)
  if @outdir
    File.join(@outdir, basename(type))
  elsif @basename
    File.join(File.dirname(path), basename(type))
  else
    basename(type)
  end
end
path_name() click to toggle source

Returns the <name> part of the path, or nil when no path is available. This is used as the manual page name when the file contents do not include a name section.

# File lib/ronn/document.rb, line 120
def path_name
  return unless @basename

  parts = @basename.split('.')
  parts.pop if parts.length > 1 && parts.last =~ /^\w+$/
  parts.pop if parts.last =~ /^\d+$/
  parts.join('.')
end
path_section() click to toggle source

Returns the <section> part of the path, or nil when no path is available.

# File lib/ronn/document.rb, line 131
def path_section
  $1 if @basename.to_s =~ /\.(\d\w*)\./
end
reference_name() click to toggle source

The name used to reference this manual.

# File lib/ronn/document.rb, line 164
def reference_name
  name + (section && "(#{section})").to_s
end
section() click to toggle source

Returns the manual page section based first on the document's contents and then on the path name. A string whose first character is numeric; displayed in parenthesis along with the name.

# File lib/ronn/document.rb, line 153
def section
  @section || path_section
end
section?() click to toggle source

True when the section number was extracted from the name section of the document.

# File lib/ronn/document.rb, line 159
def section?
  !@section.nil?
end
section_heads()
Alias for: toc
sniff() click to toggle source

Sniff the document header and extract basic document metadata. Return a tuple of the form: [name, section, description], where missing information is represented by nil and any element may be missing.

# File lib/ronn/document.rb, line 210
def sniff
  html = Kramdown::Document.new(data[0, 512], auto_ids: false, smart_quotes: ['apos', 'apos', 'quot', 'quot'], typographic_symbols: { hellip: '...', ndash: '--', mdash: '--' }).to_html
  heading, html = html.split("</h1>\n", 2)
  return [nil, nil, nil] if html.nil?

  case heading
  when /([\w_.\[\]~+=@:-]+)\s*\((\d\w*)\)\s*-+\s*(.*)/
    # name(section) -- description
    [$1, $2, $3]
  when /([\w_.\[\]~+=@:-]+)\s+-+\s+(.*)/
    # name -- description
    [$1, nil, $2]
  else
    # description
    [nil, nil, heading.sub('<h1>', '')]
  end
end
styles=(styles) click to toggle source

Styles to insert in the generated HTML output. This is a simple Array of string module names or file paths.

# File lib/ronn/document.rb, line 203
def styles=(styles)
  @styles = (%w[man] + styles).uniq
end
title() click to toggle source

The document's title when no name section was defined. When a name section exists, this value is nil.

# File lib/ronn/document.rb, line 177
def title
  @tagline unless name?
end
title?() click to toggle source

Truthful when the document started with an h1 but did not follow the “<name>(<sect>) – <tagline>” convention. We assume this is some kind of custom title.

# File lib/ronn/document.rb, line 171
def title?
  !name? && tagline
end
to_h() click to toggle source
# File lib/ronn/document.rb, line 285
def to_h
  %w[name section tagline manual organization date styles toc]
    .each_with_object({}) { |name, hash| hash[name] = send(name) }
end
to_html() click to toggle source

Convert the document to HTML and return the result as a string. The returned string is a complete HTML document.

# File lib/ronn/document.rb, line 255
def to_html
  layout = ENV['RONN_LAYOUT']
  layout_path = nil
  if layout
    layout_path = File.expand_path(layout)
    unless File.exist?(layout_path)
      warn "warn: can't find #{layout}, using default layout."
      layout_path = nil
    end
  end

  template = Ronn::Template.new(self)
  template.context.push html: to_html_fragment(nil)
  template.render(layout_path || 'default')
end
to_html_fragment(wrap_class = 'mp') click to toggle source

Convert the document to HTML and return the result as a string. The HTML does not include <html>, <head>, or <style> tags.

# File lib/ronn/document.rb, line 274
def to_html_fragment(wrap_class = 'mp')
  frag_nodes = html.at('body').children
  out = frag_nodes.to_s.rstrip
  out = "<div class='#{wrap_class}'>#{out}\n</div>" unless wrap_class.nil?
  out
end
to_json(*_args) click to toggle source
# File lib/ronn/document.rb, line 295
def to_json(*_args)
  require 'json'
  to_h.merge('date' => date.iso8601).to_json
end
to_markdown() click to toggle source
# File lib/ronn/document.rb, line 281
def to_markdown
  markdown
end
to_roff() click to toggle source

Convert the document to roff and return the result as a string.

# File lib/ronn/document.rb, line 245
def to_roff
  RoffFilter.new(
    to_html_fragment(nil),
    name, section, tagline,
    manual, organization, date
  ).to_s
end
to_yaml() click to toggle source
# File lib/ronn/document.rb, line 290
def to_yaml
  require 'yaml'
  to_h.to_yaml
end
toc() click to toggle source

Retrieve a list of top-level section headings in the document and return as an array of +[id, text]+ tuples, where id is the element's generated id and text is the inner text of the heading element.

# File lib/ronn/document.rb, line 195
def toc
  @toc ||=
    html.search('h2[@id]').map { |h2| [h2.attributes['id'].content.upcase, h2.inner_text] }
end
Also aliased as: section_heads

Protected Instance Methods

html_filter_angle_quotes() click to toggle source

Perform angle quote (<THESE>) post filtering.

# File lib/ronn/document.rb, line 381
def html_filter_angle_quotes
  # convert all angle quote vars nested in code blocks
  # back to the original text
  code_nodes = @html.search('code')
  code_nodes.search('.//text() | text()').each do |node|
    next unless node.to_html.include?('var&gt;')

    new =
      node.to_html
          .gsub('&lt;var&gt;', '&lt;')
          .gsub('&lt;/var&gt;', '>')
    node.swap(new)
  end
end
html_filter_definition_lists() click to toggle source

Convert special format unordered lists to definition lists.

# File lib/ronn/document.rb, line 397
def html_filter_definition_lists
  # process all unordered lists depth-first
  @html.search('ul').to_a.reverse_each do |ul|
    items = ul.search('li')
    next if items.any? { |item| item.inner_text.strip.split("\n", 2).first !~ /:$/ }

    dl = Nokogiri::XML::Node.new 'dl', html
    items.each do |item|
      # This processing is specific to how Markdown generates definition lists
      term, definition = item.inner_html.strip.split(":\n", 2)
      term = term.sub(/^<p>/, '')

      dt = Nokogiri::XML::Node.new 'dt', html
      dt.children = Nokogiri::HTML.fragment(term)
      dt.attributes['class'] = 'flush' if dt.inner_text.length <= 7

      dd = Nokogiri::XML::Node.new 'dd', html
      dd_contents = Nokogiri::HTML.fragment(definition)
      dd.children = dd_contents

      dl.add_child(dt)
      dl.add_child(dd)
    end
    ul.replace(dl)
  end
end
html_filter_heading_anchors() click to toggle source

Add URL anchors to all HTML heading elements.

# File lib/ronn/document.rb, line 444
def html_filter_heading_anchors
  h_nodes = @html.search('//*[self::h1 or self::h2 or self::h3 or self::h4 or self::h5 and not(@id)]')
  h_nodes.each do |heading|
    heading.set_attribute('id', heading.inner_text.gsub(/\W+/, '-'))
  end
end
html_filter_inject_name_section() click to toggle source
# File lib/ronn/document.rb, line 424
def html_filter_inject_name_section
  markup =
    if title?
      "<h1>#{title}</h1>"
    elsif name
      "<h2>NAME</h2>\n" \
        "<p class='man-name'>\n  <code>#{name}</code>" +
        (tagline ? " - <span class='man-whatis'>#{tagline}</span>\n" : "\n") +
        "</p>\n"
    end
  return unless markup

  if html.at('body').first_element_child
    html.at('body').first_element_child.before(Nokogiri::HTML.fragment(markup))
  else
    html.at('body').add_child(Nokogiri::HTML.fragment(markup))
  end
end
input_html() click to toggle source
# File lib/ronn/document.rb, line 312
def input_html
  @input_html ||= strip_heading(Kramdown::Document.new(markdown, auto_ids: false, smart_quotes: ['apos', 'apos', 'quot', 'quot'], typographic_symbols: { hellip: '...', ndash: '--', mdash: '--' }).to_html)
end
markdown_filter_angle_quotes(markdown) click to toggle source

Convert <WORD> to <var>WORD</var> but only if WORD isn't an HTML tag.

# File lib/ronn/document.rb, line 367
def markdown_filter_angle_quotes(markdown)
  markdown.gsub(/<([^:.\/]+?)>/) do |match|
    contents = $1
    tag, attrs = contents.split(' ', 2)
    if attrs =~ /\/=/ || html_element?(tag.sub(/^\//, '')) ||
       data.include?("</#{tag}>") || contents =~ /^!/
      match.to_s
    else
      "<var>#{contents}</var>"
    end
  end
end
markdown_filter_heading_anchors(markdown) click to toggle source

Add [id]: #ANCHOR elements to the markdown source text for all sections. This lets us use the [SECTION-REF][] syntax

# File lib/ronn/document.rb, line 354
def markdown_filter_heading_anchors(markdown)
  first = true
  markdown.split("\n").grep(/^[#]{2,5} +[\w '-]+[# ]*$/).each do |line|
    markdown << "\n\n" if first
    first = false
    title = line.gsub(/[^\w -]/, '').strip
    anchor = title.gsub(/\W+/, '-').gsub(/(^-+|-+$)/, '')
    markdown << "[#{title}]: ##{anchor} \"#{title}\"\n"
  end
  markdown
end
preprocess!() click to toggle source

Parse the document and extract the name, section, and tagline from its contents. This is called while the object is being initialized.

# File lib/ronn/document.rb, line 307
def preprocess!
  input_html
  nil
end
process_html!() click to toggle source
# File lib/ronn/document.rb, line 327
def process_html!
  wrapped_html = "<html>\n  <body>\n#{input_html}\n  </body>\n</html>"
  @html = Nokogiri::HTML.parse(wrapped_html)
  html_filter_angle_quotes
  html_filter_definition_lists
  html_filter_inject_name_section
  html_filter_heading_anchors
  html_filter_annotate_bare_links
  html_filter_manual_reference_links
  @html
end
process_markdown!() click to toggle source
# File lib/ronn/document.rb, line 321
def process_markdown!
  md = markdown_filter_heading_anchors(data)
  md = markdown_filter_link_index(md)
  markdown_filter_angle_quotes(md)
end
strip_heading(html) click to toggle source
# File lib/ronn/document.rb, line 316
def strip_heading(html)
  heading, html = html.split("</h1>\n", 2)
  html || heading
end