class String
Constants
- BRA2KET
- ROMAN
Taken from O'Reilly's Perl Cookbook 6.23. Regular Expression Grabbag.
- ROMAN_VALUES
Public Class Methods
Interpolate. Provides a means of extenally using Ruby string interpolation mechinism.
try = "hello" str = "\#{try}!!!" String.interpolate{ str } #=> "hello!!!"
NOTE: The block neccessary in order to get then binding of the caller.
CREDIT: Trans
# File lib/facets/string/interpolate.rb, line 14 def self.interpolate(&str) eval "%{#{str.call}}", str.binding end
Generate a random binary string of n_bytes
size.
CREDIT: Guido De Rosa
# File lib/facets/string/random_binary.rb, line 6 def self.random_binary(n_bytes) ( Array.new(n_bytes){ rand(0x100) } ).pack('c*') end
Public Instance Methods
Removes occurances of a string or regexp.
("HELLO HELLO" - "LL") #=> "HEO HEO"
CREDIT: Benjamin David Oakes
# File lib/facets/string/op_sub.rb, line 9 def -(pattern) self.gsub(pattern, '') end
Treats self
and path
as representations of
pathnames, joining thme together as a single path.
('home' / 'trans') #=> 'home/trans'
# File lib/facets/string/op_div.rb, line 9 def /(path) File.join(self, path.to_s) end
Binary XOR of two strings.
a = "\000\000\001\001" ^ "\000\001\000\001" b = "\003\003\003" ^ "\000\001\002" a #=> "\000\001\001\000" b #=> "\003\002\001"
# File lib/facets/string/xor.rb, line 11 def ^(aString) a = self.unpack('C'*(self.length)) b = aString.unpack('C'*(aString.length)) if (b.length < a.length) (a.length - b.length).times { b << 0 } end xor = "" 0.upto(a.length-1) { |pos| x = a[pos] ^ b[pos] xor << x.chr() } return(xor) end
Transform a string into an acronym.
CREDIT: Robert Fey
# File lib/facets/string/acronym.rb, line 7 def acronym gsub(/(([a-zA-Z0-9])([a-zA-Z0-9])*)./,"\\2") end
Alignment method dispatches to align_right, align_left or align_center, accorging to the
first direction
parameter.
s = <<-EOS This is a test and so on EOS s.align(:right, 14)
produces …
This is a test and so on
Returns a String aligned right, left or center.
# File lib/facets/string/align.rb, line 21 def align(direction, n, sep="\n", c=' ') case direction when :right align_right(n, sep="\n", c=' ') when :left align_left(n, sep="\n", c=' ') when :center align_center(n, sep="\n", c=' ') else raise ArgumentError end end
Centers each line of a string.
The default alignment separation is a new line (“n”). This can be changed as can be the padding string which defaults to a single space (' ').
s = <<-EOS This is a test and so on EOS s.align_center(14)
produces …
This is a test and so on
CREDIT: Trans
# File lib/facets/string/align.rb, line 116 def align_center(n, sep="\n", c=' ') return center(n.to_i,c.to_s) if sep==nil q = split(sep.to_s).collect { |line| line.center(n.to_i,c.to_s) } q.join(sep.to_s) end
Align a string to the left.
The default alignment separation is a new line (“n”). This can be changed as can be the padding string which defaults to a single space (' ').
s = <<-EOS This is a test and so on EOS s.align_left(20, "\n", '.')
produces …
This is a test...... and................. so on...............
CREDIT: Trans
# File lib/facets/string/align.rb, line 86 def align_left(n, sep="\n", c=' ') return ljust(n.to_i,c.to_s) if sep==nil q = split(sep.to_s).map do |line| line.strip.ljust(n.to_i,c.to_s) end q.join(sep.to_s) end
Align a string to the right.
The default alignment separation is a new line (“n”). This can be changed as can be the padding string which defaults to a single space (' ').
s = <<-EOS This is a test and so on EOS s.align_right(14)
produces …
This is a test and so on
CREDIT: Trans
# File lib/facets/string/align.rb, line 56 def align_right(n, sep="\n", c=' ') return rjust(n.to_i,c.to_s) if sep==nil q = split(sep.to_s).map do |line| line.rjust(n.to_i,c.to_s) end q.join(sep.to_s) end
Is this string just whitespace?
"abc".blank? #=> false " ".blank? #=> true
# File lib/facets/kernel/blank.rb, line 74 def blank? /\S/ !~ self end
Return a new string embraced by given brackets. If only one bracket char is given it will be placed on either side.
"wrap me".bracket('{') #=> "{wrap me}" "wrap me".bracket('--','!') #=> "--wrap me!"
CREDIT: Trans
# File lib/facets/string/bracket.rb, line 14 def bracket(bra, ket=nil) #ket = String.bra2ket[$&] if ! ket && /^[\[({<]$/ =~ bra ket = BRA2KET[bra] unless ket "#{bra}#{self}#{ket ? ket : bra}" end
Inplace version of bracket.
CREDIT: Trans
# File lib/facets/string/bracket.rb, line 24 def bracket!(bra, ket=nil) self.replace(bracket(bra, ket)) end
Transform a string into a sentence like form.
"This Is A String".briefcase #=> "This is a string"
# File lib/facets/string/titlecase.rb, line 22 def briefcase titlecase.capitalize end
Upacks string into bytes.
Note, this is not 100% compatible with 1.8.7+ which returns an enumerator instead of an array.
# File lib/facets/string/bytes.rb, line 10 def bytes(&blk) if block_given? self.unpack('C*').each(&blk) else self.unpack('C*') end end
Converts a string to camelcase. This method leaves the first character as given. This allows other methods to be used first, such as uppercase and lowercase.
"camel_case".camelcase #=> "camelCase" "Camel_case".camelcase #=> "CamelCase"
Custom separators
can be used to specify the patterns used to
determine where capitalization should occur. By default these are
underscores (`_`) and space characters (`s`).
"camel/case".camelcase('/') #=> "camelCase"
If the first separator is a symbol, either `:lower` or `:upper`, then the first characters of the string will be downcased or upcased respectively.
"camel_case".camelcase(:upper) #=> "CamelCase"
Note that this implementation is different from ActiveSupport's. If that is what you are looking for you may want {#modulize}.
# File lib/facets/string/camelcase.rb, line 24 def camelcase(*separators) case separators.first when Symbol, TrueClass, FalseClass, NilClass first_letter = separators.shift end separators = ['_', '\s'] if separators.empty? str = self.dup separators.each do |s| str = str.gsub(/(?:#{s}+)([a-z])/){ $1.upcase } end case first_letter when :upper, true str = str.gsub(/(\A|\s)([a-z])/){ $1 + $2.upcase } when :lower, false str = str.gsub(/(\A|\s)([A-Z])/){ $1 + $2.downcase } end str end
Return true if the string is capitalized, otherwise false.
"This".capitalized? #=> true "THIS".capitalized? #=> false "this".capitalized? #=> false
Note Ruby's strange concept of capitalized. See capitalcase for the more command conception.
CREDIT: Phil Tomson
# File lib/facets/string/capitalized.rb, line 14 def capitalized? capitalize == self end
Returns an array of characters.
"abc".characters.to_a #=> ["a","b","c"]
TODO: Probably should make this an enumerator. With scan?
# File lib/facets/string/characters.rb, line 8 def characters split(//) end
Returns an Enumerator for iterating over each line of the string, stripped of whitespace on either side.
"this\nthat\nother\n".cleanlines.to_a #=> ['this', 'that', 'other']
# File lib/facets/string/cleanlines.rb, line 11 def cleanlines(&block) if block scan(/^.*?$/) do |line| block.call(line.strip) end else str = self Enumerator.new do |output| str.scan(/^.*?$/) do |line| output.yield(line.strip) end end end end
Cleave a string. Break a string in two parts at the nearest whitespace.
CREDIT: Trans
# File lib/facets/string/cleave.rb, line 8 def cleave(threshold=nil, len=nil) l = (len || size / 2) t = threshold || size h1 = self[0...l] h2 = self[l..-1] i1 = h1.rindex(/\s/) || 0 d1 = (i1 - l).abs d2 = h2.index(/\s/) || l i2 = d2 + l d1 = (i1-l).abs d2 = (i2-l).abs if [d1, d2].min > t i = t elsif d1 < d2 i = i1 else i = i2 end #dup.insert(l, "\n").gsub(/^\s+|\s+$/, '') return self[0..i].to_s.strip, self[i+1..-1].to_s.strip end
Compare method that takes length into account. Unlike #<=>, this is compatible with succ.
"abc".cmp("abc") #=> 0 "abcd".cmp("abc") #=> 1 "abc".cmp("abcd") #=> -1 "xyz".cmp("abc") #=> 1
CREDIT: Peter Vanbroekhoven
TODO: Move #cmp to string/ directory.
# File lib/facets/comparable/cmp.rb, line 34 def cmp(other) return -1 if length < other.length return 1 if length > other.length self <=> other # alphabetic compare end
Matches any whitespace (including newline) and replaces with a single space
string = " SELECT name FROM users ".compress_lines string #=> "SELECT name FROM users"
# File lib/facets/string/compress_lines.rb, line 12 def compress_lines(spaced = true) split($/).map{ |line| line.strip }.join(spaced ? ' ' : '') end
Common Unix cryptography method. This adds a default salt to the built-in crypt method.
NOTE: This method is not a common core extension and is not loaded
automatically when using require 'facets'
.
@uncommon
require 'facets/string/crypt'
# File lib/facets/string/crypt.rb, line 14 def crypt(salt=nil) salt ||= ( (rand(26) + (rand(2) == 0 ? 65 : 97) ).chr + (rand(26) + (rand(2) == 0 ? 65 : 97) ).chr ) _crypt(salt) end
Breaks a string up into an array based on a regular expression. Similar to scan, but includes the matches.
s = "<p>This<b>is</b>a test.</p>" s.divide( /\<.*?\>/ ) #=> ["<p>This", "<b>is", "</b>a test.", "</p>"]
CREDIT: Trans
# File lib/facets/string/divide.rb, line 12 def divide( re ) re2 = /#{re}.*?(?=#{re}|\Z)/ scan(re2) #{re}(?=#{re})/) end
Return true if the string is lowercase (downcase), otherwise false.
"THIS".downcase? #=> false "This".downcase? #=> false "this".downcase? #=> true
CREDIT: Phil Tomson
# File lib/facets/string/capitalized.rb, line 26 def downcase? downcase == self end
Yields a single-character string for each character in the string. When $KCODE = 'UTF8', multi-byte characters are yielded appropriately.
a = '' "HELLO".each_char{ |c| a << "#{c.downcase}" } a #=> 'hello'
# File lib/facets/string/each_char.rb, line 14 def each_char # :yield: scanner, char = StringScanner.new(self), /./mu loop{ yield(scanner.scan(char) || break) } end
Iterate through each word of a string.
a = [] "list of words".each_word { |word| a << word } a #=> ['list', 'of', 'words']
# File lib/facets/string/each_word.rb, line 13 def each_word(&block) words.each(&block) end
Levenshtein distance algorithm implementation for Ruby, with UTF-8 support.
The Levenshtein distance is a measure of how similar two strings s and t are, calculated as the number of deletions/insertions/substitutions needed to transform s into t. The greater the distance, the more the strings differ.
The Levenshtein distance is also sometimes referred to as the easier-to-pronounce-and-spell 'edit distance'.
Calculate the Levenshtein distance between two strings self
and str2
. self
and str2
should be
ASCII, UTF-8, or a one-byte-per character encoding such as ISO-8859-*.
The strings will be treated as UTF-8 if $KCODE is set appropriately (i.e. 'u'). Otherwise, the comparison will be performed byte-by-byte. There is no specific support for Shift-JIS or EUC strings.
When using Unicode text, be aware that this algorithm does not perform normalisation. If there is a possibility of different normalised forms being used, normalisation should be performed beforehand.
CREDIT: Paul Battley
# File lib/facets/string/edit_distance.rb, line 25 def edit_distance(str2) str1 = self if $KCODE =~ /^U/i unpack_rule = 'U*' else unpack_rule = 'C*' end s = str1.unpack(unpack_rule) t = str2.unpack(unpack_rule) n = s.length m = t.length return m if (0 == n) return n if (0 == m) d = (0..m).to_a x = nil (0...n).each do |i| e = i+1 (0...m).each do |j| cost = (s[i] == t[j]) ? 0 : 1 x = [ d[j+1] + 1, # insertion e + 1, # deletion d[j] + cost # substitution ].min d[j] = e e = x end d[m] = x end return x end
Does a string end with the given suffix?
"hello".end_with?("lo") #=> true "hello".end_with?("to") #=> false
Note: This definition is better than standard Ruby's becuase it handles regular expressions.
CREDIT: Juris Galang
# File lib/facets/string/start_with.rb, line 49 def end_with?(suffix) suffix = Regexp.escape(suffix.to_s) unless Regexp===suffix /#{suffix}$/.match(self) ? true : false end
The inverse of include?.
# File lib/facets/string/exclude.rb, line 5 def exclude?(str) !include?(str) end
Expands tabs to n
spaces. Non-destructive. If n
is 0, then tabs are simply removed. Raises an exception if n
is negative.
"\t\tHey".expand_tabs(2) #=> " Hey"
Thanks to GGaramuno for a more efficient algorithm. Very nice.
CREDIT: Gavin Sinclair, Noah Gibbs, GGaramuno
TODO: Don't much care for the name #expand_tabs. What about a more concise name like detab?
# File lib/facets/string/expand_tab.rb, line 16 def expand_tabs(n=8) n = n.to_int raise ArgumentError, "n must be >= 0" if n < 0 return gsub(/\t/, "") if n == 0 return gsub(/\t/, " ") if n == 1 str = self.dup while str.gsub!(/^([^\t\n]*)(\t+)/) { |f| val = ( n * $2.size - ($1.size % n) ) $1 << (' ' * val) } end str end
Use fluent notation for making file directives.
For instance, if we had a file 'foo.txt',
'foo.txt'.file.mtime
# File lib/facets/string/file.rb, line 11 def file Functor.new(&method(:file_send).to_proc) end
Returns a new string with all new lines removed from adjacent lines of text.
s = "This is\na test.\n\nIt clumps\nlines of text." s.fold
produces
"This is a test.\n\nIt clumps lines of text. "
One arguable flaw with this, that might need a fix: if the given string ends in a newline, it is replaced with a single space.
CREDIT: Trans
# File lib/facets/string/fold.rb, line 19 def fold(ignore_indented=false) ns = '' i = 0 br = self.scan(/(\n\s*\n|\Z)/m) do |m| b = $~.begin(1) e = $~.end(1) nl = $& tx = slice(i...b) if ignore_indented and slice(i...b) =~ /^[ ]+/ ns << tx else ns << tx.gsub(/[ ]*\n+/,' ') end ns << nl i = e end ns end
Equivalent to #indent, but modifies the receiver in place.
# File lib/facets/string/indent.rb, line 18 def indent!(n, c=' ') replace(indent(n,c)) end
Like index but returns an array of all index locations. The reuse flag allows the trailing portion of a match to be reused for subsquent matches.
"abcabcabc".index_all('a') #=> [0,3,6] "bbb".index_all('bb', false) #=> [0] "bbb".index_all('bb', true) #=> [0,1]
TODO: Culd probably be defined for Indexable in general too.
# File lib/facets/string/index_all.rb, line 14 def index_all(s, reuse=false) s = Regexp.new(Regexp.escape(s)) unless Regexp===s ia = []; i = 0 while (i = index(s,i)) ia << i i += (reuse ? 1 : $~[0].size) end ia end
Left chomp.
"help".lchomp("h") #=> "elp" "help".lchomp("k") #=> "help"
CREDIT: Trans
# File lib/facets/string/lchomp.rb, line 10 def lchomp(match) if index(match) == 0 self[match.size..-1] else self.dup end end
In-place left chomp.
"help".lchomp("h") #=> "elp" "help".lchomp("k") #=> "help"
CREDIT: Trans
# File lib/facets/string/lchomp.rb, line 25 def lchomp!(match) if index(match) == 0 self[0...match.size] = '' self end end
Line wrap at width.
s = "1234567890".line_wrap(5) s #=> "12345\n67890\n"
CREDIT: Trans
# File lib/facets/string/line_wrap.rb, line 11 def line_wrap(width, tabs=4) s = gsub(/\t/,' ' * tabs) # tabs default to 4 spaces s = s.gsub(/\n/,' ') r = s.scan( /.{1,#{width}}/ ) r.join("\n") << "\n" end
Returns an array of characters.
"abc\n123".lines.to_a #=> ["abc\n","123"]
# File lib/facets/string/lines.rb, line 9 def lines(&blk) if block_given? each_line(&blk) #scan(/$.*?\n/).each(&blk) else Enumerator.new(self, :lines) #.split(/\n/) end end
Same as #camelcase
but converts first letter to lowercase.
"camel_case".lower_camelcase #=> "camelCase" "Camel_case".lower_camelcase #=> "camelCase"
@deprecated
Use `#camelcase(:lower)` instead.
# File lib/facets/string/camelcase.rb, line 68 def lower_camelcase(*separators) camelcase(:lower, *separators) end
Downcase first letter.
# File lib/facets/string/uppercase.rb, line 17 def lowercase str = to_s str[0,1].downcase + str[1..-1] end
Provides a margin controlled string.
x = %Q{ | This | is | margin controlled! }.margin
NOTE: This may still need a bit of tweaking.
TODO: describe its limits and caveats and edge cases
CREDIT: Trans
# File lib/facets/string/margin.rb, line 17 def margin(n=0) #d = /\A.*\n\s*(.)/.match( self )[1] #d = /\A\s*(.)/.match( self)[1] unless d d = ((/\A.*\n\s*(.)/.match(self)) || (/\A\s*(.)/.match(self)))[1] return '' unless d if n == 0 gsub(/\n\s*\Z/,'').gsub(/^\s*[#{d}]/, '') else gsub(/\n\s*\Z/,'').gsub(/^\s*[#{d}]/, ' ' * n) end end
Translate a class or module name to a suitable method name.
"My::CoolClass".methodize #=> "my__cool_class"
# File lib/facets/string/methodize.rb, line 7 def methodize gsub(/([A-Z]+)([A-Z])/,'\1_\2'). gsub(/([a-z])([A-Z])/,'\1_\2'). gsub('/' ,'__'). gsub('::','__'). downcase end
Converts a string to module name representation.
This is essentially camelcase, but it also converts '/' to '::' which is useful for converting paths to namespaces.
Examples
"method_name".modulize #=> "MethodName" "method/name".modulize #=> "Method::Name"
# File lib/facets/string/modulize.rb, line 14 def modulize #gsub('__','/'). # why was this ever here? gsub(/__(.?)/){ "::#{$1.upcase}" }. gsub(/\/(.?)/){ "::#{$1.upcase}" }. gsub(/(?:_+|-+)([a-z])/){ $1.upcase }. gsub(/(\A|\s)([a-z])/){ $1 + $2.upcase } end
Like scan but returns MatchData ($~) rather then matched string ($&).
CREDIT: Trans
# File lib/facets/string/mscan.rb, line 8 def mscan(re) #:yield: if block_given? scan(re) { yield($~) } else m = [] scan(re) { m << $~ } m end end
'Natural order' comparison of strings, e.g. …
"my_prog_v1.1.0" < "my_prog_v1.2.0" < "my_prog_v1.10.0"
which does not follow alphabetically. A secondary parameter, if set to true, makes the comparison case insensitive.
"Hello.1".natcmp("Hello.10") #=> -1
TODO: Invert case flag?
@author Alan Davies @author Martin Pool
# File lib/facets/string/natcmp.rb, line 47 def natcmp(str2, caseInsensitive=false) str1 = self.dup str2 = str2.dup compareExpression = /^(\D*)(\d*)(.*)$/ if caseInsensitive str1.downcase! str2.downcase! end # -- remove all whitespace str1.gsub!(/\s*/, '') str2.gsub!(/\s*/, '') while (str1.length > 0) or (str2.length > 0) do # -- extract non-digits, digits and rest of string str1 =~ compareExpression chars1, num1, str1 = $1.dup, $2.dup, $3.dup str2 =~ compareExpression chars2, num2, str2 = $1.dup, $2.dup, $3.dup # -- compare the non-digits case (chars1 <=> chars2) when 0 # Non-digits are the same, compare the digits... # If either number begins with a zero, then compare alphabetically, # otherwise compare numerically if (num1[0] != 48) and (num2[0] != 48) num1, num2 = num1.to_i, num2.to_i end case (num1 <=> num2) when -1 then return -1 when 1 then return 1 end when -1 then return -1 when 1 then return 1 end # case end # while # -- strings are naturally equal return 0 end end
Returns n characters of the string. If n is positive the characters are from the beginning of the string. If n is negative from the end of the string.
str = "this is text" str.nchar(4) #=> "this" str.nchar(-4) #=> "text"
Alternatively a replacement string can be given, which will replace the n characters.
str.nchar(4, 'that') #=> "that is text"
The original string remains unaffected.
str #=> "this is text"
# File lib/facets/string/nchar.rb, line 21 def nchar(n, replacement=nil) if replacement s = self.dup n > 0 ? (s[0...n] = replacement) : (s[n..-1] = replacement) s else n > 0 ? self[0...n] : self[n..-1] end end
Returns an Enumerator for iterating over each line of the string, void of the termining newline character, in contrast to lines which retains it.
"a\nb\nc".newlines.class.assert == Enumerator "a\nb\nc".newlines.to_a.assert == %w{a b c} a = [] "a\nb\nc".newlines{|nl| a << nl} a.assert == %w{a b c}
# File lib/facets/string/newlines.rb, line 16 def newlines(&block) if block scan(/^.*?$/) do |line| block.call(line.chomp) end else str = self Enumerator.new do |output| str.scan(/^.*?$/) do |line| output.yield(line.chomp) end end end end
# File lib/facets/object/object_state.rb, line 54 def object_state(data=nil) data ? replace(data) : dup end
Transforms a namespace, i.e. a class or module name, into a viable file path.
"ExamplePathize".pathize #=> "example_pathize" "ExamplePathize::Example".pathize #=> "example_pathize/example"
Compare this method to {String#modulize) and {String#methodize).
# File lib/facets/string/pathize.rb, line 11 def pathize gsub(/([A-Z]+)([A-Z])/,'\1_\2'). gsub(/([a-z])([A-Z])/,'\1_\2'). gsub('__','/'). gsub('::','/'). gsub(/\s+/, ''). # spaces are bad form gsub(/[?%*:|"<>.]+/, ''). # reserved characters downcase end
Return a new string embraced by given type
and
count
of quotes. The arguments can be given in any order.
If no type is given, double quotes are assumed.
"quote me".quote #=> '"quote me"'
If no type but a count is given then :mixed is assumed.
"quote me".quote(1) #=> %q{'quote me'} "quote me".quote(2) #=> %q{"quote me"} "quote me".quote(3) #=> %q{'"quote me"'}
Symbols can be used to describe the type.
"quote me".quote(:single) #=> %q{'quote me'} "quote me".quote(:double) #=> %q{"quote me"} "quote me".quote(:back) #=> %q{`quote me`} "quote me".quote(:bracket) #=> %q{`quote me'}
Or the character itself.
"quote me".quote("'") #=> %q{'quote me'} "quote me".quote('"') #=> %q{"quote me"} "quote me".quote("`") #=> %q{`quote me`} "quote me".quote("`'") #=> %q{`quote me'}
CREDIT: Trans
# File lib/facets/string/quote.rb, line 32 def quote(type=:double, count=nil) if Integer === type tmp = count count = type type = tmp || :mixed else count ||= 1 end type = type.to_s unless Integer===type case type when "'", 'single', 's', 1 f = "'" * count b = f when '"', 'double', 'd', 2 f = '"' * count b = f when '`', 'back', 'b', -1 f = '`' * count b = f when "`'", 'bracket', 'sb' f = "`" * count b = "'" * count when "'\"", 'mixed', "m", Integer c = (count.to_f / 2).to_i f = '"' * c b = f if count % 2 != 0 f = "'" + f b = b + "'" end else raise ArgumentError, "unrecognized quote type -- #{type}" end "#{f}#{self}#{b}" end
Like index but returns a Range.
"This is a test!".range('test') #=> (10..13)
CREDIT: Trans
# File lib/facets/string/range.rb, line 9 def range(pattern, offset=0) unless Regexp === pattern pattern = Regexp.new(Regexp.escape(pattern.to_s)) end string = self[offset..-1] if md = pattern.match(string) return (md.begin(0)+offset)..(md.end(0)+offset-1) end nil end
Like index_all but returns an array of Ranges.
"abc123abc123".range_all('abc') #=> [0..2, 6..8]
TODO: Add offset ?
CREDIT: Trans
# File lib/facets/string/range.rb, line 28 def range_all(pattern, reuse=false) r = []; i = 0 while i < self.length rng = range(pattern, i) if rng r << rng i += reuse ? 1 : rng.end + 1 else break end end r.uniq end
Returns an array of ranges mapping the characters per line.
"this\nis\na\ntest".range_of_line #=> [0..4, 5..7, 8..9, 10..13]
CREDIT: Trans
# File lib/facets/string/range.rb, line 50 def range_of_line offset=0; charmap = [] each_line do |line| charmap << (offset..(offset + line.length - 1)) offset += line.length end charmap end
Apply a set of rules in the form of regular expression matches to the string.
-
rules - The array containing rule-pairs (match, write).
Keep in mind that the order of rules is significant.
Returns the rewritten String.
CREDIT: George Moschovitis
# File lib/facets/string/rewrite.rb, line 14 def rewrite(rules) raise ArgumentError.new('The rules parameter is nil') unless rules rewritten_string = dup rules.each do |(match,write)| rewritten_string.gsub!(match,write) end return rewritten_string end
Considers string a Roman numeral numeral, and converts it to the corresponding integer.
NOTE: This method is not a common core extension and is not loaded
automatically when using require 'facets'
.
@uncommon
require 'facets/string/roman'
# File lib/facets/roman.rb, line 62 def roman roman = upcase raise unless roman? last = roman[-1,1] roman.reverse.split('').inject(0) do |result, c| if ROMAN_VALUES[c] < ROMAN_VALUES[last] result -= ROMAN_VALUES[c] else last = c result += ROMAN_VALUES[c] end end end
Returns true iif the subject is a valid Roman numeral.
NOTE: This method is not a common core extension and is not loaded
automatically when using require 'facets'
.
@uncommon
require 'facets/string/roman'
# File lib/facets/roman.rb, line 84 def roman? ROMAN =~ upcase end
Breaks a string up into an array based on a regular expression. Similar to scan, but includes the matches.
s = "<p>This<b>is</b>a test.</p>" s.shatter( /\<.*?\>/ )
produces
["<p>", "This", "<b>", "is", "</b>", "a test.", "</p>"]
CREDIT: Trans
# File lib/facets/string/shatter.rb, line 15 def shatter( re ) r = self.gsub( re ){ |s| "\1" + s + "\1" } while r[0,1] == "\1" ; r[0] = '' ; end while r[-1,1] == "\1" ; r[-1] = '' ; end r.split("\1") end
A fuzzy matching mechanism. Returns a score from 0-1, based on the number of shared edges. To be effective, the strings must be of length 2 or greater.
"Alexsander".similarity("Aleksander") #=> 0.9
The way it works:
-
Converts each string into a “graph like” object, with edges …
"alexsander" -> [ alexsander, alexsand, alexsan ... lexsand ... san ... an, etc ] "aleksander" -> [ aleksander, aleksand ... etc. ]
-
Perform match, then remove any subsets from this matched set (i.e. a hit
on “san” is a subset of a hit on “sander”) …
Above example, once reduced -> [ ale, sander ]
-
See's how many of the matches remain, and calculates a score based
on how many matches, their length, and compare to the length of the larger of the two words.
Still a bit rough. Any suggestions for improvement are welcome.
CREDIT: Derek Lewis.
# File lib/facets/string/similarity.rb, line 28 def similarity(str_in) return 0 if str_in == nil return 1 if self == str_in # -- make a graph of each word (okay, its not a true graph, but is similar) graph_A = Array.new graph_B = Array.new # -- "graph" self last = self.length (0..last).each do |ff| loc = self.length break if ff == last - 1 wordB = (1..(last-1)).to_a.reverse! if (wordB != nil) wordB.each do |ss| break if ss == ff graph_A.push( "#{self[ff..ss]}" ) end end end # -- "graph" input string last = str_in.length (0..last).each{ |ff| loc = str_in.length break if ff == last - 1 wordB = (1..(last-1)).to_a.reverse! wordB.each do |ss| break if ss == ff graph_B.push( "#{str_in[ff..ss]}" ) end } # -- count how many of these "graph edges" we have that are the same matches = graph_A & graph_B #-- #matches = Array.new #graph_A.each{ |aa| matches.push(aa) if( graph_B.include?(aa) ) } #++ # -- for eliminating subsets, we want to start with the smallest hits. matches.sort!{|x,y| x.length <=> y.length} # -- eliminate any subsets mclone = matches.dup mclone.each_index do |ii| reg = Regexp.compile( Regexp.escape(mclone[ii]) ) count = 0.0 matches.each{|xx| count += 1 if xx =~ reg} matches.delete(mclone[ii]) if count > 1 end score = 0.0 matches.each{ |mm| score += mm.length } self.length > str_in.length ? largest = self.length : largest = str_in.length return score/largest end
Underscore a string such that camelcase, dashes and spaces are replaced by underscores. This is the reverse of {#camelcase}, albeit not an exact inverse.
"SnakeCase".snakecase #=> "snake_case" "Snake-Case".snakecase #=> "snake_case" "Snake Case".snakecase #=> "snake_case" "Snake - Case".snakecase #=> "snake_case"
Note, this method no longer converts `::` to `/`, in that case use the {#pathize} method instead.
# File lib/facets/string/snakecase.rb, line 15 def snakecase #gsub(/::/, '/'). gsub(/([A-Z]+)([A-Z][a-z])/,'\1_\2'). gsub(/([a-z\d])([A-Z])/,'\1_\2'). tr('-', '_'). gsub(/\s/, '_'). gsub(/__+/, '_'). downcase end
String#slice is essentially the same as store.
a = "HELLO" a.splice(1, "X") a #=> "HXLLO"
But it acts like slice! when given a single argument.
a = "HELLO" a.splice(1) #=> "E" a #=> "HLLO"
CREDIT: Trans
# File lib/facets/string/splice.rb, line 19 def splice(idx, sub=nil) if sub store(idx, sub) else case idx when Range slice!(idx) else slice!(idx,1) end end end
Returns the string, first removing all whitespace on both ends of the string, and then changing remaining consecutive whitespace groups into one space each.
%Q{ Multi-line string }.squish # => "Multi-line string" " foo bar \n \t boo".squish # => "foo bar boo"
# File lib/facets/string/squish.rb, line 11 def squish dup.squish! end
Performs a destructive squish. See #squish.
# File lib/facets/string/squish.rb, line 16 def squish! strip! gsub!(/\s+/, ' ') self end
Does a string start with the given prefix?
"hello".start_with?("he") #=> true "hello".start_with?("to") #=> false
Note: This definition is better than standard Ruby's becuase it handles regular expressions.
CREDIT: Juris Galang
# File lib/facets/string/start_with.rb, line 30 def start_with?(prefix) prefix = Regexp.escape(prefix.to_s) unless Regexp===prefix /^#{prefix}/.match(self) ? true : false end
Aligns each line n spaces.
CREDIT: Gavin Sinclair
# File lib/facets/string/tab.rb, line 9 def tab(n) gsub(/^ */, ' ' * n) end
Preserves relative tabbing. The first non-empty line ends up with n spaces before nonspace.
CREDIT: Gavin Sinclair
# File lib/facets/string/tabto.rb, line 10 def tabto(n) if self =~ /^( *)\S/ indent(n - $1.length) else self end end
Transform a string into a form that makes for an acceptable title.
"this is a string".titlecase #=> "This Is A String"
@author Eliazar Parra @author Angelo Lakra (apostrophe fix)
# File lib/facets/string/titlecase.rb, line 11 def titlecase tr('_', ' '). gsub(/\s+/, ' '). gsub(/\b\w/){ $`[-1,1] == "'" ? $& : $&.upcase } end
Interpret common affirmative string meanings as true, otherwise nil or false. Blank space and case are ignored. The following strings that will return true …
true yes on t 1 y ==
The following strings will return nil …
nil null
All other strings return false.
Here are some exmamples.
"true".to_b #=> true "yes".to_b #=> true "no".to_b #=> false "123".to_b #=> false
# File lib/facets/boolean.rb, line 96 def to_b case self.downcase.strip when 'true', 'yes', 'on', 't', '1', 'y', '==' return true when 'nil', 'null' return nil else return false end end
Turns a string into a regular expression.
"a?".to_re #=> /a?/
CREDIT: Trans
# File lib/facets/string/to_re.rb, line 9 def to_re(esc=false) Regexp.new((esc ? Regexp.escape(self) : self)) end
Turns a string into a regular expression. By default it will escape all
characters. Use false
argument to turn off escaping.
"[".to_rx #=> /\[/
CREDIT: Trans
# File lib/facets/string/to_re.rb, line 21 def to_rx(esc=true) Regexp.new((esc ? Regexp.escape(self) : self)) end
Return a new string with the given brackets removed. If only one bracket char is given it will be removed from either side.
"{unwrap me}".unbracket('{') #=> "unwrap me" "--unwrap me!".unbracket('--','!') #=> "unwrap me"
CREDIT: Trans
# File lib/facets/string/bracket.rb, line 37 def unbracket(bra=nil, ket=nil) if bra ket = BRA2KET[bra] unless ket ket = ket ? ket : bra s = self.dup s.gsub!(%r[^#{Regexp.escape(bra)}], '') s.gsub!(%r[#{Regexp.escape(ket)}$], '') return s else if m = BRA2KET[ self[0,1] ] return self.slice(1...-1) if self[-1,1] == m end end return self.dup # if nothing else end
Inplace version of unbracket.
CREDIT: Trans
# File lib/facets/string/bracket.rb, line 57 def unbracket!(bra=nil, ket=nil) self.replace( unbracket(bra, ket) ) end
Unfold paragrpahs.
FIXME: Sometimes adds one too many blank lines. TEST!!!
# File lib/facets/string/unfold.rb, line 7 def unfold blank = false text = '' split(/\n/).each do |line| if /\S/ !~ line text << "\n\n" blank = true else if /^(\s+|[*])/ =~ line text << (line.rstrip + "\n") else text << (line.rstrip + " ") end blank = false end end text = text.gsub(/(\n){3,}/,"\n\n") text.rstrip end
Remove excessive indentation. Useful for multi-line strings embeded in already indented code.
<<-END.unindent ohaie wurld END
Outputs …
ohaie wurld
Instead of …
ohaie wurld
CREDIT: Noah Gibbs, mynyml
# File lib/facets/string/indent.rb, line 42 def unindent(size=nil) if size indent(-size) else char = ' ' self.scan(/^[\ \t]*\S/) do |m| if size.nil? || m.size < size size = m.size char = m[0...-1] end end size -= 1 indent(-size, char) end end
Equivalent to #unindent, but modifies the receiver in place.
CREDIT: mynyml
# File lib/facets/string/indent.rb, line 62 def unindent! self.replace(self.unindent) end
Remove quotes from string.
"'hi'".unquote #=> "hi"
CREDIT: Trans
# File lib/facets/string/quote.rb, line 76 def unquote s = self.dup case self[0,1] when "'", '"', '`' s[0] = '' end case self[-1,1] when "'", '"', '`' s[-1] = '' end return s end
Is the string upcase/uppercase?
"THIS".upcase? #=> true "This".upcase? #=> false "this".upcase? #=> false
CREDIT: Phil Tomson
# File lib/facets/string/capitalized.rb, line 41 def upcase? upcase == self end
Same as #camelcase
but converts first letter to uppercase.
"camel_case".upper_camelcase #=> "CamelCase" "Camel_case".upper_camelcase #=> "CamelCase"
@deprecated
Use `#camelcase(:upper)` instead.
# File lib/facets/string/camelcase.rb, line 56 def upper_camelcase(*separators) camelcase(:upper, *separators) end
Upcase first letter.
NOTE: One might argue that this method should behave the same as
#upcase
and rather this behavior should be in place of
#captialize
. Probably so, but since Matz has already defined
#captialize
the way it is, this name seems most fitting to the
missing behavior.
# File lib/facets/string/uppercase.rb, line 10 def uppercase str = to_s str[0,1].upcase + str[1..-1] end
Prepend an “@” to the beginning of a string to make a instance variable name. This also replaces non-valid characters with underscores.
# File lib/facets/string/variablize.rb, line 7 def variablize v = gsub(/\W/, '_') "@#{v}" end
Word wrap a string not exceeding max width.
"this is a test".word_wrap(4)
produces …
this is a test
This is basic implementation of word wrap, but smart enough to suffice for most use cases.
CREDIT: Gavin Kistner, Dayne Broderson
# File lib/facets/string/word_wrap.rb, line 18 def word_wrap( col_width=80 ) self.dup.word_wrap!( col_width ) end
As with word_wrap, but modifies the string in place.
CREDIT: Gavin Kistner, Dayne Broderson
# File lib/facets/string/word_wrap.rb, line 26 def word_wrap!( col_width=80 ) self.gsub!( /(\S{#{col_width}})(?=\S)/, '\1 ' ) self.gsub!( /(.{1,#{col_width}})(?:\s+|$)/, "\\1\n" ) self end
Returns an array of characters.
"abc 123".words #=> ["abc","123"]
# File lib/facets/string/words.rb, line 7 def words self.split(/\s+/) end
Private Instance Methods
# File lib/facets/string/file.rb, line 17 def file_send(op, *a, &b) File.send(op, self, *a, &b) end