class FriendlyId::SlugString

This class provides some string-manipulation methods specific to slugs. Its Unicode support is provided by ActiveSupport::Multibyte::Chars; this is needed primarily for Unicode encoding normalization and proper calculation of string lengths.

Note that this class includes many “bang methods” such as {#clean!} and {#normalize!} that perform actions on the string in-place. Each of these methods has a corresponding “bangless” method (i.e., +SlugString#clean!+ and +SlugString#clean+) which does not appear in the documentation because it is generated dynamically.

All of the bang methods return an instance of String, while the bangless versions return an instance of FriendlyId::SlugString, so that calls to methods specific to this class can be chained:

string = SlugString.new("hello world")
string.with_dashes! # => "hello-world"
string.with_dashes  # => <FriendlyId::SlugString:0x000001013e1590 @wrapped_string="hello-world">

@see www.utf8-chartable.de/unicode-utf8-table.pl?utf8=dec Unicode character table @see ::dump_approximations

Constants

APPROXIMATIONS: All values are Unicode decimal characters or character arrays.

Public Class Methods

dump_approximations() click to toggle source

This method can be used by developers wishing to debug the {APPROXIMATIONS} hashes, which are written in a hard-to-read format. @return Hash @example

> ruby -rrubygems -rlib/friendly_id -e 'p FriendlyId::SlugString.dump_approximations'

{:common => {“À”=>“A”, “Á”=>“A”, “Â”=>“A”, “Ã”=>“A”, “Ä”=>“A”, “Å”=>“A”, “Æ”=>“AE”, “Ç”=>“C”, “È”=>“E”, “É”=>“E”, “Ê”=>“E”, “Ë”=>“E”, “Ì”=>“I”, “Í”=>“I”, “Î”=>“I”, “Ï”=>“I”, “Ð”=>“D”, “Ñ”=>“N”, “Ò”=>“O”, “Ó”=>“O”, “Ô”=>“O”, “Õ”=>“O”, “Ö”=>“O”, “×”=>“x”, “Ø”=>“O”, “Ù”=>“U”, “Ú”=>“U”, “Û”=>“U”, “Ü”=>“U”, “Ý”=>“Y”, “Þ”=>“Th”, “ß”=>“ss”, “à”=>“a”, “á”=>“a”, “â”=>“a”, “ã”=>“a”, “ä”=>“a”, “å”=>“a”, “æ”=>“ae”, “ç”=>“c”, “è”=>“e”, “é”=>“e”, “ê”=>“e”, “ë”=>“e”, “ì”=>“i”, “í”=>“i”, “î”=>“i”, “ï”=>“i”, “ð”=>“d”, “ñ”=>“n”, “ò”=>“o”, “ó”=>“o”, “ô”=>“o”, “õ”=>“o”, “ö”=>“o”, “ø”=>“o”, “ù”=>“u”, “ú”=>“u”, “û”=>“u”, “ü”=>“u”, “ý”=>“y”, “þ”=>“th”, “ÿ”=>“y”, “Ā”=>“A”, “ā”=>“a”, “Ă”=>“A”, “ă”=>“a”, “Ą”=>“A”, “ą”=>“a”, “Ć”=>“C”, “ć”=>“c”, “Ĉ”=>“C”, “ĉ”=>“c”, “Ċ”=>“C”, “ċ”=>“c”, “Č”=>“C”, “č”=>“c”, “Ď”=>“D”, “ď”=>“d”, “Đ”=>“D”, “đ”=>“d”, “Ē”=>“E”, “ē”=>“e”, “Ĕ”=>“E”, “ĕ”=>“e”, “Ė”=>“E”, “ė”=>“e”, “Ę”=>“E”, “ę”=>“e”, “Ě”=>“E”, “ě”=>“e”, “Ĝ”=>“G”, “ĝ”=>“g”, “Ğ”=>“G”, “ğ”=>“g”, “Ġ”=>“G”, “ġ”=>“g”, “Ģ”=>“G”, “ģ”=>“g”, “Ĥ”=>“H”, “ĥ”=>“h”, “Ħ”=>“H”, “ħ”=>“h”, “Ĩ”=>“I”, “ĩ”=>“i”, “Ī”=>“I”, “ī”=>“i”, “Ĭ”=>“I”, “ĭ”=>“i”, “Į”=>“I”, “į”=>“i”, “İ”=>“I”, “ı”=>“i”, “Ĳ”=>“IJ”, “ĳ”=>“ij”, “Ĵ”=>“J”, “ĵ”=>“j”, “Ķ”=>“K”, “ķ”=>“k”, “ĸ”=>“k”, “Ĺ”=>“L”, “ĺ”=>“l”, “Ļ”=>“L”, “ļ”=>“l”, “Ľ”=>“L”, “ľ”=>“l”, “Ŀ”=>“L”, “ŀ”=>“l”, “Ł”=>“L”, “ł”=>“l”, “Ń”=>“N”, “ń”=>“n”, “Ņ”=>“N”, “ņ”=>“n”, “Ň”=>“N”, “ň”=>“n”, “ŉ”=>“'n”, “Ŋ”=>“NG”, “ŋ”=>“ng”, “Ō”=>“O”, “ō”=>“o”, “Ŏ”=>“O”, “ŏ”=>“o”, “Ő”=>“O”, “ő”=>“o”, “Œ”=>“OE”, “œ”=>“oe”, “Ŕ”=>“R”, “ŕ”=>“r”, “Ŗ”=>“R”, “ŗ”=>“r”, “Ř”=>“R”, “ř”=>“r”, “Ś”=>“S”, “ś”=>“s”, “Ŝ”=>“S”, “ŝ”=>“s”, “Ş”=>“S”, “ş”=>“s”, “Š”=>“S”, “š”=>“s”, “Ţ”=>“T”, “ţ”=>“t”, “Ť”=>“T”, “ť”=>“t”, “Ŧ”=>“T”, “ŧ”=>“t”, “Ũ”=>“U”, “ũ”=>“u”, “Ū”=>“U”, “ū”=>“u”, “Ŭ”=>“U”, “ŭ”=>“u”, “Ů”=>“U”, “ů”=>“u”, “Ű”=>“U”, “ű”=>“u”, “Ų”=>“U”, “ų”=>“u”, “Ŵ”=>“W”, “ŵ”=>“w”, “Ŷ”=>“Y”, “ŷ”=>“y”, “Ÿ”=>“Y”, “Ź”=>“Z”, “ź”=>“z”, “Ż”=>“Z”, “ż”=>“z”, “Ž”=>“Z”, “ž”=>“z”}, :german => {“ü”=>“ue”, “ö”=>“oe”, “ä”=>“ae”}, :spanish => {“Ñ”=>“Nn”, “ñ”=>“nn”}}

# File lib/friendly_id/slug_string.rb, line 102
def self.dump_approximations
  Hash[APPROXIMATIONS.map do |name, approx|
    [name, Hash[approx.map {|key, value| [[key].pack("U*"), [value].flatten.pack("U*")]}]]
  end]
end

new(string) click to toggle source

@param string [String] The string to use as the basis of the SlugString.

Calls superclass method

# File lib/friendly_id/slug_string.rb, line 110
def initialize(string)
  super string.to_s
end

Public Instance Methods

approximate_ascii!(*args) click to toggle source

Approximate an ASCII string. This works only for Western strings using characters that are Roman-alphabet characters + diacritics. Non-letter characters are left unmodified.

string = SlugString.new "Łódź, Poland"
string.approximate_ascii                 # => "Lodz, Poland"
string = SlugString.new "日本"
string.approximate_ascii                 # => "日本"

You can pass any key(s) from {APPROXIMATIONS} as arguments. This allows for contextual approximations. By default; :spanish and :german are provided:

string = SlugString.new "Jürgen Müller"
string.approximate_ascii                 # => "Jurgen Muller"
string.approximate_ascii :german        # => "Juergen Mueller"
string = SlugString.new "¡Feliz año!"
string.approximate_ascii                 # => "¡Feliz ano!"
string.approximate_ascii :spanish       # => "¡Feliz anno!"

You can modify the built-in approximations, or add your own:

# Make Spanish use "nh" rather than "nn"
FriendlyId::SlugString::APPROXIMATIONS[:spanish] = {
  # Ñ => "Nh"
  209 => [78, 104],
  # ñ => "nh"
  241 => [110, 104]
}

It's also possible to use a custom approximation for all strings:

FriendlyId::SlugString.approximations << :german

Notice that this method does not simply convert to ASCII; if you want to remove non-ASCII characters such as “¡” and “¿”, use {#to_ascii!}:

string.approximate_ascii!(:spanish)       # => "¡Feliz anno!"
string.to_ascii!                          # => "Feliz anno!"

@param *args <Symbol> @return String

# File lib/friendly_id/slug_string.rb, line 155
def approximate_ascii!(*args)
  @maps = (self.class.approximations + args + [:common]).flatten.uniq
  @wrapped_string = normalize_utf8(:c).unpack("U*").map { |char| approx_char(char) }.flatten.pack("U*")
end

clean!() click to toggle source

Removes leading and trailing spaces or dashses, and replaces multiple whitespace characters with a single space. @return String

# File lib/friendly_id/slug_string.rb, line 163
def clean!
  @wrapped_string = @wrapped_string.gsub(/\A\-|\-\z/, '').gsub(/\s+/u, ' ').strip
end

downcase!() click to toggle source

Lowercases the string. Note that this works for Unicode strings, though your milage may vary with Greek and Turkic strings. @return String

# File lib/friendly_id/slug_string.rb, line 170
def downcase!
  @wrapped_string = apply_mapping :lowercase_mapping
end

normalize!() click to toggle source

Normalize the string for use as a FriendlyId. Note that in this context, normalize means, strip, remove non-letters/numbers, downcasing and converting whitespace to dashes. ActiveSupport::Multibyte::Chars#normalize is aliased to normalize_utf8 in this subclass. @return String

# File lib/friendly_id/slug_string.rb, line 217
def normalize!
  clean!
  word_chars!
  downcase!
  with_dashes!
end

normalize_for!(config) click to toggle source

Normalize the string for a given {FriendlyId::Configuration}. @param config [FriendlyId::Configuration] @return String

# File lib/friendly_id/slug_string.rb, line 199
def normalize_for!(config)
  if config.normalizer?
    @wrapped_string = config.normalizer.call(to_s)
  else
    approximate_ascii! if config.approximate_ascii?
    to_ascii! if config.strip_non_ascii?
    normalize!
  end
end

to_ascii!() click to toggle source

Delete any non-ascii characters. @return String

# File lib/friendly_id/slug_string.rb, line 232
def to_ascii!
  @wrapped_string = normalize_utf8(:c).unpack("U*").reject {|char| char > 127}.pack("U*")
end

truncate!(max) click to toggle source

Truncate the string to max length. @return String

# File lib/friendly_id/slug_string.rb, line 226
def truncate!(max)
  @wrapped_string = self[0...max].to_s if length > max
end

upcase!() click to toggle source

Upper-cases the string. Note that this works for Unicode strings, though your milage may vary with Greek and Turkic strings. @return String

# File lib/friendly_id/slug_string.rb, line 239
def upcase!
  @wrapped_string = apply_mapping :uppercase_mapping
end

validate_for!(config) click to toggle source

Validate that the slug string is not blank or reserved, and truncate it to the max length if necessary. @param config [FriendlyId::Configuration] @return String @raise FriendlyId::BlankError @raise FriendlyId::ReservedError

# File lib/friendly_id/slug_string.rb, line 249
def validate_for!(config)
  truncate!(config.max_length)
  raise FriendlyId::BlankError if blank?
  raise FriendlyId::ReservedError if config.reserved?(self)
  self
end

with_dashes!() click to toggle source

Replaces whitespace with dashes (“-”). @return String

# File lib/friendly_id/slug_string.rb, line 258
def with_dashes!
  @wrapped_string = @wrapped_string.gsub(/[\s\-]+/u, '-')
end

word_chars!() click to toggle source

Remove any non-word characters. @return String

# File lib/friendly_id/slug_string.rb, line 176
def word_chars!
  @wrapped_string = normalize_utf8(:c).unpack("U*").map { |char|
    case char
    # control chars
    when 0..31
    # punctuation; 45 is "-" (HYPHEN-MINUS) and allowed
    when 33..44
    # more puncuation
    when 46..47
    # more puncuation and other symbols
    when 58..64
    # brackets and other symbols
    when 91..96
    # braces, pipe, tilde, etc.
    when 123..191
    else char
    end
  }.compact.pack("U*")
end