Ruby Enumerable.slice_before用法及代码示例

本文简要介绍ruby语言中 Enumerable.slice_before 的用法。

用法

slice_before(pattern) → enumerator

slice_before {|array| ... } → enumerator

使用参数 pattern ，返回一个枚举器，该枚举器使用该模式将元素划分为数组 (“slices”)。如果element === pattern(或者如果它是第一个元素)，则元素开始一个新切片。

a = %w[foo bar fop for baz fob fog bam foy]
e = a.slice_before(/ba/) # => #<Enumerator: ...>
e.each {|array| p array }

输出：

["foo"]
["bar", "fop", "for"]
["baz", "fob", "fog"]
["bam", "foy"]

使用块，返回一个枚举器，该枚举器使用块将元素划分为数组。如果一个元素的块返回值为真值(或者如果它是第一个元素)，则该元素开始一个新切片：

e = (1..20).slice_before {|i| i % 4 == 2 } # => #<Enumerator: ...>
e.each {|array| p array }

输出：

[1]
[2, 3, 4, 5]
[6, 7, 8, 9]
[10, 11, 12, 13]
[14, 15, 16, 17]
[18, 19, 20]

Enumerator 类和 Enumerable 模块的其他方法，例如to_a、map等，也可以使用。

例如，对ChangeLog 条目的迭代可以实现如下：

# iterate over ChangeLog entries.
open("ChangeLog") { |f|
  f.slice_before(/\A\S/).each { |e| pp e }
}

# same as above.  block is used instead of pattern argument.
open("ChangeLog") { |f|
  f.slice_before { |line| /\A\S/ === line }.each { |e| pp e }
}

“svn proplist -R”为每个文件生成多行输出。它们可以按如下方式分块：

IO.popen([{"LC_ALL"=>"C"}, "svn", "proplist", "-R"]) { |f|
  f.lines.slice_before(/\AProp/).each { |lines| p lines }
}
#=> ["Properties on '.':\n", "  svn:ignore\n", "  svk:merge\n"]
#   ["Properties on 'goruby.c':\n", "  svn:eol-style\n"]
#   ["Properties on 'complex.c':\n", "  svn:mime-type\n", "  svn:eol-style\n"]
#   ["Properties on 'regparse.c':\n", "  svn:eol-style\n"]
#   ...

如果块需要维护多个元素的状态，可以使用局部变量。例如，可以按如下方式压缩三个或更多连续递增的数字(请参阅chunk_while 以获得更好的方法)：

a = [0, 2, 3, 4, 6, 7, 9]
prev = a[0]
p a.slice_before { |e|
  prev, prev2 = e, prev
  prev2 + 1 != e
}.map { |es|
  es.length <= 2 ? es.join(",") : "#{es.first}-#{es.last}"
}.join(",")
#=> "0,2-4,6,7,9"

但是，如果结果枚举器被枚举两次或更多，则应谨慎使用局部变量。应该为每个枚举初始化局部变量。 Enumerator.new 可以用来做。

# Word wrapping.  This assumes all characters have same width.
def wordwrap(words, maxwidth)
  Enumerator.new {|y|
    # cols is initialized in Enumerator.new.
    cols = 0
    words.slice_before { |w|
      cols += 1 if cols != 0
      cols += w.length
      if maxwidth < cols
        cols = w.length
        true
      else
        false
      end
    }.each {|ws| y.yield ws }
  }
end
text = (1..20).to_a.join(" ")
enum = wordwrap(text.split(/\s+/), 10)
puts "-"*10
enum.each { |ws| puts ws.join(" ") } # first enumeration.
puts "-"*10
enum.each { |ws| puts ws.join(" ") } # second enumeration generates same result as the first.
puts "-"*10
#=> ----------
#   1 2 3 4 5
#   6 7 8 9 10
#   11 12 13
#   14 15 16
#   17 18 19
#   20
#   ----------
#   1 2 3 4 5
#   6 7 8 9 10
#   11 12 13
#   14 15 16
#   17 18 19
#   20
#   ----------

mbox 包含一系列以 Unix From 行开头的邮件。因此，每封邮件都可以在 Unix From 行之前按切片提取。

# parse mbox
open("mbox") { |f|
  f.slice_before { |line|
    line.start_with? "From "
  }.each { |mail|
    unix_from = mail.shift
    i = mail.index("\n")
    header = mail[0...i]
    body = mail[(i+1)..-1]
    body.pop if body.last == "\n"
    fields = header.slice_before { |line| !" \t".include?(line[0]) }.to_a
    p unix_from
    pp fields
    pp body
  }
}

# split mails in mbox (slice before Unix From line after an empty line)
open("mbox") { |f|
  emp = true
  f.slice_before { |line|
    prevemp = emp
    emp = line == "\n"
    prevemp && line.start_with?("From ")
  }.each { |mail|
    mail.pop if mail.last == "\n"
    pp mail
  }
}

相关用法

注：本文由纯净天空筛选整理自ruby-lang.org大神的英文原创作品 Enumerable.slice_before。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。