Ruby String.unpack用法及代码示例

本文简要介绍ruby语言中 String.unpack 的用法。

用法

unpack(format) → anArray

unpack(format, offset: anInteger) → anArray

根据格式字符串解码str(可能包含二进制数据)，返回每个提取值的数组。格式字符串由一系列单字符指令组成，总结在本条目末尾的表中。每个指令后面可以跟一个数字，表示使用该指令重复的次数。星号(“*”)将用完所有剩余的元素。 sSiIlL 指令后面可以跟一个下划线(“_”)或感叹号(“!”)，以使用指定类型的底层平台的本机大小；否则，它使用与平台无关的一致大小。格式字符串中的空格被忽略。

另见 String#unpack1 、 Array#pack 。

"abc \0\0abc \0\0".unpack('A6Z6')   #=> ["abc", "abc "]
"abc \0\0".unpack('a3a3')           #=> ["abc", " \000\000"]
"abc \0abc \0".unpack('Z*Z*')       #=> ["abc ", "abc "]
"aa".unpack('b8B8')                 #=> ["10000110", "01100001"]
"aaa".unpack('h2H2c')               #=> ["16", "61", 97]
"\xfe\xff\xfe\xff".unpack('sS')     #=> [-2, 65534]
"now=20is".unpack('M*')             #=> ["now is"]
"whole".unpack('xax2aX2aX1aX2a')    #=> ["h", "e", "l", "l", "o"]

此表总结了各种格式和每种格式返回的 Ruby 类。

Integer       |         |
Directive     | Returns | Meaning
------------------------------------------------------------------
C             | Integer | 8-bit unsigned (unsigned char)
S             | Integer | 16-bit unsigned, native endian (uint16_t)
L             | Integer | 32-bit unsigned, native endian (uint32_t)
Q             | Integer | 64-bit unsigned, native endian (uint64_t)
J             | Integer | pointer width unsigned, native endian (uintptr_t)
              |         |
c             | Integer | 8-bit signed (signed char)
s             | Integer | 16-bit signed, native endian (int16_t)
l             | Integer | 32-bit signed, native endian (int32_t)
q             | Integer | 64-bit signed, native endian (int64_t)
j             | Integer | pointer width signed, native endian (intptr_t)
              |         |
S_ S!         | Integer | unsigned short, native endian
I I_ I!       | Integer | unsigned int, native endian
L_ L!         | Integer | unsigned long, native endian
Q_ Q!         | Integer | unsigned long long, native endian (ArgumentError
              |         | if the platform has no long long type.)
J!            | Integer | uintptr_t, native endian (same with J)
              |         |
s_ s!         | Integer | signed short, native endian
i i_ i!       | Integer | signed int, native endian
l_ l!         | Integer | signed long, native endian
q_ q!         | Integer | signed long long, native endian (ArgumentError
              |         | if the platform has no long long type.)
j!            | Integer | intptr_t, native endian (same with j)
              |         |
S> s> S!> s!> | Integer | same as the directives without ">" except
L> l> L!> l!> |         | big endian
I!> i!>       |         |
Q> q> Q!> q!> |         | "S>" is the same as "n"
J> j> J!> j!> |         | "L>" is the same as "N"
              |         |
S< s< S!< s!< | Integer | same as the directives without "<" except
L< l< L!< l!< |         | little endian
I!< i!<       |         |
Q< q< Q!< q!< |         | "S<" is the same as "v"
J< j< J!< j!< |         | "L<" is the same as "V"
              |         |
n             | Integer | 16-bit unsigned, network (big-endian) byte order
N             | Integer | 32-bit unsigned, network (big-endian) byte order
v             | Integer | 16-bit unsigned, VAX (little-endian) byte order
V             | Integer | 32-bit unsigned, VAX (little-endian) byte order
              |         |
U             | Integer | UTF-8 character
w             | Integer | BER-compressed integer (see Array#pack)

Float        |         |
Directive    | Returns | Meaning
-----------------------------------------------------------------
D d          | Float   | double-precision, native format
F f          | Float   | single-precision, native format
E            | Float   | double-precision, little-endian byte order
e            | Float   | single-precision, little-endian byte order
G            | Float   | double-precision, network (big-endian) byte order
g            | Float   | single-precision, network (big-endian) byte order

String       |         |
Directive    | Returns | Meaning
-----------------------------------------------------------------
A            | String  | arbitrary binary string (remove trailing nulls and ASCII spaces)
a            | String  | arbitrary binary string
Z            | String  | null-terminated string
B            | String  | bit string (MSB first)
b            | String  | bit string (LSB first)
H            | String  | hex string (high nibble first)
h            | String  | hex string (low nibble first)
u            | String  | UU-encoded string
M            | String  | quoted-printable, MIME encoding (see RFC2045)
m            | String  | base64 encoded string (RFC 2045) (default)
             |         | base64 encoded string (RFC 4648) if followed by 0
P            | String  | pointer to a structure (fixed-length string)
p            | String  | pointer to a null-terminated string

Misc.        |         |
Directive    | Returns | Meaning
-----------------------------------------------------------------
@            | ---     | skip to the offset given by the length argument
X            | ---     | skip backward one byte
x            | ---     | skip forward one byte

跳过指定字节数后，可以给出关键字offset开始解码：

"abc".unpack("C*") # => [97, 98, 99]
"abc".unpack("C*", offset: 2) # => [99]
"abc".unpack("C*", offset: 4) # => offset outside of string (ArgumentError)

HISTORY

杰，杰！ j, 和 j!从 Ruby 2.3 开始可用。
Q_、Q!、q_ 和 q!从 Ruby 2.1 开始可用。
I!<、i!<、I!> 和 i!> 从 Ruby 1.9.3 开始可用。

相关用法

注：本文由纯净天空筛选整理自ruby-lang.org大神的英文原创作品 String.unpack。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。