Ruby String.unpack用法及代碼示例

本文簡要介紹ruby語言中 String.unpack 的用法。

用法

unpack(format) → anArray

unpack(format, offset: anInteger) → anArray

根據格式字符串解碼str(可能包含二進製數據)，返回每個提取值的數組。格式字符串由一係列單字符指令組成，總結在本條目末尾的表中。每個指令後麵可以跟一個數字，表示使用該指令重複的次數。星號(“*”)將用完所有剩餘的元素。 sSiIlL 指令後麵可以跟一個下劃線(“_”)或感歎號(“!”)，以使用指定類型的底層平台的本機大小；否則，它使用與平台無關的一致大小。格式字符串中的空格被忽略。

另見 String#unpack1 、 Array#pack 。

"abc \0\0abc \0\0".unpack('A6Z6')   #=> ["abc", "abc "]
"abc \0\0".unpack('a3a3')           #=> ["abc", " \000\000"]
"abc \0abc \0".unpack('Z*Z*')       #=> ["abc ", "abc "]
"aa".unpack('b8B8')                 #=> ["10000110", "01100001"]
"aaa".unpack('h2H2c')               #=> ["16", "61", 97]
"\xfe\xff\xfe\xff".unpack('sS')     #=> [-2, 65534]
"now=20is".unpack('M*')             #=> ["now is"]
"whole".unpack('xax2aX2aX1aX2a')    #=> ["h", "e", "l", "l", "o"]

此表總結了各種格式和每種格式返回的 Ruby 類。

Integer       |         |
Directive     | Returns | Meaning
------------------------------------------------------------------
C             | Integer | 8-bit unsigned (unsigned char)
S             | Integer | 16-bit unsigned, native endian (uint16_t)
L             | Integer | 32-bit unsigned, native endian (uint32_t)
Q             | Integer | 64-bit unsigned, native endian (uint64_t)
J             | Integer | pointer width unsigned, native endian (uintptr_t)
              |         |
c             | Integer | 8-bit signed (signed char)
s             | Integer | 16-bit signed, native endian (int16_t)
l             | Integer | 32-bit signed, native endian (int32_t)
q             | Integer | 64-bit signed, native endian (int64_t)
j             | Integer | pointer width signed, native endian (intptr_t)
              |         |
S_ S!         | Integer | unsigned short, native endian
I I_ I!       | Integer | unsigned int, native endian
L_ L!         | Integer | unsigned long, native endian
Q_ Q!         | Integer | unsigned long long, native endian (ArgumentError
              |         | if the platform has no long long type.)
J!            | Integer | uintptr_t, native endian (same with J)
              |         |
s_ s!         | Integer | signed short, native endian
i i_ i!       | Integer | signed int, native endian
l_ l!         | Integer | signed long, native endian
q_ q!         | Integer | signed long long, native endian (ArgumentError
              |         | if the platform has no long long type.)
j!            | Integer | intptr_t, native endian (same with j)
              |         |
S> s> S!> s!> | Integer | same as the directives without ">" except
L> l> L!> l!> |         | big endian
I!> i!>       |         |
Q> q> Q!> q!> |         | "S>" is the same as "n"
J> j> J!> j!> |         | "L>" is the same as "N"
              |         |
S< s< S!< s!< | Integer | same as the directives without "<" except
L< l< L!< l!< |         | little endian
I!< i!<       |         |
Q< q< Q!< q!< |         | "S<" is the same as "v"
J< j< J!< j!< |         | "L<" is the same as "V"
              |         |
n             | Integer | 16-bit unsigned, network (big-endian) byte order
N             | Integer | 32-bit unsigned, network (big-endian) byte order
v             | Integer | 16-bit unsigned, VAX (little-endian) byte order
V             | Integer | 32-bit unsigned, VAX (little-endian) byte order
              |         |
U             | Integer | UTF-8 character
w             | Integer | BER-compressed integer (see Array#pack)

Float        |         |
Directive    | Returns | Meaning
-----------------------------------------------------------------
D d          | Float   | double-precision, native format
F f          | Float   | single-precision, native format
E            | Float   | double-precision, little-endian byte order
e            | Float   | single-precision, little-endian byte order
G            | Float   | double-precision, network (big-endian) byte order
g            | Float   | single-precision, network (big-endian) byte order

String       |         |
Directive    | Returns | Meaning
-----------------------------------------------------------------
A            | String  | arbitrary binary string (remove trailing nulls and ASCII spaces)
a            | String  | arbitrary binary string
Z            | String  | null-terminated string
B            | String  | bit string (MSB first)
b            | String  | bit string (LSB first)
H            | String  | hex string (high nibble first)
h            | String  | hex string (low nibble first)
u            | String  | UU-encoded string
M            | String  | quoted-printable, MIME encoding (see RFC2045)
m            | String  | base64 encoded string (RFC 2045) (default)
             |         | base64 encoded string (RFC 4648) if followed by 0
P            | String  | pointer to a structure (fixed-length string)
p            | String  | pointer to a null-terminated string

Misc.        |         |
Directive    | Returns | Meaning
-----------------------------------------------------------------
@            | ---     | skip to the offset given by the length argument
X            | ---     | skip backward one byte
x            | ---     | skip forward one byte

跳過指定字節數後，可以給出關鍵字offset開始解碼：

"abc".unpack("C*") # => [97, 98, 99]
"abc".unpack("C*", offset: 2) # => [99]
"abc".unpack("C*", offset: 4) # => offset outside of string (ArgumentError)

HISTORY

傑，傑！ j, 和 j!從 Ruby 2.3 開始可用。
Q_、Q!、q_ 和 q!從 Ruby 2.1 開始可用。
I!<、i!<、I!> 和 i!> 從 Ruby 1.9.3 開始可用。

相關用法

注：本文由純淨天空篩選整理自ruby-lang.org大神的英文原創作品 String.unpack。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。