Python tf.keras.preprocessing.image.ImageDataGenerator用法及代碼示例

通過實時數據增強生成批量張量圖像數據。

用法

tf.keras.preprocessing.image.ImageDataGenerator(
    featurewise_center=False, samplewise_center=False,
    featurewise_std_normalization=False, samplewise_std_normalization=False,
    zca_whitening=False, zca_epsilon=1e-06, rotation_range=0, width_shift_range=0.0,
    height_shift_range=0.0, brightness_range=None, shear_range=0.0, zoom_range=0.0,
    channel_shift_range=0.0, fill_mode='nearest', cval=0.0,
    horizontal_flip=False, vertical_flip=False, rescale=None,
    preprocessing_function=None, data_format=None, validation_split=0.0, dtype=None
)

參數

featurewise_center 布爾值。在數據集feature-wise 上將輸入均值設置為 0。
samplewise_center 布爾值。將每個樣本均值設置為 0。
featurewise_std_normalization 布爾值。將輸入除以數據集的 std，feature-wise。
samplewise_std_normalization 布爾值。將每個輸入除以其標準。
zca_epsilon 用於 ZCA 美白的 epsilon。默認值為 1e-6。
zca_whitening 布爾值。應用 ZCA 美白。
rotation_range Int. 隨機旋轉的度數範圍。
width_shift_range
浮點數，一維 array-like 或 int
- 浮點數：總寬度的分數，如果 < 1，或像素，如果 >= 1。
- 一維array-like：數組中的隨機元素。
- int：間隔(-width_shift_range, +width_shift_range)的整數像素數
- width_shift_range=2 可能的值是整數 [-1, 0, +1] ，與 width_shift_range=[-1, 0, +1] 相同，而 width_shift_range=1.0 可能的值是區間 [-1.0, +1.0) 中的浮點數。
height_shift_range
浮點數，一維 array-like 或 int
- 浮點數：總高度的分數，如果 < 1，或像素，如果 >= 1。
- 一維array-like：數組中的隨機元素。
- int：間隔(-height_shift_range, +height_shift_range)的整數像素數
- height_shift_range=2 可能的值是整數 [-1, 0, +1] ，與 height_shift_range=[-1, 0, +1] 相同，而 height_shift_range=1.0 可能的值是區間 [-1.0, +1.0) 中的浮點數。
brightness_range 元組或兩個浮點數的列表。從中選擇亮度偏移值的範圍。
shear_range 浮點數。剪切強度(逆時針方向的剪切角，以度為單位)
zoom_range 浮點數或[下，上]。隨機縮放範圍。如果是浮點數， [lower, upper] = [1-zoom_range, 1+zoom_range] 。
channel_shift_range 浮點數。隨機通道移位的範圍。
fill_mode
{"constant"、"nearest"、"reflect" 或 "wrap"} 之一。默認為'nearest'。根據給定的模式填充輸入邊界之外的點：
- 'constant':kkkkkkkk|abcd|kkkkkkkk (cval=k)
- 'nearest': aaaaaaaa|abcd|dddddddd
- 'reflect': abcddcba|abcd|dcbaabcd
- 'wrap'：abcdabcd|abcd|abcdabcd
cval 浮點數或 Int. fill_mode = "constant" 時用於邊界外點的值。
horizontal_flip 布爾值。水平隨機翻轉輸入。
vertical_flip 布爾值。垂直隨機翻轉輸入。
rescale 重新縮放因子。默認為無。如果 None 或 0，則不應用重新縮放，否則我們將數據乘以提供的值(在應用所有其他轉換之後)。
preprocessing_function 將應用於每個輸入的函數。該函數將在圖像調整大小和增強後運行。該函數應采用一個參數：一張圖像(等級為 3 的 Numpy 張量)，並且應輸出具有相同形狀的 Numpy 張量。
data_format 圖像數據格式，"channels_first" 或 "channels_last"。 "channels_last" 模式意味著圖像應該具有形狀 (samples, height, width, channels) ，"channels_first" 模式意味著圖像應該具有形狀 (samples, channels, height, width) 。它默認為您的 Keras 配置文件中的 image_data_format 值 ~/.keras/keras.json 。如果您從未設置它，那麽它將是"channels_last"。
validation_split 浮點數。保留用於驗證的圖像分數(嚴格在 0 和 1 之間)。
dtype 用於生成的數組的 Dtype。

拋出

ValueError 如果參數的值 data_format 不是 "channels_last" 或 "channels_first" 。
ValueError 如果參數的值，validation_split > 1 或 validation_split

數據將被循環(分批)。

例子：

使用 .flow(x, y) 的示例：

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
y_train = utils.to_categorical(y_train, num_classes)
y_test = utils.to_categorical(y_test, num_classes)
datagen = ImageDataGenerator(
    featurewise_center=True,
    featurewise_std_normalization=True,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True,
    validation_split=0.2)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(x_train)
# fits the model on batches with real-time data augmentation:
model.fit(datagen.flow(x_train, y_train, batch_size=32,
         subset='training'),
         validation_data=datagen.flow(x_train, y_train,
         batch_size=8, subset='validation'),
         steps_per_epoch=len(x_train) / 32, epochs=epochs)
# here's a more "manual" example
for e in range(epochs):
    print('Epoch', e)
    batches = 0
    for x_batch, y_batch in datagen.flow(x_train, y_train, batch_size=32):
        model.fit(x_batch, y_batch)
        batches += 1
        if batches >= len(x_train) / 32:
            # we need to break the loop by hand because
            # the generator loops indefinitely
            break

使用 .flow_from_directory(directory) 的示例：

train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
        'data/train',
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
        'data/validation',
        target_size=(150, 150),
        batch_size=32,
        class_mode='binary')
model.fit(
        train_generator,
        steps_per_epoch=2000,
        epochs=50,
        validation_data=validation_generator,
        validation_steps=800)

一起轉換圖像和蒙版的示例。

# we create two instances with the same arguments
data_gen_args = dict(featurewise_center=True,
                     featurewise_std_normalization=True,
                     rotation_range=90,
                     width_shift_range=0.1,
                     height_shift_range=0.1,
                     zoom_range=0.2)
image_datagen = ImageDataGenerator(**data_gen_args)
mask_datagen = ImageDataGenerator(**data_gen_args)
# Provide the same seed and keyword arguments to the fit and flow methods
seed = 1
image_datagen.fit(images, augment=True, seed=seed)
mask_datagen.fit(masks, augment=True, seed=seed)
image_generator = image_datagen.flow_from_directory(
    'data/images',
    class_mode=None,
    seed=seed)
mask_generator = mask_datagen.flow_from_directory(
    'data/masks',
    class_mode=None,
    seed=seed)
# combine generators into one which yields image and masks
train_generator = zip(image_generator, mask_generator)
model.fit(
    train_generator,
    steps_per_epoch=2000,
    epochs=50)

相關用法

注：本文由純淨天空篩選整理自tensorflow.org大神的英文原創作品 tf.keras.preprocessing.image.ImageDataGenerator。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。