本文整理汇总了Python中keras.preprocessing方法的典型用法代码示例。如果您正苦于以下问题:Python keras.preprocessing方法的具体用法?Python keras.preprocessing怎么用?Python keras.preprocessing使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类keras
的用法示例。
在下文中一共展示了keras.preprocessing方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: clean_module_name
# 需要导入模块: import keras [as 别名]
# 或者: from keras import preprocessing [as 别名]
def clean_module_name(name):
if name.startswith('keras_applications'):
name = name.replace('keras_applications', 'keras.applications')
if name.startswith('keras_preprocessing'):
name = name.replace('keras_preprocessing', 'keras.preprocessing')
assert name[:6] == 'keras.', 'Invalid module name: %s' % name
return name
示例2: fit
# 需要导入模块: import keras [as 别名]
# 或者: from keras import preprocessing [as 别名]
def fit(self, X, Y, ngram_range=1, max_features=20000, maxlen=400,
batch_size=32, embedding_dims=50, epochs=5):
self.tokenizer = keras.preprocessing.text.Tokenizer(
num_words=max_features, split=" ", char_level=False)
self.tokenizer.fit_on_texts(X)
x_train = self.tokenizer.texts_to_sequences(X)
self.ngram_range = ngram_range
self.maxlen = maxlen
self.add_ngrams = lambda x: x
if ngram_range > 1:
ngram_set = set()
for input_list in x_train:
for i in range(2, ngram_range + 1):
set_of_ngram = create_ngram_set(input_list, ngram_value=i)
ngram_set.update(set_of_ngram)
# Dictionary mapping n-gram token to a unique integer.
# Integer values are greater than max_features in order
# to avoid collision with existing features.
start_index = max_features + 1
self.token_indice = {v: k + start_index for k, v in enumerate(ngram_set)}
indice_token = {self.token_indice[k]: k for k in self.token_indice}
# max_features is the highest integer that could be found in the dataset.
max_features = np.max(list(indice_token.keys())) + 1
self.add_ngrams = lambda x: add_ngram(x, self.token_indice,
self.ngram_range)
x_train = self.add_ngrams(x_train)
print('Average train sequence length: {}'.format(np.mean(list(map(len, x_train)), dtype=int)))
x_train = sequence.pad_sequences(x_train, maxlen=self.maxlen)
self.model = Sequential()
# we start off with an efficient embedding layer which maps
# our vocab indices into embedding_dims dimensions
self.model.add(Embedding(max_features,
embedding_dims,
input_length=self.maxlen))
# we add a GlobalAveragePooling1D, which will average the embeddings
# of all words in the document
self.model.add(GlobalAveragePooling1D())
# We project onto a single unit output layer, and squash via sigmoid:
self.model.add(Dense(1, activation='sigmoid'))
self.model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
self.model.fit(x_train, Y, batch_size=batch_size, epochs=epochs, verbose=2)