Python tf.estimator.classifier_parse_example_spec用法及代码示例

为要与分类器一起使用的 tf.parse_example 生成解析规范。

用法

tf.estimator.classifier_parse_example_spec(
    feature_columns, label_key, label_dtype=tf.dtypes.int64, label_default=None,
    weight_column=None
)

参数

feature_columns 包含所有特征列的迭代。所有项目都应该是派生自 FeatureColumn 的类的实例。
label_key 标识标签的字符串。这意味着 tf.Example 使用此键存储标签。
label_dtype tf.dtype 标识标签的类型。默认情况下它是 tf.int64 。如果用户定义了 label_vocabulary ，则应将其设置为 tf.string 。 tf.float32 标签仅支持二进制分类。
label_default 如果 label_key 在给定的 tf.Example 中不存在，则用作标签。示例用法：假设 label_key 是 'clicked' 并且 tf.Example 仅包含以下格式的正示例的点击数据 key:clicked, value:1 。这意味着如果没有键为 'clicked' 的数据，则应通过设置 label_deafault=0 将其视为反例。此值的类型应与 label_dtype 兼容。
weight_column 由 tf.feature_column.numeric_column 创建的字符串或 NumericColumn 定义表示权重的特征列。它用于在训练期间减轻重量或增加示例。它将乘以示例的损失。如果它是一个字符串，它被用作从 features 中获取权重张量的键。如果是 NumericColumn ，则通过键 weight_column.key 获取原始张量，然后对其应用 weight_column.normalizer_fn 以获得权重张量。

将每个函数键映射到 FixedLenFeature 或 VarLenFeature 值的字典。

抛出

ValueError 如果在 feature_columns 中使用标签。
ValueError 如果在 feature_columns 中使用 weight_column 。
ValueError 如果任何给定的 feature_columns 不是 _FeatureColumn 实例。
ValueError 如果 weight_column 不是 NumericColumn 实例。
ValueError 如果label_key 为无。

如果用户以 tf.Example 格式保存数据，他们需要使用适当的函数规范调用 tf.parse_example。此实用程序有两个主要帮助：

用户需要将特征的解析规范与标签和权重(如果有的话)结合起来，因为它们都是从同一个 tf.Example 实例中解析出来的。该实用程序结合了这些规范。
很难将诸如DNNClassifier 之类的分类器的预期标签映射到相应的 tf.parse_example 规范。该实用程序通过从用户获取相关信息(key、dtype)对其进行编码。

解析规范的示例输出：

# Define features and transformations
feature_b = tf.feature_column.numeric_column(...)
feature_c_bucketized = tf.feature_column.bucketized_column(
  tf.feature_column.numeric_column("feature_c"), ...)
feature_a_x_feature_c = tf.feature_column.crossed_column(
    columns=["feature_a", feature_c_bucketized], ...)

feature_columns = [feature_b, feature_c_bucketized, feature_a_x_feature_c]
parsing_spec = tf.estimator.classifier_parse_example_spec(
    feature_columns, label_key='my-label', label_dtype=tf.string)

# For the above example, classifier_parse_example_spec would return the dict:
assert parsing_spec == {
  "feature_a":parsing_ops.VarLenFeature(tf.string),
  "feature_b":parsing_ops.FixedLenFeature([1], dtype=tf.float32),
  "feature_c":parsing_ops.FixedLenFeature([1], dtype=tf.float32)
  "my-label":parsing_ops.FixedLenFeature([1], dtype=tf.string)
}

分类器的示例用法：

feature_columns = # define features via tf.feature_column
estimator = DNNClassifier(
    n_classes=1000,
    feature_columns=feature_columns,
    weight_column='example-weight',
    label_vocabulary=['photos', 'keep', ...],
    hidden_units=[256, 64, 16])
# This label configuration tells the classifier the following:
# * weights are retrieved with key 'example-weight'
# * label is string and can be one of the following ['photos', 'keep', ...]
# * integer id for label 'photos' is 0, 'keep' is 1, ...


# Input builders
def input_fn_train(): # Returns a tuple of features and labels.
  features = tf.contrib.learn.read_keyed_batch_features(
      file_pattern=train_files,
      batch_size=batch_size,
      # creates parsing configuration for tf.parse_example
      features=tf.estimator.classifier_parse_example_spec(
          feature_columns,
          label_key='my-label',
          label_dtype=tf.string,
          weight_column='example-weight'),
      reader=tf.RecordIOReader)
   labels = features.pop('my-label')
   return features, labels

estimator.train(input_fn=input_fn_train)

相关用法

注：本文由纯净天空筛选整理自tensorflow.org大神的英文原创作品 tf.estimator.classifier_parse_example_spec。非经特殊声明，原始代码版权归原作者所有，本译文未经允许或授权，请勿转载或复制。

用法

参数

返回

抛出