当前位置: 首页>>代码示例>>Java>>正文


Java AvroJob.setOutputKeySchema方法代码示例

本文整理汇总了Java中org.apache.avro.mapreduce.AvroJob.setOutputKeySchema方法的典型用法代码示例。如果您正苦于以下问题:Java AvroJob.setOutputKeySchema方法的具体用法?Java AvroJob.setOutputKeySchema怎么用?Java AvroJob.setOutputKeySchema使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在org.apache.avro.mapreduce.AvroJob的用法示例。


在下文中一共展示了AvroJob.setOutputKeySchema方法的15个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: setSchema

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
/** Hacked method */
private void setSchema(Job job, Schema keySchema, Schema valSchema) {

  boolean isMaponly = job.getNumReduceTasks() == 0;
  if (keySchema != null) {
    if (isMaponly){
      AvroJob.setMapOutputKeySchema(job, keySchema);
    }
    AvroJob.setOutputKeySchema(job, keySchema);
  }
  if (valSchema != null) {
    if (isMaponly){
      AvroJob.setMapOutputValueSchema(job, valSchema);
    }
    AvroJob.setOutputValueSchema(job, valSchema);
  }

}
 
开发者ID:openaire,项目名称:iis,代码行数:19,代码来源:AvroMultipleOutputs.java

示例2: testMapReduce

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
@Test
public void testMapReduce() throws IOException {
  MyAvroReducer reducer = new MyAvroReducer();

  // Configure a job.
  Job job = new Job();
  // We've got to do a little hacking here since mrunit doesn't run exactly like
  // the real hadoop mapreduce framework.
  AvroJob.setMapOutputKeySchema(job, Node.SCHEMA$);
  AvroJob.setOutputKeySchema(job, reducer.getAvroKeyWriterSchema());
  AvroSerialization.setValueWriterSchema(job.getConfiguration(), Node.SCHEMA$);

  // Run the reducer.
  ReduceDriver<Text, AvroValue<Node>, AvroKey<Node>, NullWritable> driver
      = new ReduceDriver<Text, AvroValue<Node>, AvroKey<Node>, NullWritable>();
  driver.setReducer(reducer);
  driver.withConfiguration(job.getConfiguration());
  driver.withInput(new Text("foo"),
      Collections.singletonList(new AvroValue<Node>(new NodeBuilder("bar", 1.0).build())));
  List<Pair<AvroKey<Node>, NullWritable>> output = driver.run();
  assertEquals(1, output.size());
  assertEquals("bar", output.get(0).getFirst().datum().getLabel().toString());
}
 
开发者ID:kijiproject,项目名称:kiji-mapreduce-lib,代码行数:24,代码来源:TestAvroReducer.java

示例3: testMapReduce

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
@Test
public void testMapReduce() throws IOException {
  MyNodeReducer reducer = new MyNodeReducer();

  // Configure a job.
  Job job = new Job();
  // We've got to do a little hacking here since mrunit doesn't run exactly like
  // the real hadoop mapreduce framework.
  AvroJob.setMapOutputKeySchema(job, Node.SCHEMA$);
  AvroJob.setOutputKeySchema(job, reducer.getAvroKeyWriterSchema());
  AvroSerialization.setValueWriterSchema(job.getConfiguration(), Node.SCHEMA$);

  ReduceDriver<Text, AvroValue<Node>, AvroKey<Node>, NullWritable> driver
      = new ReduceDriver<Text, AvroValue<Node>, AvroKey<Node>, NullWritable>();
  driver.setReducer(reducer);
  driver.withConfiguration(job.getConfiguration());
  driver.withInput(
      new Text("foo"),
      Collections.singletonList(new AvroValue<Node>(new NodeBuilder("bar", 1.0).build())));
  List<Pair<AvroKey<Node>, NullWritable>> output = driver.run();
  assertEquals(1, output.size());
  assertEquals("bar", output.get(0).getFirst().datum().getLabel().toString());
}
 
开发者ID:kijiproject,项目名称:kiji-mapreduce-lib,代码行数:24,代码来源:TestNodeReducer.java

示例4: runMapReduce

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
public boolean runMapReduce(final Job job, Path inputPath, Path outputPath) throws Exception {
    FileInputFormat.setInputPaths(job, inputPath);
    job.setInputFormatClass(AvroKeyInputFormat.class);
    AvroJob.setInputKeySchema(job, Weather.SCHEMA$);

    job.setMapperClass(SortMapper.class);
    AvroJob.setMapOutputValueSchema(job, Weather.SCHEMA$);
    job.setMapOutputKeyClass(WeatherSubset.class);

    job.setReducerClass(SortReducer.class);
    AvroJob.setOutputKeySchema(job, Weather.SCHEMA$);

    job.setOutputFormatClass(AvroKeyOutputFormat.class);
    FileOutputFormat.setOutputPath(job, outputPath);

    job.setPartitionerClass(WeatherPartitioner.class);
    job.setGroupingComparatorClass(WeatherSubsetGroupingComparator.class);
    job.setSortComparatorClass(WeatherSubsetSortComparator.class);

    return job.waitForCompletion(true);
}
 
开发者ID:alexholmes,项目名称:avro-sorting,代码行数:22,代码来源:AvroWritableKeySort.java

示例5: runMapReduce

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
public boolean runMapReduce(final Job job, Path inputPath, Path outputPath) throws Exception {
    FileInputFormat.setInputPaths(job, inputPath);
    job.setInputFormatClass(AvroKeyInputFormat.class);
    AvroJob.setInputKeySchema(job, WeatherNoIgnore.SCHEMA$);

    job.setMapperClass(SortMapper.class);
    AvroJob.setMapOutputKeySchema(job, WeatherNoIgnore.SCHEMA$);
    AvroJob.setMapOutputValueSchema(job, WeatherNoIgnore.SCHEMA$);

    job.setReducerClass(SortReducer.class);
    AvroJob.setOutputKeySchema(job, WeatherNoIgnore.SCHEMA$);

    job.setOutputFormatClass(AvroKeyOutputFormat.class);
    FileOutputFormat.setOutputPath(job, outputPath);

    return job.waitForCompletion(true);
}
 
开发者ID:alexholmes,项目名称:avro-sorting,代码行数:18,代码来源:AvroSortDefault.java

示例6: runMapReduce

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
public boolean runMapReduce(final Job job, Path inputPath, Path outputPath) throws Exception {
    FileInputFormat.setInputPaths(job, inputPath);
    job.setInputFormatClass(AvroKeyInputFormat.class);
    AvroJob.setInputKeySchema(job, Weather.SCHEMA$);

    job.setMapperClass(SortMapper.class);
    AvroJob.setMapOutputKeySchema(job, Weather.SCHEMA$);
    AvroJob.setMapOutputValueSchema(job, Weather.SCHEMA$);

    job.setReducerClass(SortReducer.class);
    AvroJob.setOutputKeySchema(job, Weather.SCHEMA$);

    job.setOutputFormatClass(AvroKeyOutputFormat.class);
    FileOutputFormat.setOutputPath(job, outputPath);

    return job.waitForCompletion(true);
}
 
开发者ID:alexholmes,项目名称:avro-sorting,代码行数:18,代码来源:AvroSortWithIgnores.java

示例7: configureSchema

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
private void configureSchema(Job job) throws IOException {
  Schema newestSchema = getNewestSchemaFromSource(job);
  AvroJob.setInputKeySchema(job, newestSchema);
  AvroJob.setMapOutputKeySchema(job, this.shouldDeduplicate ? getKeySchema(job, newestSchema) : newestSchema);
  AvroJob.setMapOutputValueSchema(job, newestSchema);
  AvroJob.setOutputKeySchema(job, newestSchema);
}
 
开发者ID:Hanmourang,项目名称:Gobblin,代码行数:8,代码来源:MRCompactorAvroKeyDedupJobRunner.java

示例8: createAndSubmitJob

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
public boolean createAndSubmitJob() throws IOException, ClassNotFoundException, InterruptedException {
    Configuration configuration = new Configuration(yarnUnit.getConfig());
    configuration.setBoolean("mapred.mapper.new-api", true);
    configuration.setBoolean("mapred.reducer.new-api", true);
    Job job = Job.getInstance(configuration);
    job.setJobName(this.getClass().getSimpleName() + "-job");

    job.setNumReduceTasks(1);

    job.setMapperClass(AvroMapReduce.AvroMapper.class);

    Schema inputSchema = new Schema.Parser().parse(
            MapreduceAvroTest.class.getClassLoader().getResourceAsStream("mapreduce-avro/input.avsc"));
    FileInputFormat.addInputPath(job, new Path(inputPath));
    job.setInputFormatClass(AvroKeyInputFormat.class);
    AvroJob.setInputKeySchema(job, inputSchema);
    job.setMapOutputKeyClass(IntWritable.class);
    job.setMapOutputValueClass(Text.class);

    job.setReducerClass(AvroMapReduce.AvroReducer.class);

    FileOutputFormat.setOutputPath(job, new Path(outputPath));
    job.setOutputFormatClass(AvroKeyOutputFormat.class);
    AvroJob.setOutputKeySchema(job, new Schema.Parser().parse(
            MapreduceAvroTest.class.getClassLoader().getResourceAsStream("mapreduce-avro/output.avsc")));
    job.setOutputKeyClass(AvroKey.class);
    job.setOutputValueClass(NullWritable.class);

    job.setSpeculativeExecution(false);
    job.setMaxMapAttempts(1); // speed up failures
    return job.waitForCompletion(true);
}
 
开发者ID:intropro,项目名称:prairie,代码行数:33,代码来源:MapreduceAvroTest.java

示例9: process

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
@Override
public void process(Annotation annotation, Job job, Object target)
		throws ToolException {

	AvroJobInfo avroInfo = (AvroJobInfo)annotation;
	if (avroInfo.inputKeySchema() != AvroDefault.class) {
		AvroJob.setInputKeySchema(job, getSchema(avroInfo.inputKeySchema()));
	}
	if (avroInfo.inputValueSchema() != AvroDefault.class) {
		AvroJob.setInputValueSchema(job, getSchema(avroInfo.inputValueSchema()));
	}

	if (avroInfo.outputKeySchema() != AvroDefault.class) {
		AvroJob.setOutputKeySchema(job, getSchema(avroInfo.outputKeySchema()));
	}
	if (avroInfo.outputValueSchema() != AvroDefault.class) {
		AvroJob.setOutputValueSchema(job, getSchema(avroInfo.outputValueSchema()));
	}

	if (avroInfo.mapOutputKeySchema() != AvroDefault.class) {
		AvroJob.setMapOutputKeySchema(job, getSchema(avroInfo.mapOutputKeySchema()));
	}
	if (avroInfo.mapOutputValueSchema() != AvroDefault.class) {
		AvroJob.setMapOutputValueSchema(job, getSchema(avroInfo.mapOutputValueSchema()));
	}

	AvroSerialization.addToConfiguration(job.getConfiguration());
}
 
开发者ID:conversant,项目名称:mara,代码行数:29,代码来源:AvroJobInfoAnnotationHandler.java

示例10: getJob

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
private Job getJob(Schema avroSchema) {
    
    Job job;
    
    try {
        job = Job.getInstance();
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
    
    AvroJob.setOutputKeySchema(job, avroSchema);

    return job;
}
 
开发者ID:CeON,项目名称:spark-utils,代码行数:15,代码来源:SparkAvroSaver.java

示例11: writeAvro

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
private static <T> TransformEvaluator<AvroIO.Write.Bound<T>> writeAvro() {
  return new TransformEvaluator<AvroIO.Write.Bound<T>>() {
    @Override
    public void evaluate(AvroIO.Write.Bound<T> transform, EvaluationContext context) {
      Job job;
      try {
        job = Job.getInstance();
      } catch (IOException e) {
        throw new IllegalStateException(e);
      }
      AvroJob.setOutputKeySchema(job, transform.getSchema());
      @SuppressWarnings("unchecked")
      JavaPairRDD<AvroKey<T>, NullWritable> last =
          ((JavaRDDLike<WindowedValue<T>, ?>) context.getInputRDD(transform))
          .map(WindowingHelpers.<T>unwindowFunction())
          .mapToPair(new PairFunction<T, AvroKey<T>, NullWritable>() {
            @Override
            public Tuple2<AvroKey<T>, NullWritable> call(T t) throws Exception {
              return new Tuple2<>(new AvroKey<>(t), NullWritable.get());
            }
          });
      ShardTemplateInformation shardTemplateInfo =
          new ShardTemplateInformation(transform.getNumShards(),
          transform.getShardTemplate(), transform.getFilenamePrefix(),
          transform.getFilenameSuffix());
      writeHadoopFile(last, job.getConfiguration(), shardTemplateInfo,
          AvroKey.class, NullWritable.class, TemplatedAvroKeyOutputFormat.class);
    }
  };
}
 
开发者ID:shakamunyi,项目名称:spark-dataflow,代码行数:31,代码来源:TransformTranslator.java

示例12: run

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
public int run(String[] args) throws Exception {
  org.apache.log4j.BasicConfigurator.configure();

  if (args.length != 2) {
    System.err.println("Usage: MapReduceAgeCount <input path> <output path>");
    return -1;
  }

  Job job = Job.getInstance(getConf());
  job.setJarByClass(MapReduceAgeCount.class);
  job.setJobName("Age Count");

  // RECORDSERVICE:
  // To read from a table instead of a path, comment out
  // FileInputFormat.setInputPaths() and instead use:
  // FileInputFormat.setInputPaths(job, new Path(args[0]));
  RecordServiceConfig.setInputTable(job.getConfiguration(), null, args[0]);

  // RECORDSERVICE:
  // Use the RecordService version of the AvroKeyValueInputFormat
  job.setInputFormatClass(
      com.cloudera.recordservice.avro.mapreduce.AvroKeyValueInputFormat.class);
  FileOutputFormat.setOutputPath(job, new Path(args[1]));

  job.setMapperClass(AgeCountMapper.class);
  // Set schema for input key and value.
  AvroJob.setInputKeySchema(job, UserKey.getClassSchema());
  AvroJob.setInputValueSchema(job, UserValue.getClassSchema());

  job.setMapOutputKeyClass(Text.class);
  job.setMapOutputValueClass(IntWritable.class);

  job.setOutputFormatClass(AvroKeyValueOutputFormat.class);
  job.setReducerClass(AgeCountReducer.class);
  AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.STRING));
  AvroJob.setOutputValueSchema(job, Schema.create(Schema.Type.INT));

  return (job.waitForCompletion(true) ? 0 : 1);
}
 
开发者ID:cloudera,项目名称:RecordServiceClient,代码行数:40,代码来源:MapReduceAgeCount.java

示例13: run

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
@Override
public int run(String[] args) throws Exception {
  org.apache.log4j.BasicConfigurator.configure();

  if (args.length != 2) {
    System.err.println("Usage: MapReduceColorCount <input path> <output path>");
    return -1;
  }

  Job job = Job.getInstance(getConf());
  job.setJarByClass(MapReduceColorCount.class);
  job.setJobName("Color Count");

  // RECORDSERVICE:
  // To read from a table instead of a path, comment out
  // FileInputFormat.setInputPaths() and instead use:
  //FileInputFormat.setInputPaths(job, new Path(args[0]));
  RecordServiceConfig.setInputTable(job.getConfiguration(), "rs", "users");

  // RECORDSERVICE:
  // Use the RecordService version of the AvroKeyInputFormat
  job.setInputFormatClass(
      com.cloudera.recordservice.avro.mapreduce.AvroKeyInputFormat.class);
  //job.setInputFormatClass(AvroKeyInputFormat.class);

  FileOutputFormat.setOutputPath(job, new Path(args[1]));

  job.setMapperClass(ColorCountMapper.class);
  AvroJob.setInputKeySchema(job, User.getClassSchema());
  job.setMapOutputKeyClass(Text.class);
  job.setMapOutputValueClass(IntWritable.class);

  job.setOutputFormatClass(AvroKeyValueOutputFormat.class);
  job.setReducerClass(ColorCountReducer.class);
  AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.STRING));
  AvroJob.setOutputValueSchema(job, Schema.create(Schema.Type.INT));

  return (job.waitForCompletion(true) ? 0 : 1);
}
 
开发者ID:cloudera,项目名称:RecordServiceClient,代码行数:40,代码来源:MapReduceColorCount.java

示例14: countColors

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
/**
 * Run the MR2 color count with generic records, and return a map of favorite colors to
 * the number of users.
 */
public static java.util.Map<String, Integer> countColors() throws IOException,
    ClassNotFoundException, InterruptedException {
  String output = TestUtil.getTempDirectory();
  Path outputPath = new Path(output);
  JobConf conf = new JobConf(ColorCount.class);
  conf.setInt("mapreduce.job.reduces", 1);

  Job job = Job.getInstance(conf);
  job.setJarByClass(ColorCount.class);
  job.setJobName("MR2 Color Count With Generic Records");

  RecordServiceConfig.setInputTable(job.getConfiguration(), "rs", "users");
  job.setInputFormatClass(
      com.cloudera.recordservice.avro.mapreduce.AvroKeyInputFormat.class);
  FileOutputFormat.setOutputPath(job, outputPath);

  job.setMapperClass(Map.class);
  job.setMapOutputKeyClass(Text.class);
  job.setMapOutputValueClass(IntWritable.class);

  job.setOutputFormatClass(AvroKeyValueOutputFormat.class);
  job.setReducerClass(Reduce.class);
  AvroJob.setOutputKeySchema(job, Schema.create(Schema.Type.STRING));
  AvroJob.setOutputValueSchema(job, Schema.create(Schema.Type.INT));

  job.waitForCompletion(false);

  // Read the result and return it. Since we set the number of reducers to 1,
  // there is always just one file containing the value.
  SeekableInput input = new FsInput(new Path(output + "/part-r-00000.avro"), conf);
  DatumReader<GenericRecord> reader = new GenericDatumReader<GenericRecord>();
  FileReader<GenericRecord> fileReader = DataFileReader.openReader(input, reader);
  java.util.Map<String, Integer> colorMap = new HashMap<String, Integer>();
  for (GenericRecord datum: fileReader) {
    colorMap.put(datum.get(0).toString(), Integer.parseInt(datum.get(1).toString()));
  }
  return colorMap;
}
 
开发者ID:cloudera,项目名称:RecordServiceClient,代码行数:43,代码来源:ColorCount.java

示例15: afterPropertiesSet

import org.apache.avro.mapreduce.AvroJob; //导入方法依赖的package包/类
@Override
public void afterPropertiesSet() throws Exception {

    if (avroInputKey != null) {
        AvroJob.setInputKeySchema(job, resolveClass(avroInputKey).newInstance().getSchema());
    }

    if (avroInputValue != null) {
        AvroJob.setInputValueSchema(job, resolveClass(avroInputValue).newInstance().getSchema());
    }

    if (avroMapOutputKey != null) {
        AvroJob.setMapOutputKeySchema(job, resolveClass(avroMapOutputKey).newInstance().getSchema());
    }

    if (avroMapOutputValue != null) {
        Class<? extends IndexedRecord> c = resolveClass(avroMapOutputValue);
        IndexedRecord o = c.newInstance();
        AvroJob.setMapOutputValueSchema(job, o.getSchema());
    }

    if (avroOutputKey != null) {
        AvroJob.setOutputKeySchema(job, resolveClass(avroOutputKey).newInstance().getSchema());
    }

    if (avroOutputValue != null) {
        AvroJob.setOutputValueSchema(job, resolveClass(avroOutputValue).newInstance().getSchema());
    }
}
 
开发者ID:ch4mpy,项目名称:hadoop2,代码行数:30,代码来源:AvroJobInitializingBean.java


注:本文中的org.apache.avro.mapreduce.AvroJob.setOutputKeySchema方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。