Java JobConf.setMapperClass方法代码示例

本文整理汇总了Java中org.apache.hadoop.mapred.JobConf.setMapperClass方法的典型用法代码示例。如果您正苦于以下问题：Java JobConf.setMapperClass方法的具体用法？Java JobConf.setMapperClass怎么用？Java JobConf.setMapperClass使用的例子？那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类org.apache.hadoop.mapred.JobConf的用法示例。

在下文中一共展示了JobConf.setMapperClass方法的15个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: runTests

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
/**
 * Run the test
 * 
 * @throws IOException on error
 */
public static void runTests() throws IOException {
  config.setLong("io.bytes.per.checksum", bytesPerChecksum);
  
  JobConf job = new JobConf(config, NNBench.class);

  job.setJobName("NNBench-" + operation);
  FileInputFormat.setInputPaths(job, new Path(baseDir, CONTROL_DIR_NAME));
  job.setInputFormat(SequenceFileInputFormat.class);
  
  // Explicitly set number of max map attempts to 1.
  job.setMaxMapAttempts(1);
  
  // Explicitly turn off speculative execution
  job.setSpeculativeExecution(false);

  job.setMapperClass(NNBenchMapper.class);
  job.setReducerClass(NNBenchReducer.class);

  FileOutputFormat.setOutputPath(job, new Path(baseDir, OUTPUT_DIR_NAME));
  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(Text.class);
  job.setNumReduceTasks((int) numberOfReduces);
  JobClient.runJob(job);
}

开发者ID:naver，项目名称:hadoop，代码行数:30，代码来源:NNBench.java

示例2: testInputFormat

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
void testInputFormat(Class<? extends InputFormat> clazz) throws IOException {
  final JobConf job = MapreduceTestingShim.getJobConf(mrCluster);
  job.setInputFormat(clazz);
  job.setOutputFormat(NullOutputFormat.class);
  job.setMapperClass(ExampleVerifier.class);
  job.setNumReduceTasks(0);
  LOG.debug("submitting job.");
  final RunningJob run = JobClient.runJob(job);
  assertTrue("job failed!", run.isSuccessful());
  assertEquals("Saw the wrong number of instances of the filtered-for row.", 2, run.getCounters()
      .findCounter(TestTableInputFormat.class.getName() + ":row", "aaa").getCounter());
  assertEquals("Saw any instances of the filtered out row.", 0, run.getCounters()
      .findCounter(TestTableInputFormat.class.getName() + ":row", "bbb").getCounter());
  assertEquals("Saw the wrong number of instances of columnA.", 1, run.getCounters()
      .findCounter(TestTableInputFormat.class.getName() + ":family", "columnA").getCounter());
  assertEquals("Saw the wrong number of instances of columnB.", 1, run.getCounters()
      .findCounter(TestTableInputFormat.class.getName() + ":family", "columnB").getCounter());
  assertEquals("Saw the wrong count of values for the filtered-for row.", 2, run.getCounters()
      .findCounter(TestTableInputFormat.class.getName() + ":value", "value aaa").getCounter());
  assertEquals("Saw the wrong count of values for the filtered-out row.", 0, run.getCounters()
      .findCounter(TestTableInputFormat.class.getName() + ":value", "value bbb").getCounter());
}

开发者ID:fengchen8086，项目名称:ditb，代码行数:23，代码来源:TestTableInputFormat.java

示例3: runIOTest

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
private void runIOTest(
        Class<? extends Mapper<Text, LongWritable, Text, Text>> mapperClass, 
        Path outputDir) throws IOException {
  JobConf job = new JobConf(config, TestDFSIO.class);

  FileInputFormat.setInputPaths(job, getControlDir(config));
  job.setInputFormat(SequenceFileInputFormat.class);

  job.setMapperClass(mapperClass);
  job.setReducerClass(AccumulatingReducer.class);

  FileOutputFormat.setOutputPath(job, outputDir);
  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(Text.class);
  job.setNumReduceTasks(1);
  JobClient.runJob(job);
}

开发者ID:naver，项目名称:hadoop，代码行数:18，代码来源:TestDFSIO.java

示例4: getJob

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
/**
 * Sets up a job conf for the given job using the given config object. Ensures
 * that the correct input format is set, the mapper and and reducer class and
 * the input and output keys and value classes along with any other job
 * configuration.
 * 
 * @param config
 * @return JobConf representing the job to be ran
 * @throws IOException
 */
private JobConf getJob(ConfigExtractor config) throws IOException {
  JobConf job = new JobConf(config.getConfig(), SliveTest.class);
  job.setInputFormat(DummyInputFormat.class);
  FileOutputFormat.setOutputPath(job, config.getOutputPath());
  job.setMapperClass(SliveMapper.class);
  job.setPartitionerClass(SlivePartitioner.class);
  job.setReducerClass(SliveReducer.class);
  job.setOutputKeyClass(Text.class);
  job.setOutputValueClass(Text.class);
  job.setOutputFormat(TextOutputFormat.class);
  TextOutputFormat.setCompressOutput(job, false);
  job.setNumReduceTasks(config.getReducerAmount());
  job.setNumMapTasks(config.getMapAmount());
  return job;
}

开发者ID:naver，项目名称:hadoop，代码行数:26，代码来源:SliveTest.java

示例5: joinAs

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
private static void joinAs(String jointype,
    Class<? extends SimpleCheckerBase> c) throws Exception {
  final int srcs = 4;
  Configuration conf = new Configuration();
  JobConf job = new JobConf(conf, c);
  Path base = cluster.getFileSystem().makeQualified(new Path("/"+jointype));
  Path[] src = writeSimpleSrc(base, conf, srcs);
  job.set("mapreduce.join.expr", CompositeInputFormat.compose(jointype,
      SequenceFileInputFormat.class, src));
  job.setInt("testdatamerge.sources", srcs);
  job.setInputFormat(CompositeInputFormat.class);
  FileOutputFormat.setOutputPath(job, new Path(base, "out"));

  job.setMapperClass(c);
  job.setReducerClass(c);
  job.setOutputKeyClass(IntWritable.class);
  job.setOutputValueClass(IntWritable.class);
  JobClient.runJob(job);
  base.getFileSystem(job).delete(base, true);
}

开发者ID:naver，项目名称:hadoop，代码行数:21，代码来源:TestDatamerge.java

示例6: testEmptyJoin

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
public void testEmptyJoin() throws Exception {
  JobConf job = new JobConf();
  Path base = cluster.getFileSystem().makeQualified(new Path("/empty"));
  Path[] src = { new Path(base,"i0"), new Path("i1"), new Path("i2") };
  job.set("mapreduce.join.expr", CompositeInputFormat.compose("outer",
      Fake_IF.class, src));
  job.setInputFormat(CompositeInputFormat.class);
  FileOutputFormat.setOutputPath(job, new Path(base, "out"));

  job.setMapperClass(IdentityMapper.class);
  job.setReducerClass(IdentityReducer.class);
  job.setOutputKeyClass(IncomparableKey.class);
  job.setOutputValueClass(NullWritable.class);

  JobClient.runJob(job);
  base.getFileSystem(job).delete(base, true);
}

开发者ID:naver，项目名称:hadoop，代码行数:18，代码来源:TestDatamerge.java

示例7: createCopyJob

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
/**
 * Creates a simple copy job.
 * 
 * @param indirs List of input directories.
 * @param outdir Output directory.
 * @return JobConf initialised for a simple copy job.
 * @throws Exception If an error occurs creating job configuration.
 */
static JobConf createCopyJob(List<Path> indirs, Path outdir) throws Exception {

  Configuration defaults = new Configuration();
  JobConf theJob = new JobConf(defaults, TestJobControl.class);
  theJob.setJobName("DataMoveJob");

  FileInputFormat.setInputPaths(theJob, indirs.toArray(new Path[0]));
  theJob.setMapperClass(DataCopy.class);
  FileOutputFormat.setOutputPath(theJob, outdir);
  theJob.setOutputKeyClass(Text.class);
  theJob.setOutputValueClass(Text.class);
  theJob.setReducerClass(DataCopy.class);
  theJob.setNumMapTasks(12);
  theJob.setNumReduceTasks(4);
  return theJob;
}

开发者ID:naver，项目名称:hadoop，代码行数:25，代码来源:JobControlTestUtils.java

示例8: initMultiTableSnapshotMapperJob

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
/**
 * Sets up the job for reading from one or more multiple table snapshots, with one or more scans
 * per snapshot.
 * It bypasses hbase servers and read directly from snapshot files.
 *
 * @param snapshotScans     map of snapshot name to scans on that snapshot.
 * @param mapper            The mapper class to use.
 * @param outputKeyClass    The class of the output key.
 * @param outputValueClass  The class of the output value.
 * @param job               The current job to adjust.  Make sure the passed job is
 *                          carrying all necessary HBase configuration.
 * @param addDependencyJars upload HBase jars and jars for any of the configured
 *                          job classes via the distributed cache (tmpjars).
 */
public static void initMultiTableSnapshotMapperJob(Map<String, Collection<Scan>> snapshotScans,
    Class<? extends TableMap> mapper, Class<?> outputKeyClass, Class<?> outputValueClass,
    JobConf job, boolean addDependencyJars, Path tmpRestoreDir) throws IOException {
  MultiTableSnapshotInputFormat.setInput(job, snapshotScans, tmpRestoreDir);

  job.setInputFormat(MultiTableSnapshotInputFormat.class);
  if (outputValueClass != null) {
    job.setMapOutputValueClass(outputValueClass);
  }
  if (outputKeyClass != null) {
    job.setMapOutputKeyClass(outputKeyClass);
  }
  job.setMapperClass(mapper);
  if (addDependencyJars) {
    addDependencyJars(job);
  }

  org.apache.hadoop.hbase.mapreduce.TableMapReduceUtil.resetCacheConfig(job);
}

开发者ID:fengchen8086，项目名称:ditb，代码行数:34，代码来源:TableMapReduceUtil.java

示例9: runJob

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
@Override
protected void runJob(String jobName, Configuration c, List<Scan> scans)
    throws IOException, InterruptedException, ClassNotFoundException {
  JobConf job = new JobConf(TEST_UTIL.getConfiguration());

  job.setJobName(jobName);
  job.setMapperClass(Mapper.class);
  job.setReducerClass(Reducer.class);

  TableMapReduceUtil.initMultiTableSnapshotMapperJob(getSnapshotScanMapping(scans), Mapper.class,
      ImmutableBytesWritable.class, ImmutableBytesWritable.class, job, true, restoreDir);

  TableMapReduceUtil.addDependencyJars(job);

  job.setReducerClass(Reducer.class);
  job.setNumReduceTasks(1); // one to get final "first" and "last" key
  FileOutputFormat.setOutputPath(job, new Path(job.getJobName()));
  LOG.info("Started " + job.getJobName());

  RunningJob runningJob = JobClient.runJob(job);
  runningJob.waitForCompletion();
  assertTrue(runningJob.isSuccessful());
  LOG.info("After map/reduce completion - job " + jobName);
}

开发者ID:fengchen8086，项目名称:ditb，代码行数:25，代码来源:TestMultiTableSnapshotInputFormat.java

示例10: run

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
public int run(String[] argv) throws IOException {
  if (argv.length < 2) {
    System.out.println("ExternalMapReduce <input> <output>");
    return -1;
  }
  Path outDir = new Path(argv[1]);
  Path input = new Path(argv[0]);
  JobConf testConf = new JobConf(getConf(), ExternalMapReduce.class);
  
  //try to load a class from libjar
  try {
    testConf.getClassByName("testjar.ClassWordCount");
  } catch (ClassNotFoundException e) {
    System.out.println("Could not find class from libjar");
    return -1;
  }
  
  
  testConf.setJobName("external job");
  FileInputFormat.setInputPaths(testConf, input);
  FileOutputFormat.setOutputPath(testConf, outDir);
  testConf.setMapperClass(MapClass.class);
  testConf.setReducerClass(Reduce.class);
  testConf.setNumReduceTasks(1);
  JobClient.runJob(testConf);
  return 0;
}

开发者ID:naver，项目名称:hadoop，代码行数:28，代码来源:ExternalMapReduce.java

示例11: runJobFail

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
public static void runJobFail(JobConf conf, Path inDir, Path outDir)
       throws IOException, InterruptedException {
  conf.setJobName("test-job-fail");
  conf.setMapperClass(FailMapper.class);
  conf.setJarByClass(FailMapper.class);
  conf.setReducerClass(IdentityReducer.class);
  conf.setMaxMapAttempts(1);
  
  boolean success = runJob(conf, inDir, outDir, 1, 0);
  Assert.assertFalse("Job expected to fail succeeded", success);
}

开发者ID:naver，项目名称:hadoop，代码行数:12，代码来源:TestMROldApiJobs.java

示例12: runJobSucceed

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
public static void runJobSucceed(JobConf conf, Path inDir, Path outDir)
       throws IOException, InterruptedException {
  conf.setJobName("test-job-succeed");
  conf.setMapperClass(IdentityMapper.class);
  //conf.setJar(new File(MiniMRYarnCluster.APPJAR).getAbsolutePath());
  conf.setReducerClass(IdentityReducer.class);
  
  boolean success = runJob(conf, inDir, outDir, 1 , 1);
  Assert.assertTrue("Job expected to succeed failed", success);
}

开发者ID:naver，项目名称:hadoop，代码行数:11，代码来源:TestMROldApiJobs.java

示例13: testCombinerShouldUpdateTheReporter

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
@Test
public void testCombinerShouldUpdateTheReporter() throws Exception {
  JobConf conf = new JobConf(mrCluster.getConfig());
  int numMaps = 5;
  int numReds = 2;
  Path in = new Path(mrCluster.getTestWorkDir().getAbsolutePath(),
      "testCombinerShouldUpdateTheReporter-in");
  Path out = new Path(mrCluster.getTestWorkDir().getAbsolutePath(),
      "testCombinerShouldUpdateTheReporter-out");
  createInputOutPutFolder(in, out, numMaps);
  conf.setJobName("test-job-with-combiner");
  conf.setMapperClass(IdentityMapper.class);
  conf.setCombinerClass(MyCombinerToCheckReporter.class);
  //conf.setJarByClass(MyCombinerToCheckReporter.class);
  conf.setReducerClass(IdentityReducer.class);
  DistributedCache.addFileToClassPath(TestMRJobs.APP_JAR, conf);
  conf.setOutputCommitter(CustomOutputCommitter.class);
  conf.setInputFormat(TextInputFormat.class);
  conf.setOutputKeyClass(LongWritable.class);
  conf.setOutputValueClass(Text.class);

  FileInputFormat.setInputPaths(conf, in);
  FileOutputFormat.setOutputPath(conf, out);
  conf.setNumMapTasks(numMaps);
  conf.setNumReduceTasks(numReds);
  
  runJob(conf);
}

开发者ID:naver，项目名称:hadoop，代码行数:29，代码来源:TestMRAppWithCombiner.java

示例14: addInputPath

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
/**
 * Add a {@link Path} with a custom {@link InputFormat} and
 * {@link Mapper} to the list of inputs for the map-reduce job.
 * 
 * @param conf The configuration of the job
 * @param path {@link Path} to be added to the list of inputs for the job
 * @param inputFormatClass {@link InputFormat} class to use for this path
 * @param mapperClass {@link Mapper} class to use for this path
 */
public static void addInputPath(JobConf conf, Path path,
    Class<? extends InputFormat> inputFormatClass,
    Class<? extends Mapper> mapperClass) {

  addInputPath(conf, path, inputFormatClass);

  String mapperMapping = path.toString() + ";" + mapperClass.getName();
  String mappers = conf.get("mapreduce.input.multipleinputs.dir.mappers");
  conf.set("mapreduce.input.multipleinputs.dir.mappers", mappers == null ? mapperMapping
     : mappers + "," + mapperMapping);

  conf.setMapperClass(DelegatingMapper.class);
}

开发者ID:naver，项目名称:hadoop，代码行数:23，代码来源:MultipleInputs.java

示例15: initTableMapJob

import org.apache.hadoop.mapred.JobConf; //导入方法依赖的package包/类
/**
 * Use this before submitting a TableMap job. It will
 * appropriately set up the JobConf.
 *
 * @param table  The table name to read from.
 * @param columns  The columns to scan.
 * @param mapper  The mapper class to use.
 * @param outputKeyClass  The class of the output key.
 * @param outputValueClass  The class of the output value.
 * @param job  The current job configuration to adjust.
 * @param addDependencyJars upload HBase jars and jars for any of the configured
 *           job classes via the distributed cache (tmpjars).
 */
public static void initTableMapJob(String table, String columns,
  Class<? extends TableMap> mapper,
  Class<?> outputKeyClass,
  Class<?> outputValueClass, JobConf job, boolean addDependencyJars,
  Class<? extends InputFormat> inputFormat) {

  job.setInputFormat(inputFormat);
  job.setMapOutputValueClass(outputValueClass);
  job.setMapOutputKeyClass(outputKeyClass);
  job.setMapperClass(mapper);
  job.setStrings("io.serializations", job.get("io.serializations"),
      MutationSerialization.class.getName(), ResultSerialization.class.getName());
  FileInputFormat.addInputPaths(job, table);
  job.set(TableInputFormat.COLUMN_LIST, columns);
  if (addDependencyJars) {
    try {
      addDependencyJars(job);
    } catch (IOException e) {
      e.printStackTrace();
    }
  }
  try {
    initCredentials(job);
  } catch (IOException ioe) {
    // just spit out the stack trace?  really?
    ioe.printStackTrace();
  }
}

开发者ID:fengchen8086，项目名称:ditb，代码行数:42，代码来源:TableMapReduceUtil.java

注：本文中的org.apache.hadoop.mapred.JobConf.setMapperClass方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。