當前位置: 首頁>>代碼示例>>Java>>正文


Java FileInputFormat.getInputPaths方法代碼示例

本文整理匯總了Java中org.apache.hadoop.mapred.FileInputFormat.getInputPaths方法的典型用法代碼示例。如果您正苦於以下問題:Java FileInputFormat.getInputPaths方法的具體用法?Java FileInputFormat.getInputPaths怎麽用?Java FileInputFormat.getInputPaths使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.apache.hadoop.mapred.FileInputFormat的用法示例。


在下文中一共展示了FileInputFormat.getInputPaths方法的10個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: validateInput

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
public void validateInput(JobConf job) throws IOException {
  // expecting exactly one path
  Path [] tableNames = FileInputFormat.getInputPaths(job);
  if (tableNames == null || tableNames.length > 1) {
    throw new IOException("expecting one table name");
  }

  // connected to table?
  if (getHTable() == null) {
    throw new IOException("could not connect to table '" +
      tableNames[0].getName() + "'");
  }

  // expecting at least one column
  String colArg = job.get(COLUMN_LIST);
  if (colArg == null || colArg.length() == 0) {
    throw new IOException("expecting at least one column");
  }
}
 
開發者ID:fengchen8086,項目名稱:ditb,代碼行數:20,代碼來源:TableInputFormat.java

示例2: setInputPaths

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
/**
 * setInputPaths add all the paths in the provided list to the Job conf object
 * as input paths for the job.
 *
 * @param job
 * @param pathsToAdd
 */
public static void setInputPaths(JobConf job, List<Path> pathsToAdd) {

  Path[] addedPaths = FileInputFormat.getInputPaths(job);
  if (addedPaths == null) {
    addedPaths = new Path[0];
  }

  Path[] combined = new Path[addedPaths.length + pathsToAdd.size()];
  System.arraycopy(addedPaths, 0, combined, 0, addedPaths.length);

  int i = 0;
  for(Path p: pathsToAdd) {
    combined[addedPaths.length + (i++)] = p;
  }
  FileInputFormat.setInputPaths(job, combined);
}
 
開發者ID:mini666,項目名稱:hive-phoenix-handler,代碼行數:24,代碼來源:Utilities.java

示例3: getInputPaths

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
Path[] getInputPaths(JobConf job) throws IOException {
  Path[] dirs = FileInputFormat.getInputPaths(job);
  if (dirs.length == 0) {
    // on tez we're avoiding to duplicate the file info in FileInputFormat.
    if (HiveConf.getVar(job, HiveConf.ConfVars.HIVE_EXECUTION_ENGINE).equals("tez")) {
      try {
        List<Path> paths = Utilities.getInputPathsTez(job, mrwork);
        dirs = paths.toArray(new Path[paths.size()]);
      } catch (Exception e) {
        throw new IOException("Could not create input files", e);
      }
    } else {
      throw new IOException("No input paths specified in job");
    }
  }
  return dirs;
}
 
開發者ID:mini666,項目名稱:hive-phoenix-handler,代碼行數:18,代碼來源:HiveInputFormat.java

示例4: getSplits

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
/**
 * Provide the required splits from the specified configuration. By default this
 *   method makes query (function-execution) on the region with `_meta' suffix
 *   so need to be make sure that the region-name is passed accordingly.
 *
 * @param conf the job configuration
 * @param numSplits the required number of splits
 * @return the required splits to read/write the data
 * @throws IOException if table does not exist.
 */
public static InputSplit[] getSplits(final JobConf conf, final int numSplits) throws IOException {
  final Path[] tablePaths = FileInputFormat.getInputPaths(conf);
  /** initialize cache if not done yet.. **/
  final AmpoolClient aClient = MonarchUtils.getConnectionFromConf(conf);
  String tableName = conf.get(MonarchUtils.REGION);
  boolean isFTable = MonarchUtils.isFTable(conf);
  Table table = null;
  if (isFTable) {
    table = aClient.getFTable(tableName);
  } else {
    table = aClient.getMTable(tableName);
  }
  if (table == null) {
    throw new IOException("Table " + tableName + "does not exist.");
  }
  int totalnumberOfSplits = table.getTableDescriptor().getTotalNumOfSplits();
  Map<Integer, Set<ServerLocation>> bucketMap = new HashMap<>(numSplits);
  final AtomicLong start = new AtomicLong(0L);
  MonarchSplit[] splits = MTableUtils
    .getSplitsWithSize(tableName, numSplits, totalnumberOfSplits, bucketMap)
    .stream().map(e -> {
      MonarchSplit ms = convertToSplit(tablePaths, start.get(), e, bucketMap);
      start.addAndGet(e.getSize());
      return ms;
    }).toArray(MonarchSplit[]::new);
  logger.info("numSplits= {}; MonarchSplits= {}", numSplits, Arrays.toString(splits));
  return splits;
}
 
開發者ID:ampool,項目名稱:monarch,代碼行數:39,代碼來源:MonarchSplit.java

示例5: initialize

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
@Override
protected void initialize(JobConf job) throws IOException {
  Path[] tableNames = FileInputFormat.getInputPaths(job);
  String colArg = job.get(COLUMN_LIST);
  String[] colNames = colArg.split(" ");
  byte [][] m_cols = new byte[colNames.length][];
  for (int i = 0; i < m_cols.length; i++) {
    m_cols[i] = Bytes.toBytes(colNames[i]);
  }
  setInputColumns(m_cols);
  Connection connection = ConnectionFactory.createConnection(job);
  initializeTable(connection, TableName.valueOf(tableNames[0].getName()));
}
 
開發者ID:fengchen8086,項目名稱:ditb,代碼行數:14,代碼來源:TableInputFormat.java

示例6: getSplits

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
@Override
public InputSplit[] getSplits(JobConf job, int numSplits)
        throws IOException {
    Path[] paths = FileInputFormat.getInputPaths(job);

    return FluentIterable.from(BaseInputFormat.getSplits(job, paths))
            .transform(_fromSplit)
            .toArray(InputSplit.class);
}
 
開發者ID:bazaarvoice,項目名稱:emodb,代碼行數:10,代碼來源:EmoInputFormat.java

示例7: getInputPath

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
@Override
protected Path getInputPath(JobConf conf) {
  Path path = null;

  Path[] paths = FileInputFormat.getInputPaths(conf);
  if ((paths != null) && (paths.length > 0)) {
    path = paths[0];
  }

  return path;
}
 
開發者ID:awslabs,項目名稱:emr-dynamodb-connector,代碼行數:12,代碼來源:HiveDynamoDBSplitGenerator.java

示例8: getStatistics

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
@Override
public BaseStatistics getStatistics(BaseStatistics cachedStats) throws IOException {
	// only gather base statistics for FileInputFormats
	if (!(mapredInputFormat instanceof FileInputFormat)) {
		return null;
	}

	final FileBaseStatistics cachedFileStats = (cachedStats instanceof FileBaseStatistics) ?
			(FileBaseStatistics) cachedStats : null;

	try {
		final org.apache.hadoop.fs.Path[] paths = FileInputFormat.getInputPaths(this.jobConf);

		return getFileStats(cachedFileStats, paths, new ArrayList<FileStatus>(1));
	} catch (IOException ioex) {
		if (LOG.isWarnEnabled()) {
			LOG.warn("Could not determine statistics due to an io error: "
					+ ioex.getMessage());
		}
	} catch (Throwable t) {
		if (LOG.isErrorEnabled()) {
			LOG.error("Unexpected problem while getting the file statistics: "
					+ t.getMessage(), t);
		}
	}

	// no statistics available
	return null;
}
 
開發者ID:axbaretto,項目名稱:flink,代碼行數:30,代碼來源:HadoopInputFormatBase.java

示例9: getStatistics

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
@Override
public BaseStatistics getStatistics(BaseStatistics cachedStats) throws IOException {
	// only gather base statistics for FileInputFormats
	if (!(mapredInputFormat instanceof FileInputFormat)) {
		return null;
	}

	final FileBaseStatistics cachedFileStats = (cachedStats != null && cachedStats instanceof FileBaseStatistics) ?
			(FileBaseStatistics) cachedStats : null;

	try {
		final org.apache.hadoop.fs.Path[] paths = FileInputFormat.getInputPaths(this.jobConf);

		return getFileStats(cachedFileStats, paths, new ArrayList<FileStatus>(1));
	} catch (IOException ioex) {
		if (LOG.isWarnEnabled()) {
			LOG.warn("Could not determine statistics due to an io error: "
					+ ioex.getMessage());
		}
	} catch (Throwable t) {
		if (LOG.isErrorEnabled()) {
			LOG.error("Unexpected problem while getting the file statistics: "
					+ t.getMessage(), t);
		}
	}

	// no statistics available
	return null;
}
 
開發者ID:axbaretto,項目名稱:flink,代碼行數:30,代碼來源:HadoopInputFormatBase.java

示例10: testInputPath

import org.apache.hadoop.mapred.FileInputFormat; //導入方法依賴的package包/類
public void testInputPath() throws Exception {
  JobConf jobConf = new JobConf();
  Path workingDir = jobConf.getWorkingDirectory();
  
  Path path = new Path(workingDir, 
      "xx{y"+StringUtils.COMMA_STR+"z}");
  FileInputFormat.setInputPaths(jobConf, path);
  Path[] paths = FileInputFormat.getInputPaths(jobConf);
  assertEquals(1, paths.length);
  assertEquals(path.toString(), paths[0].toString());
   
  StringBuilder pathStr = new StringBuilder();
  pathStr.append(StringUtils.ESCAPE_CHAR);
  pathStr.append(StringUtils.ESCAPE_CHAR);
  pathStr.append(StringUtils.COMMA);
  pathStr.append(StringUtils.COMMA);
  pathStr.append('a');
  path = new Path(workingDir, pathStr.toString());
  FileInputFormat.setInputPaths(jobConf, path);
  paths = FileInputFormat.getInputPaths(jobConf);
  assertEquals(1, paths.length);
  assertEquals(path.toString(), paths[0].toString());
    
  pathStr.setLength(0);
  pathStr.append(StringUtils.ESCAPE_CHAR);
  pathStr.append("xx");
  pathStr.append(StringUtils.ESCAPE_CHAR);
  path = new Path(workingDir, pathStr.toString());
  Path path1 = new Path(workingDir,
      "yy"+StringUtils.COMMA_STR+"zz");
  FileInputFormat.setInputPaths(jobConf, path);
  FileInputFormat.addInputPath(jobConf, path1);
  paths = FileInputFormat.getInputPaths(jobConf);
  assertEquals(2, paths.length);
  assertEquals(path.toString(), paths[0].toString());
  assertEquals(path1.toString(), paths[1].toString());

  FileInputFormat.setInputPaths(jobConf, path, path1);
  paths = FileInputFormat.getInputPaths(jobConf);
  assertEquals(2, paths.length);
  assertEquals(path.toString(), paths[0].toString());
  assertEquals(path1.toString(), paths[1].toString());

  Path[] input = new Path[] {path, path1};
  FileInputFormat.setInputPaths(jobConf, input);
  paths = FileInputFormat.getInputPaths(jobConf);
  assertEquals(2, paths.length);
  assertEquals(path.toString(), paths[0].toString());
  assertEquals(path1.toString(), paths[1].toString());
  
  pathStr.setLength(0);
  String str1 = "{a{b,c},de}";
  String str2 = "xyz";
  String str3 = "x{y,z}";
  pathStr.append(str1);
  pathStr.append(StringUtils.COMMA);
  pathStr.append(str2);
  pathStr.append(StringUtils.COMMA);
  pathStr.append(str3);
  FileInputFormat.setInputPaths(jobConf, pathStr.toString());
  paths = FileInputFormat.getInputPaths(jobConf);
  assertEquals(3, paths.length);
  assertEquals(new Path(workingDir, str1).toString(), paths[0].toString());
  assertEquals(new Path(workingDir, str2).toString(), paths[1].toString());
  assertEquals(new Path(workingDir, str3).toString(), paths[2].toString());

  pathStr.setLength(0);
  String str4 = "abc";
  String str5 = "pq{r,s}";
  pathStr.append(str4);
  pathStr.append(StringUtils.COMMA);
  pathStr.append(str5);
  FileInputFormat.addInputPaths(jobConf, pathStr.toString());
  paths = FileInputFormat.getInputPaths(jobConf);
  assertEquals(5, paths.length);
  assertEquals(new Path(workingDir, str1).toString(), paths[0].toString());
  assertEquals(new Path(workingDir, str2).toString(), paths[1].toString());
  assertEquals(new Path(workingDir, str3).toString(), paths[2].toString());
  assertEquals(new Path(workingDir, str4).toString(), paths[3].toString());
  assertEquals(new Path(workingDir, str5).toString(), paths[4].toString());
}
 
開發者ID:aliyun-beta,項目名稱:aliyun-oss-hadoop-fs,代碼行數:82,代碼來源:TestInputPath.java


注:本文中的org.apache.hadoop.mapred.FileInputFormat.getInputPaths方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。