当前位置: 首页>>代码示例>>Java>>正文


Java SequenceFile.Sorter方法代码示例

本文整理汇总了Java中org.apache.hadoop.io.SequenceFile.Sorter方法的典型用法代码示例。如果您正苦于以下问题:Java SequenceFile.Sorter方法的具体用法?Java SequenceFile.Sorter怎么用?Java SequenceFile.Sorter使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在org.apache.hadoop.io.SequenceFile的用法示例。


在下文中一共展示了SequenceFile.Sorter方法的3个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: sortListing

import org.apache.hadoop.io.SequenceFile; //导入方法依赖的package包/类
/**
 * Sort sequence file containing FileStatus and Text as key and value respecitvely
 *
 * @param fs File System
 * @param conf Configuration
 * @param sourceListing Source listing file
 * @return Path of the sorted file. Is source file with _sorted appended to the name
 * @throws IOException Any exception during sort.
 */
private static Path sortListing(FileSystem fs, Configuration conf, Path sourceListing) throws IOException {
  SequenceFile.Sorter sorter = new SequenceFile.Sorter(fs, Text.class, CopyListingFileStatus.class, conf);
  Path output = new Path(sourceListing.toString() + "_sorted");

  if (fs.exists(output)) {
    fs.delete(output, false);
  }

  sorter.sort(sourceListing, output);
  return output;
}
 
开发者ID:HotelsDotCom,项目名称:circus-train,代码行数:21,代码来源:CopyListing.java

示例2: sortListing

import org.apache.hadoop.io.SequenceFile; //导入方法依赖的package包/类
/**
 * Sort sequence file containing FileStatus and Text as key and value respecitvely
 *
 * @param fs - File System
 * @param conf - Configuration
 * @param sourceListing - Source listing file
 * @return Path of the sorted file. Is source file with _sorted appended to the name
 * @throws IOException - Any exception during sort.
 */
public static Path sortListing(FileSystem fs, Configuration conf, Path sourceListing)
    throws IOException {
  SequenceFile.Sorter sorter = new SequenceFile.Sorter(fs, Text.class,
    CopyListingFileStatus.class, conf);
  Path output = new Path(sourceListing.toString() +  "_sorted");

  if (fs.exists(output)) {
    fs.delete(output, false);
  }

  sorter.sort(sourceListing, output);
  return output;
}
 
开发者ID:naver,项目名称:hadoop,代码行数:23,代码来源:DistCpUtils.java

示例3: deleteNonexisting

import org.apache.hadoop.io.SequenceFile; //导入方法依赖的package包/类
/**
 * Delete the dst files/dirs which do not exist in src
 * 
 * @return total count of files and directories deleted from destination
 * @throws IOException
 */
static private long deleteNonexisting(
    FileSystem dstfs, FileStatus dstroot, Path dstsorted,
    FileSystem jobfs, Path jobdir, JobConf jobconf, Configuration conf
    ) throws IOException {
  if (dstroot.isFile()) {
    throw new IOException("dst must be a directory when option "
        + Options.DELETE.cmd + " is set, but dst (= " + dstroot.getPath()
        + ") is not a directory.");
  }

  //write dst lsr results
  final Path dstlsr = new Path(jobdir, "_distcp_dst_lsr");
  try (final SequenceFile.Writer writer = SequenceFile.createWriter(jobconf,
      Writer.file(dstlsr), Writer.keyClass(Text.class),
      Writer.valueClass(NullWritable.class), Writer.compression(
      SequenceFile.CompressionType.NONE))) {
    //do lsr to get all file statuses in dstroot
    final Stack<FileStatus> lsrstack = new Stack<FileStatus>();
    for(lsrstack.push(dstroot); !lsrstack.isEmpty(); ) {
      final FileStatus status = lsrstack.pop();
      if (status.isDirectory()) {
        for(FileStatus child : dstfs.listStatus(status.getPath())) {
          String relative = makeRelative(dstroot.getPath(), child.getPath());
          writer.append(new Text(relative), NullWritable.get());
          lsrstack.push(child);
        }
      }
    }
  }

  //sort lsr results
  final Path sortedlsr = new Path(jobdir, "_distcp_dst_lsr_sorted");
  SequenceFile.Sorter sorter = new SequenceFile.Sorter(jobfs,
      new Text.Comparator(), Text.class, NullWritable.class, jobconf);
  sorter.sort(dstlsr, sortedlsr);

  //compare lsr list and dst list  
  long deletedPathsCount = 0;
  try (SequenceFile.Reader lsrin =
           new SequenceFile.Reader(jobconf, Reader.file(sortedlsr));
       SequenceFile.Reader  dstin =
           new SequenceFile.Reader(jobconf, Reader.file(dstsorted))) {
    //compare sorted lsr list and sorted dst list
    final Text lsrpath = new Text();
    final Text dstpath = new Text();
    final Text dstfrom = new Text();
    final Trash trash = new Trash(dstfs, conf);
    Path lastpath = null;

    boolean hasnext = dstin.next(dstpath, dstfrom);
    while (lsrin.next(lsrpath, NullWritable.get())) {
      int dst_cmp_lsr = dstpath.compareTo(lsrpath);
      while (hasnext && dst_cmp_lsr < 0) {
        hasnext = dstin.next(dstpath, dstfrom);
        dst_cmp_lsr = dstpath.compareTo(lsrpath);
      }
      
      if (dst_cmp_lsr == 0) {
        //lsrpath exists in dst, skip it
        hasnext = dstin.next(dstpath, dstfrom);
      } else {
        //lsrpath does not exist, delete it
        final Path rmpath = new Path(dstroot.getPath(), lsrpath.toString());
        ++deletedPathsCount;
        if ((lastpath == null || !isAncestorPath(lastpath, rmpath))) {
          if (!(trash.moveToTrash(rmpath) || dstfs.delete(rmpath, true))) {
            throw new IOException("Failed to delete " + rmpath);
          }
          lastpath = rmpath;
        }
      }
    }
  }
  return deletedPathsCount;
}
 
开发者ID:naver,项目名称:hadoop,代码行数:82,代码来源:DistCpV1.java


注:本文中的org.apache.hadoop.io.SequenceFile.Sorter方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。