當前位置: 首頁>>代碼示例>>Java>>正文


Java SequenceFile.Sorter方法代碼示例

本文整理匯總了Java中org.apache.hadoop.io.SequenceFile.Sorter方法的典型用法代碼示例。如果您正苦於以下問題:Java SequenceFile.Sorter方法的具體用法?Java SequenceFile.Sorter怎麽用?Java SequenceFile.Sorter使用的例子?那麽, 這裏精選的方法代碼示例或許可以為您提供幫助。您也可以進一步了解該方法所在org.apache.hadoop.io.SequenceFile的用法示例。


在下文中一共展示了SequenceFile.Sorter方法的3個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。

示例1: sortListing

import org.apache.hadoop.io.SequenceFile; //導入方法依賴的package包/類
/**
 * Sort sequence file containing FileStatus and Text as key and value respecitvely
 *
 * @param fs File System
 * @param conf Configuration
 * @param sourceListing Source listing file
 * @return Path of the sorted file. Is source file with _sorted appended to the name
 * @throws IOException Any exception during sort.
 */
private static Path sortListing(FileSystem fs, Configuration conf, Path sourceListing) throws IOException {
  SequenceFile.Sorter sorter = new SequenceFile.Sorter(fs, Text.class, CopyListingFileStatus.class, conf);
  Path output = new Path(sourceListing.toString() + "_sorted");

  if (fs.exists(output)) {
    fs.delete(output, false);
  }

  sorter.sort(sourceListing, output);
  return output;
}
 
開發者ID:HotelsDotCom,項目名稱:circus-train,代碼行數:21,代碼來源:CopyListing.java

示例2: sortListing

import org.apache.hadoop.io.SequenceFile; //導入方法依賴的package包/類
/**
 * Sort sequence file containing FileStatus and Text as key and value respecitvely
 *
 * @param fs - File System
 * @param conf - Configuration
 * @param sourceListing - Source listing file
 * @return Path of the sorted file. Is source file with _sorted appended to the name
 * @throws IOException - Any exception during sort.
 */
public static Path sortListing(FileSystem fs, Configuration conf, Path sourceListing)
    throws IOException {
  SequenceFile.Sorter sorter = new SequenceFile.Sorter(fs, Text.class,
    CopyListingFileStatus.class, conf);
  Path output = new Path(sourceListing.toString() +  "_sorted");

  if (fs.exists(output)) {
    fs.delete(output, false);
  }

  sorter.sort(sourceListing, output);
  return output;
}
 
開發者ID:naver,項目名稱:hadoop,代碼行數:23,代碼來源:DistCpUtils.java

示例3: deleteNonexisting

import org.apache.hadoop.io.SequenceFile; //導入方法依賴的package包/類
/**
 * Delete the dst files/dirs which do not exist in src
 * 
 * @return total count of files and directories deleted from destination
 * @throws IOException
 */
static private long deleteNonexisting(
    FileSystem dstfs, FileStatus dstroot, Path dstsorted,
    FileSystem jobfs, Path jobdir, JobConf jobconf, Configuration conf
    ) throws IOException {
  if (dstroot.isFile()) {
    throw new IOException("dst must be a directory when option "
        + Options.DELETE.cmd + " is set, but dst (= " + dstroot.getPath()
        + ") is not a directory.");
  }

  //write dst lsr results
  final Path dstlsr = new Path(jobdir, "_distcp_dst_lsr");
  try (final SequenceFile.Writer writer = SequenceFile.createWriter(jobconf,
      Writer.file(dstlsr), Writer.keyClass(Text.class),
      Writer.valueClass(NullWritable.class), Writer.compression(
      SequenceFile.CompressionType.NONE))) {
    //do lsr to get all file statuses in dstroot
    final Stack<FileStatus> lsrstack = new Stack<FileStatus>();
    for(lsrstack.push(dstroot); !lsrstack.isEmpty(); ) {
      final FileStatus status = lsrstack.pop();
      if (status.isDirectory()) {
        for(FileStatus child : dstfs.listStatus(status.getPath())) {
          String relative = makeRelative(dstroot.getPath(), child.getPath());
          writer.append(new Text(relative), NullWritable.get());
          lsrstack.push(child);
        }
      }
    }
  }

  //sort lsr results
  final Path sortedlsr = new Path(jobdir, "_distcp_dst_lsr_sorted");
  SequenceFile.Sorter sorter = new SequenceFile.Sorter(jobfs,
      new Text.Comparator(), Text.class, NullWritable.class, jobconf);
  sorter.sort(dstlsr, sortedlsr);

  //compare lsr list and dst list  
  long deletedPathsCount = 0;
  try (SequenceFile.Reader lsrin =
           new SequenceFile.Reader(jobconf, Reader.file(sortedlsr));
       SequenceFile.Reader  dstin =
           new SequenceFile.Reader(jobconf, Reader.file(dstsorted))) {
    //compare sorted lsr list and sorted dst list
    final Text lsrpath = new Text();
    final Text dstpath = new Text();
    final Text dstfrom = new Text();
    final Trash trash = new Trash(dstfs, conf);
    Path lastpath = null;

    boolean hasnext = dstin.next(dstpath, dstfrom);
    while (lsrin.next(lsrpath, NullWritable.get())) {
      int dst_cmp_lsr = dstpath.compareTo(lsrpath);
      while (hasnext && dst_cmp_lsr < 0) {
        hasnext = dstin.next(dstpath, dstfrom);
        dst_cmp_lsr = dstpath.compareTo(lsrpath);
      }
      
      if (dst_cmp_lsr == 0) {
        //lsrpath exists in dst, skip it
        hasnext = dstin.next(dstpath, dstfrom);
      } else {
        //lsrpath does not exist, delete it
        final Path rmpath = new Path(dstroot.getPath(), lsrpath.toString());
        ++deletedPathsCount;
        if ((lastpath == null || !isAncestorPath(lastpath, rmpath))) {
          if (!(trash.moveToTrash(rmpath) || dstfs.delete(rmpath, true))) {
            throw new IOException("Failed to delete " + rmpath);
          }
          lastpath = rmpath;
        }
      }
    }
  }
  return deletedPathsCount;
}
 
開發者ID:naver,項目名稱:hadoop,代碼行數:82,代碼來源:DistCpV1.java


注:本文中的org.apache.hadoop.io.SequenceFile.Sorter方法示例由純淨天空整理自Github/MSDocs等開源代碼及文檔管理平台,相關代碼片段篩選自各路編程大神貢獻的開源項目,源碼版權歸原作者所有,傳播和使用請參考對應項目的License;未經允許,請勿轉載。