本文整理汇总了Java中org.apache.parquet.hadoop.ParquetFileWriter.start方法的典型用法代码示例。如果您正苦于以下问题:Java ParquetFileWriter.start方法的具体用法?Java ParquetFileWriter.start怎么用?Java ParquetFileWriter.start使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类org.apache.parquet.hadoop.ParquetFileWriter
的用法示例。
在下文中一共展示了ParquetFileWriter.start方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。
示例1: mergeOutput
import org.apache.parquet.hadoop.ParquetFileWriter; //导入方法依赖的package包/类
@Override
protected boolean mergeOutput(FileSystem fs, String sourceFolder, String targetFile) {
try {
FileStatus[] sourceStatuses = FileSystemUtil.listSubFiles(fs, sourceFolder);
List<Path> sourceFiles = new ArrayList<>();
for (FileStatus sourceStatus : sourceStatuses) {
sourceFiles.add(sourceStatus.getPath());
}
FileMetaData mergedMeta = ParquetFileWriter.mergeMetadataFiles(sourceFiles, fs.getConf()).getFileMetaData();
ParquetFileWriter writer = new ParquetFileWriter(fs.getConf(), mergedMeta.getSchema(), new Path(targetFile),
ParquetFileWriter.Mode.CREATE);
writer.start();
for (Path input : sourceFiles) {
writer.appendFile(fs.getConf(), input);
}
writer.end(mergedMeta.getKeyValueMetaData());
} catch (Exception e) {
LOG.error("Error when merging files in {}.\n{}", sourceFolder, e.getMessage());
return false;
}
return true;
}
示例2: execute
import org.apache.parquet.hadoop.ParquetFileWriter; //导入方法依赖的package包/类
@Override
public void execute(CommandLine options) throws Exception {
// Prepare arguments
List<String> args = options.getArgList();
List<Path> inputFiles = getInputFiles(args.subList(0, args.size() - 1));
Path outputFile = new Path(args.get(args.size() - 1));
// Merge schema and extraMeta
FileMetaData mergedMeta = mergedMetadata(inputFiles);
PrintWriter out = new PrintWriter(Main.out, true);
// Merge data
ParquetFileWriter writer = new ParquetFileWriter(conf,
mergedMeta.getSchema(), outputFile, ParquetFileWriter.Mode.CREATE);
writer.start();
boolean tooSmallFilesMerged = false;
for (Path input: inputFiles) {
if (input.getFileSystem(conf).getFileStatus(input).getLen() < TOO_SMALL_FILE_THRESHOLD) {
out.format("Warning: file %s is too small, length: %d\n",
input,
input.getFileSystem(conf).getFileStatus(input).getLen());
tooSmallFilesMerged = true;
}
writer.appendFile(HadoopInputFile.fromPath(input, conf));
}
if (tooSmallFilesMerged) {
out.println("Warning: you merged too small files. " +
"Although the size of the merged file is bigger, it STILL contains small row groups, thus you don't have the advantage of big row groups, " +
"which usually leads to bad query performance!");
}
writer.end(mergedMeta.getKeyValueMetaData());
}