本文整理汇总了Java中org.apache.parquet.avro.AvroParquetInputFormat类的典型用法代码示例。如果您正苦于以下问题:Java AvroParquetInputFormat类的具体用法?Java AvroParquetInputFormat怎么用?Java AvroParquetInputFormat使用的例子?那么恭喜您, 这里精选的类代码示例或许可以为您提供帮助。
AvroParquetInputFormat类属于org.apache.parquet.avro包,在下文中一共展示了AvroParquetInputFormat类的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。
示例1: ParquetHdfsFileSource
import org.apache.parquet.avro.AvroParquetInputFormat; //导入依赖的package包/类
private ParquetHdfsFileSource(UgiDoAs doAs, String filepattern, LazyAvroCoder<IndexedRecord> lac,
ExtraHadoopConfiguration extraConfig, SerializableSplit serializableSplit) {
super(doAs, filepattern, (Class) AvroParquetInputFormat.class, Void.class, IndexedRecord.class, extraConfig,
serializableSplit);
this.lac = lac;
setDefaultCoder(VoidCoder.of(), (LazyAvroCoder) lac);
}
示例2: getADAMReads
import org.apache.parquet.avro.AvroParquetInputFormat; //导入依赖的package包/类
/**
* Loads ADAM reads stored as Parquet.
* @param inputPath path to the Parquet data
* @return RDD of (ADAM-backed) GATKReads from the file.
*/
public JavaRDD<GATKRead> getADAMReads(final String inputPath, final TraversalParameters traversalParameters, final SAMFileHeader header) throws IOException {
Job job = Job.getInstance(ctx.hadoopConfiguration());
AvroParquetInputFormat.setAvroReadSchema(job, AlignmentRecord.getClassSchema());
Broadcast<SAMFileHeader> bHeader;
if (header == null) {
bHeader= ctx.broadcast(null);
} else {
bHeader = ctx.broadcast(header);
}
@SuppressWarnings("unchecked")
JavaRDD<AlignmentRecord> recordsRdd = ctx.newAPIHadoopFile(
inputPath, AvroParquetInputFormat.class, Void.class, AlignmentRecord.class, job.getConfiguration())
.values();
JavaRDD<GATKRead> readsRdd = recordsRdd.map(record -> new BDGAlignmentRecordToGATKReadAdapter(record, bHeader.getValue()));
JavaRDD<GATKRead> filteredRdd = readsRdd.filter(record -> samRecordOverlaps(record.convertToSAMRecord(header), traversalParameters));
return putPairsInSamePartition(header, filteredRdd);
}