本文整理匯總了Java中org.apache.parquet.avro.AvroParquetInputFormat類的典型用法代碼示例。如果您正苦於以下問題:Java AvroParquetInputFormat類的具體用法?Java AvroParquetInputFormat怎麽用?Java AvroParquetInputFormat使用的例子?那麽, 這裏精選的類代碼示例或許可以為您提供幫助。
AvroParquetInputFormat類屬於org.apache.parquet.avro包,在下文中一共展示了AvroParquetInputFormat類的2個代碼示例,這些例子默認根據受歡迎程度排序。您可以為喜歡或者感覺有用的代碼點讚,您的評價將有助於係統推薦出更棒的Java代碼示例。
示例1: ParquetHdfsFileSource
import org.apache.parquet.avro.AvroParquetInputFormat; //導入依賴的package包/類
private ParquetHdfsFileSource(UgiDoAs doAs, String filepattern, LazyAvroCoder<IndexedRecord> lac,
ExtraHadoopConfiguration extraConfig, SerializableSplit serializableSplit) {
super(doAs, filepattern, (Class) AvroParquetInputFormat.class, Void.class, IndexedRecord.class, extraConfig,
serializableSplit);
this.lac = lac;
setDefaultCoder(VoidCoder.of(), (LazyAvroCoder) lac);
}
示例2: getADAMReads
import org.apache.parquet.avro.AvroParquetInputFormat; //導入依賴的package包/類
/**
* Loads ADAM reads stored as Parquet.
* @param inputPath path to the Parquet data
* @return RDD of (ADAM-backed) GATKReads from the file.
*/
public JavaRDD<GATKRead> getADAMReads(final String inputPath, final TraversalParameters traversalParameters, final SAMFileHeader header) throws IOException {
Job job = Job.getInstance(ctx.hadoopConfiguration());
AvroParquetInputFormat.setAvroReadSchema(job, AlignmentRecord.getClassSchema());
Broadcast<SAMFileHeader> bHeader;
if (header == null) {
bHeader= ctx.broadcast(null);
} else {
bHeader = ctx.broadcast(header);
}
@SuppressWarnings("unchecked")
JavaRDD<AlignmentRecord> recordsRdd = ctx.newAPIHadoopFile(
inputPath, AvroParquetInputFormat.class, Void.class, AlignmentRecord.class, job.getConfiguration())
.values();
JavaRDD<GATKRead> readsRdd = recordsRdd.map(record -> new BDGAlignmentRecordToGATKReadAdapter(record, bHeader.getValue()));
JavaRDD<GATKRead> filteredRdd = readsRdd.filter(record -> samRecordOverlaps(record.convertToSAMRecord(header), traversalParameters));
return putPairsInSamePartition(header, filteredRdd);
}