当前位置: 首页>>代码示例>>Java>>正文


Java BucketUtils.isHadoopUrl方法代码示例

本文整理汇总了Java中org.broadinstitute.hellbender.utils.gcs.BucketUtils.isHadoopUrl方法的典型用法代码示例。如果您正苦于以下问题:Java BucketUtils.isHadoopUrl方法的具体用法?Java BucketUtils.isHadoopUrl怎么用?Java BucketUtils.isHadoopUrl使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在org.broadinstitute.hellbender.utils.gcs.BucketUtils的用法示例。


在下文中一共展示了BucketUtils.isHadoopUrl方法的6个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Java代码示例。

示例1: ReadsDataflowSource

import org.broadinstitute.hellbender.utils.gcs.BucketUtils; //导入方法依赖的package包/类
/**
 * @param bam A file path or a google bucket identifier to a bam file to read
 * @param p the pipeline object for the job. This is needed to read a bam from a bucket.
 *          The options inside of the pipeline MUST BE GCSOptions (to get the secret file).
 */
public ReadsDataflowSource(String bam, Pipeline p){
    this.bam = Utils.nonNull(bam);
    this.pipeline = p;

    cloudStorageUrl = BucketUtils.isCloudStorageUrl(bam);
    hadoopUrl = BucketUtils.isHadoopUrl(bam);
    if(cloudStorageUrl) {
        // The options used to create the pipeline must be GCSOptions to get the secret file.
        try {
            options = p.getOptions().as(GCSOptions.class);
        } catch (ClassCastException e) {
            throw new GATKException("The pipeline options was not GCSOptions.", e);
        }
        GenomicsOptions.Methods.validateOptions(options);
        auth = getAuth(options);
    }
}
 
开发者ID:broadinstitute,项目名称:gatk-dataflow,代码行数:23,代码来源:ReadsDataflowSource.java

示例2: writeToFile

import org.broadinstitute.hellbender.utils.gcs.BucketUtils; //导入方法依赖的package包/类
/**
 * Takes a few Reads and will write them to a BAM file.
 * The Reads don't have to be sorted initially, the BAM file will be.
 * All the reads must fit into a single worker's memory, so this won't go well if you have too many.
 *
 * @param pipeline the pipeline to add this operation to.
 * @param reads  the reads to write (they don't need to be sorted).
 * @param header the header that corresponds to the reads.
 * @param destPath the GCS or local path to write to (must start with "gs://" if writing to GCS).
 * @param parquet whether to write out BAM or Parquet data (BDG AlignmentRecords); only applies when writing to Hadoop
 */
public static void writeToFile(
        Pipeline pipeline, PCollection<GATKRead> reads, final SAMFileHeader header, final String destPath,
        final boolean parquet) {
    if ( BucketUtils.isHadoopUrl(destPath) ||
            pipeline.getRunner().getClass().equals(SparkPipelineRunner.class)) {
        writeToHadoop(pipeline, reads, header, destPath, parquet);
    } else {
        PCollectionView<Iterable<GATKRead>> iterableView =
                reads.apply(View.<GATKRead>asIterable());

        PCollection<String> dummy = pipeline.apply("output file name", Create.<String>of(destPath));

        dummy.apply(ParDo.named("save to BAM file")
                        .withSideInputs(iterableView)
                        .of(new SaveToBAMFile(header, iterableView))
        );
    }
}
 
开发者ID:broadinstitute,项目名称:gatk-dataflow,代码行数:30,代码来源:SmallBamWriter.java

示例3: setupPipeline

import org.broadinstitute.hellbender.utils.gcs.BucketUtils; //导入方法依赖的package包/类
private Pipeline setupPipeline(final String inputPath, final String outputPath, boolean enableGcs, boolean enableCloudExec) {
    final GATKGCSOptions options = PipelineOptionsFactory.as(GATKGCSOptions.class);
    if (enableCloudExec) {
        options.setStagingLocation(getGCPTestStaging());
        options.setProject(getGCPTestProject());
        options.setRunner(BlockingDataflowPipelineRunner.class);
    } else if (BucketUtils.isHadoopUrl(inputPath) || BucketUtils.isHadoopUrl(outputPath)) {
        options.setRunner(SparkPipelineRunner.class);
    } else {
        options.setRunner(DirectPipelineRunner.class);
    }
    if (enableGcs) {
        options.setApiKey(getGCPTestApiKey());
    }
    final Pipeline p = Pipeline.create(options);
    DataflowUtils.registerGATKCoders(p);
    return p;
}
 
开发者ID:broadinstitute,项目名称:gatk-dataflow,代码行数:19,代码来源:SmallBamWriterTest.java

示例4: ReferenceMultiSource

import org.broadinstitute.hellbender.utils.gcs.BucketUtils; //导入方法依赖的package包/类
/**
 * @param referenceURL the name of the reference (if using the Google Genomics API), or a path to the reference file
 * @param referenceWindowFunction the custom reference window function used to map reads to desired reference bases
 */
public ReferenceMultiSource(final String referenceURL,
                            final SerializableFunction<GATKRead, SimpleInterval> referenceWindowFunction) {
    Utils.nonNull(referenceWindowFunction);
    if (ReferenceTwoBitSource.isTwoBit(referenceURL)) {
        try {
            referenceSource = new ReferenceTwoBitSource(referenceURL);
        } catch (IOException e) {
            throw new UserException("Failed to create a ReferenceTwoBitSource object" + e.getMessage());
        }
    } else if (isFasta(referenceURL)) {
        if (BucketUtils.isHadoopUrl(referenceURL)) {
            referenceSource = new ReferenceHadoopSource(referenceURL);
        } else {
            referenceSource = new ReferenceFileSource(referenceURL);
        }
    } else { // use the Google Genomics API
        referenceSource = new ReferenceAPISource(referenceURL);
    }
    this.referenceWindowFunction = referenceWindowFunction;
}
 
开发者ID:broadinstitute,项目名称:gatk,代码行数:25,代码来源:ReferenceMultiSource.java

示例5: SubdivideAndFillReadsIterator

import org.broadinstitute.hellbender.utils.gcs.BucketUtils; //导入方法依赖的package包/类
public SubdivideAndFillReadsIterator(String bam, int outputShardSize, int margin, final ReadFilter optFilter, ContextShard shard) throws IOException, GeneralSecurityException, ClassNotFoundException {
    this.bam = bam;
    this.shard = shard;
    this.optFilter = optFilter;
    // it's OK if this goes beyond the contig boundaries.
    lastValidPos = shard.interval.getEnd() + margin;
    firstValidPos = Math.max(shard.interval.getStart() - margin, 1);
    ArrayList<SimpleInterval> ints =new ArrayList<>();
    ints.add(shard.interval);
    subshards = IntervalUtils.cutToShards(ints, outputShardSize);
    currentSubShardIndex = 0;
    currentSubShard = subshards.get(currentSubShardIndex);

    if (BucketUtils.isCloudStorageUrl(bam)) {
        reader = SamReaderFactory.make()
            .validationStringency(ValidationStringency.SILENT)
            .open(IOUtils.getPath(bam));
    } else if (BucketUtils.isHadoopUrl(bam)) {
        throw new RuntimeException("Sorry, Hadoop paths aren't yet supported");
    } else {
        // read from local file (this only makes sense if every worker sees the same thing, e.g. if we're running locally)
        reader = SamReaderFactory.make().validationStringency(ValidationStringency.SILENT).open(new File(bam));
    }
    query = reader.queryOverlapping(shard.interval.getContig(), shard.interval.getStart(), shard.interval.getEnd());

}
 
开发者ID:broadinstitute,项目名称:gatk,代码行数:27,代码来源:AddContextDataToReadSparkOptimized.java

示例6: serializeSingleObject

import org.broadinstitute.hellbender.utils.gcs.BucketUtils; //导入方法依赖的package包/类
/**
 * Serializes the collection's single object to the specified file.
 *
 * Of course if you run on the cloud and specify a local path, the file will be saved
 * on a cloud worker, which may not be very useful.
 *
 * @param collection A collection with a single serializable object to save.
 * @param fname the name of the destination, starting with "gs://" to save to GCS, or "hdfs://" to save to HDFS.
 * @returns SaveDestination.CLOUD if saved to GCS, SaveDestination.HDFS if saved to HDFS,
 * SaveDestination.LOCAL_DISK otherwise.
 */
public static <T> SaveDestination serializeSingleObject(PCollection<T> collection, String fname) {
    if ( BucketUtils.isCloudStorageUrl(fname)) {
        saveSingleResultToRemoteStorage(collection, fname);
        return SaveDestination.CLOUD;
    } else if (BucketUtils.isHadoopUrl(fname)) {
        saveSingleResultToRemoteStorage(collection, fname);
        return SaveDestination.HDFS;
    } else {
        saveSingleResultToLocalDisk(collection, fname);
        return SaveDestination.LOCAL_DISK;
    }
}
 
开发者ID:broadinstitute,项目名称:gatk-dataflow,代码行数:24,代码来源:DataflowUtils.java


注:本文中的org.broadinstitute.hellbender.utils.gcs.BucketUtils.isHadoopUrl方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。