本文整理汇总了Python中CGAT.BamTools.estimateTagSize方法的典型用法代码示例。如果您正苦于以下问题:Python BamTools.estimateTagSize方法的具体用法?Python BamTools.estimateTagSize怎么用?Python BamTools.estimateTagSize使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类CGAT.BamTools
的用法示例。
在下文中一共展示了BamTools.estimateTagSize方法的1个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: loadZinba
# 需要导入模块: from CGAT import BamTools [as 别名]
# 或者: from CGAT.BamTools import estimateTagSize [as 别名]
def loadZinba(infile, outfile, bamfile,
tablename=None,
controlfile=None):
'''load Zinba results in *tablename*
This method loads only positive peaks. It filters peaks by p-value,
q-value and fold change and loads the diagnostic data and
re-calculates peakcenter, peakval, ... using the supplied bamfile.
If *tablename* is not given, it will be :file:`<track>_intervals`
where track is derived from ``infile`` and assumed to end
in :file:`.zinba`.
If no peaks were predicted, an empty table is created.
This method creates :file:`<outfile>.tsv.gz` with the results
of the filtering.
This method uses the refined peak locations.
Zinba peaks can be overlapping. This method does not merge
overlapping intervals.
Zinba calls peaks in regions where there are many reads inside
the control. Thus this method applies a filtering step
removing all intervals in which there is a peak of
more than readlength / 2 height in the control.
.. note:
Zinba calls peaks that are overlapping.
'''
track = P.snip(os.path.basename(infile), ".zinba")
folder = os.path.dirname(infile)
infilename = infile + ".peaks"
outtemp = P.getTempFile(".")
tmpfilename = outtemp.name
outtemp.write("\t".join((
"interval_id",
"contig", "start", "end",
"npeaks", "peakcenter",
"length",
"avgval",
"peakval",
"nprobes",
"pvalue", "fold", "qvalue",
"macs_summit", "macs_nprobes",
)) + "\n")
counter = E.Counter()
if not os.path.exists(infilename):
E.warn("could not find %s" % infilename)
elif IOTools.isEmpty(infilename):
E.warn("no data in %s" % infilename)
else:
# filter peaks
shift = getPeakShiftFromZinba(infile)
assert shift is not None, \
"could not determine peak shift from Zinba file %s" % infile
E.info("%s: found peak shift of %i" % (track, shift))
samfiles = [pysam.Samfile(bamfile, "rb")]
offsets = [shift / 2]
if controlfile:
controlfiles = [pysam.Samfile(controlfile, "rb")]
readlength = BamTools.estimateTagSize(controlfile)
control_max_peakval = readlength // 2
E.info("removing intervals in which control has peak higher than %i reads" %
control_max_peakval)
else:
controlfiles = None
id = 0
# get thresholds
max_qvalue = float(PARAMS["zinba_fdr_threshold"])
with IOTools.openFile(infilename, "r") as ins:
for peak in WrapperZinba.iteratePeaks(ins):
# filter by qvalue
if peak.fdr > max_qvalue:
counter.removed_qvalue += 1
continue
assert peak.refined_start < peak.refined_end
# filter by control
if controlfiles:
npeaks, peakcenter, length, avgval, peakval, nreads = countPeaks(peak.contig,
peak.refined_start,
peak.refined_end,
#.........这里部分代码省略.........