当前位置: 首页>>代码示例>>Python>>正文


Python Helper.getDomainsFromTsv方法代码示例

本文整理汇总了Python中Helper.getDomainsFromTsv方法的典型用法代码示例。如果您正苦于以下问题:Python Helper.getDomainsFromTsv方法的具体用法?Python Helper.getDomainsFromTsv怎么用?Python Helper.getDomainsFromTsv使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在Helper的用法示例。


在下文中一共展示了Helper.getDomainsFromTsv方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: initGeneLevelProteins

# 需要导入模块: import Helper [as 别名]
# 或者: from Helper import getDomainsFromTsv [as 别名]
    def initGeneLevelProteins(filename, tsvfileA, tsvfileB, useDomains):
        proteinsA = {}
        proteinsB = {}
        orthologGroups = {}
        groupsStarted = False

        rcp = ConfigParser.RawConfigParser()
        rcp.read("orthology.cfg")
        cutoff = rcp.getint("Options", "domainlengthcutoff")

        if useDomains:
            domainsA, shortA = Helper.getDomainsFromTsv(tsvfileA, cutoff)
            domainsB, shortB = Helper.getDomainsFromTsv(tsvfileB, cutoff)
        handle = open(filename, "r")
        ort = None

        lineStarts = ["Group", "Score", "Boots", "_____"]
        for line in handle.readlines():
            if groupsStarted:
                if line[0:5] not in lineStarts:
                    hasA = not line.startswith(" ")
                    temp = []
                    splittedLine = line.split()
                    temp = ort.getBasicProteins(splittedLine)

                    if hasA:
                        temp[0].__class__ = GeneLevelProtein
                        proteinsA[temp[0].accession] = temp[0]
                        if useDomains:
                            temp[0].domains = domainsA[temp[0].accession]
                        score = float(splittedLine[1].split("%")[0])
                        ort.inparalogsA[temp[0].accession] = score

                    if not hasA or len(temp) > 1:
                        temp[-1].__class__ = GeneLevelProtein
                        proteinsB[temp[-1].accession] = temp[-1]
                        if useDomains:
                            temp[-1].domains = domainsB[temp[-1].accession]
                        score = float(splittedLine[-1].split("%")[0])
                        ort.inparalogsB[temp[-1].accession] = score

                elif line.startswith("Group"):
                    ort = OrthologyGroup.getBasicOrthologyGroup(line, True, orthologGroups)

                elif line.startswith("Bootstrap"):
                    ort.addSeeds(line)

            else:
                if line.startswith("_"):
                    groupsStarted = True

        pairsCount = 0
        for g in orthologGroups:
            pairsCount += len(orthologGroups[g].inparalogsA) * len(orthologGroups[g].inparalogsB)

        print pairsCount, "should be the amount of pairs"
        print len(orthologGroups), "ortholog groups read from the file"
        handle.close()
        if useDomains:
            return proteinsA, proteinsB, orthologGroups, shortA, shortB
        else:
            return proteinsA, proteinsB, orthologGroups
开发者ID:expectopatronum,项目名称:orth-scripts,代码行数:64,代码来源:GeneLevelProtein.py

示例2: in

# 需要导入模块: import Helper [as 别名]
# 或者: from Helper import getDomainsFromTsv [as 别名]
        if opt in ("-t", "--tsvfile"):
            tsvfile = arg
        elif opt in ("-o", "--ofile"):
            outputfile = arg
        elif opt in ("-p", "--proteomefile"):
            proteomefile = arg
        else:
            ok = False
    if ok:
        
        tsvpath = rcp.get("Filepaths", "tsvpath")
        outpath = rcp.get("Filepaths", "retrieveddomainspath")
        cutoff = rcp.getint("Options", "domainlengthcutoff")
        idlistpath = rcp.get("Filepaths", "idlistpath")
        idlistsuffix = rcp.get("Fileendings", "idlistsuffix")
        domains, short = Helper.getDomainsFromTsv(tsvpath+tsvfile, cutoff)
        output = getDomainsForSequences(domains, idlistpath+proteomefile + idlistsuffix)
        outFile = open(outpath+outputfile, 'w')
        outFile.write(str(output))
        outFile.close()              
    else:
        print __doc__
        sys.exit(0)

def getDomainsForSequences(domains, proteomefile):   
    handle = open(proteomefile, "rU")
    result = ''
    accessions = []
    for record in handle.readlines():
        accessions.append(record.split('\n')[0])
开发者ID:expectopatronum,项目名称:orth-scripts,代码行数:32,代码来源:getDomainSequencesForProteome.py


注:本文中的Helper.getDomainsFromTsv方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。