本文整理汇总了Python中dipper.models.Dataset.Dataset.set_citation方法的典型用法代码示例。如果您正苦于以下问题:Python Dataset.set_citation方法的具体用法?Python Dataset.set_citation怎么用?Python Dataset.set_citation使用的例子?那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类dipper.models.Dataset.Dataset
的用法示例。
在下文中一共展示了Dataset.set_citation方法的2个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: MPD
# 需要导入模块: from dipper.models.Dataset import Dataset [as 别名]
# 或者: from dipper.models.Dataset.Dataset import set_citation [as 别名]
class MPD(Source):
"""
From the [MPD](http://phenome.jax.org/) website:
This resource is a collaborative standardized collection of measured data
on laboratory mouse strains and populations. Includes baseline phenotype
data sets as well as studies of drug, diet, disease and aging effect.
Also includes protocols, projects and publications, and SNP,
variation and gene expression studies.
Here, we pull the data and model the genotypes using GENO and
the genotype-to-phenotype associations using the OBAN schema.
MPD provide measurements for particular assays for several strains.
Each of these measurements is itself mapped to a MP or VT term
as a phenotype. Therefore, we can create a strain-to-phenotype association
based on those strains that lie outside of the "normal" range for the given
measurements. We can compute the average of the measurements
for all strains tested, and then threshold any extreme measurements being
beyond some threshold beyond the average.
Our default threshold here, is +/-2 standard deviations beyond the mean.
Because the measurements are made and recorded at the level of
a specific sex of each strain, we associate the MP/VT phenotype with
the sex-qualified genotype/strain.
"""
MPDDL = 'http://phenomedoc.jax.org/MPD_downloads'
files = {
'ontology_mappings': {
'file': 'ontology_mappings.csv',
'url': MPDDL + '/ontology_mappings.csv'},
'straininfo': {
'file': 'straininfo.csv',
'url': MPDDL + '/straininfo.csv'},
'assay_metadata': {
'file': 'measurements.csv',
'url': MPDDL + '/measurements.csv'},
'strainmeans': {
'file': 'strainmeans.csv.gz',
'url': MPDDL + '/strainmeans.csv.gz'},
# 'mpd_datasets_metadata': { #TEC does not seem to be used
# 'file': 'mpd_datasets_metadata.xml.gz',
# 'url': MPDDL + '/mpd_datasets_metadata.xml.gz'},
}
# the following are strain ids for testing
# test_ids = [
# "MPD:2", "MPD:3", "MPD:5", "MPD:6", "MPD:9", "MPD:11", "MPD:18",
# "MPD:20", "MPD:24", "MPD:28", "MPD:30", "MPD:33", "MPD:34", "MPD:36",
# "MPD:37", "MPD:39", "MPD:40", "MPD:42", "MPD:47", "MPD:66", "MPD:68",
# "MPD:71", "MPD:75", "MPD:78", "MPD:122", "MPD:169", "MPD:438",
# "MPD:457","MPD:473", "MPD:481", "MPD:759", "MPD:766", "MPD:770",
# "MPD:849", "MPD:857", "MPD:955", "MPD:964", "MPD:988", "MPD:1005",
# "MPD:1017", "MPD:1204", "MPD:1233", "MPD:1235", "MPD:1236", "MPD:1237"]
test_ids = [
'MPD:6', 'MPD:849', 'MPD:425', 'MPD:569', "MPD:10", "MPD:1002",
"MPD:39", "MPD:2319"]
mgd_agent_id = "MPD:db/q?rtn=people/allinv"
mgd_agent_label = "Mouse Phenotype Database"
mgd_agent_type = "foaf:organization"
def __init__(self, graph_type, are_bnodes_skolemized):
Source.__init__(self, graph_type, are_bnodes_skolemized, 'mpd')
# @N, not sure if this step is required
self.stdevthreshold = 2
# update the dataset object with details about this resource
# @N: Note that there is no license as far as I can tell
self.dataset = Dataset(
'mpd', 'MPD', 'http://phenome.jax.org', None, None)
# TODO add a citation for mpd dataset as a whole
self.dataset.set_citation('PMID:15619963')
self.assayhash = {}
self.idlabel_hash = {}
# to store the mean/zscore of each measure by strain+sex
self.score_means_by_measure = {}
# to store the mean value for each measure by strain+sex
self.strain_scores_by_measure = {}
return
def fetch(self, is_dl_forced=False):
self.get_files(is_dl_forced)
return
def parse(self, limit=None):
"""
MPD data is delivered in four separate csv files and one xml file,
which we process iteratively and write out as
one large graph.
:param limit:
:return:
"""
#.........这里部分代码省略.........
示例2: GeneReviews
# 需要导入模块: from dipper.models.Dataset import Dataset [as 别名]
# 或者: from dipper.models.Dataset.Dataset import set_citation [as 别名]
class GeneReviews(Source):
"""
Here we process the GeneReviews mappings to OMIM,
plus inspect the GeneReviews (html) books to pull the clinical descriptions
in order to populate the definitions of the terms in the ontology.
We define the GeneReviews items as classes that are either grouping classes
over OMIM disease ids (gene ids are filtered out),
or are made as subclasses of DOID:4 (generic disease).
Note that GeneReviews
[copyright policy](http://www.ncbi.nlm.nih.gov/books/NBK138602/)
(as of 2015.11.20) says:
GeneReviews® chapters are owned by the University of Washington, Seattle,
© 1993-2015. Permission is hereby granted to reproduce, distribute,
and translate copies of content materials provided that
(i) credit for source (www.ncbi.nlm.nih.gov/books/NBK1116/)
and copyright (University of Washington, Seattle)
are included with each copy;
(ii) a link to the original material is provided whenever the material is
published elsewhere on the Web; and
(iii) reproducers, distributors, and/or translators comply with this
copyright notice and the GeneReviews Usage Disclaimer.
This script doesn't pull the GeneReviews books from the NCBI Bookshelf
directly; scripting this task is expressly prohibited by
[NCBIBookshelf policy](http://www.ncbi.nlm.nih.gov/books/NBK45311/).
However, assuming you have acquired the books (in html format) via
permissible means, a parser for those books is provided here to extract
the clinical descriptions to define the NBK identified classes.
"""
files = {
'idmap': {'file': 'NBKid_shortname_OMIM.txt',
'url': GRDL + '/NBKid_shortname_OMIM.txt'},
'titles': {'file': 'GRtitle_shortname_NBKid.txt',
'url': GRDL + '/GRtitle_shortname_NBKid.txt'}
}
def __init__(self):
Source.__init__(self, 'genereviews')
self.load_bindings()
self.dataset = Dataset(
'genereviews', 'Gene Reviews', 'http://genereviews.org/',
None, 'http://www.ncbi.nlm.nih.gov/books/NBK138602/')
self.dataset.set_citation('GeneReviews:NBK1116')
self.gu = GraphUtils(curie_map.get())
self.book_ids = set()
self.all_books = {}
if 'test_ids' not in config.get_config() or\
'disease' not in config.get_config()['test_ids']:
logger.warning("not configured with disease test ids.")
self.test_ids = list()
else:
# select ony those test ids that are omim's.
self.test_ids = config.get_config()['test_ids']['disease']
return
def fetch(self, is_dl_forced=False):
"""
We fetch GeneReviews id-label map and id-omim mapping files from NCBI.
:return: None
"""
self.get_files(is_dl_forced)
return
def parse(self, limit=None):
"""
:return: None
"""
if self.testOnly:
self.testMode = True
self._get_titles(limit)
self._get_equivids(limit)
self.create_books()
self.process_nbk_html(limit)
self.load_bindings()
# no test subset for now; test == full graph
self.testgraph = self.graph
logger.info("Found %d nodes", len(self.graph))
return
def _get_equivids(self, limit):
"""
#.........这里部分代码省略.........