Python AttributeCache.get_column_names方法代码示例

本文整理汇总了Python中opus_core.store.attribute_cache.AttributeCache.get_column_names方法的典型用法代码示例。如果您正苦于以下问题：Python AttributeCache.get_column_names方法的具体用法？Python AttributeCache.get_column_names怎么用？Python AttributeCache.get_column_names使用的例子？那么, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类opus_core.store.attribute_cache.AttributeCache的用法示例。

在下文中一共展示了AttributeCache.get_column_names方法的1个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: DataStructureModel

# 需要导入模块: from opus_core.store.attribute_cache import AttributeCache [as 别名]
# 或者: from opus_core.store.attribute_cache.AttributeCache import get_column_names [as 别名]
class DataStructureModel(Model):
    """
    Checks the structure of datasets in a given cache (or run cache) when compared to a reference cache.
    It writes out all columns that are missing as well as those that are not present in the reference cache.
    It can also compare the sizes of the datasets. 
    """
    def __init__(self, reference_location=None):
        """
        "reference_location" is the directory of the reference cache and should include the year.
        If it is None, the simulation directory in its start year is taken. 
        """
        if reference_location is None:
            reference_location = os.path.join(SimulationState().get_cache_directory(), "%s" % SimulationState().get_start_time())
        self.reference_storage =  flt_storage(reference_location)
    
    def run(self, directory=None, check_size=True):
        """
        "directory" is the cache to be compared to the reference. It should not include the year
        as the model checks all years.
        Set "check_sizes" to False if no size check of the datasets is required. 
        """
        if directory is None:
            directory = SimulationState().get_cache_directory()        
        self.cache = AttributeCache(directory)
        year_orig = SimulationState().get_current_time()
        years = self.years_in_cache()
        SimulationState().set_current_time(years[0])
        storages = {}
        for year in years:
            storages[year] = flt_storage(os.path.join(self.cache.get_storage_location(), '%s' % year))
        df = pd.DataFrame(columns=["Table", "Less-than-ref", "More-than-ref", "Year", "Size", "Size-ref"])
        tables = self.cache.get_table_names() 
        for table in tables:
            columns_list = self.cache.get_column_names(table)
            columns = Set(columns_list)
            ref_columns_list = self.reference_storage.get_column_names(table, lowercase=True)
            ref_columns = Set(ref_columns_list)
            more = columns.difference(ref_columns)
            less = ref_columns.difference(columns)
            samesize = True
            if check_size:
                table_size = self.cache.load_table(table, columns_list[0])[columns_list[0]].size
                reftable_size = self.reference_storage.load_table(table, ref_columns_list[0])[ref_columns_list[0]].size
                if table_size <> reftable_size:
                    samesize = False
            if len(more) == 0 and len(less) == 0 and samesize:
                continue
            df.loc[df.shape[0]] = [table, ', '.join(less), ', '.join(more), '', 0, 0]
            if len(more) == 0 and samesize:
                continue
            # if there are columns in the "more" column, write out the corresponding years
            columns_and_years = self.cache._get_column_names_and_years(table)
            more_years = []
            for col, year in columns_and_years:
                if col in more:
                    more_years.append(year)
            df.loc[df.shape[0]-1, "Year"] = ', '.join(np.unique(np.array(more_years).astype("str")))
            if not samesize:  # there is difference in table sizes
                df.loc[df.shape[0]-1, "Size"] = table_size
                df.loc[df.shape[0]-1, "Size-ref"] = reftable_size
           
        if not check_size or (df['Size'].sum()==0 and df['Size-ref'].sum()==0):
            # remove the size columns if not used
            del df['Size']
            del df['Size-ref']
        if df.shape[0] > 0:
            logger.log_status("Differences in data structure relative to %s:" % self.reference_storage.get_storage_location())
            logger.log_status(df)
        else:
            logger.log_status("Data structure corresponds to the one in %s" % self.reference_storage.get_storage_location())
        return df
    
    def years_in_cache(self):
        return self.cache._get_sorted_list_of_years(start_with_current_year=False)

开发者ID:，项目名称:，代码行数:76，代码来源:

注：本文中的opus_core.store.attribute_cache.AttributeCache.get_column_names方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。