当前位置: 首页>>代码示例>>Python>>正文


Python Workflow.batch_read_csv方法代码示例

本文整理汇总了Python中workflow.Workflow.batch_read_csv方法的典型用法代码示例。如果您正苦于以下问题:Python Workflow.batch_read_csv方法的具体用法?Python Workflow.batch_read_csv怎么用?Python Workflow.batch_read_csv使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在workflow.Workflow的用法示例。


在下文中一共展示了Workflow.batch_read_csv方法的3个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: SparkContext

# 需要导入模块: from workflow import Workflow [as 别名]
# 或者: from workflow.Workflow import batch_read_csv [as 别名]
    sc = SparkContext(appName="TEST")

    java_import(sc._jvm, "edu.isi.karma")

    inputFilename = argv[1]
    outputFilename = argv[2]


    fileUtil = FileUtil(sc)
    workflow = Workflow(sc)
    contextUrl = "https://raw.githubusercontent.com/american-art/aac-alignment/master/karma-context.json"

    #1. Read the input

    #test big file
    inputRDD = workflow.batch_read_csv(inputFilename).partitionBy(1)

    #test small file
    # inputRDD = workflow.batch_read_csv(inputFilename)


    #2. Apply the karma Model
    outputRDD = workflow.run_karma(inputRDD,
                                   "https://raw.githubusercontent.com/american-art/autry/master/AutryMakers/AutryMakers-model.ttl",
                                   "http://dig.isi.edu/AutryMakers/",
                                   "http://www.cidoc-crm.org/cidoc-crm/E22_Man-Made_Object1",
                                   "https://raw.githubusercontent.com/american-art/aac-alignment/master/karma-context.json",
                                   data_type="csv",
                                   additional_settings={"karma.input.delimiter":","})

    #3. Save the output
开发者ID:dingyi567,项目名称:American_Art,代码行数:33,代码来源:AutryWorkflowCSV.py

示例2: SparkContext

# 需要导入模块: from workflow import Workflow [as 别名]
# 或者: from workflow.Workflow import batch_read_csv [as 别名]
    sc = SparkContext(appName="TEST")

    java_import(sc._jvm, "edu.isi.karma")

    inputFilename = argv[1]
    outputFilename = argv[2]
    numPartitions = 1000
    numFramerPartitions = max(10, numPartitions / 10)

    fileUtil = FileUtil(sc)
    workflow = Workflow(sc)
    contextUrl = "https://raw.githubusercontent.com/american-art/aac-alignment/master/karma-context.json"

    #1. Read the input
    inputRDD = workflow.batch_read_csv(inputFilename)

    #2. Apply the karma Model
    outputRDD = workflow.run_karma(inputRDD,
                                   "https://raw.githubusercontent.com/american-art/npg/master/NPGConstituents/NPGConstituents-model.ttl",
                                   "http://americanartcollaborative.org/npg/",
                                   "http://www.cidoc-crm.org/cidoc-crm/E39_Actor1",
                                   "https://raw.githubusercontent.com/american-art/aac-alignment/master/karma-context.json",
                                   num_partitions=numPartitions,
                                   data_type="csv",
                                   additional_settings={"karma.input.delimiter":","})

    #3. Save the output
    # fileUtil.save_file(outputRDD, outputFilename, "text", "json")

    #4. Reduce rdds
开发者ID:american-art,项目名称:aac-alignment,代码行数:32,代码来源:npgWorkflowCSV.py

示例3: str

# 需要导入模块: from workflow import Workflow [as 别名]
# 或者: from workflow.Workflow import batch_read_csv [as 别名]
            line = line.rstrip()
            params = line.split("\t")
            data_file_URL = str(params[0])
            num_partitions = int(params[1])
            model_file_URL= str(params[2])
            base = str(params[3])
            root = str(params[4])
            context = str(params[5])
            output_folder = str(params[6])
            output_zip_path = str(params[7])

            #0. Download data file
            dataFileName = download_file(data_file_URL)

            #1. Read the input
            inputRDD = workflow.batch_read_csv(dataFileName).partitionBy(num_partitions)

            #2. Apply the karma Model
            outputRDD = workflow.run_karma(inputRDD,
                                            model_file_URL,
                                            base,
                                            root,
                                            context,
                            data_type="csv",
                            additional_settings={"karma.input.delimiter":",", "karma.output.format": "n3"})

            #3. Save the output
            outputPath = outputFilename + "/" + output_folder
            outputRDD.map(lambda x: x[1]).saveAsTextFile(outputPath)
            print "Successfully apply karma!"
开发者ID:dingyi567,项目名称:American_Art,代码行数:32,代码来源:npgBatchWorkflowCSV.py


注:本文中的workflow.Workflow.batch_read_csv方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。