本文整理汇总了Python中workflow.Workflow.batch_read_csv方法的典型用法代码示例。如果您正苦于以下问题:Python Workflow.batch_read_csv方法的具体用法?Python Workflow.batch_read_csv怎么用?Python Workflow.batch_read_csv使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在类workflow.Workflow
的用法示例。
在下文中一共展示了Workflow.batch_read_csv方法的3个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。
示例1: SparkContext
# 需要导入模块: from workflow import Workflow [as 别名]
# 或者: from workflow.Workflow import batch_read_csv [as 别名]
sc = SparkContext(appName="TEST")
java_import(sc._jvm, "edu.isi.karma")
inputFilename = argv[1]
outputFilename = argv[2]
fileUtil = FileUtil(sc)
workflow = Workflow(sc)
contextUrl = "https://raw.githubusercontent.com/american-art/aac-alignment/master/karma-context.json"
#1. Read the input
#test big file
inputRDD = workflow.batch_read_csv(inputFilename).partitionBy(1)
#test small file
# inputRDD = workflow.batch_read_csv(inputFilename)
#2. Apply the karma Model
outputRDD = workflow.run_karma(inputRDD,
"https://raw.githubusercontent.com/american-art/autry/master/AutryMakers/AutryMakers-model.ttl",
"http://dig.isi.edu/AutryMakers/",
"http://www.cidoc-crm.org/cidoc-crm/E22_Man-Made_Object1",
"https://raw.githubusercontent.com/american-art/aac-alignment/master/karma-context.json",
data_type="csv",
additional_settings={"karma.input.delimiter":","})
#3. Save the output
示例2: SparkContext
# 需要导入模块: from workflow import Workflow [as 别名]
# 或者: from workflow.Workflow import batch_read_csv [as 别名]
sc = SparkContext(appName="TEST")
java_import(sc._jvm, "edu.isi.karma")
inputFilename = argv[1]
outputFilename = argv[2]
numPartitions = 1000
numFramerPartitions = max(10, numPartitions / 10)
fileUtil = FileUtil(sc)
workflow = Workflow(sc)
contextUrl = "https://raw.githubusercontent.com/american-art/aac-alignment/master/karma-context.json"
#1. Read the input
inputRDD = workflow.batch_read_csv(inputFilename)
#2. Apply the karma Model
outputRDD = workflow.run_karma(inputRDD,
"https://raw.githubusercontent.com/american-art/npg/master/NPGConstituents/NPGConstituents-model.ttl",
"http://americanartcollaborative.org/npg/",
"http://www.cidoc-crm.org/cidoc-crm/E39_Actor1",
"https://raw.githubusercontent.com/american-art/aac-alignment/master/karma-context.json",
num_partitions=numPartitions,
data_type="csv",
additional_settings={"karma.input.delimiter":","})
#3. Save the output
# fileUtil.save_file(outputRDD, outputFilename, "text", "json")
#4. Reduce rdds
示例3: str
# 需要导入模块: from workflow import Workflow [as 别名]
# 或者: from workflow.Workflow import batch_read_csv [as 别名]
line = line.rstrip()
params = line.split("\t")
data_file_URL = str(params[0])
num_partitions = int(params[1])
model_file_URL= str(params[2])
base = str(params[3])
root = str(params[4])
context = str(params[5])
output_folder = str(params[6])
output_zip_path = str(params[7])
#0. Download data file
dataFileName = download_file(data_file_URL)
#1. Read the input
inputRDD = workflow.batch_read_csv(dataFileName).partitionBy(num_partitions)
#2. Apply the karma Model
outputRDD = workflow.run_karma(inputRDD,
model_file_URL,
base,
root,
context,
data_type="csv",
additional_settings={"karma.input.delimiter":",", "karma.output.format": "n3"})
#3. Save the output
outputPath = outputFilename + "/" + output_folder
outputRDD.map(lambda x: x[1]).saveAsTextFile(outputPath)
print "Successfully apply karma!"