当前位置: 首页>>代码示例>>Python>>正文


Python HiveContext.setConf方法代码示例

本文整理汇总了Python中pyspark.sql.HiveContext.setConf方法的典型用法代码示例。如果您正苦于以下问题:Python HiveContext.setConf方法的具体用法?Python HiveContext.setConf怎么用?Python HiveContext.setConf使用的例子?那么恭喜您, 这里精选的方法代码示例或许可以为您提供帮助。您也可以进一步了解该方法所在pyspark.sql.HiveContext的用法示例。


在下文中一共展示了HiveContext.setConf方法的4个代码示例,这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞,您的评价将有助于系统推荐出更棒的Python代码示例。

示例1: get_context_test

# 需要导入模块: from pyspark.sql import HiveContext [as 别名]
# 或者: from pyspark.sql.HiveContext import setConf [as 别名]
def get_context_test():
    conf = SparkConf()
    sc = SparkContext('local[1]', conf=conf)
    sql_context = HiveContext(sc)
    sql_context.sql("""use fex_test""")
    sql_context.setConf("spark.sql.shuffle.partitions", "1")
    return sc, sql_context
开发者ID:hongbin0908,项目名称:bintrade,代码行数:9,代码来源:index.py

示例2: get_context

# 需要导入模块: from pyspark.sql import HiveContext [as 别名]
# 或者: from pyspark.sql.HiveContext import setConf [as 别名]
def get_context():
    conf = SparkConf()
    conf.set("spark.executor.instances", "4")
    conf.set("spark.executor.cores", "4")
    conf.set("spark.executor.memory", "8g")
    sc = SparkContext(appName="__file__", conf=conf)
    sql_context = HiveContext(sc)
    sql_context.sql("""use fex""")
    sql_context.setConf("spark.sql.shuffle.partitions", "32")
    return sc, sql_context
开发者ID:hongbin0908,项目名称:bintrade,代码行数:12,代码来源:index.py

示例3: SparkContext

# 需要导入模块: from pyspark.sql import HiveContext [as 别名]
# 或者: from pyspark.sql.HiveContext import setConf [as 别名]
Fails after 2+ hours. Problem seems to be "(Too many open files)"
Likely several thousand files are open at one time.


"""


from pyspark import SparkContext
from pyspark.sql import HiveContext

sc = SparkContext()
sqlContext = HiveContext(sc)

# snappy compression recommended for Arrow
# Interesting- snappy is slightly smaller than gz for the 10 rows.
sqlContext.setConf("spark.sql.parquet.compression.codec", "snappy")

# Testing
#pems = sqlContext.sql("SELECT * FROM pems LIMIT 10")

# This works
# pems = sqlContext.sql("SELECT * FROM pems WHERE station IN (402265, 402264, 402263, 402261, 402260)")

pems = sqlContext.sql("SELECT * FROM pems ORDER BY station")

# Don't see options about file chunk sizes, probably comes from some 
# environment variable
# Later versions:
# pems.write.parquet("pems_sorted", compression = "snappy")

#pems.write.parquet("pems_station", partitionBy="station")
开发者ID:clarkfitzg,项目名称:phd_research,代码行数:33,代码来源:pems_to_parquet.py

示例4: SparkConf

# 需要导入模块: from pyspark.sql import HiveContext [as 别名]
# 或者: from pyspark.sql.HiveContext import setConf [as 别名]
            adjclose float
            )
            ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'
    """)

    sqlContext.sql(""" use fex """)
    df = sqlContext.sql("""
    SELECT
        *
    FROM
        eod_spx
    WHERE
        symbol = "SPX"
        AND date >= "2010-01-01"
        AND date <= "2010-06-30"
    """)
    sqlContext.sql(""" use fex_test """)
    df.repartition(1).insertInto("eod_spx", True)


if __name__ == "__main__":
    conf = SparkConf();
    conf.set("spark.executor.instances", "4")
    conf.set("spark.executor.cores", "4")
    conf.set("spark.executor.memory", "8g")
    sc = SparkContext(appName=__file__, conf = conf)
    sqlContext = HiveContext(sc)
    sqlContext.setConf("spark.sql.shuffle.partitions", "1")
    main(sc, sqlContext)
    sc.stop()
开发者ID:hongbin0908,项目名称:bintrade,代码行数:32,代码来源:eod_testdata.py


注:本文中的pyspark.sql.HiveContext.setConf方法示例由纯净天空整理自Github/MSDocs等开源代码及文档管理平台,相关代码片段筛选自各路编程大神贡献的开源项目,源码版权归原作者所有,传播和使用请参考对应项目的License;未经允许,请勿转载。