R SparkR read.stream用法及代碼示例

說明：

將數據源中的數據集作為 SparkDataFrame 返回

用法：

read.stream(source = NULL, schema = NULL, ...)

參數：

source 外部數據源名稱
schema structType 中定義的數據模式或 DDL 格式的字符串，這是基於文件的流數據源所必需的
... 其他外部數據源特定的命名選項，例如 path 用於基於文件的流數據源。 timeZone 表示用於解析 JSON/CSV 數據源或分區值中的時間戳的時區；如果未設置，則使用默認值會話本地時區。

細節：

數據源由source 和一組選項(...)指定。如果不指定source，則使用"spark.sql.sources.default"配置的默認數據源。

SparkDataFrame

注意：

從 2.2.0 開始的 read.stream

實驗

例子：

sparkR.session()
df <- read.stream("socket", host = "localhost", port = 9999)
q <- write.stream(df, "text", path = "/home/user/out", checkpointLocation = "/home/user/cp")

df <- read.stream("json", path = jsonDir, schema = schema, maxFilesPerTrigger = 1)
stringSchema <- "name STRING, info MAP<STRING, DOUBLE>"
df1 <- read.stream("json", path = jsonDir, schema = stringSchema, maxFilesPerTrigger = 1)

相關用法

注：本文由純淨天空篩選整理自spark.apache.org大神的英文原創作品 Load a streaming SparkDataFrame。非經特殊聲明，原始代碼版權歸原作者所有，本譯文未經允許或授權，請勿轉載或複製。

說明：

用法：

參數：

細節：

返回：

注意：

例子：