Package org.rumbledb.api
Class SequenceWriter
java.lang.Object
org.rumbledb.api.SequenceWriter
Helper class to configure and materialize the output of a
SequenceOfItems.
This class is effectively immutable: all configuration methods such as mode(),
format(), option(), etc. return a new SequenceWriter instance
instead of mutating the current one.
There are two mutually exclusive internal modes:
- DataFrame mode: dataFrameWriter != null and mode == null.
In this case, the sequence can be represented as a Spark Dataset /
DataFrameWriter and is written using Spark's native writers (json/csv/parquet/...).
- RDD mode: dataFrameWriter == null and mode != null.
In this case, the sequence is serialized item-by-item via Serializer
and saved as text files.
The serialization method (json, tyson, xml-json-hybrid, yaml, delta, ...) is always taken from
SerializationParameters.getMethod(), which is the single source of truth for the output
format.-
Method Summary
Modifier and TypeMethodDescriptionvoidvoidinsertInto(String tableName) voidmode(org.apache.spark.sql.SaveMode saveMode) options(org.apache.spark.sql.util.CaseInsensitiveStringMap options) voidvoidpartitionBy(String... colNames) voidsave()voidvoidsaveAsTable(String tableName) voidvoidvoid
-
Method Details
-
mode
-
mode
-
format
-
option
-
option
-
option
-
option
-
options
-
options
-
partitionBy
-
bucketBy
-
sortBy
-
save
-
getSerializer
-
save
public void save() -
insertInto
-
saveAsTable
-
tyson
-
yaml
-
json
-
parquet
-
text
-
orc
-
csv
-