A JSONiq engine to query large JSON datasets. Built on Spark.

Made for JSON

Rumble uses the JSONiq language, which was tailored-made for heterogenous, nested JSON data.

Simple to learn

JSONiq is a declarative and functional language. It is user-friendly and easy to read and write, because it looks a lot like JSON.

Just like SQL

JSONiq's most powerful construct works exactly like SQL's SELECT-FROM-WHERE, but with the enhanced flexibility that JSON needs.

Large-scale JSON querying

If you have JSON data that does not fit on your machine, or that is too slow to process, Rumble is for you. It leverages Spark to spread the I/O workload on multiple machines and parallelize as much as it can.

No pre-loading time

If you have your JSON files ready, one object per line, all you need to do is copy them over to your HDFS cluster and you are ready to go.

Focus on your data, not on the code

With Rumble, you do not need to write Java or Scala code. Simply write what you want and run your query.

Excellent at tree-like data

JSONiq understands nested data natively. It is more than just SQL with dot syntax on top. Nested objects and arrays are a walk in the park.

Heterogeneous data

Not all data fits into highly structured DataFrames. JSONiq is a NoSQL language that has the flexibility to deal with heterogeneous, missing, or extra fields.

Try it now!

This is our first beta release.