Class WhereClauseSparkIterator

java.lang.Object
org.rumbledb.runtime.RuntimeTupleIterator
org.rumbledb.runtime.flwor.clauses.WhereClauseSparkIterator
All Implemented Interfaces:
com.esotericsoftware.kryo.KryoSerializable, Serializable, RuntimeTupleIteratorInterface

public class WhereClauseSparkIterator extends RuntimeTupleIterator
See Also:
  • Constructor Details

  • Method Details

    • open

      public void open(DynamicContext context)
      Specified by:
      open in interface RuntimeTupleIteratorInterface
      Overrides:
      open in class RuntimeTupleIterator
    • close

      public void close()
      Specified by:
      close in interface RuntimeTupleIteratorInterface
      Overrides:
      close in class RuntimeTupleIterator
    • reset

      public void reset(DynamicContext context)
      Specified by:
      reset in interface RuntimeTupleIteratorInterface
      Overrides:
      reset in class RuntimeTupleIterator
    • next

      public sparksoniq.jsoniq.tuple.FlworTuple next()
      Specified by:
      next in interface RuntimeTupleIteratorInterface
      Specified by:
      next in class RuntimeTupleIterator
    • getDataFrame

      public FlworDataFrame getDataFrame(DynamicContext context)
      Description copied from class: RuntimeTupleIterator
      Obtains the dataframe from the child clause. It is possible, with the second parameter, to specify the variables it needs to project the others away, or that only a count is needed for a specific variable, which allows projecting away the actual items.
      Specified by:
      getDataFrame in class RuntimeTupleIterator
      Parameters:
      context - the dynamic context in which the evaluate the child clause's dataframe.
      Returns:
      the DataFrame with the tuples returned by the child clause.
    • getDynamicContextVariableDependencies

      public Map<Name,DynamicContext.VariableDependency> getDynamicContextVariableDependencies()
      Description copied from class: RuntimeTupleIterator
      Variable dependencies are variables that MUST be provided by the parent clause in the dynamic context for successful execution of this clause. These variables are: 1. All variables that the expression of the clause depends on (recursive call of getVariableDependencies on the expression) 2. Except those variables bound in the current FLWOR (obtained from the auxiliary method getVariablesBoundInCurrentFLWORExpression), because those are provided in the Tuples 3. Plus (recursively calling getVariableDependencies) all the Variable Dependencies of the child clause if it exists.
      Overrides:
      getDynamicContextVariableDependencies in class RuntimeTupleIterator
      Returns:
      a map of variable names to dependencies (FULL, COUNT, ...) that this clause needs to obtain from the dynamic context.
    • getOutputTupleVariableNames

      public Set<Name> getOutputTupleVariableNames()
      Description copied from class: RuntimeTupleIterator
      Returns the output tuple variable names. These variables can be removed from the dependencies of expressions in ascendent (subsequent) clauses, because their values are provided in the tuples rather than the dynamic context object.
      Overrides:
      getOutputTupleVariableNames in class RuntimeTupleIterator
      Returns:
      the set of variable names that are bound by descendant clauses.
    • print

      public void print(StringBuffer buffer, int indent)
      Overrides:
      print in class RuntimeTupleIterator
    • getInputTupleVariableDependencies

      public Map<Name,DynamicContext.VariableDependency> getInputTupleVariableDependencies(Map<Name,DynamicContext.VariableDependency> parentProjection)
      Description copied from class: RuntimeTupleIterator
      Builds the DataFrame projection that this clause needs to receive from its child clause. The intent is that the result of this method is forwarded to the child clause in getDataFrame() so it can optimize some values away. Invariant: all keys in getInputTupleVariableDependencies(...) MUST be output tuple variables, i.e., appear in this.child.getOutputTupleVariableNames()
      Specified by:
      getInputTupleVariableDependencies in class RuntimeTupleIterator
      Parameters:
      parentProjection - the projection needed by the parent clause.
      Returns:
      the projection needed by this clause.
    • tryNativeQuery

      public static FlworDataFrame tryNativeQuery(FlworDataFrame dataFrame, RuntimeIterator iterator, DynamicContext context, ExceptionMetadata metadata)
      Try to generate the native query for the let clause and run it, if successful return the resulting dataframe, otherwise it returns null
      Parameters:
      dataFrame - input dataframe for the query
      iterator - where filtering expression iterator
      context - current dynamic context of the dataframe
      metadata -
      Returns:
      resulting dataframe of the let clause if successful, null otherwise
    • containsClause

      public boolean containsClause(FLWOR_CLAUSES kind)
      Description copied from class: RuntimeTupleIterator
      Says whether or not the clause and its descendants include a clause of the specified kind.
      Specified by:
      containsClause in class RuntimeTupleIterator
      Parameters:
      kind - the kind of clause to test for.
      Returns:
      true if there is one. False otherwise.
    • isSparkJobNeeded

      public boolean isSparkJobNeeded()
      Says whether this expression evaluation triggers a Spark job.
      Specified by:
      isSparkJobNeeded in class RuntimeTupleIterator
      Returns:
      true if the execution triggers a Spark, false otherwise, null if undetermined yet.
    • generateNativeQuery

      public NativeClauseContext generateNativeQuery(NativeClauseContext nativeClauseContext)
      Description copied from class: RuntimeTupleIterator
      This function generate (if possible) a native spark-sql query that maps the inner working of the iterator
      Overrides:
      generateNativeQuery in class RuntimeTupleIterator
      Parameters:
      nativeClauseContext - context information to generate the native query
      Returns:
      a native clause context with the spark-sql native query to get an equivalent result of the iterator, or [NativeClauseContext.NoNativeQuery] if it is not possible