java.lang.Object

org.rumbledb.runtime.RuntimeTupleIterator

org.rumbledb.runtime.flwor.clauses.JoinClauseSparkIterator

All Implemented Interfaces:: com.esotericsoftware.kryo.KryoSerializable, Serializable, RuntimeTupleIteratorInterface

public class JoinClauseSparkIterator extends RuntimeTupleIterator

See Also:

Serialized Form

Field Summary

Fields inherited from class org.rumbledb.runtime.RuntimeTupleIterator
child, currentDynamicContext, evaluationDepthLimit, FLOW_EXCEPTION_MESSAGE, hasNext, inputTupleProjection, isOpen, outputTupleProjection
Constructor Summary

Constructors

Constructor

Description

JoinClauseSparkIterator(RuntimeTupleIterator leftChild, RuntimeTupleIterator rightChild, boolean isLeftOuterJoin, RuntimeStaticContext staticContext)
Method Summary

Modifier and Type

Method

Description

boolean

containsClause(FLWOR_CLAUSES kind)

Says whether or not the clause and its descendants include a clause of the specified kind.

FlworDataFrame

getDataFrame(DynamicContext context)

Obtains the dataframe from the child clause.

protected Map<Name,DynamicContext.VariableDependency>

getInputTupleVariableDependencies(Map<Name,DynamicContext.VariableDependency> parentProjection)

Builds the DataFrame projection that this clause needs to receive from its child clause.

boolean

isSparkJobNeeded()

Says whether this expression evaluation triggers a Spark job.

static FlworDataFrame

joinInputTupleWithSequenceOnPredicate(DynamicContext context, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> leftInputTuple, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> rightInputTuple, Map<Name,DynamicContext.VariableDependency> outputTupleVariableDependencies, List<Name> variablesInLeftInputTuple, List<Name> variablesInRightInputTuple, RuntimeIterator predicateIterator, boolean isLeftOuterJoin, Name newRightSideVariableName, ExceptionMetadata metadata, RumbleRuntimeConfiguration conf)

Joins two input tuples.

sparksoniq.jsoniq.tuple.FlworTuple

next()

Methods inherited from class org.rumbledb.runtime.RuntimeTupleIterator
canSetEvaluationDepthLimit, close, generateNativeQuery, getChildIterator, getConfiguration, getDynamicContextVariableDependencies, getEvaluationDepthLimit, getHeight, getHighestExecutionMode, getMetadata, getOutputTupleVariableNames, getSubtreeBeyondLimit, hasNext, isDataFrame, isOpen, open, print, read, reset, setEvaluationDepthLimit, setInputAndOutputTupleVariableDependencies, toString, write

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

Constructor Details
- JoinClauseSparkIterator
  
  public JoinClauseSparkIterator(RuntimeTupleIterator leftChild, RuntimeTupleIterator rightChild, boolean isLeftOuterJoin, RuntimeStaticContext staticContext)
Method Details
- joinInputTupleWithSequenceOnPredicate
  
  public static FlworDataFrame joinInputTupleWithSequenceOnPredicate(DynamicContext context, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> leftInputTuple, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> rightInputTuple, Map<Name,DynamicContext.VariableDependency> outputTupleVariableDependencies, List<Name> variablesInLeftInputTuple, List<Name> variablesInRightInputTuple, RuntimeIterator predicateIterator, boolean isLeftOuterJoin, Name newRightSideVariableName, ExceptionMetadata metadata, RumbleRuntimeConfiguration conf)
  
  Joins two input tuples. Warning: if the two tuples collide in their columns, unexpected things may happen for now.
  
  Parameters:
  
  context - the dynamic context for the evaluation of the predicate expression.
  
  leftInputTuple - the left tuple.
  
  rightInputTuple - the right tuple.
  
  outputTupleVariableDependencies - the necessary and sufficient variable dependencies that the output tuple should contain.
  
  variablesInLeftInputTuple - a list of the variables in the left tuple.
  
  variablesInRightInputTuple - a list of the variables in the right tuple.
  
  predicateIterator - the predicate iterator.
  
  isLeftOuterJoin - true if it is a left outer join, false otherwise.
  
  newRightSideVariableName - the new name of the variable to rename the context item in the output (null if no rename).
  
  metadata - the metadata.
  
  Returns:
  
  the joined tuple.
- next
  
  public sparksoniq.jsoniq.tuple.FlworTuple next()
  
  Specified by:
  
  next in interface RuntimeTupleIteratorInterface
  
  Specified by:
  
  next in class RuntimeTupleIterator
- getDataFrame
  
  public FlworDataFrame getDataFrame(DynamicContext context)
  
  Description copied from class: RuntimeTupleIterator
  
  Obtains the dataframe from the child clause. It is possible, with the second parameter, to specify the variables it needs to project the others away, or that only a count is needed for a specific variable, which allows projecting away the actual items.
  
  Specified by:
  
  getDataFrame in class RuntimeTupleIterator
  
  Parameters:
  
  context - the dynamic context in which the evaluate the child clause's dataframe.
  
  Returns:
  
  the DataFrame with the tuples returned by the child clause.
- getInputTupleVariableDependencies
  
  protected Map<Name,DynamicContext.VariableDependency> getInputTupleVariableDependencies(Map<Name,DynamicContext.VariableDependency> parentProjection)
  
  Description copied from class: RuntimeTupleIterator
  
  Builds the DataFrame projection that this clause needs to receive from its child clause. The intent is that the result of this method is forwarded to the child clause in getDataFrame() so it can optimize some values away. Invariant: all keys in getInputTupleVariableDependencies(...) MUST be output tuple variables, i.e., appear in this.child.getOutputTupleVariableNames()
  
  Specified by:
  
  getInputTupleVariableDependencies in class RuntimeTupleIterator
  
  Parameters:
  
  parentProjection - the projection needed by the parent clause.
  
  Returns:
  
  the projection needed by this clause.
- containsClause
  
  public boolean containsClause(FLWOR_CLAUSES kind)
  
  Description copied from class: RuntimeTupleIterator
  
  Says whether or not the clause and its descendants include a clause of the specified kind.
  
  Specified by:
  
  containsClause in class RuntimeTupleIterator
  
  Parameters:
  
  kind - the kind of clause to test for.
  
  Returns:
  
  true if there is one. False otherwise.
- isSparkJobNeeded
  
  public boolean isSparkJobNeeded()
  
  Says whether this expression evaluation triggers a Spark job.
  
  Specified by:
  
  isSparkJobNeeded in class RuntimeTupleIterator
  
  Returns:
  
  true if the execution triggers a Spark, false otherwise, null if undetermined yet.

Class JoinClauseSparkIterator

Field Summary

Fields inherited from class org.rumbledb.runtime.RuntimeTupleIterator

Constructor Summary

Method Summary

Methods inherited from class org.rumbledb.runtime.RuntimeTupleIterator

Methods inherited from class java.lang.Object

Constructor Details

JoinClauseSparkIterator

Method Details

joinInputTupleWithSequenceOnPredicate

next

getDataFrame

getInputTupleVariableDependencies

containsClause

isSparkJobNeeded