ROSE  0.11.96.11
Public Member Functions | List of all members
Rose::EditDistance::TreeEditDistance::Analysis Class Reference

Description

Analysis object for tree edit distance.

The Analysis object holds the settings and state for performing tree edit distance. See TreeEditDistance for details and examples.

Definition at line 95 of file TreeEditDistance.h.

#include <TreeEditDistance.h>

Public Member Functions

 Analysis ()
 Construct an analysis with default values.
 
Analysisclear ()
 Forget calculated results. More...
 
double cost () const
 Total cost for making one tree the same shape as the other. More...
 
double relativeCost () const
 Relative cost. More...
 
Edits edits () const
 Edit operations to make one path look like another.
 
std::pair< SgNode *, SgNode * > trees () const
 The two trees that were compared.
 
std::pair< size_t, size_t > graphSize () const
 Number of vertices and edges in the graph. More...
 
void emitGraphViz (std::ostream &) const
 Emit a GraphViz file. More...
 
double insertionCost () const
 Property: insertion cost. More...
 
AnalysisinsertionCost (double weight)
 Property: insertion cost. More...
 
double deletionCost () const
 Property: deletion cost. More...
 
AnalysisdeletionCost (double weight)
 Property: deletion cost. More...
 
double substitutionCost () const
 Property: substitution cost. More...
 
AnalysissubstitutionCost (double weight)
 Property: substitution cost. More...
 
SubstitutionPredicatesubstitutionPredicate () const
 Property: substitution predicate. More...
 
AnalysissubstitutionPredicate (SubstitutionPredicate *predicate)
 Property: substitution predicate. More...
 
Analysiscompute (SgNode *source, SgNode *target, SgFile *sourceFile=NULL, SgFile *targetFile=NULL)
 Compute tree edit distances. More...
 
Analysiscompute (SgNode *target, SgFile *targetFile=NULL)
 Compute tree edit distances. More...
 
Analysiscompute ()
 Compute tree edit distances. More...
 
const std::vector< SgNode * > & sourceTreeNodes () const
 List of nodes in the trees.
 
const std::vector< SgNode * > & targetTreeNodes () const
 List of nodes in the trees.
 
AnalysissetTree1 (SgNode *ast, SgFile *file=NULL)
 Change one tree or the other. More...
 
AnalysissetTree2 (SgNode *ast, SgFile *file=NULL)
 Change one tree or the other. More...
 

Member Function Documentation

◆ clear()

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::clear ( )
inline

Forget calculated results.

Causes the analysis to forget the trees being compared and previous results. Does not modify properties that affect the analysis operation (like the various edit costs).

Definition at line 129 of file TreeEditDistance.h.

◆ insertionCost() [1/2]

double Rose::EditDistance::TreeEditDistance::Analysis::insertionCost ( ) const
inline

Property: insertion cost.

The non-negative cost of performing an insertion when editing one tree to make it look like another.

Definition at line 143 of file TreeEditDistance.h.

◆ insertionCost() [2/2]

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::insertionCost ( double  weight)
inline

Property: insertion cost.

The non-negative cost of performing an insertion when editing one tree to make it look like another.

Definition at line 146 of file TreeEditDistance.h.

◆ deletionCost() [1/2]

double Rose::EditDistance::TreeEditDistance::Analysis::deletionCost ( ) const
inline

Property: deletion cost.

The non-negative cost of performing a deletion when editing one tree to make it look like another.

Definition at line 158 of file TreeEditDistance.h.

◆ deletionCost() [2/2]

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::deletionCost ( double  weight)
inline

Property: deletion cost.

The non-negative cost of performing a deletion when editing one tree to make it look like another.

Definition at line 161 of file TreeEditDistance.h.

◆ substitutionCost() [1/2]

double Rose::EditDistance::TreeEditDistance::Analysis::substitutionCost ( ) const
inline

Property: substitution cost.

The non-negative cost of performing a substitution when editing one tree to make it look like another.

Definition at line 173 of file TreeEditDistance.h.

◆ substitutionCost() [2/2]

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::substitutionCost ( double  weight)
inline

Property: substitution cost.

The non-negative cost of performing a substitution when editing one tree to make it look like another.

Definition at line 176 of file TreeEditDistance.h.

◆ substitutionPredicate() [1/2]

SubstitutionPredicate* Rose::EditDistance::TreeEditDistance::Analysis::substitutionPredicate ( ) const
inline

Property: substitution predicate.

A substitution can only occur when the source and destination AST nodes are at the same depth in their respective subtrees, and when the substitution predicate returns true. The default predicate (when the pointer is null) always returns true, thus taking only tree shape into account when making one tree look like another.

Definition at line 190 of file TreeEditDistance.h.

◆ substitutionPredicate() [2/2]

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::substitutionPredicate ( SubstitutionPredicate predicate)
inline

Property: substitution predicate.

A substitution can only occur when the source and destination AST nodes are at the same depth in their respective subtrees, and when the substitution predicate returns true. The default predicate (when the pointer is null) always returns true, thus taking only tree shape into account when making one tree look like another.

Definition at line 193 of file TreeEditDistance.h.

◆ compute() [1/3]

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::compute ( SgNode source,
SgNode target,
SgFile sourceFile = NULL,
SgFile targetFile = NULL 
)

Compute tree edit distances.

Computes edit distances and stores them in this analysis. Most of the other methods simply query the results computed from this call.

Given two trees, source and target, compute and store within this object information about the edit distance from source to target. That is, the cost for editing source to make it the same shape as target. Applying edits to build such a tree is most likely nonsensical, but it gives a measure of similarity between two trees. If files sourceFile and/or targetFile are non-null, then the Sg_File_Info of each node under source and target, respectively, are compared with the specified files and contribute to the edit distance only when that node belongs to the specified file.

Returns this analysis object so that queries can be chained, as in

double diff = TreeEditDistance::Analysis().compute(t1,t2).cost();

There are three versions of this function:

  • A version where both trees are specified.
  • A version where only the target tree is specified and the source tree is re-used from a previous calculation. This is useful when comparing one tree against many trees.
  • A version that re-uses both trees from a previous calculation. This is useful when one changes only the edit costs or other properties that might influence the result.

This analysis uses Dijkstra's shortest path and takes a total of $O(V_s V_t)$ memory and $O(V_s V_t log(V_s V_t) + E)$ where $V_s$ and $V_t$ are the number of nodes in the source and target trees and $E$ is the number of edges representing possible insertions, deletions, and substitutions.

◆ compute() [2/3]

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::compute ( SgNode target,
SgFile targetFile = NULL 
)

Compute tree edit distances.

Computes edit distances and stores them in this analysis. Most of the other methods simply query the results computed from this call.

Given two trees, source and target, compute and store within this object information about the edit distance from source to target. That is, the cost for editing source to make it the same shape as target. Applying edits to build such a tree is most likely nonsensical, but it gives a measure of similarity between two trees. If files sourceFile and/or targetFile are non-null, then the Sg_File_Info of each node under source and target, respectively, are compared with the specified files and contribute to the edit distance only when that node belongs to the specified file.

Returns this analysis object so that queries can be chained, as in

double diff = TreeEditDistance::Analysis().compute(t1,t2).cost();

There are three versions of this function:

  • A version where both trees are specified.
  • A version where only the target tree is specified and the source tree is re-used from a previous calculation. This is useful when comparing one tree against many trees.
  • A version that re-uses both trees from a previous calculation. This is useful when one changes only the edit costs or other properties that might influence the result.

This analysis uses Dijkstra's shortest path and takes a total of $O(V_s V_t)$ memory and $O(V_s V_t log(V_s V_t) + E)$ where $V_s$ and $V_t$ are the number of nodes in the source and target trees and $E$ is the number of edges representing possible insertions, deletions, and substitutions.

◆ compute() [3/3]

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::compute ( )

Compute tree edit distances.

Computes edit distances and stores them in this analysis. Most of the other methods simply query the results computed from this call.

Given two trees, source and target, compute and store within this object information about the edit distance from source to target. That is, the cost for editing source to make it the same shape as target. Applying edits to build such a tree is most likely nonsensical, but it gives a measure of similarity between two trees. If files sourceFile and/or targetFile are non-null, then the Sg_File_Info of each node under source and target, respectively, are compared with the specified files and contribute to the edit distance only when that node belongs to the specified file.

Returns this analysis object so that queries can be chained, as in

double diff = TreeEditDistance::Analysis().compute(t1,t2).cost();

There are three versions of this function:

  • A version where both trees are specified.
  • A version where only the target tree is specified and the source tree is re-used from a previous calculation. This is useful when comparing one tree against many trees.
  • A version that re-uses both trees from a previous calculation. This is useful when one changes only the edit costs or other properties that might influence the result.

This analysis uses Dijkstra's shortest path and takes a total of $O(V_s V_t)$ memory and $O(V_s V_t log(V_s V_t) + E)$ where $V_s$ and $V_t$ are the number of nodes in the source and target trees and $E$ is the number of edges representing possible insertions, deletions, and substitutions.

◆ cost()

double Rose::EditDistance::TreeEditDistance::Analysis::cost ( ) const

Total cost for making one tree the same shape as the other.

This is the same value returned by the previous call to compute and also available by querying for the actual list of edits and summing their costs.

◆ relativeCost()

double Rose::EditDistance::TreeEditDistance::Analysis::relativeCost ( ) const

Relative cost.

This function returns an edit distance normalized to the size of the larger tree. It does so by simply dividing the total edit cost by the number of selected nodes in the larger tree.

◆ graphSize()

std::pair<size_t, size_t> Rose::EditDistance::TreeEditDistance::Analysis::graphSize ( ) const

Number of vertices and edges in the graph.

The graph is used to compute Dijkstra's shortest path and minimize the cost of the edits. This function returns the number of vertices and edges in the graph.

◆ emitGraphViz()

void Rose::EditDistance::TreeEditDistance::Analysis::emitGraphViz ( std::ostream &  ) const

Emit a GraphViz file.

Emits a GraphViz file of the graph used to compute the edit distance and the edit distance path. The rows of the graph represent nodes in the source AST and the columns represent nodes in the target AST. Note that the columns are not in the same order for each row because of limitations with the GraphViz "dot" program. Red edges represent deletion of a node in the source tree; green edges represent insertion of a node from the target tree; blue edges represent substitution of a node in the source tree with a node from the target tree; and black edges represent the edit path that was computed by the compute method.

The output file also contains invisible edges in an attempt to coerce "dot" into making a layout that is as close to being a matrix as possible.

◆ setTree1()

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::setTree1 ( SgNode ast,
SgFile file = NULL 
)

Change one tree or the other.

This function sets one of the two trees that are being compared without affecting the other.

◆ setTree2()

Analysis& Rose::EditDistance::TreeEditDistance::Analysis::setTree2 ( SgNode ast,
SgFile file = NULL 
)

Change one tree or the other.

This function sets one of the two trees that are being compared without affecting the other.


The documentation for this class was generated from the following file: