### Solution Graph

A subgoal hierarchy is represented as a graph, in which nodes are clusters of similar subgoal labels and edges represent the sequential order of subgoals in a solution. A solution procedure is represented as the path along directed edges from the source node. Branching edges represent different solution strategies, and the transitive edges represent the substructure of a goal in a hierarchy.

### Computing the Similarity and Optimal Alignment

The similarity between two sequences of subgoal labels is determined by the optimal alignment of the sequences. The optimal alignment is computed by the global sequence alignment algorithm with the scoring scheme of {Match: the textual similarity score between subgoal labels, Insertion/deletion: 0, Mismatch: -0.2}. The textual similarity is calculated based on the Sørensen-Dice coefficient and is normalized to a 0-1 range.

### Applications

There are two applications of this computation: merging an input sequence to the right position in a solution graph and generating feedback based on alignments. The similarity between sequences provides the cue for the system to perform a graph union operation to merge the sequence with the graph. The misalignments in two sequences of labels imply structural differences, which can produce useful feedback on structure.