public class StringEditDistance extends Object
Constructor and Description |
---|
StringEditDistance() |
Modifier and Type | Method and Description |
---|---|
static double[][] |
calculateDerivative(AlignmentAlgorithm<Node,Node,? extends DerivableAlignmentDistance<Node,Node>> algorithm,
Sequence a,
Sequence b)
Calculates the alignment derivative between the two given input sequences
using the given algorithm.
|
static <R extends DerivableAlignmentDistance<Node,Node>> |
calculateDerivatives(AlignmentAlgorithm<Node,Node,R> algorithm,
Sequence[] dataSpace,
int threadNum)
Calculates the pairwise alignment derivative between all given input
sequences using the given algorithm.
|
static AlignmentSpecification |
setUpSpecification(Sequence[] dataSpace)
Sets up a default AlignmentSpecification for the simple
StringEditDistance problem.
|
static AlignmentSpecification |
setUpSpecification(Sequence[] dataSpace,
double[][] scoringScheme)
Sets up an AlignmentSpecification for the simple
StringEditDistance problem.
|
static AlignmentSpecification |
setUpSpecification(Sequence[] dataSpace,
double matchCost,
double mismatchCost,
double gapCost,
double skipCost)
Sets up an AlignmentSpecification for the simple
StringEditDistance problem.
|
static Sequence[] |
toSequences(Collection<String> strings)
Transforms the given strings to the TCSAlignmentToolbox Sequence format.
|
static Sequence[] |
toSequences(Collection<String> strings,
Alphabet alphabet)
Transforms the given strings to the TCSAlignmentToolbox Sequence format
using the given alphabet.
|
static Sequence[] |
toSequences(String[] strings)
Transforms the given strings to the TCSAlignmentToolbox Sequence format.
|
static Sequence[] |
toSequences(String[] strings,
Alphabet alphabet)
Transforms the given strings to the TCSAlignmentToolbox Sequence format
using the given alphabet.
|
public static Sequence[] toSequences(String[] strings)
strings
- your data. The input format can be either a plain string,
where each character is interpreted as a distinct symbol or a string of
symbols separated by a vertical bar '|' character.public static Sequence[] toSequences(Collection<String> strings)
strings
- your data. The input format can be either a plain string,
where each character is interpreted as a distinct symbol or a string of
symbols separated by a vertical bar '|' character.public static Sequence[] toSequences(String[] strings, Alphabet alphabet)
strings
- your data. The input format can be either a plain string,
where each character is interpreted as a distinct symbol or a string of
symbols separated by a vertical bar '|' character.alphabet
- the alphabet containing all symbols of your data space.public static Sequence[] toSequences(Collection<String> strings, Alphabet alphabet)
strings
- your data. The input format can be either a plain string,
where each character is interpreted as a distinct symbol or a string of
symbols separated by a vertical bar '|' character.alphabet
- the alphabet containing all symbols of your data space.public static AlignmentSpecification setUpSpecification(Sequence[] dataSpace)
dataSpace
- the sequences of your dataspace.public static AlignmentSpecification setUpSpecification(Sequence[] dataSpace, double matchCost, double mismatchCost, double gapCost, double skipCost)
dataSpace
- the sequences of your dataspace.matchCost
- the cost for aligning two equal symbols.mismatchCost
- the cost for aligning two unequal symbols.gapCost
- the cost for deleting or inserting a symbol.skipCost
- the cost for skip-deleting or skip-inserting a symbol.public static AlignmentSpecification setUpSpecification(Sequence[] dataSpace, double[][] scoringScheme)
dataSpace
- the sequences of your dataspace.scoringScheme
- a scoring scheme that defines individual replacement
and gap costs for each symbol. More formally: For each combination of
symbols a,b in Sigma u {_,skip} you have to specify
lambda(a,b), where lambda(a,b) is stored at point [i][j] in the matrix,
where i is the index of a in the alphabet and j is the index of b in the
alphabet (see also the Alphabet documentation).
deletion costs
lambda(a,_)
are stored at point [i][|Sigma|] and insertion costs
lambda(_,b)public static double[][] calculateDerivative(@NonNull AlignmentAlgorithm<Node,Node,? extends DerivableAlignmentDistance<Node,Node>> algorithm, @NonNull Sequence a, @NonNull Sequence b)
algorithm
- an algorithm that returns enough information to
calculate the alignment.a
- a sequence.b
- a sequence.public static <R extends DerivableAlignmentDistance<Node,Node>> double[][][][] calculateDerivatives(@NonNull AlignmentAlgorithm<Node,Node,R> algorithm, @NonNull Sequence[] dataSpace, int threadNum)
R
- the class of DericableAlignmentDistance which the given AlignmentAlgorithm computes.algorithm
- an algorithm that returns enough information to
calculate the alignment.dataSpace
- an array of sequences.threadNum
- the number of threads that shall be used for parallel
computation.Copyright (C) 2016 Benjamin Paaßen, AG Theoretical Computer Science, Centre of Excellence Cognitive Interaction Technology (CITEC), University of Bielefeld, licensed under the AGPL v. 3: http://openresearch.cit-ec.de/projects/tcs . This documentation is licensed under the conditions of CC-BY-SA 4.0: https://creativecommons.org/licenses/by-sa/4.0/