public class StringEditDistance extends Object
Constructor and Description |
---|
StringEditDistance() |
Modifier and Type | Method and Description |
---|---|
static double[][] |
calculateDerivative(AlignmentAlgorithm<? extends AlignmentDerivativeAlgorithm> algorithm,
Sequence a,
Sequence b)
Calculates the alignment derivative between the two given input sequences
using the given algorithm.
|
static <R extends AlignmentDerivativeAlgorithm> |
calculateDerivatives(AlignmentAlgorithm<R> algorithm,
Sequence[] dataSpace,
int threadNum)
Calculates the pairwise alignment derivative between all given input
sequences using the given algorithm.
|
static AlignmentSpecification |
setUpSpecification(Sequence[] dataSpace)
Sets up a default AlignmentSpecification for the simple
StringEditDistance problem.
|
static AlignmentSpecification |
setUpSpecification(Sequence[] dataSpace,
double[][] scoringScheme)
Sets up an AlignmentSpecification for the simple
StringEditDistance problem.
|
static AlignmentSpecification |
setUpSpecification(Sequence[] dataSpace,
double matchCost,
double mismatchCost,
double gapCost)
Sets up an AlignmentSpecification for the simple
StringEditDistance problem.
|
static Sequence[] |
toSequences(Collection<String> strings)
Transforms the given strings to the TCSAlignmentToolbox Sequence format.
|
static Sequence[] |
toSequences(Collection<String> strings,
Alphabet alphabet)
Transforms the given strings to the TCSAlignmentToolbox Sequence format
using the given alphabet.
|
static Sequence[] |
toSequences(String[] strings)
Transforms the given strings to the TCSAlignmentToolbox Sequence format.
|
static Sequence[] |
toSequences(String[] strings,
Alphabet alphabet)
Transforms the given strings to the TCSAlignmentToolbox Sequence format
using the given alphabet.
|
public static Sequence[] toSequences(String[] strings)
strings
- your data. The input format can be either a plain string,
where each character is interpreted as a distinct symbol or a string of
symbols separated by a vertical bar '|' character.public static Sequence[] toSequences(Collection<String> strings)
strings
- your data. The input format can be either a plain string,
where each character is interpreted as a distinct symbol or a string of
symbols separated by a vertical bar '|' character.public static Sequence[] toSequences(String[] strings, Alphabet alphabet)
strings
- your data. The input format can be either a plain string,
where each character is interpreted as a distinct symbol or a string of
symbols separated by a vertical bar '|' character.alphabet
- the alphabet containing all symbols of your data space.public static Sequence[] toSequences(Collection<String> strings, Alphabet alphabet)
strings
- your data. The input format can be either a plain string,
where each character is interpreted as a distinct symbol or a string of
symbols separated by a vertical bar '|' character.alphabet
- the alphabet containing all symbols of your data space.public static AlignmentSpecification setUpSpecification(Sequence[] dataSpace)
dataSpace
- the sequences of your dataspace.public static AlignmentSpecification setUpSpecification(Sequence[] dataSpace, double matchCost, double mismatchCost, double gapCost)
dataSpace
- the sequences of your dataspace.matchCost
- the cost for aligning two equal symbols.mismatchCost
- the cost for aligning two unequal symbols.gapCost
- the cost for deleting or inserting a symbol.public static AlignmentSpecification setUpSpecification(Sequence[] dataSpace, double[][] scoringScheme)
dataSpace
- the sequences of your dataspace.scoringScheme
- a scoring scheme that defines individual replacement
and gap costs for each symbol. More formally: For each combination of
symbols a,b in Sigma u {_,skip} you have to specify
lambda(a,b), where lambda(a,b) is stored at point [i][j] in the matrix,
where i is the index of a in the alphabet and j is the index of b in the
alphabet (see also the Alphabet documentation).
deletion costs
lambda(a,_)
are stored at point [i][|Sigma|] and insertion costs
lambda(_,b)public static double[][] calculateDerivative(AlignmentAlgorithm<? extends AlignmentDerivativeAlgorithm> algorithm, Sequence a, Sequence b)
algorithm
- an algorithm that returns enough information to
calculate the alignment.a
- a sequence.b
- a sequence.public static <R extends AlignmentDerivativeAlgorithm> double[][][][] calculateDerivatives(AlignmentAlgorithm<R> algorithm, Sequence[] dataSpace, int threadNum)
algorithm
- an algorithm that returns enough information to
calculate the alignment.dataSpace
- an array of sequences.threadNum
- the number of threads that shall be used for parallel
computation.Copyright (C) 2013, 2014 Benjamin Paaßen, Charlie Krüger, Georg Zentgraf, AG Theoretical Computer Science, Centre of Excellence Cognitive Interaction Technology (CITEC), University of Bielefeld, licensed under the AGPL v. 3: http://openresearch.cit-ec.de/projects/tcs