Class MotifTools
MotifTools contains utility methods for sequence
motifs.- Author:
- Keith James
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic StringcreateRegex(SymbolList motif) createRegexcreates a regular expression which matches theSymbolList.
-
Constructor Details
-
MotifTools
public MotifTools()
-
-
Method Details
-
createRegex
createRegexcreates a regular expression which matches theSymbolList. AmbiguousSymbols are simply transformed into character classes. For example the nucleotide sequence "AAGCTT" becomes "A{2}GCT{2}" and "CTNNG" is expanded to "CT[ABCDGHKMNRSTVWY]{2}G". The character class is generated using thegetMatchesmethod of an ambiguity symbol to obtain the alphabet ofAtomicSymbols it matches, followed by callinggetAllSymbolson this alphabet, removal of any gap symbols and then tokenization of the remainder. The ordering of the tokens in a character class is by ascending numerical order of their tokens as determined byArrays.sort(char []).The
Alphabetof theSymbolListmust be finite and must have a character token type. Regular expressions may be generated for any suchSymbolList, not just DNA, RNA and protein.- Parameters:
motif- aSymbolList.- Returns:
- a
Stringregular expression.
-