Package org.biojava.bio.seq.io
Class SeqIOConstants
java.lang.Object
org.biojava.bio.seq.io.SeqIOConstants
SeqIOConstants contains constants used to identify
sequence formats, alphabets etc, in the context of reading and
writing sequences.
An int used to specify symbol alphabet and
sequence format type is derived thus:
- The two least significant bytes are reserved for format types such as RAW, FASTA, EMBL etc.
- The two most significant bytes are reserved for alphabet and symbol information such as AMBIGUOUS, DNA, RNA, AA etc.
-
Bitwise OR combinations of each component
intare used to specify combinations of format type and symbol information. To derive anintidentifier for DNA with ambiguity codes in Fasta format, bitwise OR the AMBIGUOUS, DNA and FASTA values.
- Author:
- Keith James
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final intAAindicates that a sequence contains AA (amino acid) symbols.static final intAMBIGUOUSindicates that a sequence contains ambiguity symbols.static final intDNAindicates that a sequence contains DNA (deoxyribonucleic acid) symbols.static final intEMBLindicates that the sequence format is EMBL.static final intEMBL_AApremade EMBL | AA.static final intEMBL_DNApremade EMBL | DNA.static final intEMBL_RNApremade EMBL | RNA.static final intFASTAindicates that the sequence format is Fasta.static final intFASTA_AApremade FASTA | AA.static final intFASTA_DNApremade FASTA | DNA.static final intFASTA_RNApremade FASTA | RNA.static final intGCGindicates that the sequence format is GCG.static final intGENBANKindicates that the sequence format is GENBANK.static final intGENBANK_DNApremade GENBANK | AA.static final intGENBANK_DNApremade GENBANK | DNA.static final intGENBANK_DNApremade GENBANK | RNA.static final intGENPEPTindicates that the sequence format is GENPEPT.static final intGFFindicates that the sequence format is GFF.static final intIGindicates that the sequence format is IG.static final intINTEGERindicates that a sequence contains integer alphabet symbols, such as used to describe sequence quality data.static final LifeScienceIdentifierLSID_EMBL_AAsequence format LSID for EMBL AA.static final LifeScienceIdentifierLSID_EMBL_DNAsequence format LSID for EMBL DNA.static final LifeScienceIdentifierLSID_EMBL_RNAsequence format LSID for EMBL RNA.static final LifeScienceIdentifierLSID_FASTA_AAsequence format LSID for Fasta AA.static final LifeScienceIdentifierLSID_FASTA_DNAsequence format LSID for Fasta DNA.static final LifeScienceIdentifierLSID_FASTA_RNAsequence format LSID for Fasta RNA.static final LifeScienceIdentifierLSID_GENBANK_AAsequence format LSID for Genbank AA.static final LifeScienceIdentifierLSID_GENBANK_DNAsequence format LSID for Genbank DNA.static final LifeScienceIdentifierLSID_GENBANK_RNAsequence format LSID for Genbank RNA.static final LifeScienceIdentifierLSID_SWISSPROTsequence format LSID for Swissprot.static final intNBRFindicates that the sequence format is NBRF.static final intPDBindicates that the sequence format is PDB.static final intPHREDindicates that the sequence format is PHRED.static final intRAWindicates that the sequence format is raw (symbols only).static final intREFSEQindicates that the sequence format is REFSEQ.static final intREFSEQ_AApremade REFSEQ | AA.static final intREFSEQ_DNApremade REFSEQ | DNA.static final intREFSEQ_RNApremade REFSEQ | RNA.static final intRNAindicates that a sequence contains RNA (ribonucleic acid) symbols.static final intSWISSPROTindicates that the sequence format is SWISSPROT.static final intUNKNOWNindicates that the sequence format is unknown. -
Constructor Summary
Constructors -
Method Summary
-
Field Details
-
AMBIGUOUS
AMBIGUOUSindicates that a sequence contains ambiguity symbols. The first bit of the most significant word of the int is set.- See Also:
-
DNA
DNAindicates that a sequence contains DNA (deoxyribonucleic acid) symbols. The second bit of the most significant word of the int is set.- See Also:
-
RNA
RNAindicates that a sequence contains RNA (ribonucleic acid) symbols. The third bit of the most significant word of the int is set.- See Also:
-
AA
AAindicates that a sequence contains AA (amino acid) symbols. The fourth bit of the most significant word of the int is set.- See Also:
-
INTEGER
INTEGERindicates that a sequence contains integer alphabet symbols, such as used to describe sequence quality data. The fifth bit of the most significant word of the int is set.- See Also:
-
UNKNOWN
UNKNOWNindicates that the sequence format is unknown.- See Also:
-
RAW
RAWindicates that the sequence format is raw (symbols only).- See Also:
-
FASTA
FASTAindicates that the sequence format is Fasta.- See Also:
-
NBRF
NBRFindicates that the sequence format is NBRF.- See Also:
-
IG
IGindicates that the sequence format is IG.- See Also:
-
EMBL
EMBLindicates that the sequence format is EMBL.- See Also:
-
SWISSPROT
SWISSPROTindicates that the sequence format is SWISSPROT. Always protein, so already had the AA bit set.- See Also:
-
GENBANK
GENBANKindicates that the sequence format is GENBANK.- See Also:
-
GENPEPT
GENPEPTindicates that the sequence format is GENPEPT. Always protein, so already had the AA bit set.- See Also:
-
REFSEQ
REFSEQindicates that the sequence format is REFSEQ.- See Also:
-
GCG
GCGindicates that the sequence format is GCG.- See Also:
-
GFF
GFFindicates that the sequence format is GFF.- See Also:
-
PDB
PDBindicates that the sequence format is PDB. Always protein, so already had the AA bit set.- See Also:
-
PHRED
PHREDindicates that the sequence format is PHRED. Always DNA, so already had the DNA bit set. Also has INTEGER bit set for quality data.- See Also:
-
EMBL_DNA
EMBL_DNApremade EMBL | DNA.- See Also:
-
EMBL_RNA
EMBL_RNApremade EMBL | RNA.- See Also:
-
EMBL_AA
EMBL_AApremade EMBL | AA.- See Also:
-
GENBANK_DNA
GENBANK_DNApremade GENBANK | DNA.- See Also:
-
GENBANK_RNA
GENBANK_DNApremade GENBANK | RNA.- See Also:
-
GENBANK_AA
GENBANK_DNApremade GENBANK | AA.- See Also:
-
REFSEQ_DNA
REFSEQ_DNApremade REFSEQ | DNA.- See Also:
-
REFSEQ_RNA
REFSEQ_RNApremade REFSEQ | RNA.- See Also:
-
REFSEQ_AA
REFSEQ_AApremade REFSEQ | AA.- See Also:
-
FASTA_DNA
FASTA_DNApremade FASTA | DNA.- See Also:
-
FASTA_RNA
FASTA_RNApremade FASTA | RNA.- See Also:
-
FASTA_AA
FASTA_AApremade FASTA | AA.- See Also:
-
LSID_FASTA_DNA
LSID_FASTA_DNAsequence format LSID for Fasta DNA. -
LSID_FASTA_RNA
LSID_FASTA_RNAsequence format LSID for Fasta RNA. -
LSID_FASTA_AA
LSID_FASTA_AAsequence format LSID for Fasta AA. -
LSID_EMBL_DNA
LSID_EMBL_DNAsequence format LSID for EMBL DNA. -
LSID_EMBL_RNA
LSID_EMBL_RNAsequence format LSID for EMBL RNA. -
LSID_EMBL_AA
LSID_EMBL_AAsequence format LSID for EMBL AA. -
LSID_GENBANK_DNA
LSID_GENBANK_DNAsequence format LSID for Genbank DNA. -
LSID_GENBANK_RNA
LSID_GENBANK_RNAsequence format LSID for Genbank RNA. -
LSID_GENBANK_AA
LSID_GENBANK_AAsequence format LSID for Genbank AA. -
LSID_SWISSPROT
LSID_SWISSPROTsequence format LSID for Swissprot.
-
-
Constructor Details
-
SeqIOConstants
public SeqIOConstants()
-