Class EmblCDROMIndexStore
- All Implemented Interfaces:
IndexStore
EmblCDROMIndexStores implement a read-only
IndexStore backed by EMBL CD-ROM format binary
indices. The required index files are typically named
"division.lkp" and "entrynam.idx". As an IndexStore
performs lookups by sequence ID, the index files "acnum.trg" and
"acnum.hit" (which store additional accession number data) are not
used.
The sequence IDs are found using a binary search via a pointer
into the index file. The whole file is not read unless a request
for all the IDs is made using the getIDs() method. The set of IDs
is then cached after the first pass. This class also has a
close() method to free resources associated with the
underlying RandomAccessFile.
The binary index files may be created using the EMBOSS programs
dbifasta, dbiblast, dbiflat or dbigcg. The least useful from the
BioJava perspective is dbigcg because we do not have a
SequenceFormat implementation for GCG format
files.
The Index instances returned by this class do not
have the record length set because this information is not
available in the binary index. The value -1 is used instead, as
described in the Index interface.
- Since:
- 1.2
- Author:
- Keith James
-
Constructor Summary
ConstructorsConstructorDescriptionEmblCDROMIndexStore(File pathPrefix, File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser) Creates a newEmblCDROMIndexStorebacked by a random access binary index.EmblCDROMIndexStore(File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser) Creates a newEmblCDROMIndexStorebacked by a random access binary index. -
Method Summary
Modifier and TypeMethodDescriptionvoidclose()closecloses the underlyingEntryNamRandomAccesswhich in turn closes the lower levelRandomAccessFile.voidcommit()commitcommits changes.Fetch an Index based upon an ID.getFiles()Retrieve the Set of files that are currently indexed.Retrieve the format of the index file.getIDs()Retrieve the set of all current IDs.getName()getNamereturns the database name as defined within the EMBL CD-ROM index.getPathPrefixreturns the abstract path currently being appended to the raw sequence database filenames extracted from the binary index.Retrieve the SequenceBuilderFactory used to build Sequence instances.Retrieve the symbol parser used to turn the sequence characters into Symobl objects.voidrollback()rollbackrolls back changes made since the lastcommit.voidsetPathPrefix(File pathPrefix) setPathPrefixsets the abstract path to be appended to sequence database filenames retrieved from the binary index.voidstoreadds anIndexto the store.
-
Constructor Details
-
EmblCDROMIndexStore
public EmblCDROMIndexStore(File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser) throws IOException Creates a newEmblCDROMIndexStorebacked by a random access binary index.- Parameters:
divisionLkp- aFilecontaining the master index.entryNamIdx- aFilecontaining the sequence IDs and offsets.format- aSequenceFormat.factory- aSequenceBuilderFactory.parser- aSymbolTokenization.- Throws:
IOException- if an error occurs.
-
EmblCDROMIndexStore
public EmblCDROMIndexStore(File pathPrefix, File divisionLkp, File entryNamIdx, SequenceFormat format, SequenceBuilderFactory factory, SymbolTokenization parser) throws IOException Creates a newEmblCDROMIndexStorebacked by a random access binary index.- Parameters:
pathPrefix- aFilecontaining the abstract path to be appended to sequence database filenames retrieved from the binary index.divisionLkp- aFilecontaining the master index.entryNamIdx- aFilecontaining the sequence IDs and offsets.format- aSequenceFormat.factory- aSequenceBuilderFactory.parser- aSymbolTokenization.- Throws:
IOException- if an error occurs.
-
-
Method Details
-
getPathPrefix
getPathPrefixreturns the abstract path currently being appended to the raw sequence database filenames extracted from the binary index. This value defaults to the empty abstract path.- Returns:
- a
File.
-
setPathPrefix
setPathPrefixsets the abstract path to be appended to sequence database filenames retrieved from the binary index. E.g. if the binary index refers to the database as 'SWALL' and thepathPrefixis set to "/usr/local/share/data/seq/", then theIndexStorewill know the database path as "/usr/local/share/data/seq/swall" and anyIndexinstances produced by the store will return the latter path when their getFile() method is called. This value defaults to the empty abstract path.- Parameters:
pathPrefix- aFileprefix specifying the abstract path to append.
-
getName
getNamereturns the database name as defined within the EMBL CD-ROM index.- Specified by:
getNamein interfaceIndexStore- Returns:
- a
Stringvalue.
-
store
storeadds anIndexto the store. As EMBL CD-ROM indices are read-only, this implementation throws aBioException.- Specified by:
storein interfaceIndexStore- Parameters:
index- anIndex.- Throws:
IllegalIDException- if an error occurs.BioException- if an error occurs.
-
commit
commitcommits changes. As EMBL CD-ROM indices are read-only, this implementation throws aBioException.- Specified by:
commitin interfaceIndexStore- Throws:
BioException- if an error occurs.
-
rollback
rollbackrolls back changes made since the lastcommit. As EMBL CD-ROM indices are read-only, this implementation does nothing.- Specified by:
rollbackin interfaceIndexStore
-
fetch
Description copied from interface:IndexStoreFetch an Index based upon an ID.- Specified by:
fetchin interfaceIndexStore- Parameters:
id- The ID of the sequence Index to retrieve- Throws:
IllegalIDException- if the ID couldn't be foundBioException- if the fetch fails in the underlying storage mechanism
-
getIDs
Description copied from interface:IndexStoreRetrieve the set of all current IDs.This set should either be immutable, or modifiable totally separately from the IndexStore.
- Specified by:
getIDsin interfaceIndexStore- Returns:
- a Set of all legal IDs
-
getFiles
Description copied from interface:IndexStoreRetrieve the Set of files that are currently indexed.- Specified by:
getFilesin interfaceIndexStore
-
getFormat
Description copied from interface:IndexStoreRetrieve the format of the index file.This set should either be immutable, or modifiable totally separately from the IndexStore.
- Specified by:
getFormatin interfaceIndexStore- Returns:
- a Set of all indexed files
-
getSBFactory
Description copied from interface:IndexStoreRetrieve the SequenceBuilderFactory used to build Sequence instances.- Specified by:
getSBFactoryin interfaceIndexStore- Returns:
- the associated SequenceBuilderFactory
-
getSymbolParser
Description copied from interface:IndexStoreRetrieve the symbol parser used to turn the sequence characters into Symobl objects.- Specified by:
getSymbolParserin interfaceIndexStore- Returns:
- the associated SymbolParser
-
close
closecloses the underlyingEntryNamRandomAccesswhich in turn closes the lower levelRandomAccessFile. This frees the resources associated with the file.- Throws:
IOException- if an error occurs.
-