|
| 1 | +--- |
| 2 | +title: BioJava3 Design |
| 3 | +--- |
| 4 | + |
| 5 | +References |
| 6 | +---------- |
| 7 | + |
| 8 | +This document was based on comments made on the following pages: |
| 9 | + |
| 10 | +- <http://biojava.org/wiki/BioJava3_Proposal> |
| 11 | +- <http://biojava.org/wiki/Talk:BioJava3_Proposal> |
| 12 | +- <http://biojava.org/wiki/UsageAnalysis> |
| 13 | +- <http://www.derkholm.net/svn/repos/bjv2/website/docs/index.html> |
| 14 | + |
| 15 | +Basic principles |
| 16 | +---------------- |
| 17 | + |
| 18 | +- BioJava3 (BJ3) will freely incorporate features from Java 6. |
| 19 | +- Maven will be used to build the project. |
| 20 | +- Full unit testing for every aspect from the ground up using JUnit. |
| 21 | +- Modular design without any cyclic dependencies, with separate JARs |
| 22 | + for key components (IO, databases, genetic algorithms, sequence |
| 23 | + manipulation, etc.) |
| 24 | +- Separation of APIs from implementation code by means of packages. |
| 25 | +- Base package name: org.biojava3 (to prevent clashes with org.biojava |
| 26 | + and org.biojavax, both of which will have backwards-compatibility |
| 27 | + extensions to BJ3 in order to make old code reusable). |
| 28 | +- Use of JavaBeans concepts wherever possible, e.g. getters/setters. |
| 29 | + This would enhance Java EE compliance and improve integration into |
| 30 | + larger things. |
| 31 | +- Fully commented code in LOTS of detail INCLUDING package-level docs |
| 32 | + AND wiki-docs such as the cookbook. |
| 33 | +- Use of annotations for things like database mappings. |
| 34 | +- A consistent coding style to be developed and applied. |
| 35 | +- No Swing code to be included, but graphics code is OK for obviously |
| 36 | + useful things such as protein structures or sequence traces. Swing |
| 37 | + code is impossible to write in a way that will integrate fully with |
| 38 | + each different individual's own program requirements. |
| 39 | + |
| 40 | +Compromises and Unfinished bits |
| 41 | +------------------------------- |
| 42 | + |
| 43 | +- TestNG was suggested instead of JUnit, but knowledge of this tool is |
| 44 | + not so widespread and this may impact on quality of testing. |
| 45 | +- A tool for analysing comment coverage and coding style was |
| 46 | + suggested, but none have been identified. Please amend this document |
| 47 | + with the names of any good ones you know. |
| 48 | + |
| 49 | +Priorities |
| 50 | +---------- |
| 51 | + |
| 52 | +Andreas' very useful Usage Analysis page shows the most frequently |
| 53 | +requested documentation. In the absence of any real usage statistics, we |
| 54 | +must assume that the things people most often want to read about are the |
| 55 | +things that people most often use. (It could also be said that the |
| 56 | +things that people most read about are the things that work least well |
| 57 | +in the present code... but we shall ignore that for now...). |
| 58 | + |
| 59 | +Here are the priorities based on Andreas' work: |
| 60 | + |
| 61 | +- How to get an Alphabet |
| 62 | +- How to make a Sequence Object from a String or make a Sequence |
| 63 | + Object back into a String |
| 64 | +- How to parse a Blast output |
| 65 | +- How to read sequences from a Fasta file |
| 66 | +- How to read a GenBank, SwissProt or EMBL file |
| 67 | +- How to generate a global or local alignment with the |
| 68 | + Needleman-Wunsch- or the Smith-Waterman-algorithm |
| 69 | +- How to read a protein structure - PDB file |
| 70 | +- How to export a sequence to fasta |
| 71 | +- How to view a sequence in a gui |
| 72 | +- How to parse a Fasta database search output file |
| 73 | + |
| 74 | +These can be broken down into the following modules: |
| 75 | + |
| 76 | +- Plain sequence \<-\> Enriched sequence |
| 77 | +- Sequence similarity -\> Sequence similarity IO (Blast, Fasta, etc.) |
| 78 | +- Plain sequence -\> Plain sequence IO (Genbank, FASTA, etc.) |
| 79 | +- Enriched sequence -\> Sequence alignments |
| 80 | +- Enriched sequence -\> Protein structures |
| 81 | + |
0 commit comments