Skip to content

Commit 1db8441

Browse files
dicknetherlandsandreasprlic
authored andcommitted
New page: ==References== This document was based on comments made on the following pages: * http://biojava.org/wiki/BioJava3_Proposal * http://biojava.org/wiki/Talk:BioJava3_Proposal * http://biojav...
1 parent 1f417d8 commit 1db8441

File tree

2 files changed

+128
-0
lines changed

2 files changed

+128
-0
lines changed

_wikis/BioJava3_Design.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
---
2+
title: BioJava3 Design
3+
---
4+
5+
References
6+
----------
7+
8+
This document was based on comments made on the following pages:
9+
10+
- <http://biojava.org/wiki/BioJava3_Proposal>
11+
- <http://biojava.org/wiki/Talk:BioJava3_Proposal>
12+
- <http://biojava.org/wiki/UsageAnalysis>
13+
- <http://www.derkholm.net/svn/repos/bjv2/website/docs/index.html>
14+
15+
Basic principles
16+
----------------
17+
18+
- BioJava3 (BJ3) will freely incorporate features from Java 6.
19+
- Maven will be used to build the project.
20+
- Full unit testing for every aspect from the ground up using JUnit.
21+
- Modular design without any cyclic dependencies, with separate JARs
22+
for key components (IO, databases, genetic algorithms, sequence
23+
manipulation, etc.)
24+
- Separation of APIs from implementation code by means of packages.
25+
- Base package name: org.biojava3 (to prevent clashes with org.biojava
26+
and org.biojavax, both of which will have backwards-compatibility
27+
extensions to BJ3 in order to make old code reusable).
28+
- Use of JavaBeans concepts wherever possible, e.g. getters/setters.
29+
This would enhance Java EE compliance and improve integration into
30+
larger things.
31+
- Fully commented code in LOTS of detail INCLUDING package-level docs
32+
AND wiki-docs such as the cookbook.
33+
- Use of annotations for things like database mappings.
34+
- A consistent coding style to be developed and applied.
35+
- No Swing code to be included, but graphics code is OK for obviously
36+
useful things such as protein structures or sequence traces. Swing
37+
code is impossible to write in a way that will integrate fully with
38+
each different individual's own program requirements.
39+
40+
Compromises and Unfinished bits
41+
-------------------------------
42+
43+
- TestNG was suggested instead of JUnit, but knowledge of this tool is
44+
not so widespread and this may impact on quality of testing.
45+
- A tool for analysing comment coverage and coding style was
46+
suggested, but none have been identified. Please amend this document
47+
with the names of any good ones you know.
48+
49+
Priorities
50+
----------
51+
52+
Andreas' very useful Usage Analysis page shows the most frequently
53+
requested documentation. In the absence of any real usage statistics, we
54+
must assume that the things people most often want to read about are the
55+
things that people most often use. (It could also be said that the
56+
things that people most read about are the things that work least well
57+
in the present code... but we shall ignore that for now...).
58+
59+
Here are the priorities based on Andreas' work:
60+
61+
- How to get an Alphabet
62+
- How to make a Sequence Object from a String or make a Sequence
63+
Object back into a String
64+
- How to parse a Blast output
65+
- How to read sequences from a Fasta file
66+
- How to read a GenBank, SwissProt or EMBL file
67+
- How to generate a global or local alignment with the
68+
Needleman-Wunsch- or the Smith-Waterman-algorithm
69+
- How to read a protein structure - PDB file
70+
- How to export a sequence to fasta
71+
- How to view a sequence in a gui
72+
- How to parse a Fasta database search output file
73+
74+
These can be broken down into the following modules:
75+
76+
- Plain sequence \<-\> Enriched sequence
77+
- Sequence similarity -\> Sequence similarity IO (Blast, Fasta, etc.)
78+
- Plain sequence -\> Plain sequence IO (Genbank, FASTA, etc.)
79+
- Enriched sequence -\> Sequence alignments
80+
- Enriched sequence -\> Protein structures
81+

_wikis/BioJava3_Design.mediawiki

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
==References==
2+
This document was based on comments made on the following pages:
3+
* http://biojava.org/wiki/BioJava3_Proposal
4+
* http://biojava.org/wiki/Talk:BioJava3_Proposal
5+
* http://biojava.org/wiki/UsageAnalysis
6+
* http://www.derkholm.net/svn/repos/bjv2/website/docs/index.html
7+
8+
==Basic principles==
9+
* BioJava3 (BJ3) will freely incorporate features from Java 6.
10+
* Maven will be used to build the project.
11+
* Full unit testing for every aspect from the ground up using JUnit.
12+
* Modular design without any cyclic dependencies, with separate JARs for key components (IO, databases, genetic algorithms, sequence manipulation, etc.)
13+
* Separation of APIs from implementation code by means of packages.
14+
* Base package name: org.biojava3 (to prevent clashes with org.biojava and org.biojavax, both of which will have backwards-compatibility extensions to BJ3 in order to make old code reusable).
15+
* Use of JavaBeans concepts wherever possible, e.g. getters/setters. This would enhance Java EE compliance and improve integration into larger things.
16+
* Fully commented code in LOTS of detail INCLUDING package-level docs AND wiki-docs such as the cookbook.
17+
* Use of annotations for things like database mappings.
18+
* A consistent coding style to be developed and applied.
19+
* No Swing code to be included, but graphics code is OK for obviously useful things such as protein structures or sequence traces. Swing code is impossible to write in a way that will integrate fully with each different individual's own program requirements.
20+
21+
==Compromises and Unfinished bits==
22+
* TestNG was suggested instead of JUnit, but knowledge of this tool is not so widespread and this may impact on quality of testing.
23+
* A tool for analysing comment coverage and coding style was suggested, but none have been identified. Please amend this document with the names of any good ones you know.
24+
25+
==Priorities==
26+
Andreas' very useful Usage Analysis page shows the most frequently requested documentation. In the absence of any real usage statistics, we must assume that the things people most often want to read about are the things that people most often use. (It could also be said that the things that people most read about are the things that work least well in the present code... but we shall ignore that for now...).
27+
28+
Here are the priorities based on Andreas' work:
29+
30+
* How to get an Alphabet
31+
* How to make a Sequence Object from a String or make a Sequence Object back into a String
32+
* How to parse a Blast output
33+
* How to read sequences from a Fasta file
34+
* How to read a GenBank, SwissProt or EMBL file
35+
* How to generate a global or local alignment with the Needleman-Wunsch- or the Smith-Waterman-algorithm
36+
* How to read a protein structure - PDB file
37+
* How to export a sequence to fasta
38+
* How to view a sequence in a gui
39+
* How to parse a Fasta database search output file
40+
41+
These can be broken down into the following modules:
42+
43+
* Plain sequence <-> Enriched sequence
44+
* Sequence similarity -> Sequence similarity IO (Blast, Fasta, etc.)
45+
* Plain sequence -> Plain sequence IO (Genbank, FASTA, etc.)
46+
* Enriched sequence -> Sequence alignments
47+
* Enriched sequence -> Protein structures

0 commit comments

Comments
 (0)