A Terminal User Interface (TUI) for browsing, visualizing, and analyzing bacteriophage genetic data. Features color-coded DNA/amino acid sequences, rotating 3D ASCII phage models, and instant navigation between genomes—no browser, no cloud, just a fast local binary.
One-liner install:
curl -fsSL "https://raw.githubusercontent.com/Dicklesworthstone/phage_explorer/main/install.sh?$(date +%s)" | bash
Main dashboard — Phage selector with 24 genomes, Lambda phage illustration, color-coded sequence grid, 3D protein structure viewer, and Analysis Tools panel
Sequence + Structure view — Color-coded amino acid grid with gene map, alongside WebGL 3D protein structure (8ZVI, 28,618 atoms) fetched from RCSB PDB
Virtual Gel Electrophoresis [Alt+G] — Simulate restriction enzyme digestion with EcoRI, HindIII, BamHI and more; compare predicted band patterns against experimental gels
Simulation Hub [Shift+S] — Interactive phage biology simulations: Lysogeny Decision Circuit, Plaque Growth Automata, Ribosome Traffic, Burst Kinetics, and more
What phages are (in engineering terms): self-assembling nanosyringes that package code (DNA/RNA) in a protein shell, land on bacteria, and inject that code to hijack the host. They’re the most abundant biological “edge devices” on Earth (≈10³¹ particles), continuously stress‑testing bacterial firmware.
- Medicine: design phage cocktails to beat antibiotic‑resistant infections.
- Biotech: workhorse enzymes (T7 RNA polymerase, Phi29 DNA pol) are phage gifts used in PCR, in‑vitro transcription, and genome amplification.
- Evolution & ecology: phages rewrite bacterial genomes, drive the ocean carbon cycle, and shape your gut microbiome.
- Engineering playground: modular genomes swap parts (tail fibers, lysis cassettes) like microservice plugins.
- Hershey–Chase blender experiment (1952) used phage T2 to prove DNA is the hereditary material.
- Crick–Brenner frameshift experiment (1961) in phage T4 rII genes showed the code is read in triplets—delete/insert three bases and the protein still works.
- Early codon assignments relied on phage RNA/DNA templates (e.g., poly‑U/UGG) to map base triplets to amino acids, paving the way for the 64‑codon table we use today.
- What they are: Bacteriophages (“phages”) are bacteria‑infecting viruses. They self‑assemble into nanomachines with a head (capsid) that stores DNA/RNA and a tail that injects it like a syringe.
- Why they matter today
- Medicine: design phage cocktails to beat antibiotic‑resistant infections.
- Biotech: workhorse enzymes (T7 RNA polymerase, Phi29 DNA pol) are phage gifts used in PCR, in‑vitro transcription, and genome amplification.
- Evolution & ecology: phages are the most abundant biological entities on Earth. They rewrite bacterial genomes, drive the ocean carbon cycle, and shape your gut microbiome.
- Modular by design: Phage genomes behave like Lego kits—swapping tail fibers, lysis cassettes, and regulatory modules across lineages. That modularity is exactly what Phage Explorer visualizes.
- Historical impact: Foundational experiments (Hershey–Chase blender test for DNA as genetic material; Luria–Delbrück fluctuation test revealing random mutation) used phages. They were also the playground for cracking the genetic code.
- Genome = source code (A/C/G/T chars) with strongly typed 3-char opcodes (codons).
- Promoter/RBS = function entry points that recruit ribosomes (the runtime) to start decoding.
- Reading frame = instruction pointer alignment; shift by one base and every downstream opcode changes.
- Stop codons = return; terminate translation and hand control back to the host.
- Phage lifecycle = deployment strategy: lytic =
rm -rf host; lysogenic =git cloneinto host genome and wait.
- DNA alphabet: A, C, G, T (RNA swaps T→U). Proteins are chains of 20 amino acids.
- Translation reads DNA in codons—non‑overlapping triplets of bases (64 possible combos). Every three letters → one amino acid or a stop signal.
- The mapping is fixed and mostly redundant (“degenerate”): several codons can encode the same amino acid. Three codons encode stop, and ATG usually marks start.
- Reading frames matter: shift by one base and every downstream codon changes. Phage Explorer lets you flip frames live to see amino acid strings reflow.
- Triplets are proven by phage genetics: three-base indels restore the reading frame; doublet/singlet changes scramble everything.
- Quick lookup:
| Codon | Amino acid | Notes |
|---|---|---|
| ATG | M (Methionine) | Start codon |
| TTT / TTC | F (Phenylalanine) | Aromatic |
| GAA / GAG | E (Glutamic acid) | Acidic |
| TGG | W (Tryptophan) | Unique single-codon amino acid |
| TAA / TAG / TGA | * (Stop) | Terminates translation |
- Chemistry buckets: Amino acids group into hydrophobic, polar, acidic, basic, and “special” (glycine, proline, cysteine, histidine). Themes color by these groups so chemical patterns pop while you scroll.
- Full-Screen HUD Interface — Navigate between phages instantly with arrow keys
- Color-Coded Sequences — DNA (ACTG) and amino acid views with distinct, beautiful colors
- 5 Color Themes — Classic, Ocean, Matrix, Sunset, Forest (cycle with
T) - 3D Structure Viewer — Real PDB structures from RCSB with cartoon/ball-and-stick/surface modes (web), ASCII wireframe (TUI)
- Gene Map Navigation — Visual gene bar with position tracking and snap-to-gene
- Layer-1 Quick Overlays —
GGC skew,Xcomplexity,Bbendability,Ppromoter/RBS motifs,Rrepeats/palindromes - Diff Mode — Compare sequences between phages visually
- Search — Fuzzy search by name, host, family, or accession
- 24 Real Phages — Lambda, T4, T7, PhiX174, MS2, M13, P22, Phi29, Mu, Phi6, SPbeta, T5, P1, P2, N4, Felix O1, D29, L5, PhiC31, PhiKZ, PRD1, PM2, Qβ, T1
- WASM-Accelerated — Rust-compiled spatial algorithms for instant analysis of large structures
- Zero Dependencies at Runtime — Single binary, no Bun/Node required
# Install latest release (includes database)
curl -fsSL "https://raw.githubusercontent.com/Dicklesworthstone/phage_explorer/main/install.sh?$(date +%s)" | bash -s -- --with-database
# Run
phage-explorer# Clone and install
git clone https://github.com/Dicklesworthstone/phage_explorer.git
cd phage_explorer
bun install
# Build the phage database (fetches from NCBI, ~1 minute)
bun run build:db
# Run the TUI
bun run devThe full web experience is live at https://phage-explorer.org. It includes:
- 3D Structure Viewer — Real PDB structures from RCSB with cartoon, ball-and-stick, and surface rendering modes
- Interactive Sequence Grid — Color-coded DNA/amino acid display with smooth scrolling
- 30+ Analysis Overlays — GC skew, dot plots, Hilbert curves, HGT detection, synteny, and more
- WASM-Accelerated — Rust-compiled spatial algorithms for instant loading of large structures (50K+ atoms)
- Mobile-Friendly — Touch gestures, bottom sheets, haptic feedback
Deployment details:
- Hosting: Vercel (prod alias:
phage-explorer.org) - Build command:
bun run build:web - Output:
packages/web/dist - SQLite database bundled as static asset, queried via sql.js in-browser
- Zero telemetry, works offline after initial load
The TUI remains available for terminal enthusiasts:
- Local SQLite DB with instant queries
- ASCII 3D wireframe models
- Real-time keyboard controls, diff mode, command palette
- Build locally:
bun install && bun run build:db && bun run dev(or use the prebuilt binary)
| Key | Action |
|---|---|
↑ / ↓ |
Navigate between phages |
← / → |
Scroll sequence left/right |
PgUp / PgDn |
Scroll by one page |
Home / End |
Jump to start/end of genome |
N / C |
Toggle DNA / Amino Acid (codon) view |
F |
Cycle reading frame (1, 2, 3) |
T |
Cycle color theme |
D |
Toggle diff mode vs reference phage |
M |
Toggle 3D model display |
K |
Toggle amino acid key legend |
S / / |
Search phages |
[ / ] |
Jump to previous/next gene |
? |
Show help overlay |
Q |
Quit |
| Phage | Genome | Type | Host | Notes |
|---|---|---|---|---|
| Lambda (λ) | 48,502 bp | dsDNA | E. coli K-12 | Classic temperate phage, lysogenic |
| T4 | 168,903 bp | dsDNA | E. coli B | Large lytic phage, contractile tail |
| T7 | 39,937 bp | dsDNA | E. coli | Famous for its RNA polymerase |
| PhiX174 | 5,386 bp | ssDNA | E. coli C | First DNA genome sequenced (1977) |
| MS2 | 3,569 bp | ssRNA | E. coli | First genome ever sequenced (1976) |
| M13 | 6,407 bp | ssDNA | E. coli | Filamentous, used in phage display |
| P22 | 41,724 bp | dsDNA | Salmonella | Transducing phage |
| Phi29 | 19,282 bp | dsDNA | B. subtilis | DNA polymerase for amplification |
| Mu | 36,717 bp | dsDNA | E. coli | Transposable phage |
| Phi6 | 2,948 bp | dsRNA | P. syringae | Rare dsRNA phage with envelope |
| SPbeta | 134,416 bp | dsDNA | B. subtilis | Large temperate phage |
| T5 | 121,750 bp | dsDNA | E. coli | Two-step DNA injection |
| P1 | 94,800 bp | dsDNA | E. coli | Plasmid prophage, Cre-lox origin |
| P2 | 33,593 bp | dsDNA | E. coli | Founding Peduovirus |
| N4 | 70,153 bp | dsDNA | E. coli | Injects its own RNA polymerase |
| Felix O1 | 86,155 bp | dsDNA | Salmonella | Classic Salmonella-typing phage |
| D29 | 49,136 bp | dsDNA | Mycobacterium | SEA-PHAGES workhorse, TB research |
| L5 | 52,297 bp | dsDNA | Mycobacterium | First sequenced mycobacteriophage |
| PhiC31 | 41,491 bp | dsDNA | Streptomyces | Serine integrase for gene therapy |
| PhiKZ | 280,334 bp | dsDNA | Pseudomonas | Jumbo phage, forms nucleus-like shell |
| PRD1 | 14,927 bp | dsDNA | E. coli | Tailless, internal lipid membrane |
| PM2 | 10,079 bp | dsDNA | Pseudoalteromonas | Marine, first lipid-containing phage |
| Qβ | 4,217 bp | ssRNA | E. coli | RNA replicase, isothermal amplification |
| T1 | 48,836 bp | dsDNA | E. coli | Notorious lab contaminant |
Cycle through themes with T:
- Classic — Traditional bioinformatics colors (green A, blue C, amber G, red T)
- Ocean — Cool blue/teal palette
- Matrix — Green terminal aesthetic
- Sunset — Warm orange/coral tones
- Forest — Natural earth greens and browns
Each theme provides distinct colors for:
- 4 nucleotides (A, C, G, T) + N
- 20 amino acids grouped by property (hydrophobic, polar, acidic, basic, special) + stop codon
phage-explorer/
├── packages/
│ ├── core/ # Domain logic: codons, theming, grid virtualization
│ ├── db-schema/ # Drizzle ORM schema (phages, sequences, genes, etc.)
│ ├── db-runtime/ # Repository implementations over SQLite
│ ├── state/ # Zustand store for UI state management
│ ├── renderer-3d/ # ASCII 3D rendering engine with Z-buffering
│ ├── wasm-compute/ # Rust/WASM for performance-critical algorithms
│ ├── data-pipeline/ # NCBI fetcher and database builder
│ ├── web/ # React web app with WebGL 3D viewer
│ └── tui/ # Ink/React TUI components
├── phage.db # SQLite database (generated)
└── install.sh # One-liner installer script
To keep 31+ planned features coherent, Phage Explorer follows a lightweight design system tailored for text UIs:
-
Color semantics
- Green
#22c55e: good / conserved / similar - Yellow
#eab308: caution / neutral - Red
#ef4444: warning / divergent - Blue
#3b82f6: informational metrics - Purple
#a855f7: notable / special - Gray
#6b7280: background / reference / inactive
- Green
-
Graph & sparkline standards
- Height: 3–5 text rows; Width: terminal width where possible
- Auto-normalize values; show min/max in labels
- Prefer Braille dots for dense sparklines; fall back to ASCII if needed
-
Panels, modals, overlays
- Single-line box borders, 1-char padding
- Headers bold, centered, include close hint (e.g., “Esc to close”)
- Max width 80 chars or
(terminal width - 4), whichever is smaller
-
Interaction “depth layers” (progressive disclosure)
- Layer 0: Core always-on controls (navigation, view toggles, help, search)
- Layer 1: Quick overlays (single-key toggles; limit 3 active)
- Layer 2: Analysis menu (
A) with numbered shortcuts + fuzzy search - Layer 3: Simulation hub (
Shift+S) for interactive models - Layer 4: Command palette (
:orCtrl+P) that fuzzy-searches all actions
Use these conventions for any new component: pick colors from the palette above, respect padding/borders, and place advanced features in deeper layers so newcomers see a stable core UI.
Phage Explorer serves both newcomers and power users by stacking features in “depth layers”:
-
Layer 0 – Sacred Surface (always available, never changed)
Navigation:↑↓←→PgUpPgDnHomeEnd
View:N/C/Space(DNA/AA),F(frame),T(theme),D(diff),M(3D model)
Meta:?(help),Sor/(search),Q(quit)
These keys are stable forever; everything else builds on top. -
Layer 1 – Quick Overlays (single-key toggles; max 3 active)
Reserved keys:G(GC skew),X(complexity),B(bendability),P(promoter/RBS),R(repeats).
Data is precomputed on phage selection so toggles feel instant. (Feature hooks coming next.) -
Layer 2 – Analysis Menu (
A)
Categorized, numbered actions (e.g., GC skew analysis, dot plots, codon dashboards). Supports fuzzy search. -
Layer 3 – Simulation Hub (
Shift+S)
Full-screen interactive simulations (lysogeny circuits, ribosome traffic, plaque automata, etc.). -
Layer 4 – Command Palette (
:orCtrl+P)
Fuzzy-search every action; shows the shortcut path (e.g., “GC skew [G] Overlay” or “Dot plot [A→4] Analysis”).
Keystroke budget rule: Layer 0 keys are immutable; new features must live in higher layers. This keeps discoverability for beginners and speed for experts.
# Install to custom directory
curl -fsSL .../install.sh | bash -s -- --dest ~/bin
# Install system-wide
curl -fsSL .../install.sh | bash -s -- --system
# Install specific version
curl -fsSL .../install.sh | bash -s -- --version v1.0.0
# Build from source instead of downloading binary
curl -fsSL .../install.sh | bash -s -- --from-source
# Include phage database
curl -fsSL .../install.sh | bash -s -- --with-database
# Auto-add to PATH
curl -fsSL .../install.sh | bash -s -- --easy-mode| Variable | Default | Description |
|---|---|---|
VERSION |
latest | Pin specific release tag |
DEST |
~/.local/bin |
Install directory |
OWNER |
Dicklesworthstone |
GitHub owner |
REPO |
phage_explorer |
GitHub repository |
| Script | Description |
|---|---|
bun run dev |
Run TUI in development mode |
bun run build:db |
Build phage database from NCBI |
bun run build |
Compile to single binary |
bun run build:all |
Build for all platforms |
bun run lint |
ESLint (zero warnings) |
bun run typecheck |
TypeScript checks |
bun run check |
Lint + typecheck |
bun run test |
Run unit tests |
- Sequence Storage: Chunked in 10kb segments for efficient virtualized rendering
- Virtualized Rendering: Only visible sequence portion rendered, smooth 60fps scrolling
- 3D Engine: Custom ASCII renderer (TUI) and WebGL/Three.js viewer (web) with real PDB structures
- WASM Acceleration: Rust-compiled WebAssembly for compute-intensive operations (spatial-hash bond detection, dot plots, k-mer analysis) with automatic JS fallback
- 3D Structure Loading: Fetches PDB/mmCIF from RCSB, parses in Web Worker, O(N) spatial-hash bond detection for structures with 50,000+ atoms
- State Management: Zustand for reactive UI updates
- Database: SQLite with Drizzle ORM, ~6MB for 24 phages with PDB structure references
| Issue | Solution |
|---|---|
| Database missing | Run bun run build:db or use --with-database flag |
| NCBI fetch errors | Retry build:db; NCBI has rate limits |
| Binary not in PATH | Use --easy-mode or manually add install directory |
| Terminal too small | Resize to at least 80x24 for best experience |
| Colors not showing | Ensure terminal supports 256 colors / truecolor |
- Network calls: Only to NCBI during database build
- No telemetry: Zero analytics or tracking
- Local data: All data stored in local SQLite database
- Open source: Full source code available for audit
- Single-binary distribution, no runtime dependencies
- SQLite database for instant local queries
- Virtualized sequence rendering for genomes up to 500kb+
- Chunked fetching avoids loading entire genomes into memory
- 3D model rendering at ~20fps with minimal CPU usage
- WASM spatial-hash: O(N) bond detection vs O(N²) naive—1000x faster for large structures (50K atoms: <1s vs 60s+)
- Web Worker isolation keeps UI responsive during heavy computation
- Automatic fallback to pure JS when WASM unavailable
- Lint + Typecheck: Every push and PR
- Cross-platform builds: macOS (arm64, x64), Linux (x64, arm64), Windows (x64)
- Automated releases: Tagged versions trigger binary builds and GitHub release
- Database artifacts: Pre-built
phage.dbincluded in releases
bun installto set up dependenciesbun run build:dbto create the databasebun run devto run the TUI- Run
bun run checkbefore committing changes
Phage Explorer implements a comprehensive suite of genomic analysis algorithms, from classic bioinformatics to cutting-edge machine learning approaches.
| Algorithm | Description | Complexity |
|---|---|---|
| Dot Plot | Downsampled self-homology comparison with direct & inverted (reverse-complement) scoring | O(bins² × window) |
| Chaos Game Representation | Fractal IFS-based genomic fingerprinting; maps DNA to 2D coordinates using corner mappings (A→0,0 T→1,0 C→0,1 G→1,1) | O(N) |
| Recombination Radar | k-mer Jaccard similarity against reference panel; detects mosaic segments & HGT breakpoints | O(N × refs) |
| Hilbert Curve Atlas | Maps 1D sequence to 2D via space-filling curve; preserves locality for visual pattern recognition | O(N) |
| Algorithm | Description |
|---|---|
| Codon Bias (RSCU) | Relative Synonymous Codon Usage; identifies preferred/rare codons per amino acid family |
| Ribosome Traffic | Particle-based translation simulation with collision detection; bottleneck identification based on codon rarity |
| Periodicity Analysis | Spectral analysis of GC% and purine runs; identifies repeating motifs (tandem repeats, period-3 codon patterns) |
| K-mer Frequency | 256-element tetranucleotide vectors for phylogenetic profiling; base-4 encoding for O(1) lookup |
| Structure | Algorithm |
|---|---|
| G-quadruplex | G4Hunter scoring: +1 to +4 per G-run length, -1 to -4 per C-run; identifies potential G4-forming regions |
| Z-DNA | Dinucleotide propensity scoring; CG/GC repeats favor left-handed helix formation |
| Cruciform | Inverted repeat detection with stem-loop energy estimation |
| Feature | Method |
|---|---|
| CRISPR Pressure | Spacer matching against mock host CRISPR arrays; rates evolutionary pressure 0-10 scale |
| Anti-CRISPR (ACR) | Heuristic scoring for defense-suppressing genes; flags known families (AcrIIA4, Ocr, etc.) |
| Tail Fiber Tropism | ESM2 protein embeddings + HDBSCAN clustering; confidence scores for receptor binding predictions |
| Prophage Excision | Detects integrase genes; searches for attL/attR direct repeats; models excision products |
| Feature | Description |
|---|---|
| Protein Domains | Pfam/InterPro/SMART domain annotations with E-values and boundaries |
| Fold Embeddings | ESM2-derived vectors stored as Float32 blobs; enables novelty/neighbor queries via cosine distance |
| AMG Detection | Auxiliary Metabolic Genes linked to KEGG pathways (photosynthesis, carbon, nucleotide metabolism) |
The Simulation Hub (Shift+S) provides interactive models of phage biology with adjustable parameters and real-time visualization.
Model: ODE system of Lambda lysis-lysogeny bistable switch State variables: CI and Cro repressor concentrations Parameters:
- Hill cooperativity (n=1-4) — controls switch steepness
- MOI (multiplicity of infection) — tips balance toward lysogeny
- UV damage — triggers lytic induction
Output: Phase classification (lysogenic/lytic/undecided) with time-series history
Model: Particle-based translation with stochastic elongation Mechanics:
- 9-codon ribosome footprint
- Elongation rates derived from phage codon usage bias
- Rare codons create traffic jams → bottlenecks
Output: Protein production rate, stall events, density heatmap
Model: Cellular automaton with reaction-diffusion dynamics Cell states: Empty → Bacteria → Infected → Lysed → Phage → Lysogen Parameters: Burst size, latent period, diffusion probability, adsorption rate, lysogeny probability Grid: Toroidal wrap-around, ~100 steps/frame
Model: SIR-like ODE system (dB/dt, dI/dt, dP/dt) Features:
- Adsorption kinetics: k [mL/(phage·min)]
- Exponential latent period decay
- Logistic bacterial growth with carrying capacity
Model: Gillespie stochastic simulation with tau-leaping States: Sensitive → Partially resistant (per phage) → Fully resistant Key insight: Mono-phage resistance emerges faster than cocktail resistance (synergy effect) Tracks: Sensitive/resistant populations, emergence time, individual phage counts
Model: Physics-informed DNA packaging Factors:
- Intrinsic pressure from DNA bending: L/(R²) energy model
- Ionic strength effects (Debye screening ~0.304/√I nm)
- Morphology-specific baselines (headful/cos/phi29)
Output: Fill fraction, internal pressure (atm), motor force (pN)
Model: Molecular clock with mutations and genetic drift Mechanics:
- Poisson-distributed mutations per generation
- Selection coefficients from normal distribution
- Fitness clamped 0.6-1.5, effective population (Ne) tracks drift
Performance-critical algorithms are implemented in Rust and compiled to WebAssembly for near-native speed in the browser.
| Algorithm | Threshold | Speedup | Fallback |
|---|---|---|---|
| PCA (Power Iteration) | 20K+ elements | ~10-50x | Pure JS eigendecomposition |
| Spatial Hash Bond Detection | Always | 1000x vs O(N²) | — |
| K-mer Counting | Large genomes | ~5x | TypedArray JS |
Traditional bond detection is O(N²)—checking every atom pair. For a 50,000-atom structure, that's 2.5 billion comparisons.
The WASM spatial hash:
- Divides 3D space into grid cells (cell size = max bond length)
- Each atom only checks neighbors in adjacent cells
- Reduces to O(N) on average
Result: 50K atoms in <1s vs 60s+ naive. This enables real-time structure exploration.
When WASM isn't available (older browsers, restricted environments), Phage Explorer gracefully falls back to pure JavaScript implementations. Users get the same features, just slower on large datasets.
The SQLite database includes extensive precomputed annotations, enabling instant analysis with zero latency.
| Table | Contents |
|---|---|
phages |
24 reference phages with metadata (accession, family, genus, host, morphology, lifecycle, genome stats) |
sequences |
Chunked 10kb segments for virtualized streaming |
genes |
Full annotations (position, strand, type, qualifiers as JSON) |
codonUsage |
Per-phage amino acid and codon frequency counts |
proteinDomains |
Pfam/SMART/CDD domains with E-values and boundaries |
foldEmbeddings |
ESM2 protein embeddings as Float32 blobs |
amgAnnotations |
Auxiliary metabolic genes linked to KEGG pathways |
defenseSystems |
Anti-CRISPR/anti-RM/anti-Abi predictions with confidence |
tropismPredictions |
Tail fiber receptor binding (ML-derived) |
codonAdaptation |
CAI, tAI, CPB, Nc' per phage-host pair |
Annotations are precomputed via GitHub Actions:
- NCBI fetch → Raw sequences and gene annotations
- Pfam/InterPro scan → Protein domain identification
- KEGG mapping → Metabolic pathway linkage
- ESM2 inference → Protein fold embeddings
- SQLite build → Indexed, compressed database (~6MB)
Users get instant results because all heavy computation happens at build time.
Traditional sequence comparison requires alignment (BLAST, etc.), which is slow and fails for distant homologs. Phage Explorer includes alignment-free methods that work on raw k-mer statistics.
Maps a genome to a 2D fractal image:
- Start at center (0.5, 0.5)
- For each nucleotide, move halfway toward its corner
- Plot the point
Result: A unique "fingerprint" where similar genomes produce similar fractals. Visually identifies sequence biases, repeat structures, and phylogenetic relationships without alignment.
- Compute tetranucleotide (4-mer) frequency vectors (256 dimensions)
- Run PCA to reduce to 2-3D
- Cluster phages by genomic signature
Why it works: Oligonucleotide frequencies are remarkably stable within a genome but differ between species. This captures "genomic dialect" independent of gene order or alignment.
Detects horizontal gene transfer and mosaic genomes:
- Slide a window across the query genome
- Compute k-mer Jaccard similarity to each reference
- Plot "donor" assignment along the genome
Output: Identifies breakpoints where the genome switches from one "parent" to another—direct evidence of recombination events.
Most phage genomics tools are either:
- Too specialized — Built for one analysis type, require pipeline assembly
- Too heavyweight — Need cloud infrastructure, accounts, or complex installation
- Too dated — Command-line tools from the 2000s with no visualization
Phage Explorer combines:
- Breadth — 25+ analyses in one interface
- Speed — Precomputed annotations + WASM = instant results
- Accessibility — Single binary or web app, zero configuration
- Education — Interactive simulations teach the biology, not just display data
It's designed for the researcher who wants to explore a phage genome, not run a production pipeline.
Phage Explorer handles genomes up to 500kb+ without loading the entire sequence into memory.
How it works:
- Sequences stored in 10kb chunks in SQLite
- Virtual window tracks visible region + 500bp overscan buffer
- Only visible chunks loaded and rendered
- Row tiles cached with content hashing—scroll reuses existing tiles
- Incremental blitting renders only newly visible rows
GlyphAtlas optimization:
- Pre-renders all nucleotides and amino acids to a texture atlas
- 10x faster than calling
fillText()per character - O(1) array-based glyph lookup (not
Map.get()) - Automatic device pixel ratio scaling (1x to 3x)
Zoom levels:
| Preset | Cell Size | Use Case |
|---|---|---|
| Genome | 1px | Full genome overview |
| Micro | 4px | Regional patterns |
| Region | 8px | Gene-level navigation |
| Codon | 12px | Reading codons |
| Base | 18px | Individual nucleotides |
ASCII Renderer (TUI):
- 70-character gradient for smooth shading:
.'-",:;Il!i><~+_-?][}{1)(|\/tfjrxnuvczXYUJCLQ0OZmwqpdbkhao*#MW&8%B@$ - Float32 Z-buffer for proper occlusion
- Cohen-Sutherland line clipping with z-interpolation
- Front-to-back edge sorting for optimal depth writes
- Brightness mapping: depth → 0.2-1.0 intensity range
WebGL Renderer (Web):
- Real PDB structures fetched from RCSB
- Three render modes: cartoon ribbons, ball-and-stick, surface mesh
- CRT post-processing: chromatic aberration, scanlines, bloom, vignette, film grain
- O(N) spatial-hash bond detection for 50K+ atom structures
8 scientifically-grounded themes, each with 40+ color tokens.
Colors map to biochemical properties so structural patterns are visible while scrolling:
| Property | Amino Acids | Color Family | Rationale |
|---|---|---|---|
| Hydrophobic | V, L, I, M, F, W | Blues/Purples | Nonpolar, buried in core |
| Polar | S, T, N, Q | Cyans/Teals | OH/NH₂ side chains |
| Positive | K, R, H | Blues | Basic, cationic |
| Negative | D, E | Reds | Acidic, anionic |
| Special | C, P, G, A | Yellows/Silvers | Unique structural roles |
| Stop | * | Bright Red | Termination signal |
- Holographic (default) — Glassmorphic cyan/magenta, electric accents
- Cyberpunk — Neon grid aesthetic (cyan/magenta/yellow)
- Classic — Traditional bioinformatics (green A, blue C, amber G, red T)
- Ocean — Cool water-inspired blues and teals
- Matrix — Green monochrome terminal aesthetic
- Sunset — Warm orange/red/gold gradient
- Forest — Natural greens and earth tones
- Monochrome — Accessibility-focused grayscale
Phage Explorer is fully touch-optimized with gesture navigation and haptic feedback.
| Gesture | Action |
|---|---|
| Swipe left/right | Navigate between phages (spring physics animation) |
| Pinch | Zoom sequence grid between presets |
| Long press | Context menu with progressive haptic buildup |
| Pull down | Refresh/reload data |
9 distinct vibration patterns for tactile feedback:
| Pattern | Duration | Use |
|---|---|---|
light |
10ms | Selections, toggles |
medium |
25ms | Button presses |
heavy |
50ms | Important actions |
success |
15-50-25ms | Task completion |
warning |
30-80-30ms | Destructive action alert |
error |
50-100-50-100-50ms | Triple buzz for errors |
tick |
3ms | Slider/picker scrubbing |
Haptics respect prefers-reduced-motion and gracefully degrade on unsupported devices.
Mobile overlays use iOS-style bottom sheets:
- Snap points: 25%, 50%, 90% of screen height
- Drag handle with visual feedback
- Swipe-to-dismiss with velocity detection
- Safe area insets for notched devices
30+ analysis overlays, architected for performance and maintainability.
Eager-loaded (instant): Search, Help, Settings, Command Palette Lazy-loaded (on-demand): All analysis overlays via React Suspense
This keeps initial bundle small (~200KB) while supporting 30+ feature-rich overlays.
| Category | Overlays |
|---|---|
| Sequence | GC Skew, Complexity, Bendability, Promoter, Repeats, K-mer Anomaly |
| Visualization | CGR Fractal, Hilbert Curve, Dot Plot, Phase Portrait, Logo Plot |
| Genomic | HGT Detection, CRISPR Systems, Non-B DNA, Prophage Excision |
| Protein | Codon Bias, Domains, Fold Viewer, RNA Structure |
| Ecology | Host Tropism, Defense Arms Race, AMG Pathways, Cocktail Compatibility |
| Simulation | All 7 interactive simulations |
Overlays share 12+ reusable components:
OverlaySection/OverlaySectionHeader— Consistent section stylingOverlayGrid/OverlayRow— Responsive layoutsOverlayStatCard/OverlayStatGrid— Metric displaysOverlayKeyValue/OverlayBadge— Data presentationOverlayLegend— Color key explanations
Zustand powers all UI state, shared between TUI and web platforms.
The PhageExplorerStore manages ~170 state properties:
| Domain | Properties |
|---|---|
| Navigation | Current phage, scroll position, zoom level |
| View | DNA/AA mode, reading frame, active theme |
| Overlays | Overlay stack (max 3), overlay-specific data |
| Simulations | Active simulation, parameters, history |
| User | Experience level, completed tours, preferences |
Overlay stack limit:
overlays: [...filtered, newOverlay].slice(-3) // Max 3 activeView mode scroll preservation:
- DNA→AA:
scrollPosition / 3(codons are 3bp) - AA→DNA:
scrollPosition * 3
LocalStorage persistence: Beginner mode, tour completion, theme preference
The data pipeline fetches sequences from NCBI's Entrez API:
- ESearch — Query nucleotide database with filters
- ESummary — Batch metadata retrieval (50 items/request)
- EFetch — Full GenBank records with gene annotations
Filters applied:
viruses[filter] OR phage[Title]- Sequence length constraints
- Collection date for temporal analysis
| Step | Input | Output |
|---|---|---|
| Normalization | Raw FASTA | Uppercase, N-padded |
| Chunking | Full sequence | 10kb segments |
| Translation | DNA chunks | 6-frame amino acids |
| Annotation | GenBank features | Structured gene records |
| Domain scan | Protein sequences | Pfam/InterPro hits |
| Embedding | Protein sequences | ESM2 fold vectors |
For temporal analysis, the pipeline:
- Parses collection dates (ISO, "May-2020", "2020", etc.)
- Bins sequences by month/year
- Extracts geographic metadata
- Computes temporal coverage histograms
Comparative genomics:
- Load two phages (e.g., Lambda and P22)
- Toggle Diff mode (
D) to highlight sequence differences - Use Dot Plot overlay to visualize synteny
- Identify conserved vs. variable regions
Gene function prediction:
- Navigate to gene of interest with
[/] - Open Protein Domains overlay
- Check fold embedding neighbors for structural homologs
- Cross-reference with AMG Pathway overlay
Phage therapy candidate screening:
- Use Host Tropism overlay to verify target range
- Check Defense Arms Race for anti-CRISPR genes
- Run Cocktail Compatibility for multi-phage therapy
- Simulate Resistance Evolution to predict durability
Teaching the genetic code:
- Toggle reading frames (
F) to show frameshift effects - Compare DNA vs. amino acid views (
N/C) - Run Ribosome Traffic simulation to visualize translation
Demonstrating phage biology:
- Lysogeny Decision Circuit shows Lambda switch mechanics
- Plaque Growth Automata visualizes infection spread
- Packaging Motor demonstrates DNA condensation physics
| Feature | Phage Explorer | NCBI Viewer | Geneious | SnapGene |
|---|---|---|---|---|
| Startup time | <100ms | 5-10s | 30s+ | 10s |
| 3D structures | Real PDB | None | Plugin | None |
| Analysis tools | 30+ built-in | 3-5 | 20+ (paid) | 10+ |
| Offline | Full | No | Yes | Yes |
| Mobile | Full touch UI | Limited | No | No |
| Simulations | 7 interactive | None | None | None |
| Price | Free | Free | $2,500/yr | $695 |
| Open source | Yes | No | No | No |
Best for: Rapid exploration, education, phage therapy research Not ideal for: High-throughput pipelines (use command-line tools), primer design (use SnapGene)
MIT License (with OpenAI/Anthropic Rider) — see LICENSE for details.
- Sequence data from NCBI GenBank/RefSeq
- TUI framework: Ink (React for CLIs)
- Database: Drizzle ORM + SQLite
- Runtime: Bun