Analyze protein structures, predict binding pockets with P2Rank, and visualize them in 3D.
- 🧬 Fetch PDB structures from RCSB PDB
- 🤖 Encode proteins with ESM2 (protein language model)
- 🔍 Predict binding pockets using P2Rank (AlphaFold model)
- 📊 Interactive 3D visualization with py3Dmol
- 🌐 Clean web interface
python --version # Should be 3.12 or higherDownload and install P2Rank:
# Create tools directory
mkdir -p ~/tools
cd ~/tools
# Download P2Rank 2.5.1
wget https://github.com/rdk/p2rank/releases/download/2.5.1/p2rank_2.5.1.tar.gz
tar -xzf p2rank_2.5.1.tar.gz
# Verify installation
~/tools/p2rank_2.5.1/prank -hAdd to your ~/.zshrc or ~/.bashrc:
export P2RANK_BIN="$HOME/tools/p2rank_2.5.1/prank"Then reload:
source ~/.zshrc # or source ~/.bashrccd <BASE>/protVizEncpython -m venv .env
source .env/bin/activatepip install --upgrade pip
pip install fastapi uvicorn aiohttp biopython py3Dmol transformers torchcd <BASE>/protVizEnc
source .env/bin/activateuvicorn backend.api:app --reloadYou should see:
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: Started reloader process [XXXX] using WatchFiles
INFO: Application startup complete.
Navigate to:
http://127.0.0.1:8000/
- Enter a PDB ID (e.g.,
1ubq) - Click "Analyze"
- Wait for analysis (may take 10-30 seconds first time)
- View results:
- Left panel: Predicted pockets with scores
- Right panel: Interactive 3D visualization
1ubq- Ubiquitin (small, fast, ~4 pockets)3aid- HIV-1 Protease1fbl- Protein kinase2src- Tyrosine kinase
Request:
{
"source": "1ubq"
}Response:
{
"html": "<div>...</div>",
"pockets": [
{
"id": 1,
"name": "pocket1",
"score": 4.71,
"probability": 0.207,
"combined_score": 0.445,
"rank": 1,
"center": [34.08, 38.75, 19.20],
"residues": [
{"chain": "A", "resnum": 11},
{"chain": "A", "resnum": 34}
]
}
]
}curl -X POST http://127.0.0.1:8000/api/analyze \
-H "Content-Type: application/json" \
-d '{"source": "1ubq"}' | jq '.pockets | length'protVizEnc/
├── backend/
│ ├── api.py # FastAPI server + CORS
│ ├── structures.py # PDB fetching and parsing
│ ├── encoders.py # ESM2 protein encoding
│ ├── pockets.py # P2Rank integration
│ ├── annotations.py # Structure annotations
│ └── viz.py # py3Dmol visualization
├── frontend/
│ └── index.html # Web interface
├── .env/ # Virtual environment
└── README.md
- Fetch Structure: Download PDB from RCSB or load local file
- Encode Protein: Run ESM2 transformer model for embeddings
- Predict Pockets: Run P2Rank with AlphaFold model
- Parse Results: Extract pocket scores, probabilities, residues
- Generate Viz: Create py3Dmol HTML with annotations
- Return JSON: Send HTML + pocket data to frontend
- Model:
alphafold(works without conservation data) - CSV Parsing: Strips whitespace from column names
- Residue Format: Space-separated
A_11 A_34 A_36 - Output: Predictions CSV with scores, probabilities, centers
# Find and kill process on port 8000
lsof -ti:8000 | xargs kill -9# Check environment variable
echo $P2RANK_BIN
# Should output: /Users/rahulkhorana/tools/p2rank_2.5.1/prank
# If not, add to ~/.zshrc and reload- P2Rank may not find pockets for all proteins
- Try a different PDB ID
- Check server logs for errors
- Ensure server is running on
http://127.0.0.1:8000 - CORS is enabled for all origins in development
# Test pocket detection
python -c "
import asyncio
from backend.pockets import run_fpocket_and_collect
from backend.structures import load_structure_and_pdb_text
async def test():
_, pdb = await load_structure_and_pdb_text('1ubq')
pockets = await run_fpocket_and_collect(pdb)
print(f'Found {len(pockets)} pockets')
asyncio.run(test())
"# Press Ctrl+C in terminal where uvicorn is running
# Or kill all uvicorn processes:
pkill -f uvicorn- P2Rank: Krivak & Hoksza (2018) - https://github.com/rdk/p2rank
- ESM2: Meta AI - https://github.com/facebookresearch/esm
- py3Dmol: 3Dmol.js team - https://3dmol.csb.pitt.edu/
MIT