This repository implements the Perturbed Saddle-escape Descent (PSD) algorithm for escaping saddle points in non-convex optimisation problems, as described in Alpay and Alakkad (2025). It contains reference NumPy implementations, framework specific optimisers for PyTorch and TensorFlow, and utilities for reproducing the synthetic experiments reported in the accompanying manuscript.
- Reference implementations of PSD, PSD-Probe and baseline gradient descent variants in pure NumPy.
- Suite of analytic test functions with gradients and Hessians.
- Synthetic data generator producing the tables and figures used in the
paper (
experiments.py). - Framework specific optimisers:
PSDTorch,PSDTensorFlowand aPSDOptimizer/PerturbedAdampackage for PyTorch. - Example training scripts for MNIST and CIFAR-10.
The core project depends on the following libraries:
| Library | Purpose |
|---|---|
numpy |
numerical routines for reference implementations |
torch, torchvision |
deep-learning framework and datasets |
optuna |
hyper-parameter search utilities |
matplotlib |
visualisation in notebooks |
Python 3.8 or later is required.
Install the published optimiser package:
pip install psd-optimizerOr install the repository in editable mode for development:
git clone https://github.com/farukalpay/PSD.git
cd PSD
pip install -e ".[dev]"import numpy as np
from psd import algorithms, functions
x0 = np.array([1.0, -1.0])
x_star, _ = algorithms.gradient_descent(x0, functions.SEPARABLE_QUARTIC.grad)Further examples are available in the examples/ directory and the
documentation.
The core PSD routines and test functions can be imported from the
psd package:
import numpy as np
from psd import algorithms, functions
x0 = np.array([1.0, -1.0])
x_star, _ = algorithms.gradient_descent(x0, functions.SEPARABLE_QUARTIC.grad)This structure allows you to experiment with the reference NumPy implementations directly in your projects.
The PyTorch optimisers PSDOptimizer and PerturbedAdam are also
available directly via from psd import ....
For rapid experimentation without navigating submodules, import the aggregated
psd.monster module. It re-exports the core algorithms, analytic test
functions and framework-specific optimisers in a single namespace:
import numpy as np
from psd import monster
x0 = np.array([1.0, -1.0])
x_star, _ = monster.gradient_descent(x0, monster.SEPARABLE_QUARTIC.grad)This unified view aims to be approachable for both humans and language models exploring the project.
python experiments.pyThe command writes CSV summaries to results/ and training curves to
data/.
Profiling identified rosenbrock_hess as a hot path when computing the
Rosenbrock Hessian. Vectorising the computation removed explicit
Python loops and yielded the following improvements (dimension 1000):
| Version | Mean time (ms) | Peak memory (MB) |
|---|---|---|
| Before | 3.52 | 8.00 |
| After | 1.01 | 8.04 |
Benchmarking is automated via pytest-benchmark using a fixed NumPy seed.
Hard time and memory thresholds guard against major regressions.
from psd_optimizer import PSDOptimizer
model = ...
opt = PSDOptimizer(model.parameters(), lr=1e-3)
def closure():
opt.zero_grad()
output = model(x)
loss = criterion(output, y)
loss.backward()
return loss
opt.step(closure)Example scripts using this API are available in the notebooks/
directory.
An illustrative example for fine-tuning a compact transformer with
PSDOptimizer is provided in scripts/train_small_language_model.py.
The script downloads a tiny GPT-style model from the Hugging Face Hub and
optimizes it on a short dummy corpus.
Run the example with default settings:
python scripts/train_small_language_model.pySpecify a different pretrained model and number of epochs:
python scripts/train_small_language_model.py --model distilgpt2 --epochs 5Full API documentation and guides are available in the
docs/ directory.
Additional materials include:
notebooks/10_minute_start.ipynb– an interactive notebook showcasing the optimiser.docs/section_1_5_extension.md– theoretical notes on extending PSD to stochastic settings.notebooks/navigation.ipynb– links to all example notebooks includingadvanced_usage.ipynb.
After installing the repository in editable mode, run the test suite to verify that everything works:
pytestThe current suite is small but helps prevent regressions.
psd/ # Reference implementations and framework-specific optimisers
algorithms.py # PSD and baseline algorithms
functions.py # Analytic test functions and registry
psd_optimizer/ # PyTorch optimiser package
experiments.py # Synthetic data generation
Contributions are welcome! Please open an issue or pull request on GitHub
and see CONTRIBUTING.md for guidelines. By participating you agree to
abide by the CODE_OF_CONDUCT.md.
If you use PSD in your research, please cite the following:
@misc{alpay2025escapingsaddlepointscurvaturecalibrated,
title={Escaping Saddle Points via Curvature-Calibrated Perturbations: A Complete Analysis with Explicit Constants and Empirical Validation},
author={Faruk Alpay and Hamdi Alakkad},
year={2025},
eprint={2508.16540},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2508.16540},
}This project is released under the MIT License. See LICENSE for details.