A place to collect write-ups, notes, and tutorials.
This is a web app to calculate different metrics to compare two texts.
You can compare two texts at the character-level (Levenshtein distance), word-level (BLEU score), or with a metric that respects the meaning (BERTScore using embeddings). You can also ask a language model to compare the two texts (this requires a valid OpenAI API key).
The website is available at textcomparison.pythonanywhere.com (it may take some time to load), and the repository is available here.
A simple two-layer feed forward network with manual backpropagation
This is a short expository Python code of a tiny neural net, written from scratch in Numpy with manual backpropagation. We then verify gradients with Pytorch.
Ask questions to your PDF documents - a tutorial
In this notebook, we use a large language model to ask and answer questions to PDF documents. You may want to do this with non-publically available documents, or with documents that appeared after the language model was trained.
Additionally, we want to do run the language model locally. This has advantages for privacy and confidentiality (if you cannot send your data to a remote service).
We will use Llama 7B as the language model, and langchain for integration of the other necessary pieces.
Machine learning model validation
The use of machine learning models quickly expands to business areas with strong regulation, such as finance and medicine. Machine learning models and their validation differ substantially from classical models and classical model validation. Regulators, model validation teams and model developers need to adjust to different requirements on model validation and model development.
These notes aim to go into detail what topics model validation of machine learning models should address.
A tiny introduction to a Poisson Generalized Linear Model (GLM), spelled-out in code
This is an introduction to a very simple example of a Poisson GLM in R. The goal is that we understand and manually implement all output of summary()
for this special case. This includes:
- how the GLM is fitted to the data,
- how to predict from the GLM,
- how categorical vs. numerical variables influence the fit,
- the derivation and interpretation of quality of fit indicators (deviance and AIC),
- one possibility to test for statistical significance of including an additional parameter in the model (likelihood ratio test),
- the meaning and derivation of parameter standard errors and p-values.
The emphasis is on the expositional code, rather than on mathematical derivations or intuition for the quantities introduced.
The R Code is also available.
These are the slides (in German) that I prepared for the last exam (colloquium) to become an actuary (member of Swiss Association of Actuaries, SAV). The topic was Steigende Teuerungsraten: Chancen und Risiken im Reserving und Risk Management aus aktuarieller Sicht, which roughly translates as Rising inflation rates: opportunities and risks in reserving and risk management from an actuarial perspective. The slides are also available in the collection of colloquium slides by the SAV young actuaries.
Least-Squares Monte-Carlo for an American Put Option
These notes reproduce the first example on least-squares Monte-Carlo from paper Valuing American Options by Simulation: A Simple Least-Squares Approach by Longstaff and Schwartz, available here.
The R Code is also available.
Proof of Thieles Differential Equation
We prove Thiele's differential equation in continuous time from scratch (ignoring some technical details). This write-up tries to spell out all steps in the derivation of this equation, following M. Koller's lecture notes on "Selected Topics in Life Insurance" (not publicly available).