Skip to content

Applying/Replicating Git Re-Basin to 2L MLP trained on Mod Addition

Notifications You must be signed in to change notification settings

afterless/cs4644_final

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CS 4644 Final Project: Git Re-Basin on 2L MLP trained on Modular Addition

Purpose/Goal

The purpose of this final project is to replicate the Git Re-Basin paper on a 2L MLP trained on modular addition. The goal of this project would be to build upon Nanda et al.'s work into interpreting networks that have learned moduler addition via "grokking" as shown in Power et al.'s work. With Git Re-Basin, we wish to explore how basin phenomena interacts with certain architectures/tasks, such as in this case modular addition and whether linear mode connectivity exists between models that have "grokked" any task. Another goal is to replicate the paper successfully due to previous failed attempts by others and there being no codebase which implements all three algorithms. We will be using Stanislav Fort's replication, the Git Re-Basin codebase, and Neel Nanda's codebase as a starting point for this project.

Setup

If you wish to run this project out of interest or to contribute, you can setup your machine using Miniconda or something similar to make a virtual environment. If you are on macOS or Linux you can use the following:

ENV_PATH=~/cs4644_final/.env/
cd $ENV_PATH
conda create -p $ENV_PATH python=3.10 -y
conda install -p $ENV_PATH pytorch=2.0.0 torchtext torchdata torchvision -c pytorch -y
conda run -p $ENV_PATH pip install -r requirements.txt

If you are on Windows, you can run this:

$env:ENV_PATH='c:\users\<user_name>\cs4644_final\.env'
cd cs4644_final
conda create -p $env:ENV_PATH python=3.10 -y
conda install -p $env:ENV_PATH pytorch=1.12.0 torchtext torchdata torchvision -c pytorch -y
conda run -p $ENV_PATH pip install -r requirements.txt

Results

Below are plots taken for two experiments: training an MLP on MNIST to verify that all three algorithms are working, and an MLP that was trained on modular addition. These modular addition models are notable in that each "grokked" the task achieving 100% test accuracy after initially overfitting on training data, more can be found on the phenomenon here.

MNIST Plots

  • Activation Matching Activation Matching
  • Weight Matching Weight Matching
  • Straight Through Estimation Straight Through Estimation

Modular Addition Plots

(Note: Activation Matching did not work due to nature of Embedding layer)

  • Weight Matching Weight Matching
  • Straight Through Estimation Straight Through Estimation

Takeaways

The performance of the rebasin algorithms on modular additions is terrible to the point any permutation destroys model performance. However, if we don't choose to permute the embedding weights, we see curves similar to the other rebasin curves, and notice that even if we permute the other weights in the model the original grokking performance still holds:

  • Weight Matching (No Embedding Permutation) Weight Matching
  • Straight Through Estimation (No Embedding Permutation) Straight through Estimation

As shown however, naive interpolation still outperforms rebasin techniques, leading me to believe that each possible basin a grokked model could end up in is permutationally invariant, or that any permutation of the weights except embedding still lies in its original basin, hence why there is no rebasin benefit for these models.

As an addendum, these are interpolation plots where neither embedding/unembedding was permuted:

  • Weight Matching Weight Matching
  • Straight Through Estimation Straight Through Estimation

About

Applying/Replicating Git Re-Basin to 2L MLP trained on Mod Addition

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published