This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import sys | |
with open(sys.argv[0]) as f: | |
code = f.read() # read the code of this file ASAP, for logging | |
import uuid | |
import time | |
import glob | |
import subprocess | |
import contextlib | |
from dataclasses import dataclass |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# NOTE: Resume originally formatted to fit a standard terminal screen (80 px) | |
Fern Balsam (she/they) | |
community-funded open-source ml researcher / part-time outdoor adventurer | |
9 years of experience developing and aggressively optimizing neural networks | |
[ primary ] goal: societal impact via applied research and surrounding work | |
[ secondary ] goal: pure research extending theory of deep learning | |
------------------------------------------------------------------------------- | |
| | | |
| Achievements | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# NOTE: Resume originally formatted to fit a standard terminal screen (80 px) | |
Fern Balsam (she/they) | |
community-funded open-source ml researcher / part-time outdoor adventurer | |
9 years of experience developing and aggressively optimizing neural networks | |
[ primary ] goal: societal impact via applied research and surrounding work | |
[ secondary ] goal: pure research extending theory of deep learning | |
------------------------------------------------------------------------------- | |
| | | |
| Achievements | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Note: The one change we need to make if we're in Colab is to uncomment this below block. | |
# If we are in an ipython session or a notebook, clear the state to avoid bugs | |
""" | |
try: | |
_ = get_ipython().__class__.__name__ | |
## we set -f below to avoid prompting the user before clearing the notebook state | |
%reset -f | |
except NameError: | |
pass ## we're still good | |
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Sketch-specific note: a roughly ~25 run battery for this code estimated a roughly ~93.11% accuracy in the same number of steps as the baseline network, ~1.7x runtime overhead (much of which goes to the torch.randn allocations and extra layer calculations). | |
# Note: The one change we need to make if we're in Colab is to uncomment this below block. | |
# If we are in an ipython session or a notebook, clear the state to avoid bugs | |
#""" | |
try: | |
_ = get_ipython().__class__.__name__ | |
## we set -f below to avoid prompting the user before clearing the notebook state | |
%reset -f | |
except NameError: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import os | |
import time | |
import queue | |
import threading | |
import random | |
import torch | |
import torchvision | |
from torchvision.transforms import v2 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# [IN-DEV currently] | |
# Maintained/Initially created by Fern. Say hi to me and feel free to ask any questions as needed! <3 :')))) | |
# If anything here is self-cited/has no citation, that means that it's a conclusion I arrived at over time, or in | |
# deriving something from the basics, however, there may be work elaborating it in further detail (feel free to comment if there's an especially relevant link). | |
# Misc | |
- LayerNorm/RMSNorm might be acting as lateral inhibition, a paradigm attempted in many 2000's and surrounding ML papers (Fern, {relevant sources needed}) | |
- 'Soft' (pre-determined or pre-compiled) architectures in the weights of your network can greatly increase convergences times and/or generalization. | |
- Downcasting dtypes to a lower bit depth in your dot products can be a 'free' efficiency improvement in some circumstances. |