Skip to content

Instantly share code, notes, and snippets.

View kalomaze's full-sized avatar

kalomaze

View GitHub Profile
@kalomaze
kalomaze / r1_paper_mario_test.md
Last active January 23, 2025 01:11
DeepSeek r1 MIPS decompilation test

What the original function actually looked like:

f32 atan2(f32 startX, f32 startZ, f32 endX, f32 endZ) {
    f32 xDiff = endX - startX;
    f32 zDiff = endZ - startZ;
    f32 absXDiff = fabsf(xDiff);
    f32 absZDiff = fabsf(zDiff);
    f32 ret;
class RescaleDescentTrainer(Trainer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
# Initialize all buffers
self.tokens_buffer = [] # for raw token loss
self.weighted_tokens_buffer = [] # for entropy weighted token loss
self.unigram_rate_buffer = []
self.bigram_rate_buffer = []
self.trigram_rate_buffer = []
self.weighted_unigram_buffer = []
@kalomaze
kalomaze / qwen_tokenize_test.py
Created January 20, 2025 22:22
qwen tokenizer test
from transformers import AutoTokenizer
from huggingface_hub import snapshot_download
import os
def add_token_boundaries(tokenizer, tokens):
"""Add brackets around token boundaries"""
text = ""
for token in tokens:
decoded = tokenizer.decode([token])
text += f"[{decoded}] "
import sys
import random
import numpy as np
import string
from datetime import datetime
from PIL import Image, ImageEnhance, ImageOps
from PyQt5.QtWidgets import (QApplication, QMainWindow, QWidget, QVBoxLayout,
QHBoxLayout, QTextEdit, QPushButton, QCheckBox,
QLabel, QSpinBox, QComboBox, QSlider, QFileDialog,
QFrame)
@kalomaze
kalomaze / scrambled_text.md
Created January 6, 2025 21:28
scrambled text example

Original Text ("Reading a Sign 43 Times Heals Your Axe Durability" by Hunter R.)

We all know that the axe in Animal Crossing will usually break after using it too much. Of course, the axe is intentionally designed to break like this in order to make the unbreakable Golden Axe an appealing item to unlock. And yet what if I told you that by simply reading a sign over and over you can actually prevent your standard axe from ever breaking? And no, I'm not joking—you can actually sit here and read this sign over and over to heal the durability on your axe, making it theoretically invincible. I'm sure a lot of you are wondering how or why this even works, so let's take a closer look.

Creating an unbreakable axe is a really funny glitch that was recently discovered by Animal Crossing spreadsheet owner Phil. To understand how interacting with a sign heals your axe, let's discuss how axe durability works.

Normally an axe can withstand 72 hits on normal trees before breaking. Since trees take three hits to cut
@kalomaze
kalomaze / branching_completion_prototype.js
Created December 31, 2024 06:18
Branching LLM frontend react widget prototype
import React, { useState } from 'react';
import { Settings, Bookmark, Download, Library, HelpCircle, RefreshCw, ArrowLeft } from 'lucide-react';
const STORY_BRANCHES = {
root: {
text: `The darkness grew absolute, not that the hyperstitioner could see in the first place. His ears pricked up, however; he could hear the skittering, the mechanical hum as the machine followed him invisibly...`,
continuations: [
{
id: 'a1',
text: " The mechanical tendrils wrapped tighter around his shoulder, its grip a cold reminder of their symbiosis...",
datasets:
- path: anthracite-core/c2_logs_8k_llama3_v1.2
# contents of this dataset were filtered for quality, but not safety or safe for work-ness. be advised
type: sharegpt
conversation: llama3
- path: anthracite-org/kalo-opus-instruct-22k-no-refusal
type: sharegpt
conversation: llama3
- path: lodrick-the-lafted/kalo-opus-instruct-3k-filtered
type: sharegpt
@kalomaze
kalomaze / tp_latency.txt
Created August 25, 2024 02:03
Tensor Parallel latency
== Results torch.int8 meta-llama/Llama-2-7b-hf-TP1 ====
[--------------------------------------- scaled-torch.int8-gemm --------------------------------------]
| pytorch_bf16_bf16_bf16_matmul-no-scales | cutlass_i8_i8_bf16_scaled_mm
1 threads: --------------------------------------------------------------------------------------------
MKN=(1x4096x12288) | 195.3 | 142.4
MKN=(1x4096x4096) | 64.5 | 47.5
MKN=(1x4096x22016) | 322.9 | 235.6
MKN=(1x11008x4096) | 162.6 | 112.9
MKN=(16x4096x12288) | 187.5 | 142.6
MKN=(16x4096x4096) | 66.2 | 47.6
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, AutoConfig
import random
import os
import shutil
# Set a seed for reproducibility
random.seed(42)
# Load the model, tokenizer, and configuration
@kalomaze
kalomaze / modeling_mixtral.py
Created May 5, 2024 03:38
Fixed Mixtral training code for HF Transformers
# coding=utf-8
# Copyright 2023 Mixtral AI and the HuggingFace Inc. team. All rights reserved.
#
# This code is based on EleutherAI's GPT-NeoX library and the GPT-NeoX
# and OPT implementations in this library. It has been modified from its
# original forms to accommodate minor architectural differences compared
# to GPT-NeoX and OPT used by the Meta AI team that trained the model.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.