Skip to content

Commit

Permalink
when measuring BPC, divide by ln(2) to get into the right base
Browse files Browse the repository at this point in the history
  • Loading branch information
proger committed Apr 26, 2023
1 parent 0dc8cf7 commit 1c5dc38
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
8 changes: 4 additions & 4 deletions examples/exp/ppl/BPC
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
data/flair-uk-forward.ppl.tsv sentences 28643
data/flair-uk-forward.ppl.tsv nll_mean 139.6773665773955
data/flair-uk-forward.ppl.tsv bpc 0.8214119255908473
data/flair-uk-forward.ppl.tsv bpc 1.7096613025528762
exp/ppl/small.tsv sentences 28643
exp/ppl/small.tsv nll_mean 118.76219278015107
exp/ppl/small.tsv bpc 0.7222614857594851
exp/ppl/small.tsv bpc 1.503292652634816
exp/ppl/medium.tsv sentences 28643
exp/ppl/medium.tsv nll_mean 115.0707072714907
exp/ppl/medium.tsv bpc 0.6998114303527089
exp/ppl/medium.tsv bpc 1.4565658036892946
exp/ppl/large.tsv sentences 28643
exp/ppl/large.tsv nll_mean 113.01209246430402
exp/ppl/large.tsv bpc 0.6872915942213289
exp/ppl/large.tsv bpc 1.4305074051181672
2 changes: 1 addition & 1 deletion examples/scripts/evaluate_nll.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
df = df.loc[df.index.intersection(idf.index)]

nll = np.log(df.ppl.to_numpy()) * df.sentence_len.to_numpy()
nll2 = nll / np.log2(np.e)
nll2 = nll / np.log(2)

char_len = df.text.str.len().to_numpy()
N = np.sum(char_len)
Expand Down

0 comments on commit 1c5dc38

Please sign in to comment.