A Pytorch implementation of 'AUTOMATIC SPEECH EMOTION RECOGNITION USING RECURRENT NEURAL NETWORKS WITH LOCAL ATTENTION'
-
Python 3.6.4
-
Pytorch 0.4.1
-
fileutils.readHtk(githubrepo. I changed htk.py for python3)
IEMOCAP DB has 5531 utterances, composed of 4 Emotions.
A: Anger H: Excited + Happiness N: Neutral S: Sadness
#head -2 iemocap/wav_cat.list
/your/path/Ses01F_impro01_F000.wav N
/your/path/Ses01F_impro01_F001.wav N
#head -2 iemocap/utt.list
Ses01F_impro01_F000
Ses01F_impro01_F001
MSP-IMPROV DB has 7798 utterances, composed of 4 Emotions.
#head -2 msp_improv/wav_cat.list
/your/path/MSP-IMPROV-S01A-F01-P-FM01.wav N
/your/path/MSP-IMPROV-S01A-F01-P-FM02.wav H
#head -2 msp_improv/utt.list
MSP-IMPROV-S01A-F01-P-FM01
MSP-IMPROV-S01A-F01-P-FM02
./add_opensmile_conf.sh your_opensmile_dir
./prepare_list.sh iemocap/wav_cat.list \ # done.
iemocap/lld.htk.list iemocap/utt.list iemocap/lld/
./extract_lld.sh your_opensmile_dir/ iemocap/wav_cat.list \
iemocap/lld.htk.list
./make_utt_lld_pair.py iemocap/utt.list iemocap/lld.htk.list \
iemocap/utt_lld.pk
./iemocap/make_csv.sh iemocap/utt.list iemocap/wav_cat.list iemocap/ \
iemocap/full_dataset.csv
# Modify make_dataset.py parameters as you want!
#
### Default setting ###
#
# devfrac=0.2
# session=1
# prelabel="gender"
#
# e.g.
# sed 's/"gender"/"speaker"/' iemocap/make_dataset.py > new_script.py
# sed 's/devfrac=0.2/devfrac=0.1/' iemocap/make_dataset.py > new_script.py
./iemocap/make_dataset.py iemocap/full_dataset.csv iemocap/utt_lld.pk iemocap/your_dataset_path
# Modify make_expcase.py params as you want!
#
### Default setting ###
#
# lr=0.00005
# bsz=64
# ephs=200
./iemocap/make_expcase.py iemocap/your_dataset_path iemocap/your_dataset_path/your_expcase
#ls iemocap/your_dataset_path/your_expcase
# log
# param.json
# premodel.pth
# model.pth
./run.py --propjs iemocap/your_dataset_path/your_expcase/param.json
# parameters were not tuned.
# grep test iemocap/your_dataset_path/your_expcase/log
# iemocap/sess1/exp/log:[test] score: 0.459, loss: 1.278
# iemocap/sess2/exp/log:[test] score: 0.542, loss: 1.190
# iemocap/sess3/exp/log:[test] score: 0.542, loss: 1.195
# iemocap/sess4/exp/log:[test] score: 0.521, loss: 1.214
# iemocap/sess5/exp/log:[test] score: 0.513, loss: 1.226
# grep test msp_improv/sess?/exp/log
# msp_improv/sess1/exp/log:[test] score: 0.493, loss: 1.238
# msp_improv/sess2/exp/log:[test] score: 0.485, loss: 1.249
# msp_improv/sess3/exp/log:[test] score: 0.526, loss: 1.208
# msp_improv/sess4/exp/log:[test] score: 0.502, loss: 1.225
# msp_improv/sess5/exp/log:[test] score: 0.474, loss: 1.261