-
Notifications
You must be signed in to change notification settings - Fork 7.6k
Modify EmbeddingDropout and adapt AWD_LSTM to trigger hooks in Embedding layers #2906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Thanks @arnaujc91 . Can you please fix the test failure shown here in CI? It's saying |
Ok, I have created a notebook where I tried to explain as good as possible why I am doing this PR and what are the possible solutions that I see to the problem. See the following notebook: EmbeddingDropout P.S. I updated again the notebook. Let me know what you think. Thanks a lot! |
Thanks @arnaujc91 - I'm still seeing the test failure however. Would you be able to fix that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please fix test failure:
ModuleAttributeError: 'AWD_LSTM' object has no attribute 'encoder_dp'
I modified the PR to pass the previous test, but now is failing in another test:
This is because the class As I explained in the previous Notebook there are two possibilities to fix the ActivationStats problem:
I decided to modify EmbeddingDropout. The reason was twofold: it simplified the code and it solved the issue with the hooks. My current modification just affects If the tests can't be modified then I will need to close this PR and maybe check if i can modify flatten_model instead. Can you or me modify the tests for the layer EmbeddingDropout? Thanks a lot Jeremy. |
Yes please do go ahead and modify the tests. Many thanks!
|
Please before accepting the modifications of this PR check the Notebook I created explaining how to make full sense of this PR. Right now everything works, but the code can be further cleaned. Here is the notebook: encoder and encoder_dp are redundant Thanks a lot! |
Fix for: #2850
@sutt already fixed the
awd_lstm.WeightDropout
class to be able to run the ActivationStats callback for the model AWD_LSTM. Still the Embedding layers were not recording any statistics. Furthermore the Hooks class had duplicated layers due to howflatten_model
works.In this PR I propose a modification to solve both issues:
An example of this can be found here: example EmbeddingDropout.
To solve the issue I just modified slightly the class
EmbeddingDropout
and adaptedAWD_LSTM
to work exactly the same as before for these changes. I think now the code is more compact than before.I by no means have your expertise, nevertheless I tried to do my best refactoring the code. If there is anything that I did wrong just let me know.
Thanks a lot!