Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up roles/0-init by almost 50% (by moving 2 things to detected_network.yml) [and later remove them almost entirely] #3272

Merged
merged 2 commits into from
Jun 30, 2022

Conversation

holta
Copy link
Member

@holta holta commented Jun 29, 2022

This PR accelerates roles/0-init by almost 50% (~20 seconds instead of ~30 seconds) which is a real win to help new IIAB testers and old — to de-bureaucratize and focus.

Good enough for now. More improvements will surely be possible over time — without sacrificing critical validation etc (the whole purpose of testing!)

Whether vars internet_available and gw_active are really needed at all (or whether they should evolve into something else, etc) can be resolved over time. For now, they are moved from roles/0-init to roles/network/tasks/detected_network.yml

I want to get this merged quickly to help testers of all kind, who are tired of "hurry up and wait" workflows.

This PR is tested on Mint 20.

Please consider @jvonau's PR below which has many other good ideas to follow in July:

PS roles/1-prep/tasks/main.yml is also cleaned up, to communicate clearly when /etc/iiab/iiab.env is being populated.

Related:

@holta holta added this to the 8.0 milestone Jun 29, 2022
@holta
Copy link
Member Author

holta commented Jun 29, 2022

Testing a MEDIUM-sized IIAB install on Ubuntu Server 22.04 is underway (currently wrapping up Stage 7, looking good!)

@jvonau
Copy link
Contributor

jvonau commented Jun 29, 2022

So overall you lose 10 seconds by running the same code twice during iiab-install, spit on #2959 which would alert the user to the fact there is no internet to complete the install. Now in place of just recording there was no internet present to explain a potential failure of any other single role, there is no recorded feedback at all as the code moved to specific roles.

@jvonau
Copy link
Contributor

jvonau commented Jun 29, 2022

You broke the vnstat role roles/vnstat/tasks/install.yml

@jvonau
Copy link
Contributor

jvonau commented Jun 29, 2022

Don't toss more code on top, just back out of this idea.

@holta
Copy link
Member Author

holta commented Jun 30, 2022

We need to optimize for the common case:

  • While I personally don't know anyone (ever) who's really depended on vnStat over the past decade, thanks for catching that, just it case. There a strong argument that deadweight of this nature really needs to be removed from IIAB. But we can evaluate that at some other time (silent users need to speak up, and demonstrate their need, if there are any!)
  • What matters most is optimizing for the common case — which means things like roles/0-init used constantly by testers (I'll be talking to several potential new such people over coming months). Keeping in mind that ~10 seconds makes a massive difference when you're waiting ~30 seconds every single time (now closer to 20 seconds) during rapid-testing cycles. i.e. if just to state the obvious, an extra 10 seconds (temporary, in any case) during a 30 minute install is completely irrelevant.
  • Even though my profiling of roles/0-init was rudimentary earlier today, it quickly showed that setting both age-old and unnecessary networking variables were gobbling up far more time than I realized — removing them was indeed a big win.

@holta holta merged commit a225ad9 into iiab:master Jun 30, 2022
@jvonau
Copy link
Contributor

jvonau commented Jun 30, 2022

We need to optimize for the common case:

* While I personally don't know anyone (ever) who's really depended on vnStat over the past decade, thanks for catching that, just it case.  There a strong argument that deadweight of this nature really needs to be removed from IIAB.  But we can evaluate that at some other time (silent users need to speak up, and demonstrate their need, if there are any!)
grep -r vnstat vars/
vars/local_vars_medical.yml:vnstat_install: True
vars/local_vars_medical.yml:vnstat_enabled: True
vars/local_vars_large.yml:# Network traffic monitor - from https://humdi.net/vnstat/
vars/local_vars_large.yml:vnstat_install: True
vars/local_vars_large.yml:vnstat_enabled: True
* What matters most is optimizing for the **common case** — which means things like roles/0-init used constantly by testers (I'll be talking to several potential new such people over coming months).  Keeping in mind that ~10 seconds makes a _massive_ difference when you're waiting ~30 seconds every single time (now closer to 20 seconds) during rapid-testing cycles.  i.e. if just to state the obvious, an extra 10 seconds (temporary, in any case) during a 30 _minute_ install is completely irrelevant.

Thought the common case is to catch the problem early like validate does, record the event to help explain why any role may fail and potentially do something other than run a role that is going to fail. Now with 'rapid-testing cycles' I think you are referring to using runrole repeatedly during testing or development.

* Even though my profiling of roles/0-init was rudimentary earlier today, it quickly showed that setting both age-old and unnecessary networking variables was gobbling up far more time than I realized — removing them was indeed a big win.

The slowest part would be the 4 pings and the dns name lookup. This was part of the debugging routine to detect proxy issues that other users were having, if you can't get a heart beat then apt and friends are sure to fail in some cryptic fashion.

@jvonau jvonau mentioned this pull request Jun 30, 2022
@holta
Copy link
Member Author

holta commented Jun 30, 2022

with 'rapid-testing cycles' I think you are referring to using runrole repeatedly during testing or development.

Yes. That's a big opportunity.

Possibly we can try to target getting roles/0-init under 10-15 seconds (for poor people too, eager to craft their own IIAB's out of junkier hardware) before end of year.

@jvonau
Copy link
Contributor

jvonau commented Jun 30, 2022

Other option prior to pulling out the detection out of 0-init would of been to make the detection optional on the iiab_stage > 9, new installs could run the detection and optionally react if something is funny while runrole is usually run with the install complete, iiab_stage at 9 when doing developmental work. The admin-console runs with stage being 9, so suppressed there and would cover runrole needing internet when debugging a single role that might of crashed during iiab-install. So where is the issue (how do we improve 0-init times) where this sort of stuff is discussed at?

@tim-moody
Copy link
Contributor

Internet detection was for getting packages and for roles that do downloads. Not sure if kiwix catalog is still downloaded. I guess Khan videos are not; not sure about language packs. cmdsrv has a function to detect internet that is pretty fast.

@jvonau
Copy link
Contributor

jvonau commented Jun 30, 2022

Almost all roles do downloads when installing, knowing if the internet was reachable when the roles didn't have the install,yml split required the detection. After the mass adoption of the install.yml layout within the roles the need for internet_available went away #2972 sans for 2 roles. Networking doesn't need the detection sans for the workaround for networkd-dispatcher and there are other ways to tell if the run is install mode but nobody asked what alternatives there might be, just dumped code into the network role. I know admin-console has it's own detection so it makes no sense to have the proxy detection within network, that is way too late for any kind of useful action or recording of the startup state that having the info in runtime section of the iiab.ini file provided.

I'll ask the question again where is the issue (how do we improve 0-init times) where this sort of stuff is discussed at? Given 3264 is specifically network related refactoring why no mention of this change there before it was implemented?

@holta holta changed the title Speed up roles/0-init by almost 50% (by moving 2 things to detected_network.yml) Speed up roles/0-init by almost 50% (by moving 2 things to detected_network.yml) [and then removing them entirely] Jun 30, 2022
@holta holta changed the title Speed up roles/0-init by almost 50% (by moving 2 things to detected_network.yml) [and then removing them entirely] Speed up roles/0-init by almost 50% (by moving 2 things to detected_network.yml) [and then removing them almost entirely] Jun 30, 2022
@holta holta changed the title Speed up roles/0-init by almost 50% (by moving 2 things to detected_network.yml) [and then removing them almost entirely] Speed up roles/0-init by almost 50% (by moving 2 things to detected_network.yml) [and later remove them almost entirely] Jun 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants