Given the Tailscale founder’s posts on how he thinks about engineering and culture, I was expecting something interesting when I clicked this link. I didn’t get it.
I think the technical content is “if you use Wireguard, you don’t have VPN issues, so it’s easy to have a small team.” That’s different than the kind of cultural thing Apenwarr was writing about.
3 people ain’t enough to run a 24x7 on call environment with rotations, time off to recover, sick leave, holidays, and enough for personal development. That size is minimum 7 in countries with sensible labor laws, and/or companies that respect that.
It’s startup-sized, so probably others in the org can step in and cover these gaps if needed.
Not sure if this is something to be proud of. The engineering ofc is cool.
Even in countries with sensible labor laws there are many small shops that only have a single person who is “on call” in the sense of being able to solve technical problems. If being on call just means that you’re who people would call if the site went down, but the site never goes down, it’s not a big deal. I have been the sole person “on call” at my company for years. Sometimes I get texts from my boss if something is down, but that doesn’t happen very often and my SLO is “I will look at it whenever I get back to a computer”.
I don’t think replying to “having a proper on-call rotation implies” with “assuming the system is not mission critical and there is no on-call, having a single person is better than nothing”.
It undermines the premise of the discussion.
“You can do 24x7 on call with 1 person if the call never happens” also becomes reductive of the situation, you don’t have “24x7 on-call” in your example to begin with.
I have been the sole person “on call” at my company for years. Sometimes I get texts from my boss if something is down, but that doesn’t happen very often and my SLO is “I will look at it whenever I get back to a computer”.
It sounds like we’re having a semantic battle about the definition of “on call.” I think it’s valid for the definition of “on call” to include best effort and not a certain small number of minutes for MTR.
Going back to Tailscale, if they have fewer than let’s say 5 incidents that happen outside of normal working hours per year, I think it’s totally feasible to have three people on call.
It sounds like we’re having a semantic battle about the definition of “on call.” I think it’s valid for the definition of “on call” to include best effort and not a certain small number of minutes for MTR.
Well yes, being on-call has a specific meaning. That meaning matters, and is not the same as “being generally available to my employer outside normal working hours because i feel like it’s fine”.
Going back to Tailscale, if they have fewer than let’s say 5 incidents that happen outside of normal working hours per year, I think it’s totally feasible to have three people on call.
Well, this is the point. Depending on the labor laws of the country, and whether or not the company actually follows them, you can’t.
I agree. The linked article however says nothing about oncall rotations. It’s describing a team that develops and maintains infrastructure, not an oncall rotation.
Source: am one of the three engineers, am also not oncall.
Given the Tailscale founder’s posts on how he thinks about engineering and culture, I was expecting something interesting when I clicked this link. I didn’t get it.
For example: https://apenwarr.ca/log/20190926
I think the technical content is “if you use Wireguard, you don’t have VPN issues, so it’s easy to have a small team.” That’s different than the kind of cultural thing Apenwarr was writing about.
Thanks for sharing, it’s an interesting blog I’d not run across before.
3 people ain’t enough to run a 24x7 on call environment with rotations, time off to recover, sick leave, holidays, and enough for personal development. That size is minimum 7 in countries with sensible labor laws, and/or companies that respect that.
It’s startup-sized, so probably others in the org can step in and cover these gaps if needed.
Not sure if this is something to be proud of. The engineering ofc is cool.
You can do 24x7 on call with 1 person if the call never happens.
If this is legal you don’t have sensible labor laws.
Even in countries with sensible labor laws there are many small shops that only have a single person who is “on call” in the sense of being able to solve technical problems. If being on call just means that you’re who people would call if the site went down, but the site never goes down, it’s not a big deal. I have been the sole person “on call” at my company for years. Sometimes I get texts from my boss if something is down, but that doesn’t happen very often and my SLO is “I will look at it whenever I get back to a computer”.
I don’t think replying to “having a proper on-call rotation implies” with “assuming the system is not mission critical and there is no on-call, having a single person is better than nothing”.
It undermines the premise of the discussion.
“You can do 24x7 on call with 1 person if the call never happens” also becomes reductive of the situation, you don’t have “24x7 on-call” in your example to begin with.
I hope you get compensated accordingly.
It can be mission critical but so reliable it doesn’t take a toll on the operators
This is besides the point of whether or not you have a functional on-call rotation or not.
It sounds like we’re having a semantic battle about the definition of “on call.” I think it’s valid for the definition of “on call” to include best effort and not a certain small number of minutes for MTR.
Going back to Tailscale, if they have fewer than let’s say 5 incidents that happen outside of normal working hours per year, I think it’s totally feasible to have three people on call.
Well yes, being on-call has a specific meaning. That meaning matters, and is not the same as “being generally available to my employer outside normal working hours because i feel like it’s fine”.
Well, this is the point. Depending on the labor laws of the country, and whether or not the company actually follows them, you can’t.
I agree. The linked article however says nothing about oncall rotations. It’s describing a team that develops and maintains infrastructure, not an oncall rotation.
Source: am one of the three engineers, am also not oncall.
It’s a marketing post but there’s enough interesting content in it IMO. Only 3 people work on infrastructure at Tailscale!
[Comment removed by author]