I wish I had written this article. I have an article I’ve been working on that root causes this behavior as “Software Engineering is not Computer Science”. This is not an anti-intellectual position, but an acknowledgement that when we do things like design cars we don’t hire exclusively theoretical physicists even though we fully expect mechanical engineers to know and apply a large amount of physics.
Also our industries cargo culting of certain CS101 keywords while disdaining things on the implementation side has directly lead to some ridiculously slow applications. Such as the database performance issues mentioned in the article.
As an industry we’ve somehow assumed that people smart enough to understand CS concepts are smart enough to apply those concepts in new areas. Perhaps that’s statement is true, but the reality is all too often they simply don’t. And that’s how we end up with people who simultaneously understand b-trees well enough to implement them but don’t think to review their database’s documentation on “CREATE INDEX”.
There’s a fun flip side to this phenomenon which is that you can often solve 3SAT or even the halting problem in practice. Obviously one can’t push this too far (and by solve the halting problem I obviously mean solve specific instances of whether a program will halt) but CS undergrads seem to assign mythical difficulties to concrete instances of these problems that are actually surmountable in the real world. There’s a reason 3SAT isn’t used as the basis of a cryptographic protocol.
One thing I liked about my own undergrad (~20 yrs ago) is that the juxtaposition of two classes helped me avoid this particular pitfall. In Algorithms, SAT was a canonical example of a “hard” problem, and you proved other problems hard through reduction to SAT. But in Artificial Intelligence, SAT was a canonical example of a tractable-in-practice problem, and you could show other problems tractable through… reduction to SAT.
Alas, symbolic planning is pretty out of favor in 2023, so undergrads are unlikely to run across that particular example in an intro AI class. Maybe a programming languages class introducing formal methods could serve that role.
This is something that CS undergraduate education covers (using randomized algorithms, which can be used to solve many problems where as you say there are pathological cases, but they are only a small fraction of the sample space)?
What CS undergrad covers and what CS undergrads remember & understand a year or two out of college are overlapping but distinct. Like who the hell remembers how the pumping lemma works without looking it up?
A lot of computer science degrees seem to labour under the misapprehension that they are there to teach computer science. This is not what a university course is for. A degree is supposed to give you a guided tour of your ignorance such that you understand how little you know and how to learn the things that you later need.
The pumping lemma is a lot easier to remember when you realize it boils down to “all sufficiently-long paths on a finite directed graph contain a cycle.”
The article had been in my drafts folder for two months now, so you had a chance to beat me to it ;) In any case, it’d be worthwhile to have a parallel take on it too!
Joking aside, I agree with your response and like the “Software Engineering is not CS” claim. I think it’s spot on. Also, it’s probably true that people smart enough to understand CS concepts can learn the rest… but the question is whether they are actually put in a position where they can learn (proper reviews, mentoring, etc.) and whether they want to learn. I’ve seen too many cases of “two inexperienced people in a corner bouncing code reviews off of each other” where, before you know it, the code is already in prod causing issues. The support structures are rarely there :-/
One thing I am slowly learning/realizing about performance and Big O is that I’m less worried about an accidentally quadratic algorithm than the drip-drip-drip of small performance losses.
Accidentally-quadratic algorithms are absolutely horrible for performance, but on the positive side they are extremely visible in a flamegraph. In the best case scenario, someone on the team sends a much larger payload than what is “normal” and the system slows down to a crawl. Fortunately, a flamegraph will point us to where the problem is, and in the case of accidentally quadratic problems, the issue is often a single function, which is relatively easy to fix.
On the other hand, small performance losses (e.g., unnecessary copies, bad branch prediction, calling a slow regular function from an async function, etc.) are not quite as visible in flamegraphs, but they just sap performance all over the place and making the program faster suddenly starts to look like it might require a full re-architecture; way scarier and riskier than fixing the accidentally quadratic function.
I like this post. I’m into testing and verification, which are usually focused on functional behavior vs. any kind of performance, but I’ve also been on devops / sre / on call rotations for several years at this point. So I know full and well that logically correct code can still lead to a page in the middle of the night.
My view is that there’s a false dichotomy going on - you can test for functional correctness as well as non-functional concerns like performance / scalability. In fact, you most certainly should do both. This post goes even further and talks about many different dimensions, including development time. That’s ultimately the best way to think. There are dozens and dozens of important dimensions to a software project. We should consider all of them.
One nit:
Having a fast and responsive app is orthogonal to “knowing your big Os”.
I wouldn’t call them completely orthogonal. Using an algorithm with poor big O characteristics will also lead to poor performance. This is where I see the false dichotomy appearing in practice. The logical characteristics of your actual application code have to be in order before you even begin to worry about generic system performance. They’re intertwined, not orthogonal.
Great article. So pragmatic and rooted in real the real world.
I think, in the extreme case the general problem can be reduced to this form:
Engineer learns about some technology or principle, gets charmed feels empowered, and proceeds to do their work guided by it… While ignoring all other variables.
My theory for this is that humans really can only handle so many variables. That means mental models for human beings have to be oversimplified, and have to exclude other important variables.
We will never be perfect, and we will never have perfect technology for that reason. This is also the “squeezing the balloon” effect. The new technology which captures your attention seems strictly better in all areas, but it just has some downsides in some new variables that you probably aren’t even aware of.
My theory for this is that humans really can only handle so many variables.
It’s quite disappointing that even in software industry, which arguably drove the biggest societal and technological changes in the paste couple of decades, that value is very close to one. One would have hopes for smart people, thinking minds.
The reason why big O is more prevalent in computer science is that it doesn’t depend on a specific architecture (aside from the question of which operations are considered atomic).
I wish I had written this article. I have an article I’ve been working on that root causes this behavior as “Software Engineering is not Computer Science”. This is not an anti-intellectual position, but an acknowledgement that when we do things like design cars we don’t hire exclusively theoretical physicists even though we fully expect mechanical engineers to know and apply a large amount of physics.
Also our industries cargo culting of certain CS101 keywords while disdaining things on the implementation side has directly lead to some ridiculously slow applications. Such as the database performance issues mentioned in the article.
As an industry we’ve somehow assumed that people smart enough to understand CS concepts are smart enough to apply those concepts in new areas. Perhaps that’s statement is true, but the reality is all too often they simply don’t. And that’s how we end up with people who simultaneously understand b-trees well enough to implement them but don’t think to review their database’s documentation on “CREATE INDEX”.
There’s a fun flip side to this phenomenon which is that you can often solve 3SAT or even the halting problem in practice. Obviously one can’t push this too far (and by solve the halting problem I obviously mean solve specific instances of whether a program will halt) but CS undergrads seem to assign mythical difficulties to concrete instances of these problems that are actually surmountable in the real world. There’s a reason 3SAT isn’t used as the basis of a cryptographic protocol.
One thing I liked about my own undergrad (~20 yrs ago) is that the juxtaposition of two classes helped me avoid this particular pitfall. In Algorithms, SAT was a canonical example of a “hard” problem, and you proved other problems hard through reduction to SAT. But in Artificial Intelligence, SAT was a canonical example of a tractable-in-practice problem, and you could show other problems tractable through… reduction to SAT.
Alas, symbolic planning is pretty out of favor in 2023, so undergrads are unlikely to run across that particular example in an intro AI class. Maybe a programming languages class introducing formal methods could serve that role.
This is something that CS undergraduate education covers (using randomized algorithms, which can be used to solve many problems where as you say there are pathological cases, but they are only a small fraction of the sample space)?
What CS undergrad covers and what CS undergrads remember & understand a year or two out of college are overlapping but distinct. Like who the hell remembers how the pumping lemma works without looking it up?
A lot of computer science degrees seem to labour under the misapprehension that they are there to teach computer science. This is not what a university course is for. A degree is supposed to give you a guided tour of your ignorance such that you understand how little you know and how to learn the things that you later need.
Sounds like a very PhD thing to say ;)
The pumping lemma is a lot easier to remember when you realize it boils down to “all sufficiently-long paths on a finite directed graph contain a cycle.”
The article had been in my drafts folder for two months now, so you had a chance to beat me to it ;) In any case, it’d be worthwhile to have a parallel take on it too!
Joking aside, I agree with your response and like the “Software Engineering is not CS” claim. I think it’s spot on. Also, it’s probably true that people smart enough to understand CS concepts can learn the rest… but the question is whether they are actually put in a position where they can learn (proper reviews, mentoring, etc.) and whether they want to learn. I’ve seen too many cases of “two inexperienced people in a corner bouncing code reviews off of each other” where, before you know it, the code is already in prod causing issues. The support structures are rarely there :-/
Great article @jmmv, I really liked it!
One thing I am slowly learning/realizing about performance and Big O is that I’m less worried about an accidentally quadratic algorithm than the drip-drip-drip of small performance losses.
Accidentally-quadratic algorithms are absolutely horrible for performance, but on the positive side they are extremely visible in a flamegraph. In the best case scenario, someone on the team sends a much larger payload than what is “normal” and the system slows down to a crawl. Fortunately, a flamegraph will point us to where the problem is, and in the case of accidentally quadratic problems, the issue is often a single function, which is relatively easy to fix.
On the other hand, small performance losses (e.g., unnecessary copies, bad branch prediction, calling a slow regular function from an async function, etc.) are not quite as visible in flamegraphs, but they just sap performance all over the place and making the program faster suddenly starts to look like it might require a full re-architecture; way scarier and riskier than fixing the accidentally quadratic function.
I like this post. I’m into testing and verification, which are usually focused on functional behavior vs. any kind of performance, but I’ve also been on devops / sre / on call rotations for several years at this point. So I know full and well that logically correct code can still lead to a page in the middle of the night.
My view is that there’s a false dichotomy going on - you can test for functional correctness as well as non-functional concerns like performance / scalability. In fact, you most certainly should do both. This post goes even further and talks about many different dimensions, including development time. That’s ultimately the best way to think. There are dozens and dozens of important dimensions to a software project. We should consider all of them.
One nit:
I wouldn’t call them completely orthogonal. Using an algorithm with poor big O characteristics will also lead to poor performance. This is where I see the false dichotomy appearing in practice. The logical characteristics of your actual application code have to be in order before you even begin to worry about generic system performance. They’re intertwined, not orthogonal.
Great article. So pragmatic and rooted in real the real world.
I think, in the extreme case the general problem can be reduced to this form: Engineer learns about some technology or principle, gets charmed feels empowered, and proceeds to do their work guided by it… While ignoring all other variables.
My theory for this is that humans really can only handle so many variables. That means mental models for human beings have to be oversimplified, and have to exclude other important variables.
We will never be perfect, and we will never have perfect technology for that reason. This is also the “squeezing the balloon” effect. The new technology which captures your attention seems strictly better in all areas, but it just has some downsides in some new variables that you probably aren’t even aware of.
It’s quite disappointing that even in software industry, which arguably drove the biggest societal and technological changes in the paste couple of decades, that value is very close to one. One would have hopes for smart people, thinking minds.
The reason why big O is more prevalent in computer science is that it doesn’t depend on a specific architecture (aside from the question of which operations are considered atomic).
[Comment from banned user removed]