Dolt has been written in Go since before generics existed. It seems like the current team has marketing strategy of making blog posts with Go (or “Golang”) in the title to increase product awareness. But they also don’t seem to be very good programmers, e.g. https://www.dolthub.com/blog/2024-10-04-reflecting-on-reflect/ where Copilot wrote bad code, it panicked and they blamed Go. I dunno, maybe the original CTO who picked Go left a while ago and the new recruits don’t like it? I don’t really understand it.
This is a bit unkind, but this thought has been bubbling up for the last several posts that hit here from Dolt, and this kind of set me off about it… You can look through their recent history and decide for yourself: https://lobste.rs/domains/dolthub.com
type Stack[T any] struct {
data []T
}
func (s *Stack[T]) Push(v T) {
s.data = append(s.data, v)
}
What Go is missing having methods with their own set of generic type parameters distinct from the parameters introduced by the receiver. See https://go.dev/doc/faq#generic_methods.
Setting aside the issue of whether Go’s generics are incomplete or not, the author might have made a stronger case by showing one or more implementations in alternative languages that have “complete” generics support to provide contrast.
As it stands, they just show that the interface-based approach in Go requires runtime-checked downcasts.
We’d really prefer not to have to add typecasts everywhere we’re calling these methods. We’d also really prefer to not have to verify the runtime type within the SetMap function. Even if we’re 100% confident that these checks and casts will never fail, the whole purpose of having a type system is so that the compiler checks this for us.
This is precisely where the conventional type safety wisdom conflicts with the apparent Go philosophy. The “whole purpose of having a type system” is not for having the compiler check things for you. That’s one purpose. But it certainly isn’t the only one and isn’t necessarily the most important one either. There is a balance between what’s worth checking or not vs other concerns.
The “simplest” approach to solving the problem discussed in this article is to add some type casts and move on. Avoid attempting to solve a puzzle you’ve created for yourself.
I’m curious what you think the point of a type system is? Because when I use a dynamically type language, what I miss is the automatic checking. When I’ve seen people just try to use type annotations as documentation in say Python and then later switch to actually checking them mypy it turns out often they got the annotation wrong and mypy is effectively catching a documentation bug. So I really don’t see the value without the checking part.
I would emphasize two of those points as being more significant than they sound from that list.
First, function type signatures aren’t merely a useful thing to read, they’re computer verified documentation. For example:
If a function in JS takes a string as an argument, its docs need to state that that argument is a string. If the same function is written in Java, the docs don’t need to say that because the type signature already does.
If a function in C++ takes two pointers and returns a pointer, the docs need to say how long the returned pointer lives for: does it live for the lifetime of the first argument, or the second argument, or some other duration? If the same function is written in Rust, the docs don’t need to say that because the type signature already does.
And you can’t forget to update these docs because the compiler checks them for you.
Second, the sorts of errors that type systems eliminate aren’t just null pointers and trivial errors like typos. The sorts of bugs that a type system prevents can catch can be arbitrarily complex. It only seems like it’s preventing local bugs because the type system makes them local. Without type signatures, they aren’t local.
The sorts of bugs that a type system prevents can catch can be arbitrarily complex.
What I am saying here is that there are diminishing returns here: most value in detecting simple bugs fast, not at preventing complex bugs.
I really need to get to writing a proper post here, but, generally, type-system as bug prevention mechanism only works for code that you haven’t run even once. So, error handling and concurrency. For normal code, the added safety is marginal, and is dwarfed by dev tooling improvements.
I would also consider type checking of “ownership patterns” whether through monads, substructural types, lifetime analysis, or whatever else to also be a very important property of type systems, just one most currently type systems don’t give you much help with. Bugs in this area (especially in the case of concurrency) are demonstrably common and difficult to diagnose and repair.
On the other hand, I also think there are diminishing returns to trying to encode increasingly complex correctness contracts into the type system. Quickly it seems to me the difficulty of understanding how the contract has been encoded and how to use it properly outweighs the difficulty of avoiding the error yourself. The cost can be worth it in safety critical systems, I assume.
Based on the most common use of generics in the most commonly used language with generics (Java) I would say this is empirically incorrect. The top two motivations are clearly: tooling assistance, and preventing putting values of the wrong type inside generic collections.
But Java got generics only in version 6! Hence it follows that the thing that Java-style generics are useful for are not the top priorities for a type-system!
Also, your other list of priorities seems rather subjective. I feel that checking many different properties at compile time is by far the biggest benefit of a type system, far above performance under most circumstances.
Counterargument: Typescript, Mypy, Sorbet, even Gleam all provide type systems that don’t and cannot make the code faster. I think there’s also an important case between “detect nulls” and “units of measure” which is features like ADTs that allow for the whole “make invalid states impossible” thing to happen.
More fundamentally, I think this is a roughly accurate list of how much different features of a type system are used, but I think it’s not necessarily right to call that importance. For example, Typescript is definitely designed to support LSP features in Javascript even when the type system isn’t fully there, and it often gets sold on the typo handling stuff — these are key features to Typescript’s success. But Mypy can do this stuff as well, yet my impression is Mypy hasn’t found half as much success in Python codebases as Typescript has in Javascript codebases.
I suspect this is because Typescript does a much better job of modelling the implicit types that already existed in Javascript codebases than Mypy does for Python. (This is partly because Typescript is more powerful, but I suspect also because Javascript is less dynamic, and so easier to model.) This, then, is the more important feature of a type system: it can model the problems that developers are using it for. If it can’t do that, then it is not fit for purpose and will be adjusted or replaced. But if it is sufficient for that purpose, then people will use it in roughly the order your describe in your list.
That said, I do think “the problems that developers are using it for” is such a broad statement that it will look differently for different languages. For example, you probably don’t want to model complex concepts in C, so C’s type system can remain relatively simple. Whereas modelling functions of complex states in a way that prevents errors feels like a definitional statement for functional programming, so MLs and friends will typically have much more complex type systems.
How this applies to Go, though, I’m not sure. Go’s designers definitely want their types and modelled concepts to be as simple as possible, but the way they keep on adding more complex features to the language suspects that they’ve not found quite the right local maxima yet.
Yes, they go after the second priority — dev tooling.
But the extent to which they are successful at it depends — in my experience — more on the later priorities than the earlier ones. That is to say: dev tooling by itself is a high priority. But to get that dev tooling, you need to be able to correctly model the concepts that your users want to model. Be that complex dynamic types as in Python or Typescript, data types as in many functional languages, or lifetimes as @withoutboats points out in a sibling comment. If you can’t model that (and I don’t think e.g. mypy does model that very well), then the type system isn’t very useful.
This is why I think your list matches what users of a type system want to use, but doesn’t necessarily match the priorities from a language designer perspective.
You don’t need types here! Erlang has sum types, and it doesn’t have a null problem, but it also doesn’t have types.
I’m a bit sceptical here, but I admit I have almost no practical experience with Erlang & friends. I’ve used sum types plenty in Javascript, though, and in my experience they work okay up to a point, but they’re so much more usable with a type system to validate things.
Erlang’s type annotations support union types but not sum types. It’s a dynamic language so you can pass nil to a function that expects a tuple, which is just like the null problem – tho I suspect that Erlang’s usual coding style makes it less of a problem than in other languages. I don’t know if dializer is strict enough that you can use it to ensure invalid states are unrepresentable.
I periodically attempt to take some of my Python libraries – which do use type hints as documentation – and get mypy to run on them.
I have never had mypy catch an actual type error when doing this (meaning, a situation where an attempt was made to use a value of a type incompatible with the expected one described in the annotation). I have, however, gone down far more rabbit holes than I’d care to in an attempt to figure out how to express things in a way mypy will understand.
My most recent attempt at this involved a situation where mypy’s type-narrowing facilities left a lot to be desired. I am going to describe this here so you can see what I mean, and so you can get a sense of the frustration I felt. And to be clear: I am not a newbie to either Python (which I’ve been doing professionally for decades) or static typing (I first learned Java and C back in the mid-2000s).
So. The real code here isn’t particularly relevant. What you need to know is that it’s a list of values each of which (because they’re being read from environment variables which might or might not be set) is initially str | None. The code then did an if not all(the_list): check and would bail out with an exception in that branch. Which, crucially, means that all code past that point can safely assume all the values have been narrowed to str (since if any of them were None, the all() check would have failed).
Later code would start checking to see if the values were URL-like, because ultimately that’s what they’re supposed to be. So imagine some code like this for a simplified example:
items: list[str | None]
# Now imagine some code that fills in the list...
if all(items):
print(item.startswith("http://") for item in items)
But mypy looks at this perfectly idiomatic and perfectly safe Python code and says error: Item "None" of "str | None" has no attribute "startswith" [union-attr]. Because although we humans can clearly see that the type of items must have been narrowed, mypy can’t. OK, mypy’s documentation suggests an is not None check will narrow an optional type:
if all(item is not None for item in items):
print(item.startswith("http://") for item in items)
But no, that gets the same error from mypy. So does this, though mypy says isinstance() checks can be used for narrowing:
if all(isinstance(item, str) for item in items):
print(item.startswith("http://") for item in items)
The actual problem, of course, is mypy doesn’t understand that all() would return False if any of the values actually were None, and so cannot infer from the return value of all() that the type has in fact been narrowed from str | None to just str. We have to help it. If you’re actually reading mypy’s type-narrowing docs, the next thing it will suggest is writing a guard function with TypeGuard. OK:
from typing import TypeGuard
def guard_str(value: list[str | None]) -> TypeGuard[list[str]]:
"""
Narrowing type guard which indicates whether the given value
is a list of strings.
"""
return all(isinstance(v, str) for v in value)
if guard_str(items):
print(item.startswith("http://") for item in items)
And mypy actually accepts this! But there’s a problem: remember I wanted to do an if not all(items): to bail out with an error, and then have a clean path beyond that where the type has been narrowed from str | None to str? Well, turns out TypeGuard can only narrow the “true” branch of a conditional. To narrow both branches, you need to use TypeIs instead. OK, so here’s the TypeIs version:
from typing import TypeIs
def typeis_str(value: list[str | None]) -> TypeIs[list[str]]:
"""
Narrowing type guard which indicates whether the given value
is a list of strings.
"""
return all(isinstance(v, str) for v in value)
if typeis_str(items):
print(item.startswith("http://") for item in items)
So naturally mypy accepts that, right?
Haha, just kidding:
typeis.py:9: error: Narrowed type "list[str]" is not a subtype of input type "list[str | None]" [narrowed-type-not-subtype]
typeis.py:19: error: "Never" has no attribute "__iter__" (not iterable) [attr-defined]
You see, TypeGuard doesn’t care about generic type variance, but TypeIsdoes! And it turns out list is defined by the bolted-on Python “static type” system to be invariant. So now we have to go redefine everything to use a different generic type. Probably the best choice here is Sequence, which is covariant.
from collections.abc import Sequence
from typing import TypeIs
items: Sequence[str | None]
# Now imagine some code that fills in the list...
def typeis_str(value: Sequence[str | None]) -> TypeIs[Sequence[str]]:
"""
Narrowing type guard which indicates whether the given value
is a sequence of strings.
"""
return all(isinstance(v, str) for v in value)
if typeis_str(items):
print(item.startswith("http://") for item in items)
This, finally, will do the correct thing. It satisfies the type narrowing problem in both branches of a conditional, which is what’s ultimately wanted, and does so in a way that makes the narrowing obvious to mypy. And it only took, what, half a dozen tries and a bunch of frustrating errors? Again, I’m not new to Python or to static typing, and even though I actually understood the reason for most of the errors after a quick skim, this was still an incredible amount of pointless and frustrating busy-work just to satisfy mypy of something that mypy should have been able to figure out from the initial idiomatic implementation. And, worse, this has introduced expensive runtimeisinstance() checks as a cost of making the “static” type checker happy!
All of which is just the most recent of many examples of why I continue to add type hints to my code as documentation, because I do find them useful for that purpose, but also continue not to run mypy as part of my linting suite and why I do not include a py.typed file in my own packages.
My experience is with Typescript vs Javascript, but I’ve converted two or three codebases over to Typescript at this point, and each time the act of defining types for functions where the types weren’t entirely clear has helped me find bugs. It’s also allowed me to remove overly-defensive code and make the code easier to read overall.
I think part of this is that Typescript has a more powerful type system that better models real-world Javascript code (whereas Mypy’s always feels like Java with some light sugar on top). But I also suspect that Javascript is a lot easier to model than Python, as there are fewer opportunities for wild, dynamic code, and even when there are, most Javascript developers tend to avoid that style unless it’s really useful.
For your example specifically, Array#every, which is roughly the equivalent of all(...), does include the requisite type information to correctly handle this case:
declare const items: Array<string | null>;
// `.every(...)` always requires a predicate to be passed in, unlike `all(...)`, which
// means this is probably exactly how you'd write this in Javascript without types
if (items.every(items => items !== null)) {
for (const item of items) {
console.log(item.startsWith("http://"));
}
}
JavaScript is kind of an interesting example to bring up, because people love to talk about making Python “strongly typed” when it already is. You could make a sort of grid of languages like this to show the difference:
I can see how the number of implicit type conversion behaviors in JavaScript, which you mostly have to just know and remember not to trip over, would lead to a desire to work in something a bit stricter, and how doing so could yield benefits in code quality.
And TypeScript is also kind of a different example because it’s not required to remain syntactically compatible with JavaScript. “Typed Python”, on the other hand, does have to maintain syntactic compatibility with plain Python (which is why several things from the static-type-checking Python world have had to be imported into Python itself).
But I also stand by the fact that mypy has never uncovered an actual bug in Python code I’ve run it on. It’s only ever uncovered weird limitations of mypy requiring workarounds like the ones described in my comment above.
I absolutely agree with your last part. My experience with Python has been very similar, and I’ve stopped using Mypy much these days, even when working with Python, because there are too many cases that it found but weren’t actual bugs, and too many actual bugs that it didn’t catch for one reason or another.
But I think that’s largely because Mypy isn’t very good at typechecking idiomatic Python code, and not because the concept as a whole is flawed.
I do wonder, though, if Mypy would have worked better from the start if annotations had always been lazy — or even if they’d only been available at runtime as strings. This would have given the type checkers more chances to experiment with syntax without needing changes to the runtime.
I think it is important to remember that Mypy is one of the implementations of PEP-484, and while it is probably the most famous and popular, it is not the only one. Also important to note that PEP-484 does not define how the types should be enforced (it defines a few things to allow interoperability, but leave the majority of behaviors for the implementors). Heck, they’re essentially comment strings for the interpreter (especially after PEP-563 2).
For example, Pyright explicitly tries to infer more things than Mypy 3, while the fact that Mypy doesn’t is a design choice 4.
This is true, but the fact that until PEP563 the type annotations were interpreted and therefore needed to be valid, and the way that after 563 they’re still semi-interpretable (IIRC the __future__ annotations import is now discouraged because it has other weird side effects and they’re looking in a different direction, but I’ve not hugely been following the discussion) — all that means that annotations are still very much tied to Python-defined semantics. You couldn’t easily, for example, define a custom mapped type syntax using dict expressions (similar to Typescript’s mapped types), because a lot of stuff would break in weird and wonderful ways.
Like I say, I’ve given up on this stuff for now, and I’m hoping that it might get better at some point, but last time I used it Pyright was more strict, but only in the sense that I needed to contort my code more aggressively to make it work. (IIRC, the last problem I ran into with Pyright was that it inferred a value as having type Unknown, and then started raising angry type errors, even though I never interacted with the value, and therefore it being Unknown was of no consequence.)
I am using Pyright as my LSP while I used mypy --strict in this project and they never disagree. But to be clear, this is a small project with ~2000 lines of code, and also doesn’t interact with external APIs so I think this avoid the weird corner cases of Mypy.
I have never had mypy catch an actual type error when doing this (meaning, a situation where an attempt was made to use a value of a type incompatible with the expected one described in the annotation). I have, however, gone down far more rabbit holes than I’d care to in an attempt to figure out how to express things in a way mypy will understand.
My experience is quite the opposite, mypy did catch lots of typing errors that would otherwise just cause issues in runtime (or maybe not, but it was definitely doing something not the way I wanted to express). One example from yesterday, I was using subprocess.run (actually a wrapper above it, more details below) and wanted to pass an env parameter (that is a Mapping), and I did something like:
subprocess.run(["foo"], env={"ENV", "foo"}
Do you see the error? Probably yes, since this is only one line of the code that I was working for, but I didn’t, and mypy gladly got the issue. And even better, this was before I tested the code, and once I run for the first time (after fixing all mypy errors), the code Just Worked (TM).
The other day I was refactoring the code of the same project and I was not running mypy in the tests yet. I decided to setup mypy in the tests just to be sure and boom, I forgot to change the caller in the tests. Since they’re mocked, they still passed, but the tests didn’t made any sense. While it was annoying to type the test methods itself (that I don’t think makes much sense), after that experience I decided that I definitely need mypy running in my tests too.
I concur that mypy is weird sometimes, and far from perfect. For example, I got some really strange issues when trying to write a wrapper for another method (again, subprocess.run) and typing its keyword arguments. Something like:
class RunArgs(TypedDict, total=False):
...
env: dict[str, str] | None # actually a Mapping[StrOrBytesPath] | None, but decided to simplify here
def run_wrapper(args: list[str], extra_env: dict[str, str] | None = None, **kwargs: Unpack[RunArgs]):
env = kwargs.get("env")
if extra_env:
env = (env or os.environ) | extra_env
subprocess.run(args, env=env)
And this caused some weird issues. Moving env to the method parameter instead worked and even simplified the code, but yes, far from ideal.
But even with the somewhat strange issue, I still thing that mypy was a huge plus in this project. The fact that I can refactor code and be confident that if the tests and mypy pass it is very likely that the code is still correct is worth the work that I have to do sometimes to “please” mypy.
I’ve definitely found errors when converting python programs to have type hints, but not as many afterwards; perhaps it’s just that having the type annotations makes it harder for me (and other people working on it) to write incorrect code? Either way, I fully agree that mypy/pyright work in ways that are just too annoying and I disable them in IDE and linting unless I’m required to have them.
One thing that I found valuable in mypy is during refactoring. I just refactored a function parameter from bool to str | None and while this didn’t broke any tests (because in the function I was doing something like if v: something, so it doesn’t really matter if I am passing False or None as v), mypy correctly identified the issue.
You could argue that it doens’t matter since this didn’t break the logic, but I think it does. Having tests with incorrect parameters, even if they works, are confusing because you lose the context of what the tests are actually trying to test.
The author kind of undermines their complaint by solving the problem with a clever and (I think) tolerably neat hack: using a “contract” type to describe the relationship between the various types in play and to break the recursion loop. They also note that C++ needs a clever trick (the curiously recurring template pattern) to cope with situations like this. I agree with the sentiment that modern languages should be held to much higher standards than C++ but I wish the author’s clever solution was more prominent – I fear some readers might bounce off the article and miss the best bit.
Dolt has been written in Go since before generics existed. It seems like the current team has marketing strategy of making blog posts with Go (or “Golang”) in the title to increase product awareness. But they also don’t seem to be very good programmers, e.g. https://www.dolthub.com/blog/2024-10-04-reflecting-on-reflect/ where Copilot wrote bad code, it panicked and they blamed Go. I dunno, maybe the original CTO who picked Go left a while ago and the new recruits don’t like it? I don’t really understand it.
This is a bit unkind, but this thought has been bubbling up for the last several posts that hit here from Dolt, and this kind of set me off about it… You can look through their recent history and decide for yourself: https://lobste.rs/domains/dolthub.com
Well, Go generics are deliberately incomplete, eg methods can’t be generic. That was a design decision.
Interestingly, one of the predecessor languages to influence it (Alef) actually had generic methods.
From its reference manual, which offers:
So at least two of the authors were (or should have been) aware of that form.
Go does have this level of generics
What Go is missing having methods with their own set of generic type parameters distinct from the parameters introduced by the receiver. See https://go.dev/doc/faq#generic_methods.
Setting aside the issue of whether Go’s generics are incomplete or not, the author might have made a stronger case by showing one or more implementations in alternative languages that have “complete” generics support to provide contrast.
As it stands, they just show that the interface-based approach in Go requires runtime-checked downcasts.
It’s as easy as:
This is precisely where the conventional type safety wisdom conflicts with the apparent Go philosophy. The “whole purpose of having a type system” is not for having the compiler check things for you. That’s one purpose. But it certainly isn’t the only one and isn’t necessarily the most important one either. There is a balance between what’s worth checking or not vs other concerns.
The “simplest” approach to solving the problem discussed in this article is to add some type casts and move on. Avoid attempting to solve a puzzle you’ve created for yourself.
I’m curious what you think the point of a type system is? Because when I use a dynamically type language, what I miss is the automatic checking. When I’ve seen people just try to use type annotations as documentation in say Python and then later switch to actually checking them mypy it turns out often they got the annotation wrong and mypy is effectively catching a documentation bug. So I really don’t see the value without the checking part.
Purposes of the type system, in order of importance:
I would emphasize two of those points as being more significant than they sound from that list.
First, function type signatures aren’t merely a useful thing to read, they’re computer verified documentation. For example:
And you can’t forget to update these docs because the compiler checks them for you.
Second, the sorts of errors that type systems eliminate aren’t just null pointers and trivial errors like typos. The sorts of bugs that a type system prevents can catch can be arbitrarily complex. It only seems like it’s preventing local bugs because the type system makes them local. Without type signatures, they aren’t local.
What I am saying here is that there are diminishing returns here: most value in detecting simple bugs fast, not at preventing complex bugs.
I really need to get to writing a proper post here, but, generally, type-system as bug prevention mechanism only works for code that you haven’t run even once. So, error handling and concurrency. For normal code, the added safety is marginal, and is dwarfed by dev tooling improvements.
I would also consider type checking of “ownership patterns” whether through monads, substructural types, lifetime analysis, or whatever else to also be a very important property of type systems, just one most currently type systems don’t give you much help with. Bugs in this area (especially in the case of concurrency) are demonstrably common and difficult to diagnose and repair.
On the other hand, I also think there are diminishing returns to trying to encode increasingly complex correctness contracts into the type system. Quickly it seems to me the difficulty of understanding how the contract has been encoded and how to use it properly outweighs the difficulty of avoiding the error yourself. The cost can be worth it in safety critical systems, I assume.
Based on the most common use of generics in the most commonly used language with generics (Java) I would say this is empirically incorrect. The top two motivations are clearly: tooling assistance, and preventing putting values of the wrong type inside generic collections.
But Java got generics only in version 6! Hence it follows that the thing that Java-style generics are useful for are not the top priorities for a type-system!
The top priorities of an (uninvented) type system are the properties desired by the inventors of the type system.
The top priorities of users wrt to type systems are the revealed preferences of the users of type systems.
Which was 18(!) years ago.
Also, your other list of priorities seems rather subjective. I feel that checking many different properties at compile time is by far the biggest benefit of a type system, far above performance under most circumstances.
Counterargument: Typescript, Mypy, Sorbet, even Gleam all provide type systems that don’t and cannot make the code faster. I think there’s also an important case between “detect nulls” and “units of measure” which is features like ADTs that allow for the whole “make invalid states impossible” thing to happen.
More fundamentally, I think this is a roughly accurate list of how much different features of a type system are used, but I think it’s not necessarily right to call that importance. For example, Typescript is definitely designed to support LSP features in Javascript even when the type system isn’t fully there, and it often gets sold on the typo handling stuff — these are key features to Typescript’s success. But Mypy can do this stuff as well, yet my impression is Mypy hasn’t found half as much success in Python codebases as Typescript has in Javascript codebases.
I suspect this is because Typescript does a much better job of modelling the implicit types that already existed in Javascript codebases than Mypy does for Python. (This is partly because Typescript is more powerful, but I suspect also because Javascript is less dynamic, and so easier to model.) This, then, is the more important feature of a type system: it can model the problems that developers are using it for. If it can’t do that, then it is not fit for purpose and will be adjusted or replaced. But if it is sufficient for that purpose, then people will use it in roughly the order your describe in your list.
That said, I do think “the problems that developers are using it for” is such a broad statement that it will look differently for different languages. For example, you probably don’t want to model complex concepts in C, so C’s type system can remain relatively simple. Whereas modelling functions of complex states in a way that prevents errors feels like a definitional statement for functional programming, so MLs and friends will typically have much more complex type systems.
How this applies to Go, though, I’m not sure. Go’s designers definitely want their types and modelled concepts to be as simple as possible, but the way they keep on adding more complex features to the language suspects that they’ve not found quite the right local maxima yet.
Yes, they go after the second priority — dev tooling.
You don’t need types here! Erlang has sum types, and it doesn’t have a null problem, but it also doesn’t have types.
But the extent to which they are successful at it depends — in my experience — more on the later priorities than the earlier ones. That is to say: dev tooling by itself is a high priority. But to get that dev tooling, you need to be able to correctly model the concepts that your users want to model. Be that complex dynamic types as in Python or Typescript, data types as in many functional languages, or lifetimes as @withoutboats points out in a sibling comment. If you can’t model that (and I don’t think e.g. mypy does model that very well), then the type system isn’t very useful.
This is why I think your list matches what users of a type system want to use, but doesn’t necessarily match the priorities from a language designer perspective.
I’m a bit sceptical here, but I admit I have almost no practical experience with Erlang & friends. I’ve used sum types plenty in Javascript, though, and in my experience they work okay up to a point, but they’re so much more usable with a type system to validate things.
Erlang’s type annotations support union types but not sum types. It’s a dynamic language so you can pass
nil
to a function that expects a tuple, which is just like the null problem – tho I suspect that Erlang’s usual coding style makes it less of a problem than in other languages. I don’t know if dializer is strict enough that you can use it to ensure invalid states are unrepresentable.No, Erlang is qualitatively different to Java and Python with respect to null.
If someting is nullable in Erlang, then non-null is represented as
{ok, Value}
rather than justValue
.That is, if you plug a nullable result of a function into a non-nullable argument, the thing blows up even if non-null is returned!
In contrast, Java/Python only blow up when null/None actually happen.
I periodically attempt to take some of my Python libraries – which do use type hints as documentation – and get mypy to run on them.
I have never had mypy catch an actual type error when doing this (meaning, a situation where an attempt was made to use a value of a type incompatible with the expected one described in the annotation). I have, however, gone down far more rabbit holes than I’d care to in an attempt to figure out how to express things in a way mypy will understand.
My most recent attempt at this involved a situation where mypy’s type-narrowing facilities left a lot to be desired. I am going to describe this here so you can see what I mean, and so you can get a sense of the frustration I felt. And to be clear: I am not a newbie to either Python (which I’ve been doing professionally for decades) or static typing (I first learned Java and C back in the mid-2000s).
So. The real code here isn’t particularly relevant. What you need to know is that it’s a list of values each of which (because they’re being read from environment variables which might or might not be set) is initially
str | None
. The code then did anif not all(the_list):
check and would bail out with an exception in that branch. Which, crucially, means that all code past that point can safely assume all the values have been narrowed tostr
(since if any of them wereNone
, theall()
check would have failed).Later code would start checking to see if the values were URL-like, because ultimately that’s what they’re supposed to be. So imagine some code like this for a simplified example:
But mypy looks at this perfectly idiomatic and perfectly safe Python code and says
error: Item "None" of "str | None" has no attribute "startswith" [union-attr]
. Because although we humans can clearly see that the type ofitems
must have been narrowed, mypy can’t. OK, mypy’s documentation suggests anis not None
check will narrow an optional type:But no, that gets the same error from mypy. So does this, though mypy says
isinstance()
checks can be used for narrowing:The actual problem, of course, is mypy doesn’t understand that
all()
would returnFalse
if any of the values actually wereNone
, and so cannot infer from the return value ofall()
that the type has in fact been narrowed fromstr | None
to juststr
. We have to help it. If you’re actually reading mypy’s type-narrowing docs, the next thing it will suggest is writing a guard function withTypeGuard
. OK:And mypy actually accepts this! But there’s a problem: remember I wanted to do an
if not all(items):
to bail out with an error, and then have a clean path beyond that where the type has been narrowed fromstr | None
tostr
? Well, turns outTypeGuard
can only narrow the “true” branch of a conditional. To narrow both branches, you need to useTypeIs
instead. OK, so here’s theTypeIs
version:So naturally mypy accepts that, right?
Haha, just kidding:
You see,
TypeGuard
doesn’t care about generic type variance, butTypeIs
does! And it turns outlist
is defined by the bolted-on Python “static type” system to be invariant. So now we have to go redefine everything to use a different generic type. Probably the best choice here isSequence
, which is covariant.This, finally, will do the correct thing. It satisfies the type narrowing problem in both branches of a conditional, which is what’s ultimately wanted, and does so in a way that makes the narrowing obvious to mypy. And it only took, what, half a dozen tries and a bunch of frustrating errors? Again, I’m not new to Python or to static typing, and even though I actually understood the reason for most of the errors after a quick skim, this was still an incredible amount of pointless and frustrating busy-work just to satisfy mypy of something that mypy should have been able to figure out from the initial idiomatic implementation. And, worse, this has introduced expensive runtime
isinstance()
checks as a cost of making the “static” type checker happy!All of which is just the most recent of many examples of why I continue to add type hints to my code as documentation, because I do find them useful for that purpose, but also continue not to run mypy as part of my linting suite and why I do not include a
py.typed
file in my own packages.My experience is with Typescript vs Javascript, but I’ve converted two or three codebases over to Typescript at this point, and each time the act of defining types for functions where the types weren’t entirely clear has helped me find bugs. It’s also allowed me to remove overly-defensive code and make the code easier to read overall.
I think part of this is that Typescript has a more powerful type system that better models real-world Javascript code (whereas Mypy’s always feels like Java with some light sugar on top). But I also suspect that Javascript is a lot easier to model than Python, as there are fewer opportunities for wild, dynamic code, and even when there are, most Javascript developers tend to avoid that style unless it’s really useful.
For your example specifically,
Array#every
, which is roughly the equivalent ofall(...)
, does include the requisite type information to correctly handle this case:JavaScript is kind of an interesting example to bring up, because people love to talk about making Python “strongly typed” when it already is. You could make a sort of grid of languages like this to show the difference:
I can see how the number of implicit type conversion behaviors in JavaScript, which you mostly have to just know and remember not to trip over, would lead to a desire to work in something a bit stricter, and how doing so could yield benefits in code quality.
And TypeScript is also kind of a different example because it’s not required to remain syntactically compatible with JavaScript. “Typed Python”, on the other hand, does have to maintain syntactic compatibility with plain Python (which is why several things from the static-type-checking Python world have had to be imported into Python itself).
But I also stand by the fact that mypy has never uncovered an actual bug in Python code I’ve run it on. It’s only ever uncovered weird limitations of mypy requiring workarounds like the ones described in my comment above.
I absolutely agree with your last part. My experience with Python has been very similar, and I’ve stopped using Mypy much these days, even when working with Python, because there are too many cases that it found but weren’t actual bugs, and too many actual bugs that it didn’t catch for one reason or another.
But I think that’s largely because Mypy isn’t very good at typechecking idiomatic Python code, and not because the concept as a whole is flawed.
I do wonder, though, if Mypy would have worked better from the start if annotations had always been lazy — or even if they’d only been available at runtime as strings. This would have given the type checkers more chances to experiment with syntax without needing changes to the runtime.
I think it is important to remember that Mypy is one of the implementations of PEP-484, and while it is probably the most famous and popular, it is not the only one. Also important to note that PEP-484 does not define how the types should be enforced (it defines a few things to allow interoperability, but leave the majority of behaviors for the implementors). Heck, they’re essentially comment strings for the interpreter (especially after PEP-563 2).
For example, Pyright explicitly tries to infer more things than Mypy 3, while the fact that Mypy doesn’t is a design choice 4.
This is true, but the fact that until PEP563 the type annotations were interpreted and therefore needed to be valid, and the way that after 563 they’re still semi-interpretable (IIRC the
__future__
annotations import is now discouraged because it has other weird side effects and they’re looking in a different direction, but I’ve not hugely been following the discussion) — all that means that annotations are still very much tied to Python-defined semantics. You couldn’t easily, for example, define a custom mapped type syntax using dict expressions (similar to Typescript’s mapped types), because a lot of stuff would break in weird and wonderful ways.Like I say, I’ve given up on this stuff for now, and I’m hoping that it might get better at some point, but last time I used it Pyright was more strict, but only in the sense that I needed to contort my code more aggressively to make it work. (IIRC, the last problem I ran into with Pyright was that it inferred a value as having type
Unknown
, and then started raising angry type errors, even though I never interacted with the value, and therefore it beingUnknown
was of no consequence.)I am using Pyright as my LSP while I used
mypy --strict
in this project and they never disagree. But to be clear, this is a small project with ~2000 lines of code, and also doesn’t interact with external APIs so I think this avoid the weird corner cases of Mypy.My experience is quite the opposite,
mypy
did catch lots of typing errors that would otherwise just cause issues in runtime (or maybe not, but it was definitely doing something not the way I wanted to express). One example from yesterday, I was usingsubprocess.run
(actually a wrapper above it, more details below) and wanted to pass anenv
parameter (that is aMapping
), and I did something like:Do you see the error? Probably yes, since this is only one line of the code that I was working for, but I didn’t, and
mypy
gladly got the issue. And even better, this was before I tested the code, and once I run for the first time (after fixing allmypy
errors), the code Just Worked (TM).The other day I was refactoring the code of the same project and I was not running
mypy
in the tests yet. I decided to setupmypy
in the tests just to be sure and boom, I forgot to change the caller in the tests. Since they’re mocked, they still passed, but the tests didn’t made any sense. While it was annoying to type the test methods itself (that I don’t think makes much sense), after that experience I decided that I definitely needmypy
running in my tests too.I concur that
mypy
is weird sometimes, and far from perfect. For example, I got some really strange issues when trying to write a wrapper for another method (again,subprocess.run
) and typing its keyword arguments. Something like:And this caused some weird issues. Moving
env
to the method parameter instead worked and even simplified the code, but yes, far from ideal.But even with the somewhat strange issue, I still thing that
mypy
was a huge plus in this project. The fact that I can refactor code and be confident that if the tests andmypy
pass it is very likely that the code is still correct is worth the work that I have to do sometimes to “please”mypy
.I’ve definitely found errors when converting python programs to have type hints, but not as many afterwards; perhaps it’s just that having the type annotations makes it harder for me (and other people working on it) to write incorrect code? Either way, I fully agree that mypy/pyright work in ways that are just too annoying and I disable them in IDE and linting unless I’m required to have them.
One thing that I found valuable in
mypy
is during refactoring. I just refactored a function parameter frombool
tostr | None
and while this didn’t broke any tests (because in the function I was doing something likeif v: something
, so it doesn’t really matter if I am passingFalse
orNone
asv
),mypy
correctly identified the issue.You could argue that it doens’t matter since this didn’t break the logic, but I think it does. Having tests with incorrect parameters, even if they works, are confusing because you lose the context of what the tests are actually trying to test.
The author kind of undermines their complaint by solving the problem with a clever and (I think) tolerably neat hack: using a “contract” type to describe the relationship between the various types in play and to break the recursion loop. They also note that C++ needs a clever trick (the curiously recurring template pattern) to cope with situations like this. I agree with the sentiment that modern languages should be held to much higher standards than C++ but I wish the author’s clever solution was more prominent – I fear some readers might bounce off the article and miss the best bit.
I think this is overcomplicated. They have two types and want to convert the concrete types. They don’t need to be methods on the types!!
Why not embed the map struct in specialized types, and skip the generics?
Because they need to write 5 blog posts a week.