In this series looking at features introduced by every version of Python 3, this is the second looking at Python 3.5. In it we examine another of the significant new features in this release, type hinting.
This is the 9th of the 34 articles that currently make up the “Python 3 Releases” series.
This is the second article looking at features added in Python 3.5, since it was quite a milestone release in a number of ways so I’m trying to give the more major features proper coverage. Last time we looked at Coroutines, this time it’s the turn of type hinting.
The syntax for annotating function arguments was added way back in Python 3.0, but the syntax was all that was specified — the semantics of the annotations was left as an exercise to each programmer to define themselves. Probably unsurprisingly, the use to which most people put this syntax was type annotations. Since Python’s dynamic typing can sometimes make it a little tricky to work out what type is expected in a particular context, type annotations are a helpful form of documentation. Furthermore it enables static analysis tools to perform correctness checking, which wouldn’t be possible without annotations. For these reasons, type information is an obvious candidate for annotations.
In response to this, the Python maintainers decided that type annotations made sense as an addition to the standard library, so everyone could benefit from a standard method to apply these annotations. This allows tooling to develop around this standard, which is much less likely to happen if everyone uses subtley different ways to achieve the same end, and it’s also just less work for everyone.
There are two important points to stress here, however. The first is that although this release adds features to the standard library to add type annotations in a standardised way, it doesn’t actually add any type-checking features to the language or the library. Fortunately when 3.5 was released there was already the mypy utility to perform this static analysis, and it makes full use of the new syntax.
The second point to stress is that this is still, and probably always will be, an optional feature. Nobody is obliged to perform type annotation, either by language requirements or by convention — it’s simply a feature available for anyone who wants to use it. Things have been carefully set up so that code using type hints can be freely mixed with code that doesn’t and all of the runtime behaviour is identical.
Before we jump in and see some code, I wanted to say a few words about subtypes. This is because it underpins all of the rules implemented by the Python type-checkers, so it’s important to be familiar with it. Some of you may already be experts, so you might like to skim this section.
You’re probably quite used to hearing about subtyping as a synonym for subclassing in inheritance hierarchies. One key point to note, however, is that subtyping is a relationship between any types, however they’re declared.
Colloquially, declaring that TypeSub
is a subtype of TypeSuper
is essentially saying that any code which expects TypeSuper
would also work successfully with TypeSub
. But how do we define this more rigorously? It boils down to two requirements:
TypeSub
must also be a possible value for TypeSuper
.TypeSuper
must also be callable on TypeSub
.If both of these requirements hold true then TypeSub
is indeed a subtype of TypeSuper
. An example of this is that int
is a subtype of float
, since integers are a subset of the real numbers1. The canonical example is that a subclass is a subtype of all its parent classes.
Right, that’s enough theory, let’s look at some practice.
Let’s kick off by considering an extremely simple example:
1 2 3 4 5 6 |
|
As is hopefully obvious, we’re declaring my_function()
here to take a single int
argument and return an float
result. Then we call this function a few times, including a couple of times that break the type-checking rules. Python will execute this fine and no errors or warnings will be emitted. If we run it under mypy
, however, we see the problems:
type-hints.py:5: error: Argument 1 to "my_function" has incompatible type "float"; expected "int"
type-hints.py:6: error: Argument 1 to "my_function" has incompatible type "float"; expected "int"
Found 2 errors in 1 file (checked 1 source file)
So far so simple, looks like this is going to be a very short article!
Slightly less simple is the issue of container types such as list
and dict
. These are more complicated not just because they contain other types, but also beacuse they can contain heterogeneous types (i.e. values within them can have different types to each other). Let’s ignore the issue with heterogeneity for now, we’ll come back to that a little later, and just consider homongeneous cases (i.e. every contained item has the same type).
To represent these we have our first encounter with the new typing
module. This provides a number of utility classes which are useful for declaring type hints. It’s important to note that these generic classes are not equivalents to the types themselves — you can’t construct an instance of them, you can only use them with type annotations.
Taking the case of list
as an example, there is a class typing.List
to represent this. On its own, that would indicate an unrestricted heterogeneous list, which can contain any types at all. This isn’t a particularly useful type hint, however, so you can use square brackets to indicate the type of the contained item, as in typing.List[int]
. This indicates that you expect every item in that list to be an int
.
The code snippet below illustrates various combinations:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
The output of running mypy
is as follows:
type-hints-containers.py:17: error: List item 0 has incompatible type "float"; expected "int"
type-hints-containers.py:17: error: List item 1 has incompatible type "float"; expected "int"
type-hints-containers.py:17: error: List item 2 has incompatible type "float"; expected "int"
type-hints-containers.py:21: error: List item 2 has incompatible type "float"; expected "int"
Found 4 errors in 1 file (checked 1 source file)
All the calls to one()
are fine, because the type hint indicates the items in the list can be of any types. Similarly, all the calls to three()
are fine because both int
and float
are subtypes of float
. The problems are on line 17, where all three float
values are a mismatch for the List[int]
argument, and line 21 where the single float
in the list is a mismatch.
Within the typing
module are classes to represent the builtin container types, including those from modules such as collections
. These include:
DefaultDict
Deque
Dict
FrozenSet
List
Set
You’ll notice Tuple
is missing from the list. This is beacuse it’s covered below in the Typing Primitives section. The reason it’s different is because all of the above are typed homogeneously — every item in the container has the same type specifier. As you’ll see later, however, Tuple
has a different specifier for each element2.
There’s also a typed version of collections.namedtuple
called typing.NamedTuple
. This is actually a concrete rather than generic type which is used to actually declare the class rather than just as a type hint. These two declarations are functionally equivalent, aside from the addition of the type hints:
Student = typing.NamedTuple(
"Student",
[("name", str), ("address", str), ("age", int)]
)
Student = collections.namedtuple(
"Student",
["name", "address", "age"]
)
As an aside, some additional features in Python 3.6 make these rather easier to define, which I’ll cover in a future article.
That still leaves us with some questions, however — how would we indicate that we want our function to take an iterable, but we don’t really care about the specific type, whether it’s list
, tuple
, or anything else?
Thankfully typing
still has us covered, and provides classes that correspond to the different abstract container types provided by collections.abc
. For example, if we just want any object that’s a read-only Sequence
(i.e. provides __len__()
and __getitem__()
) of int
then you can declare the parameter as typing.Sequence[int]
.
Here’s a list of the classes based on those container abstract base classes — most of the names are the same, but a few of them differ so I’ve included the corresponding classes in collections.abc
as well. If you want details of the supported operations, check out the collections.abc
module documentgation.
typing |
collections.abc |
---|---|
AbstractSet |
Set |
AsyncGenerator |
AsyncGenerator 3 |
AsyncIterable |
AsyncIterable |
AsyncIterator |
AsyncIterator |
Awaitable |
Awaitable |
ByteString |
ByteString 4 |
Container |
Container |
Coroutine |
Coroutine 5 |
Generator |
Generator 6 |
Hashable |
Hashable |
ItemsView |
ItemsView |
Iterable |
Iterable |
Iterator |
Iterator |
KeysView |
KeysView |
Mapping |
Mapping |
MappingView |
MappingView |
MutableMapping |
MutableMapping |
MutableSequence |
MutableSequence |
MutableSet |
MutableSet |
Reversible |
Reversible 7 |
Sequence |
Sequence |
Sized |
Sized |
ValuesView |
ValuesView |
There are also some additional classes that don’t correspond to equivalents in collections.abc
, but still represent abstract types:
SupportsAbs
for any type that provides __abs__()
.SupportsFloat
for any type that provides __float__()
.SupportsInt
for any type that provides __int__()
.SupportsRound
for any type that provides __round__()
.To finish off, a few oddments that didn’t fit into the earlier sections:
IO
for IO stream types, although most of the time you probably want to use one of the aliases:TextIO
is an alias for IO[str]
.BinaryIO
is an alias for IO[bytes]
.Pattern
and Match
for the objects used by the re
module.For more complicated situations, accepting a single type, or a homgeneous container containing a single type, is not sufficient. For these cases, the typing
module offers some more facilities.
Let’s look at something more complicated. The code below introduces using our own classes with type hints, and also the use of typing.Union
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
|
There are a few minor points to note first. First, note that self
parameters don’t bother getting hinted. Second, note that __init__()
is annotated as returning nothing (as it does return nothing). Third, note that None
can be used directly even though really this is a value — this is a special case and can be taken to mean type(None)
in type hints. Finally, this snippet illustates subclasses being subtypes with the subclasses of Named
being passed to show_name()
which expects a Named
instance. This is exactly as we’d expect, of course, but it’s nice to see it demonstrated.
The significant new feature here is on line 29, where we use typing.Union
to represent a group of types. A type is a subtype of this union if it’s a subtype of any of the types listed. This means that show_greeting()
will accept only a FrenchGreeter
or GermanGreeter
, or a subclass of either of them, but no other types.
If you run this through mypy
, it confirms this:
type-hints-unions.py:40: error: Argument 1 to "show_greeting" has incompatible type "EnglishGreeter"; expected "Union[FrenchGreeter, GermanGreeter]"
type-hints-unions.py:46: error: Missing positional argument "salutation" in call to "show_greeting"
type-hints-unions.py:46: error: Argument 1 to "show_greeting" has incompatible type "Named"; expected "Union[FrenchGreeter, GermanGreeter]"
Found 3 errors in 1 file (checked 1 source file)
The issue on line 40 is that we’re passing an EnglishGreeter
which doesn’t match anything within the union. It runs fine when executed, mind you, it’s only because of our type hinting that mypy
raises that error.
There are two issues on line 46, the first of which is simply that we’re missing the second required argument to show_greeting()
. The second issue is once again becaused Named
is not a subtype of any of the classes mentioned in the Union
.
Now we’ve seen Union
in action, let’s briefly look at the semantics of this and the other primitives that typing
offers for defining types.
Any
First up is Any
which matches any type whether assigned or being assigned to. This is subtly different to using object
which is the supertype for all other types. Let’s say you define two functions accept_any(arg: typing.Any)
and accept_obj(arg: object)
. They will both accept a parameter of any type, since anything is a subtype of either of them. However, if you then attempt to pass that value into another function you’ll find a difference — the value passed into accept_obj()
can only be used with another function taking object
or Any
. However, the parameter to accept_any()
can be passed into any other function regardless of the type.
The main function of the Any
type is to act as the default of every parameter or return type which isn’t otherwise annotated. This is the mechanism which allows source code to be incrementally annotated and still benefit from type-checking, instead of getting no benefit until every piece of code is annotated.
Union[t1, t2,
…]
object
then you’ll find the whole thing just evaluates to object
, since any other types are by definition subtypes of it.Optional[t]
Union[t, None]
.Tuple[t1, t2,
…]
tuple
whose values correspond to the types in the order specified. For example, Tuple[int, str]
matches (123, "hello")
. The number of arguments is always fixed, although there is a special case for Tuple[t, ...]
using the ellipsis token (i.e. three dots). This specifies a variadic homogeneous tuple — i.e. any number of items, but they’re all of the same specified type.Callable[[t1, t2,
…], tr]
tr
specifies the return type. There’s no way to specify optional or keyword arguments, and no support for specifying variadic functions, but you can skip checking the parameter list by using the ellipsis token: Callable[..., tr]
.Because the objects in typing
are just general classes (albeit with some restrictions) then they can be assigned to variables. This allows you to create aliases for specific types, rather like typedef
in C/C++.
Here’s some code to calculate the total length of a path of line segments in 3D space. Consider how verbose the signatures would be without being able to declare Point
and Path
even in this extremely simple example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
The final feature of type hints that I’m going to discuss is generic functions and type variables. To illustrate these, let’s consider a generic function for concatenating strings. Let’s say we want to make it work for both str
and bytes
, we could do something like this using the machinery we’ve learned so far:
1 2 3 4 5 6 7 8 9 10 |
|
This isn’t too bad except that the third call to concatenate()
demonstrates a problem — based on that specification, there’s nothing to constrain first
and second
to be the same type as each other, since they’re all independent unions.
Just for giggles, let’s see what mypy
tells us for this code:
type-hints-type-vars.py:6: error: Unsupported operand types for + ("str" and "bytes")
type-hints-type-vars.py:6: error: Unsupported operand types for + ("bytes" and "str")
type-hints-type-vars.py:6: note: Both left and right operands are unions
Found 2 errors in 1 file (checked 1 source file)
So even though our typing hinting wasn’t up to scratch, mypy
has our back and let’s us know we run the risk of mixing str
and bytes
. However, it would be better to clarify that both parameteters must be the same type using hinting — we can do so using a type variable. This is illustrated below:
1 2 3 4 5 6 7 8 9 10 |
|
At this point all we’ve said is that concatenate()
takes two parameters of the same type, and returns that type also. Any C++ programmers among you might be finding this somewhat similar to templating and it’s definitely got similarities, but of course in Python it’s just for type checking purposes. This function is known as a generic function.
So what does mypy
make of this code now?
type-hints-type-vars.py:6: error: Unsupported left operand type for + ("T")
Found 1 error in 1 file (checked 1 source file)
It still has an issue with line 6 which is that since the type of T
is unconstrained there’s nothing to stop us passing variables of a type that doesn’t support the +
operator such as set
. OK, so let’s update line 3 to constrain this variable to str
or bytes
:
3 |
|
Now re-running mypy
yields a slightly cryptic error message, but it’s now latched on the error on the correct line:
type-hints-type-vars.py:10: error: Value of type variable "T" of "concatenate" cannot be "object"
Found 1 error in 1 file (checked 1 source file)
I think that more or less covers the basics of the type hints added in Python 3.5. There are some additional details I’ve glossed over, such as covariance and contravariance of types. If you want to know more of the gory details then I suggest PEP 483 as a starting point for some discussion of the theory behind type checking, and then PEP 484 for more specifics on the implementation.
Type hinting is something I’ve always danced around the edges of with Python since I never took the time to get a proper grounding in it, but now I’ve gone through in more detail it’s definitely something I’ll be looking to make more use of. I must admit I don’t often run into type mismatch bugs in my own code, since I generally find if there’s type ambiguity then it’s often a sign of sloppy code structure that should be tidied up. That said, of course it’s happened that I’ve used, say, a date
here and a datetime
there and it’s lead to some annoying issues that don’t always show up in simple unit test cases.
Even ignoring the value of type hints to find bugs, however, there’s also a huge value in expressing the programmer’s expectations to anyone reading the code. This helps with understanding new code, as well as identifying bugs at code review time. In my opinion that’s a bigger benefit than enabling the static type checks, although I don’t want to clearly it’s best to have both.
That’s it from me. Next time I’ll be looking at the remaining smaller syntax enhancements for matrix multiplication and iterable unpacking, and some other additions to the standard library as well.
OK, so before I get lots of angry comments I know this isn’t strictly true. The mathematical statement that integers are a subtype of real numbers is true, but in Python you can represent numbers with int
that you can’t with float
— for example, try doing float(int(sys.float_info.max) * 10)
. However, in my defense, mypy does allow int
to be used anywhere where float
or complex
is expected, which is essentially treating int
as a subtype of these other types. ↩
Randomly shuffling topics around is my cunning plan to present the appearance that I think about the structure of my articles in advance. Clever, eh? As long as I’m not daft enough to tell you that’s what I’m doing, it’s pretty convincing. ↩
Strictly speaking this wasn’t added to collections.abc
until Python 3.6, but it’s in typing
in 3.5 so I’m still covering it here. It uses the same type specification as Generator
except that async generators cannot return a value so there’s only two types to specify, the type to yield and the type to send. ↩
This represents bytes
, bytearray
and memoryview
, and as a shorthand bytes
can be used for any argument of those types. ↩
Uses the same type specification as Generator
, see the footnote for that for details. ↩
A generator needs up to three types specified: the type that’s yielded from the generator, the type that’s expected to be sent to the generator, and the type that can be returned from the generator. The syntax for specifying the generator type is Generator[type_yield, type_send, type_return]
. A generator which is expected to yield integers and not expected to recieve any sent values or to return a value would be specified with Generator[int, None, None]
. ↩
If you want to get techincal, Reversible
wasn’t added to collections.abc
until Python 3.6, but it was added to typing
in 3.5 so I’m including it here. ↩