☑ What’s New in Python 3.12 - Type Hint Improvements

3 Dec 2023 at 2:17PM in Software
 | 
Photo by Dids on Pexels
 | 

In this series looking at features introduced by every version of Python 3, we take a look at the new features added in Python 3.12 related to type hinting.

This is the 30th of the 31 articles that currently make up the “Python 3 Releases” series.

python 312

After a bit of a break looking into other languages and technologies, it’s that time again — Python 3.12 was officially released a couple of weeks ago, so I thought it was time to take a look and see what goodies the Python developers have for us.

As usual I’ll run over the changes in 3.12 over a few articles — in this one we’re going to take a look at a few improvements to type hints. The specific changes we’ll look at are:

  • New syntax for defining generic classes and methods.
  • More flexible type hints for **kwargs arguments.
  • The use of the @override decorator for specifying overridden methods.
  • Enhancements to the definition and use of run-time checkable Protocol objects.

In the remainder of this article, I’ll run through each of these in turn.

New Generics Syntax

In Python 3.11, you could use something like the following to define a generic function:

Python 3.11
from typing import TypeVar

T = TypeVar('T', bound=str|bytes)
def strip_list(x: T, sep: T) -> T:
    return sep.join(i.strip() for i in x.split(sep))

As of Python 3.12, however, you could instead use the following more concise syntax, as defined by PEP 695.

Python 3.12
def strip_list[T: str|bytes](x: T, sep: T) -> T:
    return sep.join(i.strip() for i in x.split(sep))

This brings Python’s basic syntax for generics into a similar sort of orbit to Go’s, although of course there are some significant differences in the way it all works. It’s certainly both more convenient and readable than having to declare a separate TypeVar all the time.

You can also use this with classes:

Python 3.12
class MyListStripper[T: str|bytes]:
    def __init__(self, sep: T) -> None:
        self.sep = sep

    def stripList(self, x: T) -> T:
        return self.sep.join(i.strip() for i in x.split(self.sep))

There are also updates to declaring type aliases. In Python 3.11, you might do something like this:

Python 3.11
from typing import TypeAlias, TypeVar

T = TypeVar("T")
ListOrSet: TypeAlias = list[T] | set[T]

But in Python 3.12 you can instead use the new keyword type for this purpose, which removes the need for either TypeVar or TypeAlias:

Python 3.12
type ListOrSet[T] = list[T] | set[T]

Note that type is a soft keyword, which is a concept that was added in Python 3.10 and indicates something which the tokeniser treats as a general identifier, but which the parser recognises as having special meaning. Other such soft keywords are match, case and _. The reason for this is to prevent syntactical errors when running older code which might use these as, say, variable names. This means that if you want to, you can continue to use it in other parts of the grammar where it won’t be confused — but personally I’d stay away from using any keyword if you possibly can help it.

Self-Reference

One detail that’s interesting about using type is that these definitions can be self-referential:

type RecursiveList[T] = T | list[RecursiveList[T]]

This is probably only really of use to people implementing tree- or list-like data structures, and in all honesty this sort of pointer-like usage is often (not always) a bit of an antipattern in Python. That said, it’s certainly worthwhile to have it available for those niche occasions where it’s just the right thing to use.

Scoping

Another interesting point is that when you use the new generic syntax, the identifier for the type (i.e. T in the examples above) is only valid to use within the scope of the defition to which its attached — this is unlike the old use of TypeVar which had standard Python scoping.

type ListOrSet[T] = list[T] | set[T]
print(T)    # Causes NameError as T isn't defined here

This is actually a new lexical scope, which didn’t exist previously, called annotation scope. It works mostly like function scope, but names in the containing scope are still visible — this allows specifications defined on a method, say, to supplement those for the class. At the completion of PEP 649, expected in Python 3.13, annotations (i.e. type hints) will use this new scope as well so they no longer need to be converted to strings to allow self-reference.

This feels like progress because although previously the type variable itself could be reused across multiple classes or functions, the semantic meaning of it was always local to each definition. With this new syntax the scope of the type variable matches the scope of the semantic meaning, which is less confusing.

Variance

An advantage of this syntax which is quite subtle is that the developer no longer needs to worry about specifying the variance of type parameters. If you’re not familiar with this term, and I certainly wouldn’t blame you, it determines the relationship between generic types when the types on which they’re parameterised are subtypes.

If you want a full discussion you can read the Wikipedia page, but briefly let’s say we have types Base and Derived, the latter being a subtype of the former, and a generic class list[T]. The question of variance comes into play when we consider the subtype relationships of specialisations of list[T]. If list[Derived] can be used wherever list[Base] is expected, then it’s covariant. If list[Base] can instead be used where list[Derived] is expected, then it’s contravariant. If neither holds and only the same type will do, then it’s invariant.

This syntax removes the need for developers to specify type parameter variance themselves, instead relying on type checkers to use variance inference to determine it based on their usage. The rules to do this are:

  1. Any variadic type parameter (TypeVarTuple that I covered in 3.11) is always invariant.
  2. Any parameter specifications (ParamSpec that I covered in 3.10) is always invariant.
  3. Old-style TypeVar declarations maintain the same semantics, unless you add the new keyword parameter infer_variance=True.
  4. If none of the above apply then proceed with variance inference: create an Upper specialisation where the type parameter is specialised as object, and a Lower specialisation where it’s specialised as the type parameter itself. All other type parameters are given the same dummy type in both cases.
  5. If Lower can be assigned to Upper, it’s covariant.
  6. If it’s not covariant, and Upper can be assigned to Lower, it’s contravariant.
  7. If neither covariant or contravariant, it’s invariant.

If this sort of thing makes your head hurt then you should be glad of the fact that you don’t need to worry about it any more and go your merry way.

Support

One final thing that’s worth noting is that, support for all this outside the Python interpreter itself will probably be limited for awhile. For example, mypy support is still in development at time of writing.

Better Typing of **kwargs

As per the original introduction of type hints with PEP 484, the **kwargs construction could only be hinted if every argument was the same type.

def print_percentages(**kwargs: float):
    total = sum(kwargs.values())
    for k, v in kwargs.items():
        print(f"{k}: {v * 100 / total:.2f}%")

Python 3.12, however, has introduced the ability to use TypedDict to specify types more flexibly. This reuses the typing.Unpack which was introduced with variadic generics, but probably not heavily used for that purpose as the * operator also worked and is clearer.

In this case, if you define a TypedDict then you can pass that to Unpack and use the whole lot as the annotation of **kwargs — this is probably most clearly illustrated with an example.

from typing import NotRequired, TypedDict, Unpack

class First(TypedDict):
    one: int
    two: str|None
    three: NotRequired[float]

def func(**kwargs: Unpack[First]):
    print(f"one plus 1 is {kwargs['one'] + 1}")
    if kwargs["two"] is not None:
        print(f"two says {kwargs['two']}")
    if "three" in kwargs:
        print(f"three is {kwargs['three']}")
    else:
        print("three is missing")

func(one=10, two="hello")

This seems like a simple change, but there are a surprising number of subtlties — I won’t go through them here, but PEP 692 explains everything very clearly. It’s worth noting that the original plan was to use the ** operator where Unpack is used, but since this required a grammar change the authors decided to remove it from scope and implement this with Unpack in Python 3.12, and may propose the ** syntax in a future PEP.

Override Decorator

A simple one — typing.override() has been added as a decorator which allows a derived class definition to indicate that a particular method is intended to override a method in the base. This allows type checkers to flag as errors cases where methods are expected to be overriding, but there is, in fact, no such method in the base class.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
from typing import override

class Base:
    def method(self, arg: int) -> int:
        return arg + 1

class Derived(Base):
    @override
    def method(self, arg: int) -> int:
        return arg * 2

One other detail that’s worth being aware of is that at runtime this decorator also adds attribute __override__ with value True to the object so decorated1.

This is similar to the override keyword in C++, C#, TypeScript, Scala and Swift, and the @Override annotation in Java. It’s a simple change and I don’t think there’s a great deal else to say about it — once type-checkers have support for this, I’m sure it’ll help catch some potentially puzzling errors earlier in the development cycle.

You can see PEP 698 for more details, including some interesting rejected alternatives.

Static Runtime Checkable Protocols Checks

We’ll finish off with a bit more of an esoteric one. Back in Python 3.8, protocols were introduced, which define interfaces which objects must meet a type hint. There is also a decorator @runtime_checkable which allows types to be checked against these protocols using isinstance(). I covered these in a past article.

This release has some changes to this functionality. Consider that hasattr() is used to compare types against the protocol, and also consider that the implementation of hasattr() is to call getattr() and check whether it raises AttributeError. This can lead to issues when using classes whose attributes are returned by calling a function, such as attributes using property, for example. The getters for the properties of a class instance end up being invoked when you check that instance against a protocol using isinstance() — this could be an expensive call or raise an exception, which is a fairly unexpected side-effect of using isinstance().

This may be a little confusing, so consider the example below.

import typing

@typing.runtime_checkable
class Interface(typing.Protocol):
    @property
    def myproperty(self):
        pass

class ConcreteClass:
    @property
    def myproperty(self):
        raise Exception("oops")

isinstance(ConcreteClass(), Interface)

If you run this, you’ll see that the Exception("oops") is raised from within the call to isinstance().

In Python 3.12, however, the call to hasattr() has been replaced by inspect.getattr_static(), which queries the __dict__ attribute directly instead of triggering the usual __getattr__() or __getattribute__() dunder methods. In most cases this will just avoid issues such as that demonstrated above, and also improve performance — however, it’s possible that there are also cases which previously would work but which will now break. One example would be a method which manually raises AttributeError in some cases, for example.

One consequence of this initial change was apparently a drop in performance, since getattr_static() was significantly slower. However, a couple of further changes addressed this, first by a specialisation of __instancecheck__() for protocols, and second by some performance improvements to inspect.getattr_static().

Conclusions

The major change here is, of course, the new syntax for declaring generics. This seems like a big step in the right direction, making things not only more consistent with other languages, but more convenient and concise as well. In my opinion they’re also quite a lot more readable, although that’s more subjective.

I’m a little split on how many developers will find this a major help, however, simply because Python has such a comprehensive library already that I strongly suspect a large proportion of developers will have no real need to declare their own generics. Having used Python in conjunction with statically typed languages for many years, however, I’m very familiar with the benefits of increased compile-time checking in terms of catching bugs earlier and more reliably, so I’m definitely a fan of anything which makes it easier for people to use type hints more comprehensively in their code.

The replacement of typing.TypeVar with the keyword type is also a nice change to see, as it makes type aliases more convenient and readable for most people. It’s interesting to see that despite type hints being completely optional, they are being seen as important enough to justify new keywords being added, and I do think in the longer term there is a risk of this trend becoming problematic — Python has always benefited from being a compact core langauge, and it would be upsetting to see it sliding towards Perl or PHP levels of builtin functionality. But it’s premature to have such concerns yet, I just think it’s worth keeping an eye on the trend.

The @override decorator in typing is a nice touch, as it’s a simple and non-invasive change which will definitely catch the odd bug here and there for anyone making significant use of class-based polymorphism in their code. This is particularly helpful when extending library-provided base classes, where it’s particularly upsetting to, say, make a typo when overriding an error-handling method and not notice this until you actually get some rare error in production use.

Typing of **kwargs is potentially useful, although honestly the only time I tend to use this is in cases where you don’t know the types at development time anyway, rendering the change less useful. If I wanted to type these parameters, I’d tend to wrap them in my own class and pass an instance of that instead. But that’s a personal preference, and in any case is little use for supporting existing code where the decision has already been taken and cannot be changed without breaking backwards compatibility — the ability to layer in type hints may well help in these cases.

The changes to runtime checkable protocols is a bit of a niche one, but is nonetheless a sensible change and makes things more intuitively predictable in more edge cases, in my view.

So that’s it for this article — when I get chance I’ll carry on in the next article by looking at some improvements to f-strings and the Python interpreter implementation.


  1. This is set on a best-effort basis, as the object may not accept the attribute — for example, if it’s using __slots__. In these cases, the object is returned from the decorator unchanged — as a result, you probably don’t want your code to depend on this attribute too heavily. 

The next article in the “Python 3 Releases” series is What’s New in Python 3.12 - F-Strings and Interpreter Changes
Tue 6 Feb, 2024
3 Dec 2023 at 2:17PM in Software
 | 
Photo by Dids on Pexels
 |