☑ Python 2to3: What’s New in 3.6 - Part 1, New Features

25 Jul 2021 at 6:10PM in Software
 |   | 

In this series looking at features introduced by every version of Python 3, we now move on to look at new features added in Python 3.6. This first article looks at some of the most significant new features added to the language added in this release.

This is the 12th of the 15 articles that currently make up the “Python 2to3” series.

green python two 36

Now we’re on to Python 3.6. From first glance, this release isn’t quite as huge as 3.5, but there’s a good deal to get our teeth into. In this article we’ll kick off by looking at a set of syntactic changes for string interpolation, type hinting and async support. First up is a handy new string formatting mechanism.

Formatted String Literals

Prior to 3.6, Python only had three methods for string formatting. As we all know, the thirteenth aphorism in the Zen of Python states that “there should be four — and preferably only four — obvious ways to do it”, so it was clearly important for Python 3.6 to add another one. Before we get on to the new one, however, I’ll just briefly refresh your memory on the previous three.

Firstly, the % operator has been a formatting operator on strings since prior to Python 2.0. It’s the one that operates very much like sprintf() in libc. It’s a flexible and very succinct system that’s stood the test of time, but unless you’re coming from a place of familiarity with the Xprintf() family then the formatting syntax can be a little obtuse. Its positional argument style also harms readability, although it does also support a mapping-based approach.

>>> "sprintf() is great for %.7s and floats like %-10.3f" % ("stringstring", 123.456789)
'sprintf() is great for strings and floats like 123.457   '

Next, Python 2.4 added the string.Template class as part of PEP 292. This provides a more friendly approach for simpler cases, but doens’t have the flexibility for specifying the formatting of values.

>>> import string
>>> tmpl = string.Template("Hello $name, you're $age year${s} old")
>>> tmpl.substitute(name="Andy", age=43, s="s")
"Hello Andy, you're 43 years old"

Hot1 on the heels of that text formatting revolution came a more powerful one courtesy of PEP 3101 — Python 3.0 added “advanced string formatting”, which was also quickly backported to Python 2.6. I’m not going to talk about this in detail since I already covered it in the first article in this series. For reference, here’s a snippet to demonstrate it.

>>> # Exponent notation with precision 3.
>>> "{0:.3e}".format(123456789)
'1.235e+08'
>>> # General float format, min. length 8, precision 4.
>>> "{0:8.4}".format(12.345678)
'   12.35'
>>> # Hex format, centred in 10 chars using '\' as padding.
>>> "{0:\^+#10x}".format(1023)
'\\\\+0x3ff\\\\'

So that brings to the mechanism that’s been added in Python 3.6, f-strings, which are specified in PEP 498.

This introduces new syntax which harks back to the old PEP 215, never implemented, which proposed a syntax change to allow strings of the form $"..." which would have similar interpolation semantics to a unix shell. The prefix has now become f"..." and {...} is used to indicate expressions to interpolate, but the principle is otherwise similar.

The rationale for the new syntax is simplicity. Formatting with % is concise but has traps for the unwary; the str.format() method is quite safe but also verbose; and string.Template is just generally a bit too simple to be useful in a lot of circumstances. This new syntax is intended to combine the safety and flexibility of str.format() with a more concise syntax, and what we’ve ended up with is demonstrated in the snippet below.

>>> name = "Andy"
>>> age = 43
>>> f"Hello {name}, you're {age} year{'s' if age != 1 else ''} old"
"Hello Andy, you're 43 years old"
>>> import math
>>> f"Pi is around {math.pi:^+10.3f} and e is {math.e:10.4g}"
'Pi is around   +3.142   and e is      2.718'

As you can see in the example above, the syntax is pretty flexible. You can specify any Python expression, from simple variable names to function calls and more. You can also use the same formatting specifiers as with str.format(), by appending them after a colon. You can even substitute values into these.

>>> width = 6
>>> f"{name:.>{width}}"
'..Andy'
>>> width = 10
>>> f"{name:.>{width}}"
'......Andy'

The new syntax also shares the use of the __format__() method for customising formatting, which is handy for anyone that’s already supporting that as they won’t need to rewrite their code. It also allows the !s, !r and !a suffixes on values as str.format() does, although these are now actually rather superfluous since there’s nothing to stop you explicitly calling str(), repr() or ascii() yourself within the expression. The suffixes are still slightly more concise, though.

>>> import datetime
>>> now = datetime.datetime.now()
>>> f"{now} / {now!r}"
'2021-07-16 00:01:24.472536 / datetime.datetime(2021, 7, 16, 0, 1, 24, 472536)'

Overall I’m impressed with the new f-strings feature, and as long as I can assume at least Python 3.6 I think it’ll become my default string format option except for a few cases. The formatting is as powerful as any of the other options, especially since you can substitute in values for the formatting values, and the ability to execute any arbitrary Python expression make for very flexible logging.

That said, there are a couple of gotchas I would suggest watching out for with these strings. The first is in access to data — because you don’t explicitly specify what you expose to the string formatting logic, you may find yourself creating leaky abstractions. The second is side effects — because these strings can evaluate arbitrary expressions, some of which may have side-effects, they’re not necessarily idempotent and they may raise exceptions. If you’ve already, say, extracted some values from an iterator but a later expression in the same f-string raises an exception, you’ll probably lose those values you’ve already extracted.

>>> x = iter(range(1, 9))
>>> f"{next(x)} {next(x)} {next(x)}"
'1 2 3'
>>> f"{next(x)} {next(x)} {next(x)}"
'4 5 6'
>>> # What happens to 7 & 8?
>>> f"{next(x)} {next(x)} {next(x)}"
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

In addition to those general concerns, there are a few specific examples where I would suggest alternative mechanisms should be considered.

User-supplied format template strings
For all its simplicity, string.Template is pretty safe with user-specified format strings. You control the variables that you expose to it, and since it’s only evaluating values it’s highly unlikely to raise any exceptions, unless you have a __str__() method that does something dubious or similar.
User-facing formatted output
Another case where I’d reach for a more capable templating system is generating user-facing output where formatting matters, such as a web page or email. I’d strongly suggest going for something like Jinja which decouples the business logic from the presentation.
The logging module
Our old friend the logging module is another case to at least consider sticking with the %-style formatting, as this is the only style of formatting where it supports deferring of evaluation of message arguments. To change this would break backwards-compatibility. Although you might be able to do this safely within your application, it’ll potentially break any of the uses of logging in libraries you use, which is a pretty knotty problem to untangle. There’s more discussion of this in the “Optimization” and “Use of Alternative Formatting Styles” sections of the logging documentation.

Variable Type Hints

Building on the type hinting for function parameters and return values that we looked at in the previous article on type hints in Python 3.5, Python 3.6 introduces further syntax to also add annotations to variables. This is described in PEP 526 and frankly works more or less as you’d expect.

You can specify annotations as an expression on its own, or as part of an assignment operation — the ability to specify the type without an assignment is useful where a variable is initialised differently in different branches of a conditional. You can annotate local and global variables, as well as class and instance variables. For class variables, there’s typing.ClassVar to allow type-checkers to disambiguate between defining the type of class and instance variables.

To illustrate this, I’ve written a rather self-indulgent little class below whose sole purpose is track all instances of itself and announce every creation or deletion to all other extant instances.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
import sys
from typing import ClassVar, Iterable, Optional, Set
from weakref import ref, ReferenceType

class InstanceTracker:

    instances: ClassVar[Set[ReferenceType]] = set()
    instance_name: str

    def __init__(self, name: str) -> None:
        print(f"Creating {name}")
        self.instance_name = name
        instance: "InstanceTracker"
        for instance in InstanceTracker.get_instances():
            instance.handle_new_peer(self)
        InstanceTracker.instances.add(ref(self))

    def __del__(self) -> None:
        print(f"Destroying {self.instance_name}")
        InstanceTracker.instances.discard(ref(self))
        instance: "InstanceTracker"
        for instance in InstanceTracker.get_instances():
            instance.handle_peer_gone(self)
        InstanceTracker.instances.add(ref(self))

    @classmethod
    def get_instances(cls) -> Iterable["InstanceTracker"]:
        weak_instance: ReferenceType
        for weak_instance in cls.instances:
            ret: Optional[InstanceTracker] = weak_instance()
            if ret is not None:
                yield ret

    def __str__(self) -> str:
        return self.instance_name

    def handle_new_peer(self, new_instance: "InstanceTracker") -> None:
        print(f"[{self.instance_name}] New: {new_instance.instance_name}")

    def handle_peer_gone(self, old_instance: "InstanceTracker") -> None:
        print(f"[{self.instance_name}] Gone: {old_instance.instance_name}")


def main() -> int:
    one: InstanceTracker = InstanceTracker("one")
    two: InstanceTracker = InstanceTracker("two")
    three: InstanceTracker = InstanceTracker("three")

    print("----")
    instance: InstanceTracker
    for instance in InstanceTracker.get_instances():
        print(f"Instance {instance.instance_name}")
    print("----")

    del two
    print("----")
    for instance in InstanceTracker.get_instances():
        print(f"Instance {instance.instance_name}")
    print("----")

    return 0

if __name__ == "__main__":
    sys.exit(main())

You can see how the variable annotations follow the same pattern as for function parameters. On line 7 you can see a class variable instances being defined, and this explicit syntax allows checkers to raise a warning if you ever attempt to assign to it via self.instances, as that would hide the class variable within that instance. You can also see from this line that weakrefs are represented by an untyped weakref.ReferenceType — this type doesn’t support subscripting until Python 3.9 so there’s no way to make an assertion about the type of value it references.

Line 8 is setting the type of an instance variable, but without an initialiser as it’s set up in __init__(). Lines 45-47 show simple local variables being annotated.

On line 13 you can see that for variables defined by a for loop, you have to use a bare type hint on the previous line, as there’s no valid syntax for embedding it in the for loop statement itself. This has a bit of a caveat because the scope of this declaration extends across the entire current function. This means that you can’t reuse the same variable name for a different type without triggering errors in a type checker. Consider the simple script below.

1
2
3
4
5
6
7
changetype.py
value: int
for value in (1,2,3):
    print(value)

value: str
for value in ("one", "two", "three"):
    print(value)

This will execute fine with the interpreter, but when you run that through mypy you get this error:

changetype.py:5: error: Name "value" already defined on line 1
changetype.py:6: error: Incompatible types in assignment (expression has type "str", variable has type "int")
Found 2 errors in 1 file (checked 1 source file)

I don’t think there’s much else to say, really — it’s quite a natural extension to the type hint syntax. It’s worth noting that adding an annotation is sufficient to mark a variable as local in that context, even it you don’t assign to it. You can see this in the different types of exception you get from accessing an unintialised variable in the snippet below.

>>> def func1():
...     print(x)
...
>>> def func2():
...     x: str
...     print(x)
...
>>> func1()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in func1
NameError: name 'x' is not defined
>>> func2()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in func2
UnboundLocalError: local variable 'x' referenced before assignment

I also haven’t mentioned the __annotations__ attribute of modules and classes which stores all the annotations specified at that level — this seems to me a rather esoteric detail, you can read the original PEP 3107 as well as the extensions in PEP 526 if you want to know more.

Underscores in Numeric Literals

This one is nice and simple — it’s now possible to use underscores in numeric literals as, for example, thousand separators. You can use single underscores freely within any numeric literal of any base, although literals cannot start or end with underscores and you cannot use more than one consecutively. You can also use them in the constructors for int, float, complex and Decimal.

>>> 10_000_000
10000000
>>> 1_23_456_7_8_9
123456789
>>> _123
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name '_123' is not defined
>>> 1__23
  File "<stdin>", line 1
    1__23
     ^
SyntaxError: invalid token
>>> 123_
  File "<stdin>", line 1
    123_
       ^
SyntaxError: invalid token
>>> 0x_ff_ffff
16777215
>>> 0o12_34
668
>>> 1_2.3_4
12.34
>>> import decimal
>>> decimal.Decimal("1_2.3_4")
Decimal('12.34')

In addition there’s a small change to the string formatting specification for both str.format() and the new f-strings to use underscores as thousands separators. The existing format allowed a , character to use commas, and this simple extension now allows a _ character to be used in its place to specify that underscores should be used instead. Usefully this can also apply to other number bases (unlike ,) where an underscore is inserted every 4 characters for improved readability.

>>> value = 2_898_270_942
>>> f"{value:16,}"
'   2,898,270,942'
>>> f"{value:16_}"
'   2_898_270_942'
>>> f"{value:#16_x}"
'     0xacc0_1ade'

PEP 515 has some more discussion.

Asynchronous Generators

As I covered in a previous article, Python 3.5 introduced the async and await keywords for defining coroutines. This release builds on that by adding more capabilities for implementing coroutines.

The first of these changes is to allow asynchronous generators. In Python 3.5 it wasn’t valid to use both yield and await in the same function — your code was either a coroutine or a generator, but not both. This means that there was no way to use the compact generator syntax with async for, instead you’d always need to implement your own class with __aiter__() and __anext__() methods.

The good news is that this restriction has been removed in Python 3.6 as per PEP 525 and now you can both yield and await within a coroutine to make it an async generator. As you might expect, you can’t use this with a regular for loop because it follows the asynchronous iterator protocol, it isn’t a standard iterable.

Here’s a very simple example of an async generator.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import asyncio

async def slow_range(*args, delay, **kwargs):
    for i in range(*args, **kwargs):
        yield i
        await asyncio.sleep(delay)

async def countdown():
    async for i in slow_range(10, 0, -1, delay=1):
        print(i)
    print("Blast off!")

loop = asyncio.get_event_loop()
loop.run_until_complete(countdown())

This should be fairly intuitive to anyone familiar with both generators and coroutines. There’s a bit of a wrinkle that isn’t immediately obvious, however, and that’s what to do about context managers and try...finally blocks.

Imagine you write an asynchronous iterator which uses something like a database and which uses the context manager approach to track transactions. I’ve written a sort of rough approximation to this in the script below, but I haven’t bothered to actually implement any of the database side — the important aspect is the use of async with on line 22.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import asyncio

class FakeDBTransaction:
    """Pretend this is a DB transaction context manager."""

    async def __aenter__(self):
        print("Starting transaction")
        await asyncio.sleep(3)
        return self

    async def __aexit__(self, exc_type, exc_value, traceback):
        print("Completing transaction")
        await asyncio.sleep(3)

    async def cursor(self):
        for i in range(10):
            row = [i, "somevalue"]
            yield row

async def result_generator():
    print(">>> result_generator()")
    async with FakeDBTransaction() as db_conn:
        cursor = db_conn.cursor()
        async for i in cursor:
            yield i
    print("<<< result_generator()")

async def get_results():
    print("Iterating through DB results")
    rows = result_generator()
    async for row in rows:
        print(f"Got row {row}")
    print("Done with results")

loop = asyncio.get_event_loop()
loop.run_until_complete(get_results())
loop.close()

If you run this script, you’ll see that it works more or less as you’d expect:

Iterating through DB results
>>> result_generator()
Starting transaction
Got row [0, 'somevalue']
Got row [1, 'somevalue']
...
Got row [9, 'somevalue']
Completing transaction
<<< result_generator()
Done with results

So far so good. But what if we amend get_results() so that it interrupts the iteration?

28
29
30
31
32
33
34
35
async def get_results():
    print("Iterating through DB results")
    async for row in result_generator():
        print(f"Got row {row}")
        if row[0] > 3:
            print("I've had enough")
            break
    print("Done with results")

Now we see that the exit clause of the context manager is never actually invoked, because we never return processing to it. Also, the generator itself is left hanging, which causes an error when we loop.close().

Iterating through DB results
>>> result_generator()
Starting transaction
Got row [0, 'somevalue']
Got row [1, 'somevalue']
Got row [2, 'somevalue']
Got row [3, 'somevalue']
Got row [4, 'somevalue']
I've had enough
Done with results
Task was destroyed but it is pending!
task: <Task pending coro=<async_generator_athrow()>>

To resolve this we need to use the aclose() method which was added to abort an async generator midway through. Since we’re asynchronous, this returns an awaitable which, when you await it, throws a GeneratorExit exception into your generator and yields from it until it’s exhausted. The generator can catch this exception, raised from the yield statement, or just let it propogate out to terminate the generator.

See the amendments to our code snippet below to terminate both the generators with aclose().

20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
async def result_generator():
    print(">>> result_generator()")
    async with FakeDBTransaction() as db_conn:
        cursor = db_conn.cursor()
        try:
            async for i in cursor:
                yield i
        except GeneratorExit:
            print("--- Early exit")
            await cursor.aclose()
    print("<<< result_generator()")

async def get_results():
    print("Iterating through DB results")
    rows = result_generator()
    async for row in rows:
        print(f"Got row {row}")
        if row[0] > 3:
            print("I've had enough")
            await rows.aclose()
            break
    print("Done with results")

This generates the output below.

Iterating through DB results
>>> result_generator()
Starting transaction
Got row [0, 'somevalue']
Got row [1, 'somevalue']
Got row [2, 'somevalue']
Got row [3, 'somevalue']
Got row [4, 'somevalue']
I've had enough
--- Early exit
Completing transaction
<<< result_generator()
Done with results

As a final note, you can also use the new shutdown_asyncgens() method on the event loop. This arranges for aclose() to be called on all remaining async generators.

loop.run_until_complete(loop.shutdown_asyncgens())

The way it does this is by registering a hook with a new method sys.set_asyncgen_hooks(), which is called when an async generator is about to be garbage collected. But this is pretty deep magic that I’m not going to talk any more about here, since unless you’re planning to work on the internals of asyncio I doubt it’ll be too relevant.

Asynchronous Comprehensions

The second change relating to coroutines is to implement asynchronous comprehensions, described in PEP 530. This is a fairly straightforward change to allow async for to be used in list, set and dict comprehensions, as well as generator expressions.

Here’s a quick example using the slow_range() async generator from earlier.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import asyncio

async def slow_range(*args, delay, **kwargs):
    for i in range(*args, **kwargs):
        yield i
        await asyncio.sleep(delay)

async def func():
    result = [i*3 async for i in slow_range(1, 10, delay=2) if i % 3 == 0]
    print(result)

loop = asyncio.get_event_loop()
loop.run_until_complete(func())
loop.close()

As well as this, you can also use await in any comprehension, whether async or not. This can be in the main expression or the if section or both, but this is only valid within the body of a coroutine as usual for async features.

>>> import asyncio
>>> async def one_two_three():
...     return 123
...
>>> async def func():
...     funcs = [one_two_three, one_two_three, one_two_three]
...     print([await i() for i in funcs])
...
>>> loop = asyncio.get_event_loop()
>>> loop.run_until_complete(func())
[123, 123, 123]

Custom Class Creation & Descriptor Enhancements

Python has contained metaclasses for some time, which enable all sorts of customisations to the class creation process. I’d wager many Python programmers have never had a need to touch them, but there are some cases where they’re fairly useful, particularly if you’re building some sort of framwork where you want consistency of behaviour.

Here’s an example of a Final metaclass which is used to prevent classes inheriting from the one using this metaclass — the name comes from the fact that this is similar in function to the final keyword in Java and C++ (from C++11 onwards).

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
class Final(type):

    def __new__(mcl, name, bases, attrs):
        for base_cls in bases:
            if isinstance(base_cls, Final):
                raise TypeError(f"{base_cls.__name__} is final")
        return super().__new__(mcl, name, bases, attrs)

class MyBaseClass(metaclass=Final):
    pass

class MyDerivedClass(MyBaseClass):
    pass

The flexibility is great, but for many common cases it’s a bit convoluted to get your head around. Well, in Python 3.6 there’s now another option which can avoid the need for metaclasses in some of the more common cases, described in PEP 487. This is to define an __init_subclass__() method on a base class — this will be invoked on each subclass which is created of this base class.

A key point to emphasise here is that this initialisation happens when the subclass is defined as opposed to when an instance is created. The method is implicitly treated as a class method, with the subclass object being passed as the first parameter.

Here’s the same example as above but implemented with __init_subclass__() instead of a metaclass.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
class FinalBaseClass:

    def __init_subclass__(cls, **kwargs):
        for base_cls in cls.__bases__:
            if base_cls.__name__ == "FinalBaseClass":
                raise TypeError(f"{base_cls.__name__} is final")
        super().__init_subclass__(**kwargs)

class MyDerivedClass(FinalBaseClass):
    pass

In this example, the cls parameter represents the MyDerivedClass class object, and the code accesses the __bases__ attribute to see whether FinalBaseClass is one of them. Note that this will raise an error where MyDerivedClass is defined, no instantiation required.

You’ll notice that __init_subclass() also takes keyword parameters. These can be specified in the same bracketed expression as the base classes, as in the example below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class MyBaseClass:

    def __init_subclass__(cls, message, **kwargs):
        print(f"[1] {message} {cls.__name__}")
        super().__init_subclass__(**kwargs)

class MyDerivedClass(MyBaseClass, message="one"):

    def __init_subclass__(cls, another_message, **kwargs):
        print(f"[2] {another_message} {cls.__name__}")
        super().__init_subclass__(**kwargs)

class SecondDerivedClass(MyDerivedClass,
                         message="two",
                         another_message="three"):
    pass

Before you read on, take a look and see if you can work out what that’ll print if you run it as a script. Then take a look at the output below and see if you were right.

[1] one MyDerivedClass
[2] three SecondDerivedClass
[1] two SecondDerivedClass

As well as __init_subclass__(), there is actually another new method introduced called __set_name__(). This is generally used when your own classes are used as descriptors — that is, an object that’s used to customise attribute lookup. Here’s a code snippet that illustrates its use:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
class LoggingAccessor:

    def __set_name__(self, owner, name):
        print(f"__set_name__() for {name}")
        self.attr_name = name
        self.internal_name = "_" + name

    def __get__(self, obj, objtype=None):
        value = getattr(obj, self.internal_name, None)
        print(f"GET {obj!r}.{self.attr_name} = {value!r}")
        return value

    def __set__(self, obj, value):
        print(f"SET {obj!r}.{self.attr_name} = {value!r}")
        if value is None:
            delattr(obj, self.internal_name)
        else:
            setattr(obj, self.internal_name, value)

print("About to define MyClass")

class MyClass:

    first_attr = LoggingAccessor()
    second_attr = LoggingAccessor()

print("Defined MyClass")

x = MyClass()
x.first_attr = "hello"
print(f"--> {x.first_attr!r} / {x.second_attr!r}")
x.second_attr = "world"
print(f"--> {x.first_attr!r} / {x.second_attr!r}")

The LoggingAccessor class is a standard descriptor, with __get__() and __set__() methods — nothing new in Python 3.6 there. The __set_name__() method is the new feature, which is called after the LoggingAccessor instances have been constructed (and after any __init__() has been called) and is passed the name of the attribute to which the descriptor has been assigned.

As you can see there are some extra print() statements to annotate what’s going on. Here’s the output of running the code above:

About to define MyClass
__set_name__() for first_attr
__set_name__() for second_attr
Defined MyClass
SET <__main__.MyClass object at 0x1030b05f8>.first_attr = 'hello'
GET <__main__.MyClass object at 0x1030b05f8>.first_attr = 'hello'
GET <__main__.MyClass object at 0x1030b05f8>.second_attr = None
--> 'hello' / None
SET <__main__.MyClass object at 0x1030b05f8>.second_attr = 'world'
GET <__main__.MyClass object at 0x1030b05f8>.first_attr = 'hello'
GET <__main__.MyClass object at 0x1030b05f8>.second_attr = 'world'
--> 'hello' / 'world'

Bear in mind the accessors are in the scope of the class and not per instance — so __init__() and __set_name__() will only be called twice (once for each of the two attributes) for MyClass above no matter how many instances we construct. This is why the __get__() and __set__() methods need to store any values on the class instance rather than using self.

If you want to know any more about descriptors, the Python documentation has an excellent howto on them.

As a closing point on this topic, here’s the order of events when these two mechanisms are combined:

  1. type.__new__() is called to create the new class object for the subclass.
  2. This then calls __set_name__() on each decriptor, after the descriptor objects have been initialised.
  3. Then it calls any __init_subclass__() on the base class of the new class, so by the time that’s called then any descriptors will have already be fully set up. This allows __init_subclass__() to make further modifications to them if it wishes.

All in all, perhaps a bit esoteric, but in the small set of cases where these changes are useful, I think they’ll be really invaluable in avoiding the complexity of metaclasses for comparatively simple customisations.

Conclusions

That wraps it up for this article. Some handy stuff here, especially the f-strings syntax, which will be convenient in all sorts of small ways. Overall this release is shaping up to feel rather less earth-shattering than 3.5 was, but that’s fine. It’s nice to get big new features, but it’s also nice to see those spaced out by a few releases so people’s use of the language can stabilise.

In the next article we’ll look at the new secrets module as well as other new features, such as a the filesystem path protocol and a new dict implementation.


  1. Well, more sort of mildly lukewarm, really. 

This is the 12th of the 15 articles that currently make up the “Python 2to3” series.

25 Jul 2021 at 6:10PM in Software
 |   | 
Photo by David Clode on Unsplash