☑ What’s New in Python 3.11 - New and Improved Modules

17 Jan 2023 at 8:30PM in Software
 |  | 

In this series looking at features introduced by every version of Python 3, we continue our look at the new Python 3.11 release, looking at some smaller new features, two new modules and some of the library changes.

This is the 27th of the 32 articles that currently make up the “Python 3 Releases” series.

python 311

So far in this series we’ve looked at the major new features in Python 3.11, specifically the performance improvements, exception enhancements and new type hint features. In this article we’ll take a brief look at a few other smaller features, a couple of new modules that have been added, and a few of the standard library module changes.

Minor New Features

We’ll kick off with a selection of smaller changes which stood out as being of interest to me.

Starred Unpacking in for loop

Unparenthesized unpacking expressions now work in for loops, as a consequence of the new PEG parser, which could lead to some more concise code.

>>> one = (11, 22, 33)
>>> two = (400, 500)
>>> for i in *one, *two:
...     print(i)
...
11
22
33
400
500

Pickling Subclasses with Slots

It was discovered that subclasses of some builtins don’t pickle and copy properly if additional added attributes use __slots__. This snippet is from Python 3.10, showing how an attribute isn’t copied on a subclass of bytearray in Python 3.10.

Python 3.10.0 (default, Dec  3 2021, 01:57:53) [Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import pickle
>>> class MyArray(bytearray):
...     __slots__ = ("some_attr",)
...
>>> orig = MyArray(b'123')
>>> orig.some_attr = "hello"
>>> pickled = pickle.loads(pickle.dumps(orig))
>>> pickled.some_attr
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'MyArray' object has no attribute 'some_attr'

The same example in Python 3.11 shows that the attribute is correctly pickled and unpickled.

Python 3.11.1 (main, Dec 12 2022, 08:56:30) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import pickle
>>> class MyArray(bytearray):
...     __slots__ = ("some_attr",)
...
>>> orig = MyArray(b'123')
>>> orig.some_attr = "hello"
>>> pickled = pickle.loads(pickle.dumps(orig))
>>> pickled.some_attr
'hello'

For some time the pickling of objects could be customised by adding a __getstate__() method to invoke as a serialisation function, and then implementing __setstate__() to re-initialise the object instance from the serialised state. The serialisation of __slots__ has been implemented by adding a default __getstate__() to object, which you can see by calling yourself.

>>> orig.__getstate__()
(None, {'some_attr': 'hello'})

Safe Import Path

Normally Python will append the current directory to sys.path so modules can always be imported from it. However, this means that standard library or other module names can be shadowed, unintentionally of maliciously, by a conflicting module in the current directory. To prevent this there is a new option in Python 3.11 which disables this implicit inclusion.

You can trigger this mode by passing -P on the Python interpreter command-line, or by setting the PYTHONSAFEPATH environment variable. This just removes the current directory from the initial sys.path value, as you can see below (output reformatted slightly for readability).

$ python --version
Python 3.11.1
$ python -c 'import sys; print(sys.path)'
['', '/Users/andy/.pyenv/versions/3.11.1/lib/python311.zip',
 '/Users/andy/.pyenv/versions/3.11.1/lib/python3.11',
 '/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/lib-dynload',
 '/Users/andy/.pyenv/versions/python-3.11/lib/python3.11/site-packages']
$ python -P -c 'import sys; print(sys.path)'
['/Users/andy/.pyenv/versions/3.11.1/lib/python311.zip',
 '/Users/andy/.pyenv/versions/3.11.1/lib/python3.11',
 '/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/lib-dynload',
 '/Users/andy/.pyenv/versions/python-3.11/lib/python3.11/site-packages']

Switch to SipHash-1-3

As I mentioned briefly in an earlier article in this series, Python moved to SipHash in Python 3.4 from the previous Fowler-Noll-Vo hash function — this was to address concerns which I discussed in an old article.

SipHash family functions are described as “SipHash-x-y“. where x and y are two parameters which determine the level of security:

  • x is the number of hashing rounds which are performed on each message block.
  • y is the number of finalisation rounds which are then performed to compute the final hash.

As first implemented in Python 3.4, SipHash-2-4 was chosen as the variant to use. In Python 3.11, however, the SipHash-1-3 variant has been added, which trades some of the security for better performance on larger inputs. This new variant has also been set as the algorithm used in the standard build, but the original SipHash-2-4 is still available and can be chosen at compile-time, so some vendors may choose to do that in their builds.

$ python3.10 -c "import sys; print(sys.hash_info.algorithm)"
siphash24
$ python3.11 -c "import sys; print(sys.hash_info.algorithm)"
siphash13

The consensus in the Python development team seems to follow similar discussions among Rust and Ruby developers that the new variant is secure enough for conceivable current use-cases, and results in a nice performance boost in at least some cases.

I very much agree with this assessment. It’s debatable this may leave things more open to as-yet unknown DoS attacks in the future, but these hash values tend to be used only transiently — to persist them for long periods would lead to terrible headaches in the future if the algorithm is changed again, anyway, so that would be a terrible idea for application and library authors regardless of this issue. As a result, the long-term security of the values isn’t really a significant concern, so if it’s good enough for known attacks now then it’s probably not worth burning CPU cycles to add additional security of highly questionable value.

Length Limit on str to int

You may well be aware that the int() constructor, Python can convert a string of digits in any base from 2 to 36 (inclusive) into an int value.

>>> int("1111011", 2)
123
>>> int("173", 8)
123
>>> int("123", 10)
123
>>> int("7b", 16)
123
>>> int("3f", 36)
123

However, what you might not be aware is that to do this in any base which isn’t a power of 2 is a comparatively painful operation. In the example below, I’ve constructed some very long numbers and you can see that converting bases 8 or 16 are extremely quick, but base 10 is significantly slower.

>>> timeit.timeit("int(x, 8)", setup="x='67' * 1000", number=100000)
0.6754992799833417
>>> timeit.timeit("int(x, 10)", setup="x='67' * 1000", number=100000)
3.333605546038598
>>> timeit.timeit("int(x, 16)", setup="x='67' * 1000", number=100000)
0.6724714860320091

Given that these are unusually long strings, none of them are what you could call slow, but the difference is very noticeable. The powers of 2 bases are both taking around 6.7 µs per iteration, whereas base 10 conversions take 33.3 µs, an order of magnitude slower. If an attacker can find a way to feed extremely large integers into your code, therefore, they can perform a DoS attack on the host it’s on. This was logged as CVE-2020-10735, and it was addressed by limiting the number of digits that you an convert from or to a base which isn’t a power of 2.

The limit is applied on both input and output, so printing is also covered. It defaults to 4300 digits, so the limit isn’t exactly restrictive for the vast majority of use-cases.

But is this a fuss about nothing? Well, you can alter the limit using sys.set_int_max_str_digits(), so let’s see how bad things get with 100,000 digits.

>>> import sys
>>> import timeit
>>> sys.set_int_max_str_digits(100000)
>>> timeit.timeit("int(x, 16)", setup="x='1' * 100000", number=1000)
0.3209301750175655
>>> timeit.timeit("int(x, 10)", setup="x='1' * 100000", number=1000)
60.41181180800777

Here we can see that the base 16 case takes 321 µs per iteration, whereas the base 10 one takes 60 ms, which is two orders of magnitude longer. Considering that attacks could potentially feed code orders of magnitude more digits than this, the concern seems justified. The behaviour is also rather non-linear — the same thing for a million digits takes a massive 6 seconds for a single iteration.

>>> sys.set_int_max_str_digits(1000000)
>>> timeit.timeit("int(x, 10)", setup="x='1' * 1000000", number=1)
6.158579874027055

The potential for abuse is fairly obvious here, so adding the limitation certainly seems like a very sensible step.

New Modules

Next up we’ll take a quick look at the two new modules added to the standard library in Python 3.11, namely tomllib and wsgiref.types.

tomllib

If you’re looking for a markup language for configuration or other data, there are quite a few options — it’s not particularly difficult to invent such a format, and there’s always going to be someone who thinks they can roll their own to improve on flaws, perceived or genuine, within the existing options. Formats which have a enjoyed a decent level of popularity include INI, XML, JSON, YAML and TOML.

The last of these is the baby of the bunch, just coming up for its 10th birthday. Whilst it’s not without its critics, it received something of a boost in the Python community when it was chosen by PEP 518 for storing build dependencies and other metadata. The only significant argument against it at that time was the lack of inclusion in the standard library of a module for parsing it. Now, nearly seven years after that PEP was published, that has finally been addressed.

The new tomllib library that’s been added to Python 3.11 is a simple affair, and provides the facility only to parse TOML data, but not serialise it back in the direction.

It provides two functions:

  • load() takes a binary file object and parses from it.
  • loads() takes a string and parses from that.

There’s not a great deal to configure here. The TOML syntax encodes the type as well as the value, so you get structured information back in the native data types you’d expect — for example, a table becomes a dict and an array becomes a list. The one type worth calling out is float as it’s possible to provide a different function to parse these. The default is float(), as you’d expect, but you could, for example, pass parse_float=decimal.Decimal to produce Python Decimal for each TOML float instead.

>>> from pprint import pprint
>>> import tomllib
>>>
>>> pprint(tomllib.loads("""
... some_attr = "some string"
... another_attr = 123.456
...
... # Now a table
... [mytable]
... one = ["un", "ein"]
... two = ["deux", "zwei"]
... three = ["trois", "drei"]
... """))
{'another_attr': 123.456,
 'mytable': {'one': ['un', 'ein'],
             'three': ['trois', 'drei'],
             'two': ['deux', 'zwei']},
 'some_attr': 'some string'}

So that’s about it for tomllib, short and sweet. It’s definitely nice that you can parse pyproject.toml without requiring third party libraries, but you’ll still need one to edit or create TOML files in code — perhaps a future PEP might also add this functionality to tomllib.

wsgiref.types

The other new module in Python 3.11 is a simple extension to wsgiref, the reference implementation of WSGI in the standard library, to add types that can be used for type hints. The types referred to all correspond to the usage defined in PEP 3333, and are listed below.

StartResponse
A typing.Protocol describing the type of the start_response() callable, which is invoked to start the HTTP response. It’s responsible for returning a callable to send data to the client.
WSGIEnvironment
A WSGI environment dictionary (essentially dict[str, Any]).
WSGIApplication
The type of the application callable, which is invoked by a WSGI server for each request.
InputStream and ErrorStream
A pair of typing.Protocol describing the types of the input and error streams, as described by PEP 3333.
FileWrapper
A typing.protocol describing the interface provided by a file wrapper, an abstraction which iterates blocks of data from a file.

Text Processing Services

To finish off this article, I’m going to make a start looking at some of the updates to a few of the standard library modules, with the rest covered in the final article in this series. I’m going to kick off with some changes to text processing modules, comprising:

  • A small change to the re regular expression matching library, adding support for atomic grouping.
  • An even smaller change to the string module to validate instances of string.Template.

re

An update has been made to the regular expression parser which adds support for atomic grouping. An atomic group is a bit like a nested regular expression, which is matched but then throws away any backtracking positions stored against any of the tokens within the group. To put it another way, the atomic group’s first match is always locked in, even if the rest of the string fails to match then there won’t be any backtracking to see whether the atomic group could have matched something different.

Atomic Groups

If you take /a(bc|b)c/ as an example regular expression, then this would normally match both abcc, with the parenthesised group matching bc, and also abc, with the parenthesised group matching b.

However, if we used the atomic group syntax (?>...), making the pattern /a(?>bc|b)c/ then this will still match abcc but abc will not match. This is because the parenthesised group matches the bc of abc, and since it’s an atomic group this is “locked in” so when the final c fails to match then there’s no backtracking to try other possibilities.

>>> import re
>>>
>>> re.match("a(bc|b)c", "abc")
<re.Match object; span=(0, 3), match='abc'>
>>> re.match("a(bc|b)c", "abcc")
<re.Match object; span=(0, 4), match='abcc'>
>>>
>>> re.match("a(?>bc|b)c", "abcc")
<re.Match object; span=(0, 4), match='abcc'>
>>> re.match("a(?>bc|b)c", "abc")
>>>

Possessive Quantifiers

A simpler form of this are possessive quantifiers, which have also been added in this release. You may already be aware that quantifiers (i.e. things like * and +) default to being greedy, which means they consume as much of the input string as possible before moving on within the pattern. If the match fails, they’ll then backtrack, but they start off on that basis. You can modify that to be lazy, however, which means they’ll match as little as possible before moving on. Again, they’ll still match more on backtracks, but the priority order of matching is reversed. To do this you add an additional ? after the quantifier.

As of Python 3.11 you can instead append an additional + after the quantifier to make it possessive. This acts just like a greedy quantifier, in that it matches as much as it can, but it doesn’t then backtrack to try to match less.

Discussion

So why is this useful? Well, in principle one application may be in optimising the performance of expressions which might incur significant backtracking effort before finally failing to match. Let’s take the example of /\b(return|retry|re)\b/ matched against the string returns. Once the parser has matched return, but then failed because it’s not followed by a word boundary, then logically we know that neither of the others can match — they’re both shorter than return and aren’t prefixes of it. However, the regex parser lacking such logic would try to backtrack and retry all the possibilities unnecessarily.

A note of caution, however: I’ve not found this always borne out by experience. I can only assume that either some overhead of using atomic groups outweighs the performance benefits in simple cases, or the matcher has some optimisation which does something similar automatically and more efficiently behind the scenes. You can see below the supposedly more efficient atomic grouping version is actually slower.

>>> setup_str = "import re; regex = re.compile(r'\b(return|retry|re)\b')"
>>> timeit("regex.search('returns')", setup=setup_str)
0.10913390596397221
>>> setup_str = "import re; regex = re.compile(r'\b(?>return|retry|re)\b')"
>>> timeit("regex.search('returns')", setup=setup_str)
0.19278872106224298

I was able to find cases where possessive quantifiers made a positive difference, though, so I think there’s potential value here — but you probably have to be fairly experienced to reliably tell the cases where it’ll help vs hinder. Personally I tend to use regular expressions as a tool of last resort for parsing anyway, so I’m doubtful that I’ll ever use these particular features — but there’s certainly no harm in being aware of them just in case.

>>> setup_str = "import re; regex = re.compile(r'<[^>]*>')"
>>> timeit("regex.match('<aaaaaaaaaaaaaaaaaaaaaaaaaaaaa')", setup=setup_str)
0.36812841705977917
>>> setup_str = "import re; regex = re.compile(r'<[^>]*+>')"
>>> timeit("regex.match('<aaaaaaaaaaaaaaaaaaaaaaaaaaaaa')", setup=setup_str)
0.32666016498114914

string

When people need string templating, they’ll often turn to third party modules such as Jinja. As I’ve mentioned in a previous article I’m a big fan of Jinja, but its power and complexity are overkill in simple cases. For these cases it’s easy to forget that there’s string.Template sitting right there in the standard library.

>>> from string import Template
>>> template = Template("$first's on first, $second's on second, $third's on third.")
>>> template.substitute({"first": "Who", "second": "What", "third": "I Don't Know"})
"Who's on first, What's on second, I Don't Know's on third."

It’s always been a little light on features, as a matter of design, but in Python 3.11 it has a couple of additional methods which could be useful. The first of these is get_identifiers() which lists the parameters to the template — i.e. those keys which the dict passed to substitute() would be expected to supply.

>>> template.get_identifiers()
['first', 'second', 'third']

The second method is is_valid(), which checks to see whether the template has any errors which would cause a call to substitute() to raise a ValueError. The snippet below illustrates an example of an invalid template.

>>> bad_template = Template("$person owes me $100")
>>> bad_template.is_valid()
False
>>> bad_template.substitute({"person": "Andy"})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/string.py", line 121, in substitute
    return self.pattern.sub(convert, self.template)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/string.py", line 118, in convert
    self._invalid(mo)
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/string.py", line 101, in _invalid
    raise ValueError('Invalid placeholder in string: line %d, col %d' %
ValueError: Invalid placeholder in string: line 1, col 17

These methods will likely prove particularly useful where templates are defined in one part of the code but substituted in another. In particular, if a library accepts a template as a parameter then these functions will allow the library to do some basic sanity checks of the template that’s been passed before it’s used. For example, if you use a template to define the format of a warning email to send to customers, it’s nice to know it has a bug in it before a situation where such emails need to be sent.

Data Types

The second, and final, area of the standard library changes we’ll look at in this article are a few changes around data types.

  • datetime has a new convenience alias for UTC zone, and more flexible parsing of ISO 86001 formats.
  • enum has a whole raft of assorted changes including a new StrEnum class, improvements to string representations, and decorators for validation and conversion.

datetime

The changes to datetime are fairly simple, particularly the first one — you can now use datetime.UTC as an alias for datetime.timezone.utc. Because UTC is used so commonly, this simple change is actually quite convenient.

The second change is around the fromisoformat() methods of date, time and datetime classes. Previously these methods had a fairly limited scope which was to parse any format that the isoformat() methods would generate. As of Python 3.11, they should accept any of the formats defined in ISO 8601, with the exception of those which use fractional hours and minutes — those if you who hate the idea you might need to say “15 seconds” instead of “0.25 minutes” are in for disappointment, I’m afraid.

One aspect of the parsing that’s a little more flexible than the ISO standard is that the T separator can be any non-numeric character — in particular a space will also work, which is pretty helpful. However, it must be a single character, so if you’re passing strings supplied by a user, you may still want to normalise them a little.

>>> import datetime
>>> datetime.UTC
datetime.timezone.utc
>>> datetime.date.fromisoformat("2023-01-15")
datetime.date(2023, 1, 15)
>>> datetime.time.fromisoformat("13:15:25.112")
datetime.time(13, 15, 25, 112000)
>>> datetime.datetime.fromisoformat("2023-01-15T13:15:25.112Z")
datetime.datetime(2023, 1, 15, 13, 15, 25, 112000, tzinfo=datetime.timezone.utc)
>>> datetime.datetime.fromisoformat("2023-01-15T13:15+0100")
datetime.datetime(2023, 1, 15, 13, 15, tzinfo=datetime.timezone(datetime.timedelta(seconds=3600)))
>>> datetime.datetime.fromisoformat("2023-01-15 13:15")
datetime.datetime(2023, 1, 15, 13, 15)
>>> datetime.datetime.fromisoformat("2023-01-15  13:15")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Invalid isoformat string: '2023-01-15  13:15'

enum

The enum module has had a lot of love in this release, with a number of improvements in different areas. It’s rather hard to summarise them in some unifying manner, so I’ll just jump right in.

StrEnum and ReprEnum

The class hierarchy within enum has changed a little in this release, with the specialisations using a new ReprEnum class. Whereas Enum uses its own representations for both the __repr__() and __str__() of instances, ReprEnum only uses __repr__() from the base Enum — it leaves the __str__() representation the same as the specialised type it’s representing (e.g. int for IntEnum). This allows enumeration values to behave more like their real types in more situations.

Python 3.11 enum class hierarchy

There’s also a new StrEnum specialisation for when enumeration values should be treated as strings. Whereas a normal Enum will allow its values to be usable as strings, they’re not stored as such natively. StrEnum, however, always makes its values strings, and will raise TypeError if any of its values are not strings or trivially coerced to strings.

The snippet below shows various ways in which Enum and StrEnum differ in their behaviour.

>>> import enum
>>> class NormalEnum(enum.Enum):
...     ONE = "one"
...     TWO = ("two",)
...
>>> class StringEnum(enum.StrEnum):
...     ONE = "one"
...     TWO = ("two",)
...
>>> NormalEnum.TWO
<NormalEnum.TWO: ('two',)>
>>> StringEnum.TWO
<StringEnum.TWO: 'two'>
>>> str(NormalEnum.ONE)
'NormalEnum.ONE'
>>> str(StringEnum.ONE)
'one'
>>> NormalEnum("two")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/enum.py", line 715, in __call__
    return cls.__new__(cls, value)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/enum.py", line 1131, in __new__
    raise ve_exc
ValueError: 'two' is not a valid NormalEnum
>>> StringEnum("two")
<StringEnum.TWO: 'two'>
>>>
>>> class Foo(enum.Enum):
...     ONE = 123
...
>>> class Foo(enum.StrEnum):
...     ONE = 123
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/enum.py", line 558, in __new__
    raise exc
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/enum.py", line 259, in __set_name__
    enum_member = enum_class._new_member_(enum_class, *args)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/enum.py", line 1276, in __new__
    raise TypeError('%r is not a string' % (values[0], ))
TypeError: 123 is not a string

The values of StrEnum are usable in most situations where str is accepted, provided code is well-behaved and uses isinstance() to check for subclasses of str. If code checks specifically for str itself, of course, it will fail.

>>> isinstance(StringEnum.ONE, str)
True
>>> type(StringEnum.ONE) == str
False

Flag and IntFlag Boundaries

The Flag enumeration is used where you need a single value which represents the state of multiple binary flags — this is what you’d use a bit field for in some other languages.

The change for this class in Python 3.11 is the addition of a boundary class parameter to indicate how out-of-range values should be handled — i.e. if bits are set which don’t correspond to any of the items defined in the enumeration. The valid values for this parameter are defined by the enum.FlagBoundary enumeration, and are as follows:

STRICT
If the provided value has any bits set which don’t correspond to valid enumeration members, a ValueError is raised.
CONFORM
Any bits which don’t correspond to valid enumeration members are ignored, but any bits which do are respected. This is the default for a Flag enumeration.
EJECT
If any bits don’t correspond to valid enumeration members, the whole value is converted to an int and returned as such.
KEEP
Invalid values are maintained, even if they don’t correspond to enumeration members. This is the default for IntFlag.

The snippet below demonstrates these cases.

>>> class StrictEnum(enum.Flag, boundary=enum.STRICT):
...     ONE = enum.auto()
...
>>> StrictEnum(2**2)
Traceback (most recent call last):
# (Traceback removed for brevity)
ValueError: <flag 'StrictEnum'> invalid value 4
    given 0b0 100
  allowed 0b0 001
>>>
>>> class ConformEnum(enum.Flag, boundary=enum.CONFORM):
...     ONE = enum.auto()
...
>>> ConformEnum(1 + 2**2 + 2**3)
<ConformEnum.ONE: 1>  
>>>
>>> class EjectEnum(enum.Flag, boundary=enum.EJECT):
...     ONE = enum.auto()
...
>>> EjectEnum(1 + 2**2)
5
>>>
>>> class KeepEnum(enum.Flag, boundary=enum.KEEP):
...     ONE = enum.auto()
...
>>> KeepEnum(1 + 2**2 + 2**3)
<KeepEnum.ONE|12: 13>

@verify

There’s a new @verify decorator which can impose specific constraints on the enumeration you declare. The constraint you impose is controlled by passing a parameter to the decorator which is a member of the EnumCheck enumeration. The values, and the conditions they impose, are:

UNIQUE
Ensure that the value of each enumeration member is unique — i.e. no two names refer to the same underlying value.
CONTINUOUS
For use with integer-valued enumerations, ensures that there are no missing numbers between the lowest and highest value.
NAMED_FLAGS
For use with Flag and IntFlag, ensures that for any values which refer to multiple flags (i.e. a value with multiple bits set), all of the values correspond to valid members of the enumeration.

These three cases are demonstrated below by showing cases which fail their validations.

>>> @enum.verify(enum.UNIQUE)
... class MyEnum(enum.Enum):
...     ONE = 1
...     TWO = 2
...     EIN = 1
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/enum.py", line 1827, in __call__
    raise ValueError('aliases found in %r: %s' %
ValueError: aliases found in <enum 'MyEnum'>: EIN -> ONE
>>>
>>> @enum.verify(enum.CONTINUOUS)
... class MyEnum(enum.Enum):
...     ONE = 1
...     TWO = 2
...     FOUR = 4
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/enum.py", line 1848, in __call__
    raise ValueError(('invalid %s %r: missing values %s' % (
ValueError: invalid enum 'MyEnum': missing values 3
>>>
>>> @enum.verify(enum.NAMED_FLAGS)
... class MyEnum(enum.Flag):
...    ONE = 1
...    TWO = 2
...    THREE = 4
...    FOUR = 8
...    ONE_AND_TWO = 3
...    UNKNOWN_AND_FOUR = 24
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/andy/.pyenv/versions/3.11.1/lib/python3.11/enum.py", line 1881, in __call__
    raise ValueError(
ValueError: invalid Flag 'MyEnum': alias UNKNOWN_AND_FOUR is missing value 0x10 [use enum.show_flag_values(value) for details]

@member and @nonmember

Typically, every attribute of an enumeration subclass is converted to a member of the enumeration. However, there may be circumstances where you want to define, say, nested classes or other attributes.

I’ll be honest, I struggled to think of concrete cases where this would be useful, since I don’t tend to pile additional functionality into my enumerations — they’re almost always just bare classes which are members of another class or module which provides the related functionality. But I think it’s good to broaden your horizons, so in Python 3.11 it’s possible to specify whether or not a given attribute should be a member of the enumeration using the member() and nonmember() functions.

This would probably take five times as long to explain as demonstrate, so hopefully the snippet below will make things a little clearer. But even if it doesn’t, I suspect the chances are good that you’ll never need this anyway.

>>> class MyEnum(enum.Enum):
...     MEMBER_ITEM = 100
...     NON_MEMBER_ITEM = enum.nonmember(200)
...
>>> MyEnum.MEMBER_ITEM
<MyEnum.MEMBER_ITEM: 100>
>>> MyEnum.NON_MEMBER_ITEM
200
>>> MyEnum.__members__
mappingproxy({'MEMBER_ITEM': <MyEnum.MEMBER_ITEM: 100>})

@property

Similar to the builtin @property decorator, there’s a new @enum.property decorator specific for enumeration classes. The purpose of this is to define properties in a way which won’t clash with enumeration members, even if the names are the same. The only requirement is that the two definitions don’t occur in the same class — the property is defined in a base class.

Similar to @nonmember, I struggle a little to think of real world cases where you’d want to merge enumeration functionality into another class, but if that sort of thing floats your boat then you can see an example of its use below.

>>> class MyBaseEnum(enum.Enum):
...     @enum.property
...     def ATTR(self):
...         return 123
...
>>> class MyDerivedEnum(MyBaseEnum):
...     ATTR = 456
...
>>> MyDerivedEnum.ATTR
<MyDerivedEnum.ATTR: 456>
>>> x = MyDerivedEnum(456)
>>> x.ATTR
123

@global_enum

Another change in this release is the addition of the @global_enum decorator. This is intended for cases where the enumeration values will be promoted to module-level names, and it modifies the repr() and str() results to be consistent with this. You can see the difference in the output in the snippet below.

>>> class MyNormalEnum(enum.Enum):
...     ONE = 1
...     TWO = 2
...
>>> repr(MyNormalEnum.ONE)
'<MyNormalEnum.ONE: 1>'
>>> str(MyNormalEnum.TWO)
'MyNormalEnum.TWO'
>>>
>>> @enum.global_enum
... class MyGlobalEnum(enum.Enum):
...     ONE = 1
...     TWO = 2
...
>>> repr(MyGlobalEnum.ONE)
'__main__.ONE'
>>> str(MyGlobalEnum.TWO)
'TWO'

Flag Enumeration Membership

When members of an enum.Flag class cover multiple bits, it’s reasonable to assume you might be able to treat them a bit like a frozenset of enumeration members. In support of this, Python 3.11 adds support for len(), list(), set() and membership tests on these values, as demonstrated in the snippet below.

>>> class MyFlagEnum(enum.Flag):
...     FOO = 1
...     BAR = 2
...     BAZ = 4
...     ALL_ITEMS = 7
...
>>> len(MyFlagEnum.ALL_ITEMS)
3
>>> set(MyFlagEnum.ALL_ITEMS)
{<MyFlagEnum.BAZ: 4>, <MyFlagEnum.BAR: 2>, <MyFlagEnum.FOO: 1>}
>>> list(MyFlagEnum.ALL_ITEMS)
[<MyFlagEnum.FOO: 1>, <MyFlagEnum.BAR: 2>, <MyFlagEnum.BAZ: 4>]
>>> MyFlagEnum.FOO in MyFlagEnum.ALL_ITEMS
True
>>> MyFlagEnum.BAR not in MyFlagEnum.BAZ
True

Conclusion

Phew, that was a strong finish on the enum module changes — I was beginning to wonder if that should have had its own whole article…!

Still, despite some of the enum changes being a little obscure, there’s some useful changes buried in there. Looking back over the rest of the article, I must admit the feature that I’m most pleased to see are the improvements to datetime — writing code to parse dates is really dull, so I’m happy to see the need for that becoming less likely.

Next time I’ll finish off the look at Python 3.11 by covering the remaining changes of interest in the standard library.

The next article in the “Python 3 Releases” series is What’s New in Python 3.11 - Improved Modules II
Sun 22 Jan, 2023
17 Jan 2023 at 8:30PM in Software
 |  |