In this series looking at features introduced by every version of Python 3, we finish our look at the updates in Python 3.6. This third and final article looks at the updates to library modules in this release. These include some asyncio improvements, new enumeration types and some new options for use with sockets and SSL.
This is the 14th of the 32 articles that currently make up the “Python 3 Releases” series.
As well as all the more significant changes in Python 3.6, which were outlined in the previous two articles, there are also the usual slew of standard library updates and additions. We’ll look at most of these in this article.
We’ll kick off with some small changes to the zipfile
module.
First up, the ZipInfo
class, which represents the metainformation about a member of a zip file archive, has a new class method from_file()
which allows creation of a ZipInfo
instance from the specified filename. This allows code to create an instance from a file, but then override specific fields (e.g. last modified time) and then add it to an archive.
There’s also a ZipInfo.is_dir()
method which is the equivalent of os.path.isdir()
for a zip file entry, and ZipFile.open()
now allows data to be added / updated in a zipfile, instead of just being used to extract it.
Together these changes make for a more flexible interface for manipulating ZIP files.
There’s a minor change to concurrent.futures
where the ThreadPoolExecutor
constructor now accepts a thread_name_prefix
parameter, which allows you to insert some uniquely identifying string as a prefix of the names of the threads created by it. This could be extremely helpful when debugging issues in heavily multithreaded applications, where it can become quite confusing which threads were created where.
There are also some asyncio
changes substantial enough to warrant their own section.
There are a whole host of smaller changes to asyncio
, whose API is now considered stable as of this release. Many of these were backported to Python 3.5.1, and hence I discussed them in the earlier article on changes in 3.5, and there was also some discussion in the section on asynchronous generators in the first Python 3.6 article.
There are a few notable changes I didn’t cover1 in any earlier articles, however:
get_event_loop()
always returns current loopget_event_loop()
now always returns the currently executing loop. Outside these contexts the previous behaviour still applies, which is to return whatever loop has been set within the current thread by calling set_event_loop()
, or to raise RuntimeError
except in the main thread where an event loop is created on-demand.Transport.is_closing()
addedTrue
if the specified transport is closed or in the process of closing.loop.stop()
behaviour changedloop.stop()
which changes the behaviour around the execution of scheduled callbacks. Previously, calling stop()
scheduled a callback which raised a private exception which stopped the loop. This has changed to simply set a flag in the loop which is checked every time a complete cycle of the loop is run. This means that if stop()
is called by a callback, only the remainder of the current loop executes — any further callbacks scheduled by the existing callbacks being run will not run until the loop is started again later. If stop()
is called prior to the loop running, it’ll run one complete cycle and then exit. You can find more discussion in this Python issue report if you want the details.loop.connect_accepted_socket()
addedasyncio
to handle data transfer on it.TCP_NODELAY
by defaultTCP_NODELAY
option to disable Nagle’s algorithm, which can badly impact performance in modern TCP/IP stacks for some workloads, particularly when combined with delayed ACKs. If you have a use-case where Nagling would actually be preferable, you can always set the option yourself using transport.get_extra_info("socket")
to obtain the socket.socket
instance and calling setsockopt()
on it.Future
and Task
now have fast C implementations, improving performance by up to 30%.The hashlib
module has acquired support for some new algorithms in this release.
Firstly, there are two new functions which implement the BLAKE2 hashing algorithm, defined in RFC 7693. The creators claim that it’s both fast and highly secure, and they make some persuasive arguments.
The algorithm comes in two variants: BLAKE2b is optimised for 64-bit platforms, producing digests of up to 64 bytes, and BLAKE2s is optimised for smaller platforms down to 8-bit, producting digests of up to 32 bytes. The two variants are supported with the functions blake2b()
and blake2s()
respectively. I’m guessing most users of Python can assume a 64-bit platform these days, but it’s useful to have both variants in case you’re interoperating with smaller platforms.
Unlike some hashing algorithms, BLAKE2 allows any digest size to be generated, which needs to be selected when the hashing object is created. As you can see from the snippet below, this is an integral part of the algorithm as opposed to simply being a truncation operation at the end.
>>> import hashlib
>>> hasher = hashlib.blake2b()
>>> hasher.update(b"Hello, world\n")
>>> hasher.hexdigest()
'3028a38d034e6e5ef7bda22013d4fa20ca5cfb1fc48f8ef0984fba2cbcf9650dec54be93f51ea3f6fdc39e68473abf00a1dca08672f4dd8201b171bb01ad3129'
>>> hasher = hashlib.blake2b(digest_size=32)
>>> hasher.update(b"Hello, world\n")
>>> hasher.hexdigest()
'202e210b86fa26eb50f3ca8cc268db9f1bd83687a4d91ec37f16e062b6a7362e'
Both of these variants support a salt
parameter for salted hashing, as well as a personal
parameter for an additional “personalisation string” — this is like an additional salt which is set according to the context in which the hash is generated, to reduce the chances that a hash generated in one part of the code can be reused by an attacker to bypass a different protection which is using the same algorithm. The functions also take a key
parameter, so BLAKE2 can be used to implement a form of HMAC3.
SHA-3 support has also been added at various hash sizes with the functions sha3_224()
, sha3_256()
, sha3_384()
and sha3_512()
. There’s also support for the SHAKE extendable-output function (XOF) to generate arbitrary length digests, and this is provided by the shake_128()
and shake_256()
functions. Although the digest length is arbitrary, the functions only offer 128- and 256-bit security against collisions and other attacks. Unlike the earlier BLAKE2 functions, the SHAKE functions allow you to select the digest length in bytes at generation time.
>>> import hashlib
>>> hasher = hashlib.shake_256()
>>> hasher.update(b"Hello, world\n")
>>> hasher.hexdigest(8)
'd1be3aba87b7379f'
>>> hasher.hexdigest(32)
'd1be3aba87b7379f90f9b3014ef68cd940ed41dd2088e878ebdc9866bba9e254'
Finally, there’s also a new scrypt()
function which, unsurprisingly, implements the scrypt password-based key derivation function. This is designed to require more hardware resources than earlier approaches like PBKDF2, which makes it harder to implement in hardware and thus harder to crack. If you want to use this function, I would do some reading around it to select appropriate values for the mandatory n
, r
and p
parameters — in particular if you choose an insufficient n
then you’ll impair the security.
It’s good to see hashlib
keeping up with newer algorithms, especially scrypt()
given the advances in the ability of attackers to crack password hashes.
There are some new abstract base classes and a couple of changes to namedtuple
in collections
, as well as some new enum
base types.
There are some new abstract base classes in collections.abc
to round out some gaps:
Collection
Container
, Sized
and Iterable
2, but not indexable or reversible as a Sequence
is.Reversible
Iterable
which also provides __reversed__()
.AsyncGenerator
There are also a couple of changes to collections.namedtuple
. Firstly, you can pass a module
keyword parameter to set the __module__
attribute of the constructed class. Secondly, the verbose
and rename
parameters are now keyword-only — this change is not backwards compatible, but the risk of breakage in real code was deemed acceptably low.
Finally a bit of an obscure one, but it’s now possible to pickle
a recursive collections.deque
— that is, if the deque
contains a reference to itself. Here’s what used to happen in Python 3.5:
>>> x = collections.deque()
>>> x.append(x)
>>> pickle.dumps(x)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded while pickling an object
Python 3.6, on the other hand, handles this fine:
>>> x = collections.deque()
>>> x.append(x)
>>> pickle.dumps(x)
b'\x80\x03ccollections\ndeque\nq\x00)Rq\x01h\x01a.'
Back in the first Python 3.4 article we saw the addition of the enum
module, with the base classes Enum
and IntEnum
. In Python 3.6 these have been joined by two new base classes, Flag
and IntFlag
.
These are similar in use to Enum
and IntEnum
, except that they support bitwise operations and still remain the same type. This is easiest to explain with an example — in the code snippet below you can see usage of both the original IntEnum
class and the new IntFlag
class.
>>> import enum
>>> class NormalIntEnum(enum.IntEnum):
... ONE = enum.auto()
... TWO = enum.auto()
... THREE = enum.auto()
...
>>> class FlagIntEnum(enum.IntFlag):
... ONE = enum.auto()
... TWO = enum.auto()
... THREE = enum.auto()
...
>>> NormalIntEnum.THREE.value
3
>>> FlagIntEnum.THREE.value
4
>>> NormalIntEnum.ONE | NormalIntEnum.TWO | NormalIntEnum.THREE
3
>>> FlagIntEnum.ONE | FlagIntEnum.TWO | FlagIntEnum.THREE
<FlagIntEnum.THREE|TWO|ONE: 7>
>>> (FlagIntEnum.ONE | FlagIntEnum.TWO | FlagIntEnum.THREE).value
7
>>> (FlagIntEnum.ONE | FlagIntEnum.TWO | FlagIntEnum.THREE) & 3
<FlagIntEnum.TWO|ONE: 3>
The enumeration values are automatically numbered with enum.auto()
, but note that the THREE
value differs between the two. That’s because in a standard enumeration, consecutive integers 1
, 2
, 3
, 4
, etc. will be used. To support bitwise operations, however, the values become bits, so the values are powers of two 1
, 2
, 4
, 8
, etc. That’s why NormalIntEnum.THREE
is 3
whereas FlagIntEnum.THREE
is 4
.
When we apply the bitwise “or” operator |
to the NormalIntEnum
values, it’s successful but the result is a normal int
. When we do it to the FlagIntEnum
values, however, the result is another FlagIntEnum
which represents the union of those values. You can also mask off values using int
directly as you can see from the final line.
These new base classes are pretty handy for efficiently storing flags, and potentially to more readably represent file and network protocol formats which use bitfields for flags. The fact that they interoperate gracefullly with int
values is a nice touch in particular.
As well as the changes for local time disambiguous we already covered in the previous article, there are some other improvements to the datetime
module.
strftime()
directivesdatetime.strftime()
and date.strftime()
now support some new directives for ISO 8601 formatted dates and times.
%G
appears very similar to %Y
in that it populates a four-digit year, but apply caution here — the year is selected as being the one that contains most of the current ISO week. This article has a great discussion of why this probably isn’t what you want unless you’re really sure.%u
is the ISO weekday in the range [1, 7]
where 1
is Monday.%V
the current ISO week as a 2-digit number. ISO defines week 01
as being that which contains 4th January of that year.datetime.isoformat()
accepts timespec
parameterThe isoformat()
function formats the specified datetime in the ISO format, for example 2021-08-01T13:54:06
. The T
separator can be customised with the sep
parameter, and if the datetime
object has a non-zero microseconds
attribute then that’s appended, e.g. 2021-08-01T13:54:06.015661
. So far no change since Python 3.5.
What’s new is that in Python 3.6 there’s a new timespec
parameter to enable you to truncate the string at different points. For example, hours
will construct a timestamp like 2021-08-01T13
. There are also minutes
, seconds
, milliseconds
and microseconds
values. The previous behaviour can be selected with auto
, which is also the default, which selects between seconds
and microseconds
depending on whether the microseconds
attribute is zero.
datetime.combine()
accepts tzinfo
parameterdate
object and time
object into a datetime
. This now accepts a tzinfo
parameter which can be used to override the tzinfo
from the time
object.The timeit
module has a new autorange()
method which you can use when you’re not sure how many iterations would be most appropriate to test your code. It calls timeit()
with increasing numbers of iterations until the runtime exceeds 200ms.
Another improvement is that if multiple repetitions (not iterations!) have times that differ by more than a factor of four, a warning is emitted using the standard warnings
module.
$ python -m timeit -s 'import time; import random' -n 1 -r 5 'time.sleep(random.random())'
1 loops, best of 5: 67.8 msec per loop
:0: UserWarning: The test results are likely unreliable. The worst
time (798 msec) was more than four times slower than the best time.
There are also a couple of improvements to unittest.mock
:
assert_called()
and assert_called_once()
methods added to Mock
reset_mock()
return_value
and side_effect
, which default to False
but can be set to True
to additionally reset those behaviours to their default state.First up, some smaller changes in the email
and http.client
modules that don’t warrant their own sections, and then some socket-related changes which are numerous enough that they do.
The email
policy framework which we covered in a previous article on Python 3.3 is no longer provisional, and the documentation has been updated to focus on it.
Also, the email.mime
class all now accept a policy
keyword to specify the policy to use, as does the email.generator.DecodedGenerator
constructor. In addition there’s a new message_factory
attribute of policies which specifies the callable used to construct new messages. For the compatability compat32
policy the default is Message
, for everything else it’s the newer EmailMessage
class.
The http.client
has a small but useful change — HTTPConnection.request()
and endheaders()
now support chunked encoding of request bodies. I’ve written in the past about how it’s always been a source of frustration to me that chunked encoding of HTTP requests isn’t better supported, so this is a welcome change.
In a related change, urllib.request
also supports this in the AbstractHTTPHandler
class which is the base class for both HTTPHandler
and HTTPSHandler
. If no Content-Length
header is added, and the request body isn’t a fixed-sized bytes
object, then chunked encoding will be used.
The socket
module has support for new option constants for use with getsockopt()
on systems which support them (at least Linux but not, for example, MacOS).
SO_DOMAIN
AF_INET
. This is read-only.SO_PROTOCOL
IPPROTO_IPV4
6. This is read-only.SO_PEERSEC
SO_PASSSEC
AF_UNIX
) and controls whether the SELinux security label of the peer socket can be received in an ancilliary message of type SCM_SECURITY
. If that means nothing to you, don’t let it worry you — it’s another esoteric one that only a few people will likely care about.There’s also a new form of setsockopt()
which is apparently required in some cases. The existing options take a level
parameter, often SOL_SOCKET
, and an optname
to specify which option. There’s then a value
parameter which is either an int
or a bytes
object specifying a buffer in a format which is specific to each option. It seems that this isn’t sufficient for some cases, however, which require a NULL
to be passed for the buffer, with the length of the buffer being used by the option in question instead of any data in the buffer. The new form supports this by allowing a None
as the value and an optlen
parameter which is mandatory if None
is used.
Here’s an example of this from the standard library:
algo.setsockopt(socket.SOL_ALG, socket.ALG_SET_AEAD_AUTHSIZE, None, taglen)
Another enhancement to socket
is support for the AF_ALG
family for interfacing with the Linux Kernel crypto API. Needless to say, this is only available on Linux. The SOL_ALG
and ALG_*
constants were added to socket
, as well as sendmsg_afalg()
which is a specialised form of sendmsg()
which sets various parameters of an AF_ALG
socket.
Sticking with Linux-specific changes, there are two more additional constants added. These are both used with level
as socket.IPPROTO_TCP
.
TCP_USER_TIMEOUT
ETIMEDOUT
is returned. Specifying zero will use the system default. Setting a low value is helpful where “fail fast” behaviour is desirable, or large values are useful where connections should persist even in the presence of extended periods of disconnection.TCP_CONGESTION
tcp_allowed_congestion_control
under /proc
, but privileged processes with the CAP_NET_ADMIN
capability can use any setting. On Linux, to see what’s available (not just allowed) take a look at tcp_available_congestion_control
under /proc
.As well as socket
changes, there are also a couple of changes in the higher-level socketserver
module.
First up, classes based on socketserver
now support the context-manager protocol, to make it easier to ensure socketserver.server_close()
is called:
with socketserver.TCPServer(("", 1234), MyHandler) as server:
server.serve_forever()
Secondly, the wfile
attribute of StreamRequestHandler
has been updated to implement the io.BufferedIOBase
interface. Prior to this change, it was a simple wrapper around an underlying filehandle and therefore could perform partial writes, which library code was often ill equipped to handle. With this change, a write()
call will use the sendall()
method on the socket, so blocking until all data has been successfully written.
The ssl
module has also received some more improvements this release to improve security and add support for OpenSSL 1.1.0. The insecure Tiple DES cipher has been dropped, and the ChaCha20 cipher and Poly1305 MAC have been added, which was standardised in RFC 8439.
The SSLContext
class now has some more secure defaults, which used to be overidden in ssl.create_default_context()
instead. PROTOCOL_TLS
, added in this release, is now the default and it always selects the highest version supported by both client and server. Additionally, SSLv2 and v3 are explicitly disabled by default due to their insecurity. Only cipher suites that OpenSSL classifies as HIGH
encryption are included, with no MD5-based ciphers. There are some other more secure default settings which are a bit more involved so I’ll pass over them here.
As well as the new PROTOCOL_TLS
, there are also two more specific protocols that can be selected, PROTOCOL_TLS_CLIENT
and PROTOCOL_TLS_SERVER
, which are only suitable for use with the client or server end of the connection respectively. The reason for the difference is that the sensible and secure defaults differ between these two ends. For example, PROTOCOL_TLS_CLIENT
enables CERT_REQUIRED
and check_hostname
by default.
To see which ciphers are in use, the SSLContext
class also has a new get_ciphers()
method which returns details of all the ciphers that are available and enabled, in priority order.
>>> pprint.pprint(context.get_ciphers()[0])
{'aead': True,
'alg_bits': 256,
'auth': 'auth-any',
'description': 'TLS_AES_256_GCM_SHA384 TLSv1.3 Kx=any Au=any '
'Enc=AESGCM(256) Mac=AEAD',
'digest': None,
'id': 50336514,
'kea': 'kx-any',
'name': 'TLS_AES_256_GCM_SHA384',
'protocol': 'TLSv1.3',
'strength_bits': 256,
'symmetric': 'aes-256-gcm'}
>>> print("\n".join(i["name"] for i in context.get_ciphers()))
TLS_AES_256_GCM_SHA384
TLS_CHACHA20_POLY1305_SHA256
TLS_AES_128_GCM_SHA256
ECDHE-ECDSA-AES256-GCM-SHA384
# ... lots of entries skipped for brevity ...
CAMELLIA128-SHA256
CAMELLIA256-SHA
CAMELLIA128-SHA
There’s a new SSLSession
object which has been added to support copying of an existing session from one client-side connection to another. This allows TLS session resumption which improves performance. It can be even more important for repeated connections to FTP servers as RFC 4217 §10.2 explicitly mentions the possibility that FTP servers may refuse to waste CPU cycles by repeatedly performing TLS negotiation with the same client, returning a 522 error code instead.
The session is exposed as the session
attribute of the SSLSocket
object, so you can retrieve it from your first connection pass it using a new session
parameter to the SSLContext.wrap_socket()
method for your second and subsequent connections. See the simplified example below:
>>> import socket
>>> import ssl
>>>
>>> context = ssl.create_default_context()
>>> with socket.create_connection(("www.andy-pearce.com", 443)) as sock:
... with context.wrap_socket(sock,
... server_hostname="www.andy-pearce.com
... ) as ssl_sock:
... session = ssl_sock.session
... ssl_sock.send(b"GET https://www.andy-pearce.com/blog/ HTTP/1.1\r\n"
... b"Host: andy-pearce.com\r\n\r\n")
... print(ssl_sock.read(8192).splitlines()[0])
...
73
b'HTTP/1.1 200 OK'
>>> session
<_ssl.Session object at 0x10b12b940>
>>>
>>> with socket.create_connection(("www.andy-pearce.com", 443)) as sock:
... with context.wrap_socket(sock,
... server_hostname="www.andy-pearce.com",
... session=session
... ) as ssl_sock:
... ssl_sock.send(b"GET https://www.andy-pearce.com/blog/ HTTP/1.1\r\n"
... b"Host: andy-pearce.com\r\n\r\n")
... print(ssl_sock.read(8192).splitlines()[0])
...
73
b'HTTP/1.1 200 OK'
It’s a bit of a shame that the library can’t implement session reuse automatically behind the scenes, but I suppose there are probably cases where that would break things. In any case, at least applications have the option now.
There’s a new contextlib.AbstractContextManager
abstract base class for context managers — that is, objects which implement __enter__()
and __exit__()
methods. For simple cases, the __enter__()
method in the base class returns self
, while __exit__()
is left abstract for subclasses to implement. There’s also a corresponding typing.ContextManager
class for type hints.
On the subject of typing
there are a few more improvements there:
Dict[str, Tuple[S, T]]
).typing.Collection
addedcollections.abc.Collection
.typing.TYPE_CHECKING
addedTrue
when type checking, but False
at runtime. This allows conditional behaviours which are only required for type-checking, such as importing additional modules.typing.NewType()
addedJSON = typing.NewType("JSON", str)
creates a new type JSON
which is considered a subclass of str
but is treated as a distinct type. With appropriate annotations, this means the type checker can detect you passing the result of a function that returns (say) XML into a function that expects JSON, even though at runtime it’s all just str
.The decimal.Decimal()
class has a potentially useful as_integer_ratio()
method which converts the decimal into the simplest fraction with the same value. The result is returned as a 2-tuple of (numerator, denominator)
.
The constant math.tau
has been added. This is simply defined as \( \tau = 2 \pi \), but this is a more useful contstant in many situations. You can read PEP 628 for more details4.
The random
module has also acquired a new choices()
function as a more general form of the existing choice()
. This version selects a specified number of items from the population, with replacement5. There’s also an optional weights
parameter if you want a biased choice.
>>> import random
>>> random.choices(range(10), weights=range(10, 0, -1), k=10)
[3, 4, 5, 6, 0, 4, 2, 5, 4, 6]
Outside of purely statistical analysis, I can see this being useful in applications such as load balancing, where you want to direct a set of requests to servers balanced according to their current available capacity.
The statstics
module, added back in Python 3.4, has acquired a new harmonic_mean()
method which, unsurprisingly, calculates the harmonic mean of a set of data points. I seem to recall that this mean is often the most appropriate when calculating the average of a set of rates or ratios, but it’s been a very long time since A-Level maths so I’d suggest turning elsewhere to learn about the Pythagorean means.
The iterator os.scandir()
now offers a close()
method to make sure associated resources are freed in the case that iteration wasn’t completed for some reason. More usefully, it also now supports the context manager protocol to make sure this cleanup happens. If you don’t either exhaust the iterator or call close()
, explicitly or implicitly via the context manager, its destructor will raise ResourceWarning
to let you know.
>>> import os
>>> import warnings
>>> warnings.resetwarnings()
>>> x = os.scandir("/tmp")
>>> del x
__main__:1: ResourceWarning: unclosed scandir iterator <posix.ScandirIterator object at 0x105042b70>
You can see an example of using scandir()
as a context manager in the small script below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
There’s also an improvement to the shlex
module, which provides simple shell-like parsing facilities for strings. This takes the form of a new punctuation_chars
parameter to shlex.shlex()
which defaults to False
for pre-3.6 behaviour, but can be set to a string of characters to be regarded as punctuation meaningful to the shell. If specified, a consecutive block with any of these characters is treated as a single token. Alternatively, passing a value of True
causes it to use an internal default of common shell punctuation.
>>> import shlex
>>> command_string = "/opt/bin/app_status && /opt/bin/app_shutdown >$LOGFILE 2>&1"
>>> list(shlex.shlex(command_string))
['/', 'opt', '/', 'bin', '/', 'app_status', '&', '&', '/', 'opt', '/', 'bin', '/', 'app_shutdown', '>', '$', 'LOGFILE', '2', '>', '&', '1']
>>> list(shlex.shlex(command_string, punctuation_chars=True))
['/opt/bin/app_status', '&&', '/opt/bin/app_shutdown', '>', '$', 'LOGFILE', '2', '>&', '1']
>>> list(shlex.shlex(command_string, punctuation_chars="&>12"))
['/opt/bin/app_status', '&&', '/opt/bin/app_shutdown', '>', '$', 'LOGFILE', '2>&1']
This falls short of full shell parsing, but comes pretty close for simple cases. The collapsing of paths into a single token is particularly helpful, although unfortunately this doesn’t extend to proper support for backslashed spaces.
>>> list(shlex.shlex(r"/opt/bin/app\ status", punctuation_chars=True))
['/opt/bin/app', '\\', 'status']
A small change to the subprocess
module is that it now emits a ResourceWarning
if the child process is still running when the Popen
object destructor is called. You shouldn’t ever see this if you use it as a context manager or explicitly call wait()
, but it’s a useful hint for more complicated cases. Also, there’s a new encoding
parameter to specify the encoding to use for stdin
, stdout
and stderr
, if used.
The regular expression language supported by the re
module has once again been extended by allowing flags to be specified inline for a particular subset of a pattern. For example, in Python 3.5 the re.IGNORECASE
or re.I
flags can be passed via the flags
parameter to several of the functions (e.g. re.compile()
) or the sequence (?i)
can be specified within the pattern itself — both of these have the effect of specifying case-insensitive matching across the entire pattern. The addition within Python 3.6 is that you can specify (?i:subpattern)
to ignore case only for subpattern
, without affecting the flag for the remainder of the pattern. You can also specify (?-i:subpattern)
to add case-sensitivity to an otherwise case-insensitive pattern.
>>> import re
>>> pattern = re.compile(r"(?i)IT(?-i:aa)S")
>>> pattern.match("itaas") is None
False
>>> pattern.match("ITaas") is None
False
>>> pattern.match("ITAAS") is None
True
Another handy change is not to the regular expression language, but to the Match
objects returned on a successful match. These now support indexing as an alias for calling group()
— examples in the code snippet below. Rather more esoterically, now any object that supports an __index__()
method can be passed instead of just an int
.
>>> import re
>>> pattern = re.compile(r"(?i)my name is ([a-z]*)")
>>> match = pattern.match("My name is Andy")
>>> match[0]
'My name is Andy'
>>> match[1]
'Andy'
>>> pattern = re.compile(r"(?i)my name is (?P<name>[a-z]*)")
>>> match = pattern.match("My name is Guido")
>>> match["name"]
'Guido'
Finally, the following things have been deprecated in this release, and will be removed in a future release.
asynchat
and asyncore
deprecated in favour of asyncio
.distutils
, the extra_path
parameter to the Distribution
constructor is considered deprecated.grp.getgrgid()
, non-int
arguments are no longer accepted.bytes
-like objects as paths in the os
module, which was never documented behaviour, is now explicitly deprecated.certfile
, keyfile
and check_hostname
which used to be specified directly in modules such as ftplib
, http.client
, imaplib
, poplib
and smtplib
, are now deprecated in favour of using an SSLContext
.As usual some useful changes. The asyncio
updates are welcome, and it’s not surprising that the interface to this fairly recently-added module is still evolving more than many others.
There’s also some great security-related enhancements, including the new algorithms supported by hashlib
, especially scrypt, and the more secure defaults used by the ssl
module.
So that’s it for Python 3.6. In the next article I’ll continue my upward climb by looking at major new features added in Python 3.7.
At least as far as I remember, which frankly isn’t ever as far as I’d like. As you may have noticed I tend to cover things in the order they’re mentioned in the release notes, to try and avoid duplication, but if they’re mentioned twice in there then there’s a fair change I’ll cover them twice here as well, despite my best efforts! ↩
In other words, they support the __contains__()
, __iter__()
and __len__()
methods. ↩
I’m not in a position to comment on the relative security of using BLAKE2’s keyed mode directly, as opposed to the more established hmac
module based on RFC 2104. As a hashing algorithm, however, BLAKE2 can be used with hmac
if you wish, although using it directly will almost certainly be noticeably faster. ↩
Although as you might imagine, there’s a limit to how much you can say about this change. With the exception of PEP 20, this has got to be a good candidate for the shortest PEP ever! ↩
The existing random.sample()
already offers selection without replacement. ↩
I’m not sure if this will work for anything except AF_INET
or AF_INET6
, because the protocol constants aren’t necessarily defined. This will be OS-dependent, however. ↩