After many years of separation I was recently reunited with the venerable old FTP protocol. The years haven’t been kind to it.
Happy New Year!
Right, that’s the jollity out of the way.
I recently had cause to have some dealings with the File Transfer Protocol, which is something I honestly never thought I’d be using again. During the process I was reminded what an antiquated protocol it is. It’s old, it has a lot of wrinkles and, frankly, it’s really starting to smell.
But perhaps I’m getting ahead of myself, what is FTP anyway?
It’s been around for decades, but in these days of cloud storage and everything being web-based, it’s not something that people typically come across very often. This article is really an effort to convince you that this is something for which we should all be thankful.
Wikipedia says of FTP:
The File Transfer Protocol (FTP) is the standard network protocol used for the transfer of computer files between a client and server on a computer network.
Twenty years ago, perhaps this was even true, but this Wikipedia page was last edited a month ago! Surely the world has moved on these days? At least I certainly hope so, and this post is my attempt to explain some of the reasons why.
“What’s so bad about FTP?” I hear you cry! Oh wait, my mistake, it wasn’t you but the painfully transparent literary device sitting two rows back. Well, regardless, I’m glad you asked. First, however, a tiny bit of history and a brief explanation of how FTP works.
FTP is a protocol with a long history, and even predates TCP/IP. It first put in an appearance in 1971 in RFC 114, a standard so old it wasn’t even put into machine-readable form until 2001. At this point it was built on the Network Control Program (NCP), which was a unidirectional precursor of TCP/IP. The simplex nature of NCP may well explain some of FTP’s quirks, but I’m getting ahead of myself.
In 1980 the first version of what we’d recognise as FTP today was defined in RFC 765. In this version the client opens a TCP connection (thereafter known as the command connection) to port 21 on a server. It then sends requests to transfer files across this connection, but the file data itself is transferred across a separate TCP connection, the data connection. This is the main aspect of FTP which doesn’t play well with modern network topologies as we’ll find out later.
Given that TCP connecitons are full-duplex, why didn’t they take the opportunity to remove the need for a second connection when they moved off NCP? Well, the clues are in RFC 327, from a time when people were still happy to burn RFC numbers for the minutes of random meetings. I won’t rehash it here, but suffice to say it was a different time and the designers of the protocol had very different considerations.
Whatever the reasons, once the command connection is open and a transfer is requested, the server connects back to the client machine on a TCP port specified by the FTP
PORT command. This is known as active mode. Once this connection is established, the sending end can throw data down this connection.
Even back in 1980 they anticipated that this strategy may not always be ideal, however, so they also added a
PASV command to use passive mode instead. In this mode, the server passively listens on a port and sends its IP address and port to the client. The client then makes a second outgoing connection to this point and thus the data connection is formed. This works a lot better than active mode when you’re behind a firewall, or especially a NAT gateway. As NAT gateways became more popular, as the IPv4 address space became increasingly crowded, this form of FTP transfer became more or less entirely dominant.
There were a few later revisions of the RFC to tighten up some of the definitions and provide more clarity. There was a final change that is relevant to this article, however, which was made in 1998 when adding IPv6 support to the protocol, as part of RFC 2428. One change this made was to add the
EPSV command to enter extended passive mode. The intended use of this was to work around the fact the original protocol was tied to using 4-byte addresses, and they couldn’t change this without breaking existing clients. As a simple change the
EPSV command simply removes the IP address that the server sends to the client for
PASV and instead the client uses the same address as it used to create the command connection1.
Not only is extended passive mode great for IPv6, it also works in the increasingly common case where the server is behind a NAT gateway. This causes problems with standard passive mode because the FTP server doesn’t necessarily know its own external IP address, and hence typically sends a response to the client asking it to connect to an address in a private range which, unsurprisingly, doesn’t work2.
It’s important to note that EPSV mode isn’t the only solution to the NATed server problem—some FTP servers allow the external address they send to be configured instead of the server simply using the local address. There are still some pitfalls to this approach, which I’ll mention later.
Given all that what, then, are the problems with FTP?
Well, some of them we’ve covered already, in that it’s quite awkward to run FTP through any kind of firewall or NAT gateway. Active mode requires the client to be able to accept incoming connections to an arbitrary port, which is typically painful as most gateways are built on the assumption of outwards connections only and require fiddly configuration to support inbound.
Passive mode makes life easier for the client, but for security-conscious administrators it can be frustrating to have to enable a large range of ports on which to allow outbound connections. It’s also more painful for the server due to the dynamic ports involved, as we’ve already touched on. The server can’t use only a single port for its data connections since that would only allow it to support a single client concurrently. This is because the port number is the only think linking the command and data connections—if two clients opened data connections at the same time, the server would have no other way to tell them apart.
Extended passive mode makes like easier all round, as long as you can live with opening the largish range of ports required. But even given all this there’s still one major issue which I haven’t yet mentioned, which crops up with the way that modern networks tend to be architected.
Anyone who’s familiar with architecting resilient systems will know that servers are often organised into clusters. This makes it simple to tolerate failures of a particular system, and is also the only practical way to handle more load than a single server can tolerate.
When you have a cluster of servers, it’s important to find a way to direct incoming connections to the right machine in the cluster. One way to do this is with a hardware load balancer, but a simpler approach is simply use DNS. In this approach you have a domain name which resolves to multiple IP addresses, sorted into a random order each time, and each address represents one member of the pool. As clients connect they’ll typically use the first address and hence incoming connections will tend to be balanced across available servers.
This works really well for protocols like HTTP which are stateless because every time the client connects back in it doesn’t matter which of the servers it gets connected to, any of them are equally capable of handling any request. If a server gets overloaded or gets taken down for maintenance, the DNS record is updated and no new connections go to it. Simple.
This approach works fine for making the FTP command connection. However, when it comes to something that requires a data connection (e.g. transferring a file), things are not necessarily so rosy. In some cases it might work fine, but it’s a lot more dependent on network
Let’s illustrate a potential problem with an example. Let’s say there’s a public FTP site that’s served with a cluster of three machines, and those have public IP addresses
188.8.131.52. These are hidden behind the hostname
ftp.example.com which will resolve to all three addresses. This can either be in the form of returning multiple A records in one response, or returning different addresses each time. We can see examples of both of these if we look at the DNS records for Facebook:
$ host -t A facebook.com facebook.com has address 184.108.40.206 $ host -t A facebook.com facebook.com has address 220.127.116.11
… and for Twitter:
$ host -t A twitter.com twitter.com has address 18.104.22.168 twitter.com has address 22.214.171.124
When the FTP client initiates a connection to
ftp.example.com it first performs a DNS lookup—let’s say that it gets address
126.96.36.199. It then connects to
188.8.131.52:21 to form the command connection. Let’s say the FTP client and server are both well behaved and then negotiate the recommended
EPSV mode, and the server returns port
12345 for the client to connect on.
At this point the client must make a new connection to the specified port. Since it needs to reuse the original address it connected to, let’s say that it repeats the DNS lookup and this time gets IP address
184.108.40.206 and so makes its outgoing data connection to that address. However, since that’s a physically separate server it won’t be listening on port
12345 and the data connection will fail.
OK, so you can argue that’s a broken FTP client—instead of repeating the DNS lookup it could just reconnect to the same address it got last time. However, in the case where you’re connecting through a proxy then this is much less clear cut—the proxy server is going to have no way to know that the two connections that the FTP client is making through it should go to the same IP address, and so it’s more than likely to repeat the DNS resolution and risk resolving to a different IP address as a result. This is particularly likely for sites using DNS for load-balancing since they’re very likely to have set a very short TTL to prevent DNS caches from spoiling the effect.
We could use regular passive mode to work around the inconsistent DNS problem, because the FTP server returns its IP address explicitly. However, this could still cause an issue with the proxy if it’s whitelisting outgoing connections—we would likely have just included the domain name in the whitelist, so the IP address would be blocked. Leaving that issue aside, there’s still another potential pitfall if the FTP server has had the public IP address to return configured by an administrator. If that administrator has configured this via a domain name, the FTP server itself could attempt to resolve the the name and get the wrong IP address, so actually instruct the client to connect back incorrectly. Each server could be configured with its external IP address directly, but this is going to make centralised configuration management quite painful.
As well as all the potential connectivity issues, FTP also suffers from a pretty poor security model. This is fairly well known and there’s even an RFC discussing many of the issues.
One of the most fundamental weaknesses is that it involves sending the username and password in plaintext across the channel. One easy way to solve this is to tunnel the FTP connection over something more secure, such as an SSL connection. This setup, usually known as FTPS, works fairly well, but still suffers from the same issues around the separate data and command connections. Another alternative is to tunnel FTP connections over SSH.
None of these options should be confused with SFTP which, despite the similarity in name, is a completely different protocol developed by the IETF3. It’s also different from SCP, just for extra confusion4. This protocol assumes only a previously authenticated and secure channel, so is applicable over SSH but more generally anywhere where a secure connection has been created.
Overall, then, I strongly recommend sticking to SFTP wherever you can, as the world of FTP is, as we’ve seen, by and large a world of pain if you care about more or less any aspect of security at all, or indeed ability to work in any but the most trivial network architectures.
In conclusion, then, I think that far from FTP being the “standard” network protocol used for the transfer of computer files, we should instead be hammering the last few nails in its coffin and putting it out of our misery.
I guess by 1998 they’d given up on those crazy ideas from the 80’s of transferring between remote systems without taking a local copy—you know, the thing that absolutely nobody ever used ever. I wonder why they dropped it? ↩
Even with extended passive mode NAT can still cause problems, as you also need to redirect the full port range that you plan to use for data connections to the right server. It solves part of the problem, however. ↩
Just for extra bonus confusion there’s a really old protocol called the Simple File Transfer Protocol defined in RFC 913 which could also reasonably be called “SFTP”. But it never really caught on so probably this isn’t likely to cause confusion unless some pedantic sod reminds everyone about it in a blog post or similar. ↩