overwatering.org

blog

about

Currently at work I’m designing a large-scale system that will be susceptible to a certain kind of denial-of-service attack. By way of analogy, imagine that Gmail didn’t bother to prevent robots from creating accounts. By the time the first human went to create an account all the reasonable combinations of the top 10,000 human names would have had already been taken, by robots. This would be very irritating to all actual human users.

Our problem is much more serious than simply losing human-preferred free email addresses. But, it is a case of preventing robots from soaking up a finite resource and depriving real humans of using resource.

My approach to large system design is to always get security right first: you can never effectively retrofit it later. And the central question we keep coming back to on security is how to defend ourselves against robots. Our thinking has typically followed certain lines:

  1. To acquire a resource, a user must prove they are human.
  2. All users must have a registered account, so we can identify who is consuming the resource and only have to verify their humanity once: on registration.
  3. The user’s account must be protected with a password to avoid a bot misusing a real human’s account.
  4. Each account has a threshold of resource acquisition. If the threshold is exceeded than that account is temporarily blocked in some way.

At this point in our thinking we’re pretty confident that we’ve dealt with the risk of a robot creating an account and using that single account to soak up all our resources. We’re also pretty certain we’ve dealt with the issue of a robot creating many, many accounts, using those accounts to soak up resources while staying under the threshold for each.

But. What about bot nets? And by restricting single accounts like this, haven’t we just forced attackers to use a bot net? Attackers would want to distribute a bot across the Internet. Each bot would not use its own account, instead it would use the account of the human owning the computer the bot had infected. Once the bot is on the human’s computer it can easily grab the credentials, as a key logger or by sniffing around in the browser cookies. In this situation our threshold control hasn’t really stopped the attacker, but it has hurt the human. The effective threshold for the human is now much lower.

And it is on this point that our discussions tend to go around and around. How can we prevent bots (who may have acquired a human’s account) without negatively affecting the human’s experience and without placing prohibitive barriers to use in place?

Thinking about this issue tonight, I wonder if we’re not completely wrong in this argument? If a user’s computer has been compromised and is now part of a bot net, should we be trying to give that user a smooth experience at all? They’ve been compromised, shouldn’t we identify that, inform the user and then attempt to lock them out completely? There’s a question there about when we can let them back in, but I’ll leave that now.

My central question is, should web applications actually aggressively make the experience worse for user’s who have been compromised? In the case of a bank the answer seems obvious. I suspect we’re actually similar.