Authentication vs. Sybil Attacks

2021/12/30

... why is everyone using Login with Facebook or stupid phone numbers?

This is post no. 60 for Kev Quirk's #100DaysToOffload challenge. The point is to write many things, not to write good ones. Please adjust quality expectations accordingly :)

"This stupid service only lets me log in with Facebook or Google, which are evil monopolies. They surely just hate Freedom."

As in... sure, more and more of your potential users are domesticated enough to be okay with this, but... still. Why not just let them log in with an email address? Also, using phone numbers as your identity or requiring one to log in is also not great.

Except, all of these just a suboptimal solution for an existing problem that nevertheless does in fact exist and which is not to be ignored entirely.

Spam and Sybil Attacks

The main point being... how do you prevent bad actors from signing up to your service 87,000 times and spamming everything with "add 2 more inches" promotions? Or from just messing with your statistics / etc?

Yes, you can ban a user for doing this. They can just sign up another time. They can generate an infinite number of email addresses to sign up with. (This is what "sybil attacks" are: one person / entity pretending that it's a lot of distinct entities.) IP blocking doesn't work against anyone but the least sophisticated attackers; it's somewhat of an arms race, actually.

It is, however, relatively harder to obtain 87,000 working phone numbers. Or sign up for Google / Facebook 87,000 times: they have actual teams working on ensuring that you don't. (Methods involve, not surprisingly, asking for phone numbers. Or credit cards.)

If your only login methods are what Facebook / Google / telcos provide, you still might get spammers, but it's sufficient to only ban them once / just a couple of times. All you need to do is setting up Facebook / Google / telcos as a gatekeeper to your service: feudalism at work here: "we'll protect you from spam if all your users log in via our systems".

Is everything doomed?

So, as it seems, in order to combat spam, we need to give up on the freedom of choosing identity providers, and tie all our accounts to our real phone number / Facebook account. Which... is not great for privacy (even if we ignore the part where not-that-many megacorps get to dictate a lot of terms). Is there an alternative solution though?

The main point here, nevertheless, is that what sites using Google / FB logins want to know is that you're a person who has not signed up yet. The way Facebook login / a phone number proves this is that you still haven't used this number / FB identity yet and they're hard to obtain. However, they wouldn't need to care about the actual identity itself!

Decentralized anti-sybil services

We could, instead, set up a set of "identity uniqueness providers". What one of these would do is give you a token saying:

for the domain bobs-spam-resistant-web-forum.com, the holder of this token, of number 26348, is a different person from the holders of any other token I've handed out.

Implementations could range from "I give out a token to anyone who has spent 2 minutes of CPU on a random task" (which is not a very strong proof), through "I have established that the applicant has a gmail address I haven't seen yet", all the way up to "I have checked their government ID".

Then, sites needing spam-free registration can decide which of such services they trust, and ask for some of these tokens at registration. Assuming we establish an actual protocol for exchanging these tokens, you can now both swap them out, or even aggregate / chain them (e.g. a service that gives out a token to anyone proving they have a unique gmail OR facebook login).

Benefits

First of all, even if Alice already requested a token for bobs-spam-resistant1-web-forum.com, she can still get another, completely different and unrelated one for some-other-service-needing-login.com from the same anti-sybil service. Thus, even if they collude, they are not able to establish who these users are, or even that it's the same user, thus keeping complete anonymity while still preventing multiple registrations! (Compare that to any of the other, existing methods, where these services get phone numbers or gmail addresses.)

As a result, you're now free to use identity providers (e.g. an email address on your own domain or your own domain itself) that are weaker in terms of providing uniqueness than it would be otherwise required, while also being able to ditch your uniqueness (anti-sybil) provider later. Compare that to "log in with Google / FB", where you'd be dependent on FB / Google for each time you log in.

... comments welcome, either in email or on the (eventual) Mastodon post on Fosstodon.