For a few crazy moments there I thought I had come up with a solution to spam email.

Alright, here's my reasoning. Spammers get your email address by two major methods. One is being given a list of working email addresses by some third party who shouldn't have done so - perhaps some internet site you signed up with was less than ethical in how they dealt with the information you gave them when you registered. There are well-known ways and means of dealing with this, starting with being extremely careful with whom you give your "real" email address to and descending from there. That's not what this idea is about.

The other major method is to use bots to scan the internet for anything of the form "*@*.*", assume that that is an email address and send spam to it.

Now, the traditional way to avoid the bots is by the practice of email address munging, but this has the drawback that not everybody knows that address munging is a standard practice. As a result, people who want to contact you may end up clicking that "mailto:" link and sending an email to nobody, because your email client (correctly) doesn't recognise "wossname[ at ]example[ dot ]com" as a valid email address.

Therefore, it would be better if there was some standard way to munge an email address so that an email address is no longer easily distinguishable from regular plain text, markup, misspelt text or just plain gibberish. We have, then, a public, reversible algorithm for converting an email address into something which doesn't look like an email address. By standardising the munging system, an email client can simply have an extra piece of code which goes "hang on a second. This isn't an ordinary email address. Let's try demunging it. Ah! Now it's okay," before sending the email. (As you can see, the usual username@example.com system is still in place, so we don't have to suddenly change the whole way we do email - it's just that this demunged form of the address is kept nominally secret.)

"Wait!" I hear you say. "What's to stop spammers doing the exact same thing? They can simply program their bot to take all the text on a given website and test-demunge each piece to see if it's an email address in disguise."

Here's the clever bit. We choose our reversible algorithm to be extremely processor-intensive - so that on a modern computer, it takes, say, 1 whole second to run.

You have no problem waiting one second while your client goes through the demunging process. How much legitimate email do you send a day? Maybe a hundred at most? (Obviously this would be all done client-side.) On the other hand, the spam bots have to run that demunging process for every single word on every single web page. Ordinarily they can download a whole 20kB web page in, what, a fraction of a second? But now scanning that page for fresh meat will take minutes at a time!

"And what if home computers get faster and faster?" Well, if you happen to have the time on your hands, munge your address twice - even a single pass won't find it. Processor speeds have doubled? Three times. Four times. As many times as you can be bothered with. The email client will just try to demunge as many times as necessary (up to a user-adjustable limit).

Sounds good, right? Spammers stopped dead in their tracks!

Why it doesn't work

First of all, you still want people to be able to click a "mailto:" link to email you. Which means the bots can still get what they are very certain is a valid (possibly munged) email address by searching for anything inside a mailto: link. So we still have the hassle of people having to manually copy and paste email addresses into their emails before they can be sent.

Secondly. This will not slow down the rate at which spammers can send spam email. The spammers will NOT have to demunge each address every time they send a spam email to that address and be limited to at most 86,400 spams per day (compared to millions, usually). They can just keep a lookup table of already-demunged addresses. All this does is slow down their address harvesting.

Thirdly. This will not slow down their harvesting significantly either. Demunging everything in sight can be made significantly faster by - again - creating a lookup table of words which have already been demunged and confirmed not to be email addresses in disguise, then simply skipping these when they are encountered. And after all the plain text is eliminated from the possibility space, I strongly suspect that there is NOT quite enough gibberish on the internet to make test-demunging it all time-consuming enough.

You could slow the bots down significantly further by making it so the munged email address is not a string of gibberish but actually a list of, say, ten to fifty real plaintext words - then the bot would need to test many combinations from each page, slowing it down by another order of magnitude. But this makes the whole system even MORE unwieldy to use and unappealing for the legitimate user.

Fourthly. Spammers have armies of worm-ridden Windows boxes at their disposal. They are not in any sense short of processing power. And the harvesting will continue, even if it continues more slowly than before.

Conclusions

I believe some good could come from this general idea of abandoning the use (by humans, at least) of a standard, easily machine-recognisable format for email addresses. But clearly the idea needs work. I guess the main lesson here is - DON'T think that smarter folks than you haven't already come up with much more sophisticated solutions to spam, found flaws in them, and abandoned them.

Log in or register to write something here or to contact authors.