Don't Be Evil, Google

There were a couple of things I’ve read this week that only added to my disdain of Google.

First, Google quietly dropped its previous commitment to not use AI for developing weapons and surveillance tools. Those pentagon contracts are obviously too juicy to ignore. This one doesn’t bother me so much. The use of AI is inevitable. Palantir is already leading the way.

The next bit of news, which is actually based on research published in 2023 (so not really news), relates to those seriously annoying popups you get when you visit many sites across the web. You know the ones I mean, when you have to prove you’re a human, and not a bot, by discerning letters or images from a blurry image, or clicking on squares in an image to pick out bicycles.

These tests are called CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart).

In 2007, Luis von Ahn had the foresight to use CAPTCHAs to help digitise scanned text from books and newspapers that computers struggled to read. He called this reCAPTCHA. The New York Times even used it to digitise its archive of 13 million articles dating back to 1851.

Google acquired reCAPTCHA in 2009 and used it to digitise Google Books and improve Google Street View by processing photos of street signs and house numbers.

As it turns out, Google has turned it into a mass surveillance tool which benefits nobody but itself.

Over 512 billion reCAPTCHAs have been solved historically, costing an estimated 819 million human hours and $6.1 billion USD in lost labor, whilst consuming 134 petabytes of bandwidth and 7.5 million kWh of energy, contributing significantly to CO2 emissions.

All this for what? Certainly not for security. Security professionals have known for a while that bot developers have learnt how to get around them.

You can read the analysis yourself but I’ll summarise what it says about Google’s use of reCAPTCHAv2.

reCAPTCHAv2 collects user behavioural data beyond just solving captchas. It tracks users before, during, and after captcha interactions. It uses a risk analysis system that considers factors like:

User’s browsing history
Cookies stored on the browser
Browser environment details (e.g., screen resolution, user agent, canvas rendering)
Mouse movements and other behavioural cues.

It basically works as a covert user tracking mechanism. All the data is sent to Google servers, who then use it in their advertising ecosystem as yet another way to track you and serve ads.

If you’re a web developer, you might want to think twice before embedding a reCAPTCHA in a form.

For everyone else, your mileage will vary if you try and automatically block these.

You could disable javascript in your web browser, but then that kills the functionality of most sites across the web.

I’ve had varied success with Content Blockers that you can install as browser extensions or plugins. A lot of content/ad blockers say that they can block it, but it’s not so easy. Google loads the reCAPTCHA in an external iframe served from Google. iFrames are not so easy to block. If they’re embedded in a form, you may be able to block the entire form automatically using a content blocker, or manually using a feature such as ‘Hide Distracting Items’ which is now available in Apple’s Safari browser. Invoking Reader mode also doesn’t always work.

What about using a VPN? If you use a VPN (I use Private Relay which comes with iCloud+, which is essentially a VPN), these will hide your IP address from Google. All this achieves though is making it more likely you will be repeatedly served reCAPTCHAs.

Maybe one day regulatory authorities will get their act together and stop this crap. GDPR certainly didn’t. All the EU’s flagship privacy regulations did was serve us annoying cookie banners. So annoying that 99.9% of users just automatically click ‘Accept Cookies’.

Remember, Google quietly removed “Don’t be evil” from its official corporate code of conduct in 2018.

The battle continues.