Ad block and you don't stop

Who will win, a multibillion dollar industry or a couple of txt files?

Art: Advertisement for Pope Manufacturing Co., "We are Having a Heavenly Time" Cassius Marcellus Coolidge, 1890

Something I worry about a lot is that I don’t understand how the internet works. And by that, I mean the entire web, the whole shebang, from when I press “Send Newsletter”, to how Normcore makes it way up and through WiFi and then thick cables somewhere in the darkness of the Atlantic, through seas of malware, bypassing the dark web, and finally finds its way to your inbox, all of it. 

I swim in the broad, rolling waters of our internet every single day, for hours and hours on end.  I earn my salary building complicated, distributed products that rely on understanding machine learning research knowledge assembled from experts around the world and traverse millions of network requests. I write tweets and newsletters and read hundreds of thousands of words on news sites and blogs and academic journals. I like memes.  And yet I have no more understanding of what makes the entire thing work any more than I know what happens when I get in my car and turn on the ignition. 

In previous Normcores, I’ve explored bits and pieces of it: web servers, networks and packets, how the internet got started and data centers. But, it’s still an enormous system, a living, breathing organism with layers of history, nuance, subcultures, and features that makes it all somehow work together every minute and second of the day. 

In the tradition of continuing to explore various pieces of online life, something that I’ve been wanting to understand a lot lately is adblockers.  I’ve been using one since probably 2005, but have never given a second thought about how it works. I just know that they make my online experience bearable.

For example, here are some modern sites, loaded with adblockers on the left and without on the right. Using the internet without adblockers is intolerable. 

And just for fun, check out Forbes.com, which shows you a video roll instead of the home page if you’re not blocking the ad. 

What’s so amazing to me about ad blockers is that,  like the data powering neural networks, they’ve been built up pretty much by hand by a few people over years and years and now impact the browsing experience of hundreds of millions of people around the world. 

New Kids on the Block(er)

In the early 1990s, the first banner ad was launched on hotwired.com, Wired’s website, in the same way most tech things are launched: with a few people in a room trying to guess how an entire online experience for millions of people might work.  

Jonathan Steuer, online tsar, HotWired: I was in the room when it happened. The conversation was about, “How do we come up with money to pay for this thing on the internet?”

Louis Rossetto, co-founder, Wired: People told us if you put ads online, the internet would throw up on us. I thought the opposition was ridiculous. There is hardly an area of human activity that isn’t commercial. Why should the internet be the exception? So we said, “Fuck it,” and just went ahead and did it.

Soon afterwards, a guy called James Howard, 23 years old at the time, got together with three other dudes to create Internet Fast Forward, the first ad blocker. 

Howard has been a computer hacker since he was 10 and on the Internet since he was 14. Now he’s chief executive of PrivNet Inc., the private company set up to market the IFF program. His partners are Gene Hoffman, a 20-year-old UNC communications major; Jeff Harrell, a 22-year-old English major; and Mark Elrod, a 22-year-old computer science major.

IFF allows users to block advertisements, blinking text, Web graphics and “cookies"--Web software that tracks a visitor’s movements through a Web site. The IFF device, which works as a “plug-in” to Netscape’s browser, can also circumvent the lucrative advertising sites that direct surfers to popular “search engines” such as Yahoo and Infoseek.

Howard said his gripe is not with advertising per se, but with the time it takes to view a page with advertising. “It can take 4 to 6 seconds to download each ad, and if you are on the Web a lot, that really gets annoying,” he said. “If the advertisers want to pay for a high-speed Net connection to my house, then I would take the ads, but right now it is costing me money to look at their ads.”

Here’s a video about IFF, which is pretty incredible. (I love that around 3:19, where the narrator says they’re in for a “tough battle”,  you can clearly see all of them playing Wolfenstein.)

Ad blockers really took off after that, and soon, an adblocker called Ad Block was created by Henrik Aasted Sørensen, at the time a Danish university student, who was procrastinating from studying, 

"I suppose some people expect Adblock to have been created in a fit of anti-capitalist rage, or as an idealistic effort to return the internet to its less commercial roots," Sørensen said. "What actually happened is I was supposed to be cramming for an upcoming exam at university [in Copenhagen, where he studied internet technology and computer science.] As a procrastination project, I decided to try out the relatively new possibility of creating extensions for the Phoenix browser — which is the browser that eventually got renamed Firefox. The idea was primarily to try out a new development environment and move a bit out of my development comfort zone."

Whomst among us has not created an iconic technology when they didn’t want to be studying for Biochem? After that AdBlock really took off, and became the premier ad blocking tool and browser extension for a long while, always being maintained by a small group of very opinionated people. Offshoots, such as uBlockOrigin and Ghostery, continue to be pretty popular.  (I use mostly uBlock these days because of the controversy around AdBlockPlus and their new ad acceptance policy)

What’s interesting in all of this is how few people were involved in making these changes that would eventually impact the way the rest of us interact with the internet. In hindsight, though, it’s not that different from the average opinion of 10k people in San Francisco, only at a much smaller scale. 

But how do all these programs work anyway?

The Tangled Web of Ads We Weave

To understand adblockers, it helps to understand a bit about how modern adtech works. 

Let’s say you go to your favorite technology publication, The Normcore New Times, to read all about how companies are removing distributed systems from their tech stacks.

Now, The Normcore New Times, unlike its humble predecessor newsletter, is aggressively ambitious about making money, and wants to monetize the everloving crap out of every page element. What they’ll do is sell space on their site.  

And not just any space, the places that researchers have determined people’s eyeballs travel to the most. So maybe they’ll do something like this: 

And then, after they figure this out, they might go to an advertiser directly, like, say Adidas, and ask if they want to place ads in these places. This is how it started out

In the early days of online ads, a brand would strike a deal with a website owner to host a paid banner. The onscreen space for that image, known as the ad inventory, would be sold by the publisher directly. (The magazine you’re reading right now made the first such transaction, back in 1994.) 

But, let’s say that you’re Adidas, the official sponsor of the Normcore New Times, and you’re serving ads like this: 

That’s all well and good. But you, the advertiser,  also have maybe a hundred other sites that need ads, and you have ads that are constantly changing based on which site they’re on, which styles you’d like to advertise that day, and a million other variables that are hard to keep track of when you scale. 

And, another thing. You can also customize which ads you serve based on cookies. So, maybe you’d like people who have looked at Adidas or searched for Adidas to be exposed to your ads across the internet

Now you have to keep track of not only your ad inventory, but which sites it’s stored on, how quickly those ads need to change, and users across sites. What you’ve built now is not just an advertising relationship, but a tech stack. 

What arose in trying to maximize revenue from ads is an entire ad pipeline that now looks something like this, where there is an enormous disconnect between the marketer, Adidas, and the publisher, Normcore New Times, in the name of “efficiency.”

That ad is now being served to you programmatically, in a process that puts a new ad for you to look at every time you load a web page.  

Today, the process has grown far more complicated, and humans are barely involved. “As they do in modern-day capital markets, machines dominate the modern-day ecosystem of advertising on the web,” Hwang writes. Now, whenever you load a website, scroll on social media, or hit Enter on a Google search, hundreds or thousands of companies compete in a cascade of auctions to show you their ad. The process, known as “programmatic” advertising, occurs in milliseconds, tens of billions of times each day. Only automated software can manage it.

There are a lot, a lot more implementation details, but the bottom line is that sites collect demographic information about you and bundle that information. They then send it to third-parties, which, in turn, pick from a collection of ads sent to them by ad agencies, and decide which ad should be shown to you on the webpage. All of this happens through 3 or 4 layers of technology and happens extremely quickly, although not quickly enough to not annoy users

Blocklists to the Rescue

Note that Normcore would never run ads because I’m really bad at selling out, BUT IF IT DID, you would want to block that sucker. 

What do you block? In the old days, you could basically determine which images on a page were “ad-sized” and easily block those image sizes. Today, it’s a bit more complicated, because the URL the ad could be hosted at could be at any one of a number of places. It could be coming from adidas.com, but that’s very unlikely, given the extremely complicated ecosystem. 

Usually, what you want to do is to keep track of any one of the URLs connected to the adtech ecosystem. What AdBlockers do, is check out the requests your browser is sending. If those requests are on the block list, it doesn’t load that content. 

Let’s take a look at what’s blocked on Wired.com, where that first p ran, for example. This is what shows up on UBlock for Firefox, as an example. The links highlighted in red are known regular expressions matching URLs that serve ads.

How does the adblocker know what to add to the blocklist? 

This is the crazy part. There are about 4-5 guys in the world (and some one-off open-source contributors) managing them. The biggest blocklist historically, and these days, is EasyList

EasyList was originally launched in 2005, as a kind of add-on to the Adblock browser extension. Several different people have overseen it since then, and today, a group of four people, led by a man named Ryan Brown, is authorized to change EasyList’s rules.

Over time, its list of rules (and exceptions to those rules) has grown sprawling. Analysis conducted by Brave last summer found over 70,000 rules in EasyList, a mixture of network rules, which determine whether a site fetches sites or code from web addresses that match a certain kind of pattern; element rules, which dictate whether certain page elements, such as banners, can be displayed; and exceptions to the element and network rules.

The other amazing thing about EasyList is that it literally is just a group of textfiles that you can browse on GitHub, and if you’re even more curious, dig in and look at everything that was added. And, if you want, you can even contribute

You can also see the commits, and see Ryan and a handful of others dutifully committing new changes in the dynamic, shifting landscape of trying to keep a step ahead of advertisers and providing a good browsing experience for the general population of internet users. 

Block On

What’s the impact of having such a small group of people take on the entire adtech ecosystem? Well, there are the usual development problems: bad, outdated data, for example. 

A recent study found a lot of outdated rules that slow down the blocking logic. It should be mentioned, however, that the authors of the study, Brave Browser, the company run by the former CEO of Mozilla and founder of Javascript Brendan Eich,  who runs their own adblocker, has a vested interest in uncovering these inefficiencies and exploiting them. (In a way, it’s kind of ironic that Brendan developed Javascript, which powers the modern front-end internet, including making it possible to dynamically serve ads, the very thing his new company is now trying to fight. Oh, and there’s also cryptocurrency involved, which makes me think that there should probably be a separate Normcore on Brave. But I digress.)

The other thing is that legitimate sites sometimes get blocked by accident and there’s no immediate recourse

While the purpose of the list has always been to keep advertising out of web experiences, EasyList’s rules regularly break normal editorial features on sites. In the past six months, EasyList changes have broken the buy buttons on commerce site The Inventory, the video player on Animal Planet, disrupted site navigation on Fandom, and disrupted the style and CSS loading process on job search site Indeed.

Though most of these issues were resolved quickly, as publisher sites continue to evolve, they have to contend with the possibility that they might run afoul of one of the most important crowd-maintained documents on the internet.

But, as usual, the problem is not in the list itself, but the continued growth of the online advertising ecosystem, and the attempts of adblockers to stay one step ahead. 

It’s kind of amazing when you think about it, this small list of volunteers manually up against a multibillion dollar industry outfitted with algorithms to the hilt, whose entire goal is to thwart them. But if that’s not normcore, I don’t know what is. 

What I’m reading lately: 

  1. A short history of Flash

  2. TIL there is a Burger King in Germany that’s housed in a former Nazi power station

  3. Just watched Darmok, still thinking about this episode. Strong recommend

  4. Storium

  5. Dads, commit to your family at home and at work

  6. How I Git

  7. Guido at Microsoft

  8. When data disappears

  9. Writing a technical book: from idea to print

  10. The great data debate


The Newsletter:

This newsletter’s M.O. is takes on tech news that are rooted in humanism, nuance, context, rationality, and a little fun. It goes out once or twice a week. If you like it, forward it to friends and tell them to subscribe!

Swag: Stickers. Mug. Notepad.

The Author:

I’m a machine learning engineer. Most of my free time is spent wrangling a kindergartner and a toddler, reading, and writing bad tweets. Find out more here or follow me on Twitter.