<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Layman's Guide to Computing - Season 06</title><link href="https://ngjunsiang.github.io/laymansguide/" rel="alternate"></link><link href="https://ngjunsiang.github.io/laymansguide/feeds/season-06.atom.xml" rel="self"></link><id>https://ngjunsiang.github.io/laymansguide/</id><updated>2020-07-04T08:00:00+08:00</updated><entry><title>Issue 78: uMatrix: voyuering the voyeurs</title><link href="https://ngjunsiang.github.io/laymansguide/issue078.html" rel="alternate"></link><published>2020-07-04T08:00:00+08:00</published><updated>2020-07-04T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-07-04:/laymansguide/issue078.html</id><summary type="html">&lt;p&gt;Modern webpages rely on many third-party resources for their functionality. Blocking access to some domains may cause these webpages to break and stop&amp;nbsp;working.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; The default settings of most browsers expose a lot of information to scripts that request it. To prevent such scripts from running, we need services that can filter &lt;strong&gt;the source&lt;/strong&gt; of these scripts. These services generally work by matching browser requests against a blacklist, and blocking the request if it comes from a domain known to host malicious&amp;nbsp;scripts.&lt;/p&gt;
&lt;p&gt;Many existing solutions to blocking scripts—let’s call them script-blockers—rely on manually managing a blacklist. That is to be expected, but few of them make it easy to see which domains the scripts are coming&amp;nbsp;from.&lt;/p&gt;
&lt;h2&gt;uMatrix&lt;/h2&gt;
&lt;p&gt;As part of my research for this season, I installed &lt;a href="https://github.com/gorhill/uMatrix"&gt;uMatrix&lt;/a&gt;, a browser extension for &lt;a href="https://addons.mozilla.org/firefox/addon/umatrix/"&gt;Firefox&lt;/a&gt;, &lt;a href="https://chrome.google.com/webstore/detail/%C2%B5matrix/ogfcmafjalglgifnmanfmnieipoejdcf"&gt;Chrome&lt;/a&gt;, and &lt;a href="https://addons.opera.com/en-gb/extensions/details/umatrix/"&gt;Opera&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Once installed, it adds a button beside the address bar. When clicked, this button pops up a matrix showing the number of resources loaded from each&amp;nbsp;domain:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of uMatrix in Firefox browser, showing default settings." src="https://ngjunsiang.github.io/laymansguide/issue078_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;uMatrix in Firefox showing default settings.&lt;br /&gt;Items highlighted in green are permitted to load, items in red are&amp;nbsp;blocked.&lt;/em&gt;    &lt;/p&gt;
&lt;h2&gt;Understanding the&amp;nbsp;Matrix&lt;/h2&gt;
&lt;p&gt;Along the top row, the column headers tell us what kind of resources are being requested by the page. A quick&amp;nbsp;refresher:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;cookies are little bits of information that scripts attach to a domain in the browser (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue069.html"&gt;Issue 69&lt;/a&gt;))&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;CSS&lt;/span&gt; (Cascading Style Sheet) files describe the styling to be applied to the&amp;nbsp;page&lt;/li&gt;
&lt;li&gt;image files need no explanation I&amp;nbsp;hope&lt;/li&gt;
&lt;li&gt;media covers any rich/animated media e.g.&amp;nbsp;videos&lt;/li&gt;
&lt;li&gt;scripts are javascript files containing code to be executed when the page has loaded&amp;nbsp;them&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XHR&lt;/span&gt; (XmlHTTPRequests) are requests for other resources—to verify a Captcha, get a winning ad bid (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue073.html"&gt;Issue 73&lt;/a&gt;)) … or something as innocent as getting the weather&amp;nbsp;forecast&lt;/li&gt;
&lt;li&gt;frame refers to iframes (inline frames), which are a way of embedding a webpage inside another. You see this often on sites which display &lt;span class="caps"&gt;PDF&lt;/span&gt; files within their pages. But this can also be used to embed Captcha puzzles within a login box, for&amp;nbsp;instance.&lt;/li&gt;
&lt;li&gt;other: I won’t go into the other esoteric means of loading data onto a webpage; we won’t need that for this&amp;nbsp;issue&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;At a glance, I can see that just to load the login, the Dropbox webpage is pulling resources not only from dropbox.com, but also&amp;nbsp;from:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;dropboxcaptcha.com&lt;/li&gt;
&lt;li&gt;dropboxstatic.com&lt;/li&gt;
&lt;li&gt;google.com&lt;/li&gt;
&lt;li&gt;fonts.googleapis.com&lt;/li&gt;
&lt;li&gt;gstatic.com&lt;/li&gt;
&lt;li&gt;googletagmanager.com&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These are represented as row&amp;nbsp;labels.&lt;/p&gt;
&lt;p&gt;The numbers in each cell represent how many resources of each type are being loaded from each&amp;nbsp;domain.&lt;/p&gt;
&lt;p&gt;&lt;span class="caps"&gt;CSS&lt;/span&gt; and images are considered important and quite harmless, and are thus allowed by default. First-party resources (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue076.html"&gt;Issue 76&lt;/a&gt;)) too, since the website itself has full control over them, are considered “secure”, assuming you trust that website enough to be there in the first&amp;nbsp;place.&lt;/p&gt;
&lt;h2&gt;Blacklisting or whitelisting&amp;nbsp;domains&lt;/h2&gt;
&lt;p&gt;By default, some domains known to host scripts for tracking are already blacklisted. googletagmanager.com (highlighted in bold red) is the domain for Google’s Tag Manager platform for measuring and analysing browsing data. It is how their ads can get personalised data on you, so it is on uMatrix’s blacklist once you install&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;Other third-party domains are blacklisted by default (highlighted in light red) for your safety, but I can choose to whitelist them by clicking on them until they are highlighted in light&amp;nbsp;green.&lt;/p&gt;
&lt;h2&gt;Dissecting page&amp;nbsp;functionality&lt;/h2&gt;
&lt;p&gt;That’s interesting … blocking all third-party resources does not stop the page from loading at all! So what are those resources doing (especially the 63 scripts from cfl.dropboxstatic.com)? Let’s continue using the webpage to find&amp;nbsp;out.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of Error (405) when logging in with all third-party resources blocked." src="https://ngjunsiang.github.io/laymansguide/issue078_02.png" /&gt;&lt;br /&gt;
&lt;em&gt;&lt;code&gt;Error (405)&lt;/code&gt; means &lt;code&gt;Method Not Allowed&lt;/code&gt;, implying that something is missing from the webpage resulting in it not understanding what to do.&amp;nbsp;Oops.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;Error 405. Looks like I broke something. This is the tedious part: I whitelist one domain at a time, reloading the page each time to see if anything&amp;nbsp;changes.&lt;/p&gt;
&lt;p&gt;It turns out the Dropbox webpage is doing a surprising number of things behind the scenes! By the time I managed to get a login, uMatrix looked like&amp;nbsp;this:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of uMatrix in Firefox browser, showing some domains whitelisted." src="https://ngjunsiang.github.io/laymansguide/issue078_03.png" /&gt;&lt;br /&gt;
&lt;em&gt;uMatrix in Firefox showing settings that got Dropbox working.&lt;br /&gt;I had to allow embedded frames from dropboxcaptcha.com and google.com as&amp;nbsp;well.&lt;/em&gt;    &lt;/p&gt;
&lt;h2&gt;Spotting the&amp;nbsp;patterns&lt;/h2&gt;
&lt;p&gt;If you are thinking of trying this, be warned: this will frustrate your browsing experience for the first week or so (after you take a couple of days to figure out how the uMatrix interface works) while you build up a custom whitelist of domains on your usual online haunts. There is an “off” button for times when you really don’t have the brainspace to be figuring this out (e.g. when you are just tying to get some ibanking done quickly), but it shouldn’t be the default&amp;nbsp;setting.&lt;/p&gt;
&lt;p&gt;I did this because I wanted to know what my web browser is doing. And here are some things I’ve figured out through this&amp;nbsp;exercise:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Big websites often load their unchanging (static) resources, such as images, &lt;span class="caps"&gt;CSS&lt;/span&gt; files, script files, etc, from a separate domain.&lt;br /&gt;
  Presumably they do this so that this other domain can be set up for caching (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue039.html"&gt;Issue 39&lt;/a&gt;)). Having static files cached on the browser makes the browsing experience much smoother, as static parts such as the icons and stylesheets can be rendered (put on screen) first while waiting for dynamic data to load.&lt;br /&gt;
  Dropbox loads their static resources from&amp;nbsp;dropboxstatic.com.&lt;/li&gt;
&lt;li&gt;Big websites may load their dynamic data from a &lt;span class="caps"&gt;CDN&lt;/span&gt; (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue073.html"&gt;Issue 73&lt;/a&gt;)).&lt;br /&gt;
  Once traffic gets large enough that a single server might not be able to handle peak load, many online services switch to delivering their content through a &lt;span class="caps"&gt;CDN&lt;/span&gt; (such as Squarespace). These resources will appear to be loaded from a third-party. So anything with a “cdn” in the domain is &lt;em&gt;probably&lt;/em&gt;&amp;nbsp;safe.&lt;/li&gt;
&lt;li&gt;ReCaptchas don’t always need a pop-up.&lt;br /&gt;
  Some of them run in the background, checking to see if you have already been verified human somewhere else, or verifying you by other means.&lt;br /&gt;
  Dropbox loads its captchas from dropboxcaptcha.com &lt;strong&gt;and&lt;/strong&gt; google.com (for Google’s reCaptcha service). Two layers of&amp;nbsp;captchas!&lt;/li&gt;
&lt;li&gt;There are many websites out there that rely on google.com being whitelisted.
  This is what happens when you have a single company providing so many critical services that their domain has to be whitelisted. If blocked, the webpage will no longer&amp;nbsp;work.&lt;/li&gt;
&lt;li&gt;Some websites rely on “daisy-chaining”, where script A loads script B which loads script C, and so on.&lt;br /&gt;
  You know this because when using uMatrix, you whitelist a domain and reload the page, and another domain appers. You whitelist that domain, and another one appears&amp;nbsp;…&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Modern webpages rely on many third-party resources for their functionality. Blocking access to some domains may cause these webpages to break and stop&amp;nbsp;working.&lt;/p&gt;
&lt;p&gt;This was fun, in a masochistic sort of way. Most of what I learnt here is not really newsletter-worthy: how prevalent Google is, what a clean webpage looks like in the backend (very few domains), what a massive webpage looks like (lots of domains! E.g. Trello), what the most popular CDNs are, and some dead giveaways of a webpage quickly spiralling out of control (large numbers on a single domain, slow loading with no static domain or &lt;span class="caps"&gt;CDN&lt;/span&gt;) … maybe I’ll figure out the layman-worthy parts of it someday and put it in another&amp;nbsp;season.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S7] Issue 79: A Base for&amp;nbsp;Data&lt;/p&gt;
&lt;p&gt;Next season, we go back to data again. Specifically, we look at how data is stored and managed for most of the internet: in a&amp;nbsp;database.&lt;/p&gt;
&lt;p&gt;What is a database and why do we need&amp;nbsp;one?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category><category term="cache"></category></entry><entry><title>Issue 77: Wearing clothes on the Internet</title><link href="https://ngjunsiang.github.io/laymansguide/issue077.html" rel="alternate"></link><published>2020-06-20T08:00:00+08:00</published><updated>2020-06-20T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-06-20:/laymansguide/issue077.html</id><summary type="html">&lt;p&gt;The default settings of most browsers expose a lot of information to scripts that request it. To prevent such scripts from running, we need services that can filter &lt;strong&gt;the source&lt;/strong&gt; of these scripts. These services generally work by matching browser requests against a blacklist, and blocking the request if it comes from a domain known to host malicious&amp;nbsp;scripts.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Cookies with the same domain as the site are first-party cookies, while cookies with domains different from the site are third-party cookies. Cookies are used for all kinds of purposes, from remembering browsing sessions, to logging users in, to tracking their identity across websites. Blocking all third-party cookies indiscriminately can result in most if not all of these functions breaking. And yet, not blocking them at all means that you are being tracked across all your browsing sessions, likely without your explicit&amp;nbsp;permission.&lt;/p&gt;
&lt;p&gt;I apologise for the titillating title, though I believe it is apt. After all, your choice of clothing is not about ensuring not a single square centimetre of skin is seen, nor is it about covering the absolute bare minimum. It is not about everybody having to follow the exact same dress code. It is about giving you &lt;em&gt;choices&lt;/em&gt; about how far along the spectrum you want to be, from totally uncovered at one end to totally covered at the other end. It is about giving you &lt;em&gt;options&lt;/em&gt; in deciding where to cover and where not to&amp;nbsp;cover.&lt;/p&gt;
&lt;p&gt;But I’m getting ahead of myself. Cover yourself from what? From scripts that seek to see things they shouldn’t. And what are they trying to see? Your &lt;em&gt;information&lt;/em&gt;.&lt;/p&gt;
&lt;h2&gt;What a script&amp;nbsp;sees&lt;/h2&gt;
&lt;p&gt;There are websites online (such as &lt;a href="https://privacy.net"&gt;privacy.net&lt;/a&gt;) which can tell you what information is exposed by your browser (and other settings). They do so by, of course, actually extracting this information by any means possible. Go on, give it a try if you’re not&amp;nbsp;paranoid.&lt;/p&gt;
&lt;p&gt;If you are, I did it for you so you don’t have to. Heres what it can see, in decreasing order of control (I skip privacy hacks/cheats here because the list would be almost&amp;nbsp;endless):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span class="caps"&gt;IP&lt;/span&gt; address&lt;br /&gt;
   From your &lt;span class="caps"&gt;IP&lt;/span&gt; address, it is an &lt;span class="caps"&gt;IP&lt;/span&gt; lookup away from finding out your &lt;strong&gt;&lt;span class="caps"&gt;ISP&lt;/span&gt;&lt;/strong&gt;, your approximate &lt;strong&gt;location&lt;/strong&gt; (using geoIP&amp;nbsp;services)&lt;/li&gt;
&lt;li&gt;Browser (and probably &lt;span class="caps"&gt;OS&lt;/span&gt;), with version information&lt;br /&gt;
   This lets scripts know if you are using a (possibly outdated) browser version. Since most browser vulnerabilities are published online (to help security researchers patch them), you should keep your browser updated to benefit from these security patches.&lt;br /&gt;
   &lt;span class="caps"&gt;OS&lt;/span&gt; information can provide some demographic information (e.g. if you are an Apple user or Linux user), and also whether you are on a mobile browser or laptop browser. With many data points, a data aggregator can learn if you are on the move often (mostly on mobile browser) or generally static (about 50/50 between mobile and&amp;nbsp;laptop).&lt;/li&gt;
&lt;li&gt;Screen resolution (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue044.html"&gt;Issue 44&lt;/a&gt;))&lt;br /&gt;
   This can provide enough info to put you in an income bracket; cheaper devices generally have lower resolution. A mid-range or high-end phone usually has a resolution of 1080×1920 or&amp;nbsp;higher.&lt;/li&gt;
&lt;li&gt;Autofill information&lt;br /&gt;
   Any information you save in your browser, to be autofilled in forms, can be extracted by a script. It creates a hidden input field that the browser detects and autofills. The script can then send this information as an &lt;span class="caps"&gt;HTTP&lt;/span&gt; request back to the originating&amp;nbsp;server.&lt;/li&gt;
&lt;li&gt;Accounts you are logged in to.&lt;br /&gt;
   A script can sniff other cookies on your browser session and match them against known cookies to do this. These cookies may also containing other info, such as your username, last accessed timestamp, last search term,&amp;nbsp;etc.&lt;/li&gt;
&lt;li&gt;Information that you have given permission to access&lt;br /&gt;
   If you run browser plugins and third-party services on your accounts (e.g. Google Drive Addons), you may have granted additional permissions that give these services permission to access your contact list, location, microphone, camera, etc. Needless to say, they now have access to that&amp;nbsp;information.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All this, before a script even lays a single cookie on you! Then there’s all the information it can get through the tracking pixels and cookie IDs on the webpage, when it looks up those IDs in its own database. And if it takes a step further and attempts to exploit some common vulnerabilities, it may also&amp;nbsp;know:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Your browsing history&lt;br /&gt;
   A script can know if URLs on a page have been visited before (this is why links you have visited before can appear in a different style; if you’re a millennial i.e. Gen-Xer, remember the blue links and purple links?). Scripts are also able to check if a link is visited or not. By applying this check on every &lt;span class="caps"&gt;URL&lt;/span&gt; it comes across, it is able to build up a browsing history of your device, albeit in a limited&amp;nbsp;way.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Okay, I’ll stop scaring you here, although I am by no means done with all the things a script can do once it has been loaded by a webpage. But I hope I’ve made my point: you need to limit what scripts can see about you. In other words, &lt;em&gt;you need to wear clothes on the Internet&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;Let’s talk about some broadly useful strategies (note: this is a newsletter, not a howto guide. I won’t walk you through the steps here, just outline the strategies&amp;nbsp;available):&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;DNS&lt;/span&gt;&amp;nbsp;blocking&lt;/h2&gt;
&lt;p&gt;A quick refresher on &lt;span class="caps"&gt;DNS&lt;/span&gt; (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue028.html"&gt;Issue 28&lt;/a&gt;)): each time the browser is given a &lt;span class="caps"&gt;URL&lt;/span&gt; to load, it first figures out the &lt;span class="caps"&gt;IP&lt;/span&gt; address associated with the domain name of the &lt;span class="caps"&gt;URL&lt;/span&gt;&amp;nbsp;(e.g. &lt;code&gt;facebook.com&lt;/code&gt; is the domain name of a &lt;span class="caps"&gt;URL&lt;/span&gt;&amp;nbsp;like &lt;code&gt;https://www.facebook.com/&amp;lt;username&amp;gt;/posts/17-digit-number&lt;/code&gt;). It does this through a &lt;span class="caps"&gt;DNS&lt;/span&gt; lookup request to a &lt;span class="caps"&gt;DNS&lt;/span&gt;&amp;nbsp;server.&lt;/p&gt;
&lt;p&gt;Your default &lt;span class="caps"&gt;DNS&lt;/span&gt; server is usually your &lt;span class="caps"&gt;ISP&lt;/span&gt;. This allows your &lt;span class="caps"&gt;ISP&lt;/span&gt; to do some content filtering for you (e.g. if you signed up for a parental control service by them), by simply &lt;em&gt;blocking all requests&lt;/em&gt; to a particular &lt;span class="caps"&gt;IP&lt;/span&gt; address or domain. e.g. if you have &lt;span class="caps"&gt;ISP&lt;/span&gt; parental controls enabled, and the &lt;span class="caps"&gt;ISP&lt;/span&gt; detects a &lt;span class="caps"&gt;DNS&lt;/span&gt; lookup request to resolve a blacklisted domain&amp;nbsp;like &lt;code&gt;www.xxxchicksxxx.com&lt;/code&gt; to its &lt;span class="caps"&gt;IP&lt;/span&gt; address, it will simply block the request by not returning any result—stopped at the source! (Note: that &lt;span class="caps"&gt;URL&lt;/span&gt; is probably fictional, I have not tested&amp;nbsp;it!)&lt;/p&gt;
&lt;p&gt;What if you don’t want to pay for that service? You could use other alternatives, such as &lt;a href="https://www.opendns.com/"&gt;OpenDNS&lt;/a&gt;. You will need&amp;nbsp;to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Register an account. You need an account for OpenDNS to remember your&amp;nbsp;settings.&lt;/li&gt;
&lt;li&gt;Change your &lt;span class="caps"&gt;DNS&lt;/span&gt; server &lt;span class="caps"&gt;IP&lt;/span&gt; address to OpenDNS’s&amp;nbsp;servers: &lt;code&gt;208.67.222.222&lt;/code&gt; and &lt;code&gt;208.67.220.220&lt;/code&gt;&lt;br /&gt;
   If you do this on your wireless router, anyone using that wifi connection will use the same &lt;span class="caps"&gt;DNS&lt;/span&gt; server—benefits for&amp;nbsp;all!&lt;/li&gt;
&lt;li&gt;Decide the level of filtering you want. You can customise the blocked domain names, or whitelist some that you need (the higher levels can be pretty aggressive and cause some services to stop&amp;nbsp;working)&lt;/li&gt;
&lt;li&gt;Register your &lt;span class="caps"&gt;IP&lt;/span&gt; address with your account, so OpenDNS can apply your setting to requests from your &lt;span class="caps"&gt;IP&lt;/span&gt; address. Since your &lt;span class="caps"&gt;ISP&lt;/span&gt; may change your &lt;span class="caps"&gt;IP&lt;/span&gt; address periodically, you may need to enable a &lt;span class="caps"&gt;DDNS&lt;/span&gt; service (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue031.html"&gt;Issue 31&lt;/a&gt;)), again best done on your router. Some modern routers may have this built-in for you to&amp;nbsp;configure.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Script filtering with a browser&amp;nbsp;addon&lt;/h2&gt;
&lt;p&gt;Some browser addons can help you detect script sources, and &lt;strong&gt;block the script from loading&lt;/strong&gt; if the originating domain is blacklisted. The blacklist is full of tracking companies and data aggregators, and being updated by volunteers on a regular&amp;nbsp;basis.&lt;/p&gt;
&lt;p&gt;This currently works on laptop browsers only, as most mobile browsers do not support&amp;nbsp;addons.&lt;/p&gt;
&lt;h2&gt;Web-filtering mobile&amp;nbsp;apps&lt;/h2&gt;
&lt;p&gt;Although mobile browsers do not support addons, some mobile apps are able to help you do this blocking. They do so by setting up an app-controlled &lt;span class="caps"&gt;VPN&lt;/span&gt; on your phone, routing all internet traffic through that &lt;span class="caps"&gt;VPN&lt;/span&gt;, and filtering blacklisted &lt;span class="caps"&gt;DNS&lt;/span&gt; lookup&amp;nbsp;requests.&lt;/p&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;The options are not many, and they often don’t leave you with much configuration options. Adding a domain to a blacklist/whitelist is tedious, and most users end up not enabling it at&amp;nbsp;all.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; The default settings of most browsers expose a lot of information to scripts that request it. To prevent such scripts from running, we need services that can filter &lt;strong&gt;the source&lt;/strong&gt; of these scripts. These services generally work by matching browser requests against a blacklist, and blocking the request if it comes from a domain known to host malicious&amp;nbsp;scripts.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 78: uMatrix: voyuering the&amp;nbsp;voyeurs&lt;/p&gt;
&lt;p&gt;In my virtual travels, I have found an addon that actually makes it easy for you to see what domains the scripts on a page are coming from. It even makes it easy for you to decide if you want to block them in future. It is by no means easy to use, as it requires some background knowledge of what the different kinds of requests are and what they do, but it makes it really easy to experiment and learn about privacy at the same&amp;nbsp;time!&lt;/p&gt;
&lt;p&gt;I’ll reserve the last issue fo this season to show you some screenshots from it&amp;nbsp;:)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 76: Third-parties and cross-site resources</title><link href="https://ngjunsiang.github.io/laymansguide/issue076.html" rel="alternate"></link><published>2020-06-13T08:00:00+08:00</published><updated>2020-06-13T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-06-13:/laymansguide/issue076.html</id><summary type="html">&lt;p&gt;Cookies with the same domain as the site are first-party cookies, while cookies with domains different from the site are third-party cookies. Cookies are used for all kinds of purposes, from remembering browsing sessions, to logging users in, to tracking their identity across websites. Blocking all third-party cookies indiscriminately can result in most of not all of these functions&amp;nbsp;breaking.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; By not enforcing strict cookie policies on their own sites, publishers allowed advertisers to sneakily set cookies on their site audience. This allowed advertisers to reach the same audience via their advertising slots on other websites, which could be bought more cheaply. The publishers were cut out of the value chain and were not longer “gatekeepers” to their own site readers. They could not sell their advertising slots at a&amp;nbsp;premium.&lt;/p&gt;
&lt;h2&gt;First-party&amp;nbsp;cookies&lt;/h2&gt;
&lt;p&gt;Almost every site that needs to “remember” who you are will set cookies on your browser. The reasons for doing so can range from simply remembering that you are not new to the site and don’t need to be reminded to subscribe to their promotional newsletter, to giving you a login cookie so that the site knows you are logged in. (This cookie gets removed when you log out, which is why clearing cookies automatically logs you out of most&amp;nbsp;sites.)&lt;/p&gt;
&lt;p&gt;The site publisher sets these cookies via scripts that are often hosted on the same &lt;span class="caps"&gt;URL&lt;/span&gt;. Since cookies are tagged by &lt;span class="caps"&gt;URL&lt;/span&gt; domain, these cookies will have the same domain as the site. These are &lt;strong&gt;first-party cookies&lt;/strong&gt;. Blocking these will result in internet-wide breakage, particularly the large majority of login&amp;nbsp;mechanisms.&lt;/p&gt;
&lt;h2&gt;Third-party&amp;nbsp;cookies&lt;/h2&gt;
&lt;p&gt;On the other hand, if the site uses a script from another domain, and this other-domain script sets a cookie, that cookie will have a domain tag that is not the same as the site &lt;span class="caps"&gt;URL&lt;/span&gt; domain (e.g. huffpost.com using a script from an advertiser that sets an advertising cookie, which will not have huffpost.com as its domain). These cookies are known as &lt;strong&gt;third-party cookies&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;These cookies enable advertisers and data-mining companies to track you across websites. Any website you visit which is running their script can retrieve these cookies and send the cookie information back to their servers. This is known as &lt;strong&gt;cross-site tracking&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;A simple way to block pretty much all cross-site tracking is to block third-party cookies. But this also causes other problems, as I will explain&amp;nbsp;below.&lt;/p&gt;
&lt;h2&gt;Software-as-a-Service needs third-party&amp;nbsp;scripts&lt;/h2&gt;
&lt;p&gt;There was once a time when sites took it upon themselves to run all the services they needed. Login, authentication, database management … everything was handled on the server, by scripts that originated on the same server. first-party&amp;nbsp;everything.&lt;/p&gt;
&lt;p&gt;As sites grew more complex, Software-as-a-Service (SaaS) companies grew to provide more specialised services involved in the running of such sites. Companies cropped up offering off-site databases, login servers, and all manner of services. That means that when users visit your site, the browser downloads the SaaS company’s script, which carries out the task for&amp;nbsp;you.&lt;/p&gt;
&lt;p&gt;For example, Google’s reCAPTCHA service lets you add a &lt;span class="caps"&gt;CAPTCHA&lt;/span&gt; to your site. A &lt;span class="caps"&gt;CAPTCHA&lt;/span&gt; is a test that humans are supposed to pass and bots (automated scripts) are supposed to fail: usually some image recognition-based task such as “identify all buses” or “identify all traffic lights” or “type the letters you see”. The code involved in carrying this out is not simple, and most sites are not capable of running the full backend required to make it work. So they embed a reCAPTCHA script from Google on their site, let the script verify that the user is a human, and then carry on as&amp;nbsp;usual.&lt;/p&gt;
&lt;p&gt;However, the Google reCAPTCHA script sets and retrieves cookies. (I am guessing it probably sends your Google cookie to its servers to look up your online history and determine if you are malicious or not.) Since the script originates from Google and not from the website itself, the browser considers it a third-party cookie. Disabling third-party cookies will also cause reCAPTCHA to fail, resulting in a non-functional login for the&amp;nbsp;site.&lt;/p&gt;
&lt;h2&gt;Cookie&amp;nbsp;categories&lt;/h2&gt;
&lt;p&gt;For this reason, cookie policies often differentiate between cookie&amp;nbsp;categories:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Session cookies&lt;br /&gt;
   Cookies that are set &lt;strong&gt;for that browsing session only&lt;/strong&gt;. These cookies are removed when the browser window is closed. These cookies may be used to remember your progress in a multi-step transaction, e.g. doing a multi-page&amp;nbsp;survey.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Persistent cookies&lt;br /&gt;
   These cookies last beyond the current browsing session, and normally terminate after a pre-defined period of time (I often see “1 year” as a default value of sorts, although it can even be set to 30 years!)  Such cookies are used to remember the state you left a service in (e.g. what you have in your shopping cart, even if you didn’t log in or create an&amp;nbsp;account).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Strictly necessary cookies&lt;br /&gt;
   (Subjectively) necessary cookies for legal compliance or other reasons, for example implementing parental controls, or internal analytics (tracking most-visited pages, or visit&amp;nbsp;frequency).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Functional cookies&lt;br /&gt;
   Cookies set for the intent of enabling site functionality, e.g. remembering preferences and&amp;nbsp;settings.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Performance cookies&lt;br /&gt;
   Cookies that enhance the website’s performance, not always what you think that means. For example, if the website is trying out a new feature, they may do A/B testing, giving one cohort of users the “A” interface and another cohort the “B” interface. Which cohort you are in is decided at random, and remembered with a performance&amp;nbsp;cookie.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Advertising cookies&lt;br /&gt;
   Just explained in Issues 73 and&amp;nbsp;74.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;Caveats&lt;/h2&gt;
&lt;p&gt;I think it is only responsible for me to point out here that the above categorisation is not exactly enforced by law, and nothing stops a company from miscategorising their cookies so as to mislead a user into enabling them. For instance, some sites may categorise a cookie for tracking identity as a functional cookie, justifying it by claiming it as part of their security measures, and thereby require the user to enable third-party functional cookies before they are able to use the&amp;nbsp;site.&lt;/p&gt;
&lt;h2&gt;Objections to internet-wide disabling of third-party&amp;nbsp;cookies&lt;/h2&gt;
&lt;p&gt;It would come as no surprise that ad companies object to such measures, claiming it will “hurt the user experience”, “sabotage the economic model for the Internet”, and “disrupt the valuable digital advertising ecosystem that funds much of today’s digital content and services”. (The quoted parts come from an &lt;a href="https://www.patentlyapple.com/patently-apple/2017/09/ad-groups-send-an-open-letter-to-apple-objecting-to-the-new-intelligent-tracking-prevention-setting-in-safari.html"&gt;open letter from the Digital Advertising Community to Apple Inc.&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;Other websites have chimed in with the above concerns about disrupted provision of third-party services (X-as-a-Service providers e.g. Software-as-a-Service especially). Right now the shakeout is happening, with the browsers working out an alternative to third-party cookies, software service providers working out alternatives to cookies for providing services, and ad companies finding more subtle ways to track users. It remains to be seen what the Internet will be using in the next 5&amp;nbsp;years.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Cookies with the same domain as the site are first-party cookies, while cookies with domains different from the site are third-party cookies. Cookies are used for all kinds of purposes, from remembering browsing sessions, to logging users in, to tracking their identity across websites. Blocking all third-party cookies indiscriminately can result in most of not all of these functions&amp;nbsp;breaking.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 77: Wearing clothes on the&amp;nbsp;Internet&lt;/p&gt;
&lt;p&gt;Today we wear clothes for all kinds of reasons: to look cool, to cover ourselves up, to feel comfy … but I suppose in more prehistoric times, the primary purpose of clothing were more basic: to protect one from the elements, and to hide&amp;nbsp;information.&lt;/p&gt;
&lt;p&gt;What kind of information? Wounds, vulnerabilities, illnesses, recognisable features (e.g. tattoos), sometimes even sex … all of these are information that people sought to hide from each other in popular fiction, and presumably in real life as&amp;nbsp;well.&lt;/p&gt;
&lt;p&gt;Whether you believe in sharing about yourself openly or only sharing what is necessary, nobody today goes around naked (with the exception of nudist communities). Yet, as recently as ten years ago, we were doing the equivalent on the Internet: any information that websites requested about us was given freely, with few restrictions if&amp;nbsp;any.&lt;/p&gt;
&lt;p&gt;Today, with advertisers and other data-mining companies tracking you everywhere you go, with malicious hackers, phishers, and scammers waiting to snare unsuspecting users, and with more at stake being tied to your personal digital identity, we have to do&amp;nbsp;better.&lt;/p&gt;
&lt;p&gt;We have to wear clothes on the&amp;nbsp;Internet.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 75: The Costs of Data Leakage</title><link href="https://ngjunsiang.github.io/laymansguide/issue075.html" rel="alternate"></link><published>2020-06-06T08:00:00+08:00</published><updated>2020-06-06T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-06-06:/laymansguide/issue075.html</id><summary type="html">&lt;p&gt;By not enforcing strict cookie policies on their own sites, publishers allowed advertisers to sneakily set cookies on their site audience. This allowed advertisers to reach the same audience via their advertising slots on other websites, which could be bought more cheaply. The publishers were cut out of the value chain and were no longer “gatekeepers” to their own site readers. They could not sell their advertising slots at a&amp;nbsp;premium.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Data companies use the data they have gathered to determine what ads to serve you when you visit sites that load their cookie-setting scripts. This data is sent from your browser via a document request, or via a tracking pixel&amp;nbsp;request.&lt;/p&gt;
&lt;h2&gt;Content&amp;nbsp;adjacency&lt;/h2&gt;
&lt;p&gt;Ads used to be much more discriminate: You would publish only certain kinds of ads in Playboy magazine, another kind of ad in The New Yorker, and yet another kind in The New York Times. This, of course, is seldom dictated by the publishers themselves; mostly, the advertisers self-selected where they would like their ads to appear. &lt;span class="caps"&gt;IBM&lt;/span&gt; wouldn’t publish their ads in Playboy; they won’t reach their target group this way, and their ad spending would be&amp;nbsp;wasted.&lt;/p&gt;
&lt;p&gt;This idea was known as content adjacency: to reach your target group, you want to place your ads next to content that they would read. Content adjacency gave publishers a lot of power, since they were the gatekeepers to published&amp;nbsp;ads.&lt;/p&gt;
&lt;p&gt;But today, that power has mostly leaked away, to ad exchanges. The ads on HuffPost, &lt;span class="caps"&gt;NYT&lt;/span&gt;, and just about any newspaper look largely similar. These advertising slots are sold to ad exchanges, which decide (through the automated bidding) which ads to display to the viewer; no two viewers see the same set of ads. Content adjacency is irrelevant here. The power of ad filtering lies not with the publishers, but with the ad exchanges&amp;nbsp;now.&lt;/p&gt;
&lt;h2&gt;The danger of advertising: cookie&amp;nbsp;leakage&lt;/h2&gt;
&lt;p&gt;In &lt;a href="https://ngjunsiang.github.io/laymansguide/issue071.html"&gt;Issue 71&lt;/a&gt;), I mentioned that part of the value QuantCast brought to the table is that in exchange for letting them put a cookie on your site, they would also tell you more about your audience—far more than you could ever know collecting information on your&amp;nbsp;own.&lt;/p&gt;
&lt;p&gt;But here’s the thing: it is very hard for a website’s publisher to know when an advertiser is setting a cookie. When an advertiser is allowed to put advertisements on a website, you are tacitly allowing them to put in a script that is &lt;strong&gt;supposed&lt;/strong&gt; to request an ad from the ad server (after getting a winning bid from the ad exchange). This script could easily, at the same time, set a cookie and return cookie data along with that&amp;nbsp;request.&lt;/p&gt;
&lt;p&gt;The only way to catch this is to load the page yourself, compare the site data before and after, and see if any cookies are being set. You could automate this, but you’ll need resources to run that regularly on every webpage you publish—resources that publishers were loathe to spend to protect their data and their&amp;nbsp;readers.&lt;/p&gt;
&lt;h2&gt;The danger of cookie leakage: audience&amp;nbsp;leakage&lt;/h2&gt;
&lt;p&gt;Why would advertisers want to sneak cookies like this? Let me put it this way: nobody ever uses the Internet just for reading The &lt;span class="caps"&gt;NYT&lt;/span&gt;. &lt;span class="caps"&gt;NYT&lt;/span&gt; readers might head to Facebook to see how their friends are doing (and view Facebook ads), they might send out some angry tweets on Twitter (and see Twitter ads), they might head to Amazon or Barnes &lt;span class="amp"&gt;&amp;amp;&lt;/span&gt; Noble or any number of sites to do the&amp;nbsp;necessaries.&lt;/p&gt;
&lt;p&gt;And these readers can be reached on these other sites if the advertisers buy advertising slots with them. They no longer needed to rely on The &lt;span class="caps"&gt;NYT&lt;/span&gt; to reach a particular class of consumers. If The &lt;span class="caps"&gt;NYT&lt;/span&gt; thought they could price their advertising slots more expensively for the exclusive reach to upper-class readers, they now no longer have that advantage. Those readers are tied to a cookie &lt;span class="caps"&gt;ID&lt;/span&gt; now, not to a website &lt;span class="caps"&gt;URL&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;The publishers were being cut out of the value&amp;nbsp;chain.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; By not enforcing strict cookie policies on their own sites, publishers allowed advertisers to sneakily set cookies on their site audience. This allowed advertisers to reach the same audience via their advertising slots on other websites, which could be bought more cheaply. The publishers were cut out of the value chain and were no longer “gatekeepers” to their own site readers. They could not sell their advertising slots at a&amp;nbsp;premium.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 76: Third-parties and cross-site&amp;nbsp;resources&lt;/p&gt;
&lt;p&gt;One way that web browsers and privacy advocates are trying to protect users is by pushing for stricter third-party cookie restrictions. Firefox started blocking third-party cookies by default since Sep 2019, Safari started doing so in Apr 2020, and Chrome intends to do so from&amp;nbsp;2022.&lt;/p&gt;
&lt;p&gt;Many sites are against this, arguing that it will break some “basic internet functionality”. What is this furore about? I’ll explain what third-party cookies and resources are in the next issue, and summarise some of the objections that sites are&amp;nbsp;raising.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 74: The Walls Have Pixels</title><link href="https://ngjunsiang.github.io/laymansguide/issue074.html" rel="alternate"></link><published>2020-05-30T08:00:00+08:00</published><updated>2020-05-30T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-05-30:/laymansguide/issue074.html</id><summary type="html">&lt;p&gt;There are two ways your browser can send cookies back to the&amp;nbsp;server:&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; When a page loads advertisements through header bidding, it sends your cookie along with other information to an ad exchange. The ad exchange conducts automated bidding among the ad-buyers, determines the winner(s), and sends the winning code(s) back to your browser. Your browser then sends these codes to the &lt;strong&gt;&lt;span class="caps"&gt;CDN&lt;/span&gt;&lt;/strong&gt;, which sends back the winning ads for your page to render in your&amp;nbsp;browser.&lt;/p&gt;
&lt;p&gt;So how does Facebook know what you just bought on Amazon? I hope the previous post sheds some light on that. But not everything is a web browser, and not everything uses cookies (especially apps). This post is about another way that your data gets shuttled along to whoever has a data-sharing agreement with the site you are&amp;nbsp;on.&lt;/p&gt;
&lt;h2&gt;Tracking pixels: another way of sending&amp;nbsp;information&lt;/h2&gt;
&lt;p&gt;Even if you disable third-party tracking cookies and javascript that didn’t originate from the same page, information about where you went can still be sent to these servers. Can you guess&amp;nbsp;how?&lt;/p&gt;
&lt;p&gt;Obviously when you loaded the page, some information already went to the server to tell it what your browser wants. But beyond that, have you ever wondered about the images that get&amp;nbsp;loaded?&lt;/p&gt;
&lt;p&gt;Let’s revisit HuffPost again, this time filtering only for image&amp;nbsp;loads:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of DevTools in Vivaldi browser, filtered to show only image loads." src="https://ngjunsiang.github.io/laymansguide/issue074_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;Chrome DevTools showing filtered image requests.&lt;br /&gt;A request for a tracking pixel is highlighted in&amp;nbsp;blue.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;Hmm … why does an image request need to be so long? Anytime you see a long &lt;span class="caps"&gt;URL&lt;/span&gt; like that, with&amp;nbsp;a &lt;code&gt;?&lt;/code&gt; after the &lt;span class="caps"&gt;URL&lt;/span&gt; proper, and peppered&amp;nbsp;with &lt;code&gt;&amp;amp;&lt;/code&gt;s&amp;nbsp;and &lt;code&gt;=&lt;/code&gt;s, alarm bells should be going off in your head: data is being sent to the server (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue070.html"&gt;Issue 70&lt;/a&gt;))!&lt;/p&gt;
&lt;p&gt;Let’s see what this image looks&amp;nbsp;like:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Vivaldi browser tab showing a tracking pixel." src="https://ngjunsiang.github.io/laymansguide/issue074_02.png" /&gt;&lt;br /&gt;
&lt;em&gt;This is a tracking pixel.&lt;br /&gt;You can’t see it. The image info sidebar shows that its dimensions are 1×1&amp;nbsp;pixels.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;Wha—?!&lt;/p&gt;
&lt;p&gt;What is your browser doing, loading a useless 1×1 image? If it appears to be doing something useless, you’re not looking in the right place. The image itself is clearly useless; its just a way to get your browser to send information to a&amp;nbsp;server.&lt;/p&gt;
&lt;h2&gt;Tracking pixels work hand in hand with&amp;nbsp;cookies&lt;/h2&gt;
&lt;p&gt;This request for the tracking pixel was sent from a script. My cookie information was embedded in the request &lt;span class="caps"&gt;URL&lt;/span&gt; when it was sent. So a tracking pixel is another mechanism for sending cookies, besides sending a generic document request via the script like we saw in &lt;a href="https://ngjunsiang.github.io/laymansguide/issue070.html"&gt;Issue 70&lt;/a&gt;).&lt;/p&gt;
&lt;p&gt;If you have a popular website, ad exchanges will ask to pay you to put their ads on your website. These ads are served after the user’s browser sends the user’s cookie to the ad exchange, which triggers an automated bidding process. The winning bid gets sent to the &lt;span class="caps"&gt;CDN&lt;/span&gt; (content delivery network), which serves the ads (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue073.html"&gt;Issue 73&lt;/a&gt;)).&lt;/p&gt;
&lt;p&gt;On the other hand, data companies don’t serve ads. They usually ask to put a tracking pixel on your website, which means they ask you to put in their script. This script will scrape whatever data it can about the page the user is on and related user activity, and embed it in the pixel request along with the user’s&amp;nbsp;cookie.&lt;/p&gt;
&lt;p&gt;When you visit Facebook, it looks up your cookie and sees if you have been visiting any websites recently, or left any shopping carts un-checked-out. Then it knows what ads to serve you&amp;nbsp;:)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; There are two ways your browser can send cookies back to the&amp;nbsp;server:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;By sending an &lt;span class="caps"&gt;HTTP&lt;/span&gt; &lt;em&gt;document&lt;/em&gt; request (known as an &lt;strong&gt;&lt;span class="caps"&gt;XHR&lt;/span&gt;&lt;/strong&gt;, short for XmlHTTPRequest) which usually returns a chunk of text&amp;nbsp;data,&lt;/li&gt;
&lt;li&gt;By sending an &lt;span class="caps"&gt;HTTP&lt;/span&gt; &lt;em&gt;image&lt;/em&gt; request which usually returns a 1×1 pixel, known as a &lt;strong&gt;tracking pixel&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Data companies use the data they have gathered to determine what ads to serve you when you visit sites that load their cookie-setting&amp;nbsp;scripts.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 75: The Costs of Data&amp;nbsp;Leakage&lt;/p&gt;
&lt;p&gt;Notice at this point that ad and data companies are still more concerned with &lt;strong&gt;what you are doing&lt;/strong&gt;, not &lt;strong&gt;who you are&lt;/strong&gt;. That’s right; they don’t gather names, credit card numbers, and the like; that is useless for serving&amp;nbsp;ads!&lt;/p&gt;
&lt;p&gt;I kind of did some time travel in the past few issues. One moment, it was 2006 and ad companies were still just serving static images with some request tags and QuantCast had just discovered the power of the cookie. The next moment, there are a gazillion ad companies and a billion ad exchanges all bidding to serve ads before your eyeballs. How did this happen so&amp;nbsp;abruptly?&lt;/p&gt;
&lt;p&gt;It didn’t. Not abruptly anyway, but quite rapidly. The costs of data leakage have already been paid, not by us but by the websites. They have been greatly diminished in the value chain, replaced by ad exchanges which have sopped up most of the profit of advertising like a&amp;nbsp;sponge.&lt;/p&gt;
&lt;p&gt;Next issue, I’ll describe how this&amp;nbsp;happened.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category><category term="document"></category></entry><entry><title>Issue 73: The Heart of Darkness (Header Bidding)</title><link href="https://ngjunsiang.github.io/laymansguide/issue073.html" rel="alternate"></link><published>2020-05-23T08:00:00+08:00</published><updated>2020-05-23T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-05-23:/laymansguide/issue073.html</id><summary type="html">&lt;p&gt;When a page loads advertisements through header bidding, it sends your cookie along with other information to an &lt;strong&gt;ad exchange&lt;/strong&gt;. The ad exchange conducts automated bidding among the ad-buyers, determines the winner(s), and sends the winning code(s) back to your browser. Your browser then sends these codes to the &lt;strong&gt;&lt;span class="caps"&gt;CDN&lt;/span&gt;&lt;/strong&gt;, which sends back the winning ads for your page to render in your&amp;nbsp;browser.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; QuantCast gathers a large amount of data on internet users directly through its &lt;strong&gt;cookie&lt;/strong&gt; (which other publishers serve through their websites), and also by cross-checking it against data which it purchases from other &lt;strong&gt;data brokers&lt;/strong&gt; who gather their information through other means, such as internet activity and credit card&amp;nbsp;transactions.&lt;/p&gt;
&lt;p&gt;What exactly does QuantCast do with all this&amp;nbsp;data?&lt;/p&gt;
&lt;p&gt;I’ll take a classic ad-infested website as an example: &lt;a href="https://www.huffpost.com/"&gt;HuffPost&lt;/a&gt;. HuffPost may not &lt;em&gt;look&lt;/em&gt; ad-infested, but peek under the hood and it will look different to those who know what to look for. If you just take a quick skim through the &lt;a href="view-source:https://www.huffpost.com/"&gt;website’s source code&lt;/a&gt;, you will see that almost a third of the website is just javascript&amp;nbsp;loading!&lt;/p&gt;
&lt;p&gt;Do a search&amp;nbsp;for &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;/head&amp;gt;&lt;/code&gt;, then&amp;nbsp;for &lt;code&gt;&amp;lt;body&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;/body&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;h2&gt;Webpage loading: header, followed by&amp;nbsp;body&lt;/h2&gt;
&lt;p&gt;The section flanked&amp;nbsp;by &lt;code&gt;&amp;lt;head&amp;gt;&amp;lt;/head&amp;gt;&lt;/code&gt; is the page header. This is the most important section of the page for everyone else besides the reader. When a page is requested by the browser, the &lt;span class="caps"&gt;HTML&lt;/span&gt; code for the entire page is retrieved. But it is not rendered all at&amp;nbsp;once.&lt;/p&gt;
&lt;p&gt;The browser starts processing the page header first. It looks at all the file requests: &lt;span class="caps"&gt;CSS&lt;/span&gt; files (for styling the page), fonts (for formatting text), javascript code (for running code to make the page responsive and for loading cookies (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue070.html"&gt;Issue 70&lt;/a&gt;)) etc). It sends off another round of requests for each of these resources. The rest of the page (flanked by&amp;nbsp;the &lt;code&gt;&amp;lt;body&amp;gt;&amp;lt;/body&amp;gt;&lt;/code&gt; tags) does not start rendering until critical files have been&amp;nbsp;retrieved.&lt;/p&gt;
&lt;p&gt;Often, the javascript code is considered critical, because some of them actually change the page body or affect what is loaded. They are therefore placed in the page header and loaded first before the body is&amp;nbsp;rendered.&lt;/p&gt;
&lt;p&gt;Normally, on a non-advertising page, the page header is very short: just the page title, some metadata (to tell Google’s bot what the page is about for ranking in searches), some fonts, some &lt;span class="caps"&gt;CSS&lt;/span&gt;, and a bit of javascript to spice up page interactions. That’s it. A fancy photo carousel or other features will involve a bit more code, but still not a whole&amp;nbsp;lot.&lt;/p&gt;
&lt;h2&gt;How ad bidding&amp;nbsp;works&lt;/h2&gt;
&lt;p&gt;When advertising comes into the mix, the information flow gets much more complicated. The header loads an ad script that passes the cookie (embedded in the page), along with any other relevant information (type of website, device info, etc&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt;) to the advertising&amp;nbsp;exchange.&lt;/p&gt;
&lt;p&gt;What does an exchange do? It matches this cookie to its huge database of cookies, and then it conducts an auction. “Here’s a user browsing New York Times! *Looks up user in database* Probably a woke young twenty-something, good credit history, into yoga, and health-fad-ish.” So it’s pretty much like a marketplace, but one that you cannot participate directly in. It’s actually automated&amp;nbsp;bidding.&lt;/p&gt;
&lt;p&gt;The ad-buyers bid. These bids are not placed on-the-spot, but pre-bidded (through the advertisers’ dashboards, or &lt;a href="https://ngjunsiang.github.io/laymansguide/issue004.html"&gt;through an &lt;span class="caps"&gt;API&lt;/span&gt;&lt;/a&gt;)). Higher bids win over lower bids, but more relevant bids win over less relevant&amp;nbsp;bids.&lt;/p&gt;
&lt;p&gt;The advertiser’s server sends the winning bid code back to your browser. Then another piece of the advertiser’s javascript code kicks in, sending this code to the advertiser’s &lt;strong&gt;content delivery network (&lt;span class="caps"&gt;CDN&lt;/span&gt;)&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Yup, online ads need specialised servers to do different things. The &lt;strong&gt;ad exchange&lt;/strong&gt; carries out the bidding and determines the winner (much like a stock exchange); this requires intensive &lt;span class="caps"&gt;CPU&lt;/span&gt; calculations and low latency connections. The &lt;strong&gt;&lt;span class="caps"&gt;CDN&lt;/span&gt;&lt;/strong&gt;, on the other hand, is a global network of servers that keep the content ready to deliver. Servers in the &lt;span class="caps"&gt;US&lt;/span&gt; can get content to &lt;span class="caps"&gt;US&lt;/span&gt; web browsers most quickly, while servers in South-east Asia are better placed to serve Southeast Asian&amp;nbsp;browsers.&lt;/p&gt;
&lt;p&gt;These servers continually talk to each other or to a coordinating server, which determines what content should be on each server depending on the demand from each region. Each regional server caches the most frequently requested ads and cat images in the server memory (which is quick to access), leaving the rest in hard disk or solid state storage (which is slower to&amp;nbsp;access).&lt;/p&gt;
&lt;p&gt;These servers are configured for high bandwidth (to serve as many images as quickly as possible) and with large memory + storage&amp;nbsp;space.&lt;/p&gt;
&lt;p&gt;This is what that invisible one-third of the page is&amp;nbsp;doing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; When a page loads advertisements through header bidding, it sends your cookie along with other information to an &lt;strong&gt;ad exchange&lt;/strong&gt;. The ad exchange conducts automated bidding among the ad-buyers, determines the winner(s), and sends the winning code(s) back to your browser. Your browser then sends these codes to the &lt;strong&gt;&lt;span class="caps"&gt;CDN&lt;/span&gt;&lt;/strong&gt;, which sends back the winning ads for your page to render in your&amp;nbsp;browser.&lt;/p&gt;
&lt;p&gt;Phew, that’s as short as I can describe ad exchanges and CDNs. (one more long-running question answered, yay!) You may or may not be surprised at what is going on at the backend, but often people don’t expect that so much of the internet backend is actually dedicated to just serving ads. But it’s true. The services you have come to rely on—this is the price we pay for them to be&amp;nbsp;“free”.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 74: The Walls Have&amp;nbsp;Pixels&lt;/p&gt;
&lt;p&gt;It gets worse … after ad exchanges came about in the mid-2010s, second-order effects were responsible for much of the data leakage and privacy concerns that hog the headlines of some publications today. I’ll explore a couple of them in the next two&amp;nbsp;issues.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;del&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue 8]&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="footnote"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;It’s hard to know what exactly is going on because the javascript is often obfuscated with all kinds of codes and renaming. Only folks in the industry will be able to tell you what exactly is going on in their backend, and even then they might not be able to tell you what exactly a competitor is doing.&amp;#160;&lt;a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</content><category term="Season 06"></category><category term="cache"></category></entry><entry><title>Issue 72: The Data Brokers</title><link href="https://ngjunsiang.github.io/laymansguide/issue072.html" rel="alternate"></link><published>2020-05-16T08:00:00+08:00</published><updated>2020-05-16T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-05-16:/laymansguide/issue072.html</id><summary type="html">&lt;p&gt;QuantCast gathers a large amount of data on internet users directly through its &lt;strong&gt;cookie&lt;/strong&gt; (which other publishers serve through their websites), and also by cross-checking it against data which it purchases from other &lt;strong&gt;data brokers&lt;/strong&gt; who gather their information through other means, such as internet activity and credit card&amp;nbsp;transactions.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; In 2006, Quantcast offered complete audience analytics for any site that puts &lt;em&gt;their&lt;/em&gt; cookie on the site. In this way, they managed to gather information on a wider audience than they, or any single website, could reach on their&amp;nbsp;own.&lt;/p&gt;
&lt;p&gt;I’m almost about to begin talking about Quantcast’s proposition to advertisers, and how that led to the ad exchange, and what an ad exchange is, but to avoid confusing you, I had better talk about data brokers and what they are&amp;nbsp;first.&lt;/p&gt;
&lt;h2&gt;The demand for data, and its&amp;nbsp;players&lt;/h2&gt;
&lt;p&gt;There is a huge market out there for data. In a way reminiscent of the slave trade of the 16th to 19th centuries, in which people were being auctioned and sold and shipped to countries far from their homeland, data today is being sold in data markets, copied to places far from their point of origin, and used to put together profiles of consumers. Who are these data&amp;nbsp;brokers?&lt;/p&gt;
&lt;p&gt;Some are sources of information: subscription lists of email addresses to free journals and magazines, (anonymised) credit card activity (how much money spent where by what income bracket), your social media clicks and likes and other activity, your browsing web history, even your mobile device telemetry data (coming from a data-mining app disguised as a mobile game which you unwittingly downloaded). They sell this data to other third-parties, or to advertisers directly&amp;nbsp;(rare).&lt;/p&gt;
&lt;p&gt;Some are middlemen: third-party brokers who offer a consultancy-like service: they buy information, recompile it into profiles that are more legible to advertisers, and then resell this&amp;nbsp;information.&lt;/p&gt;
&lt;p&gt;Some are end-buyers: insurance and other risk-management companies, investigation firms, fraud detection services, … just about any company that may need information on a person or category of&amp;nbsp;consumer.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://www.fastcompany.com/90310803/here-are-the-data-brokers-quietly-buying-and-selling-your-personal-information"&gt;FastCo has a (non-exhaustive) A–W list&lt;/a&gt; of some of these companies, if you’d like a more detailed&amp;nbsp;sampling.&lt;/p&gt;
&lt;p&gt;QuantCast, in effect, was &lt;em&gt;acting&lt;/em&gt; like a data broker (though it didn’t buy or sell this information, it gathered them directly through its&amp;nbsp;cookie).&lt;/p&gt;
&lt;h1&gt;The data QuantCast&amp;nbsp;gathers&lt;/h1&gt;
&lt;p&gt;The end result looks like what a soulless Santa Claus would have managed to gather on its own. A Privacy International journalist sent a Data Subject Access Request to QuantCast for the data it has gathered on her[^1]. By her own analysis, QuantCast has “amassed […] more than 46 columns worth of data including URLs, time stamps, &lt;span class="caps"&gt;IP&lt;/span&gt; addresses, cookies IDs, browser information and much more.” Furthermore, the data she received “suggest that [it was obtained through] data brokers like Acxiom and Oracle, but also MasterCard and credit referencing agencies like&amp;nbsp;Experian.”&lt;/p&gt;
&lt;p&gt;Interestingly enough, first name, last name, Social Security identification, and other personally identifying information is hardly collected. Such information is of little interest to advertisers; it is too specific and tells them nothing about whether an ad can be served at you, to extract another&amp;nbsp;click.&lt;/p&gt;
&lt;p&gt;[1]: QuantCast is legally obligated to fulfill &lt;a href="https://www.quantcast.com/privacy/data-subject-rights/"&gt;such requests&lt;/a&gt; under the terms of the &lt;span class="caps"&gt;GDPR&lt;/span&gt; legislation which was implemented in&amp;nbsp;2018.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; QuantCast gathers a large amount of data on internet users directly through its &lt;strong&gt;cookie&lt;/strong&gt; (which other publishers serve through their websites), and also by cross-checking it against data which it purchases from other &lt;strong&gt;data brokers&lt;/strong&gt; who gather their information through other means, such as internet activity and credit card&amp;nbsp;transactions.&lt;/p&gt;
&lt;p&gt;That’s … a lot of information, but how does it help advertisers? How does the engine of ad customisation work? All this and more in the next few&amp;nbsp;issues.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 73: The Heart of Darkness (Header&amp;nbsp;Bidding)&lt;/p&gt;
&lt;p&gt;What we have come to know as “targeted advertisements” are known in the advertising industry by other terms, depending on the mode of operation: sponsored search auction, real-time bidding, etc. Generally, they are known as &lt;strong&gt;header bidding&lt;/strong&gt;, because the code tag that triggers it is embedded in the header section of a&amp;nbsp;webpage.&lt;/p&gt;
&lt;p&gt;An entire cascade of bidding operations, reminiscent of eBay bidding but entirely automated, is completed in mere milliseconds; the smorgasbord of ads vying for your attention are shaken out, winners emerge, and are served into your browser view while you wait for the page to&amp;nbsp;load.&lt;/p&gt;
&lt;p&gt;Stay tuned next issue as we super-slo-mo this process to a speed you can&amp;nbsp;grasp.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 71: The Rise of Audience Analytics</title><link href="https://ngjunsiang.github.io/laymansguide/issue071.html" rel="alternate"></link><published>2020-05-09T08:00:00+08:00</published><updated>2020-05-09T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-05-09:/laymansguide/issue071.html</id><summary type="html">&lt;p&gt;n 2006, Quantcast offered complete audience analytics for any site that puts their cookie on the site. Websites would know more about their audience than they could otherwise gather through their site alone. But Quantcast would make most of their money through their offering to&amp;nbsp;advertisers.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; A tracking script retrieves the existing cookie on a web domain if there is one, or sets a cookie on a webpage if there isn’t an existing one. The tracking script sends the cookie information back to the originating server, along with many other fragments of&amp;nbsp;information.&lt;/p&gt;
&lt;p&gt;A quick refresher from &lt;a href="https://ngjunsiang.github.io/laymansguide/issue068.html"&gt;Issue 68&lt;/a&gt;): it is 2006. The market had just recovered, shaken itself out from the dot-com bust which started at the turn of the century and lasted about two&amp;nbsp;years.&lt;/p&gt;
&lt;p&gt;Post-bust, the remaining companies quickly realised that throwing money blindly was not the way. They needed to target audiences more specifically. Google led the charge with their &lt;span class="caps"&gt;IPO&lt;/span&gt; in 2004, demonstrating that targeted search actually brought more users. (They introduced similar ideas to their advertising arm, Adwords.) The race began: Facebook, Youtube, Twitter, and many more. Even the news was going online. And then the iPhone launched in 2007, sparking off the mobile Internet wave, and the rise of mobile&amp;nbsp;apps.&lt;/p&gt;
&lt;p&gt;These companies all had the same problem: they only knew what users did on &lt;em&gt;their&lt;/em&gt; site, but not what these users did &lt;em&gt;around the Internet&lt;/em&gt;. Each company set its own cookie and tracked its own cookie, and managed its own analytics (or an analytics company did it for&amp;nbsp;them).&lt;/p&gt;
&lt;p&gt;Then Quantcast got thinking: what if we could get these cookies synchronised? Better yet, what if we could get all these companies to load our tracking script on their sites (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue070.html"&gt;Issue 70&lt;/a&gt;)) and thereby put &lt;em&gt;our cookie&lt;/em&gt; on their sites? We would be able to gather cross-site data and build a more complete profile of the&amp;nbsp;audience!&lt;/p&gt;
&lt;p&gt;Now, why would these companies agree to that? There has to be some upside for them. The only thing Quantcast had to offer them was the very information it had gathered: in return for putting our cookie on your site, Quantcast would offer you demographic analytics on your site audience, more complete than you could ever hope to build by&amp;nbsp;yourself.&lt;/p&gt;
&lt;p&gt;Demographic analytics can help these companies know if their website design and other features and helping them reach their desired target audience. But this alone would not have catapaulted Quantcast into the&amp;nbsp;limelight.&lt;/p&gt;
&lt;p&gt;The true value of Quantcast’s cookie came when it was coupled to targeted online&amp;nbsp;advertising.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; In 2006, Quantcast offered complete audience analytics for any site that puts their cookie on the site. Websites would know more about their audience than they could otherwise gather through their site alone. But Quantcast would make most of their money through their offering to&amp;nbsp;advertisers.&lt;/p&gt;
&lt;p&gt;Another very short issue (phew!), that I hope explains how the unification of user data began. It is important to note that nobody was forced into this arrangement, at least not by the usual anti-competitive practices. Quantcast offered a product, companies that hopped on the bandwagon became highly successful at targeting specific audiences, and soon any company not doing that found themselves being unable to compete in the same&amp;nbsp;space.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 72: The Data&amp;nbsp;Brokers&lt;/p&gt;
&lt;p&gt;QuantCast does not do &lt;em&gt;all&lt;/em&gt; its data gathering; it also gets information from other data providers, known as &lt;strong&gt;data brokers&lt;/strong&gt;. Lets visit them next&amp;nbsp;issue.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 70: The Cookie Factory</title><link href="https://ngjunsiang.github.io/laymansguide/issue070.html" rel="alternate"></link><published>2020-05-02T08:00:00+08:00</published><updated>2020-05-02T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-05-02:/laymansguide/issue070.html</id><summary type="html">&lt;p&gt;When browsing a webpage, a tracking script retrieves the browser&amp;#8217;s existing cookie, if there is one, or sets a cookie for the browser if there isn’t one. The tracking script sends the cookie information back to the originating server, along with many other fragments of&amp;nbsp;information.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Cookies are little fragments of information with a name and a value, and associated with a domain address. They are most commonly used to identify new or returning users. This cookie is issued by a website upon the first visit, stored in the browser, and returned to the issuing server whenever the server requests&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;This issue is a short one, just to put one more piece in place. Last issue, I said&amp;nbsp;that &lt;code&gt;analytics.js&lt;/code&gt; loaded&amp;nbsp;a &lt;code&gt;_gid&lt;/code&gt; cookie with a value&amp;nbsp;of &lt;code&gt;GA1.2.1807773255.1584140066&lt;/code&gt;. At that point, the cookie only existed in my web browser. How did it get sent back to Google Analytics for&amp;nbsp;counting?&lt;/p&gt;
&lt;p&gt;Let’s watch what is happening with Google&amp;nbsp;DevTools:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of DevTools in Vivaldi browser, with a request by analytics.js highlighted." src="https://ngjunsiang.github.io/laymansguide/issue070_01.png" /&gt;&lt;br /&gt;&lt;small&gt;Chrome DevTools showing the (filtered) sequence of requests made by the webpage I loaded.&lt;br /&gt;
The request made&amp;nbsp;by &lt;code&gt;analytics.js&lt;/code&gt; (third-last line) is highlighted in gray. The Initiator column tells us this requested was initiated&amp;nbsp;by &lt;code&gt;analytics.js&lt;/code&gt; on line 25 of the script.&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;The full &lt;span class="caps"&gt;URL&lt;/span&gt; of the highlighted request&amp;nbsp;is &lt;code&gt;http://www.google-analytics.com/collect?v=1&amp;amp;_v=j81&amp;amp;a=227860763&amp;amp;t=pageview&amp;amp;_s=1&amp;amp;dl=http%3A%2F%2Fwww.adopsinsider.com%2Fad-serving%2Fhow-does-ad-serving-work%2F&amp;amp;ul=en-us&amp;amp;de=UTF-8&amp;amp;dt=How%20Ad%20Serving%20Works&amp;amp;sd=24-bit&amp;amp;sr=3840x2160&amp;amp;vp=1319x1284&amp;amp;je=0&amp;amp;_u=QACAAAAB~&amp;amp;jid=&amp;amp;gjid=&amp;amp;cid=184706471.1584140066&amp;amp;tid=UA-13115681-1&amp;amp;_gid=1807773255.1584140066&amp;amp;gtm=2wg340NLT927&amp;amp;z=1600454420.&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;That’s unreadable for&amp;nbsp;humans!&lt;/p&gt;
&lt;p&gt;In layman&amp;nbsp;terms, &lt;code&gt;analytics.js&lt;/code&gt; sends a request to http://www.google-analytics.com (yup, unsecured transmission since it does not use &lt;span class="caps"&gt;HTTPS&lt;/span&gt;) with the following&amp;nbsp;information:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;v: 1
_v: j81
a: 227860763
t: pageview
_s: 1
dl: http://www.adopsinsider.com/ad-serving/how-does-ad-serving-work/
ul: en-us
de: UTF-8
dt: How Ad Serving Works
sd: 24-bit
sr: 3840x2160
vp: 1319x1284
je: 0
_u: QACAAAAB~
jid:
gjid:
cid: 184706471.1584140066
tid: UA-13115681-1
_gid: 1807773255.1584140066
gtm: 2wg340NLT927
z: 1600454420
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;See anything interesting there? Here, let me highlight it for&amp;nbsp;you:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;_gid: 1807773255.1584140066&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;Yup, &lt;code&gt;analytics.js&lt;/code&gt; sets a cookie if there isn’t one, or retrieves the existing cookie if there is one. It sends the cookie back&amp;nbsp;to &lt;code&gt;google-analytics.com&lt;/code&gt; with your cookie &lt;span class="caps"&gt;ID&lt;/span&gt;, so Google Analytics knows who is visiting the page and can count visitor stats for the&amp;nbsp;webpage.&lt;/p&gt;
&lt;p&gt;It makes sense for a webpage to&amp;nbsp;embed &lt;code&gt;analytics.js&lt;/code&gt; so that Google Analytics can help it count page visits. But why would a webpage allow Facebook and other ad services to put their cookies on a reader’s browser and then send it back to their own servers? Doesn’t that worsen the site experience? What is the benefit to&amp;nbsp;them?&lt;/p&gt;
&lt;p&gt;That is the key insight that Quantcast arrived&amp;nbsp;at.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; When browsing a webpage, a tracking script retrieves the browser&amp;#8217;s existing cookie, if there is one, or sets a cookie for the browser if there isn’t one. The tracking script sends the cookie information back to the originating server, along with many other fragments of&amp;nbsp;information.&lt;/p&gt;
&lt;p&gt;Short issue just to close the loop on cookie setting and returning. Enjoy the mental break!&amp;nbsp;:)&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 71: The Rise of Audience&amp;nbsp;Analytics&lt;/p&gt;
&lt;p&gt;When it comes to ad networks, there is the How aspect, and the Why aspect. The How aspect is almost hopelessly complicated, an ever-evolving race of advertisers vs ad-blockers, each trying to outdo the other. I will focus less on this aspect, and more on the Why aspect. I think it is more critical to understanding what information advertisers actually extract, and why it does not make any sense for them to want to know your personal&amp;nbsp;details.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 69: The Cookie Monster</title><link href="https://ngjunsiang.github.io/laymansguide/issue069.html" rel="alternate"></link><published>2020-04-25T08:00:00+08:00</published><updated>2020-04-25T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-04-25:/laymansguide/issue069.html</id><summary type="html">&lt;p&gt;Cookies are little fragments of information with a name and a value, and associated with a domain address. They are most commonly used to identify new or returning users. This cookie is issued by a website upon the first visit, stored in the browser, and returned to the issuing server whenever the server requests&amp;nbsp;it.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; The old &lt;span class="caps"&gt;CPM&lt;/span&gt; model (cost per thousand impressions) in the early Internet was replaced by the &lt;span class="caps"&gt;CPC&lt;/span&gt; model (cost per click) after the dot-com bust. But &lt;span class="caps"&gt;CPC&lt;/span&gt; only works well if publishers and advertisers could get users to click; they need to target advertisements accurately to users. QuantCast figured out a way to do so in&amp;nbsp;2006.&lt;/p&gt;
&lt;p&gt;How to do that? The key, it turns out, centres around&amp;nbsp;cookies.&lt;/p&gt;
&lt;h2&gt;Wait, what’s a&amp;nbsp;cookie?&lt;/h2&gt;
&lt;p&gt;When you visit any website in Chrome or Firefox, if you click on the icon to the left of the address&amp;nbsp;bar:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of website info popup in Vivaldi browser" src="https://ngjunsiang.github.io/laymansguide/issue069_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;Clicking the icon to the left of the address bar shows basic site&amp;nbsp;information&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;It shows you some basic information, including the cookies loaded by the&amp;nbsp;website.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of cookies in use popup in Vivaldi browser" src="https://ngjunsiang.github.io/laymansguide/issue069_02.png" /&gt;&lt;br /&gt;
&lt;em&gt;You can view the content of cookies through that window in Chrome or Vivaldi. This information is also available in other web browsers through a different menu&amp;nbsp;option.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;The cookies themselves are only just little fragments of information. They are identified with a name, they have a bunch of content (usually gibberish to humans), and they are associated with a website. Above, you can see that this website has a cookie&amp;nbsp;named &lt;code&gt;_gid&lt;/code&gt; with a value&amp;nbsp;of &lt;code&gt;GA1.2.1807773255.1584140066&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of website source in Vivaldi browser. The script line that loads analytics.js is highlighted" src="https://ngjunsiang.github.io/laymansguide/issue069_03.png" /&gt;&lt;br /&gt;
&lt;em&gt;The script code used by Google Analytics is&amp;nbsp;named &lt;code&gt;analytics.js&lt;/code&gt;.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;Little snippets of javascript create and delete cookies. These snippets of Javascript are usually loaded as a script, with&amp;nbsp;a &lt;code&gt;.js&lt;/code&gt; file extension. The script code used by Google Analytics is&amp;nbsp;named &lt;code&gt;analytics.js&lt;/code&gt;.&lt;/small&gt;&lt;/p&gt;
&lt;h2&gt;What do cookies&amp;nbsp;do?&lt;/h2&gt;
&lt;p&gt;This cookie was loaded by  &lt;a href="https://developers.google.com/analytics/devguides/collection/analyticsjs/cookies-user-id"&gt;analytics.js&lt;/a&gt; after the web browser runs the script. It is how Google Analytics identifies users on the website. The value stored in&amp;nbsp;the &lt;code&gt;_gid&lt;/code&gt; cookie is the client &lt;span class="caps"&gt;ID&lt;/span&gt; assigned by Google Analytics to identify a unique&amp;nbsp;user.&lt;/p&gt;
&lt;p&gt;Many bloggers and website owners rely on Google Analytics to tell them how much internet traffic their website is getting every month, which countries they are from, what time of day they are most active, which search results are bringing these visitors to the site, and so&amp;nbsp;on.&lt;/p&gt;
&lt;p&gt;But each visit represents one browser loading the page; how do we know that’s not the same user repeatedly refreshing the page waiting for something to happen? (It happens on auction sites, or game sites, and many other&amp;nbsp;places).&lt;/p&gt;
&lt;p&gt;Whenever the webpage is loaded, the cookie information gets sent to the Google Analytics server. That is how Google Analytics know it’s the same fella on the same browser doing it. The cookie associates each client &lt;a href="https://ngjunsiang.github.io/laymansguide/issue007.html"&gt;Issue 7&lt;/a&gt;) with&amp;nbsp;a &lt;code&gt;_gid&lt;/code&gt; id. But if the user is using two different web browsers, or using a smartphone browser and doing it on their laptop, that actually gets classified under two different identifiers, even though it’s the same&amp;nbsp;person!&lt;/p&gt;
&lt;h1&gt;Plain cookies are not&amp;nbsp;enough&lt;/h1&gt;
&lt;p&gt;Before 2006, this wasn’t a big issue. Users mostly browsed the internet on their desktops and browsers, and they seldom used more than one as their regular device. The famous Intel Core series processors had not even arrived yet—they would come a year later, in July 2007—and the first iPhone would arrive a month before Intel&amp;nbsp;Core.&lt;/p&gt;
&lt;p&gt;That meant the average user was using a Pentium-based computer to browse the internet, and that was probably their only internet-enabled device. At most, they had a desktop at home and a laptop at work. If you got a website visit with a user’s cookie, you know it’s not coming from a smartphone or their Amazon Alexa or any other smart device—those did not exist yet. One or two cookie identifiers was&amp;nbsp;enough.&lt;/p&gt;
&lt;p&gt;In a year, this would&amp;nbsp;change.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Cookies are little fragments of information with a name and a value, and associated with a domain address. They are most commonly used to identify new or returning users. This cookie is issued by a website upon the first visit, stored in the browser, and returned to the issuing server whenever the server requests&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;Time to dispel some myths: cookies don’t actually contain any information about you. At least, in the context of advertising, what gives you away is not the cookie information. Think of cookies as queue numbers or collection slips that you get when you go shopping. They are impersonal identifiers simply used to ensure that a product gets delivered to the person who actually paid for&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;So what’s actually leaking your information, and helping Facebook know what you bought on Amazon? We’ll get there, patience please. The pieces are not yet in&amp;nbsp;place.&lt;/p&gt;
&lt;p&gt;Earlier in this issue, I&amp;nbsp;said&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Whenever the webpage is loaded, the cookie information gets sent to the Google Analytics&amp;nbsp;server.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;How does this actually happen? In &lt;a href="https://ngjunsiang.github.io/laymansguide/issue038.html"&gt;Issue 38&lt;/a&gt;), I showed you a graphic from Chrome’s Developer Tools that represented the loading sequence a webpage goes through. With that same feature, we can find out when and how the Google Analytics cookie gets returned to the&amp;nbsp;server.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 70: The Cookie&amp;nbsp;Factory&lt;/p&gt;
&lt;p&gt;We’ve seen how cookies are served, next issue we’ll get a bit closer. We’ll see how information from the cookie is returned. And then in the subsequent issue, you’ll understand Quantcast’s genius insight, and how it led to the ad landscape we have&amp;nbsp;today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;&lt;del&gt;a cookie? [Issue 8]&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 68: The Age of Bloat</title><link href="https://ngjunsiang.github.io/laymansguide/issue068.html" rel="alternate"></link><published>2020-04-18T08:00:00+08:00</published><updated>2020-04-18T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-04-18:/laymansguide/issue068.html</id><summary type="html">&lt;p&gt;Advertising was sold on a &lt;span class="caps"&gt;CPM&lt;/span&gt; model (cost per thousand impressions) in the early Internet, until the dot-com bust forced companies to reconsider their ad-buying strategy. The &lt;span class="caps"&gt;CPC&lt;/span&gt; model (cost per click) became more popular, but was still not very user-targeted. It would take QuantCast, founded in 2006, to figure out a way to gather data on users and build a coherent profile of each&amp;nbsp;demographic.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Each click on a link, or even an ad, sends data to the server. This information can include an &lt;span class="caps"&gt;ID&lt;/span&gt; for the link you clicked, or the category of ad you clicked. But without Javascript, the webpage can’t know very much about&amp;nbsp;you.&lt;/p&gt;
&lt;h2&gt;The dot-com&amp;nbsp;bust&lt;/h2&gt;
&lt;p&gt;Once Javascript was made available … surprisingly little happened on the ad front. Javascript could animate your pages and make buttons that &lt;em&gt;changed image&lt;/em&gt; or even &lt;em&gt;changed colour&lt;/em&gt; when you clicked them. But it was doing little else for&amp;nbsp;now.&lt;/p&gt;
&lt;p&gt;3 years after Javascript was announced, the online advertising industry had achieved revenue of $4.6 billion. It’s hard to imagine that this was largely achieved through banner ads alone … many new companies were being founded, there was lots of capital in the market, and it looked like the Internet was the new growing industry, with stock prices continually soaring beyond what people could&amp;nbsp;imagine.&lt;/p&gt;
&lt;p&gt;On March 10, 2000, the &lt;span class="caps"&gt;NASDAQ&lt;/span&gt; Composite stock market reached its peak, and then it all went downhill from there. It was the dot-com bust which welcomed the 21st&amp;nbsp;century.&lt;/p&gt;
&lt;h2&gt;The old model of advertising: cost-per-mile (&lt;span class="caps"&gt;CPM&lt;/span&gt;)&lt;/h2&gt;
&lt;p&gt;Past online publishers (who displayed ads on their sites) primarily used a &lt;span class="caps"&gt;CPM&lt;/span&gt; model of pricing (“cost per mile”, which was interpreted as “cost per thousand ads served”). You paid for a certain number of ads to be served on a certain number of pageloads, and that was it. You could pay more to have your ad served in a more prominent slot, or to have more ads served, and that was&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;You would often have little idea who saw it or who clicked it, and you just sat and waited for the clicks to come through. Sometimes they did, and often they didn’t. It was cheaper than highway banner ads and huge posters on buildings, but it was still&amp;nbsp;expensive.&lt;/p&gt;
&lt;h2&gt;Recovery and&amp;nbsp;restrategising&lt;/h2&gt;
&lt;p&gt;As freeflow cash quickly shrunk during the dot-com bust, many companies began to rethink their advertising campaigns. They could no longer just spend freely on banner ads that online users were getting accustomed to. The pop-up ad, invented in 1997, was being blocked by non-Microsoft-owned major browsers (Netscape, Firefox, and Opera) around the time the economy started to recover from the bust. New services were needed, new value needed to be&amp;nbsp;created.&lt;/p&gt;
&lt;p&gt;The dot-com low lasted until early 2002, when stock prices finally started to pick up again. Google led the rise with its revamped&amp;nbsp;Adwords.&lt;/p&gt;
&lt;p&gt;Google Adwords, revamped after its premature introduction 2 years earlier, offered a &lt;span class="caps"&gt;CPC&lt;/span&gt; model: cost-per-click. You only had to pay if somebody clicked through the ad to your site, not if they ignored the&amp;nbsp;ad.&lt;/p&gt;
&lt;p&gt;This was not a new innovation: Yahoo already offered a similar model back in 1998. That was a flop, because Yahoo didn’t know enough about its users to optimise the click-through&amp;nbsp;rate.&lt;/p&gt;
&lt;p&gt;Google innovated over the old model in one unique way&amp;nbsp;though.&lt;/p&gt;
&lt;h2&gt;The new model: cost-per-click (&lt;span class="caps"&gt;CPC&lt;/span&gt;)&lt;/h2&gt;
&lt;p&gt;Early &lt;span class="caps"&gt;CPC&lt;/span&gt; models literally just counted clicks on a link and invoiced you accordingly. As the number of advertisers buying ads rocketed, the publishers switched to an auction model: highest bidder wins. This model disadvantaged smaller companies, who had much smaller advertising budgets, and could not out-compete the big ad-buyers on&amp;nbsp;price.&lt;/p&gt;
&lt;p&gt;Google (back then still a tiny company) saw this and, inspired by its search engine algorithm, introduced one change to it: if an ad with a lower bid got more clicks than ads with higher bids, it could climb the ranking&amp;nbsp;ladder.&lt;/p&gt;
&lt;p&gt;Now the race is on to grab every user click, with new services and web media. Facebook launched in 2004, YouTube in 2005, Twitter in&amp;nbsp;2006.&lt;/p&gt;
&lt;h2&gt;The search for unified user&amp;nbsp;data&lt;/h2&gt;
&lt;p&gt;There was just one problem: these companies still didn’t know very much about the market. Every company had a piece of the puzzle: Online publishers knew a bit about its users: what time they visited most often and their approximate locations. But they didn’t know what kind of ads their users wanted, and they have to balance the annoyance their users experienced with the revenue that could be brought in by online&amp;nbsp;advertising.&lt;/p&gt;
&lt;p&gt;Ad buyers, on the other hand, mostly knew who their target market was, but had little idea how to reach them. They had to make a guess, or talk to online publishers to see if there was a fit&amp;nbsp;somewhere.&lt;/p&gt;
&lt;p&gt;Analytics companies such as comScore and Nielsen quickly saw this need, and started researching demographic behaviours online. But this didn’t work for niche markets, or when data was&amp;nbsp;lacking.&lt;/p&gt;
&lt;p&gt;Ad servers (such as Doubleclick, whom you already met in &lt;a href="https://ngjunsiang.github.io/laymansguide/issue066.html"&gt;Issue 66&lt;/a&gt;)) helped to aggregate advertising slots from online publishers. But they were not in a place to gather data on the users; users were not visiting their site. Nor were they in a place to gather the disparate information from online publishers and ad buyers to build coherent profiles of&amp;nbsp;users.&lt;/p&gt;
&lt;p&gt;That piece of the puzzle would come later. Konrad Feldman and Paul Sutter, who noticed the surge of interest in search advertising after Google’s &lt;span class="caps"&gt;IPO&lt;/span&gt; in 2004, and were working on an interesting puzzle: “How would we get direct data on users of sites that we don’t&amp;nbsp;own?”&lt;/p&gt;
&lt;p&gt;They figured it out two years later, and founded a company called&amp;nbsp;QuantCast.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Advertising was sold on a &lt;span class="caps"&gt;CPM&lt;/span&gt; model (cost per thousand impressions) in the early Internet, until the dot-com bust forced companies to reconsider their ad-buying strategy. The &lt;span class="caps"&gt;CPC&lt;/span&gt; model (cost per click) became more popular, but was still not very user-targeted. It would take QuantCast, founded in 2006, to figure out a way to gather data on users and build a coherent profile of each&amp;nbsp;demographic.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 69: The Cookie&amp;nbsp;Monster&lt;/p&gt;
&lt;p&gt;We will take a short detour next week so that I can explain what cookies are, how they came about, and what they do. It’s the linchpin for understanding how modern online advertising works&amp;nbsp;today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 67: The Innocent Times</title><link href="https://ngjunsiang.github.io/laymansguide/issue067.html" rel="alternate"></link><published>2020-04-11T08:00:00+08:00</published><updated>2020-04-11T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-04-11:/laymansguide/issue067.html</id><summary type="html">&lt;p&gt;Each click on a link, or even an ad, sends data to the server. This information can include an &lt;span class="caps"&gt;ID&lt;/span&gt; for the link you clicked, or the category of ad you clicked. But without Javascript, the webpage can’t know very much about&amp;nbsp;you.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; DoubleClick, the first commercially successfully ad server, launched in 1996. It ran a system that tracked the performance of banner ads across 30 sites, working to optimise their return on investment. This was made possible by standardisation of the web (thanks to the &lt;span class="caps"&gt;HTTP&lt;/span&gt; specification), and the birth of Javascript, a scripting language integrated into the webpage rather than being a separate module from it. All of this happened in&amp;nbsp;1995–1996.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://archive.org/about/"&gt;The Internet Archive is a 501(c) non-profit&lt;/a&gt; that aims to achieve nothing less than a digital library of the Internet and its artifacts. &lt;a href="https://archive.org/web/"&gt;The Wayback Machine&lt;/a&gt; is your Google portal to the past. This is where you can type in any &lt;span class="caps"&gt;URL&lt;/span&gt; and see how it looked in the past (as long as The Wayback Machine has a saved copy of it from that&amp;nbsp;time).&lt;/p&gt;
&lt;h2&gt;Advertising in&amp;nbsp;1996&lt;/h2&gt;
&lt;p&gt;Back &lt;a href="https://web.archive.org/web/19961022175643/http://www10.yahoo.com:80/"&gt;in Oct 22, 1996, Yahoo! already had advertising&lt;/a&gt; front and centre, right above its search bar. (Google had not even been founded&amp;nbsp;yet.)&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of Yahoo! from Oct 22, 1996" src="https://ngjunsiang.github.io/laymansguide/issue067_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;Yahoo! in 1996 already had advertising right above the search&amp;nbsp;bar&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;The &lt;span class="caps"&gt;URL&lt;/span&gt; of that page was http://www10.yahoo.com:80/, and we can see a few things from&amp;nbsp;that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span class="caps"&gt;HTTP&lt;/span&gt; 1.0 had not been fully effected yet. When it was, port 80 would be standardised as the port for the Internet. Before that happened, though, you sometimes had to specify the port (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue033.html"&gt;Issue 33&lt;/a&gt;)) for your web browser to send the request&amp;nbsp;through.&lt;/li&gt;
&lt;li&gt;The Internet was small, but it was big enough for Yahoo! to need more than 1 server to serve its homepage. Yahoo had one domain name, yahoo.com, to route all internet traffic through, but it had to somehow direct this traffic to multiple servers. 1 such server was www.yahoo.com, the others were named www2.yahoo.com, www3.yahoo.com, &amp;#8230; you get the&amp;nbsp;idea.&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;HTTPS&lt;/span&gt; was not yet a thing. Privacy was the last thing on peoples’ minds. Who cares what you were searching for? There wasn’t much on the Internet to implicate people with yet. You couldn’t book hotels or buy stuff online or send a tweet. The Internet was an interesting place, far removed from real&amp;nbsp;life.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;What’s in an ad&amp;nbsp;link?&lt;/h2&gt;
&lt;p&gt;The &lt;span class="caps"&gt;URL&lt;/span&gt; that the ad points to is http://www.yahoo.com/homet/SpaceID=0/AdID=2754/?http://la.yahoo.com. Why does yahoo.com appear twice? What’s going&amp;nbsp;on?&lt;/p&gt;
&lt;p&gt;That link is doing quite a number of things: it is sending an &lt;span class="caps"&gt;HTTP&lt;/span&gt; request to Yahoo’s servers with some information&amp;nbsp;attached:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;SpaceID = 0&lt;/strong&gt;&lt;br /&gt;
  Website owners categorise ad slots into different “spaces”. The primary, busiest parts of the webpage might have ads categorised as SpaceID 0. Pages with less traffic might have ads categorised as SpaceID 1, and so on. This allows for some limited form of ad targeting, and different pricing tiers: SpaceID 0 would be more expensive, SpaceID 1 less so, and so&amp;nbsp;on.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;AdID = 2754&lt;/strong&gt;&lt;br /&gt;
  In the table of customers, AdID 2754 would belong to the Yahoo! Los Angeles&amp;nbsp;page.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;http://la.yahoo.com&lt;/strong&gt;&lt;br /&gt;
  This is the page that users should be redirected&amp;nbsp;to.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Back in 1996, websites like Yahoo! could already track how many times an ad was clicked before redirecting users to the actual page. But it had no way of knowing anything about the user who clicked it. The only information it would have was the user’s &lt;span class="caps"&gt;IP&lt;/span&gt; address (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue027.html"&gt;Issue 27&lt;/a&gt;)).&lt;/p&gt;
&lt;p&gt;You might find it surprising that none of this requires Javascript; in fact, that page doesn’t have a single scrap of Javascript in&amp;nbsp;it!&lt;/p&gt;
&lt;p&gt;So what does Javascript do for&amp;nbsp;ads?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Each click on a link, or even an ad, sends data to the server. This information can include an &lt;span class="caps"&gt;ID&lt;/span&gt; for the link you clicked, or the category of ad you clicked. But without Javascript, the webpage can’t know very much about&amp;nbsp;you.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 68: The Age of&amp;nbsp;Bloat&lt;/p&gt;
&lt;p&gt;Still starting slow &amp;#8230; because the picture of online advertising is not complete&amp;nbsp;yet&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry><entry><title>Issue 66: Before the Cloud</title><link href="https://ngjunsiang.github.io/laymansguide/issue066.html" rel="alternate"></link><published>2020-04-04T08:00:00+08:00</published><updated>2020-04-04T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-04-04:/laymansguide/issue066.html</id><summary type="html">&lt;p&gt;DoubleClick, the first commercially successful ad server, launched in 1996. It ran a system that tracked the performance of banner ads across 30 sites, working to optimise their return on investment. This was made possible by standardisation of the web (thanks to the &lt;span class="caps"&gt;HTTP&lt;/span&gt; specification), and the birth of Javascript, a scripting language integrated into the webpage rather than being a separate module from it. All of this happened in&amp;nbsp;1995–1996.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Shared memory helps to reduce the amount of memory needed by all the applications running on an operating system. It also allows applications to send data to each other, and to&amp;nbsp;communicate.&lt;/p&gt;
&lt;p&gt;Season 5 focused on the vulnerabilities that arise from optimising CPUs for speed. Speed means sharing; the more easily data is made available to the &lt;span class="caps"&gt;CPU&lt;/span&gt; without all kinds of permission checks, the more quickly the processing that take&amp;nbsp;place.&lt;/p&gt;
&lt;p&gt;This season, Season 6, I will finally go back to the topic that got me started writing Layman’s Guide to Computing in the first place: online data privacy. But this is a huge, complex topic, and I’ve spent two weeks so far trying to build a timeline of key events, identifying key moments, and chasing interesting connections down deep rabbit-holes. Where do I even&amp;nbsp;start?&lt;/p&gt;
&lt;p&gt;Part of the difficulty of getting started is trying to definitively find out when it all started. Today, when you dig into a website’s code, it is mostly a gobbledygook of interacting code, advertising tags, accessibility declarations, and more. A mere 25 years ago, early in 1995, websites were still only static content! How did it turn out like&amp;nbsp;this?&lt;/p&gt;
&lt;p&gt;Did it start, maybe, in 1993? When Tim O’Reilly, who had already founded what would later be O’Reilly Media, started the first online information project, Global Network Navigator. (Yahoo! would follow suit with Yahoo! Directory a year later, trying to create the world’s biggest index of websites. By hand.) The site had to be funded somehow, since online commerce had not been born yet—the web was still static content, remember? The enterprising O’Reilly, taking a page from the huge highway banner ads, sold the first clickable ad to a Silicon Valley law firm. After all, 5 months later, Hotwired, a commercial web magazine (which would later be renamed to just &lt;span class="caps"&gt;WIRED&lt;/span&gt;), started doing just that in large&amp;nbsp;quantities.&lt;/p&gt;
&lt;p&gt;That seems to be a reasonable starting point … except that was not the same as the online advertising we know today. People emailing each other image files and signing off advertising contracts on paper is not the same as online ad space being sold to the highest bidder within microseconds while your page&amp;nbsp;loads.&lt;/p&gt;
&lt;h2&gt;The birth of&amp;nbsp;Javascript&lt;/h2&gt;
&lt;p&gt;No. I think it started in mid-1995, when Netscape hired Brendan Eich to create a scripting language for the web. They already had Java, a language which the web didn’t understand; it had to be compiled (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue054.html"&gt;Issue 54&lt;/a&gt;)) to a Java application (which you might know as a java applet) and put into its own little box so it wouldn’t hurt the rest of the webpage. But they wanted a &lt;strong&gt;scripting language&lt;/strong&gt;, which could be run directly in the browser without compilation, in real time, &lt;em&gt;as part of the page&lt;/em&gt;. &lt;a href="https://www.infoworld.com/article/2653798/javascript-creator-ponders-past--future.html"&gt;In Eich’s words&lt;/a&gt;, “The idea was to make something that Web designers, people who may or may not have much programming training, could use to add a little bit of animation or a little bit of smarts to their Web forms and their Web&amp;nbsp;pages.”&lt;/p&gt;
&lt;p&gt;Mr Eich created a prototype for the language, Mocha, in 10 days, just in time to be included in Netscape Navigator 2.0 beta 3 when it was released in November that year. Its name had been changed to LiveScript. But in December, when his prototype language was announced to the world by Netscape Communications and Sun Microsystems, it would be known as&amp;nbsp;Javascript.&lt;/p&gt;
&lt;p&gt;The same year, Internet Explorer 2.0 was also released to the world. Work on it had also started early that year. Both Netscape Navigator and Internet Explorer were based on very similar codebases: both originated from &lt;span class="caps"&gt;NCSA&lt;/span&gt;’s Mosaic browser, which began development by Eric Bina and Marc Andreessen three years ago, at the end of 1992. (Andreessen would later be best known as co-founder of Andreessen-Horowitz Capital&amp;nbsp;Management.)&lt;/p&gt;
&lt;p&gt;By Spring 1996, things were heating up. Before this point, web browsers were only working with &lt;a href="https://www.w3.org/Protocols/HTTP/HTTP2.html"&gt;&lt;span class="caps"&gt;HTTP&lt;/span&gt; v0.9&lt;/a&gt;, a protocol so simple I probably wouldn’t need to laymanise it for you. But a new standard was needed to support all the new things that Web 2.0 was supposed to be able to do. That new standard, &lt;span class="caps"&gt;HTTP&lt;/span&gt; v1.0, was published in 1996. (See &lt;a href="https://ngjunsiang.github.io/laymansguide/issue007.html"&gt;Issue 7&lt;/a&gt;) if you’re still wondering what &lt;span class="caps"&gt;HTTP&lt;/span&gt;&amp;nbsp;is.)&lt;/p&gt;
&lt;p&gt;What else happened in that magical year of&amp;nbsp;1996?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;As if to signal a shift in the zeitgeist, Global Network Navigator was bought by &lt;span class="caps"&gt;AOL&lt;/span&gt; that year; by year-end they were shuttered, their subscribers moved to &lt;span class="caps"&gt;AOL&lt;/span&gt;. Static banner ads would go the way of the&amp;nbsp;dinosaur.&lt;/li&gt;
&lt;li&gt;The Internet Advertising Bureau was founded to streamline industry standards and provide legal support—instead of stunting growth through regulation like today, this was meant to help growth by standardising things when most things were non-standard, such as the pixel dimensions of online&amp;nbsp;ads.&lt;/li&gt;
&lt;li&gt;Adobe introduces Flash. It would have a good run for 15 years until Apple decided not to support it in their iOS devices, and it would see browser support removed entirely in 2020, just 25 years after its&amp;nbsp;beginnings.&lt;/li&gt;
&lt;li&gt;While Google was the first to successfully monetise putting ads in your search, Yahoo! was the first to &lt;a href="https://www.youtube.com/watch?time_continue=17&amp;amp;v=Aa0WaSSVeIw&amp;amp;feature=emb_logo"&gt;put their search engine in an ad&lt;/a&gt;. They launched their &lt;span class="caps"&gt;IPO&lt;/span&gt; in April of&amp;nbsp;1996.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And one more thing. Slow as the internet seemed to be growing, people quickly ran into the limits of static banner ads. You couldn’t do very much on static websites. You couldn&amp;#8217;t track clicks, for instance, and you couldn’t quickly deploy different ads to different websites to see which ones did better. To do something like that, you had to work with different websites—talking to them over phone or email(!)—and work out performance metrics and tracking arrangements with them. In an era when it was hard to know precisely how much a &lt;span class="caps"&gt;TV&lt;/span&gt; ad, poster, or radio ad contributed to your campaign’s success, many companies were hoping to change things with an online presence through ads. It was unsurprisingly turning out to be harder than&amp;nbsp;expected.&lt;/p&gt;
&lt;p&gt;But right then in 1995, one company figured out how to do just that. Instead of serving their own ads, they decided to run their banner ad system, deployed across 30 sites, and sell ad space to other companies. By early 1996, they decided to launch their business. DoubleClick, an ad server, was&amp;nbsp;born.&lt;/p&gt;
&lt;p&gt;They would be acquired almost 10 years later by Google for &lt;span class="caps"&gt;US&lt;/span&gt;$3.1&amp;nbsp;billion.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; DoubleClick, the first commercially successful ad server, launched in 1996. It ran a system that tracked the performance of banner ads across 30 sites, working to optimise their return on investment. This was made possible by standardisation of the web (thanks to the &lt;span class="caps"&gt;HTTP&lt;/span&gt; specification), and the birth of Javascript, a scripting language integrated into the webpage rather than being a separate module from it. All of this happened in&amp;nbsp;1995–1996.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 67: The Innocent&amp;nbsp;Times&lt;/p&gt;
&lt;p&gt;The more astute by this point could have imagined the portentous future that Javascript would herald. But this was still an age of innocence, still enchanted by the immense untapped potential of the desktop and still-new&amp;nbsp;laptop.&lt;/p&gt;
&lt;p&gt;Online advertising already existed even then. Visually, it would look familiar. But at the backend, ads today work very differently from how they did in&amp;nbsp;1996.&lt;/p&gt;
&lt;p&gt;In the next issue, I will try to trace how online ads developed, as the industry changed and grew and shifted, to show you how they became what they were&amp;nbsp;today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 06"></category></entry></feed>