<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Layman's Guide to Computing - Season 05</title><link href="https://ngjunsiang.github.io/laymansguide/" rel="alternate"></link><link href="https://ngjunsiang.github.io/laymansguide/feeds/season-05.atom.xml" rel="self"></link><id>https://ngjunsiang.github.io/laymansguide/</id><updated>2020-03-28T17:12:00+08:00</updated><entry><title>Issue 65: Memory Sharing in the Operating System</title><link href="https://ngjunsiang.github.io/laymansguide/issue065.html" rel="alternate"></link><published>2020-03-28T17:12:00+08:00</published><updated>2020-03-28T17:12:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-03-28:/laymansguide/issue065.html</id><summary type="html">&lt;p&gt;Shared memory helps to reduce the amount of memory needed by all the applications running on an operating system. It also allows applications to send data to each other, and to&amp;nbsp;communicate.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Meltdown and Spectre require the programs executing them to have access to kernel memory space. Kernel address isolation attempts to prevent the program from even having access to the kernel address space in the first place. &lt;span class="caps"&gt;TLB&lt;/span&gt; flushing changes the virtual-to-physical memory mapping, disrupting Spectre’s reliance on a consistent virtual-to-physical memory&amp;nbsp;mapping.&lt;/p&gt;
&lt;p&gt;One question that makes sense to ask is: if the operating system is supposed to keep the memory used by each program separate, then how is one program able to access the memory of another program? How would a program trying to mount a Meltdown or Spectre attack be able to read the memory of any other program, let alone the operating&amp;nbsp;system?&lt;/p&gt;
&lt;p&gt;Let’s face it: it is impossible to completely separate programs from each other. Many programs need to communicate with each other; antivirus software needs to be able to scan the addresses accessed by your web browser for harmful links, Office software needs to be able to send data to each other especially for features like Mail Merge, and of course your task manager has to know how much resources every app is using. So that it can show you&amp;nbsp;this:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of task manager in Windows 10, showing shared memory usage" src="https://ngjunsiang.github.io/laymansguide/issue065_01.png" /&gt;&lt;br /&gt;
&lt;small&gt;Task Manager in Windows 10&lt;br /&gt;
You can reveal the shared memory column by right-clicking on the column labels and then “Select Columns”.&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;or&amp;nbsp;this:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of system monitor in KDE, showing shared memory usage" src="https://ngjunsiang.github.io/laymansguide/issue065_02.png" /&gt;&lt;br /&gt;
&lt;em&gt;System Monitor in &lt;span class="caps"&gt;KDE&lt;/span&gt;&amp;nbsp;(Linux).&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;What is this shared&amp;nbsp;memory?&lt;/p&gt;
&lt;h2&gt;Private&amp;nbsp;memory&lt;/h2&gt;
&lt;p&gt;The memory I talked about earlier, which every software application has, is used to store various things. It is used to store temporary information, such as unsaved data, application settings, graphics resources (every icon and image shown in the application has to come from somewhere …), but most important, libraries and other functions (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue017.html"&gt;Issue 17&lt;/a&gt;)).&lt;/p&gt;
&lt;p&gt;Very few software developers will write every single bit of code used by their program; often, they will use software libraries written by others to provide specialised functions (e.g. encrypting your data, or accessing a database). When program code is compiled into &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue054.html"&gt;Issue 54&lt;/a&gt;)), these libraries of course have to be compiled and bundled up as&amp;nbsp;well.&lt;/p&gt;
&lt;p&gt;That makes the program really huge, doesn’t it? Yes, it does; it is one reason (but not the main reason) that mobile apps, especially Android apps, &lt;a href="https://trevore.com/post/app-sizes-are-out-of-control/"&gt;have become so bloated&lt;/a&gt; over the last half-decade or so. But I&amp;nbsp;digress.&lt;/p&gt;
&lt;h2&gt;Shared&amp;nbsp;memory&lt;/h2&gt;
&lt;p&gt;At some point, you start to realise that many of these apps need to use a set of identical functions: at the most basic level, requesting and managing memory, requesting file access, sending data over a network, …, and up to libraries for resizing images, and so&amp;nbsp;on.&lt;/p&gt;
&lt;p&gt;It doesn’t make sense for each app to have to bundle their own libraries for that! So the &lt;span class="caps"&gt;OS&lt;/span&gt; actually provides a set of common libraries that applications compiled for that &lt;span class="caps"&gt;OS&lt;/span&gt; can use. Each operating system bundles its own libraries for applications to use; this is one reason why applications compiled for Windows wont work on &lt;span class="caps"&gt;OSX&lt;/span&gt; or Linux, and vice-versa. That also means that these libraries have to be loaded into a part of memory that is accessible to all applications. These shared libraries thus go into &lt;strong&gt;shared memory space&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;What else? Shared libraries can’t be taking up so much space by themselves, they’re just instructions&amp;nbsp;…&lt;/p&gt;
&lt;p&gt;Let’s try to find out what else is sitting in&amp;nbsp;there.&lt;/p&gt;
&lt;h2&gt;Investigating memory&amp;nbsp;details&lt;/h2&gt;
&lt;p&gt;On Windows, I’m going to need more specialised tools. I’ve only got an hour; let’s try something&amp;nbsp;else.&lt;/p&gt;
&lt;p&gt;Ah! System Monitor actually reveals more details about how an application uses memory. Let’s investigate the top few processes using the most&amp;nbsp;memory.&lt;/p&gt;
&lt;p&gt;Here’s&amp;nbsp;Firefox:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of detailed memory usage in Firefox on KDE" src="https://ngjunsiang.github.io/laymansguide/issue065_03.png" /&gt;&lt;br /&gt;
&lt;em&gt;Firefox detailed memory usage in &lt;span class="caps"&gt;KDE&lt;/span&gt;&amp;nbsp;(Linux).&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;Oops, too much detail. Heres the&amp;nbsp;gist:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Firefox uses about 450 &lt;span class="caps"&gt;MB&lt;/span&gt; for its own stuff in private memory, in a place called the&amp;nbsp;heap.&lt;/li&gt;
&lt;li&gt;To communicate with other processes, it uses about 10 &lt;span class="caps"&gt;MB&lt;/span&gt; privately, and 82 &lt;span class="caps"&gt;MB&lt;/span&gt; shared with other processes (it does so through /&lt;span class="caps"&gt;SYSV00000000&lt;/span&gt;, which is deleted when not in&amp;nbsp;use)  &lt;/li&gt;
&lt;li&gt;It has loaded one of its core&amp;nbsp;libraries, &lt;code&gt;libxul.so&lt;/code&gt; (almost all libraries start with the&amp;nbsp;prefix &lt;code&gt;lib&lt;/code&gt;) in shared space. This core library is shared with other Mozilla applications, such as its Thunderbird email client, so it makes sense to put it mostly in shared&amp;nbsp;memory.&lt;/li&gt;
&lt;li&gt;It uses a small amount of space for caching things (startup code, its own scripts,&amp;nbsp;etc)&lt;/li&gt;
&lt;li&gt;It uses some shared memory to communicate with other processes. (The&amp;nbsp;acronym &lt;code&gt;IPC&lt;/code&gt; in this context usually refers to &lt;strong&gt;inter-process communication&lt;/strong&gt;.) This can be for playing audio/video (it has to communicate with the audio/video drivers), or loading content that has to be processed through plugins (used to be Flash content in the past, now it can be other&amp;nbsp;things).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hmm, interesting. Let’s try to find something more illuminating to wrap up this season&amp;nbsp;with.&lt;/p&gt;
&lt;h2&gt;How is shared memory&amp;nbsp;used?&lt;/h2&gt;
&lt;p&gt;I do my newsletter writing mainly in an app called Atom, made by Github. Atom runs on a platform called Electron (atom … electron … get it?). Electron is a Github project that allows developers to write desktop/laptop apps in Javascript, traditionally the language of web&amp;nbsp;scripting.&lt;/p&gt;
&lt;p&gt;In system monitor, I can see an app named atom, and one named electron. Let’s inspect them&amp;nbsp;both.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of detailed memory usage for electron on KDE" src="https://ngjunsiang.github.io/laymansguide/issue065_04.png" /&gt;&lt;br /&gt;
&lt;em&gt;Electron detailed memory usage in &lt;span class="caps"&gt;KDE&lt;/span&gt;&amp;nbsp;(Linux).&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;&lt;img alt="Screenshot of detailed memory usage for atom on KDE" src="https://ngjunsiang.github.io/laymansguide/issue065_05.png" /&gt;&lt;br /&gt;
&lt;em&gt;Atom detailed memory usage in &lt;span class="caps"&gt;KDE&lt;/span&gt;&amp;nbsp;(Linux).&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;We can see&amp;nbsp;that:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Both apps are sharing&amp;nbsp;the &lt;code&gt;electron&lt;/code&gt; library (it does not have&amp;nbsp;a &lt;code&gt;lib&lt;/code&gt; prefix, but it is stored in&amp;nbsp;the &lt;code&gt;/usr/lib&lt;/code&gt; directory which is where libraries&amp;nbsp;go)&lt;/li&gt;
&lt;li&gt;They both use a bunch of shared&amp;nbsp;libraries: &lt;code&gt;libicu*&lt;/code&gt; for Unicode&amp;nbsp;support, &lt;code&gt;libc*&lt;/code&gt; &lt;span class="amp"&gt;&amp;amp;&lt;/span&gt; &lt;code&gt;libstd*&lt;/code&gt; for standard operating system functions (reading/writing files,&amp;nbsp;etc), &lt;code&gt;libgtk*&lt;/code&gt; for user interface&amp;nbsp;management, &lt;code&gt;fontconfig&lt;/code&gt; for fonts,&amp;nbsp;etc&lt;/li&gt;
&lt;li&gt;Some libraries are still loaded privately, and both programs still have a heap for their own data which is not meant to be accessible to other&amp;nbsp;programs&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;You can see why the application memory usage shown in Task Manager/System Monitor doesn’t always tally with the total memory usage. Application memory usage usually shows both private+shared memory usage, so that will add up to a number greater than the total memory&amp;nbsp;usage.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Shared memory helps to reduce the amount of memory needed by all the applications running on an operating system. It also allows applications to send data to each other, and to&amp;nbsp;communicate.&lt;/p&gt;
&lt;p&gt;Long issue, I hope the images make up for it. Computers in the early days didn’t share memory so easily, and that made things really inconvenient. They often had to communicate through one application writing data to a file, and then having the other application reading the data from that file. Slow, and often unreliable. Shared memory evolved as a way to make that process&amp;nbsp;easier.&lt;/p&gt;
&lt;p&gt;But shared memory, improperly secured and managed, is also how vulnerabilities like Meltdown and Spectre are made possible, and how malware can do what it does. It’s a double-edged&amp;nbsp;sword.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S6] Issue 66: Before the&amp;nbsp;Cloud&lt;/p&gt;
&lt;p&gt;Memory is one of those topics where I think laypeople and engineers have a completely different picture in their heads. I hope this issue has clarified that picture somewhat. It still won&amp;#8217;t be completely clear until I can talk about heaps, but I wont do that until I figure out how to simplify&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;Meanwhile, the newsletter must go on! I’ve finished Season 5, having explained how computers improve performance through reordering instructions (Out-of-Order Processing) and running instructions ahead of time if it thinks they will be needed (Speculative Execution). Both of these processes use the cache, which is controlled by the &lt;span class="caps"&gt;CPU&lt;/span&gt; hardware directly, not by the operating system. And through an esoteric loophole that exploits timing differences in cache access (cache hit = fast, cache miss = slow), an attacker is able to leak data out from protected kernel memory through the&amp;nbsp;cache.&lt;/p&gt;
&lt;p&gt;After this detour, its time to rewind back to where I stopped in Season 3: with networks and the Internet. I went through data types in Season 4 to talk about what complex documents are (because the web is made up of a series of complex documents). Then I laid out a &lt;span class="caps"&gt;CPU&lt;/span&gt; exploit in Season 5, to show you how data can be leaked&amp;nbsp;inadvertently.&lt;/p&gt;
&lt;p&gt;Now I’m ready to tell you more about how the current online advertising model became what it is today, and why it is so bad for privacy. You are going to learn a lot more about how ads really work, how advertisers track your online activity, and how they ensnare many companies (especially the big publishers) into a kind of self-reinforcing scheme that lets them target their content more effectively while also letting advertisers improve their&amp;nbsp;targeting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="app"></category><category term="cache"></category><category term="memory"></category><category term="operating system"></category></entry><entry><title>Issue 64: Fixing Meltdown and Spectre</title><link href="https://ngjunsiang.github.io/laymansguide/issue064.html" rel="alternate"></link><published>2020-03-14T08:00:00+08:00</published><updated>2020-03-14T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-03-14:/laymansguide/issue064.html</id><summary type="html">&lt;p&gt;Meltdown and Spectre require the programs executing them to have access to kernel memory space. Kernel address isolation attempts to prevent the program from even having access to the kernel address space in the first place. &lt;span class="caps"&gt;TLB&lt;/span&gt; flushing changes the virtual-to-physical memory mapping, disrupting Spectre’s reliance on a consistent virtual-to-physical memory&amp;nbsp;mapping.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; For Meltdown and Spectre to work, they need two things: (1) Permission to carry out instructions (i.e. run programs) on the &lt;span class="caps"&gt;OS&lt;/span&gt;, and (2) knowledge of where the kernel address space&amp;nbsp;is.&lt;/p&gt;
&lt;p&gt;Last week, I explained two key limitations of Meltdown and Spectre that are needed for an attack to be successfully carried out. Hackers getting permission they shouldn’t have is not a security flaw related to Meltdown and Spectre, so that really belongs in a different season of Layman’s&amp;nbsp;Guide.&lt;/p&gt;
&lt;p&gt;So we’ll focus on problem 2—protecting access to the kernel address space, which is set aside for the &lt;span class="caps"&gt;OS&lt;/span&gt;’s use. The kernel address space contains key information, such as user privilege tables and &lt;span class="caps"&gt;OS&lt;/span&gt; state, which a hacker can compromise to gain higher-level permissions, or to find out where the memory address space for a certain program is running. Such as the customer information&amp;nbsp;database.&lt;/p&gt;
&lt;h2&gt;Protecting the kernel address&amp;nbsp;space&lt;/h2&gt;
&lt;p&gt;One common way of protecting knowledge of where the kernel address space (i.e. the “&lt;span class="caps"&gt;HQ&lt;/span&gt;”) is located is to keep changing its location. For example, newer versions of Linux randomise the location of the kernel address space at each computer bootup, to make it harder for an attacker to&amp;nbsp;guess.&lt;/p&gt;
&lt;p&gt;It is still possible for the attacker to slowly probe which parts of the address space it can access, and which parts it can’t, and make a guess where the kernel address space is; I will not go into detail about these various&amp;nbsp;methods.&lt;/p&gt;
&lt;p&gt;But do you see the bigger problem? Programs are &lt;strong&gt;actually allowed&lt;/strong&gt; to request an address in the kernel address space. The &lt;span class="caps"&gt;OS&lt;/span&gt; checks its permission tables before it tells the program whether it is allowed to access that space. The only thing preventing program access to that space is an &lt;span class="caps"&gt;OS&lt;/span&gt; permission&amp;nbsp;check.&lt;/p&gt;
&lt;p&gt;In contrast, if a program tried to request memory&amp;nbsp;address &lt;code&gt;-56&lt;/code&gt; or &lt;code&gt;2^65&lt;/code&gt;, the &lt;span class="caps"&gt;OS&lt;/span&gt; wouldn’t even need to check. Negative memory addresses are obviously invalid, as are memory addresses longer than 64-bit (which wouldn’t even be able to be&amp;nbsp;sent).&lt;/p&gt;
&lt;h2&gt;Mitigating Meltdown: Kernel address&amp;nbsp;isolation&lt;/h2&gt;
&lt;p&gt;One fix that has been merged into the Linux kernel since 2017 is &lt;span class="caps"&gt;KAISER&lt;/span&gt;, which aims to prevent programs from even having access to kernel address space. Similar patches have been released for Windows and macOS as well&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt;. Under this patch, &lt;strong&gt;two&lt;/strong&gt; sets of address spaces are&amp;nbsp;maintained.&lt;/p&gt;
&lt;p&gt;The first set is the same as before: it is essentially the entire address space. But now, only the kernel (the “core” of the &lt;span class="caps"&gt;OS&lt;/span&gt;) has access to it. The second set contains the entire address space used by programs, excluding kernel address space. This way, programs running with user permissions will not even be able to get data from the kernel address space. It&amp;#8217;s like trying to get to a room that doesn’t exist (to the&amp;nbsp;program).&lt;/p&gt;
&lt;p&gt;Having to keep switching between two sets of pages when executing instructions from both kernel programs as well as user programs is, of course, going to make things take longer than usual. Up to 20% longer for some&amp;nbsp;instructions.&lt;/p&gt;
&lt;p&gt;This primarily mitigates the impact of Meltdown, which attempts to access the kernel address space before it gets caught and an exception is raised in the program. But it does not do anything for Spectre, which speculatively executes two possible outcomes where the code meets a decision point, but later discards the outcome which is not&amp;nbsp;needed.&lt;/p&gt;
&lt;h2&gt;Crash course: Translation Lookaside&amp;nbsp;Buffer&lt;/h2&gt;
&lt;p&gt;One concept to cover before we get to the Spectre mitigation. In &lt;a href="https://ngjunsiang.github.io/laymansguide/issue055.html"&gt;Issue 55&lt;/a&gt;) I talked about how the virtual address space allows programs to access data from different parts of the computer: &lt;span class="caps"&gt;USB&lt;/span&gt; devices, hard drives, network, sound card, and of course not forgetting the physical memory&amp;nbsp;itself.&lt;/p&gt;
&lt;p&gt;How does the &lt;span class="caps"&gt;CPU&lt;/span&gt; know that virtual address 2354476 is actually pointing to physical memory address 3564241? It doesn’t. This mapping is stored in the &lt;span class="caps"&gt;CPU&lt;/span&gt;, within the memory management unit. Like all mappings (remember the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache, and the &lt;span class="caps"&gt;DNS&lt;/span&gt; cache from &lt;a href="https://ngjunsiang.github.io/laymansguide/issue039.html"&gt;Issue 39&lt;/a&gt;)?), the lookup process can be greatly speeded up with a cache. The part of the &lt;span class="caps"&gt;CPU&lt;/span&gt; that caches virtual-to-physical memory mappings is called the Translation Lookaside Buffer, or &lt;span class="caps"&gt;TLB&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;A key requirement for Spectre to work is for the Translation Lookaside Buffer to remain unchanged, so that it is getting data from the same part of (kernel address space)&amp;nbsp;memory.&lt;/p&gt;
&lt;h2&gt;Mitigating Spectre: &lt;span class="caps"&gt;TLB&lt;/span&gt;&amp;nbsp;flushing&lt;/h2&gt;
&lt;p&gt;Naturally, one way to mitigate Spectre is to keep flushing the &lt;span class="caps"&gt;TLB&lt;/span&gt;. As can be expected whenever you flush a cache, lookups will cause a cache miss and result in the &lt;span class="caps"&gt;CPU&lt;/span&gt; memory management unit having to figure out the mapping all over again, leading to&amp;nbsp;slowdown.&lt;/p&gt;
&lt;p&gt;Some performance/security features that are being worked on for processors include selective &lt;span class="caps"&gt;TLB&lt;/span&gt; flushing (flushing only some parts of it but not all), or learning to identify when it should be&amp;nbsp;flushed.&lt;/p&gt;
&lt;h2&gt;Last words on Meltdown and&amp;nbsp;Spectre&lt;/h2&gt;
&lt;p&gt;I lied in the title of this issue: there is no fix. These are only &lt;em&gt;mitigations&lt;/em&gt;, which can reduce the impact of these attacks, but not prevent them&amp;nbsp;completely.&lt;/p&gt;
&lt;p&gt;The dismal conclusion you might not have drawn is that there is little we can do to protect ourselves against such vulnerabilities, besides keeping your &lt;span class="caps"&gt;OS&lt;/span&gt; patched and up to date, and not leaving your computer running continuously for too long (the location of kernel address space is only randomised upon&amp;nbsp;bootup).&lt;/p&gt;
&lt;p&gt;The good news is: Meltdown and Spectre are a lot of work. No cases of them being used in the real world have been reported as of yet, and hackers are unlikely to go to this much effort to attack consumers; targets of their attack will probably be database servers of bigger&amp;nbsp;companies.&lt;/p&gt;
&lt;p&gt;Still, the origin of these exploits stemmed from an earlier time when our collective focus was on faster and faster CPUs. In the early ’00s, we didn’t hear CPUs being touted as “safe” or “secure”, just “fast”. Neither did we see a need for secure&amp;nbsp;CPUs.&lt;/p&gt;
&lt;p&gt;It was only with the explosion of the mobile internet after 2007 that the market became a lucrative target. By the time the hacking tools became widespread, CPUs had already incorporated so many features to speed up processing at the cost of&amp;nbsp;security.&lt;/p&gt;
&lt;p&gt;Perhaps it is time for us to reassess the situation, make the judgement call to ask for greater hardware security, and take the bitter pill of performance tradeoff. and then wait for the &lt;span class="caps"&gt;CPU&lt;/span&gt; manufacturers to get the message, if they haven’t&amp;nbsp;already.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Meltdown and Spectre require the programs executing them to have access to kernel memory space. Kernel address isolation attempts to prevent the program from even having access to the kernel address space in the first place. &lt;span class="caps"&gt;TLB&lt;/span&gt; flushing changes the virtual-to-physical memory mapping, disrupting Spectre’s reliance on a consistent virtual-to-physical memory&amp;nbsp;mapping.&lt;/p&gt;
&lt;p&gt;Phew, that was quite a bit to type. I am glad to be done talking about Meltdown and Spectre; these are sombre topics, and the more I write about them, the less faith I have in the devices I&amp;nbsp;use.&lt;/p&gt;
&lt;p&gt;Funnily enough, I had originally titled this season “Operating Systems and the &lt;span class="caps"&gt;CPU&lt;/span&gt;”. I am obviously far from covering things that I think people should know about their operating system, so that will probably resume in another&amp;nbsp;season.&lt;/p&gt;
&lt;p&gt;Since I’ve been talking so much about memory, I think it makes sense to bring in one more topic here to close off the season: how do all those programs share a common memory&amp;nbsp;space?&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S5] Issue 65: Memory Sharing in the Operating&amp;nbsp;System&lt;/p&gt;
&lt;p&gt;I have only one issue left and I don’t want to end with a cliffhanger, so I’m going to keep the next issue focused on one question: what is all that memory used&amp;nbsp;for?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="footnote"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Interestingly, these patches went out shortly before Meltdown and Spectre were announced … I won’t speculate about the timing here, you draw your own conclusions.&amp;#160;&lt;a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</content><category term="Season 05"></category><category term="cpu"></category><category term="operating system"></category></entry><entry><title>Issue 63: Limitations of Meltdown and Spectre</title><link href="https://ngjunsiang.github.io/laymansguide/issue063.html" rel="alternate"></link><published>2020-03-07T08:00:00+08:00</published><updated>2020-03-07T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-03-07:/laymansguide/issue063.html</id><summary type="html">&lt;p&gt;For Meltdown and Spectre to work, they need two things: (1) Permission to carry out instructions (i.e. run programs) on the &lt;span class="caps"&gt;OS&lt;/span&gt;, and (2) knowledge of where the kernel address space&amp;nbsp;is.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; To snoop the cache,&amp;nbsp;we:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Flush the cache corresponding to the 256 memory addresses (to get a cache miss when attempting to load the data from&amp;nbsp;memory)&lt;/li&gt;
&lt;li&gt;Load the secret value using Meltdown or Spectre attacks (the secret value is only one byte, and cannot be greater than 256, so 256 addresses are&amp;nbsp;sufficient)&lt;/li&gt;
&lt;li&gt;Load the memory address from step 1 that corresponds to the value of the secret - this address is now cached, and the next request for it again will result in a cache&amp;nbsp;hit&lt;/li&gt;
&lt;li&gt;Request each address and look for the one with a lower request&amp;nbsp;latency&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;At the heart of the exploit is a cache which, being managed by hardware, is not subject to more fine-grained &lt;span class="caps"&gt;OS&lt;/span&gt; control. This is presumed to be safe by engineers, since you can’t access data directly from hardware easily. But as we have seen in this season, with the right exploit, you can still get to that data, with or without&amp;nbsp;permission.&lt;/p&gt;
&lt;p&gt;Just how vulnerable are we to Meltdown and&amp;nbsp;Spectre?&lt;/p&gt;
&lt;h2&gt;Getting&amp;nbsp;permission&lt;/h2&gt;
&lt;p&gt;Every exploit relies on one or more things to work before it can do its things. Meltdown and Spectre require a way to run themselves on the &lt;span class="caps"&gt;CPU&lt;/span&gt;, in the &lt;span class="caps"&gt;OS&lt;/span&gt;. That means a black hat hacker will have to obtain illegal access to the &lt;span class="caps"&gt;OS&lt;/span&gt;, and there are a few common ways to do&amp;nbsp;so:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;By cracking a password&lt;/strong&gt;&lt;br /&gt;
   If the hash of a password (future season) is leaked, hackers can try to reverse-engineer the original password that led to that hash. This requires A &lt;span class="caps"&gt;LOT&lt;/span&gt; of &lt;span class="caps"&gt;CPU&lt;/span&gt; time, and is often not feasible for properly hashed&amp;nbsp;passwords.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Getting a password from an unsuspecting user&lt;/strong&gt;&lt;br /&gt;
   Other people, usually admins and employees, of the &lt;span class="caps"&gt;OS&lt;/span&gt; will already have access to it. A black hat hacker can try to get the password from them through phishing means, or trying to get keystroke-logging malware onto a flash drive they use, or simply posing as a contractor who needs the password for … whatever&amp;nbsp;reason.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Exploiting vulnerabilities&lt;/strong&gt;&lt;br /&gt;
   There are many ways to get an &lt;span class="caps"&gt;OS&lt;/span&gt; to carry out instructions it is not supposed to. An improperly secured web app could receive malicious form data from any of its pages that tells the database to return supposedly-secured information. An improperly configured web server could be exploited by sending it more data than it requested. (Just see the number of “buffer overflow” entries &lt;a href="https://www.cvedetails.com/vulnerability-list/vendor_id-45/product_id-66/opov-1/Apache-Http-Server.html"&gt;on this page&lt;/a&gt;). If it is not properly written to know what to do with this excess data, and naïvely stores it into memory or processes it, that leads to Bad Things&amp;nbsp;Happening.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Once the black hat hackers find a way to get permission to run things in the &lt;span class="caps"&gt;OS&lt;/span&gt;, they are in lala-land! Not quite. There are different levels of permissions, and the most restrictive ones might not let you run any programs except from a whitelist. At the other end of the privilege spectrum, &lt;strong&gt;root&lt;/strong&gt; accounts let you do pretty much everything and anything. This is why if you are ever asked to be root (or Admin) of a computer (including your router), you should really keep that password in a safe and secure place, such as a password&amp;nbsp;manager.&lt;/p&gt;
&lt;h2&gt;Knowing where the loot&amp;nbsp;is&lt;/h2&gt;
&lt;p&gt;During that tiny window of opportunity, the black hat hackers are trying to read data from parts of virtual memory they are not allowed to access. But which are those&amp;nbsp;parts?&lt;/p&gt;
&lt;p&gt;Within the physical memory part of the virtual memory&amp;nbsp;space—&lt;/p&gt;
&lt;p&gt;Okay, quick unpacking here. Remember that the virtual memory space is where all our devices get an address? Hard drives, &lt;span class="caps"&gt;USB&lt;/span&gt; devices, network interfaces, … and of course, physical memory (also known as “&lt;span class="caps"&gt;RAM&lt;/span&gt;”—yes, the same &lt;span class="caps"&gt;RAM&lt;/span&gt; you usually see on the specs of computers). Programs request data from and send data to these devices by using their virtual memory addresses. Each cell in physical memory also gets an address in virtual memory&amp;nbsp;space.&lt;/p&gt;
&lt;p&gt;Within the physical memory part of the virtual memory space, there are portions which are set aside &lt;em&gt;for the &lt;span class="caps"&gt;OS&lt;/span&gt; only&lt;/em&gt;. This is the &lt;strong&gt;kernel address space&lt;/strong&gt;, which is where critical information such as user privilege tables and &lt;span class="caps"&gt;OS&lt;/span&gt; state get stored. Knowing the addresses in the kernel address space is a big requirement for many exploits, so &lt;span class="caps"&gt;OS&lt;/span&gt; engineers obviously put a lot of work into make sure they are as hard to guess or discover as&amp;nbsp;possible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; For Meltdown and Spectre to work, they need two things: (1) Permission to carry out instructions (i.e. run programs) on the &lt;span class="caps"&gt;OS&lt;/span&gt;, and (2) knowledge of where the kernel address space&amp;nbsp;is.&lt;/p&gt;
&lt;p&gt;Problem (1) has been with us since the operating system was born. Problem (2) is also not new: it’s basically figuring out where the &lt;span class="caps"&gt;HQ&lt;/span&gt; is. Spies have also been doing that since time immemorial. But we are now dealing with a space where humans cannot tread: the virtual memory space. Hackers are sending preprogrammed chunks of compiled code into the computer to sniff out data-loot and get it out, while we are programming computers to try to detect such attempts and warn us about them or stop them&amp;nbsp;outright.&lt;/p&gt;
&lt;p&gt;Meanwhile, the processor manufacturers are trying to make everything happen faster. They are, of course, trying to prevent hackers from doing their thing, but it’s hard to do that while also trying to make things go faster. Next issue, I’ll try to show you how many of these fixes (whether complete or incomplete) inevitably lead to lower &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;performance.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S5] Issue 64: Fixing Meltdown and&amp;nbsp;Spectre&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;li&gt;What is a password hash? [Issue&amp;nbsp;63]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cache"></category></entry><entry><title>Issue 62: Cache snooping</title><link href="https://ngjunsiang.github.io/laymansguide/issue062.html" rel="alternate"></link><published>2020-03-03T17:00:00+08:00</published><updated>2020-03-03T17:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-03-03:/laymansguide/issue062.html</id><summary type="html"></summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; A cache miss is slow, and a cache hit is fast. This difference in cache reading speed can be used to transmit secrets out from the cache, which cannot be read directly by&amp;nbsp;programs.&lt;/p&gt;
&lt;p&gt;Okay, okay, we managed to leak data from memory to the cache, now how do we leak it from the cache to our&amp;nbsp;program?&lt;/p&gt;
&lt;h2&gt;Cache snooping: tapping on&amp;nbsp;tiles&lt;/h2&gt;
&lt;p&gt;Here in Singapore, before moving in to a newly built apartment, we have a “ritual” of tapping each ceramic floor tile to check if they have been properly fastened to the&amp;nbsp;ground.&lt;/p&gt;
&lt;p&gt;Most people don’t actually know what a fastened or unfastened floor tile sounds like. What we do know is that they sound &lt;em&gt;different&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;So we go &lt;em&gt;tok&lt;/em&gt;, &lt;em&gt;tok&lt;/em&gt;, &lt;em&gt;tok&lt;/em&gt;, &lt;em&gt;tok&lt;/em&gt;, &lt;em&gt;tok&lt;/em&gt;, … &lt;em&gt;tik&lt;/em&gt;! Aha, there’s a loosened floor&amp;nbsp;tile!&lt;/p&gt;
&lt;p&gt;That’s kind of what we are going to do to the cache. We are going to load information located in memory cells addressed 1 through 256, and see how long each request&amp;nbsp;takes.&lt;/p&gt;
&lt;p&gt;Address 1: 135 ns&lt;br /&gt;
Address 2: 134 ns&lt;br /&gt;
Address 3: 136 ns&lt;br /&gt;
Address 4: 134 ns&lt;br /&gt;
…&lt;br /&gt;
Address 136: 130 ns&lt;br /&gt;
Address 137: 66 ns&lt;br /&gt;
Address 138: 137 ns&lt;br /&gt;
…&lt;br /&gt;
Address 256: 135&amp;nbsp;ns&lt;/p&gt;
&lt;p&gt;Can you tell what the secret number is? It’s the one with an &lt;em&gt;obviously lower&lt;/em&gt; request latency. In this case, the other addresses didn’t have a copy of their data already in the cache, so they result in a cache miss (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue057.html"&gt;Issue 57&lt;/a&gt;))—the &lt;span class="caps"&gt;CPU&lt;/span&gt; has to go to main memory to read the data again, and that’s slow. Address 137 already had its data loaded before, and a copy of it was already in the cache, so loading it again results in a cache hit and is&amp;nbsp;fast.&lt;/p&gt;
&lt;h2&gt;Treating memory addresses as&amp;nbsp;data&lt;/h2&gt;
&lt;p&gt;One key thing to remember here is that each memory address points to a memory “cell”, which only stores one byte (8 bits), with a value that can run from 0 to 255 to give us 256 (i.e. 2^8) different&amp;nbsp;values.&lt;/p&gt;
&lt;p&gt;Meltdown or Spectre have gotten the secret number (137), but in that small window of opportunity before it gets terminated, it would not have time to even store it into a text file that we can open later. How could we get that secret number without Meltdown or Spectre storing&amp;nbsp;it?&lt;/p&gt;
&lt;p&gt;We can write a snooping program to do the&amp;nbsp;following:&lt;/p&gt;
&lt;p&gt;1) Empty the cache cells for memory addresses 1 to 256, so that loading information from them would result in a cache&amp;nbsp;miss.&lt;/p&gt;
&lt;p&gt;Then instead of storing the value 137 somewhere, we would get Meltdown/Spectre to &lt;strong&gt;load&lt;/strong&gt; information from memory address 137. A load operation is much faster than a store operation, and Meltdown/Spectre would be able to pull this off within the window of opportunity. This would cause a copy of the information in memory address 137 to be stored in the cache; the next time any program tries to load information from address 137 again, it will be a cache hit&amp;nbsp;(fast).&lt;/p&gt;
&lt;p&gt;The snooping program would&amp;nbsp;then:&lt;/p&gt;
&lt;p&gt;2) Make requests for information from each of these 256 memory addresses (“tapping on tiles”) and see which request has an &lt;em&gt;obviously lower&lt;/em&gt;&amp;nbsp;latency.&lt;/p&gt;
&lt;p&gt;3) Determine that memory address 137 has obviously lower request latency, and store the “transmitted” secret:&amp;nbsp;“137”&lt;/p&gt;
&lt;p&gt;It’s a lot of work to get a single byte (256 possible values), but computers are good at doing lots of tedious work in a short amount of time. Using sample working code that exploits out-of-order execution (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue058.html"&gt;Issue 58&lt;/a&gt;)) and speculative processing (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue060.html"&gt;Issue 60&lt;/a&gt;)), coupled with a snooping program like the one we described above, the Meltdown and Spectre authors are able to leak data at a rate of about 580 &lt;span class="caps"&gt;KB&lt;/span&gt;/s, which seems slow. But there are 86,400 seconds in a day, so that’s roughly 43 &lt;span class="caps"&gt;GB&lt;/span&gt;/day at full exploit speed! (There are 4 videos of demonstration exploits near the bottom of the &lt;a href="https://meltdownattack.com/"&gt;Meltdown page&lt;/a&gt;.) Malicious actors would probably do it at a slower rate to keep it covert, but in the weeks or months it would take to notice something was amiss with the memory access operations, that’s a lot of data they can siphon off …&amp;nbsp;.&lt;/p&gt;
&lt;p&gt;We’ve covered quite a bit of technical ground, so I’ll&amp;nbsp;summarise.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt;
To prep the cache, our program empties addresses 1 to 256, so that they are guaranteed to have a cache miss if their information is&amp;nbsp;loaded.&lt;/p&gt;
&lt;p&gt;To cache snoop (after Meltdown/Spectre have “delivered the payload”), we load information from memory addresses 1 to 256 and look for the one with an obviously lower request latency (a cache hit). The memory address itself is the value to&amp;nbsp;keep.&lt;/p&gt;
&lt;p&gt;Okay, that’s it. Secret is leaked, cat is out of the bag, and now you know how Meltdown and Spectre work, without all the technical detail (like how addresses 1 to 256 need to be in separate pages which are 4 KiB each because the &lt;span class="caps"&gt;CPU&lt;/span&gt; will speculatively load adjacent data from memory, &lt;em&gt;yaddah yaddah&lt;/em&gt;).&lt;/p&gt;
&lt;p&gt;So what can we do about it? Why hasn’t Intel fixed it after a year? I’m no computer engineer, but I’ll offer some thoughts in the next&amp;nbsp;issue.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S5] Issue 63: Limitations of Meltdown and&amp;nbsp;Spectre&lt;/p&gt;
&lt;p&gt;This isn’t even a fraction of 1% of what happens inside a &lt;span class="caps"&gt;CPU&lt;/span&gt;. It’s hard to convey just how complex &lt;span class="caps"&gt;CPU&lt;/span&gt; design is; no single person can explain in full detail how every part of the &lt;span class="caps"&gt;CPU&lt;/span&gt; works. Much of the design and validation work is already being done by software, but it still takes a human to write the code that does the&amp;nbsp;checking.&lt;/p&gt;
&lt;p&gt;Does it surprise you that a little hack like this can get past so many pairs of eyes? It really&amp;nbsp;shouldn’t.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cache"></category><category term="cpu"></category></entry><entry><title>Issue 61: Mapping the cache</title><link href="https://ngjunsiang.github.io/laymansguide/issue061.html" rel="alternate"></link><published>2020-02-22T08:00:00+08:00</published><updated>2020-02-22T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-02-22:/laymansguide/issue061.html</id><summary type="html">&lt;p&gt;A cache miss is slow, and a cache hit is fast. This difference in cache reading speed can be used to transmit secrets out from the cache, which cannot be read directly by&amp;nbsp;programs.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Speculative execution is a feature that lets the &lt;span class="caps"&gt;CPU&lt;/span&gt; speed up execution if it correctly predicts a decision point. The &lt;span class="caps"&gt;CPU&lt;/span&gt; carries out the operations along the predicted decision branch and loads the results if it predicts&amp;nbsp;correctly.&lt;/p&gt;
&lt;p&gt;Meltdown and Spectre need 2 pieces of the puzzle to leak data, and we have covered the first piece already: How to load the forbidden information into the cache, where it will not be immediately wiped by the &lt;span class="caps"&gt;OS&lt;/span&gt; when we are “found&amp;nbsp;out”.&lt;/p&gt;
&lt;p&gt;If we were trying to pull off a Meltdown or Spectre, we would try&amp;nbsp;to:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Set up the request to have the info loaded into the&amp;nbsp;cache&lt;/li&gt;
&lt;li&gt;Attempt to read the cache &amp;#8230;&amp;nbsp;how?&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The second piece of the puzzle, naturally, is how to get the info out of the cache before the &lt;span class="caps"&gt;CPU&lt;/span&gt; eventually evicts old data from&amp;nbsp;it.&lt;/p&gt;
&lt;h2&gt;Failure from the&amp;nbsp;start&lt;/h2&gt;
&lt;p&gt;At this point, we would have failed. We have gotten the secret into the cache, but we have no idea where it is in the cache, and we have no way to access the cache directly—remember that the cache is managed by the &lt;span class="caps"&gt;CPU&lt;/span&gt; and there is no instruction we can issue to the &lt;span class="caps"&gt;CPU&lt;/span&gt; to give us cache data&amp;nbsp;directly.&lt;/p&gt;
&lt;p&gt;We’ve come so far … and it doesn’t even&amp;nbsp;matter.&lt;/p&gt;
&lt;p&gt;We’ll need to modify our approach slightly. We can’t store the leaked data directly in the cache naïvely like that. We’ve got to be a little&amp;nbsp;cleverer.&lt;/p&gt;
&lt;h2&gt;The cache “mirrors” a part of virtual&amp;nbsp;memory&lt;/h2&gt;
&lt;p&gt;A quick refresher on how the cache works (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue057.html"&gt;Issue 57&lt;/a&gt;)):&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;When the &lt;span class="caps"&gt;CPU&lt;/span&gt; needs data from a memory address, it looks in the cache&amp;nbsp;first.&lt;/li&gt;
&lt;li&gt;If the data is not there (a &lt;strong&gt;cache miss&lt;/strong&gt;), it will load the data from the memory address, and store a copy in the cache for faster reference in future. [&lt;strong&gt;&lt;span class="caps"&gt;SLOW&lt;/span&gt;&lt;/strong&gt;]&lt;/li&gt;
&lt;li&gt;If there is a cache hit, the data from the cache will be returned. [&lt;strong&gt;&lt;span class="caps"&gt;FAST&lt;/span&gt;&lt;/strong&gt;]&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Hmm … there’s something here. A cache miss is slow, and a cache hit is fast. Could we exploit this in some way, possibly? If we are creative,&amp;nbsp;yes!&lt;/p&gt;
&lt;p&gt;Many secret ways of transmitting information involves a shared cipher, a secret way of converting what is sent to what is meant. Leaking cache information will require a cipher of some&amp;nbsp;sort.&lt;/p&gt;
&lt;p&gt;It’s like a &lt;span class="caps"&gt;WWII&lt;/span&gt; spy story. Two spies arrange 3 different dropoff locations. Dropoff location 1 means their country is going to attack. Dropoff location 2 means their country is not going to attack. And dropoff location 3 means the information is compromised and they should avoid contact. Even if they are caught by the secret police, there is no way of figuring out what the two spies had communicated to each other&amp;nbsp;indirectly.&lt;/p&gt;
&lt;p&gt;All right, I’m writing a newsletter here, not a workshop. And the rest of the story will need more technical detail, so let’s call it a week. Next issue, all will be revealed&amp;nbsp;;)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; A cache miss is slow, and a cache hit is fast. This difference in cache reading speed can be used to transmit secrets out from the cache, which cannot be read directly by&amp;nbsp;programs.&lt;/p&gt;
&lt;p&gt;I know, I know, what a cliffhanger! Before you started reading this newsletter, you never thought you’d be waiting with bated breath to hear some technical explanation of how to read data from a &lt;span class="caps"&gt;CPU&lt;/span&gt; cache, huh? Or that you never thought you might (in the next issue) find newfound appreciation of an ingenious &lt;span class="caps"&gt;CPU&lt;/span&gt; vulnerability exploit, and just how difficult it would be to fully resolve&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;We are getting close to the big reveal. Same time next&amp;nbsp;week.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S5] Issue 62: Snooping the&amp;nbsp;cache&lt;/p&gt;
&lt;p&gt;I was about to write both the mapping and the snooping in one issue, then I momentarily lost my train of thought and was trying to trace it again. And I realised that if I could lose the train of logic like that, I probably should split it up into two issues. One idea per issue, and I will still try to stick to it. I haven’t been able to write short issues that communicate a single idea, and it feels good to achieve it&amp;nbsp;again.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cache"></category><category term="cpu"></category><category term="memory"></category></entry><entry><title>Issue 60: CPU Optimisation Part 2 – Speculative Execution and Spectre</title><link href="https://ngjunsiang.github.io/laymansguide/issue060.html" rel="alternate"></link><published>2020-02-15T08:00:00+08:00</published><updated>2020-02-15T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-02-15:/laymansguide/issue060.html</id><summary type="html">&lt;p&gt;Speculative execution is a feature that let’s the &lt;span class="caps"&gt;CPU&lt;/span&gt; speed up execution if it correctly predicts a decision point. The &lt;span class="caps"&gt;CPU&lt;/span&gt; carries out the operations along the predicted decision branch and loads the results if it predicts&amp;nbsp;correctly.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; A set of instructions can trick a &lt;span class="caps"&gt;CPU&lt;/span&gt; into reordering load instructions so that the data is temporarily loaded into the cache before the instructions are retired. The cache can then be snooped to retrieve the&amp;nbsp;data.&lt;/p&gt;
&lt;p&gt;At the heart of the matter is the fact that the &lt;span class="caps"&gt;OS&lt;/span&gt; has no control over the order in which instructions are carried out. Because of this, hackers who understand how the &lt;span class="caps"&gt;CPU&lt;/span&gt; reorders instructions can write malicious code that tricks the &lt;span class="caps"&gt;CPU&lt;/span&gt; into loading precious data into memory for a fraction of a second, during which they can use cache-snooping techniques to read the&amp;nbsp;data.&lt;/p&gt;
&lt;p&gt;Before I go into the details of one cache-snooping technique, I want to outline another way that malicious code can get their targeted data into the cache. This exploits another feature, known as speculative&amp;nbsp;execution.&lt;/p&gt;
&lt;h2&gt;Speculative execution: the &lt;span class="caps"&gt;CPU&lt;/span&gt;’s way of&amp;nbsp;anticipating&lt;/h2&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="mf"&gt;010011011011101101000101&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;…&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Can you predict the next number in the sequence? Kinda tough&amp;nbsp;…&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="mf"&gt;1111111111111111&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;…&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;How about now?&amp;nbsp;Easier?&lt;/p&gt;
&lt;p&gt;We have all been in that workplace situation where we are waiting on a colleague to make a decision. If they choose A, we have to perform one set of routines. If they choose B, we have to perform another set of&amp;nbsp;routines.&lt;/p&gt;
&lt;p&gt;If our past experience with this person tells us that there is no pattern to what they will choose in such a situation, it is very difficult to proceed until they have made their choice. However, if they regularly choose A and occasionally choose B, that’s another story. Especially if they take a long time to make their&amp;nbsp;decision.&lt;/p&gt;
&lt;p&gt;To speed up the process, we might just carry out the set of routines for A, wait for them to say “I choose A”, then give them the results—&lt;em&gt;tada&lt;/em&gt;! And if they choose B instead, secretly dump the evidence and curse our&amp;nbsp;luck.&lt;/p&gt;
&lt;h2&gt;Another model: the car&amp;nbsp;valet&lt;/h2&gt;
&lt;p&gt;How would this work with a more concrete example? I could reuse the bank teller model from the Meltdown explanation, but I run the risk of muddling you up since the steps will look very similar. Instead, let’s model a pair of robot car valets instead. This pair still consists of a robot &lt;span class="caps"&gt;ALU&lt;/span&gt; (arithmetic logic unit) and &lt;span class="caps"&gt;LSU&lt;/span&gt; (load-store unit). The &lt;span class="caps"&gt;ALU&lt;/span&gt;, the brains of the pair, gets the car keys and driver’s license from the customer, and asks the customer for his &lt;span class="caps"&gt;ID&lt;/span&gt; number before asking the &lt;span class="caps"&gt;LSU&lt;/span&gt; to retrieve the vehicle. The &lt;span class="caps"&gt;LSU&lt;/span&gt;, the brawns of the pair, well, just parks or retrieves the&amp;nbsp;vehicle.&lt;/p&gt;
&lt;p&gt;Let’s exploit this &lt;span class="caps"&gt;CPU&lt;/span&gt; model to find out what kind of car our secretive neighbour drives. I don’t have my neighbour’s &lt;span class="caps"&gt;ID&lt;/span&gt;, but I do know his &lt;span class="caps"&gt;ID&lt;/span&gt; number (23983698576), and I give it to the &lt;span class="caps"&gt;CPU&lt;/span&gt;. It carries out the following&amp;nbsp;instructions:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;GET&lt;/span&gt;&lt;/strong&gt; &lt;span class="caps"&gt;ID&lt;/span&gt; number[23983698576] from&amp;nbsp;customer&lt;/li&gt;
&lt;li&gt;Verify if I am the car owner &lt;em&gt;[&lt;span class="caps"&gt;SLOW&lt;/span&gt;]&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; verified, &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; car of 23983698576 by driving it to retrieval point and pass keys to&amp;nbsp;customer&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; not verified, dump data and start over with the next&amp;nbsp;customer&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Sounds fair enough. The &lt;span class="caps"&gt;ALU&lt;/span&gt; finds out I am not my neighbour, and I don’t get to see his car. Awww. But let’s wait and see&amp;nbsp;…&lt;/p&gt;
&lt;h2&gt;Speculative&amp;nbsp;valeting&lt;/h2&gt;
&lt;p&gt;10 customers later, the &lt;span class="caps"&gt;CPU&lt;/span&gt; has been processing verified customers only. It goes into speculative execution mode (in a &lt;strong&gt;real &lt;span class="caps"&gt;CPU&lt;/span&gt;&lt;/strong&gt;, of course you can’t disable speculative execution just like that; it is always on). Now the &lt;span class="caps"&gt;CPU&lt;/span&gt; works this&amp;nbsp;way:&lt;/p&gt;
&lt;p&gt;[1.] &lt;strong&gt;&lt;span class="caps"&gt;GET&lt;/span&gt;&lt;/strong&gt; &lt;span class="caps"&gt;ID&lt;/span&gt; number[23983698576] from customer&lt;br /&gt;
[2a.] &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; car of 23983698576 by driving it to retrieval point&lt;br /&gt;
[2b.] Verify if I am the car owner &lt;em&gt;[&lt;span class="caps"&gt;SLOW&lt;/span&gt;]&lt;/em&gt;&lt;br /&gt;
[3.] &lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; verified, pass car keys to customer&lt;br /&gt;
[4.] &lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; not verified, dump data and start over with the next&amp;nbsp;customer&lt;/p&gt;
&lt;p&gt;2a and 2b are carried out simultaneously. Have you figured out where the cache is in this model? It’s where the car is temporarily held: at the retrieval&amp;nbsp;point.&lt;/p&gt;
&lt;p&gt;10 customers later, the &lt;span class="caps"&gt;ALU&lt;/span&gt; checks my &lt;span class="caps"&gt;ID&lt;/span&gt;, and at the same time the &lt;span class="caps"&gt;LSU&lt;/span&gt; in good faith starts to drive my neighbour’s car to the retrieval point. It is astutely hidden from my direct view. But if I know the mode of operation of this valet beforehand, I have a small window of opportunity to sneak a peek at the car before the &lt;span class="caps"&gt;ALU&lt;/span&gt; figures out I’m not the owner and a cache flush occurs (i.e. the &lt;span class="caps"&gt;LSU&lt;/span&gt; removes the car from the retrieval point). Perhaps I could plant a camera at the retrieval point&amp;nbsp;…&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Speculative execution is a feature that let’s the &lt;span class="caps"&gt;CPU&lt;/span&gt; speed up execution if it correctly predicts a decision point. The &lt;span class="caps"&gt;CPU&lt;/span&gt; carries out the operations along the predicted decision branch and loads the results if it predicts&amp;nbsp;correctly.&lt;/p&gt;
&lt;p&gt;And there you have it, two &lt;span class="caps"&gt;CPU&lt;/span&gt; features explained with robots. These are well-researched &lt;span class="caps"&gt;CPU&lt;/span&gt; features that have been used in CPUs for a long while … and nobody thought to thoroughly investigate ways in which this might be exploited for malicious&amp;nbsp;intent.&lt;/p&gt;
&lt;p&gt;You might blame this oversight on Intel, but I think I would blame the unpredictable nature of development. Early forts only needed walls, but not roofs, until catapults were invented. Hardware was invented to run fast, and the internet was designed to be robust, and very few people could predict accurately how they would be exploited for ill&amp;nbsp;intent.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S5] Issue 61: Mapping the&amp;nbsp;cache&lt;/p&gt;
&lt;p&gt;Okay, I’m done talking about the exploit part of Meltdown and Spectre. The scene freezes, goes into extreme time slowdown mode … the last 5 transactions are on the bank teller’s paper, and the neighbour’s car is at the retrieval point. The bank teller &lt;span class="caps"&gt;ALU&lt;/span&gt; is looking over my &lt;span class="caps"&gt;ID&lt;/span&gt;, checking various things, and the car valet &lt;span class="caps"&gt;ALU&lt;/span&gt; is verifying my &lt;span class="caps"&gt;ID&lt;/span&gt; … the quarry is at hand! Only a split second before they uncover the truth and the quarry is snatched&amp;nbsp;away!&lt;/p&gt;
&lt;p&gt;How are we going to snoop that precious cargo? You’ve watched enough heist movies, you know these things don’t happen without exhaustively detailed&amp;nbsp;planning.&lt;/p&gt;
&lt;p&gt;Let’s start planning our cache snoop next&amp;nbsp;issue.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cache"></category><category term="cpu"></category><category term="operating system"></category></entry><entry><title>Issue 59: Meltdown</title><link href="https://ngjunsiang.github.io/laymansguide/issue059.html" rel="alternate"></link><published>2020-02-08T08:00:00+08:00</published><updated>2020-02-08T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-02-08:/laymansguide/issue059.html</id><summary type="html">&lt;p&gt;A set of instructions can trick a &lt;span class="caps"&gt;CPU&lt;/span&gt; into reordering load instructions so that the data is temporarily loaded into the cache before the instructions are retired. The cache can then be snooped to retrieve the&amp;nbsp;data.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; The &lt;span class="caps"&gt;CPU&lt;/span&gt; comprises different types of execution units. All the execution units can run at the same time, but they may execute instructions over different numbers of clock cycles. To minimise wait time, &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions are carried out in an order that keeps the execution units busy as often as&amp;nbsp;possible.&lt;/p&gt;
&lt;h2&gt;Last issue: optimising the old-school robot bank&amp;nbsp;teller&lt;/h2&gt;
&lt;p&gt;Last issue, I modelled a simple &lt;span class="caps"&gt;CPU&lt;/span&gt; bank teller consisting of two units, an &lt;span class="caps"&gt;ALU&lt;/span&gt; (arithmetic logic unit), and a &lt;span class="caps"&gt;LSU&lt;/span&gt; (load-store unit). The &lt;span class="caps"&gt;ALU&lt;/span&gt; does the calculations, while the &lt;span class="caps"&gt;LSU&lt;/span&gt; loads from or stores data to memory. For the &lt;span class="caps"&gt;CPU&lt;/span&gt; to hum along optimally, the &lt;span class="caps"&gt;ALU&lt;/span&gt; should not be kept waiting for data, and the &lt;span class="caps"&gt;LSU&lt;/span&gt; should not be left twiddling its thumbs while the &lt;span class="caps"&gt;ALU&lt;/span&gt; is&amp;nbsp;working.&lt;/p&gt;
&lt;p&gt;By reordering the instructions that come in, we can optimise &lt;span class="caps"&gt;CPU&lt;/span&gt; usage by making sure the &lt;span class="caps"&gt;LSU&lt;/span&gt; is loading data for the next few instructions while the &lt;span class="caps"&gt;ALU&lt;/span&gt; is still working; this is known as &lt;strong&gt;out-of-order execution&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;For the &lt;span class="caps"&gt;CPU&lt;/span&gt; to give a customer his bank account balance, the following steps need to&amp;nbsp;happen:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;GET&lt;/span&gt;&lt;/strong&gt; &lt;span class="caps"&gt;ID&lt;/span&gt; from&amp;nbsp;customer&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; bank account owner from memory (using &lt;span class="caps"&gt;ID&lt;/span&gt;&amp;nbsp;number)&lt;/li&gt;
&lt;li&gt;Check that customer is the bank account owner (verifying other details) &lt;em&gt;[&lt;span class="caps"&gt;SLOW&lt;/span&gt;]&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; verified, &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; bank account&amp;nbsp;balance&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;SEND&lt;/span&gt;&lt;/strong&gt; bank account balance to&amp;nbsp;customer&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Where I last stopped, we were optimising the robot bank teller by carrying out the two &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; steps together. This helps to optimise &lt;span class="caps"&gt;CPU&lt;/span&gt; use, because while the ALUs are busy carrying out operations to verify that the customer is the owner of the bank account, the &lt;span class="caps"&gt;LSU&lt;/span&gt; is loading the bank account details, ready to be used once the &lt;span class="caps"&gt;ALU&lt;/span&gt; is&amp;nbsp;done.&lt;/p&gt;
&lt;p&gt;Where do the bank account details go while the &lt;span class="caps"&gt;LSU&lt;/span&gt; is waiting for the &lt;span class="caps"&gt;ALU&lt;/span&gt;? In the case of the bank teller, they’re written on a piece of paper (yes, old-school, because analogy). In a real &lt;span class="caps"&gt;CPU&lt;/span&gt;, every piece of data requested by the &lt;span class="caps"&gt;CPU&lt;/span&gt; first goes through the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache. This means the cache has a copy of all data ever requested, and it evicts the oldest data to make way for new data. The bank teller’s piece of paper is an analogy for the &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;cache.&lt;/p&gt;
&lt;h2&gt;Meltdown: the&amp;nbsp;exploit&lt;/h2&gt;
&lt;p&gt;Suppose I’m an ill-intentioned customer who wants to snoop on a neighbour’s bank transactions. I go up to the bank teller and ask it to retrieve the last 5 transactions of account &lt;span class="caps"&gt;ID&lt;/span&gt; 23983698576 (that’s my neighbour’s account &lt;span class="caps"&gt;ID&lt;/span&gt;, unknown to the robot&amp;nbsp;tellers).&lt;/p&gt;
&lt;p&gt;The bank tellers need to execute the following instructions. There is an implicit step after step 4 to ensure&amp;nbsp;security:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;GET&lt;/span&gt;&lt;/strong&gt; &lt;span class="caps"&gt;ID&lt;/span&gt;[23983698576] from&amp;nbsp;customer&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; bank account owner of [23983698576] from memory (written back to&amp;nbsp;cache)&lt;/li&gt;
&lt;li&gt;Check if I am the bank account owner &lt;em&gt;[&lt;span class="caps"&gt;SLOW&lt;/span&gt;]&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; verified, &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; bank account balance of&amp;nbsp;[23983698576]&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; &lt;strong&gt;not verified&lt;/strong&gt;, dump data and start over with the next&amp;nbsp;customer&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;SEND&lt;/span&gt;&lt;/strong&gt; bank account balance to&amp;nbsp;me&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; last 5 transactions of [23983698576] from memory (written back to&amp;nbsp;cache)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;SEND&lt;/span&gt;&lt;/strong&gt; last 5 transactions to&amp;nbsp;me&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;However, after reordering for efficiency, the steps now look like&amp;nbsp;this:&lt;/p&gt;
&lt;p&gt;[1.] &lt;strong&gt;&lt;span class="caps"&gt;GET&lt;/span&gt;&lt;/strong&gt; &lt;span class="caps"&gt;ID&lt;/span&gt;[23983698576] from customer&lt;br /&gt;
[2.] &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; bank account owner of [23983698576] from memory (written back to cache)&lt;br /&gt;
[3.] Check if I am the bank account owner &lt;em&gt;[&lt;span class="caps"&gt;SLOW&lt;/span&gt;]&lt;/em&gt;&lt;br /&gt;
[4.] &lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; verified, &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; bank account balance of [23983698576]&lt;br /&gt;
[7.] &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; last 5 transactions of [23983698576] from memory (written back to cache)&lt;br /&gt;
[5.] &lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; &lt;strong&gt;not verified&lt;/strong&gt;, dump data and start over with the next customer&lt;br /&gt;
[6.] &lt;strong&gt;&lt;span class="caps"&gt;SEND&lt;/span&gt;&lt;/strong&gt; bank account balance to me&lt;br /&gt;
[8.] &lt;strong&gt;&lt;span class="caps"&gt;SEND&lt;/span&gt;&lt;/strong&gt; last 5 transactions to&amp;nbsp;me&lt;/p&gt;
&lt;p&gt;While the &lt;span class="caps"&gt;ALU&lt;/span&gt; is carrying out authenticity checks in step 3, the &lt;span class="caps"&gt;LSU&lt;/span&gt; is simultaneously carrying out steps 4 and 7, the &lt;span class="caps"&gt;LOAD&lt;/span&gt; steps, to avoid sitting&amp;nbsp;idle.&lt;/p&gt;
&lt;p&gt;This also leaves a copy of the data in the cache; the &lt;span class="caps"&gt;LSU&lt;/span&gt; teller has written down the bank balance and last 5 transactions on a piece of paper while waiting for the &lt;span class="caps"&gt;ALU&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;When the &lt;span class="caps"&gt;ALU&lt;/span&gt; reaches step 5 and figures out I’m not the owner of that account, then they start over with the next customer and I get evicted from the queue (this is called retiring an instruction, in a real &lt;span class="caps"&gt;CPU&lt;/span&gt;). But meanwhile, the papers on the desk don’t get&amp;nbsp;cleared!&lt;/p&gt;
&lt;h2&gt;Cache snooping: the oldest trick in the&amp;nbsp;book&lt;/h2&gt;
&lt;p&gt;If this sounds horrifying to you, remember that the &lt;em&gt;real&lt;/em&gt; &lt;span class="caps"&gt;CPU&lt;/span&gt; is just a bunch of transistors and it really isn’t all that smart. And remember that programs cannot access the cache directly; it is a hardware implementation detail (like the backroom of any business), and so this is considered normal&amp;nbsp;practice.&lt;/p&gt;
&lt;p&gt;But still, that leaves a small window of opportunity for me to crane my neck and try to snoop the paper. And that’s all the time I need to see my neighbour’s transactions, and even his bank&amp;nbsp;balance!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; A set of instructions can trick a &lt;span class="caps"&gt;CPU&lt;/span&gt; into reordering load instructions so that the data is temporarily loaded into the cache before the instructions are retired. The cache can then be snooped to retrieve the&amp;nbsp;data.&lt;/p&gt;
&lt;p&gt;Okay, I’ve left out the meaty details of cache snooping here, because there are a whole bunch of tricks to doing it, written up into white papers by cybersecurity researchers. Also this is a one-idea-a-week newsletter, and cache snooping is a whole ’nother idea. Also, I’ll get round to it&amp;nbsp;later.&lt;/p&gt;
&lt;p&gt;But first I want to talk about Spectre, which is another way of getting the desired data into the cache. But Spectre exploits another feature, known as speculative execution. It is also an intuitive concept, not difficult for normal folks to understand, and I’ll go straight into it next&amp;nbsp;issue.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S5] Issue 60: &lt;span class="caps"&gt;CPU&lt;/span&gt; Optimisation Part 2 – Speculative Execution and&amp;nbsp;Spectre&lt;/p&gt;
&lt;p&gt;Cache snooping is interesting to me, because things like this actually happen all the time &lt;span class="caps"&gt;IRL&lt;/span&gt;! What’s really going on here is that any business operation needs to have a place to put things, move things, work on things, in a way that is invisible to customers and outsiders. But making sure that these inner workings are truly invisible to other people is helluva&amp;nbsp;difficult.&lt;/p&gt;
&lt;p&gt;Consider, for instance, &lt;a href="https://www.theatlantic.com/magazine/archive/2019/05/stock-value-satellite-images-investing/586009/"&gt;this article from The Atlantic&lt;/a&gt;. It describes how some rich investors try to make more accurate predictions of their investments’ performance by buying satellite imagery of their factory or operations sites. By seeing visual data that is not readily available to other investors, they can better predict how those companies are really&amp;nbsp;performing.&lt;/p&gt;
&lt;p&gt;Cache snooping is another instance of hardware snooping, but at a different scale and scope. Just how hidden are our hardware implementations? It is difficult to think about ways people can obtain such dearly desired info if we are not those people; human ingenuity does seem almost&amp;nbsp;boundless!&lt;/p&gt;
&lt;p&gt;When we really try to do everything in a secure manner, often it means sacrificing performance for security. For instance, a &lt;span class="caps"&gt;CPU&lt;/span&gt; without out-of-order execution would not be subject to this leak risk. But it would also run &lt;strong&gt;1.26 to 2.4 times slower&lt;/strong&gt;, &lt;a href="https://randomascii.wordpress.com/2012/10/25/out-of-order-benefits/"&gt;according to Bruce Dawson of Google&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Ah, how to have our cake and eat it too&amp;nbsp;…&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cache"></category><category term="cpu"></category></entry><entry><title>Issue 58: CPU Optimisation Part 1 – Out-of-Order Processing</title><link href="https://ngjunsiang.github.io/laymansguide/issue058.html" rel="alternate"></link><published>2020-02-01T08:00:00+08:00</published><updated>2020-02-01T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-02-01:/laymansguide/issue058.html</id><summary type="html">&lt;p&gt;The &lt;span class="caps"&gt;CPU&lt;/span&gt; comprises different types of execution units. All the execution units can run at the same time, but they may execute instructions over different numbers of clock cycles. To minimise wait time, &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions are carried out in an order that keeps the execution units busy as often as&amp;nbsp;possible.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; The &lt;span class="caps"&gt;CPU&lt;/span&gt; stores data for ready access in the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache. Accessing data from the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache is much faster than accessing data from main memory. When the &lt;span class="caps"&gt;CPU&lt;/span&gt; needs data from a memory address, it looks in the cache first. If the data is not there (a &lt;strong&gt;cache miss&lt;/strong&gt;), it will load the data from the memory address, and store a copy in the cache for faster reference in future. The &lt;span class="caps"&gt;CPU&lt;/span&gt; cache is managed by the &lt;span class="caps"&gt;CPU&lt;/span&gt; and is invisible to the &lt;span class="caps"&gt;OS&lt;/span&gt;. Programs that need to ensure the data in the cache is “fresh” can perform a cache flush and&amp;nbsp;reload.&lt;/p&gt;
&lt;p&gt;In this issue, we look at one feature that CPUs use to speed up processing: out-of-order execution. “Out-of-order” makes it sound like something is broken in the &lt;span class="caps"&gt;CPU&lt;/span&gt;, but it really just means that the &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions it is given are not executed in the same order that they were fed to the &lt;span class="caps"&gt;CPU&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;If you have seen a busy Starbucks joint or Chinese restaurant at work, you would know that menu orders are not always carried out in the same order that they were taken (even if customers are eventually first-come-first-served). A fully staffed Starbucks joint or Chinese restaurant is not a single working unit, but a collection of specialised&amp;nbsp;units.&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;CPU&lt;/span&gt; execution&amp;nbsp;units&lt;/h2&gt;
&lt;p&gt;A &lt;span class="caps"&gt;CPU&lt;/span&gt; core is comprised of 3 types of execution&amp;nbsp;units:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;A&lt;/strong&gt;rithmetic &lt;strong&gt;L&lt;/strong&gt;ogic &lt;strong&gt;U&lt;/strong&gt;nit (&lt;strong&gt;&lt;span class="caps"&gt;ALU&lt;/span&gt;&lt;/strong&gt;): &lt;span class="caps"&gt;THE&lt;/span&gt; &lt;span class="caps"&gt;ALU&lt;/span&gt; is responsible for carrying out integer&amp;nbsp;calculations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;F&lt;/strong&gt;loating &lt;strong&gt;P&lt;/strong&gt;oint &lt;strong&gt;U&lt;/strong&gt;nit (&lt;strong&gt;&lt;span class="caps"&gt;FPU&lt;/span&gt;&lt;/strong&gt;): The &lt;span class="caps"&gt;FPU&lt;/span&gt; is responsible for carrying out decimal&amp;nbsp;calculations&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;L&lt;/strong&gt;oad/&lt;strong&gt;S&lt;/strong&gt;tore &lt;strong&gt;U&lt;/strong&gt;nit (&lt;strong&gt;&lt;span class="caps"&gt;LSU&lt;/span&gt;&lt;/strong&gt;): The &lt;span class="caps"&gt;LSU&lt;/span&gt; is responsible for loading data from memory into the &lt;span class="caps"&gt;CPU&lt;/span&gt;, or storing data from the &lt;span class="caps"&gt;CPU&lt;/span&gt; into memory (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue055.html"&gt;Issue 55&lt;/a&gt;))&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;An instruction decoding unit in the &lt;span class="caps"&gt;CPU&lt;/span&gt; decodes each instruction and sends it to the appropriate execution unit. All these units can work at the same time, and for maximum performance this is what you want to&amp;nbsp;happen.&lt;/p&gt;
&lt;h2&gt;Not all instructions are executed&amp;nbsp;equal(ly)&lt;/h2&gt;
&lt;p&gt;The &lt;span class="caps"&gt;CPU&lt;/span&gt; has an internal clock (called the &lt;strong&gt;&lt;span class="caps"&gt;CPU&lt;/span&gt; clock&lt;/strong&gt;) that regulates when things are done in the &lt;span class="caps"&gt;CPU&lt;/span&gt;. Everything in a &lt;span class="caps"&gt;CPU&lt;/span&gt; takes place in cycles. Every operation takes at least one cycle, but some operations which require more steps will require more&amp;nbsp;cycles.&lt;/p&gt;
&lt;p&gt;For instance, the &lt;span class="caps"&gt;ALU&lt;/span&gt; can carry out most operations in one or two cycles, but the &lt;span class="caps"&gt;FPU&lt;/span&gt; often needs four or more cycles to do its work (moving decimals is hard work!). The &lt;span class="caps"&gt;LSU&lt;/span&gt; clock cycle latency varies, depending on which part of the cache you are fetching from (the cache has different regions; some regions are closer to &lt;span class="caps"&gt;CPU&lt;/span&gt; cores while other regions are shared among all the cores and therefore further. I won’t go into deeper&amp;nbsp;detail.)&lt;/p&gt;
&lt;p&gt;Keeping all the execution units busy is getting more complex now,&amp;nbsp;eh?&lt;/p&gt;
&lt;h2&gt;Minimising wait time in a &lt;span class="caps"&gt;CPU&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;Let’s revisit the instructions from &lt;a href="https://ngjunsiang.github.io/laymansguide/issue053.html"&gt;Issue 53&lt;/a&gt;):&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="mf"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kr"&gt;LOAD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;
&lt;span class="mf"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mf"&gt;2&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R2&lt;/span&gt;
&lt;span class="mf"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MOV&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;R2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MEM1011&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;The third instruction is to store data from the &lt;span class="caps"&gt;CPU&lt;/span&gt; register to main memory, and this is gonna take a little while. Sending subsequent instructions to the &lt;span class="caps"&gt;ALU&lt;/span&gt; immediately after the third instruction will result in some wastage of clock cycles: the &lt;span class="caps"&gt;ALU&lt;/span&gt; will just be sitting there, waiting for the data to be available in main memory before it can do its&amp;nbsp;thing.&lt;/p&gt;
&lt;p&gt;Why not schedule an instruction, even from another application, while waiting? It doesn’t matter if the other application’s instruction came later, if it can be executed now we might as well do&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;This, in a nutshell-issue, is out-of-order&amp;nbsp;execution.&lt;/p&gt;
&lt;h2&gt;Analogy: old-school robot bank&amp;nbsp;teller&lt;/h2&gt;
&lt;p&gt;Let’s model a &lt;span class="caps"&gt;CPU&lt;/span&gt; core as two execution units: an &lt;span class="caps"&gt;ALU&lt;/span&gt; and a &lt;span class="caps"&gt;LSU&lt;/span&gt;. The &lt;span class="caps"&gt;ALU&lt;/span&gt; is a robot bank teller that does what the customer asks, while the &lt;span class="caps"&gt;LSU&lt;/span&gt; is a robot bank teller that retrieves data from and stores data back to the bank’s database (i.e. memory). Two such robot bank tellers work at a teller counter (&lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;core).&lt;/p&gt;
&lt;p&gt;If a customer needs to check their bank balance, the following instructions need to happen (like I said, this is old-school; no iBanking or ATMs here, because&amp;nbsp;analogy).&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;GET&lt;/span&gt;&lt;/strong&gt; &lt;span class="caps"&gt;ID&lt;/span&gt; from&amp;nbsp;customer&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; bank account owner from memory (using &lt;span class="caps"&gt;ID&lt;/span&gt;&amp;nbsp;number)&lt;/li&gt;
&lt;li&gt;Check that customer is the bank account owner (verifying other details) &lt;em&gt;[&lt;span class="caps"&gt;SLOW&lt;/span&gt;]&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;span class="caps"&gt;IF&lt;/span&gt;&lt;/em&gt; verified, &lt;strong&gt;&lt;span class="caps"&gt;LOAD&lt;/span&gt;&lt;/strong&gt; bank account&amp;nbsp;balance&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;span class="caps"&gt;SEND&lt;/span&gt;&lt;/strong&gt; bank account balance to&amp;nbsp;customer&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;If you’re wondering why steps 2 and 4 can’t happen at the same time … congratulations! You already understand out-of-order execution at an intuitive level. If the &lt;span class="caps"&gt;LSU&lt;/span&gt; can carry out steps 2 and 4 at the same time, the &lt;span class="caps"&gt;ALU&lt;/span&gt; can simply provide the bank balance once the customer is authenticated, or discard the information&amp;nbsp;otherwise.&lt;/p&gt;
&lt;p&gt;This frees up the &lt;span class="caps"&gt;LSU&lt;/span&gt;, and if the &lt;span class="caps"&gt;LSU&lt;/span&gt;’s load is low enough we might even reduce robotpower and share one &lt;span class="caps"&gt;LSU&lt;/span&gt; between two teller counters, seeding android fears of restructuring and impending robot retrenchment … but let’s stop the analogy here for&amp;nbsp;today.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; The &lt;span class="caps"&gt;CPU&lt;/span&gt; comprises different types of execution units. All the execution units can run at the same time, but they may execute instructions over different numbers of clock cycles. To minimise wait time, &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions are carried out in an order that keeps the execution units busy as often as&amp;nbsp;possible.&lt;/p&gt;
&lt;p&gt;Some very smart people might harangue me about micro-ops, or about decode buffers, etc. My only answer to all such concerns are: not necessary at this point. Maybe in a future issue, if it is the linchpin in some layman&amp;nbsp;explanation.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; [&lt;span class="caps"&gt;LMG&lt;/span&gt; S5] Issue 59:&amp;nbsp;Meltdown&lt;/p&gt;
&lt;p&gt;This little optimisation step, of doing things in an order that keeps the &lt;span class="caps"&gt;CPU&lt;/span&gt; busy, looks innocuous enough. But once we combine it with some features of the cache, it leaves a little loophole that enables an attacker to snoop on data: this is Meltdown. Stay tuned, we’ll get to the meat of Meltdown next&amp;nbsp;issue!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cache"></category><category term="cpu"></category></entry><entry><title>Issue 57: Cache, the CPU’s working space</title><link href="https://ngjunsiang.github.io/laymansguide/issue057.html" rel="alternate"></link><published>2020-01-25T08:00:00+08:00</published><updated>2020-01-25T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-01-25:/laymansguide/issue057.html</id><summary type="html">&lt;p&gt;The &lt;span class="caps"&gt;CPU&lt;/span&gt; stores data for ready access in the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache. Accessing data from the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache is much faster than accessing data from memory. When the &lt;span class="caps"&gt;CPU&lt;/span&gt; needs data from a memory address, it looks in the cache first. If the data is not there (a &lt;strong&gt;cache miss&lt;/strong&gt;), it will load the data from the memory address, and store a copy in the cache for faster reference in future. The &lt;span class="caps"&gt;CPU&lt;/span&gt; cache is managed by the &lt;span class="caps"&gt;CPU&lt;/span&gt; and is invisible to the &lt;span class="caps"&gt;OS&lt;/span&gt;. Programs that need to ensure the data in the cache is “fresh” can perform a cache flush and&amp;nbsp;reload.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; The operating system is responsible for listing and managing the computer’s resources, making them available to programs running on the computer, and making sure they only use what they are allowed&amp;nbsp;to.&lt;/p&gt;
&lt;p&gt;Those who have been following Layman’s Guide since Season 3 will remember this term, &lt;strong&gt;caching&lt;/strong&gt;. I first introduced it at the end of Season 3, in &lt;a href="https://ngjunsiang.github.io/laymansguide/issue039.html"&gt;Issue 39&lt;/a&gt;):&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Searching for anything takes time. Need to fill out a form? You need to search for a pen first. Need to call someone? Before speed dial and contacts apps existed, you used to need to look up a number in order to dial it. If you do it often enough, you would make sure you always had a pen with you, or you would write the number somewhere convenient for you to see so you don’t need to hunt for&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;Computers use the same trick, and it is called &lt;strong&gt;caching&lt;/strong&gt;. Any information it needs repeatedly which is unchanging is stored in a &lt;strong&gt;cache&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The &lt;span class="caps"&gt;DNS&lt;/span&gt; cache, which I introduced in that issue, is a place where hostnames (such as facebook.com) and their associated &lt;span class="caps"&gt;IP&lt;/span&gt; addresses are stored, so that we don’t need to keep looking up the &lt;span class="caps"&gt;IP&lt;/span&gt; address for&amp;nbsp;facebook.com.&lt;/p&gt;
&lt;p&gt;We use caches to reduce latency and speed up processing: the full &lt;span class="caps"&gt;DNS&lt;/span&gt; querying process takes a few hundred milliseconds, but looking up a &lt;span class="caps"&gt;DNS&lt;/span&gt; entry in the cache only takes a few milliseconds — that’s a speedup by a factor of&amp;nbsp;100!&lt;/p&gt;
&lt;h2&gt;How long does it take to transfer&amp;nbsp;data?&lt;/h2&gt;
&lt;p&gt;Let’s look at the transfer speeds and latencies for a few places where data can be&amp;nbsp;stored:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Hard disk drive (&lt;span class="caps"&gt;HDD&lt;/span&gt;): ≈5 ms response latency, 100 &lt;span class="caps"&gt;MB&lt;/span&gt;/s transfer&amp;nbsp;speed&lt;/li&gt;
&lt;li&gt;Solid state disk (&lt;span class="caps"&gt;SSD&lt;/span&gt;): up to 0.1 ms response latency, 0.5–1+ &lt;span class="caps"&gt;GB&lt;/span&gt;/s transfer&amp;nbsp;speed&lt;/li&gt;
&lt;li&gt;Physical memory (&lt;span class="caps"&gt;RAM&lt;/span&gt;): 0.1 µs (0.0001 ms) response latency, &lt;span class="caps"&gt;20GB&lt;/span&gt;/s transfer&amp;nbsp;speed&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;CPU&lt;/span&gt; register: &amp;lt;1 ns (&amp;lt;0.000001 ms) response&amp;nbsp;latency[^1]&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;[1]: We seldom talk about the transfer speed of &lt;span class="caps"&gt;CPU&lt;/span&gt; registers, because each register only holds one byte and the transfer is&amp;nbsp;near-instantaneous.&lt;/p&gt;
&lt;p&gt;A &lt;span class="caps"&gt;CPU&lt;/span&gt; register is a slot within the &lt;span class="caps"&gt;CPU&lt;/span&gt; (the same slots from &lt;a href="https://ngjunsiang.github.io/laymansguide/issue053.html"&gt;Issue 53&lt;/a&gt;)) which it uses to hold the data it is&amp;nbsp;processing.&lt;/p&gt;
&lt;p&gt;Notice that the speed difference between each layer is more than 10×? If a computer did not have physical memory to store temporary data in, and had to transfer data to/from disk instead, it would be responding a thousand times more&amp;nbsp;slowly!&lt;/p&gt;
&lt;p&gt;A &lt;span class="caps"&gt;CPU&lt;/span&gt; can carry out operations very quickly on data loaded into its registers; it generally takes only a few nanoseconds for complex calculations to be done. Simple instructions (such as &lt;span class="caps"&gt;ADD&lt;/span&gt;) can even be done in less than 1&amp;nbsp;ns!&lt;/p&gt;
&lt;p&gt;This means that the limiting factor for &lt;span class="caps"&gt;CPU&lt;/span&gt; speed is actually loading and storing data. In the time it takes to load data from physical memory or store data to physical memory, it can perform 100 simple operations. That’s &lt;em&gt;damn slow&lt;/em&gt; in &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;time!&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;CPU&lt;/span&gt; and cache&amp;nbsp;performance&lt;/h2&gt;
&lt;p&gt;So &lt;span class="caps"&gt;CPU&lt;/span&gt; designers included some cache on the &lt;span class="caps"&gt;CPU&lt;/span&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="caps"&gt;CPU&lt;/span&gt; cache: 0.001–0.040 µs response latency, 175 &lt;span class="caps"&gt;GB&lt;/span&gt;/s transfer&amp;nbsp;speed&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Great, now we have some storage space that sits between physical memory and the &lt;span class="caps"&gt;CPU&lt;/span&gt;’s registers. It is only slightly slower that a &lt;span class="caps"&gt;CPU&lt;/span&gt; register, and much faster than physical&amp;nbsp;memory.&lt;/p&gt;
&lt;p&gt;Imagine working in an office that gave you a cubicle but no desk. When you need a piece of information, you have to go down the hallway to the filing cabinet, retrieve it, and return to your cubicle (no desk!). When you were done processing it (1 second), you had to put the results back in the filing cabinet, down the hallway … a process that takes about 100 seconds. &lt;span class="caps"&gt;SLOWWWWW&lt;/span&gt;!&lt;/p&gt;
&lt;p&gt;If you had a desk, you could put some papers on it, and retrieve them much more quickly (a few seconds). You would be 10× more&amp;nbsp;efficient!&lt;/p&gt;
&lt;p&gt;By simply including a cache on the &lt;span class="caps"&gt;CPU&lt;/span&gt;, its designers sped up its performance by a factor of more than&amp;nbsp;10.&lt;/p&gt;
&lt;h2&gt;Cache is not managed by the &lt;span class="caps"&gt;OS&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;Just as an organisation would not control what information you should have on your desk, an operating system does not control the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache. This feature is managed entirely by the &lt;span class="caps"&gt;CPU&lt;/span&gt; itself. The operating system is unable to see what is in the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache, and has no access to&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;Like other caches, when the &lt;span class="caps"&gt;CPU&lt;/span&gt; needs data from a memory address, it looks in the cache first. If the data is not there (a &lt;strong&gt;cache miss&lt;/strong&gt;), it will load the data from the memory address, and store a copy in the cache for faster reference in&amp;nbsp;future.&lt;/p&gt;
&lt;p&gt;Like other caches, this process has its own issues. The cache can fill up, requiring the &lt;span class="caps"&gt;CPU&lt;/span&gt; to eject old data so as to make way for fresher data. The cached data on the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache can also become outdated when other programs and instructions update the data in memory. Programs that absolutely need to ensure they get the freshest data from memory can issue special instructions to perform a &lt;strong&gt;cache flush&lt;/strong&gt; and &lt;strong&gt;cache reload&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;A cache flush empties out the cached data while preserving the memory address it is linked to. A cache reload, well, reloads the data from those memory addresses. These two terms, jargon for very technical operations that take place in the &lt;span class="caps"&gt;CPU&lt;/span&gt;, are being introduced because they are the linchpin of Meltdown and Spectre. We will get there in the next two&amp;nbsp;issues.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; The &lt;span class="caps"&gt;CPU&lt;/span&gt; stores data for ready access in the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache. Accessing data from the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache is much faster than accessing data from memory. When the &lt;span class="caps"&gt;CPU&lt;/span&gt; needs data from a memory address, it looks in the cache first. If the data is not there (a &lt;strong&gt;cache miss&lt;/strong&gt;), it will load the data from the memory address, and store a copy in the cache for faster reference in future. The &lt;span class="caps"&gt;CPU&lt;/span&gt; cache is managed by the &lt;span class="caps"&gt;CPU&lt;/span&gt; and is invisible to the &lt;span class="caps"&gt;OS&lt;/span&gt;. Programs that need to ensure the data in the cache is “fresh” can perform a cache flush and&amp;nbsp;reload.&lt;/p&gt;
&lt;p&gt;If &lt;span class="caps"&gt;CPU&lt;/span&gt; development had stopped at this point, Meltdown and Spectre would not have been possible … and we would have been stuck in the ’90s, somewhat. It is in human nature to try to exploit every last bit of available optimisation, and this is what happened with the design of&amp;nbsp;CPUs.&lt;/p&gt;
&lt;p&gt;As new manufacturing processes allowed computer engineers to cram more transistors into a &lt;span class="caps"&gt;CPU&lt;/span&gt;, the question arose: what should we do with more transistors? Just add more calculation units? Build new features into the &lt;span class="caps"&gt;CPU&lt;/span&gt;?&lt;/p&gt;
&lt;p&gt;Adding more calculation units made the design of CPUs much more complex (as anyone who has had to work alongside other team members doing the same job can attest). The most popular optimisations thus hinged on adding &lt;span class="caps"&gt;CPU&lt;/span&gt; features to ensure that it is fully utilised as much as&amp;nbsp;possible.&lt;/p&gt;
&lt;p&gt;In the next two issues, I will do a shallow dive into two of these features: out-of-order processing, and speculative execution. These will not be technical issues, because there are ready human analogues for such optimisations. You probably already do some of this at work, or even at&amp;nbsp;home!&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; &lt;span class="caps"&gt;CPU&lt;/span&gt; Optimisation Part 1 – Out-of-Order&amp;nbsp;Execution&lt;/p&gt;
&lt;p&gt;Out-of-order execution is a solution that we have all discovered at one point or other in our lives. When we have to manage multiple tasks and carry each one out as quickly as possible, we don’t always carry out the steps in a logical order, but in a manner that makes sense and lets us work as quickly as&amp;nbsp;possible.&lt;/p&gt;
&lt;p&gt;CPUs do this as well, to perform calculations much more quickly. More in the next issue of Layman’s&amp;nbsp;Guide.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cache"></category><category term="cpu"></category><category term="memory"></category></entry><entry><title>Issue 56: Operating Systems and resource management</title><link href="https://ngjunsiang.github.io/laymansguide/issue056.html" rel="alternate"></link><published>2020-01-18T08:00:00+08:00</published><updated>2020-01-18T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-01-18:/laymansguide/issue056.html</id><summary type="html">&lt;p&gt;The operating system is responsible for listing and managing the computer’s resources, making them available to programs running on the computer, and making sure they only use what they are allowed&amp;nbsp;to.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; The &lt;span class="caps"&gt;CPU&lt;/span&gt; just executes instruction after instruction after instruction. Each instruction may consist of loading data from a memory location, sending data to a memory location, or performing operations on the data it is&amp;nbsp;holding.&lt;/p&gt;
&lt;p&gt;If the &lt;span class="caps"&gt;CPU&lt;/span&gt; is mindless and simply carries out instructions, there must be some kind of a “higher mind” maintaining order and harmony within the &lt;span class="caps"&gt;CPU&lt;/span&gt; so that our programs don’t muck things up for each&amp;nbsp;other.&lt;/p&gt;
&lt;p&gt;In the early days of computing history, this higher mind was the programmer. In those days, a programmer had to mentally partition the limited memory space, and ensure that the programs being executed on the &lt;span class="caps"&gt;CPU&lt;/span&gt; don’t inadvertently muck up the memory in unexpected ways. This was manageable for a while: up to a few thousand, or tens of thousands of memory addresses, with a sensible set of rules. But as programs became more complex, and when multiple programs had to be run on the same computer, bugs started to creep in and become difficult to trace and&amp;nbsp;fix.&lt;/p&gt;
&lt;p&gt;Humans could no longer manage the &lt;span class="caps"&gt;CPU&lt;/span&gt;’s resources. It had to be automated. And so the operating system (&lt;strong&gt;&lt;span class="caps"&gt;OS&lt;/span&gt;&lt;/strong&gt;) was&amp;nbsp;born.&lt;/p&gt;
&lt;h2&gt;The operating system manages the computer’s&amp;nbsp;resources&lt;/h2&gt;
&lt;p&gt;An operating system has to do a few things at&amp;nbsp;minimum:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Enumerate the devices on the computer: checking all its available interfaces and listing the devices connected to each interface, to be made available to programs upon&amp;nbsp;request.&lt;/li&gt;
&lt;li&gt;Registering device ports into the virtual memory address space. This includes physical memory, hard drive ports, printer ports, keyboard and mouse and other &lt;span class="caps"&gt;USB&lt;/span&gt; device ports, and so on. This makes the devices available to programs that need to load data from those devices, or send data to those&amp;nbsp;devices.&lt;/li&gt;
&lt;li&gt;Manage running programs, giving each program its own memory space, dividing up the available &lt;span class="caps"&gt;CPU&lt;/span&gt; time among programs so that each gets some runtime, allocating more memory to programs that request it, reclaiming memory from programs that release&amp;nbsp;it.&lt;/li&gt;
&lt;li&gt;Enforce security by ensuring that programs only carry out instructions that they are allowed to. This is why Windows keeps bugging you about program permissions. This also ensures that guest users cannot access the data of other guest users, and cannot modify important system&amp;nbsp;files.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This is both an art and a science, and getting it right is an ongoing study. When an &lt;span class="caps"&gt;OS&lt;/span&gt; works well, instructions from different programs can be mixed into the same queue and executed by the &lt;span class="caps"&gt;CPU&lt;/span&gt; without the data somehow getting mixed up. And programs will not be able to dabble into the private memory area of other&amp;nbsp;programs.&lt;/p&gt;
&lt;p&gt;But cybersecurity is a multi-billion dollar industry with good reason. Black hat hackers and cybersecurity researchers are constantly trying to find loopholes in the &lt;span class="caps"&gt;OS&lt;/span&gt; logic so as to access data they are not supposed to be able to access. In Meltdown and Spectre, the loophole is not a fault in the &lt;span class="caps"&gt;OS&lt;/span&gt; logic, but in a hardware feature of the &lt;span class="caps"&gt;CPU&lt;/span&gt; which I will explain in the next issue: the &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;cache.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; The operating system is responsible for listing and managing the computer’s resources, making them available to programs running on the computer, and making sure they only use what they are allowed&amp;nbsp;to.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Cache, the &lt;span class="caps"&gt;CPU&lt;/span&gt;’s working&amp;nbsp;space&lt;/p&gt;
&lt;p&gt;The pieces are in place now for me to introduce the crux of the matter: the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache. This is where the heart of Meltdown and Spectre takes place, and yet we cannot do away with it. Stay tuned to learn&amp;nbsp;why.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="operating system"></category><category term="memory"></category></entry><entry><title>Issue 55: Addressing memory</title><link href="https://ngjunsiang.github.io/laymansguide/issue055.html" rel="alternate"></link><published>2020-01-11T08:00:00+08:00</published><updated>2020-01-11T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-01-11:/laymansguide/issue055.html</id><summary type="html">&lt;p&gt;The life of the unconscious &lt;span class="caps"&gt;CPU&lt;/span&gt; is just executing instruction after instruction after instruction. Each instruction may consist of loading data from a memory location, sending data to a memory location, or performing operations on the data it is&amp;nbsp;holding.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; To get useful output from a &lt;span class="caps"&gt;CPU&lt;/span&gt;, we must translate the operations we want it to perform into &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions, in a process known as &lt;strong&gt;compiling&lt;/strong&gt;. Most compilers convert programming code into &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;instructions.&lt;/p&gt;
&lt;p&gt;A &lt;span class="caps"&gt;CPU&lt;/span&gt; executes instructions, which loads data from memory, store data to memory, or carries out operations on loaded data. Where exactly does this data go, and how is it&amp;nbsp;organised?&lt;/p&gt;
&lt;h2&gt;Memory is a collection of bytes organised by&amp;nbsp;address&lt;/h2&gt;
&lt;p&gt;In Season 4, I mentioned the byte as a convenient collection of 8 bits. Part of the reason is that memory is organised by bytes. Each byte of memory has its own address. Naturally, in a &lt;span class="caps"&gt;CPU&lt;/span&gt;, this memory address will be encoded in&amp;nbsp;binary.&lt;/p&gt;
&lt;p&gt;Working out the numbers, a &lt;span class="caps"&gt;CPU&lt;/span&gt; will need 10-bit memory addresses to use 1 KiB of memory (2^10 = 1,024). 20-bit addresses will let it use 1 MiB of memory (2^20 = 1,048,576). 30-bit addresses will let it use 1 GiB of memory (2^30 = 1,073,741,824). And 32-bit addresses will let a &lt;span class="caps"&gt;CPU&lt;/span&gt; use 4 GiB of&amp;nbsp;memory.&lt;/p&gt;
&lt;p&gt;Are those numbers ringing a&amp;nbsp;bell?&lt;/p&gt;
&lt;h2&gt;The 32-bit to 64-bit transition in the&amp;nbsp;’00s&lt;/h2&gt;
&lt;p&gt;A little history, for those who remember: Around the turn of the century, in the ’00s, there was some hoo-ha about 32-bit CPUs not being able to use more than 4 GiB of memory; this was a time when 2 GiB of memory on a laptop was considered beefy, Google Chrome hadn’t appeared on the scene yet, and browsers did not use up gobs of&amp;nbsp;memory.&lt;/p&gt;
&lt;p&gt;This was also a time when 64-bit CPUs started coming onto the scene, and there was much confusion in the software world about which software would work on 32-bit CPUs, which ones would work on 64-bit CPUs, and which ones would work on&amp;nbsp;both.&lt;/p&gt;
&lt;p&gt;So this is what it boils down to: a 32-bit &lt;span class="caps"&gt;CPU&lt;/span&gt;, without any hacky workarounds, can only work with about 4 billion memory addresses. and this became insufficient around the turn of the century. We needed to use CPUs that could work with more than 4 billion addresses. 64-bit CPUs were the solution that the computing industry settled on. 64-bit memory addresses would extend the addressable memory capacity to 16 TiB for the foreseeable&amp;nbsp;future.&lt;/p&gt;
&lt;h2&gt;16 TiB?! Why do we need so much&amp;nbsp;memory?&lt;/h2&gt;
&lt;p&gt;Hold your horses — I want to be clear here. I’m not just talking about memory here, but about &lt;strong&gt;memory addresses&lt;/strong&gt;. What’s the difference? Consider for a moment how the &lt;span class="caps"&gt;CPU&lt;/span&gt; would transfer data to the hard drive. Or send data to a printer. Or even send it out onto the network. How would those virtual “locations” be represented in a &lt;span class="caps"&gt;CPU&lt;/span&gt; instruction that can only handle memory&amp;nbsp;addresses?&lt;/p&gt;
&lt;p&gt;The most straightforward answer, which you may have some difficulty accepting, is that they are simply represented as memory addresses. Yep, in the entire space of memory addresses, most of it is used to address physical memory (what is known as &lt;strong&gt;R&lt;/strong&gt;andom &lt;strong&gt;A&lt;/strong&gt;ccess &lt;strong&gt;M&lt;/strong&gt;emory, or &lt;strong&gt;&lt;span class="caps"&gt;RAM&lt;/span&gt;&lt;/strong&gt;), while some of it is used to address hard drive devices, &lt;span class="caps"&gt;USB&lt;/span&gt; devices, network devices, and various other connected&amp;nbsp;peripherals.&lt;/p&gt;
&lt;h2&gt;Of instructions and&amp;nbsp;addresses&lt;/h2&gt;
&lt;p&gt;Let’s summarise the picture so&amp;nbsp;far.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; The life of the unconscious &lt;span class="caps"&gt;CPU&lt;/span&gt; is just executing instruction after instruction after instruction. Each instruction may consist of loading data from a memory location, sending data to a memory location, or performing operations on the data it is&amp;nbsp;holding.&lt;/p&gt;
&lt;p&gt;Not a very interesting life, but it forms the bedrock which supports everything we use a computer for. And things are about to get more complex once we throw programs into the picture. Each program is its own long list of &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions, meant to produce different results. Excel carries out our spreadsheet processing, while Word helps us to format our documents. Yet the instructions from both programs are carried out in the same &lt;span class="caps"&gt;CPU&lt;/span&gt;! How does the &lt;span class="caps"&gt;CPU&lt;/span&gt; avoid mixing up data from different programs? How does it prevent Word from accidentally screwing up Excel’s data, and&amp;nbsp;vice-versa?&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Operating Systems and resource&amp;nbsp;management&lt;/p&gt;
&lt;p&gt;Okay, I think I’ve laid out the basics of &lt;span class="caps"&gt;CPU&lt;/span&gt; operation in sufficient detail for now. I have yet to mention one key component—the &lt;span class="caps"&gt;CPU&lt;/span&gt; cache. And I have yet to explain how CPUs speed up processing. These two explanations will make more sense after I make a side trip about how operating systems prevent everything from becoming one gigantic&amp;nbsp;mess.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cpu"></category><category term="memory"></category></entry><entry><title>Issue 54: Compiling programming code into CPU instructions</title><link href="https://ngjunsiang.github.io/laymansguide/issue054.html" rel="alternate"></link><published>2020-01-04T08:00:00+08:00</published><updated>2020-01-04T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2020-01-04:/laymansguide/issue054.html</id><summary type="html">&lt;p&gt;To get useful output from a &lt;span class="caps"&gt;CPU&lt;/span&gt;, we must translate the operations we want it to perform into &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions, in a process known as &lt;strong&gt;compiling&lt;/strong&gt;. Most compilers convert programming code into &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;instructions.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; CPUs are unconscious slaves that simply execute instruction after instruction, at a very fast&amp;nbsp;rate.&lt;/p&gt;
&lt;p&gt;Last issue, I introduced the idea of the &lt;span class="caps"&gt;CPU&lt;/span&gt; has an unconscious instruction-executing machine. It cannot process programming code directly; that code must first be compiled into &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;instructions.&lt;/p&gt;
&lt;h2&gt;The compiler converts programming code to &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;instructions&lt;/h2&gt;
&lt;p&gt;Last issue, I showed you a short snippet of &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;instructions:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="mf"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kr"&gt;LOAD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;
&lt;span class="mf"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mf"&gt;2&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R2&lt;/span&gt;
&lt;span class="mf"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MOV&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;R2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MEM1011&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;But that’s not the kind of code we usually see in movies, on the screens of geeks, and in stock images. What&amp;nbsp;gives?&lt;/p&gt;
&lt;p&gt;Most code we see looks something like (example from&amp;nbsp;Python):&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;num1 = 1
num2 = 2
sum = num1 + num2
print(f&amp;#39;The sum of {num1} and {num2} is {sum}&amp;#39;)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;How does that get turned into &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions? That job is performed by a piece of software known as the &lt;strong&gt;compiler&lt;/strong&gt;.[^1] The compiler compiles programming code into an &lt;strong&gt;executable file&lt;/strong&gt; (sometimes shortened to executable), which contains the actual instructions executed by the &lt;span class="caps"&gt;CPU&lt;/span&gt;. This is why, in Windows, some files have&amp;nbsp;a &lt;code&gt;.exe&lt;/code&gt; file extension — those are &lt;strong&gt;exe&lt;/strong&gt;cutable&amp;nbsp;files!&lt;/p&gt;
&lt;p&gt;[1]: Purists will argue with me that Python technically runs through an interpreter, not a compiler. At this point, the distinction between the two terms for layfolks is not critical, and I choose clarity over accuracy at this point until I can delve into more detail in a future&amp;nbsp;issue.&lt;/p&gt;
&lt;p&gt;The compiler itself is also a piece of software that reads in programming code (a process known as &lt;strong&gt;parsing&lt;/strong&gt;), and follows its own instructions to break it down into &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;instructions.[^2]&lt;/p&gt;
&lt;p&gt;[2]: If you find yourself wondering “how was the first compiler written? Which came first: the compiler code, or the compiler executable? How would a compiler compile its own code into its executable?”, you might be a prime candidate for a Computer Science degree programme&amp;nbsp;:)&lt;/p&gt;
&lt;p&gt;Okay, I think I am done talking about &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions for now. On to the next piece of the puzzle:&amp;nbsp;memory.&lt;/p&gt;
&lt;h2&gt;Computer memory: addressable&amp;nbsp;bytes&lt;/h2&gt;
&lt;p&gt;In the &lt;span class="caps"&gt;CPU&lt;/span&gt; instruction snippet above, there was a line that involved storing data into&amp;nbsp;memory:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="mf"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MOV&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;R2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MEM1011&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;This line means “store the value in slot R2 into the memory location 1011”. Next issue, I will delve into what these memory locations are, and build out our mental model of how a &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;works.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; To get useful output from a &lt;span class="caps"&gt;CPU&lt;/span&gt;, we must translate the operations we want it to perform into &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions, in a process known as &lt;strong&gt;compiling&lt;/strong&gt;. Most compilers convert programming code into &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;instructions.&lt;/p&gt;
&lt;p&gt;A very short issue, just as I like it :) There’s something philosophical about the process of a &lt;span class="caps"&gt;CPU&lt;/span&gt; beginning with no knowledge of what to do, and slowly bootstrapping a library of code-to-instruction conversions through a compiler. These and other puzzles about information manipulation are what computer scientists love studying! And this is one good reason to differentiate Computer Science from general Computing: if you take up a degree in Computer Science and expect to learn more about general Computing, you might end up being&amp;nbsp;disappointed.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Addressing&amp;nbsp;memory&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;&lt;del&gt;compiling code into an application [Issue 26]?&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cpu"></category><category term="memory"></category></entry><entry><title>Issue 53: The CPU is an instruction-obeying slave</title><link href="https://ngjunsiang.github.io/laymansguide/issue053.html" rel="alternate"></link><published>2019-12-28T08:00:00+08:00</published><updated>2019-12-28T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-12-28:/laymansguide/issue053.html</id><summary type="html">&lt;p&gt;CPUs are unconscious slaves that simply execute instruction after instruction, at a very fast&amp;nbsp;rate.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; &lt;span class="caps"&gt;PDF&lt;/span&gt;’s markup language is more concerned with how things appear on the page than with what they were originally. Once the &lt;span class="caps"&gt;PDF&lt;/span&gt; is generated, it is almost impossible to retrieve the original data from it. Scanned documents that are converted to &lt;span class="caps"&gt;PDF&lt;/span&gt; may have a text layer generated by &lt;span class="caps"&gt;OCR&lt;/span&gt; that lets detected text be copied from&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;In Season 4, I laid out the basics of how data is represented: text, images, audio, video. I also explained how compression happens, and unpacked how these basic data types can be combined into more complex&amp;nbsp;documents.&lt;/p&gt;
&lt;p&gt;But data by itself isn’t of much value in a computer if we can’t do things to it, perform operations on them. We are not talking surgical or military operations here, but chiefly mathematical operations to manipulate information: changing a bit here, a bit there, or making a massive set of changes&amp;nbsp;throughout.&lt;/p&gt;
&lt;p&gt;How exactly does that happen in a &lt;strong&gt;C&lt;/strong&gt;entral &lt;strong&gt;P&lt;/strong&gt;rocessing &lt;strong&gt;U&lt;/strong&gt;nit (henceforth &lt;strong&gt;&lt;span class="caps"&gt;CPU&lt;/span&gt;&lt;/strong&gt;)?&lt;/p&gt;
&lt;h2&gt;CPUs are instruction-obeying&amp;nbsp;slaves&lt;/h2&gt;
&lt;p&gt;The design of a &lt;span class="caps"&gt;CPU&lt;/span&gt; is very much inspired by human experience. One essential aspect of that experience is that everything we do consists of&amp;nbsp;operations.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Civilization advances by extending the number of important operations which we can perform without thinking of them.
— Alfred North&amp;nbsp;Whitehead&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Want to make coffee? Measure out one scoop of coffee beans per cup, add them to the grinder, press the Start button on the grinder and wait for the noise to stop, empty the coffee grounds into the drip machine, add water, press Start, and wait for a beep. 6 steps to make coffee. You can break those steps down differently depending on what kind of machine you are using and what kind of coffee you are making. Whatever the outcome we want, if it can’t be broken down into simple steps like that, we would not be able to design, make, and sell household appliances; we would have to be craftsmen (and craftswomen) of that&amp;nbsp;trade.&lt;/p&gt;
&lt;p&gt;A &lt;span class="caps"&gt;CPU&lt;/span&gt; is an unconscious operation-executing machine. Every outcome we want must be translated into operations which a &lt;span class="caps"&gt;CPU&lt;/span&gt; can perform without&amp;nbsp;understanding.&lt;/p&gt;
&lt;p&gt;A common mental model of how our computers work is that a programmer writes code in a language that a &lt;span class="caps"&gt;CPU&lt;/span&gt; understands, and the &lt;span class="caps"&gt;CPU&lt;/span&gt; simply carries out those instructions. Let’s go deeper into that model. How do those instructions get translated into the 1s and 0s of binary&amp;nbsp;code?&lt;/p&gt;
&lt;p&gt;Much the same way as information gets converted to binary in Season 4. The &lt;span class="caps"&gt;CPU&lt;/span&gt; can understand and execute a limited set of instructions, and each instruction is labelled with a number. The CPUs in use today have standardised on the instructions which they can be instructed to carry out. These sets of instructions are known as &lt;strong&gt;instruction sets&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;What are these instructions&amp;nbsp;like?&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;CPU&lt;/span&gt; instructions: moving data&amp;nbsp;around&lt;/h2&gt;
&lt;p&gt;These instructions perform operations on one, two, or more pieces of data. This is how an instruction&amp;nbsp;like &lt;code&gt;b = 1 + 2&lt;/code&gt; would be broken&amp;nbsp;down:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="mf"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kr"&gt;LOAD&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;
&lt;span class="mf"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;ADD&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="mf"&gt;2&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="n"&gt;R1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;R2&lt;/span&gt;
&lt;span class="mf"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MOV&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="n"&gt;R2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="n"&gt;MEM1011&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;I am using this arcane presentation format in a newsletter for layfolk because I think it helps to distinguish between human thinking and computer thinking. What the computer is doing here&amp;nbsp;is:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Load the&amp;nbsp;value &lt;code&gt;1&lt;/code&gt; into slot&amp;nbsp;R1&lt;/li&gt;
&lt;li&gt;Add the&amp;nbsp;value &lt;code&gt;2&lt;/code&gt; to the value in slot R1, and store the result in slot&amp;nbsp;R2&lt;/li&gt;
&lt;li&gt;Store the value in slot R2 into the memory location 1011 (where the&amp;nbsp;variable &lt;code&gt;b&lt;/code&gt; points, so that other programs/instructions can use the&amp;nbsp;result)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Everything we ask a &lt;span class="caps"&gt;CPU&lt;/span&gt; to do essentially consists of loading data from somewhere, doing some kind of processing on it, and storing the result somewhere. The &lt;span class="caps"&gt;CPU&lt;/span&gt; processes lists of these instructions, at a rate of millions to billions of instructions per&amp;nbsp;second.&lt;/p&gt;
&lt;p&gt;Let that sink in for a moment. Every Youtube video, meme, or tweet we send or see is the result of hundreds and thousands of operations, taking place in CPUs around the world. CPUs converting text, audio, and images into raw data, encapsulating it into a data package along with some metadata, sending it out to another &lt;span class="caps"&gt;CPU&lt;/span&gt; that translates the destination address and forwards it to the next gateway, and so on, until it reaches its destination, gets decoded and processed, and signals get sent to the monitor and speakers to produce what we see and&amp;nbsp;hear.&lt;/p&gt;
&lt;h2&gt;Why can’t I run an exe file from Windows on my smartphone, or an Android/iOS app on my Windows&amp;nbsp;laptop?&lt;/h2&gt;
&lt;p&gt;There are many reasons for that, and I will explain one of those reasons here: the x86-64 instruction set used by Intel/&lt;span class="caps"&gt;AMD&lt;/span&gt; CPUs on your Windows laptop is &lt;em&gt;not compatible&lt;/em&gt; with the &lt;span class="caps"&gt;ARM&lt;/span&gt; instruction set used by your smartphone &lt;span class="caps"&gt;CPU&lt;/span&gt;; the &lt;span class="caps"&gt;MOV&lt;/span&gt;, &lt;span class="caps"&gt;ADD&lt;/span&gt;, and other instructions have different numerical codes in each instruction&amp;nbsp;set.&lt;/p&gt;
&lt;p&gt;The same programming code for the app must be &lt;strong&gt;compiled&lt;/strong&gt; into &lt;span class="caps"&gt;CPU&lt;/span&gt; instructions separately for Intel/&lt;span class="caps"&gt;AMD&lt;/span&gt; processors, and for &lt;span class="caps"&gt;ARM&lt;/span&gt;-based&amp;nbsp;processors.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; CPUs are unconscious slaves that simply execute instruction after instruction, at a very fast&amp;nbsp;rate.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Compiling programming code into &lt;span class="caps"&gt;CPU&lt;/span&gt;&amp;nbsp;instructions&lt;/p&gt;
&lt;p&gt;I think this is a good place to stop today. Before we can dig into &lt;span class="caps"&gt;CPU&lt;/span&gt; exploits, we must first unpack what a &lt;span class="caps"&gt;CPU&lt;/span&gt; does. And we are starting slow, because the &lt;span class="caps"&gt;CPU&lt;/span&gt; is ultimately a strange place. Stepping into it is kind of like stepping into Willy Wonka’s Chocolate Factory, where all kinds of wonderful things are happening, and once you figure how everything fits together you can figure out where you can sneak globs of chocolate without people finding&amp;nbsp;out.&lt;/p&gt;
&lt;p&gt;See you in the next issue of Season 5: the Chocolate Processing&amp;nbsp;Unit!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 05"></category><category term="cpu"></category></entry></feed>