<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Layman's Guide to Computing - Season 04</title><link href="https://ngjunsiang.github.io/laymansguide/" rel="alternate"></link><link href="https://ngjunsiang.github.io/laymansguide/feeds/season-04.atom.xml" rel="self"></link><id>https://ngjunsiang.github.io/laymansguide/</id><updated>2019-12-21T08:00:00+08:00</updated><entry><title>Issue 52: PDFs part 2 – Text and images</title><link href="https://ngjunsiang.github.io/laymansguide/issue052.html" rel="alternate"></link><published>2019-12-21T08:00:00+08:00</published><updated>2019-12-21T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-12-21:/laymansguide/issue052.html</id><summary type="html">&lt;p&gt;&lt;span class="caps"&gt;PDF&lt;/span&gt;’s markup language is more concerned with how things appear on the page than with what they were originally. Once the &lt;span class="caps"&gt;PDF&lt;/span&gt; is generated, it is almost impossible to retrieve the original data from it. Scanned documents that are converted to &lt;span class="caps"&gt;PDF&lt;/span&gt; may have a text layer generated by &lt;span class="caps"&gt;OCR&lt;/span&gt; that lets detected text be copied from&amp;nbsp;it.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; &lt;span class="caps"&gt;PDF&lt;/span&gt; is the gold standard for universal compatibility (supported by most software and platforms) and visual fidelity (displays exactly the same way). When you need things to appear on a different device in exactly the same way you created it, without having to install additional software, use &lt;span class="caps"&gt;PDF&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;I mentioned earlier that &lt;span class="caps"&gt;PDF&lt;/span&gt; is an incredibly complex and powerful format. You can do so much with it once you have digested the approximately 900 pages of &lt;a href="https://www.adobe.com/devnet/pdf/pdf_reference.html"&gt;its format specification&lt;/a&gt;, which are available for free. To support older versions of Acrobat and other readers, you may have to cross-reference &lt;a href="https://mpdf.github.io/reference/pdf-files-adobe/pdf-reference.html"&gt;the reference manuals of older versions&lt;/a&gt;. It’s not impossible, but I hope this helps you understand why many apps and services are reluctant to provide &lt;span class="caps"&gt;PDF&lt;/span&gt; support unless there are already libraries available for them to use in their own application. This is time-consuming&amp;nbsp;stuff!&lt;/p&gt;
&lt;h2&gt;Nope, I’m not going to read&amp;nbsp;that.&lt;/h2&gt;
&lt;p&gt;Sure, that’s why I’m writing this newsletter :) Now if you flip to page 238 of the reference manual (I’m just kidding, don’t go download the reference manual now!) and look at Example 1, you see&amp;nbsp;this:&lt;/p&gt;
&lt;div class="codehilite"&gt;&lt;pre&gt;&lt;span&gt;&lt;/span&gt;&lt;code&gt;&lt;span class="nv"&gt;This&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;example&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;illustrates&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;most&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;straightforward&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;use&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;font&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;text&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;ABC&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;is&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;placed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;inches&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;
&lt;span class="nv"&gt;bottom&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;page&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;inches&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;from&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;left&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;edge&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;using&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nv"&gt;point&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Helvetica&lt;/span&gt;.

&lt;span class="nv"&gt;BT&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nv"&gt;F13&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Tf&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="mi"&gt;288&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;720&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Td&lt;/span&gt;
&lt;span class="w"&gt;    &lt;/span&gt;&lt;span class="ss"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;ABC&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Tj&lt;/span&gt;
&lt;span class="nv"&gt;ET&lt;/span&gt;

&lt;span class="nv"&gt;The&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;five&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;lines&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;example&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;perform&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;these&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;steps&lt;/span&gt;:

&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Begin&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;text&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;object&lt;/span&gt;.
&lt;span class="nv"&gt;b&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Set&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;and&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;size&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;to&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;use&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;installing&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;them&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;parameters&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;text&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;state&lt;/span&gt;.&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;In&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;case&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;font&lt;/span&gt;
&lt;span class="nv"&gt;resource&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;identified&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;by&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;name&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;F13&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;specifies&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;font&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;externally&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;known&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;as&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Helvetica&lt;/span&gt;.
&lt;span class="nv"&gt;c&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Specify&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;starting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;position&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;on&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;page&lt;/span&gt;,&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;setting&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;parameters&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;text&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;object&lt;/span&gt;.
&lt;span class="nv"&gt;d&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;Paint&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;glyphs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;a&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;string&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;of&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;characters&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;at&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;that&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;position&lt;/span&gt;.
&lt;span class="nv"&gt;e&lt;/span&gt;&lt;span class="ss"&gt;)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="k"&gt;End&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;the&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;text&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nv"&gt;object&lt;/span&gt;.
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;

&lt;p&gt;Remember when I showed you some markup languages in &lt;a href="https://ngjunsiang.github.io/laymansguide/issue050.html"&gt;Issue 50&lt;/a&gt;? Here’s another one, but much more concise and much more specific: it lets you specify font and position for each string of characters. There are additional formatting codes for changing the colour, changing the text format to an outlined version, and making various other kinds of&amp;nbsp;changes.&lt;/p&gt;
&lt;h2&gt;How &lt;span class="caps"&gt;PDF&lt;/span&gt; documents display this&amp;nbsp;text&lt;/h2&gt;
&lt;p&gt;In the reference manual, there is a long and complicated way of putting a text block into a &lt;span class="caps"&gt;PDF&lt;/span&gt; document and specifying the line spacing and character spacing and how to insert line breaks and all that, so that it appears nicely. In practice, it is rather difficult for developers to convert their own format used in their app into what the &lt;span class="caps"&gt;PDF&lt;/span&gt; format fully requires (see my point at the start of this issue about having &lt;span class="caps"&gt;PDF&lt;/span&gt; libraries&amp;nbsp;available).&lt;/p&gt;
&lt;p&gt;If it is done properly, copying text from a &lt;span class="caps"&gt;PDF&lt;/span&gt; document is rather easy, and you may on rare occasion have experienced this. Apps that do not use or do not have access to high-quality &lt;span class="caps"&gt;PDF&lt;/span&gt; libraries for their app may end up generating PDFs that simply display the text word by word, or line by line. If you’ve ever copied a paragraph of text from a &lt;span class="caps"&gt;PDF&lt;/span&gt; and had it appear in multiple lines instead of a single line, or with some word spaces missing, this could be the reason&amp;nbsp;why.&lt;/p&gt;
&lt;h2&gt;What about&amp;nbsp;images?&lt;/h2&gt;
&lt;p&gt;Again, the one thing you need to remember is that &lt;span class="caps"&gt;PDF&lt;/span&gt; is concerned primarily with &lt;em&gt;how things look&lt;/em&gt;, not with &lt;em&gt;what things are&lt;/em&gt;. To display a &lt;span class="caps"&gt;JPG&lt;/span&gt; or &lt;span class="caps"&gt;GIF&lt;/span&gt; on a &lt;span class="caps"&gt;PDF&lt;/span&gt;, the app’s &lt;span class="caps"&gt;PDF&lt;/span&gt; library has to convert it from its compressed format into an array of pixels. The image’s pixel dimensions will seldom match those of the frame it must go into; often you may find yourself trying to fit an 800×600px image into a 400×300px space. The &lt;span class="caps"&gt;PDF&lt;/span&gt; encodes that stream of pixels, and you may not be able to get the original image back from that stream, especially after it has gone through some resizing and&amp;nbsp;cropping.&lt;/p&gt;
&lt;h2&gt;Why can’t I copy text from a scanned&amp;nbsp;document?&lt;/h2&gt;
&lt;p&gt;Ah, a common question, and one I have been dying to&amp;nbsp;answer.&lt;/p&gt;
&lt;p&gt;When you scan a document, your scanner does not produce text; it produces an image. When the scanning software lets you save your scan as a &lt;span class="caps"&gt;PDF&lt;/span&gt;, it basically puts the image into a full-page &lt;span class="caps"&gt;PDF&lt;/span&gt; and calls it a day. There is no text content in the &lt;span class="caps"&gt;PDF&lt;/span&gt; at&amp;nbsp;all!&lt;/p&gt;
&lt;p&gt;If the software is a bit smarter, or if you have Adobe Acrobat, you might have access to &lt;strong&gt;O&lt;/strong&gt;ptical &lt;strong&gt;C&lt;/strong&gt;haracter &lt;strong&gt;R&lt;/strong&gt;ecognition software (&lt;strong&gt;&lt;span class="caps"&gt;OCR&lt;/span&gt;&lt;/strong&gt;). This is a feature in some apps that recognise text in images and recreates it for you. This feature lets the app check your scan for recognisable characters and produce a text stream from it. It can them put this text into an additional layer in the &lt;span class="caps"&gt;PDF&lt;/span&gt;, below the&amp;nbsp;image.&lt;/p&gt;
&lt;p&gt;It takes some additional trickery to ensure the text appears at exactly the same position where it was detected in the image (remember from above that the text position must be specified). If the &lt;span class="caps"&gt;PDF&lt;/span&gt; library gets the font size and positioning right, this &lt;em&gt;simulates&lt;/em&gt; the experience of selecting text on the image and having it appear to be&amp;nbsp;highlighted.&lt;/p&gt;
&lt;p&gt;However, the state of &lt;span class="caps"&gt;OCR&lt;/span&gt; technology is such that you will often still get typos or missing/extra spaces in the text, so do be sure to check any text that you copy from a &lt;span class="caps"&gt;PDF&lt;/span&gt;!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; &lt;span class="caps"&gt;PDF&lt;/span&gt;’s markup language is more concerned with how things appear on the page than with what they were originally. Once the &lt;span class="caps"&gt;PDF&lt;/span&gt; is generated, it is almost impossible to retrieve the original data from it. Scanned documents that are converted to &lt;span class="caps"&gt;PDF&lt;/span&gt; may have a text layer generated by &lt;span class="caps"&gt;OCR&lt;/span&gt; that lets detected text be copied from&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;… and Season 4’s a wrap! Phew, I hope Season 4 increased your understanding of how text, images, audio, and video are represented and stored in a computer, of how lossy and lossless compression work and why the former leads to a decrease in quality, of what a file is and how OSes tell them apart, and lastly of documents and other complex file types, and how they are put&amp;nbsp;together.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next season:&lt;/strong&gt; The &lt;span class="caps"&gt;CPU&lt;/span&gt; - where it all&amp;nbsp;happens&lt;/p&gt;
&lt;p&gt;I was going to start Season 5 continuing where I last left off in Season 3. From networking I would have gone on to talk about the internet and its history, how it became the cloud, and how we had the advertising network we have today. But I realised that (1) I still need to do more research on some areas (particularly ad exchanges), and (2) Meltdown and Spectre are apparently not fully fixed&amp;nbsp;yet.&lt;/p&gt;
&lt;p&gt;If you remember, Meltdown and Spectre are the &lt;span class="caps"&gt;CPU&lt;/span&gt; vulnerabilities that can potentially allow attackers to access protected data in your computer’s memory. Most of us don’t have much on our computers that we need to worry about, but banks and other corporations that we rely on certainly&amp;nbsp;do!&lt;/p&gt;
&lt;p&gt;One year on, that vulnerability is still not fully fixed. Some people seem to be flabbergasted by the inability of the huge &lt;span class="caps"&gt;CPU&lt;/span&gt; companies (actually just mainly Intel) to figure this out. But once you understand what Meltdown and Spectre are and how they work, even at a layperson level, I think it is easier to see that there is no straightforward fix that will make everyone happy. With media outlets everywhere citing Moore’s Law uncritically and expecting performance to increase in accordance with it, I am disappointed that such a vulnerability had not been conceptualised earlier and prevented, but I am not&amp;nbsp;surprised.&lt;/p&gt;
&lt;p&gt;Security and privacy are the hot-button topics of the day, but there are many pundits and analysts talking with little idea of how they are implemented and why they are such a difficult challenge. With Season 5 I hope to lay out the basics of operating system security and &lt;span class="caps"&gt;CPU&lt;/span&gt; operation, and attempt to explain in simple terms how Meltdown and Spectre work and why they are so difficult to&amp;nbsp;fix.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category><category term="document"></category></entry><entry><title>Issue 51: PDFs part 1 – Compatibility and fidelity</title><link href="https://ngjunsiang.github.io/laymansguide/issue051.html" rel="alternate"></link><published>2019-12-14T08:00:00+08:00</published><updated>2019-12-14T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-12-14:/laymansguide/issue051.html</id><summary type="html">&lt;p&gt;&lt;span class="caps"&gt;PDF&lt;/span&gt; is the gold standard for universal compatibility (supported by most software and platforms) and visual fidelity (displays exactly the same way). When you need things to appear on a different device in exactly the same way you created it, without having to install additional software, use &lt;span class="caps"&gt;PDF&lt;/span&gt;.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; An &lt;span class="caps"&gt;HTML&lt;/span&gt; file contains markup tags that tell the browser how to interpret and format the text within the tags. Other document formats usually use tags in a similar way. These tags constitute a markup language that any app can use to mark up its own text&amp;nbsp;too.&lt;/p&gt;
&lt;p&gt;If you were old enough (or perhaps lucky enough) to remember the old days of document layout, you may remember a time when such software was non-existent. You typed the text using a typewriter, being &lt;em&gt;very careful&lt;/em&gt; to do a carriage return and line break where the pictures were supposed to go. Then you &lt;em&gt;literally&lt;/em&gt; cut the pictures and pasted them in. Not the right size? You’re outta&amp;nbsp;luck.&lt;/p&gt;
&lt;p&gt;And then computers came along. But in the days of dot matrix printers, which printed on paper with those holey tearaway strips on both sides, it was the same process, just digital. You still printed only text, and added the pictures&amp;nbsp;later.&lt;/p&gt;
&lt;h2&gt;The early days of&amp;nbsp;publishing&lt;/h2&gt;
&lt;p&gt;If you were working for a professional publisher, you formatted text by inserting &lt;strong&gt;control codes&lt;/strong&gt; (including the formatting commands mentioned in &lt;a href="https://ngjunsiang.github.io/laymansguide/issue041.html"&gt;Issue 41&lt;/a&gt;)) using a special keyboard. But computers back then weren’t powerful enough to show you the effects of that formatting instantaneously. You would just see the formatting code on the display and have to imagine how it looks like in your head. Which is easy to do, after many years of&amp;nbsp;experience.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Formatting codes revealed in WordPerfect 5.1" src="https://ngjunsiang.github.io/laymansguide/issue051_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;WordPerfect 5.1 (1986), with formatting codes revealed&lt;br /&gt;From &lt;a href="https://anthology.hypotheses.org/254"&gt;Anthology&lt;/a&gt;&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;And then desktop publishing software arrived on the scene in the mid-1980s, when Aldus released PageMaker. You could see &lt;em&gt;how the pages actually looked&lt;/em&gt;! This feature was called What You See Is What You Get, or &lt;strong&gt;&lt;span class="caps"&gt;WYSIWYG&lt;/span&gt;&lt;/strong&gt;. PageMaker was quickly overshadowed by QuarkXpress, which had extensions (whoa!), and Aldus languished and got bought over by Adobe in late 1994. Yup, &lt;em&gt;that&lt;/em&gt; Adobe. And then Adobe released InDesign in&amp;nbsp;1999.&lt;/p&gt;
&lt;h2&gt;Publishing vs word&amp;nbsp;processing&lt;/h2&gt;
&lt;p&gt;Why didn’t I mention Microsoft Word, even though it was first released much earlier, back in 1989? That’s because Word is a &lt;strong&gt;word processor&lt;/strong&gt;, not a &lt;strong&gt;page layout application&lt;/strong&gt;. A word processor is focused on helping you to produce reports with nice formatting, but still primarily text-based. You wouldn’t design a professional magazine in Microsoft Word; it doesn’t give you enough fine-grained control over positioning of the various elements. For that you need a proper page layout application, like&amp;nbsp;InDesign.&lt;/p&gt;
&lt;p&gt;I just mentioned fine-grained control. That’s something you are going to hear a lot in the world of graphic design. Designers and publishers want control, lots of it. They not only want to control where things go on the page (down to sub-millimetre precision), they even want to control exactly how the colour&amp;nbsp;looks.&lt;/p&gt;
&lt;p&gt;Going into more detail here would betray my principle of writing for the layperson, but I think it is important to present this perspective because it explains the need for a format many of us love and hate: the &lt;span class="caps"&gt;PDF&lt;/span&gt;&amp;nbsp;format.&lt;/p&gt;
&lt;h2&gt;Ensuring print fidelity: the Postscript&amp;nbsp;language&lt;/h2&gt;
&lt;p&gt;When you design something on the screen, how do you know that it will look &lt;em&gt;exactly the same&lt;/em&gt; when printed? Short answer: you won’t, unless you have a markup language that is understood the same way by both the desktop software and your printer. That language is called &lt;strong&gt;Postscript&lt;/strong&gt;, and it can handle text, images, shapes, and additional info (or metadata, i.e. data about data) that comes with&amp;nbsp;them.&lt;/p&gt;
&lt;p&gt;But people soon wanted to include even more things in their documents: forms, videos, 3D artwork, … many of which Postscript did not support natively. And that’s where &lt;span class="caps"&gt;PDF&lt;/span&gt;&amp;nbsp;shines.&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;PDF&lt;/span&gt;: the standard for compatible&amp;nbsp;fidelity&lt;/h2&gt;
&lt;p&gt;Today, it is easy to take for granted that when I create a &lt;span class="caps"&gt;DOCX&lt;/span&gt; document in Word on my iPad and upload it to Google Drive, it should open on my laptop and look the same. To an accurate enough degree,&amp;nbsp;anyway.&lt;/p&gt;
&lt;p&gt;But two decades ago, such compatibility was still a dream. You could not take for granted that a complex document format produced on one software would open correctly (if it even opens) on another piece of software, or even the same software written for a different machine (think of Word for Windows, Mac, and other&amp;nbsp;OSes).&lt;/p&gt;
&lt;p&gt;Needless to say, this was incredibly frustrating for industry. If you were running an ad campaign and your ad agency is trying to send poster designs to you but you each use different software in your workflow … well, how is that going to happen? Or what if two different government departments are trying to collaborate on a form that citizens will use to file&amp;nbsp;taxes?&lt;/p&gt;
&lt;p&gt;A lot of engineering and coordination went into ensuring that &lt;span class="caps"&gt;PDF&lt;/span&gt; would work everywhere (universal compatibility), and display exactly the same way on every device (visual fidelity), and that is why it is a gold standard for the printing and publishing industry. If you want to ensure your T-shirt design will appear &lt;strong&gt;exactly&lt;/strong&gt;&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt; the way you want, send it as a &lt;span class="caps"&gt;PDF&lt;/span&gt; file, not as an image file. Have your magazine cover all set up with the fonts, sizes, colours, and everything else absolutely correct?&lt;sup id="fnref:2"&gt;&lt;a class="footnote-ref" href="#fn:2"&gt;2&lt;/a&gt;&lt;/sup&gt; Send it to the printers as a &lt;span class="caps"&gt;PDF&lt;/span&gt;&amp;nbsp;file.&lt;/p&gt;
&lt;p&gt;There is just one issue with &lt;span class="caps"&gt;PDF&lt;/span&gt;: because of the way it was designed to &lt;em&gt;display correctly&lt;/em&gt;, editing it is a big pain compared to text-based formats like &lt;span class="caps"&gt;DOCX&lt;/span&gt; or even &lt;span class="caps"&gt;HTML&lt;/span&gt;. I’ll explain why in the next&amp;nbsp;issue.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; &lt;span class="caps"&gt;PDF&lt;/span&gt; is the gold standard for universal compatibility (supported by most software and platforms) and visual fidelity (displays exactly the same way). When you need things to appear on a different device in exactly the same way you created it, without having to install additional software, use &lt;span class="caps"&gt;PDF&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;I am &lt;em&gt;sooo&lt;/em&gt; glad I don’t have to go into technical detail here. &lt;span class="caps"&gt;PDF&lt;/span&gt; is an incredibly, amazingly, mind-blowingly complex specification. All the words written about it would fill tomes. I am not surprised that Adobe charged so much for the initial versions of Adobe Reader and Acrobat; the immense amount of work that went into it would have made that price feel justified. (But luckily for all of us, more enterprising minds&amp;nbsp;prevailed.)&lt;/p&gt;
&lt;p&gt;I hope this issue sheds some light on the uses of &lt;span class="caps"&gt;PDF&lt;/span&gt;. We don’t get taught these things by our parents, in school, or anywhere really; the only folks who know this are usually publishing industry professionals. But with more and more programs being able to handle and produce &lt;span class="caps"&gt;PDF&lt;/span&gt; files, if we hope to continue enjoying its benefits and avoiding the consequences of using it inappropriately, then it is time that such knowledge became more&amp;nbsp;commonplace.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For more:&lt;/strong&gt; &lt;a href="https://tedium.co/2018/02/27/pdf-file-format-history/"&gt;Pretty Darn Fascinating: The story of the &lt;span class="caps"&gt;PDF&lt;/span&gt;, the portable document format that’s become one of the internet’s defining information&amp;nbsp;formats.&lt;/a&gt;&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; PDFs Part&amp;nbsp;2&lt;/p&gt;
&lt;p&gt;Next issue, I’ll try to explain why &lt;span class="caps"&gt;PDF&lt;/span&gt; files are the idiosyncratic beasts you hate to edit. While still avoiding technical jargon as much as I&amp;nbsp;can.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="footnote"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;This is harder than it appears; for one, you have to ensure you get the image size and resolution correct, or the printer will have to modify it for you.&amp;#160;&lt;a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li id="fn:2"&gt;
&lt;p&gt;Again, this is harder than it appears; the way to specify the exact colour you want is not something a layperson would know.&amp;#160;&lt;a class="footnote-backref" href="#fnref:2" title="Jump back to footnote 2 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</content><category term="Season 04"></category><category term="document"></category></entry><entry><title>Issue 50: Complex file formats and the Document</title><link href="https://ngjunsiang.github.io/laymansguide/issue050.html" rel="alternate"></link><published>2019-12-07T08:00:00+08:00</published><updated>2019-12-07T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-12-07:/laymansguide/issue050.html</id><summary type="html">&lt;p&gt;An &lt;span class="caps"&gt;HTML&lt;/span&gt; file contains markup tags that tell the browser how to interpret and format the text within the tags. Other document formats usually use tags in a similar way. These tags constitute a markup language that any app can use to mark up its own text&amp;nbsp;too.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; A file consists of data, preceded by a file header which describes the data. Software (including operating systems) detect the kind of data contained in a file by 1) glancing at the file extension, 2) looking at its declared &lt;span class="caps"&gt;MIME&lt;/span&gt; type (if any), and 3) checking the file&amp;nbsp;header.&lt;/p&gt;
&lt;p&gt;I took a small detour in Issue 49 to talk about how files are stored and how the operating system identifies them. This issue, let’s pick up where we left off in Issue 48 about complex data types, and encapsulated data (data in a shell of metadata in a shell of metadata&amp;nbsp;…).&lt;/p&gt;
&lt;p&gt;Video files can contain multiple data streams: video, audio, and text. That makes them a pretty complex type of file in which we can embed other types of data. But they are not the only complex file type. We deal with them every time we create a new Microsoft Office document, be it in Word, Powerpoint, or Excel. You can embed images, videos, fonts, and even stranger objects in Microsoft Word. How does a simple &lt;span class="caps"&gt;DOCX&lt;/span&gt; or &lt;span class="caps"&gt;PPTX&lt;/span&gt; document keep it all&amp;nbsp;together?&lt;/p&gt;
&lt;p&gt;We are going to dig into a webpage document and a Word document and see what it looks like in&amp;nbsp;there.&lt;/p&gt;
&lt;h2&gt;Webpage: An &lt;span class="caps"&gt;HTML&lt;/span&gt;&amp;nbsp;document&lt;/h2&gt;
&lt;p&gt;It may be 2019 now, where URLs can end with all kinds of extensions&amp;nbsp;like &lt;code&gt;.aspx&lt;/code&gt; and &lt;code&gt;.php&lt;/code&gt; and even no extension, but a decade or two ago they almost always ended&amp;nbsp;in &lt;code&gt;.html&lt;/code&gt;. That’s because I mentioned back in &lt;a href="https://ngjunsiang.github.io/laymansguide/issue003.html"&gt;Issue 3&lt;/a&gt;) that the basic format of any web document is &lt;span class="caps"&gt;HTML&lt;/span&gt;. I apologise for leaving that acronym untranslated up till&amp;nbsp;now.&lt;/p&gt;
&lt;p&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt; stands for &lt;strong&gt;H&lt;/strong&gt;yper&lt;strong&gt;t&lt;/strong&gt;ext &lt;strong&gt;M&lt;/strong&gt;arkup &lt;strong&gt;L&lt;/strong&gt;anguage. We’ve seen this word “Hypertext” before, when I explained the Hypertext Transfer Protocol (&lt;span class="caps"&gt;HTTP&lt;/span&gt;, &lt;a href="https://ngjunsiang.github.io/laymansguide/issue007.html"&gt;Issue 7&lt;/a&gt;)), the set of rules that our web browsers use to request Hypertext Markup Language documents. See a link&amp;nbsp;now?&lt;/p&gt;
&lt;p&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt; is not a programming language. You can’t write code and tell a computer to make different decisions just by writing &lt;span class="caps"&gt;HTML&lt;/span&gt;. You can create a button using &lt;span class="caps"&gt;HTML&lt;/span&gt;, but you cant use &lt;span class="caps"&gt;HTML&lt;/span&gt; to tell the computer to send your credit card details to another server on the Internet when you click that button. And that is why we refer to it by another term: a markup&amp;nbsp;language.&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt; Markup&amp;nbsp;tags&lt;/h2&gt;
&lt;p&gt;This is (a snippet of) the previous issue, as an &lt;span class="caps"&gt;HTML&lt;/span&gt;&amp;nbsp;file:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Snippet of HTML from Issue 49" src="https://ngjunsiang.github.io/laymansguide/issue050_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;Issue 49 as an &lt;span class="caps"&gt;HTML&lt;/span&gt;&amp;nbsp;file&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;Thank goodness we have syntax highlighting, which should make it easier to notice all the little tags that start with an open angled&amp;nbsp;bracket &lt;code&gt;&amp;lt;&lt;/code&gt; and closed angled&amp;nbsp;bracket &lt;code&gt;&amp;gt;&lt;/code&gt;. These are called &lt;span class="caps"&gt;HTML&lt;/span&gt; tags, and they signify the start and end of segments in the&amp;nbsp;document.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;html&amp;gt;&lt;/code&gt; starts the&amp;nbsp;document, &lt;code&gt;&amp;lt;/html&amp;gt;&lt;/code&gt; ends&amp;nbsp;it.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;head&amp;gt;&amp;lt;/head&amp;gt;&lt;/code&gt; contains information about the page: the page title (which will appear in the title bar of your web browser), the styles to apply to the document are&amp;nbsp;within.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;style type="text/css"&amp;gt;…&amp;lt;/style&amp;gt;&lt;/code&gt;, which I have hidden here and will show&amp;nbsp;later.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;body class="app"&amp;gt;…&amp;lt;/body&amp;gt;&lt;/code&gt; contains the main part of the document, which is what we will&amp;nbsp;see.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;h1&amp;gt;&lt;/code&gt; and &lt;code&gt;&amp;lt;h3&amp;gt;&lt;/code&gt; signify different levels of headers, which can all be formatted&amp;nbsp;separately.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; (for &amp;#8220;division&amp;#8221;) is a generic container, within which you can embed images or other&amp;nbsp;text.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;p&amp;gt;…&amp;lt;/p&amp;gt;&lt;/code&gt; (for paragraph) indicates to a web browser that the context is to be treated like a text&amp;nbsp;paragraph.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;&amp;lt;strong&amp;gt;…&amp;lt;/strong&amp;gt;&lt;/code&gt; indicates that it is to be formatted in strong fashion (which is usually treated as bold text … but you can change that in the styles section in&amp;nbsp;the &lt;code&gt;&amp;lt;head&amp;gt;&lt;/code&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What are&amp;nbsp;those &lt;code&gt;class="…"&lt;/code&gt; attributes in the tags? The web browser creates a content element for each tag, and styles it according to the predefined style class in the document, defined&amp;nbsp;inside &lt;code&gt;&amp;lt;style&amp;gt;…&amp;lt;/style&amp;gt;&lt;/code&gt;. This is what that section looks like when&amp;nbsp;expanded:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Styles for the Issue 49 HTML file" src="https://ngjunsiang.github.io/laymansguide/issue050_02.png" /&gt;&lt;br /&gt;
&lt;em&gt;Element styles for Issue&amp;nbsp;49&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;I don’t need to explain the specifications for you to notice&amp;nbsp;that &lt;code&gt;&amp;lt;h1&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;h2&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;h3&amp;gt;&lt;/code&gt; etc all have a slightly different style defined for&amp;nbsp;them. &lt;code&gt;.app&lt;/code&gt; is a little different; it starts with a period&amp;nbsp;(&lt;code&gt;.&lt;/code&gt;) and is applied to everything that has&amp;nbsp;the &lt;code&gt;class="app"&lt;/code&gt; attribute (&lt;em&gt;psst&lt;/em&gt; … that’s&amp;nbsp;the &lt;code&gt;&amp;lt;body&amp;gt;&lt;/code&gt; element from the earlier&amp;nbsp;image!).&lt;/p&gt;
&lt;p&gt;Yet at the same time, there are also other styles defined&amp;nbsp;for &lt;code&gt;&amp;lt;body&amp;gt;…&amp;lt;/body&amp;gt;&lt;/code&gt;. The browser has rules for how it chooses which styles override which. Those rules are like the bible for web programmers and web designers, which thankfully we are not (*waves to any web folks in this mailing&amp;nbsp;list*).&lt;/p&gt;
&lt;p&gt;Okay, just two more tags to illustrate embedding other&amp;nbsp;content:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Another part of the Issue 49 HTML file showing the &amp;lt;a&amp;gt; tag" src="https://ngjunsiang.github.io/laymansguide/issue050_03.png" /&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;&amp;lt;a&amp;gt;&lt;/code&gt; tag (for &amp;#8220;anchor&amp;#8221;; don’t ask) is used to define links (those clickable things in a webpage) and the place it links to is defined as&amp;nbsp;a &lt;code&gt;href="…"&lt;/code&gt; attribute.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Another part of the Issue 49 HTML file showing the &amp;lt;img&amp;gt; tag" src="https://ngjunsiang.github.io/laymansguide/issue050_04.png" /&gt;&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; tag (for &amp;#8220;image&amp;#8221;) is used to insert images. Rollover text, which appears when you put the mouse cursor over the image without clicking, is defined in&amp;nbsp;the &lt;code&gt;alt="…"&lt;/code&gt; attribute, while the &lt;span class="caps"&gt;URL&lt;/span&gt; of the image is defined in&amp;nbsp;the &lt;code&gt;src="…"&lt;/code&gt; attribute.&lt;/p&gt;
&lt;p&gt;(Embedding an image in a webpage is also possible, but I don’t want to go into depth here because I would have to explain many more concepts before&amp;nbsp;that.)&lt;/p&gt;
&lt;h2&gt;Word document: An &lt;span class="caps"&gt;XML&lt;/span&gt;&amp;nbsp;document&lt;/h2&gt;
&lt;p&gt;I probably didn’t need to explain so much in an issue that’s not Introduction to &lt;span class="caps"&gt;HTML&lt;/span&gt;, but I think it will help make the next part easier to&amp;nbsp;grasp.&lt;/p&gt;
&lt;p&gt;Last issue I said&amp;nbsp;this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;That also means you can spoof a lot of software into thinking you have a zip file when you in fact have an .epub ebook file. This is a pretty common way to unpack files that use the zip archive format to pack their&amp;nbsp;files!&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Suppose we do that with a &lt;span class="caps"&gt;DOCX&lt;/span&gt; file … heck, lets convert Issue 49 into a &lt;span class="caps"&gt;DOCX&lt;/span&gt;, rename it to&amp;nbsp;a &lt;code&gt;.zip&lt;/code&gt; file and see what&amp;nbsp;happens.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Issue 49 DOCX file opened as a .zip file" src="https://ngjunsiang.github.io/laymansguide/issue050_05.png" /&gt;&lt;/p&gt;
&lt;p&gt;&lt;img alt="WHOA cat meme" src="https://i.imgflip.com/2i7zhl.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;Don’t run! Most of it is unimportantly technical, we’ll just jump right into the interesting part which&amp;nbsp;is &lt;code&gt;document.xml&lt;/code&gt;, so take a deep breath&amp;nbsp;…&lt;/p&gt;
&lt;p&gt;&lt;img alt="document.xml" src="https://ngjunsiang.github.io/laymansguide/issue050_06.png" /&gt;&lt;br /&gt;
&lt;em&gt;document.xml&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;Okay, ouch. That’s a different tag language, called e&lt;strong&gt;X&lt;/strong&gt;tensible &lt;strong&gt;M&lt;/strong&gt;arkup &lt;strong&gt;L&lt;/strong&gt;anguage (&lt;span class="caps"&gt;XML&lt;/span&gt;). Interestingly enough, each of those tags starts&amp;nbsp;with &lt;code&gt;w:&lt;/code&gt;, followed by some familiar&amp;nbsp;phrases: &lt;code&gt;body&lt;/code&gt;, &lt;code&gt;p&lt;/code&gt;, and others that are not so&amp;nbsp;familiar.&lt;/p&gt;
&lt;p&gt;But look, there’s also &amp;#8220;Heading1&amp;#8221; and &amp;#8220;Heading3&amp;#8221;! Other than the fact that the tags look completely different, it still uses tags in similar&amp;nbsp;fashion.&lt;/p&gt;
&lt;h2&gt;Documents are just another kind of complex&amp;nbsp;file&lt;/h2&gt;
&lt;p&gt;So, that’s a Word document demystified. When you save a Word document, it just converts whatever you were working on into tags, like this, and zips it all up into a zip file. And any other program that knows how to read these &lt;span class="caps"&gt;XML&lt;/span&gt; files and edit them correctly can then open and edit a &lt;span class="caps"&gt;DOCX&lt;/span&gt; file&amp;nbsp;too.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; An &lt;span class="caps"&gt;HTML&lt;/span&gt; file contains markup tags that tell the browser how to interpret and format the text within the tags. Other document formats usually use tags in a similar way. These tags constitute a markup language that any app can use to mark up its own text&amp;nbsp;too.&lt;/p&gt;
&lt;p&gt;Okay, I hope I’ve demystified webpages, text documents, and just about any place where you see formatted text &lt;em&gt;just a little bit&lt;/em&gt;. Just about any place where you see formatting being done to text, there’s some kind of markup language working in the background. Of course, it’s often going to be much more complicated and messy than a little newsletter, but that is why we get computers to handle&amp;nbsp;it.&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; PDFs Part&amp;nbsp;1&lt;/p&gt;
&lt;p&gt;I’ll round up Season 4 with two issues on everyone’s favourite hated format: PDFs. I think a lot of the reasons people love &lt;span class="caps"&gt;PDF&lt;/span&gt; are spot on, and were how PDFs were sort of intended to be used. And a lot of the reasons people hate &lt;span class="caps"&gt;PDF&lt;/span&gt; occur in cases that &lt;span class="caps"&gt;PDF&lt;/span&gt; was never meant to be used for. They still ended up being used because no better format came along to serve that purpose. More on this in Issue&amp;nbsp;51.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;&lt;del&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue 38]&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category><category term="document"></category></entry><entry><title>Issue 49: What is a File?</title><link href="https://ngjunsiang.github.io/laymansguide/issue049.html" rel="alternate"></link><published>2019-11-30T08:00:00+08:00</published><updated>2019-11-30T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-11-30:/laymansguide/issue049.html</id><summary type="html">&lt;p&gt;A file consists of data, preceded by a file header which describes the data. Software (including operating systems) detect the kind of data contained in a file by 1) glancing at the file extension, 2) looking at its declared &lt;span class="caps"&gt;MIME&lt;/span&gt; type (if any), and 3) checking the file header, in order of difficulty and&amp;nbsp;accuracy.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; A video container can hold one or more audio, video, or text data streams. To encode or decode a data stream, you need to have the necessary codec installed. Most video runs at 25 or 30 fps, with high-quality video going up to 60 fps. You can use a program like MediaInfo to help you decipher the streams inside a video container&amp;nbsp;file.&lt;/p&gt;
&lt;p&gt;Images, audio, video, and more … we are so used to thinking of them as different kinds of files. But within the computer’s binary world, how does it tell that one file is a different type from another? In the human world, if you ran across a bunch of unlabelled boxes of various types and sizes, you would have no way of telling what is in each box. And you know this is a terrible way to move house—you would have to at least label the boxes by colour, by room, or by type of&amp;nbsp;contents.&lt;/p&gt;
&lt;p&gt;You would also have encountered this if you bought anything online. Your packages arrive with a shipping label, which is a quick and convenient way for the shipping companies to identify the package, type of contents, origin, and&amp;nbsp;destination.&lt;/p&gt;
&lt;p&gt;The box labels, and the shipping labels, tell us &lt;em&gt;about&lt;/em&gt; the contents, but not the contents itself. We refer to such data as &lt;strong&gt;metadata&lt;/strong&gt;. Metadata is data about&amp;nbsp;data.&lt;/p&gt;
&lt;p&gt;For a computer to be able to handle so many files without inspecting them individually, it must also have metadata about each of these&amp;nbsp;files.&lt;/p&gt;
&lt;h1&gt;The file&amp;nbsp;header&lt;/h1&gt;
&lt;p&gt;Files generally have a file header. The &lt;span class="caps"&gt;GIF&lt;/span&gt; file format begins with a header (“GIF87a” or “GIF89a”), so anytime a piece of software (e.g. an image editor) starts to read a file header and detects that label, it knows it’s dealing with a &lt;span class="caps"&gt;GIF&lt;/span&gt; file and not a &lt;span class="caps"&gt;JPG&lt;/span&gt;&amp;nbsp;file.&lt;/p&gt;
&lt;p&gt;When the software opens a &lt;span class="caps"&gt;GIF&lt;/span&gt; file, and before it has read anything beyond this header signature (that’s what the &lt;a href="https://ngjunsiang.github.io/laymansguide/issue023.html"&gt;&lt;span class="caps"&gt;GIF&lt;/span&gt; file specification&lt;/a&gt;) calls the above label), it doesn’t know anything about this &lt;span class="caps"&gt;GIF&lt;/span&gt; file. Before doing anything else, it will at least need to know the width and height of this image, and in the case of &lt;span class="caps"&gt;GIF&lt;/span&gt;, some information about its colour palette (which can vary from &lt;span class="caps"&gt;GIF&lt;/span&gt; to &lt;span class="caps"&gt;GIF&lt;/span&gt;). All this information is stored within the file header, and the software will have to know how to read it from the&amp;nbsp;header.&lt;/p&gt;
&lt;p&gt;If for any reason you wish to start writing software that can edit &lt;span class="caps"&gt;GIF&lt;/span&gt; files, you can find out its &lt;a href="https://www.w3.org/Graphics/GIF/spec-gif87.txt"&gt;detailed specifications&lt;/a&gt; online. This is because when Compuserve came up with the format in the early days of the internet, they meant it to be widely used. Companies who design a file format to be used internally and not for public use will come up with &lt;strong&gt;proprietary&lt;/strong&gt; file formats, which are inscrutable to most people. Anyone coming across such a file would have no idea how to open&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;If you want to figure out such a file format, you would have to &lt;strong&gt;reverse-engineer&lt;/strong&gt; it, like &lt;a href="https://reverseengineering.stackexchange.com/questions/261/how-to-reverse-engineer-a-proprietary-data-file-format-e-g-smartboard-notebook"&gt;this guy on StackExchange&lt;/a&gt;. Since typical engineering means starting with a blueprint and coming up with a product, reverse-engineering means starting with the product and trying to figure out its blueprint. Here’s Julia Evans having a go at &lt;a href="https://jvns.ca/blog/2018/03/31/reverse-engineering-notability-format/"&gt;reverse-engineering the Notability file format&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Another example: The &lt;span class="caps"&gt;MP3&lt;/span&gt; file format is simpler (although not easier to decode). Audio data is organised into frames, each frame having its own header followed by data. What about the artist name, record label, genre, date of release, and other information that comes with the file? All that is stored within the &lt;span class="caps"&gt;ID3&lt;/span&gt; portion of the file&amp;nbsp;metadata.&lt;/p&gt;
&lt;p&gt;&lt;img alt="MP3 file structure showing internal structure" src="https://ngjunsiang.github.io/laymansguide/issue049_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;The &lt;span class="caps"&gt;MP3&lt;/span&gt; file structure&lt;br /&gt;Image from &lt;a href="https://en.wikipedia.org/wiki/MP3#/media/File:Mp3filestructure.svg"&gt;Wikipedia&lt;/a&gt;&lt;/em&gt;    &lt;/p&gt;
&lt;h1&gt;File&amp;nbsp;extension&lt;/h1&gt;
&lt;p&gt;That seems like an awfully complicated way for operating systems to detect what type of files they have. They would have to open each file individually, even if just to read the header, and then figure out which complicated set of patterns the header matches. When you open a folder, save a file, or download something from the internet, the computer seems to do that detection much&amp;nbsp;faster.&lt;/p&gt;
&lt;p&gt;That’s because when speed is a concern, software will often attempt to detect the filetype simply by detecting the &lt;strong&gt;file extension&lt;/strong&gt;. File extensions are the ending characters in the filename, after the period (“.”). A file named sound.mp3 has a .mp3 file extension, and one named image.gif has a .gif file extension. That’s a much faster way to detect a whole bunch of filetypes, it’s quick-and-dirty, and it mostly&amp;nbsp;works.&lt;/p&gt;
&lt;p&gt;That also means you can spoof a lot of software into thinking you have a zip file when you in fact have an .epub ebook file. This is a pretty common way to unpack files that use the zip archive format to pack their files! So if you write software that absolutely needs to be sure it has the right filetype, you should double-check the file header instead of jumping to assumptions from the file extension&amp;nbsp;alone.&lt;/p&gt;
&lt;h2&gt;What about the&amp;nbsp;internet?&lt;/h2&gt;
&lt;p&gt;Guessing from file extensions might work in a computer, but on the internet it’s the Wild West. Images sent as data packets over the internet do not come with filenames; notice how some apps (e.g. WhatsApp) rename your image with a different name when you upload or download them? And on some web platforms, especially those that handle huge volumes of images, the filenames are just semi-random&amp;nbsp;characters.&lt;/p&gt;
&lt;p&gt;That is why we rely on what are known as &lt;strong&gt;&lt;span class="caps"&gt;MIME&lt;/span&gt; types&lt;/strong&gt;. &lt;span class="caps"&gt;MIME&lt;/span&gt; stands for &lt;strong&gt;M&lt;/strong&gt;ultipurpose &lt;strong&gt;I&lt;/strong&gt;nternet &lt;strong&gt;M&lt;/strong&gt;ail &lt;strong&gt;E&lt;/strong&gt;xtension, and yes there is an &lt;span class="caps"&gt;RFC&lt;/span&gt; for it, &lt;a href="https://tools.ietf.org/html/rfc6838"&gt;&lt;span class="caps"&gt;RFC6838&lt;/span&gt;&lt;/a&gt;. This is a much more standardised way of declaring what type of file you have. &lt;a href="https://www.iana.org/assignments/media-types/media-types.xhtml#examples"&gt;The exhaustive list&lt;/a&gt; of &lt;span class="caps"&gt;MIME&lt;/span&gt; types, maintained by &lt;span class="caps"&gt;IANA&lt;/span&gt; (whom we first met in &lt;a href="https://ngjunsiang.github.io/laymansguide/issue027.html"&gt;Issue 27&lt;/a&gt;)), has &lt;span class="caps"&gt;MIME&lt;/span&gt; types for application files, audio, font, image, messages, model, multipart formats, text, and&amp;nbsp;video.&lt;/p&gt;
&lt;p&gt;If you plan to come up with a file format that is intended to be used widely, you can &lt;a href="https://www.iana.org/form/media-types"&gt;apply for it to be included&lt;/a&gt; in the&amp;nbsp;list.&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;MIME&lt;/span&gt; types and &lt;span class="caps"&gt;HTTP&lt;/span&gt;&amp;nbsp;headers&lt;/h2&gt;
&lt;p&gt;Remember this &lt;span class="caps"&gt;HTTP&lt;/span&gt; header from Issue&amp;nbsp;8?&lt;/p&gt;
&lt;p&gt;&lt;img alt="An HTTP request header" src="https://ngjunsiang.github.io/laymansguide/issue008_01.png" /&gt;&lt;/p&gt;
&lt;p&gt;See that label in the third row, with the Content-Type label, “application/json”? That’s the &lt;span class="caps"&gt;MIME&lt;/span&gt; type for the &lt;a href="https://ngjunsiang.github.io/laymansguide/issue005.html"&gt;&lt;span class="caps"&gt;JSON&lt;/span&gt; data format&lt;/a&gt;). When the server returns data, my browser (the client) has no idea what format it is. It might be nicely formatted &lt;span class="caps"&gt;HTML&lt;/span&gt; meant for human consumption, but it might also be plain text, &lt;span class="caps"&gt;JSON&lt;/span&gt; data (like in this case), &lt;span class="caps"&gt;XML&lt;/span&gt;, or any of the various data formats that people use. Declaring the &lt;span class="caps"&gt;MIME&lt;/span&gt; type properly makes life easier for the browser to know what to do with the&amp;nbsp;data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; A file consists of data, preceded by a file header which describes the data. Software (including operating systems) detect the kind of data contained in a file by 1) glancing at the file extension, 2) looking at its declared &lt;span class="caps"&gt;MIME&lt;/span&gt; type (if any), and 3) checking the file header, in order of difficulty and&amp;nbsp;accuracy.&lt;/p&gt;
&lt;p&gt;I almost started writing a long post about filesystems, but stopped myself in time. I hoped with this issue to continue emphasising the theme of data encapsulation: data locked in shells upon shells upon shells of metadata. I’ll be back to describing other types of data again for the rest of the issue, but I thought file headers would be good to introduce at this&amp;nbsp;point.&lt;/p&gt;
&lt;p&gt;After this season I won’t be digging into complex data types, but when I move on to operating systems I’ll cycle back to filesystems and what you need to know about them. Before I get to that season, though, here’s something for you to ponder: if all data is ultimately binary, how would an app know where one file ends and where another starts? Does the file header for mydocument.doc start at this 0, or another 0, or actually at this&amp;nbsp;1?&lt;/p&gt;
&lt;hr /&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Complex file formats and the&amp;nbsp;Document&lt;/p&gt;
&lt;p&gt;Two issues ago, I just talked about video formats, which include multiple types of data: video, audio, and even text&amp;nbsp;(subtitles).&lt;/p&gt;
&lt;p&gt;Next issue, we’ll pick up where we left off to look at another format that includes multiple data types: the&amp;nbsp;document.&lt;/p&gt;
&lt;p&gt;See you again next week, next&amp;nbsp;issue.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;li&gt;How do apps know where a file starts and ends? [Issue&amp;nbsp;49]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category></entry><entry><title>Issue 48: Of containers and codecs</title><link href="https://ngjunsiang.github.io/laymansguide/issue048.html" rel="alternate"></link><published>2019-11-23T08:00:00+08:00</published><updated>2019-11-23T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-11-23:/laymansguide/issue048.html</id><summary type="html">&lt;p&gt;A video container can hold one or more audio, video, or text data streams. To encode or decode a data stream, you need to have the necessary codec installed[^1]. Most video runs at 25 or 30 fps, with high-quality video going up to 60 fps. You can use a program like MediaInfo to help you decipher the streams inside a video container&amp;nbsp;file.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Data cannot be compressed beyond its predictability limit in a lossless fashion. Lossless compression does not discard any information. It spots patterns in the data and represents them with fewer bits, through a combination of predictive coding, run-length encoding, and entropy&amp;nbsp;coding.&lt;/p&gt;
&lt;p&gt;In past issues this season, I went into some detail about how images and sound are represented as data in computers. I also went into a little detail about lossy compression, in which imperceptible information is discarded, and lossless compression, in which the original information can be&amp;nbsp;reconstructed.&lt;/p&gt;
&lt;p&gt;That progression finally brings me to this issue, where I introduce the first complex data representation: the video&amp;nbsp;file.&lt;/p&gt;
&lt;p&gt;A video file, as we like to think about it, actually is not a simple form of data. It can have one or more of the&amp;nbsp;following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;video&amp;nbsp;data&lt;/li&gt;
&lt;li&gt;audio&amp;nbsp;data&lt;/li&gt;
&lt;li&gt;subtitles&lt;/li&gt;
&lt;li&gt;annotations (e.g. on Youtube&amp;nbsp;videos)&lt;/li&gt;
&lt;li&gt;chapters (which let you jump to certain points in the video, like a&amp;nbsp;bookmark)&lt;/li&gt;
&lt;li&gt;miscellaneous files (e.g. embedded copyright&amp;nbsp;information)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;These various types of information, if they are time-sensitive (video, audio, and subtitles), have to be presented in synchrony. It’s not like you can just throw them into a simple zip file or folder and the computer knows what to do with them! How does a computer know how to put them together into an engaging&amp;nbsp;movie?&lt;/p&gt;
&lt;h1&gt;The video&amp;nbsp;container&lt;/h1&gt;
&lt;p&gt;What we usually understand as a video file is actually a &lt;strong&gt;video container&lt;/strong&gt; format. The common ones we encounter online today are &lt;span class="caps"&gt;MP4&lt;/span&gt;&amp;nbsp;(&lt;code&gt;.mp4&lt;/code&gt;) and Quicktime&amp;nbsp;(&lt;code&gt;.mov&lt;/code&gt;). In a more recent past, you would have commonly encountered &lt;span class="caps"&gt;AVI&lt;/span&gt;&amp;nbsp;(&lt;code&gt;.avi&lt;/code&gt;), &lt;span class="caps"&gt;3GPP&lt;/span&gt;&amp;nbsp;(&lt;code&gt;.3gp&lt;/code&gt;), and Flash Video&amp;nbsp;(&lt;code&gt;.flv&lt;/code&gt;). And if you’re a video techie who dives into DVDs and Blu-ray discs, you would also have seen Video Objects&amp;nbsp;(&lt;code&gt;.vob&lt;/code&gt;) and &lt;span class="caps"&gt;MPEG&lt;/span&gt; Transport Streams&amp;nbsp;(&lt;code&gt;.ts&lt;/code&gt;) while digging through their contents on a&amp;nbsp;computer.&lt;/p&gt;
&lt;p&gt;The audio, image, and text data in the video container are referred to as &lt;strong&gt;streams&lt;/strong&gt;. At the binary level, it’s all 1s and 0s; how does the computer know which part of the file contains audio, image, or text data? This information is in the video container metadata, along with more details on how to load the correct part of the video, audio, or text &lt;em&gt;at the right time&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;If you have come across poorly formed video where the image and audio data is not in sync, or the subtitles come too early/late, you know how critical it is to get this right: the human eye and ear can be pretty sensitive to even slight discrepancies in&amp;nbsp;timing.&lt;/p&gt;
&lt;h1&gt;From still image to&amp;nbsp;video&lt;/h1&gt;
&lt;p&gt;I’ve talked about how pixels are perceived in still image data, now I’ll introduce one more aspect of psychovisuals: how the human eye perceives &lt;em&gt;motion&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;The eye interacts with the brain in strange ways. Over millions of years of evolution, the brain has evolved &lt;a href="https://www.eurekalert.org/pub_releases/2006-07/uops-prc072606.php"&gt;a ‘high-power’ and a ‘low-power’ way&lt;/a&gt; to receive information from the eye. Under everyday conditions, the brain is able to connect separate frames of image data into a coherent picture and interpretation without being confused by the differences between each&amp;nbsp;frame.&lt;/p&gt;
&lt;p&gt;Decades of experimentation have set the gold standard for motion pictures at 60 frames per second (fps) for a seamless experience. That’s a lot of images per second, and a lot of corresponding video&amp;nbsp;data!&lt;/p&gt;
&lt;p&gt;For everyday purposes, such as online streaming, it is more common to encounter 30 fps, or even 25 fps for older videos. In certain types of video entertainment, such as hand-drawn animation, the human eye can make do with 15 fps and the brain can still piece together an enjoyable&amp;nbsp;performance!&lt;/p&gt;
&lt;h1&gt;Data&amp;nbsp;streams&lt;/h1&gt;
&lt;p&gt;How about the data streams? How are they&amp;nbsp;stored?&lt;/p&gt;
&lt;p&gt;To start with the obvious, they are not stored uncompressed; we saw that a single image of 1920×1080 pixels (that’s 1080p video standard, with 1080 pixels vertically) already requires 6 MiB (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue043.html"&gt;Issue 43&lt;/a&gt;)), while one second of audio requires 86 KiB (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue045.html"&gt;Issue 45&lt;/a&gt;)).&lt;/p&gt;
&lt;p&gt;In addition to the lossy compression techniques I covered in &lt;a href="https://ngjunsiang.github.io/laymansguide/issue046.html"&gt;Issue 46&lt;/a&gt;), software that creates these streams can also compare video frames at different points in time and throw away identical parts (if there’s no scene change, or if the camera is panning slowly, for&amp;nbsp;instance).&lt;/p&gt;
&lt;p&gt;Various video stream formats exist to carry out this lossy compression of video&amp;nbsp;data.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;h264 (a.k.a. &lt;span class="caps"&gt;AVC&lt;/span&gt;, for &lt;strong&gt;A&lt;/strong&gt;dvanced &lt;strong&gt;V&lt;/strong&gt;ideo &lt;strong&gt;Coding&lt;/strong&gt;) is still the most common video stream format in use&amp;nbsp;today.&lt;/li&gt;
&lt;li&gt;h265 (a.k.a. &lt;span class="caps"&gt;HEVC&lt;/span&gt;, for &lt;strong&gt;H&lt;/strong&gt;igh &lt;strong&gt;E&lt;/strong&gt;fficiency &lt;strong&gt;V&lt;/strong&gt;ideo &lt;strong&gt;C&lt;/strong&gt;oding) is slated to replace it and is set to become more and more&amp;nbsp;popular.&lt;/li&gt;
&lt;li&gt;Google’s &lt;span class="caps"&gt;VP9&lt;/span&gt; is attempting to compete with it (with companies such as Netflix already on&amp;nbsp;board).&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;FLV&lt;/span&gt; (as a video stream format, not a container; I know it’s confusing) is becoming less and less&amp;nbsp;common.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What about audio? We used to encounter &lt;span class="caps"&gt;MP3&lt;/span&gt; pretty often, but today most audio stream data is stored as &lt;span class="caps"&gt;AAC&lt;/span&gt; (for &lt;strong&gt;A&lt;/strong&gt;dvanced &lt;strong&gt;A&lt;/strong&gt;udio &lt;strong&gt;C&lt;/strong&gt;oding, the standard that’s meant to replace &lt;span class="caps"&gt;MP3&lt;/span&gt;), Dolby (often on DVDs and Blu-rays), and sometimes Vorbis&amp;nbsp;(&lt;code&gt;.ogg&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Confused yet? Just remember that the video file you have (carrying&amp;nbsp;the &lt;code&gt;.mp4&lt;/code&gt;, &lt;code&gt;.mov&lt;/code&gt;, etc file extension) is only the container, and it contains one or more streams of actual&amp;nbsp;data.&lt;/p&gt;
&lt;h1&gt;Encoding and&amp;nbsp;decoding&lt;/h1&gt;
&lt;p&gt;To use these streams, you need a piece of software on your computer. This piece of software en&lt;strong&gt;co&lt;/strong&gt;des or &lt;strong&gt;dec&lt;/strong&gt;odes the data stream, so it is called a &lt;strong&gt;codec&lt;/strong&gt;. If you don’t have the required codecs, you will get an error when you attempt to open a video container file that has one or more streams in that&amp;nbsp;format.&lt;/p&gt;
&lt;p&gt;The operating system you use comes bundled with support for the most common formats, although for free-and-open-source OSes (like some flavours of Linux) this may be hampered by copyright&amp;nbsp;restrictions.&lt;/p&gt;
&lt;p&gt;About a decade ago, when video formats proliferated like a tropical ecosystem, codec packs containing just about every codec you need were a common sight online. Today, with most video moved to online streaming platforms, you no longer need&amp;nbsp;them.&lt;/p&gt;
&lt;h1&gt;MediaInfo: a program to decipher containers and&amp;nbsp;streams&lt;/h1&gt;
&lt;p&gt;You can use a program like &lt;a href="https://mediaarea.net/en/MediaInfo"&gt;MediaInfo&lt;/a&gt; to help you read the metadata and figure out the container and stream formats. Here’s an example of the information it shows about the only video file on my laptop at the&amp;nbsp;moment:&lt;/p&gt;
&lt;p&gt;&lt;img alt="MediaInfo screenshot showing container, video stream, and audio stream information" src="https://ngjunsiang.github.io/laymansguide/issue048_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;Mediainfo screenshot showing metadata for an &lt;span class="caps"&gt;MP4&lt;/span&gt; file containing an h264 (a.k.a. &lt;span class="caps"&gt;AVC&lt;/span&gt;) video stream and an &lt;span class="caps"&gt;AAC&lt;/span&gt; audio&amp;nbsp;stream.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; A video container can hold one or more audio, video, or text data streams. To encode or decode a data stream, you need to have the necessary codec installed&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt;. Most video runs at 25 or 30 fps, with high-quality video going up to 60 fps. You can use a program like MediaInfo to help you decipher the streams inside a video container&amp;nbsp;file.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;The key part of this issue I really wanted to get to was about codecs. “Why can’t I open this video file?” was a much more popular question in the recent past, but it has gradually faded as more and more video gets moved to Youtube. Today, I suppose the only people who still run into this problem are teachers who come across archives of old videos while hunting for teaching&amp;nbsp;resources.&lt;/p&gt;
&lt;p&gt;But still, I anticipate that I need a gentle introduction to data encapsulation. That’s a complex way of talking about data being nested in a series of shells, like a Matryoshka doll. We’ve seen some examples from the previous season on networking: data stored in an &lt;span class="caps"&gt;HTTP&lt;/span&gt; request, which is encapsulated in a &lt;span class="caps"&gt;TCP&lt;/span&gt; packet, which is encapsulated in an &lt;span class="caps"&gt;IP&lt;/span&gt; packet before it is sent over the&amp;nbsp;Internet.&lt;/p&gt;
&lt;p&gt;Today, I can have video stream information stored in an &lt;span class="caps"&gt;MP4&lt;/span&gt; container, placed in a folder in a losslessly-compressed &lt;span class="caps"&gt;ZIP&lt;/span&gt; file (for whatever strange reason), and sent over the Internet to somebody else. Data surrounded by shells and more shells. It’s like opening a delivery box: your tiny item inside, surrounded by cardboard packaging, surrounded by bubble wrap, surrounded by a cardboard box, which was probably placed on a pallet and shipped in a shipping&amp;nbsp;container.&lt;/p&gt;
&lt;p&gt;The next few issues will continue to be about encapsulated data, but I’ll start with something simple first: what is a&amp;nbsp;file?&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; What is a&amp;nbsp;file?&lt;/p&gt;
&lt;p&gt;Sometimes, the hardest questions are deceptively simple. We all have an intuitive idea of what a file is. But what actually goes on under the&amp;nbsp;hood?&lt;/p&gt;
&lt;p&gt;See you again next week, next&amp;nbsp;issue.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is involved in installing a piece of software? [Issue&amp;nbsp;48]&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="footnote"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Come to think of it, that’s a good topic for a future issue: what goes on when a piece of software is installed on your computer?&amp;#160;&lt;a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</content><category term="Season 04"></category><category term="compression"></category></entry><entry><title>Issue 47: Lossless compression</title><link href="https://ngjunsiang.github.io/laymansguide/issue047.html" rel="alternate"></link><published>2019-11-16T08:00:00+08:00</published><updated>2019-11-16T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-11-16:/laymansguide/issue047.html</id><summary type="html">&lt;p&gt;Data cannot be compressed beyond its predictability limit (Shannon entropy) in a lossless fashion. Lossless compression does not discard any information. It generally tries to spot patterns in the data, and represent those patterns with fewer bits, through a combination of predictive coding, run-length encoding, and entropy&amp;nbsp;coding.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Computers compress image and audio data through a process similar to summarising: it analyses the data using algorithms that use brightness and colour instead of &lt;span class="caps"&gt;RGB&lt;/span&gt; values for images, and different frequencies of sound rather than samples at different points in time for audio. These algorithms then discard parts of the information that human senses do not perceive easily, and reduce the resolution of other parts that human senses are not as sensitive&amp;nbsp;to.&lt;/p&gt;
&lt;p&gt;I went into quite a bit of technical detail in the last issue, and am loathe to do so again this issue. Let’s see how much math I can avoid explaining this&amp;nbsp;issue.&lt;/p&gt;
&lt;p&gt;Lossless compression is necessary in cases where the information must be stored verbatim. For example, if you are sending a &lt;span class="caps"&gt;26MB&lt;/span&gt; Powerpoint file to a friend, but GMail’s attachment limit is only &lt;span class="caps"&gt;25MB&lt;/span&gt;, one thing you might try to do is put it into a compressed &lt;span class="caps"&gt;ZIP&lt;/span&gt; file to see if you can bring the size below &lt;span class="caps"&gt;25MB&lt;/span&gt;. However, you would not want any information to be lost when it reaches your friend; they must be able to decode the &lt;span class="caps"&gt;ZIP&lt;/span&gt; file to  retrieve the original Powerpoint&amp;nbsp;file.&lt;/p&gt;
&lt;p&gt;While lossy compression depends very much on how our senses (particularly sight and hearing) work and on their deficiencies, lossless compression only depends on the characteristics of the information. Accordingly, a wide variety of lossless compression techniques have been developed, each suited for a particular domain. I will attempt to give a very brief overview of some common techniques before I explain some common things people try to do in&amp;nbsp;compression.&lt;/p&gt;
&lt;h2&gt;Lossless audio&amp;nbsp;compression&lt;/h2&gt;
&lt;p&gt;Your brain works in interesting ways. If it sees two images that are near-identical (like a game of Spot The Difference), it won’t remember it as two separate images, but as one image, and the difference between the two images. So when people try to recall the two images you hear things like “this photo had a cat and a dog staring each other down and it also had [blahblah], the other photo is exactly the same except the cat’s ears were furled back and the dog was drolling”. Certainly a lot faster than describing the second image exactly the same way, with the additional&amp;nbsp;detail!&lt;/p&gt;
&lt;p&gt;Lossless audio compressors work in a similar way. They sample the audio in short segments, and try to see how lazy they can get in describing the next sample. This is known as &lt;strong&gt;predictive coding&lt;/strong&gt;, because it is a little similar to the process of trying to “predict” the next sample. For example, based on the past 10 samples, a predictive algorithm might say “the next sample will have 0.09% of sample 1, 1.02% of sample 2, 5.63% of sample 3, …”. Storing those percentages will use a lot less space than storing the entire sample; when decompressing, the algorithm can then multiply the percentages with the respective samples to reconstruct the original&amp;nbsp;sample.&lt;/p&gt;
&lt;p&gt;In lossless compression, the predictive algorithm already knows what the next sample is, so most of the work is in calculating exactly what those percentages are. It does so by making an initial guess, then refining that guess in successive stages of calculation, each stage bringing it closer to the original waveform. This requires a lot of computation time. If such a setting is available, the algorithm can shorten the process, leading to a poorer guess. It then calculates the difference between the best guess and the original sample, and stores the difference between the two. This part is what makes it lossless rather than&amp;nbsp;lossy.&lt;/p&gt;
&lt;h1&gt;Lossless image&amp;nbsp;compression&lt;/h1&gt;
&lt;p&gt;The most common image formats that use compression are &lt;span class="caps"&gt;GIF&lt;/span&gt; (yes, really) and &lt;span class="caps"&gt;PNG&lt;/span&gt;. Some kinds of images, such as screenshots, have patterns that are repeated. The algorithm used in &lt;span class="caps"&gt;GIF&lt;/span&gt; and &lt;span class="caps"&gt;PNG&lt;/span&gt;, &lt;span class="caps"&gt;LZ77&lt;/span&gt;, attempts to spot these patterns, and reduce them to 1) the repeating portion, and 2) the number of repetitions. This is known as &lt;strong&gt;run-length encoding&lt;/strong&gt;. The nature of images makes the process easier, as each pixel only has 256 possible values rather than&amp;nbsp;65536.&lt;/p&gt;
&lt;p&gt;Those patterns are stored in a table, and &lt;em&gt;references&lt;/em&gt; to them are used instead. So instead of saying “Pattern 0101011101110110”, the algorithm will store a list of these patterns, and refer to them as Pattern 0, Pattern 1, Pattern 10, Pattern 11, … (these are 1, 2, 3, and 4 respectively, in binary representation (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue040.html"&gt;Issue 40&lt;/a&gt;))).&lt;/p&gt;
&lt;p&gt;This is known as &lt;strong&gt;entropy coding&lt;/strong&gt;. By linking the longest pattern with the smallest reference number (i.e. Pattern 0), the next-longest pattern with the next-smallest reference number (Pattern 1, 10, 11, 100, 101, 110, …) you can reduce quite significantly the number of bits needed to represent the&amp;nbsp;image.&lt;/p&gt;
&lt;h1&gt;Text&amp;nbsp;compression&lt;/h1&gt;
&lt;p&gt;Text lends itself very well to compression, since there are so many repeated words and phrases. In general, text compression algorithms will use a combination of entropy coding and run-length encoding to reduce a document of text into repeating patterns, and using shorter references to those patterns rather than the full pattern&amp;nbsp;itself.&lt;/p&gt;
&lt;h1&gt;What is the maximum possible&amp;nbsp;compression?&lt;/h1&gt;
&lt;p&gt;Excellent question. Shannon’s source coding theorem&lt;sup id="fnref:1"&gt;&lt;a class="footnote-ref" href="#fn:1"&gt;1&lt;/a&gt;&lt;/sup&gt; defines a compression limit for each block of information, called Shannon entropy (unbolded, don’t worry!). The source coding theorem says it is impossible to compress data beyond its Shannon&amp;nbsp;entropy.&lt;/p&gt;
&lt;p&gt;So what is the Shannon entropy of the data? That depends on its predictability. A block of text that only consists of the letter ‘e’ would be highly predictable, and therefore have a low Shannon entropy (I will stop using this term and use &lt;strong&gt;predictability limit&lt;/strong&gt; instead). A block of text that is just completely random characters would be unpredictable and would therefore have a high Shannon&amp;nbsp;entropy.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;tl;dr&lt;/em&gt; higher predictability  = higher (lossless) compression, lower predictability = lower (lossless)&amp;nbsp;compression&lt;/p&gt;
&lt;p&gt;And now it is myth-busting time! Well, not really, since most observant folks would have noticed this by&amp;nbsp;now.&lt;/p&gt;
&lt;h1&gt;When I put a zip file in another zip file, why is the second zip file no smaller in size that the&amp;nbsp;first?&lt;/h1&gt;
&lt;p&gt;When the first zip file compressed its contents, the predictability of the resulting data decreased (ever tried compressing shorthand?). You won’t get very far trying to compress unpredictable&amp;nbsp;data.&lt;/p&gt;
&lt;p&gt;If you want greater compression, use a higher compression setting on the original file&amp;nbsp;instead.&lt;/p&gt;
&lt;p&gt;&lt;img alt="7zip archive settings, showing options for compression level, compression method, and dictionary size" src="https://ngjunsiang.github.io/laymansguide/issue047_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;7zip archive settings for zip files.&lt;br /&gt;Image from &lt;a href="https://en.wikipedia.org/wiki/File:Colorcomp.jpg"&gt;Wikimedia&amp;nbsp;Commons&lt;/a&gt;&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;A higher &lt;em&gt;compression level&lt;/em&gt; generally causes the algorithm to try more combinations and iterations of compression, a larger &lt;em&gt;dictionary size&lt;/em&gt; enables the algorithm to use more pattern references. Play with these two settings to find the best tradeoff between compression time and compression ratio (the ratio of final filesize to original&amp;nbsp;filesize).&lt;/p&gt;
&lt;h1&gt;Why do Powerpoint files sometimes compress very well and sometimes not at&amp;nbsp;all?&lt;/h1&gt;
&lt;p&gt;Powerpoint is already a compressed file format, so the only filesize gains you will get are from compressing embedded media, such as videos or images. If you used any uncompressed images, you might be able to achieve some filesize gains. But it is better to have Powerpoint handle the compression instead; it offers a &lt;a href="https://highspark.co/how-to-compress-powerpoint/"&gt;Compress Pictures&lt;/a&gt;&amp;nbsp;option.&lt;/p&gt;
&lt;h1&gt;You talk about your highfalutin Shannon entropy, but I can find so many tiny video and image files online! How do they achieve&amp;nbsp;that?&lt;/h1&gt;
&lt;p&gt;Shannon’s source coding theorem does not claim that you cannot compress data beyond its predictability limit. It only claims that you cannot do so losslessly. Which means you can compress data beyond its predictability limit, &lt;em&gt;lossily&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;You are getting video and image files from those sources with lots of information thrown away. If you can’t tell the difference, good for&amp;nbsp;you.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Data cannot be compressed beyond its predictability limit (Shannon entropy) in a lossless fashion. Lossless compression does not discard any information. It generally tries to spot patterns in the data, and represent those patterns with fewer bits, through a combination of predictive coding, run-length encoding, and entropy&amp;nbsp;coding.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Predictive coding:&lt;/strong&gt; express samples as a combination of past samples&lt;br /&gt;
&lt;strong&gt;Run-length encoding:&lt;/strong&gt; spot repetitions of patterns in the data&lt;br /&gt;
&lt;strong&gt;Entropy coding:&lt;/strong&gt; Store the list of patterns, using a shorter symbol as reference to the&amp;nbsp;pattern&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;If the lossy compression articles are hard to read, the lossless compression articles are even worse, because so much of it is math theory. I got the gist of it as best as I&amp;nbsp;can.&lt;/p&gt;
&lt;p&gt;I don’t like the way most layman explanations in the media completely skip over the details; before I understood lossless compression, these explanations were often no help to me. I think at least knowing what kind of patterns can be found in the data would help with imagining the process, hence the crash-course introductions to predictive coding, run-length encoding, and entropy&amp;nbsp;coding.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Of containers and&amp;nbsp;codecs&lt;/p&gt;
&lt;p&gt;Why have we been talking so much about images and audio and compression? Because I want to get to the meat, which is: video formats! This is probably the single biggest source of confusion for most people who come to look for me regarding file types: “What kind of video file is this? How do I open it? Why can’t it open?” Next issue: a simple way to understand video formats and what they&amp;nbsp;need.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;/ul&gt;
&lt;div class="footnote"&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id="fn:1"&gt;
&lt;p&gt;Yes, that’s the same Shannon from Nyquist-Shannon sampling theorem. Claude Shannon is lauded as “the father of information theory” with good reason.&amp;#160;&lt;a class="footnote-backref" href="#fnref:1" title="Jump back to footnote 1 in the text"&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;</content><category term="Season 04"></category><category term="compression"></category></entry><entry><title>Issue 46: Lossy compression</title><link href="https://ngjunsiang.github.io/laymansguide/issue046.html" rel="alternate"></link><published>2019-11-09T08:00:00+08:00</published><updated>2019-11-09T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-11-09:/laymansguide/issue046.html</id><summary type="html">&lt;p&gt;Computers compress image and audio data through a process similar to summarising: it analyses the data using algorithms that use brightness and colour instead of &lt;span class="caps"&gt;RGB&lt;/span&gt; values for images, and different frequencies of sound rather than samples at different points in time for audio. These algorithms then discard parts of the information that human senses do not perceive easily, and reduce the resolution of other parts that human senses are not as sensitive&amp;nbsp;to.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Humans can distinguish 120 dB of loudness, which means the loudest perceivable sound is a million times louder than the softest perceivable sound. &lt;span class="caps"&gt;CD&lt;/span&gt; audio provides 16 bits of information per sample, sufficient to provide 96 dB. Humans have a hearing range from 20 Hz to 20 kHz. &lt;span class="caps"&gt;CD&lt;/span&gt; audio is sampled at 44.1 kHz. Uncompressed audio thus requires 705,600 bits per second, or 86&amp;nbsp;kB/s.&lt;/p&gt;
&lt;p&gt;Lots of numbers in the last issue, and you don’t need to memorise any of them, but those numbers were necessary to demonstrate some fundamental facts about data and information: We need a heck lot of data to produce images and audio that doesn’t sound distorted! And this is closely related to the limits of our eyes and&amp;nbsp;ears.&lt;/p&gt;
&lt;h2&gt;Why are the images and audio files on the internet so much&amp;nbsp;smaller?&lt;/h2&gt;
&lt;p&gt;Because they are compressed, that’s&amp;nbsp;why.&lt;/p&gt;
&lt;p&gt;We all have that one friend (or maybe more) who can just drone on and on about their day, or about something that happened, giving a detailed account with every little thing that happened, and all the things that it reminds them of, and finally in their entire speech there’s that piece of information you are looking&amp;nbsp;for!&lt;/p&gt;
&lt;p&gt;Or maybe you’ve been in an hour-long meeting and your colleague missed it and asked you what they missed. Would it take you an hour to recount the key points? Probably not. You’d give a summary, highlighting only the key bits that would make a&amp;nbsp;difference.&lt;/p&gt;
&lt;p&gt;Computers do something similar using &lt;strong&gt;compression algorithms&lt;/strong&gt; that analyse the data and figure out which parts can be safely discarded without affecting the gist of what’s being transferred. Because information is being discarded, this is known as &lt;strong&gt;lossy compression&lt;/strong&gt;—you can never get back &lt;em&gt;all of&lt;/em&gt; the original information once it has been lossily&amp;nbsp;compressed.&lt;/p&gt;
&lt;p&gt;If you’re thinking “this part is going to be incredibly math-ey”, you are right, but I have only an hour for this issue so I’ll see how I can further summarise the theory for you readers&amp;nbsp;:)&lt;/p&gt;
&lt;h2&gt;Lossy image compression: luma and&amp;nbsp;chroma&lt;/h2&gt;
&lt;p&gt;In &lt;a href="https://ngjunsiang.github.io/laymansguide/issue044.html"&gt;Issue 44&lt;/a&gt;), I mentioned that the human eye has 3 types of cones that sense red, green, and blue light. What I didn’t mention then is that partly due to the way these cones are distributed, the human eye is more sensitive to differences in brightness (or “&lt;strong&gt;luma&lt;/strong&gt;”) than differences in colour (“&lt;strong&gt;chroma&lt;/strong&gt;”).&lt;/p&gt;
&lt;p&gt;A black-and-white image has only luma information (brightness), while a colour image has both luma and chroma information—you can mathematically separate the data of a colour image into the brightness component (which looks just like a black-and-white photo), and a colour component, which looks like nothing you have ever seen. The closest thing to chroma information would be analog colour photo negatives, if you were born early enough to get to see&amp;nbsp;those.&lt;/p&gt;
&lt;p&gt;So that’s another way of representing image information: you can either represent it as &lt;span class="caps"&gt;RGB&lt;/span&gt; (red-green-blue) colour values, or &lt;span class="caps"&gt;YUV&lt;/span&gt; (1 luma value, Y, and 2 chroma values, U &lt;span class="amp"&gt;&amp;amp;&lt;/span&gt; V). In &lt;span class="caps"&gt;RGB&lt;/span&gt;, all 3 colour components are equally important and you can’t treat them differently, but in &lt;span class="caps"&gt;YUV&lt;/span&gt; you &lt;em&gt;can&lt;/em&gt; process them differently to achieve lossy&amp;nbsp;compression.&lt;/p&gt;
&lt;h2&gt;Lossy image compression:&amp;nbsp;chroma&lt;/h2&gt;
&lt;p&gt;Since the human eye is less sensitive to chroma (colour) information, in the &lt;span class="caps"&gt;JPEG&lt;/span&gt; image format, the chroma components are compressed by averaging each 2×2 group of pixels into 1 value for U and V each. (This process is known as subsampling.) Theoretically that halves the amount of data required for the same image! (4/4 Y + 1/4 U + 1/4 V = 6/12 of the original&amp;nbsp;information)&lt;/p&gt;
&lt;p&gt;&lt;img alt="4 images with different chroma subsampling" src="https://ngjunsiang.github.io/laymansguide/issue046_01.jpg" /&gt;&lt;br /&gt;
&lt;em&gt;Compare the image without chroma compression (4:4:4) to the image with chroma compression (4:2:0).&lt;br /&gt;Without scrutiny, the human eye is not very sensitive to lower resolution in chroma.&lt;br /&gt;Image from &lt;a href="https://en.wikipedia.org/wiki/File:Colorcomp.jpg"&gt;Wikimedia&amp;nbsp;Commons&lt;/a&gt;&lt;/em&gt;    &lt;/p&gt;
&lt;h2&gt;Lossy image compression:&amp;nbsp;luma&lt;/h2&gt;
&lt;p&gt;Furthermore, even within the luma channel (i.e. looking at luma information only), the human eye is more sensitive to sharp changes in brightness across adjacent pixels than gradual changes in brightness across adjacent pixels. Through a Discrete Cosine Transform (&lt;span class="caps"&gt;DCT&lt;/span&gt;) algorithm, a computer can separate the luma information and differentiate parts with sharper changes, and parts with gradual&amp;nbsp;changes.&lt;/p&gt;
&lt;p&gt;As the compression level increases (this is the quality setting you often play with in Photoshop and other image-editing software), the computer increasingly discards more and more information, starting from the gradual-change information. For photograph images, you will generally hit diminishing returns below 85%: each 1% decrease in quality brings you less and less savings on&amp;nbsp;filesize.&lt;/p&gt;
&lt;p&gt;And that, in a nutshell, is how most lossy image compression works, and how the &lt;span class="caps"&gt;JPEG&lt;/span&gt; format works (well, okay, I’ve explained the main 30% of it&amp;nbsp;maybe).&lt;/p&gt;
&lt;h2&gt;Lossy audio compression: discarding what we can’t&amp;nbsp;hear&lt;/h2&gt;
&lt;p&gt;What about&amp;nbsp;audio?&lt;/p&gt;
&lt;p&gt;If you are all about the bass, or like tweaking with sound settings, or have worked with audio systems before e.g. for a performance or for your school’s events, you would have used an equaliser at some point. An equaliser is a device (or software application) that lets you adjust how much bass (low pitch), medium (middle pitch), and treble (high pitch) you want from the sound. How is the system able to do&amp;nbsp;that?&lt;/p&gt;
&lt;p&gt;Through transforms! &lt;span class="caps"&gt;DCT&lt;/span&gt;, mentioned earlier, is one such transform; audio formats often use another one, known as the Fast Fourier Transform (&lt;span class="caps"&gt;FFT&lt;/span&gt;). (Aren’t you glad this is a newsletter about computing and not about math?) Anyway, a transform lets us transform information organised by position (e.g. in images) or by time (e.g. in audio) into information organised by other properties, such as&amp;nbsp;frequency.&lt;/p&gt;
&lt;p&gt;The &lt;span class="caps"&gt;FFT&lt;/span&gt; algorithm organises audio information (for a certain time length) by frequency. Depending on your equaliser settings, it increases or decreases the weightage of different frequencies to produce the sound you want, be it bass-heavy rock or medium-light&amp;nbsp;jazz.&lt;/p&gt;
&lt;p&gt;But the &lt;span class="caps"&gt;FFT&lt;/span&gt; algorithm can do much more! It is known that most sounds we hear are typically in the 40 Hz to 19 kHz range, so it is usually a safe bet to discard frequency information below 40 Hz and above 19 kHz. If we lower the frequency ceiling for discarding, down to 16 kHz, we can reduce the amount of audio information even&amp;nbsp;more.&lt;/p&gt;
&lt;h2&gt;Lossy audio compression:&amp;nbsp;masking&lt;/h2&gt;
&lt;p&gt;It is also known that the human ear, when it hears a very loud sound around one frequency, will not process much softer sounds in other frequencies. This is known as masking. With the help of the &lt;span class="caps"&gt;FFT&lt;/span&gt; algorithm, it’s easy to identify which frequencies will be masked for each range of time samples, and therefore can be&amp;nbsp;discarded.&lt;/p&gt;
&lt;p&gt;Furthermore, because of the way the cochlea works, right after hearing a very loud sound, the ear will not be able to hear softer sounds for a fraction of a second (maybe the fluid in the cochlea of the ear needs some time to settle? I don’t know). So softer sounds occurring right after a loud sound are &lt;strong&gt;masked&lt;/strong&gt;. We can discard that audio information&amp;nbsp;too.&lt;/p&gt;
&lt;p&gt;Lastly, long periods of silence (a couple seconds for example) are not worth all that information they take up as well, and can be further&amp;nbsp;compressed.&lt;/p&gt;
&lt;h2&gt;Lossy audio compression: lowering dynamic&amp;nbsp;range&lt;/h2&gt;
&lt;p&gt;We don’t always need to record audio with the full dynamic range of human hearing. For an orchestra concert, maybe that is important, but if you are just recording an interview, you don’t need to hear every tiny detail of how that person speaks (unless maybe you’re a doctor who can pick up telltale signs of cancer from the way a person speaks? That would be&amp;nbsp;amazing.).&lt;/p&gt;
&lt;p&gt;Human voice frequency typically ranges from 85 to 255 Hz, and only covers a range of up to 65 dB. That’s a full 30 dB lower than the 96 dB of &lt;span class="caps"&gt;CD&lt;/span&gt; audio, which means we don’t need 16-bit audio to store that; about 11 or 12 bits would be sufficient. And you won’t need a 44.1 kHz sampling rate for that; 11.025 kHz is&amp;nbsp;sufficient.&lt;/p&gt;
&lt;p&gt;That, in a nutshell, is how we get such small images and audio files on the internet. If you’re particularly sensitive you can often make out the difference caused by this lost information. But most of the time, we’re not listening or looking closely, and it’s easy to overlook such minor&amp;nbsp;differences.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Computers compress image and audio data through a process similar to summarising: it analyses the data using algorithms that use brightness and colour instead of &lt;span class="caps"&gt;RGB&lt;/span&gt; values for images, and different frequencies of sound rather than samples at different points in time for audio. These algorithms then discard parts of the information that human senses do not perceive easily, and reduce the resolution of other parts that human senses are not as sensitive&amp;nbsp;to.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;It took me a long while to understand the lossy compression algorithms well enough to explain them simply, and even longer to summarise them still further without using terms like &lt;span class="caps"&gt;RLE&lt;/span&gt;, high- and low-frequency components, and subsampling. If you found the previous two issues overly technical, I hope this issue makes up for that by helping you understand compression in less time than detailed technical articles elsewhere, yet in more depth than your mainstream internet&amp;nbsp;sources.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Lossless compression: like repacking but for&amp;nbsp;data&lt;/p&gt;
&lt;p&gt;If you’ve bought anything online before, you know how much of the space is taken up by packing peanuts or styrofoam or recycled cardboard or crumpled brown paper or those airbag things. You might also know about how some third-party shipping services help you cut down on shipping costs by repacking your items together before shipping so as to reduce the volumetric weight that you have to pay for. In all cases, you’re still getting the same thing, just in a smaller package that is smaller in&amp;nbsp;size.&lt;/p&gt;
&lt;p&gt;Computers can also do something similar: give you the exact same information but in a smaller filesize. This is lossless compression, in contrast with what you learnt this issue on lossy compression (which keeps the gist of things but does not give the exact same information). How do computers do this? And when will you want to use lossless vs lossy&amp;nbsp;compression?&lt;/p&gt;
&lt;p&gt;I didn’t manage to get into what happens when you save, edit, and re-save a &lt;span class="caps"&gt;JPEG&lt;/span&gt; image repeatedly in this issue, so I’ll see how I can work it into the next issue&amp;nbsp;:)&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;What is &lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;li&gt;What is OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;&lt;del&gt;What is compression? [Issue 43]&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;&lt;del&gt;Why are music files so large when a voice call over internet uses so little data? [Issue 45]&lt;/del&gt;&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category><category term="compression"></category></entry><entry><title>Issue 45: Audio, a sampling of values</title><link href="https://ngjunsiang.github.io/laymansguide/issue045.html" rel="alternate"></link><published>2019-11-02T08:00:00+08:00</published><updated>2019-11-02T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-11-02:/laymansguide/issue045.html</id><summary type="html">&lt;p&gt;Humans can distinguish 120 dB of loudness, which means the loudest perceivable sound is a million times louder than the softest perceivable sound. &lt;span class="caps"&gt;CD&lt;/span&gt; audio provides 16 bits of information per sample, sufficient to provide 96 dB. Humans have a hearing range from 20 Hz to 20 kHz. &lt;span class="caps"&gt;CD&lt;/span&gt; audio is sampled at 44.1 kHz. Uncompressed audio thus requires 705,600 bits per second, or 86&amp;nbsp;KiB/s.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; An image’s resolution describes its dimensions. Its pixel resolution gives an indication of its physical size (if printed or displayed on a screen), and thus its sharpness. A display with imperceptibly small pixels is often referred to as a Retina display (Apple’s branding) or as a high-&lt;span class="caps"&gt;PPI&lt;/span&gt; display; this requires at least 220 &lt;span class="caps"&gt;PPI&lt;/span&gt; (pixels per inch) nominally. For an image to be printed sharply, it needs at least 300 &lt;span class="caps"&gt;DPI&lt;/span&gt; (dots per inch) on&amp;nbsp;paper.&lt;/p&gt;
&lt;p&gt;I uncovered some of the complexity of image display last issue, and I hope it has helped you to see that computers and humans influence each other very closely. The design of computers and the way they store information is inextricably linked to the way humans store and display information too. Colours are stored as &lt;span class="caps"&gt;RGB&lt;/span&gt; (red-green-blue) values because that is how humans perceive colour as well. And the monitors we buy have a pixel density that is just high enough for us to not perceive individual pixels&amp;nbsp;easily.&lt;/p&gt;
&lt;p&gt;This issue, we will explore the limits of human sensory perception again, and how it influences the way computers store information. This issue, we talk about audio. And let’s start with a question that I’ve pondered a few years&amp;nbsp;back:&lt;/p&gt;
&lt;h2&gt;Why are music files so large (a few &lt;span class="caps"&gt;MB&lt;/span&gt;) when a voice call over internet uses so little&amp;nbsp;data?&lt;/h2&gt;
&lt;p&gt;This is not a question I can answer within one issue; you will have to wait until next issue for a complete answer :) But let’s start&amp;nbsp;here.&lt;/p&gt;
&lt;p&gt;We’ll answer the first part, why audio files are so large, by looking at just how much information we need to provide an undistracting audio&amp;nbsp;experience.&lt;/p&gt;
&lt;h2&gt;The human&amp;nbsp;ear&lt;/h2&gt;
&lt;p&gt;Humans detect sound through vibrations of the eardrum which are transmitted through the cochlea of the inner ear. These vibrations are caused by variations of air pressure in the ear, which in turn are caused by vibrations in the&amp;nbsp;air.&lt;/p&gt;
&lt;p&gt;These vibrations can be produced by computers through speakers. The cones of a speaker, which are the movable rubber parts, are connected to electromagnets which control the movement of the cone. The computer sends a signal that causes the cones to move in a particular pattern that produces &amp;#8230; sound, or sometimes&amp;nbsp;music!&lt;/p&gt;
&lt;h2&gt;Converting sound to&amp;nbsp;data&lt;/h2&gt;
&lt;p&gt;If we plot the vibrations of air on a graph known as a waveform, they look something like&amp;nbsp;this:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Audio waveform" src="https://ngjunsiang.github.io/laymansguide/issue045_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;An audio waveform&lt;br /&gt;Image by &lt;a href="https://pixabay.com/users/GDJ-1086657/?utm_source=link-attribution&amp;amp;utm_medium=referral&amp;amp;utm_campaign=image&amp;amp;utm_content=1781570"&gt;Gordon Johnson&lt;/a&gt; from &lt;a href="https://pixabay.com/?utm_source=link-attribution&amp;amp;utm_medium=referral&amp;amp;utm_campaign=image&amp;amp;utm_content=1781570"&gt;Pixabay&lt;/a&gt;.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;This waveform is converted into numeric values through a process called &lt;strong&gt;Pulse Code Modulation&lt;/strong&gt; (&lt;span class="caps"&gt;PCM&lt;/span&gt;). If you see the acronym &lt;span class="caps"&gt;PCM&lt;/span&gt; or &lt;span class="caps"&gt;LPCM&lt;/span&gt; in any audio-related file, this is likely what it is referring&amp;nbsp;to.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Audio waveform" src="https://ngjunsiang.github.io/laymansguide/issue045_02.png" /&gt;
&lt;small&gt;Pulse code modulation to convert a waveform into numeric values&lt;br /&gt;
Image from &lt;a href="https://en.wikipedia.org/wiki/File:Pcm.svg"&gt;Wikimedia Commons&lt;/a&gt;.&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;These numeric values can then be stored digitally as bits (&lt;a href="https://ngjunsiang.github.io/laymansguide/issue040.html"&gt;Issue 40&lt;/a&gt;)).&lt;/p&gt;
&lt;h2&gt;How high do these values go? Of decibels and&amp;nbsp;dB&lt;/h2&gt;
&lt;p&gt;Well … how many different values would we need? That depends on how loud we need the sound to be … or does&amp;nbsp;it?&lt;/p&gt;
&lt;p&gt;The maximum loudness actually depends on your speaker, not on the signal. The number of levels we can represent in the sound should depend on the range between the loudest and softest sound, shouldn’t it? If we have sixteen levels, we can represent a range of sound where the softest sound is no softer than 16 times below the loudest sound. Any sounds softer than that can’t be represented on the&amp;nbsp;waveform.&lt;/p&gt;
&lt;p&gt;So just how many levels can the ear make out? Welcome to the field of psychoacoustics, the study of how sound is processed in the ear and perceived in the&amp;nbsp;brain.&lt;/p&gt;
&lt;p&gt;Loudness is measured in &lt;strong&gt;decibels&lt;/strong&gt; (dB). The softest sound the human ear can hear corresponds to 20 microPa (microPascals) of pressure; this is taken to be 0 dB, a reference point. A sound 10 times louder (200 microPa) is 20 dB, so every increase of 20 dB represents a tenfold increase in loudness. A jet liner taking off (120 dB) is 10^6 times louder, or a million times louder! That is generally the limit of human hearing: from 0 to 120 dB, or a range of 120&amp;nbsp;dB.&lt;/p&gt;
&lt;p&gt;&lt;span class="caps"&gt;CD&lt;/span&gt;-Audio quality audio uses 16 bits to store a single sample of sound; that provides 65,536 (2^16) different levels, which corresponds to a 96 dB range of loudness. I doubt we will find speakers that can produce close to jet engine levels of sound, and if we do, they probably won’t be using &lt;span class="caps"&gt;CD&lt;/span&gt; Audio as a sound format, so this is pretty much sufficient for most quality audio you’ll find on the&amp;nbsp;internet.&lt;/p&gt;
&lt;p&gt;Today, 16-bit audio is pretty much standard on all computers. Audiophiles will tout the benefits of 24-bit audio, but we won’t go into detail on that in a layman’s guide to&amp;nbsp;computing.&lt;/p&gt;
&lt;p&gt;So each point produced from pulse code modulation (&lt;span class="caps"&gt;PCM&lt;/span&gt;, above) of sound contains 16 bits (2 bytes) of information. How many samples do we&amp;nbsp;need?&lt;/p&gt;
&lt;h2&gt;Sampling and&amp;nbsp;frequency&lt;/h2&gt;
&lt;p&gt;You probably can’t make out the individual waves in the waveform much earlier in this issue; that’s because the waveform is visually squeezed horizontally. But if we expanded it, you would be able to make out individual&amp;nbsp;waves.&lt;/p&gt;
&lt;p&gt;A sound with higher pitch has higher frequency; it has more waves per second. A sound with lower pitch has lower frequency; it has fewer waves per second. It is the upper limit we need to worry about: we must have enough samples per second to be able to represent so many waves. To be able to see a complete wave, we need at least two points: one for the peak, and one for the&amp;nbsp;valley.&lt;/p&gt;
&lt;p&gt;This agrees with what signal engineers learn from the &lt;a href="https://en.wikipedia.org/wiki/Nyquist–Shannon_sampling_theorem"&gt;Nyquist-Shannon sampling theorem&lt;/a&gt;: to store a 1 Hz sound (1 wave per second), you need at least 2 samples per second (to distinguish the peak and valley of the&amp;nbsp;wave).&lt;/p&gt;
&lt;p&gt;The human range of hearing ranges from 20 Hz to 20 kHz (that’s 20,000 Hz).  To store a 20 kHz sound, you need at least 40,000 samples per second. &lt;span class="caps"&gt;CD&lt;/span&gt;-quality audio is sampled at 44,100 samples per second (enough for up to 22.05 kHz), which is sufficient to cover the human hearing range of&amp;nbsp;frequencies.&lt;/p&gt;
&lt;p&gt;So for 1 second of audio, uncompressed, we will need 16 bits × 44,100 samples = 705,600 bits, or 86 KiB. 1 minute of uncompressed audio would be 5.05&amp;nbsp;MiB!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Humans can distinguish 120 dB of loudness, which means the loudest perceivable sound is a million times louder than the softest perceivable sound. &lt;span class="caps"&gt;CD&lt;/span&gt; audio provides 16 bits of information per sample, sufficient to provide 96 dB. Humans have a hearing range from 20 Hz to 20 kHz. &lt;span class="caps"&gt;CD&lt;/span&gt; audio is sampled at 44.1 kHz. Uncompressed audio thus requires 705,600 bits per second, or 86&amp;nbsp;KiB/s.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;This issue and the past 2 issues set the stage for the next issue, which is the first milestone for this season when I can finally introduce compression! And then we will finally get to answering the question: Why are music files so large (a few &lt;span class="caps"&gt;MB&lt;/span&gt;) when a voice call over internet uses so little&amp;nbsp;data?&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Lossy compression: a computer’s attempt to&amp;nbsp;summarise&lt;/p&gt;
&lt;p&gt;We’ve all done this before. “What were you talking about with X?” “Oh, we were just talking about Y. I said &lt;em&gt;blah&lt;/em&gt; and X said &lt;em&gt;blah&lt;/em&gt; and that was about all that was important.” It’s called summarising, and if we didn’t do it, 75% of our lives would just be&amp;nbsp;talking.&lt;/p&gt;
&lt;p&gt;How do computers summarise and attempt to convey only the important parts of all the information we store and transmit? This and more in the next issue on lossy&amp;nbsp;compression!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;li&gt;OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is compression? [Issue&amp;nbsp;43]&lt;/li&gt;
&lt;li&gt;Why are music files so large when a voice call over internet uses so little data? [Issue&amp;nbsp;45]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category></entry><entry><title>Issue 44: Image resolution</title><link href="https://ngjunsiang.github.io/laymansguide/issue044.html" rel="alternate"></link><published>2019-10-26T08:00:00+08:00</published><updated>2019-10-26T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-10-26:/laymansguide/issue044.html</id><summary type="html">&lt;p&gt;An image’s resolution describes its dimensions. Its pixel resolution gives an indication of its physical size (if printed or displayed on a screen), and thus its sharpness. A display with imperceptibly small pixels is often referred to as a Retina display (Apple’s branding) or as a high-&lt;span class="caps"&gt;PPI&lt;/span&gt; display; this requires at least 220 &lt;span class="caps"&gt;PPI&lt;/span&gt; nominally. For an image to be printed sharply, it needs at least 300 &lt;span class="caps"&gt;DPI&lt;/span&gt;.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Colour is stored as a combination of red, green, and blue. In a computer system, each
colour is stored as one byte (8 bits), allowing for 256 different levels. An image is made up of many such pixels of&amp;nbsp;colour.&lt;/p&gt;
&lt;p&gt;An image is two-dimensional and certainly much larger than a single pixel. How do we talk about its&amp;nbsp;size?&lt;/p&gt;
&lt;h2&gt;Image&amp;nbsp;resolution&lt;/h2&gt;
&lt;p&gt;It is common to hear people refer to an image’s size as its pixel size. When we say that an image has a resolution of 1000×3000 pixels, that means it is 1000 pixels wide by 3000 pixels high. In other words, the image is made up of 3 million pixels of colour, arranged in a grid 1000 pixels wide by 3000 pixels&amp;nbsp;tall.&lt;/p&gt;
&lt;p&gt;But how large is this image &lt;em&gt;physically&lt;/em&gt;? Well, that’s a harder question to answer&amp;nbsp;…&lt;/p&gt;
&lt;h2&gt;Resizing&lt;/h2&gt;
&lt;p&gt;You see, on a computer, you can resize an image as you like. I’m sure you have done it many times, preparing for a presentation or just creating a document. So you can make that image 1cm×3cm, or 10cm×30cm. But how large is an image originally meant to&amp;nbsp;be?&lt;/p&gt;
&lt;h2&gt;Image resolution: a ratio between dots and&amp;nbsp;inches&lt;/h2&gt;
&lt;p&gt;In more finicky circles, the term “resolution” is used in another way: to refer to the ratio of pixels to a physical dimension, usually in inches (this is a legacy thing, I can’t explain why it’s imperial and not&amp;nbsp;metric).&lt;/p&gt;
&lt;p&gt;For example, if that 1000×3000 image was meant to be displayed as a 10cm×30cm image on screen (approx. 4 inches by 12 inches), it would have a resolution of 250 pixels per inch (&lt;strong&gt;&lt;span class="caps"&gt;PPI&lt;/span&gt;&lt;/strong&gt;)— 1000 pixels ÷ 4 inches. If you could see pixels, and you took out a ruler to count the number of dots in a 1-inch line across or down the image, there would be 250&amp;nbsp;pixels.&lt;/p&gt;
&lt;p&gt;If it was displayed as a 100cm×300cm image instead, that printed image would have a resolution of 25 pixels per inch (1000 pixels ÷ 40 inches). And it would look 10 times blurrier; each image pixel would be about 1mm&amp;nbsp;wide!&lt;/p&gt;
&lt;p&gt;So image resolution, as pixels per inch, also gives a measure of sharpness of the&amp;nbsp;image.&lt;/p&gt;
&lt;p&gt;For printed images, the same idea applies: a 1000×3000 image printed as a 10cm×30cm image has a resolution of 250 dots per inch (&lt;strong&gt;&lt;span class="caps"&gt;DPI&lt;/span&gt;&lt;/strong&gt;) — 1000 ÷ 4 inches. It’s dots instead of pixels because a printer lays down dots of colour rather than displaying pixels (I’ll go into more detail in a future season on computer accessories and&amp;nbsp;peripherals).&lt;/p&gt;
&lt;h2&gt;Monitor&amp;nbsp;resolution&lt;/h2&gt;
&lt;p&gt;When you buy or browse computer monitors, you would have heard the monitor’s pixel dimensions (number of pixels across and down) referred to as its resolution. Its measure of sharpness is usually listed under a label like &lt;span class="caps"&gt;DPI&lt;/span&gt; or &lt;span class="caps"&gt;PPI&lt;/span&gt;, if not pixel density. If not, you can calculate the &lt;span class="caps"&gt;PPI&lt;/span&gt; of a monitor yourself: Just take the horizontal pixel dimension (number of pixels in the screen horizontally) and divide it by the display width, or take the vertical pixel dimension and divide by the&amp;nbsp;height.&lt;/p&gt;
&lt;p&gt;Your &lt;span class="caps"&gt;OS&lt;/span&gt; might have a setting for fixing blurry apps, or making small text appear larger. These are typical problems faced on a high-&lt;span class="caps"&gt;PPI&lt;/span&gt; screen. But how high does the &lt;span class="caps"&gt;PPI&lt;/span&gt; need to be for us to get a reasonably sharp&amp;nbsp;image?&lt;/p&gt;
&lt;h2&gt;Retina: a brand name for high pixel density&amp;nbsp;displays&lt;/h2&gt;
&lt;p&gt;In 2010, the late Steve Jobs first used the term Retina referring to the iPhone 4. I suppose he meant to describe a class of devices with a display so sharp that the pixels were practically imperceptible; it wasn’t that long ago that if you squinted a little, you could make out the pixels on your monitor or laptop. High pixel density displays are a lot more common today, so you would probably have to visit the budget section of the computer monitor department in a store to see the low-pixel-density effect&amp;nbsp;again.&lt;/p&gt;
&lt;p&gt;So what’s the minimum &lt;span class="caps"&gt;PPI&lt;/span&gt; required to have a Retina display? Apple doesn’t specifically designate a number, but it appears that &lt;a href="https://en.wikipedia.org/wiki/Retina_display"&gt;the minimum &lt;span class="caps"&gt;PPI&lt;/span&gt; of their Retina devices is 218&lt;/a&gt;. Devices that will be further to your eye can get away with about 220 &lt;span class="caps"&gt;PPI&lt;/span&gt;, while those that will be closer to your eyes will need a higher &lt;span class="caps"&gt;PPI&lt;/span&gt; (up to 400 on the iPhone&amp;nbsp;6).&lt;/p&gt;
&lt;p&gt;But all of that is useless if you scale up an image and still view it at a low &lt;em&gt;image &lt;span class="caps"&gt;PPI&lt;/span&gt;&lt;/em&gt;!&lt;/p&gt;
&lt;h2&gt;Why do my printed images come out&amp;nbsp;blurry?&lt;/h2&gt;
&lt;p&gt;Here’s a problem I think some of you might have encountered: You are editing a picture on your laptop or computer monitor, and it looks just fine. You send it to the printer and it comes out really blurry. What&amp;nbsp;happened?&lt;/p&gt;
&lt;p&gt;What happened is that the image was presented in two different ways. On a screen, it appears as a grid of pixels. a 14&amp;#8221; laptop with a 1920×1080 screen resolution actually only has a screen &lt;span class="caps"&gt;PPI&lt;/span&gt; of 157. An image at 100% zoom (1 image pixel displayed as 1 screen pixel) on such a screen would appear fine, because it would be displayed alongside other screen elements (such as the application window) that appear&amp;nbsp;sharp.&lt;/p&gt;
&lt;p&gt;But once it is printed, it appears as a collection of ink dots on paper. These dots are a lot finer than the pixels on a screen, so any blurriness is immediately apparent. Your computer or laptop screen is a poor device for assessing print sharpness! To get a better sense of print sharpness, you will want to view the image on a high-&lt;span class="caps"&gt;PPI&lt;/span&gt; display (such as an iPad) and adjust the zoom such that the image on screen has the same size when&amp;nbsp;printed.&lt;/p&gt;
&lt;p&gt;For printing images, you will want to make sure your image has a resolution of at least 300 &lt;span class="caps"&gt;DPI&lt;/span&gt;; at least 600 &lt;span class="caps"&gt;DPI&lt;/span&gt; is ideal. You can also calculate this by taking the horizontal pixel dimension of the image, and dividing by the horizontal size you intend to print it&amp;nbsp;at.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; An image’s resolution describes its dimensions. Its pixel resolution gives an indication of its physical size (if printed or displayed on a screen), and thus its sharpness. A display with imperceptibly small pixels is often referred to as a Retina display (Apple’s branding) or as a high-&lt;span class="caps"&gt;PPI&lt;/span&gt; display; this requires at least 220 &lt;span class="caps"&gt;PPI&lt;/span&gt; nominally. For an image to be printed sharply, it needs at least 300 &lt;span class="caps"&gt;DPI&lt;/span&gt;.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;It took a lot of discipline this time to not burrow down rabbit holes (like image-to-screen pixel grid alignment); that would have taken a lot longer than an hour to&amp;nbsp;write.&lt;/p&gt;
&lt;p&gt;Pixels and dots are an abstraction that anyone working with computers have to think in terms of, and the relationship between them to physical size can be really tricky to articulate clearly. I hope in this issue I have at least introduced you, my dear readers, to &lt;span class="caps"&gt;PPI&lt;/span&gt; and &lt;span class="caps"&gt;DPI&lt;/span&gt;. And if you work with printers, I think knowing what is going on is a big relief, and takes away the stress from guesswork. Many times I have saved myself the stress of trying to get a sharp banner printed by doing the &lt;span class="caps"&gt;DPI&lt;/span&gt; calculations and realising that there is no way that is possible; I would need too large an&amp;nbsp;image!&lt;/p&gt;
&lt;p&gt;Okay, I think we’re done with basic colour and pixel theory! Next up, basic sound theory, and then we can move on the compression&amp;nbsp;:)&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Audio, a sampling of&amp;nbsp;values&lt;/p&gt;
&lt;p&gt;Sound is so easily taken for granted, but how exactly is it represented in the computer, and how much information is required to store sound? Stay&amp;nbsp;tuned.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;What is &lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;li&gt;What is OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is compression? [Issue&amp;nbsp;43]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category></entry><entry><title>Issue 43: Images, a mosaic of 3 colours</title><link href="https://ngjunsiang.github.io/laymansguide/issue043.html" rel="alternate"></link><published>2019-10-19T08:00:00+08:00</published><updated>2019-10-19T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-10-19:/laymansguide/issue043.html</id><summary type="html">&lt;p&gt;Colour is stored as a combination of red, green, and blue. In a computer system,&amp;nbsp;each&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Unicode is an encoding format which is meant to support every language, ever. Most websites, apps, and interfaces support it&amp;nbsp;today.&lt;/p&gt;
&lt;p&gt;In the last two issues, I explained how text is stored as numbers through the use of lookup tables, whether &lt;span class="caps"&gt;ASCII&lt;/span&gt; or Unicode. The more total characters we want to store in the lookup table, the more bits we need for each&amp;nbsp;character.&lt;/p&gt;
&lt;p&gt;This is going to be a recurring theme: If we want to be able to differentiate more shades of colour, or more degrees of loudness in sound, we will need more and more bits for each &lt;strong&gt;sample&lt;/strong&gt;, and that means our file—whether text, image, or sound—is going to have a larger&amp;nbsp;filesize.&lt;/p&gt;
&lt;p&gt;How many bits is good enough? In the case of text, that is determined largely by the upper limit on the number of symbols we might possibly need to communicate. But how do we decide that for colour? The number of different shades of colours is possibly infinite, and yet we can’t possibly differentiate between really fine shades, nor can our screens possibly produce all of them&amp;nbsp;…&lt;/p&gt;
&lt;p&gt;In this issue, I’ll be summarising and oversimplifying decades of colour theory and colour vision research. Buckle&amp;nbsp;up!&lt;/p&gt;
&lt;h2&gt;The human&amp;nbsp;eye&lt;/h2&gt;
&lt;p&gt;Any effective colour system must take into account how the human eye is structured, and how vision occurs. Today, we understand that humans are trichromatic: there are 3 types of cone cells in the eye (and also 1 type of rod cell, which I won’t be explaining here), and each one recognises a different shade of colour: red, green, blue. Each type of cone cell can differentiate roughly 100 different shades, which theoretically enables us to distinguish 1 million shades of colour&amp;nbsp;(100^3).&lt;/p&gt;
&lt;p&gt;So it makes good sense that our colour systems in computers evolved similarly, to store single dots of colour as a combination of red, green, and blue. To be able to store 100 different shades, we will need at least 7 bits (2^7 = 128), but &lt;a href="https://ngjunsiang.github.io/laymansguide/issue040.html"&gt;computer systems like things in 8s&lt;/a&gt;). For this and other historical reasons, 1 byte (8 bits) are used for each shade, giving us 256 shades of red, green, and blue each. That’s over 16 million (256^3) shades of&amp;nbsp;colour!&lt;/p&gt;
&lt;h2&gt;Colour&amp;nbsp;encoding&lt;/h2&gt;
&lt;p&gt;Since one byte stores one colour value, three bytes are needed for a single spot of colour combining red, green, and blue—a combination commonly called &lt;strong&gt;&lt;span class="caps"&gt;RGB&lt;/span&gt;&lt;/strong&gt;. In a computer, each byte represents the level of that colour; 0 means minimum level (i.e. black) while 255 means maximum level (complete saturation of that colour). So any of those 16 million colours can be stored as a number triplet, representing the red, green, and blue values&amp;nbsp;respectively.&lt;/p&gt;
&lt;p&gt;(0,0,0) is black
(255,255,255) is&amp;nbsp;white&lt;/p&gt;
&lt;p&gt;So now you know what to do with colour pickers in applications: just find the combination of red, green, and blue that is closest to the colour you&amp;nbsp;want!&lt;/p&gt;
&lt;p&gt;&lt;img alt="The Microsoft Paint colour picker" src="https://ngjunsiang.github.io/laymansguide/issue043_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;A colour picker, common in graphics applications. This one is from Microsoft&amp;nbsp;Paint.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;You can play with a simple colour wheel on &lt;a href="https://www.colorspire.com/rgb-color-wheel/"&gt;colorspire.com&lt;/a&gt;, or if you’re feeling more adventurous, try the more technical one on &lt;a href="https://www.rapidtables.com/web/color/RGB_Color.html"&gt;rapidtables.com&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Colour&amp;nbsp;production&lt;/h2&gt;
&lt;p&gt;On a screen, colours are produced by millions of liquid crystals (in LCDs) or light-emitting diodes (in &lt;span class="caps"&gt;LED&lt;/span&gt; displays). These are arranged in a rectangular grid pattern, and each one is known as a &lt;strong&gt;pixel&lt;/strong&gt; (shortened from &lt;em&gt;picture element&lt;/em&gt;). Each pixel is capable of producing 256 shades of red, green, or&amp;nbsp;blue.&lt;/p&gt;
&lt;p&gt;It is extremely difficult to manufacture pixels that can produce any colour; this would require that the crystal or diode can emit light of different frequencies. Instead, the display industry has settled on combining 3 sub-pixels into a pixel. Each sub-pixel produces—you guessed it—either red, green, or blue&amp;nbsp;light.&lt;/p&gt;
&lt;p&gt;&lt;img alt="Close-up of LCD/LED pixels from various displays" src="https://ngjunsiang.github.io/laymansguide/issue043_02.png" /&gt;&lt;br /&gt;
&lt;small&gt;Extreme close-up shots of pixels.&lt;br /&gt;
Taken from &lt;a href="http://lcdtech.info/en/tests/lcd.pixels.structure.htm"&gt;lcdtech.info&lt;/a&gt;.&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;When colour information is sent from the computer to the display (through the video cable) and decoded in the display, it also uses &lt;span class="caps"&gt;RGB&lt;/span&gt; values. In this sense there is remarkable consistency in computer systems in how colour is stored, sent, and displayed. That minimises the amount of time spent by computers converting from one format to&amp;nbsp;another.&lt;/p&gt;
&lt;h2&gt;Colour&amp;nbsp;storage&lt;/h2&gt;
&lt;p&gt;In a computer, combinations of &lt;strong&gt;image pixels&lt;/strong&gt; are stored as image files, but you already know that. I’m on the verge of exceeding my one-idea-per-week promise, so I’ll end this issue with a short comparison of common image formats. Each image format is labelled below by its file extension, the part of the filename that comes at the&amp;nbsp;end.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span class="caps"&gt;BMP&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;
&lt;span class="caps"&gt;BMP&lt;/span&gt; is short for “bitmap”. The bitmap format commonly encountered in computer systems, stores pixels uncompressed. This means that each pixel requires 3 bytes of space, so a full-screen image on a typical modern laptop (1920 pixels horizontally, 1080 pixels vertically) would require about 6 &lt;span class="caps"&gt;MB&lt;/span&gt;&amp;nbsp;(1920×1080×3)!&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span class="caps"&gt;GIF&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;
&lt;span class="caps"&gt;GIF&lt;/span&gt; (Graphics Interchange Format) is one of the earliest image formats, and is rather more restricted in its capabilities as a result. Each &lt;span class="caps"&gt;GIF&lt;/span&gt; pixel is only 8 bits, so a &lt;span class="caps"&gt;GIF&lt;/span&gt; image is limited to using only use 256 colours. One of those colours can be “transparent”, allowing &lt;span class="caps"&gt;GIF&lt;/span&gt; to produce images with transparent&amp;nbsp;parts.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span class="caps"&gt;JPEG&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;
&lt;span class="caps"&gt;JPEG&lt;/span&gt; stands for Joint Photographers Expert Group, so it wouldn’t surprise you to learn that it was designed to display photographs with as small a filesize as possible. Today, it is in use for a variety of image types. &lt;span class="caps"&gt;JPEG&lt;/span&gt; can display pixels in 24 bits (i.e. 8 bits for &lt;span class="caps"&gt;RGB&lt;/span&gt; each), but does not store them uncompressed like &lt;span class="caps"&gt;BMP&lt;/span&gt;. Instead, it applies compression to reduce the filesize by “discarding information” from the image in a way that does not affect the final image&amp;nbsp;visibly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;&lt;span class="caps"&gt;PNG&lt;/span&gt;&lt;/strong&gt;&lt;br /&gt;
&lt;span class="caps"&gt;PNG&lt;/span&gt; (Portable Network Graphics) was designed as a replacement for &lt;span class="caps"&gt;GIF&lt;/span&gt;. It supports 24-bit image pixels, with an additional 8 bits per pixel for transparency information. That means &lt;span class="caps"&gt;PNG&lt;/span&gt; pixels have 256 different levels of transparency, allowing for blending effects where one image overlaps another. &lt;span class="caps"&gt;PNG&lt;/span&gt; files support image compression, allowing them to be stored with smaller filesizes than &lt;span class="caps"&gt;BMP&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Colour is stored as a combination of red, green, and blue. In a computer system, each
colour is stored as one byte (8 bits), allowing for 256 different levels. An image is made up of many such pixels of&amp;nbsp;colour.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;I get carried away easily explaining colour, and it took incredible discipline to rein that exploratory instinct in and stick to the most essential parts. There’s so much to go into, even for laypeople! But, I know, one idea a week, and I’ve sort of worked out where the other ideas should go, so we’ll have a nice and gradual introduction to colour over the course of several&amp;nbsp;seasons.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Image&amp;nbsp;resolution&lt;/p&gt;
&lt;p&gt;&lt;img alt="Meme: One does not simply resize an image" src="https://ngjunsiang.github.io/laymansguide/issue043_03.jpg" /&gt;&lt;/p&gt;
&lt;p&gt;After examining a single pixel, I’ll look at a whole image: what does it take to trick our brains into seeing an image instead of a collection of&amp;nbsp;pixels?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;What is &lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;li&gt;What is OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;li&gt;What is compression? [Issue&amp;nbsp;43]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category></entry><entry><title>Issue 42: Unicode, computers go international</title><link href="https://ngjunsiang.github.io/laymansguide/issue042.html" rel="alternate"></link><published>2019-10-15T22:22:00+08:00</published><updated>2019-10-15T22:22:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-10-15:/laymansguide/issue042.html</id><summary type="html">&lt;p&gt;Unicode is an encoding format which is meant to support every language, ever. Most websites, apps, and interfaces support it&amp;nbsp;today.&lt;/p&gt;</summary><content type="html">&lt;p&gt;Title: Issue 42: Unicode, computers go international
Date: 2019-10-12 08:00
Tags: 
Category: Season 4
Slug: lmg-s4-issue-42-unicode-computers-go-international
Author: J S Ng&amp;nbsp;Summary: &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; In &lt;span class="caps"&gt;ASCII&lt;/span&gt; encoding, text is stored as a 7-bit sequence. Text consists of letters, numbers, symbols, and control codes. Control codes instruct the computer how to format the text so that it looks the way we&amp;nbsp;intended.&lt;/p&gt;
&lt;p&gt;Last issue, I explained what &lt;span class="caps"&gt;ASCII&lt;/span&gt; is and what it does: it allows us to &lt;strong&gt;encode&lt;/strong&gt; letters, numbers, symbols, and control codes into bits (0s and 1s) to be sent to another computer digitally, where it can be &lt;strong&gt;decoded&lt;/strong&gt; by another&amp;nbsp;computer.&lt;/p&gt;
&lt;p&gt;That still does not explain accented characters (such as á), umlauts (like ö), and emojis. Where are those represented in &lt;span class="caps"&gt;ASCII&lt;/span&gt;? And what about glyphs (symbols) used in Greek, Cyrillic, Chinese, Japanese, and other&amp;nbsp;languages?&lt;/p&gt;
&lt;h2&gt;Again, some&amp;nbsp;history&lt;/h2&gt;
&lt;p&gt;In short, other countries and cultures were not happy with &lt;span class="caps"&gt;ASCII&lt;/span&gt;. It did not allow them to communicate effectively in their own&amp;nbsp;languages.&lt;/p&gt;
&lt;p&gt;The first thing that happened was that the European Computer Manufacturers Association (&lt;span class="caps"&gt;ECMA&lt;/span&gt;) extended &lt;span class="caps"&gt;US&lt;/span&gt;-&lt;span class="caps"&gt;ASCII&lt;/span&gt; into &lt;span class="caps"&gt;ISO&lt;/span&gt; 8859-1. In &lt;span class="caps"&gt;ISO&lt;/span&gt; 8859-1, each character is represented by 8 bits. Let’s look at some&amp;nbsp;numbers:&lt;/p&gt;
&lt;p&gt;Characters needed minimally (lower- + upper-case, and numerals): 26+26+10 = 62&lt;br /&gt;
Common symbols: 30&lt;br /&gt;
7 bits can encode 2^7 = 128 different characters&lt;br /&gt;
8 bits can encode 2^8 = 256 different&amp;nbsp;characters&lt;/p&gt;
&lt;p&gt;8 bits was enough to provide for a number of additional glyphs seen below. But very quickly it ran into limitations as well. 256 characters just aren’t&amp;nbsp;enough!&lt;/p&gt;
&lt;p&gt;&lt;img alt="ISO 8859-1" src="https://upload.wikimedia.org/wikipedia/commons/thumb/a/ac/Latin-1-infobox.svg/800px-Latin-1-infobox.svg.png" /&gt;&lt;br /&gt;
&lt;em&gt;The &lt;span class="caps"&gt;ISO&lt;/span&gt; 8859-1&amp;nbsp;characters&lt;/em&gt;    &lt;/p&gt;
&lt;h2&gt;Encoding&amp;nbsp;hell&lt;/h2&gt;
&lt;p&gt;Computer systems in these other countries soon came up with their own ways of representing the huge number of glyphs they needed. There were other &lt;span class="caps"&gt;ISO&lt;/span&gt; 8859-* encoding systems which I do not want to list. The Chinese had &lt;span class="caps"&gt;GB&lt;/span&gt; encoding on the mainland, Big5 in Taiwan, and numerous extensions on that. The Japanese used Shift-&lt;span class="caps"&gt;JIS&lt;/span&gt;.&lt;/p&gt;
&lt;p&gt;It was encoding&amp;nbsp;hell.&lt;/p&gt;
&lt;p&gt;If you remember the internet circa the ’90s and early ’00s, the internet often had pages of what looked like gibberish. Because webpages then did not include information about their encoding, most of the time web browsers simply had to guess. If your page wasn’t encoded in &lt;span class="caps"&gt;ASCII&lt;/span&gt; (or &lt;span class="caps"&gt;ISO&lt;/span&gt; 8859-1), it was anybody’s guess what encoding you were using. You just tried each encoding until you got a page that makes&amp;nbsp;sense!&lt;/p&gt;
&lt;p&gt;That simply would not&amp;nbsp;do.&lt;/p&gt;
&lt;h2&gt;Origins of&amp;nbsp;Unicode&lt;/h2&gt;
&lt;p&gt;In 1988, a bunch of engineers from Xerox and Apple started thinking about a universal encoding that can encompass all languages. The first volume of this encoding was published in 1991, with extensions added&amp;nbsp;subsequently.&lt;/p&gt;
&lt;p&gt;At that point, a Unicode character was represented using 16 bits (for a possible 65,536 characters!). In 1996, a method of extending the Unicode scheme was added, so that Unicode could easily represented over a million different&amp;nbsp;characters!&lt;/p&gt;
&lt;h2&gt;Unicode&amp;nbsp;today&lt;/h2&gt;
&lt;p&gt;Today, the global significance of the internet has resulted in Unicode being the standard encoding on any interface a user interacts&amp;nbsp;with.&lt;/p&gt;
&lt;p&gt;If something you try to submit in a form (such as your name) or view on a page does not display properly, chances are the service you are interacting with has not updated itself with proper Unicode support yet. Write a support request to them and ask for it to be&amp;nbsp;done!&lt;/p&gt;
&lt;p&gt;One big reason for the increased support in Unicode is the space that was set aside for emoji … more evidence that war may drive the development of technology, but it is social factors that lead to its widespread adoption&amp;nbsp;:)&lt;/p&gt;
&lt;h2&gt;Cool things about&amp;nbsp;Unicode&lt;/h2&gt;
&lt;p&gt;Aside from the fact that it could include encodings for just about any character in any language, here are some things about Unicode which &lt;em&gt;may not be entirely relevant&lt;/em&gt; for the layperson, but I think are good to know. Feel free to skip this&amp;nbsp;section.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Unicode is able to “craft” characters by combining multiple&amp;nbsp;glyphs.&lt;/li&gt;
&lt;li&gt;For instance, a&amp;#773; is not represented with a single character, but can be printed through combining the &amp;#8216;a&amp;#8217; glyph with the ◌̅  (Combining Overline)&amp;nbsp;glyph.&lt;/li&gt;
&lt;li&gt;Unicode has an area set aside for alternate character&amp;nbsp;representations.&lt;/li&gt;
&lt;li&gt;For instance, “fl” is sometimes stylistically combined into an “ﬂ” ligature; there is room for this ligature in Unicode. (Try to select the ‘fl’s above if you can’t see the&amp;nbsp;difference.)&lt;/li&gt;
&lt;li&gt;Some high-quality fonts provide such alternate glyph representations, and with the right software (such as Adobe InDesign) you can make use of&amp;nbsp;them.&lt;/li&gt;
&lt;li&gt;Some languages (e.g. Arabic) actually require ligatures for combining adjacent glyphs, so this is a pretty big&amp;nbsp;deal.&lt;/li&gt;
&lt;li&gt;With the right font type (i.e. &lt;a href="https://en.wikipedia.org/wiki/OpenType"&gt;OpenType&lt;/a&gt;), you can actually include programmatic features though&amp;nbsp;Unicode.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://typographica.org/typeface-reviews/chartwell/"&gt;&lt;span class="caps"&gt;FF&lt;/span&gt; Chartwell&lt;/a&gt; is a font for creating mini-charts just by&amp;nbsp;typing!&lt;/li&gt;
&lt;li&gt;The font uses ligatures to turn numbers into a mini&amp;nbsp;chart.&lt;/li&gt;
&lt;li&gt;Unicode has a &amp;#8220;Private Use Area&amp;#8221; that you can use for your own private purposes. You can insert symbols from this area for use in a&amp;nbsp;webpage.&lt;/li&gt;
&lt;li&gt;I have seen websites use this to create custom icons that can scale in size and change colour easily, just like&amp;nbsp;text.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; Unicode is an encoding format which is meant to support every language, ever. Most websites, apps, and interfaces support it&amp;nbsp;today.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;That was really short, thank goodness. I’ve of course skipped over Unicode complexity, because the average layperson does not need to know that. But people need to know that it is possible, and actually &lt;em&gt;easy&lt;/em&gt;, to represent different languages on the same page, and there is no excuse not to do&amp;nbsp;so.&lt;/p&gt;
&lt;p&gt;What’s really interesting is that it took 20 years or more for a format like Unicode to be conceptualised, born, and finally reach the mainstream. Many ideas in computing are like that. When you see something really novel hit the market, it has probably been brewing in somebody’s head for over a&amp;nbsp;decade!&lt;/p&gt;
&lt;p&gt;I think we’re as done with text as we need to be. I’ll start going into other types of data in the next issue, starting with colours and&amp;nbsp;images.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Images, a tri-colour&amp;nbsp;mosaic&lt;/p&gt;
&lt;p&gt;Coming up: a highly compressed crash course in psychovisual theory, colour theory, and how an &lt;span class="caps"&gt;LCD&lt;/span&gt; screen works! All condensed into layperson language, of&amp;nbsp;course.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;del&gt;Unicode? And what does it have to do with emoji? [Issue 8]&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;What is &lt;span class="caps"&gt;HTML&lt;/span&gt;? [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;li&gt;What is OpenType? And what are fonts anyway? [Issue&amp;nbsp;42]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category></entry><entry><title>Issue 41: ASCII, the typewriter digitised</title><link href="https://ngjunsiang.github.io/laymansguide/issue041.html" rel="alternate"></link><published>2019-10-05T08:00:00+08:00</published><updated>2019-10-05T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-10-05:/laymansguide/issue041.html</id><summary type="html">&lt;p&gt;n computers that can encode and decode &lt;span class="caps"&gt;ASCII&lt;/span&gt;, text is stored as a 7-bit sequence. Text consists of letters, numbers, symbols, and control&amp;nbsp;codes.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; 8 bits comprise 1 byte. Humans count bytes in multiples of thousands, while computers count bytes in multiples of&amp;nbsp;1,024.&lt;/p&gt;
&lt;p&gt;It’s still difficult to wrap our minds around how computers do everything with exactly two symbols: 0 and 1. Let’s start simple: How do computers represent&amp;nbsp;text?&lt;/p&gt;
&lt;p&gt;The simple answer is that text can be represented as numbers. In the simplest scheme we know of, A=1, B=2, C=3, and so on. A computer does something more complicated, it keeps a table of characters and the numbers that represent them, in an &lt;strong&gt;encoding table&lt;/strong&gt;. One encoding that is commonly used for plain text is known as the American Standard Code for Information Interchange, or &lt;span class="caps"&gt;ASCII&lt;/span&gt;&amp;nbsp;table.&lt;/p&gt;
&lt;h2&gt;Some &lt;span class="caps"&gt;ASCII&lt;/span&gt; background and&amp;nbsp;history&lt;/h2&gt;
&lt;p&gt;To put things in some context, keep in mind that &lt;span class="caps"&gt;ASCII&lt;/span&gt; actually predates the internet! (We had computers way longer than we had the Internet, after all.) This was the 1960s, Morse code was the standard in telegraph transmission until the 1900s, when the &lt;a href="https://en.wikipedia.org/wiki/Baudot_code#Murray_code"&gt;Murray code&lt;/a&gt; was used instead (itself derived from the earlier Baudot code). The Murray code employed a keyboard much like a typewriter’s. This was an improvement over Morse code, because instead of tapping a single control key (like you see in classic movies), you can now use &lt;strong&gt;all five fingers&lt;/strong&gt; of the hand to&amp;nbsp;type.&lt;/p&gt;
&lt;p&gt;In the 1920s, the Murray code was developed into the International Telegraph Alphabet No. 2 code (&lt;span class="caps"&gt;ITA2&lt;/span&gt; code).&amp;nbsp;Behold:&lt;/p&gt;
&lt;p&gt;&lt;img alt="ITA2 table" src="https://ngjunsiang.github.io/laymansguide/issue041_01.jpg" /&gt;&lt;br /&gt;
&lt;em&gt;Image from &lt;a href="https://en.wikipedia.org/wiki/File:International_Telegraph_Alphabet_2.jpg"&gt;Wikimedia&amp;nbsp;Commons&lt;/a&gt;&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;But the Murray code actually used more bits to transmit the same information! In Morse code, every letter is represented with between 1 to 5 symbols. Each symbol is either a dash or a&amp;nbsp;dot:&lt;/p&gt;
&lt;p&gt;&lt;img alt="Morse Code table" src="https://ngjunsiang.github.io/laymansguide/issue041_02.png" /&gt;&lt;br /&gt;
&lt;em&gt;Image from &lt;a href="https://en.wikipedia.org/wiki/File:International_Morse_Code.svg"&gt;Wikimedia&amp;nbsp;Commons&lt;/a&gt;&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;What do we gain from using more symbols to transmit each number or letter? If you compare the two, you see that &lt;span class="caps"&gt;ITA2&lt;/span&gt; has some things that are missing in Morse&amp;nbsp;code:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Spaces&lt;/li&gt;
&lt;li&gt;Carriage&amp;nbsp;return&lt;/li&gt;
&lt;li&gt;Line&amp;nbsp;feed&lt;/li&gt;
&lt;li&gt;Symbols&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Symbols and spaces are easy enough to understand, and very welcome; if you’ve ever tried reading early telegrams (or using Morse code) you’ll appreciate their addition. But what is carriage return and line&amp;nbsp;feed?&lt;/p&gt;
&lt;h2&gt;The&amp;nbsp;typewriter&lt;/h2&gt;
&lt;p&gt;With the advent of the typewriter, people had access to nicely formatted text. You could type text on multiple rows instead of one long row! But you had to remember to do the actions when using a typewriter for it to be formatted&amp;nbsp;properly.&lt;/p&gt;
&lt;p&gt;&lt;img alt="A typewriter on a table" src="https://ngjunsiang.github.io/laymansguide/issue041_03.jpg" /&gt;&lt;br /&gt;
&lt;small&gt;The Underwood Five typewriter&lt;br /&gt;
Image from &lt;a href="https://en.wikipedia.org/wiki/File:Underwoodfive.jpg"&gt;Wikimedia Commons&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;There were two separate actions involved as you pulled the leftmost lever to the right: (1) The carriage, which holds the paper and moves a bit to the right after each letter is typed, now resets its position so you can start typing from the left again, and (2) the paper is moved up so you can begin typing on the next&amp;nbsp;line.&lt;/p&gt;
&lt;p&gt;(1) is called a carriage return, (2) is called a line&amp;nbsp;feed.&lt;/p&gt;
&lt;p&gt;&lt;span class="caps"&gt;ITA2&lt;/span&gt; could not only send letters and symbols, it could send formatting&amp;nbsp;commands!&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;ASCII&lt;/span&gt;&amp;nbsp;proper&lt;/h2&gt;
&lt;p&gt;The &lt;span class="caps"&gt;ASCII&lt;/span&gt; code chart expands the capabilities of &lt;span class="caps"&gt;ITA2&lt;/span&gt;, while requiring 7 bits for each character. Each character is situated in a specific row and column, out of 8 columns and 16 rows which are numbered starting from 0. (Note that 8 is 2^3 and requires 3 bits, 16 is 2^4 and requires 4&amp;nbsp;bits.)&lt;/p&gt;
&lt;p&gt;&lt;img alt="ASCII code chart" src="https://ngjunsiang.github.io/laymansguide/issue041_04.png" /&gt;&lt;br /&gt;
&lt;small&gt;(An early version of) The &lt;span class="caps"&gt;US&lt;/span&gt; &lt;span class="caps"&gt;ASCII&lt;/span&gt; code chart. Each row number is represented by 4 bits, while each column number is represented by 3 bits.&lt;br /&gt;
Image from &lt;a href="https://en.wikipedia.org/wiki/File:USASCII_code_chart.png"&gt;Wikimedia Commons&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;
&lt;p&gt;Technical details aside, look at what &lt;span class="caps"&gt;ASCII&lt;/span&gt;&amp;nbsp;has:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Symbols and numbers (mainly columns 2 and 3, but also scattered&amp;nbsp;elsewhere)&lt;/li&gt;
&lt;li&gt;Upper &lt;em&gt;and&lt;/em&gt; lowercase letters (columns 4 to&amp;nbsp;7)&lt;/li&gt;
&lt;li&gt;Lots and lots of control codes! (columns 0 and&amp;nbsp;1)&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What do these control codes&amp;nbsp;mean?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;NUL&lt;/code&gt; stands for null, a placeholder code for when the machine wasn’t&amp;nbsp;transmitting.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;SOH&lt;/code&gt;: start of header, to indicate the portion of the transmission that contained information about the&amp;nbsp;message.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;STX&lt;/code&gt; and &lt;code&gt;ETX&lt;/code&gt;: start of text and end of text, to indicate the message&amp;nbsp;portion.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;EOT&lt;/code&gt;: end of&amp;nbsp;transmission.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;DEL&lt;/code&gt;: to delete the previous character (hello,&amp;nbsp;backspace).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;CR&lt;/code&gt; and &lt;code&gt;LF&lt;/code&gt;: we just met them, carriage return and line&amp;nbsp;feed.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I won’t explain the rest in this supposedly-short newsletter, but if you’re interested the full list is &lt;a href="https://en.wikipedia.org/wiki/ASCII#Control_characters"&gt;on Wikipedia&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;&lt;span class="caps"&gt;ASCII&lt;/span&gt;&amp;nbsp;today&lt;/h2&gt;
&lt;p&gt;In a basic text file, text is still stored using &lt;span class="caps"&gt;ASCII&lt;/span&gt; (although it has seen some modifications since). Some of the control codes are obsolete, while some are still in use today. Remember this image from Issue&amp;nbsp;12?&lt;/p&gt;
&lt;p&gt;&lt;img alt="An HTTP request captured in Wireshark showing my developer API key" src="https://ngjunsiang.github.io/laymansguide/issue012_01.png" /&gt;&lt;br /&gt;
&lt;em&gt;An &lt;span class="caps"&gt;HTTP&lt;/span&gt; request captured in&amp;nbsp;Wireshark.&lt;/em&gt;    &lt;/p&gt;
&lt;p&gt;The &lt;code&gt;\r&lt;/code&gt; and &lt;code&gt;\n&lt;/code&gt; you see there are control codes. They stand for &amp;#8216;return&amp;#8217; and &amp;#8216;newline&amp;#8217;, the modern equivalent of &amp;#8216;carriage return&amp;#8217; and &amp;#8216;line&amp;nbsp;feed&amp;#8217;.&lt;/p&gt;
&lt;p&gt;Formatting codes are well and alive today, and they are more prosperous than ever! Without formatting codes, all our files would be stored only in the same boring format, represented only as letters and numbers and punctuation&amp;nbsp;marks.&lt;/p&gt;
&lt;p&gt;And there you have&amp;nbsp;it.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; In computers that can encode and decode &lt;span class="caps"&gt;ASCII&lt;/span&gt;, text is stored as a 7-bit sequence. Text consists of letters, numbers, symbols, and control&amp;nbsp;codes.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;A rather long issue, but that’s what it takes to explain carriage return and line feed, which don’t make sense to folks who have never seen or used a typewriter before (why can’t you just have a single control code that moves to the start of the line &lt;em&gt;and&lt;/em&gt; moves to the next line? Well, sit down and let me tell you a story&amp;nbsp;…)&lt;/p&gt;
&lt;p&gt;Much of the idiosyncracies of computers and technology are this way: accumulated from decades of historical developments, forming legacy baggage in some cases, and interesting bits of history in&amp;nbsp;others.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; Unicode, computers go&amp;nbsp;international&lt;/p&gt;
&lt;p&gt;These days, most of the text you encounter is not encoded in &lt;span class="caps"&gt;ASCII&lt;/span&gt;. It is rather limited, after all, and we need a lot more than just letters, numbers, and symbols today. Next issue, we’ll go into modern-day text encoding, using&amp;nbsp;Unicode.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;Unicode? And what does it have to do with emoji? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;del&gt;those &amp;#8216;\r\n’s in the &lt;span class="caps"&gt;HTTP&lt;/span&gt; request packet [Issue 12,17]?&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;&lt;del&gt;&lt;span class="caps"&gt;ASCII&lt;/span&gt;? [Issue 23]&lt;/del&gt;&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;What is &lt;span class="caps"&gt;HTML&lt;/span&gt; [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category></entry><entry><title>Issue 40: Bits and bytes</title><link href="https://ngjunsiang.github.io/laymansguide/issue040.html" rel="alternate"></link><published>2019-09-28T08:00:00+08:00</published><updated>2019-09-28T08:00:00+08:00</updated><author><name>J S Ng</name></author><id>tag:ngjunsiang.github.io,2019-09-28:/laymansguide/issue040.html</id><summary type="html">&lt;p&gt;A bit is a unit of measurement for information. 1 bit of information is enough to reduce the uncertainty by 50%. 8 bits comprise 1 byte. Humans count bytes in multiples of thousands, while computers count bytes in multiples of&amp;nbsp;1,024.&lt;/p&gt;</summary><content type="html">&lt;p&gt;&lt;strong&gt;Previously:&lt;/strong&gt; Networks enable data packets to get from one computer in the network to another through gateways that forward the data packets according to fixed rules. These rules are encoded in the various protocols followed by network systems, and all computers on the network agree to follow the same&amp;nbsp;protocol.&lt;/p&gt;
&lt;p&gt;But what kind of data gets transmitted over the network? And why do strange file-related things happen on my computer? I’ll unpack some of these gradually over the course of this 13-issue&amp;nbsp;season.&lt;/p&gt;
&lt;p&gt;Let’s start Season 4 slow, with a simple question: when I buy a &lt;span class="caps"&gt;1TB&lt;/span&gt; hard drive, why does my computer say it has only 930GiB&amp;nbsp;available?&lt;/p&gt;
&lt;h2&gt;A bit: the littlest bit of&amp;nbsp;data&lt;/h2&gt;
&lt;p&gt;You know the game Animal, Plant or Mineral, where you ask yes/no questions to guess what the other person has chosen (from the Animal, Plant, or Mineral category)? Each yes/no question narrows down the range of options until you are finally reasonably certain you know what they have in&amp;nbsp;mind.&lt;/p&gt;
&lt;p&gt;It seems everybody knows that all the way down, computers work with 0s and 1s. They work kind of like Yes and No, too, with each digit acting like the answer to a Yes/No question, to narrow down the available information. Quick&amp;nbsp;example:&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Animal&lt;/strong&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Does it have more than two legs? → Yes&amp;nbsp;(1)&lt;/li&gt;
&lt;li&gt;Does it have four legs? → No&amp;nbsp;(0)&lt;/li&gt;
&lt;li&gt;Does it crawl on the ground? → No&amp;nbsp;(0)&lt;/li&gt;
&lt;li&gt;Can it jump? → Yes&amp;nbsp;(1)&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With this question sequence, a grasshopper would be represented as &lt;span class="caps"&gt;YNNY&lt;/span&gt;,&amp;nbsp;or &lt;code&gt;1001&lt;/code&gt;. A millipede would be represented&amp;nbsp;as &lt;code&gt;1010&lt;/code&gt;. A dog would be represented&amp;nbsp;as &lt;code&gt;1101&lt;/code&gt;, but so would a cat. 4 digits can help us categorise different animals, but not all. The more questions we can ask, the better we can categorise&amp;nbsp;them.&lt;/p&gt;
&lt;p&gt;The answer to each question has 2 possible outcomes, and gives us a little &lt;em&gt;bit&lt;/em&gt; more information. Claude Shannon, the father of modern Information Theory, thus named it the &lt;strong&gt;bit&lt;/strong&gt;. What is a bit? It’s a unit of measure for information. Just as we measure weight in units of kilograms, height in units of centimetres, or time in units of seconds, we measure information in&amp;nbsp;bits.&lt;/p&gt;
&lt;p&gt;1 bit of information is enough information to reduce the uncertainty by 50%. Each question you ask in Animal, Plant, or Mineral should reduce the possibilities by half, until the remaining possibilities are small enough to&amp;nbsp;guess.&lt;/p&gt;
&lt;p&gt;So in a computer, a single digit—0 or 1—is a&amp;nbsp;bit.&lt;/p&gt;
&lt;h2&gt;A byte: a convenient cluster of 8&amp;nbsp;bits&lt;/h2&gt;
&lt;p&gt;In the 1970s, 8-bit microprocessors were all the rage. These were processors that processed everything in clusters of 8 bits. It became convenient to refer to 8 bits as a &lt;strong&gt;byte&lt;/strong&gt;, and the term has stuck since. The term didn’t die off because so many things still use clusters of 8 bits to represent&amp;nbsp;information.&lt;/p&gt;
&lt;p&gt;8 bits can store 256 (2^8) unique values, and that turns out to be enough for many purposes. I won’t list examples here, since those examples will come in subsequent issues. If you need greater precision, you can always use 2&amp;nbsp;bytes.&lt;/p&gt;
&lt;h2&gt;It’s all Greek (prefixes): kilo, mega, giga,&amp;nbsp;tera&lt;/h2&gt;
&lt;p&gt;The metric system gave us nice prefixes to count in thousands (kilo-), millions (mega-), billions (giga-), or trillions (tera-), neatly represented by the letters &lt;em&gt;k&lt;/em&gt;, &lt;em&gt;M&lt;/em&gt;, &lt;em&gt;G&lt;/em&gt;, and &lt;em&gt;T&lt;/em&gt; respectively&amp;nbsp;(case-sensitive).&lt;/p&gt;
&lt;p&gt;So a kilobyte is 1,000 bytes, a megabyte is 1,000,000 bytes, a gigabyte is 1,000,000,000 bytes, and a terabyte is 1,000,000,000,000&amp;nbsp;bytes.&lt;/p&gt;
&lt;h2&gt;Uh oh&amp;nbsp;…&lt;/h2&gt;
&lt;p&gt;Here we run into a little bit of a problem. Computers like to count in powers of two, because increasing the number of bits by one gives us double the number of possible&amp;nbsp;values.&lt;/p&gt;
&lt;p&gt;8 bits gives us a byte. 9 bits gives us two bytes, since the additional bit can be 0 or 1. 10 bits gives us four bytes, since the additional 2 bits can be 00, 01, 10, or&amp;nbsp;11.&lt;/p&gt;
&lt;p&gt;11 bits: 8 bytes (000, 001, 010, 011, 100, 101, 110, 111)&lt;br /&gt;
12 bits: 16 bytes (I won’t list them from this point onwards; I think you can see the pattern)&lt;br /&gt;
13 bits: 32 bytes&lt;br /&gt;
14 bits: 64 bytes&lt;br /&gt;
15 bits: 128 bytes&lt;br /&gt;
16 bits: 256 bytes&lt;br /&gt;
17 bits: 512 bytes&lt;br /&gt;
18 bits: 1024&amp;nbsp;bytes  &lt;/p&gt;
&lt;p&gt;1024 bytes is the closest we can come to 1000&amp;nbsp;bytes.&lt;/p&gt;
&lt;h2&gt;Can’t be&amp;nbsp;unseen&lt;/h2&gt;
&lt;p&gt;If you’re on a Windows computer, go to My Computer. If you’re on another &lt;span class="caps"&gt;OS&lt;/span&gt;, go to whichever app shows you available disk space. Look carefully at the units for free&amp;nbsp;space.&lt;/p&gt;
&lt;p&gt;Disk space is not reported in &lt;span class="caps"&gt;MB&lt;/span&gt;, &lt;span class="caps"&gt;GB&lt;/span&gt;, or &lt;span class="caps"&gt;TB&lt;/span&gt;. It’s reported in MiB (mebibytes), GiB (gibibytes), or TiB (tebibytes)! Those units are not the decimal notations we are used to. We count bits differently from&amp;nbsp;computers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Humans:&lt;/strong&gt; Since 10^3 is 1000, a kilobyte is 1000 bytes, a megabyte is 1000 kilobytes, a gigabyte is 1000 megabytes, and a terabyte is 1000 gigabytes.&lt;br /&gt;
&lt;strong&gt;Computers:&lt;/strong&gt; Since 2^10 is 1024, a kibibyte (kiB, or kilo binary byte) is 1024 bytes, a mebibyte (MiB, or mega binary byte) is 1024 kibibytes, a gibibyte (GiB, or giga binary byte) is 1024 mebibytes, and a tebibyte (TiB, or tera binary byte) is 1024&amp;nbsp;gibibytes.  &lt;/p&gt;
&lt;p&gt;When you buy a &lt;span class="caps"&gt;1TB&lt;/span&gt; hard drive, you are buying a 1,000,000,000,000-byte&amp;nbsp;drive.&lt;/p&gt;
&lt;p&gt;1,000,000,000,000 bytes ÷ 1,024 = 976,562,500 kibibytes (kiB)&lt;br /&gt;
976,562,500 kibibytes ÷ 1,024 = 953,674 mebibytes (MiB)&lt;br /&gt;
953,674 mebibytes ÷ 1,024 = 931 gibibytes&amp;nbsp;(GiB)  &lt;/p&gt;
&lt;p&gt;So your computer isn’t lying, it’s just using different units of&amp;nbsp;counting.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Issue summary:&lt;/strong&gt; A bit is a unit of measurement for information. 1 bit of information is enough to reduce the uncertainty by 50%. 8 bits comprise 1 byte. Humans count bytes in multiples of thousands, while computers count bytes in multiples of&amp;nbsp;1,024.&lt;/p&gt;
&lt;hr /&gt;
&lt;p&gt;There are much shorter versions of this explanation on the Internet, but I found none of them satisfying, because they try to paper over the mathematical detail. While this newsletter is intended for layfellas, the math is something that can be worked out with a calculator, and I found that showing the detail makes it easier to&amp;nbsp;understand.&lt;/p&gt;
&lt;p&gt;There may be a social-construct argument to be made here for units of measurement, but I won’t go into that here. I wanted Issue 40 to start with an example of how things work differently between a human mind and a computer’s “computational mind”, and I hope I’ve achieved&amp;nbsp;that.&lt;/p&gt;
&lt;h2&gt;What I’ll be covering&amp;nbsp;next&lt;/h2&gt;
&lt;p&gt;&lt;strong&gt;Next issue:&lt;/strong&gt; &lt;span class="caps"&gt;ASCII&lt;/span&gt;, the typewriter&amp;nbsp;digitised&lt;/p&gt;
&lt;p&gt;We started with a (figurative) bit of bean-counting, let’s get right into how computers work with text in Issue 41 so that I can finally answer one of the sometime-in-the-future questions below: What is &lt;span class="caps"&gt;ASCII&lt;/span&gt;? And I’ll answer another one in Issue 42: What is&amp;nbsp;Unicode?&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Sometime in the future:&lt;/strong&gt; What&amp;nbsp;is:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;booting up? [Issue&amp;nbsp;15]&lt;/li&gt;
&lt;li&gt;a cookie? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;XSS&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;a &lt;span class="caps"&gt;CDN&lt;/span&gt;? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;Unicode? And what does it have to do with emoji? [Issue&amp;nbsp;8]&lt;/li&gt;
&lt;li&gt;those &amp;#8216;\r\n’s in the &lt;span class="caps"&gt;HTTP&lt;/span&gt; request packet [Issue&amp;nbsp;12,17]?&lt;/li&gt;
&lt;li&gt;a good reason developers write code and give it away for free online? [Issue&amp;nbsp;21]&lt;/li&gt;
&lt;li&gt;&lt;span class="caps"&gt;ASCII&lt;/span&gt;? [Issue&amp;nbsp;23]&lt;/li&gt;
&lt;li&gt;compiling code into an application [Issue&amp;nbsp;26]?&lt;/li&gt;
&lt;li&gt;firmware? [Issue&amp;nbsp;34]&lt;/li&gt;
&lt;li&gt;What is &lt;span class="caps"&gt;HTML&lt;/span&gt; [Issue&amp;nbsp;38]&lt;/li&gt;
&lt;/ul&gt;</content><category term="Season 04"></category></entry></feed>