Layman's Guide to Computing - Articles by J S Ng

Articles by J S Ng

HTTP is a set of rules for sending and receiving webpages that link to other webpages. According to the rules, if you want a webpages, you (the client) must send a (HTTP) request to a server, which will return a response.

Issue 8: HTTP error codes—How does a server let the client know if there’s something wrong with their HTTP request?

A request or a response consists of a header and a body. The response header contains information about the response. The status code in the response header determines if the request was successful or unsuccessful.

Issue 9: How do I make an HTTP request?

Issue 10: How do websites actually know if you are really you?

Issue 11: How does wifi work?

Issue 12: What is HTTPS? How is it different from HTTP?

Issue 13: How do I use HTTPS?

Issue 14: What do developers do?

Issue 15: Sysadmins and the command line

A command line is a way of giving commands to the computer in the form of text. An instruction consists of the name of the program to be run, and the options that it needs to use. Command lines provide a fallback mechanism when graphical interfaces break down, and are a much more remote-friendly interface.

Issue 16: Shell scripts and automation

Everybody has a simpleton in their pocket, and maybe one at home on the desk. These simpletons are able to run sets of instructions that they are carrying. You can give them more sets of instructions, often obtained through a Store. Some of the really good instruction sets will cost some money, though. And almost none of them will get the simpleton to do exactly what you want.

Issue 17: Libraries

Libraries make it easier to do the same thing in a programming language, or enable advanced functionality that wasn’t available previously. Libraries are usually specific to a particular programming language, and can’t be used in another programming language.

Issue 18: Frameworks

While libraries make it easier to do the same thing in a programming language, a framework makes it easier to make a particular kind of app. Like libraries, frameworks are usually specific to a particular programming language, and can’t be used in another programming language.

Issue 19: Version control and git

A version control system (VCS) tracks changes to documents. Git is a version control system for source code. Keeping a change history enables a VCS to roll back code in a previous point in time. A change history can also be used in areas other than coding.

Issue 20: Testing

Testing is the only way to know your app really works. Tests can be set up for the different parts of your app, from the basic building blocks to the main code and finally even the interface, including web pages.

Issue 21: Forking and merging

n git, forking a repository creates a copy of it for you to work on. Merging a repository with the original combines the commits from both so that they become one repository again. Conflicts arising from commits from both codepaths that affect the same part of the code will need to be resolved manually. Developers do most of this forking and merging in Github, an online platform for working on and talking about code.

Issue 22: Continuous Integration in software

Continuous Integration means merging changes back to the main branch as often as possible. This means keeping code changes as small as possible, and using automated testing to speed up the development process.

Issue 23: Specifications in software

Specifications describe the details of how a piece of hardware/software should work in order to meet a set of requirements. When well written and well implemented, they aid the coordination of the multitude of devices across the world, enabling them to communicate seamlessly, unambiguously, and unnoticeably with each other. Your devices work because they follow specifications.

Issue 24: Issue trackers, Bug trackers

A bug tracker, or issue tracker, is where users can submit problems they encounter with the software. To submit a helpful issue, users should understand the project’s philosophy and purpose, and read the contributing guidelines.

Issue 25: Text Editors and Integrated Development Environments

A text editor helps programmers to edit their code. An IDE (integrated development environment) helps programmers to see what’s going on in their code, test their code’s performance, and provide almost all the necessary tools in one package.

Issue 26: Software distribution

A developer can simply put up a download on a webpage and hope people download it and figure out how to get the app on their devices. But often, the more popular way is to publish the app to a repository (also known as an app store, for mobile devices). Before this can be done, the code or application needs to be packaged according to the requirements of the repository.

Issue 27: What is an IP address?

P addresses are a string of four numbers. A list of reserved IP addresses is managed by IANA, and all Internet registries agree to forward data packets according to that list. A data packet sent from a client goes to its gateway. At the gateway, the destination IP address is checked against the gateway’s forwarding tables. If the IP address is found in the forwarding table, it gets sent along that route, otherwise it gets forwarded to the next gateway, … until it reaches its destination.

Issue 28: Domain Names and DNS

Domain names consist of an optional subdomain, the domain name, and the top-level domain. The top-level domains are managed by a registry, which receives registration requests from domain name registrars, and maintains registrant information for each domain under their TLD in a WHOIS database. The domain name registrars let you configure which IP address to forward data packets to, and propagate that information through their DNS servers so that data packets will be routed accordingly.

Issue 29: How to resolve a hostname

Resolving a hostname means answering the question “which IP address does this hostname point to?”. Your web browser seeks this answer by sending a DNS query to the gateway. If the gateway is unable to provide a satisfactory answer, you can configure your network interface to send the DNS query to a different DNS server.

Issue 30: Private IP Addresses

Private IP addresses are special IP addresses that routers will treat as belonging to devices within the private network, and not outside it. Data packets sent to private IP addresses will never make it past the gateway into the internet. This system allows multiple devices within a private network to share a public IP address.

Issue 31: Getting a private IP address: DHCP (and DDNS)

DHCP is a protocol by which a router assigns IP addresses to devices that connect to it. Static IP addresses are IP addresses that are reserved for a device, so that the device always gets the same IP address when it connects.

Issue 32: Sharing a public IP address: Network Address Traversal

When a request from a device on the network is to be forwarded to the gateway, it has to traverse different networks. The router helps it by rewriting the source IP and port number, keeping track of the originating IP and port. When a response is received, it rewrites the destination IP and port so that the response will reach the originating device.

Issue 33: Port numbers

When an app makes a network request through the OS, the OS adds the source and destination port number to the query in accordance with TCP. When the OS receives the response, it forwards the data to the app which is mapped to the destination port number. Port numbers 1-1023 are registered to standard Internet services, port numbers 1024 to 49151 may be registered to other services, and port numbers 49152 to 65535 may be used by anyone.

Issue 34: Firewalls

Firewalls block data packets that match certain rules. They decrypt the data packet layer by layer, dropping those that match its programmed rules without allowing them to be forwarded to the next point in its journey. The type of filtering that can be applied depends on the processing power available to the router, since some information is hidden more deeply in the data packet than others. Such filtering is typically circumvented by the use of VPNs, or other means of encrypting the data that is required.

Issue 35: Virtual Private Networks (VPNs)

VPNs link devices that are not within the same network, such that they can behave as though they are. By encrypting the packet data before it is sent between devices, the VPN software hides these packets from being snooped (i.e. spied upon), effectively forming an encrypted tunnel for information to travel between devices. This enables devices to circumvent firewalls and protect the privacy of information in the data packets.

Issue 36: Latency

Latency is the time duration between a ping packet being sent out and its response being received. It is an indication of how far away a target server is.

Issue 37: Traceroute–Google Maps for data packets

The process of forwarding data packets from server to server takes time. Each hop a data packet takes adds to the latency. The more hops a packet must undergo, the longer the latency. The slower the servers along the route, the longer the latency as well.

Issue 38: Loading a web page

When a webpage document loads (Stage 1), it is processed by the web browser, which then

Issue 39: Caches and caching

Your computer and browser speed up a lot of lookups by caching information that is unlikely to change from the last view. When the same information is requested, your computer or browser will first look in the cache to find that information, and retrieve it from cache if it is there, otherwise it will load the information (and store it in cache if allowed to). There are usually ways to bypass a cache if the information is stale or no longer correct.

Issue 40: Bits and bytes

A bit is a unit of measurement for information. 1 bit of information is enough to reduce the uncertainty by 50%. 8 bits comprise 1 byte. Humans count bytes in multiples of thousands, while computers count bytes in multiples of 1,024.

Issue 41: ASCII, the typewriter digitised

n computers that can encode and decode ASCII, text is stored as a 7-bit sequence. Text consists of letters, numbers, symbols, and control codes.

Issue 42: Unicode, computers go international

Unicode is an encoding format which is meant to support every language, ever. Most websites, apps, and interfaces support it today.

Issue 43: Images, a mosaic of 3 colours

Colour is stored as a combination of red, green, and blue. In a computer system, each

Issue 44: Image resolution

An image’s resolution describes its dimensions. Its pixel resolution gives an indication of its physical size (if printed or displayed on a screen), and thus its sharpness. A display with imperceptibly small pixels is often referred to as a Retina display (Apple’s branding) or as a high-PPI display; this requires at least 220 PPI nominally. For an image to be printed sharply, it needs at least 300 DPI.

Issue 45: Audio, a sampling of values

Humans can distinguish 120 dB of loudness, which means the loudest perceivable sound is a million times louder than the softest perceivable sound. CD audio provides 16 bits of information per sample, sufficient to provide 96 dB. Humans have a hearing range from 20 Hz to 20 kHz. CD audio is sampled at 44.1 kHz. Uncompressed audio thus requires 705,600 bits per second, or 86 KiB/s.

Issue 46: Lossy compression

Computers compress image and audio data through a process similar to summarising: it analyses the data using algorithms that use brightness and colour instead of RGB values for images, and different frequencies of sound rather than samples at different points in time for audio. These algorithms then discard parts of the information that human senses do not perceive easily, and reduce the resolution of other parts that human senses are not as sensitive to.

Issue 47: Lossless compression

Data cannot be compressed beyond its predictability limit (Shannon entropy) in a lossless fashion. Lossless compression does not discard any information. It generally tries to spot patterns in the data, and represent those patterns with fewer bits, through a combination of predictive coding, run-length encoding, and entropy coding.

Issue 48: Of containers and codecs

A video container can hold one or more audio, video, or text data streams. To encode or decode a data stream, you need to have the necessary codec installed[^1]. Most video runs at 25 or 30 fps, with high-quality video going up to 60 fps. You can use a program like MediaInfo to help you decipher the streams inside a video container file.

Issue 49: What is a File?

A file consists of data, preceded by a file header which describes the data. Software (including operating systems) detect the kind of data contained in a file by 1) glancing at the file extension, 2) looking at its declared MIME type (if any), and 3) checking the file header, in order of difficulty and accuracy.

Issue 50: Complex file formats and the Document

An HTML file contains markup tags that tell the browser how to interpret and format the text within the tags. Other document formats usually use tags in a similar way. These tags constitute a markup language that any app can use to mark up its own text too.

Issue 51: PDFs part 1 – Compatibility and fidelity

PDF is the gold standard for universal compatibility (supported by most software and platforms) and visual fidelity (displays exactly the same way). When you need things to appear on a different device in exactly the same way you created it, without having to install additional software, use PDF.

Issue 52: PDFs part 2 – Text and images

PDF’s markup language is more concerned with how things appear on the page than with what they were originally. Once the PDF is generated, it is almost impossible to retrieve the original data from it. Scanned documents that are converted to PDF may have a text layer generated by OCR that lets detected text be copied from it.

Issue 53: The CPU is an instruction-obeying slave

CPUs are unconscious slaves that simply execute instruction after instruction, at a very fast rate.

Issue 54: Compiling programming code into CPU instructions

To get useful output from a CPU, we must translate the operations we want it to perform into CPU instructions, in a process known as compiling. Most compilers convert programming code into CPU instructions.

Issue 55: Addressing memory

The life of the unconscious CPU is just executing instruction after instruction after instruction. Each instruction may consist of loading data from a memory location, sending data to a memory location, or performing operations on the data it is holding.

Issue 56: Operating Systems and resource management

The operating system is responsible for listing and managing the computer’s resources, making them available to programs running on the computer, and making sure they only use what they are allowed to.

Issue 57: Cache, the CPU’s working space

The CPU stores data for ready access in the CPU cache. Accessing data from the CPU cache is much faster than accessing data from memory. When the CPU needs data from a memory address, it looks in the cache first. If the data is not there (a cache miss), it will load the data from the memory address, and store a copy in the cache for faster reference in future. The CPU cache is managed by the CPU and is invisible to the OS. Programs that need to ensure the data in the cache is “fresh” can perform a cache flush and reload.

Issue 58: CPU Optimisation Part 1 – Out-of-Order Processing

The CPU comprises different types of execution units. All the execution units can run at the same time, but they may execute instructions over different numbers of clock cycles. To minimise wait time, CPU instructions are carried out in an order that keeps the execution units busy as often as possible.

Issue 59: Meltdown

A set of instructions can trick a CPU into reordering load instructions so that the data is temporarily loaded into the cache before the instructions are retired. The cache can then be snooped to retrieve the data.

Issue 60: CPU Optimisation Part 2 – Speculative Execution and Spectre

Speculative execution is a feature that let’s the CPU speed up execution if it correctly predicts a decision point. The CPU carries out the operations along the predicted decision branch and loads the results if it predicts correctly.

Issue 61: Mapping the cache

A cache miss is slow, and a cache hit is fast. This difference in cache reading speed can be used to transmit secrets out from the cache, which cannot be read directly by programs.

Issue 62: Cache snooping

Issue 63: Limitations of Meltdown and Spectre

For Meltdown and Spectre to work, they need two things: (1) Permission to carry out instructions (i.e. run programs) on the OS, and (2) knowledge of where the kernel address space is.

Issue 64: Fixing Meltdown and Spectre

Meltdown and Spectre require the programs executing them to have access to kernel memory space. Kernel address isolation attempts to prevent the program from even having access to the kernel address space in the first place. TLB flushing changes the virtual-to-physical memory mapping, disrupting Spectre’s reliance on a consistent virtual-to-physical memory mapping.

Issue 65: Memory Sharing in the Operating System

Shared memory helps to reduce the amount of memory needed by all the applications running on an operating system. It also allows applications to send data to each other, and to communicate.

Issue 66: Before the Cloud

DoubleClick, the first commercially successful ad server, launched in 1996. It ran a system that tracked the performance of banner ads across 30 sites, working to optimise their return on investment. This was made possible by standardisation of the web (thanks to the HTTP specification), and the birth of Javascript, a scripting language integrated into the webpage rather than being a separate module from it. All of this happened in 1995–1996.

Issue 67: The Innocent Times

Each click on a link, or even an ad, sends data to the server. This information can include an ID for the link you clicked, or the category of ad you clicked. But without Javascript, the webpage can’t know very much about you.

Issue 68: The Age of Bloat

Advertising was sold on a CPM model (cost per thousand impressions) in the early Internet, until the dot-com bust forced companies to reconsider their ad-buying strategy. The CPC model (cost per click) became more popular, but was still not very user-targeted. It would take QuantCast, founded in 2006, to figure out a way to gather data on users and build a coherent profile of each demographic.

Issue 69: The Cookie Monster

Cookies are little fragments of information with a name and a value, and associated with a domain address. They are most commonly used to identify new or returning users. This cookie is issued by a website upon the first visit, stored in the browser, and returned to the issuing server whenever the server requests it.

Issue 70: The Cookie Factory

When browsing a webpage, a tracking script retrieves the browser's existing cookie, if there is one, or sets a cookie for the browser if there isn’t one. The tracking script sends the cookie information back to the originating server, along with many other fragments of information.

Issue 71: The Rise of Audience Analytics

n 2006, Quantcast offered complete audience analytics for any site that puts their cookie on the site. Websites would know more about their audience than they could otherwise gather through their site alone. But Quantcast would make most of their money through their offering to advertisers.

Issue 72: The Data Brokers

QuantCast gathers a large amount of data on internet users directly through its cookie (which other publishers serve through their websites), and also by cross-checking it against data which it purchases from other data brokers who gather their information through other means, such as internet activity and credit card transactions.

Issue 73: The Heart of Darkness (Header Bidding)

When a page loads advertisements through header bidding, it sends your cookie along with other information to an ad exchange. The ad exchange conducts automated bidding among the ad-buyers, determines the winner(s), and sends the winning code(s) back to your browser. Your browser then sends these codes to the CDN, which sends back the winning ads for your page to render in your browser.

Issue 74: The Walls Have Pixels

There are two ways your browser can send cookies back to the server:

Issue 75: The Costs of Data Leakage

By not enforcing strict cookie policies on their own sites, publishers allowed advertisers to sneakily set cookies on their site audience. This allowed advertisers to reach the same audience via their advertising slots on other websites, which could be bought more cheaply. The publishers were cut out of the value chain and were no longer “gatekeepers” to their own site readers. They could not sell their advertising slots at a premium.

Issue 76: Third-parties and cross-site resources

Cookies with the same domain as the site are first-party cookies, while cookies with domains different from the site are third-party cookies. Cookies are used for all kinds of purposes, from remembering browsing sessions, to logging users in, to tracking their identity across websites. Blocking all third-party cookies indiscriminately can result in most of not all of these functions breaking.

Issue 77: Wearing clothes on the Internet

The default settings of most browsers expose a lot of information to scripts that request it. To prevent such scripts from running, we need services that can filter the source of these scripts. These services generally work by matching browser requests against a blacklist, and blocking the request if it comes from a domain known to host malicious scripts.

Issue 78: uMatrix: voyuering the voyeurs

Modern webpages rely on many third-party resources for their functionality. Blocking access to some domains may cause these webpages to break and stop working.

Issue 79: A Base for Data

Comma-separated value (CSV) files store all data in text form. Within each row, a separator divides each chunk of data, and rows are separated by a line delimiter. To keep the data compact and read it more quickly, we have to decide beforehand what data type each chunk should be, and how much space it is allowed to take up. Such a data form can no longer be opened in a simple text editor program like Notepad.

Issue 80: Indexing

An index is a separate table containing key terms in the database (usually names, IDs, or some other key identifier), alongside the row numbers where they are found. An index greatly speeds up row lookups, but slows down the writing of new rows.

Issue 81: Data Normalisation

Putting all data into one table results in unnecessary duplication of data. Making data atomic by splitting it up into multiple tables makes the data easier to work with, but requires multiple lookups and joins to get the required data. A standard database language, SQL, makes it possible to write queries that are supported by multiple databases.

Issue 82: Multiplayer databases

A database system follows rules that enable multiple users to send commands to the database at the same time. The system attempts to execute each action one at a time, locking data that is in use by other users, and ensuring that each user does not carry out actions that they are not permitted to. Such systems are better able to prevent data corruption compared to a text-based system.

Issue 83: Structured Query Language

Structured Query Language (SQL) is a computer language for managing data in databases. It has keywords and keyphrases that let you filter rows and columns, group and order data, perform basic arithmetic on data, and more. It is complex and powerful, but using it in an astute and efficient manner requires specialised training.

Issue 84: JOIN – supercharged VLOOKUP

SQL queries let you join multiple tables based on specified conditions using the JOIN keyword. This enables crafting complex queries to return only the specific data that is required.

Issue 85: SQL Injections

Forms that naïvely inject user-submitted data into a SQL query template may end up sending valid (but otherwise unathorised) SQL commands to the database, with disastrous consequences.

Issue 86: Distributed databases

To increase the performance of a distributed database, we can scale up/scale vertically by increasing the computers’ performance, or scale out/scale horizontally by adding more computers. Distributed databases can only prioritise two of the following three factors: consistency, availability, partitioning (CAP theorem).

Issue 87: Relational Databases

Relational databases are designed to maintain a well-structured set of data tables through constraint rules. This makes them very useful for preventing accidental inconsistencies in data, but make any changes to the data schema difficult to implement. Changing from one schema to another involves downtime and a migration.

Issue 88: Document Databases

Document databases organise data into documents, each containing a number of field-value pairs. Each value can itself be a document, and multiple values/documents can be grouped under a field. Document databases do not enforce data consistency across documents, so those rules need to be managed by the application which is using the database. This allows document databases to continue operating even when partitioned, at the cost of some consistency.

Issue 89: Graph Databases

Graph databases treat the details of things as secondary, and optimise for managing the network of relationships. A graph database can quickly look up how things are related to each other, and return the results.

Issue 90: Using a database

A URI (Uniform Resource Identifier) is required to connect to a database. This URI can be provided by a hosting service provider that runs your own database for you, or by a cloud service provider that runs your database on their platform.

Issue 91: Commercial database alternatives

Depending on what you need a database for, there may be online database platforms that can manage and automate much of the work for you. Airtable, Smartsheet, Knack, and Zoho Creator are just 4 of many options that offer an easier way to set up and input your data, then access them through apps or other means.

Issue 92: All about apps

Sandboxing is a catch-all term for the concept of ensuring apps don’t have access to resources outside of their privileges. Sandboxed apps are generally safer than non-sandboxed apps in terms of security, and easier to manage, terminate, and uninstall.

Issue 93: What's in a web app?

Web apps have limited access to the device’ storage, and can only store data in browser-managed databases. Progressive Web Apps (PWAs) can additionally register service workers that run in the background. Because they are so cleanly sandboxed, they can be easily removed by clearing the browser cache and storage, and deregistering any service workers manually.

Issue 94: Why do web browsers take up so much memory?

Web apps require the browser to request memory on their behalf, and thus their memory usage shows up under the browser process in the OS Task Manager. Web apps use this data to store a more convenient (but larger) representation of the webpage document, and to store the data needed by the app.

Issue 95: What’s in a mobile app?

Mobile apps, unlike web apps, can bundle resources and libraries to be installed to a mobile device. They can also request access to storage, and typically have a higher memory limit than web apps.

Issue 96: Why are mobile apps so large in size?

Mobile apps are sandboxed by the operating system. As a result, they have to bundle all the libraries they need, and are not allowed to share libraries with other apps. This results in mobile apps with huge filesizes.

Issue 97: Laptop apps

A laptop app can do practically anything, if it is running through the Administrator/root account. Sandboxing is carried out through permission control.

Issue 98: Temporary files

Apps generally handle three categories of files: its own (permanent) app files, (shared) user files, and (ephemeral) temporary files.

Issue 99: Where does all the app data go? A look at Mac-like systems

MacOS, Linux, and other similar systems treat everything as a file, organised into appropriate subfolders.

Issue 100: Where does all the app data go? A look at Windows systems

Windows systems categorise data into two types: files, and settings. Files are stored under an appropriate subfolder in C:\, while other storage devices and network locations are stored elsewhere or given their own drive letters. Settings are managed through the Windows Registry, which is stored in C:\Windows\System32\Config\ and C:\Windows\Users\Name\.

Issue 101: Why apps crash

An app crashes when it encounters a situation it can’t handle, or when it attempts to perform an operation that is disallowed by the operating system.

Issue 102: Threading

Applications are assigned a thread by the OS for running a sequence of instructions. The instructions are executed sequentially, and the app cannot proceed if it gets stuck on any instruction.

Issue 103: Why apps hang even with multiple threads

A race condition happens when threads depend on instructions happening with coincidental timing for success. When instructions are not executed with appropriate timing, one or more threads can get stuck waiting on a response that never comes.

Issue 104: Storing sensitive data

Issue 105: Operating Systems

The OS takes care of booting up, login and user management, window management, memory allocation, storage interfaces, background services, peripheral management, and much more. Access to these services, where allowed, is provided in the form of software libraries that developers can use.

Issue 106: Organising storage

A hard disk is organised into sectors, which are the smallest unit of storage. The OS’s filesystem determines how and where to store each file on the hard disk. The filesystem manages the file metadata in a file table, separate from the actual contents of the file.

Issue 107: The challenges of storage

When write operations are interrupted prematurely, filesystem corruption often results.

Issue 108: Safeguarding data operations

Safe writes ensure that all the data is written to disk sectors properly first before updating the file table. The result is that write operations take a longer time to complete.

Issue 109: Speeding up data operations

Fast writes dump the data to a write cache (in computer memory), then update the file table to look like the file is already written to disk. However, if power is cut before all data is properly moved from the write cache to disk, the data in memory is lost, and file corruption usually results.

Issue 110: Safeguarding against data corruption with a journal

Filesystem journals are a record of changes made to the disk, so as to enable those changes to be rolled back, or to be completed properly in case of sudden interruption.

Issue 111: Copying, moving, and deleting files

Moving a file (within the same disk region) merely updates its file table record, and this happens really quickly. Copying a file, or moving it to a different disk/region, involves copying the contents and then updating the file table record, and is considerably slower. Deleting a file only requires that its file table record be removed, and is a very fast operation (if it does not involve the Recycle Bin).

Issue 112: Bootstrapping into existence (bootup)

When a computer is booted up, it runs the BIOS from a chip on the motherboard. The chip checks that core parts are present, checks for a storage disk containing a bootloader, loads it into memory, and hands over control. The bootloader loads the operating system kernel. The operating system kernel then does whatever it needs to do to get the system ready for use.

Issue 113: A computer’s existential crisis (boot failure)

f you can’t get to a BIOS screen, it is likely a hardware problem and has to be solved by a technician. If you can’t get the OS loading screen, it’s a bootloader problem and needs to be solved with more geekery. If something goes wrong with OS loading, and fails to fix itself on subsequent reboots, it’s probably time for a system refresh or reinstall.

Issue 114: In the beginning (firmware)

Embedded operating systems are unlike user operating systems. They are designed to run the software needed for an appliance’s operation, and are not meant to be used by users directly. Since they are considered somewhere between software and hardware, they are usually referred to as firmware.

Issue 115: Shutdown & standby

When you shut a computer down, it sends an exit signal to all running programs to get them to do their exit routine. This process can sometimes take a long time. To preserve the data configuration in memory while minimising power draw, a computer can go into standby mode: all hardware except the memory gets powered down, until the computer is woken up from standby.

Issue 116: Hibernation

Hibernation mode causes the computer to store the data configuration into a hibernation file on disk. When powered up, the OS reads the data configuration from the file back into memory. This lets the system avoid having to do a full shutdown and bootup; it performs a shorter version of these two sequences instead.

Issue 117: Swap space

Operating systems use a page file on the storage disk as a complement to physical memory. This allows OSes to behave more performantly than they would if they did not have a page file. Data that is rarely accessed is moved to the pagefile (“paged out”), and can be paged in when it is needed later, albeit with a performance hit.

Issue 118: When I run two file-copy processes at the same time, why are they much slower?

A hard disk consists of a read arm, and a set of magnetic platters which store data. To read or write data, the read arm must move to the appropriate track of the rotating platter, and detect the magnetic field (for reading), or attempt to magnetise the domains on the platter (for writing). Operations that require the read arm to access different parts of the magnetic platters intermittently result in slower read speeds.

Issue 119: Solid-state disks, an upgrade from hard disks

Solid-state disks are much faster than hard disks because they have no moving parts, so no time is wasted waiting for parts to get into the right position. However, they are more expensive than hard disk drives.

Issue 120: Drivers, the glue between hardware and firmware

Driver files provide information about the driver, and instructions on how to receive information from the device, and encode information to be passed to the device. The operating system may come with generic driver files for the device, but custom driver files might provide better performance or additional features.

Issue 121: In graphic detail

3D models are represented with vertices (points), edges (line segments between points), and faces in a computer. Images known as textures can be mapped to faces to give the impression of detail.

Issue 122: The great flattening

Computers are general-purpose machines that usually process integer calculations. The graphics pipeline requires more specialised hardware that can process decimal number calculations. This is why high-performance graphics usually requires a graphics card.

Issue 123: Graphics cards: The Pixel Factory

Graphics cards contain lots of tiny cores that are much better at performing the same calculation for lots of decimal numbers. These cores are organised into compute units; a graphics card with more compute units can perform more calculations every second. Graphics cards have their own onboard memory, separate from the CPU. GPU memory is different from computer memory; it is configured for much higher data throughput. Integrated graphics are GPUs that are integrated into a CPU chip; these do not have their own onboard memory, and share memory with the CPU.

Issue 124: Video formats

The VGA video format originated in the time of cathode-ray televisions (CRTs). It was superseded by HDMI, a video format standardised by consumer electronics companies. DisplayPort, on the other hand, is a video format standardised by computer display companies.

Issue 125: Analog and digital conversion

Analog formats such as VGA mostly contain the control signals that the CRT needs to operate, while digital formats such as HDMI and DisplayPort contain image data that the device must convert to control signals. Analog signals need a digital-analog-conversion (DAC) chip to be converted to digital signals, hence VGA-HDMI adapters tend to be more costly than DisplayPort-HDMI adapters. Dedicated graphics cards generally support more simultaneous output video streams than integrated graphics cards.

Issue 126: USB Type-C

USB is a (licensed) technical standard that describes how devices connect to each other through a cable. USB Type-C is a new connector standard that supports USB 3, DisplayPort, HDMI, and Thunderbolt. It is able to carry multiple types of data simultaneously, in limited combinations. In a USB connection, one device acts as the host while the other acts as the device; the host initiates all communication.

Issue 127: USB Type-C Power Delivery

USB Power Delivery is a specification that describes how much voltage and current can be supplied by different categories of USB cables. It allows power delivery at different levels for all kinds of connected devices, up to 100W. This should help to simplify cable setups that otherwise require multiple kinds of cables between two closely interconnected devices (such as a laptop and an external monitor).

Issue 128: Upgradeability

Upgradable parts need a slot or socket to be inserted into; these slots/sockets need to be made robust enough, causing them to take up more space than a soldered part. Devices which were designed to be small and portable generally eliminate these as far as possible, opting to have parts directly soldered to the board instead.

Issue 129: Cooling

The larger the surface area, the faster an object loses heat. The larger the temperature difference between object and surroundings, the faster the object loses heat. Heat is bad for computers, and CPUs will need cooling to be able to process computations quickly. A mobile phone thus typically uses no more than 4 W of power, a laptop can use 25–45 W, and a desktop can usually use 65 W and more. Two popular ways of increasing the cooling capacity of a device is to attach a larger piece of metal to the chip (passive cooling), or use a fan to force air over the heatsink (active cooling).

Issue 130: Power limits

AC power from the wall uses electric current that alternates directions, while DC power from batteries uses electric current that flows in one direction only. All electronics are DC-only, and require an AC-DC adapter to be powered from the wall. The AC-DC conversion produces a significant amount of heat; AC-DC adapters are usually external unless the device has sufficient space or cooling capacity for it.

Issue 131: What do early CPUs and startup founders have in common?

CPUs have limited throughput, since there is a max frequency they can operate at, and a limit to the number of wires they can be connected to (throughput = no. of wires × frequency). Later designs of early computers increased the capability of computers by delegating more work to secondary chips.

Issue 132: the AT form factor (pre-1995)

Chipsets served as go-betweens in the AT form factor by IBM.

Issue 133: the ATX form factor (post-1995)

The ATX form factor also brought with it a new breed of computers with more specialised chipsets: the memory controller hub (MCH) and peripheral controller hub (PCH). The MCH coordinates high-throughput components, such as computer memory and graphics. The PCH specialises in lower-throughput needs.

Issue 134: Part 1 – the Intel Core i-series launches!

Light takes 0.3 ns to travel 10 cm, approximately the distance by wire between the CPU and the MCH. This potentially causes operations between the CPU and MCH to slow down by one cycle, at frequencies above 3 GHz. One way the Intel Core i-series resolves this conundrum is to move the memory controller into the CPU.

Issue 135: Part 2 – Unifying the CPU and MCH (post-2008)

A modern CPU is manufactured through a process called photolithography, by which the CPU components are etched onto the silicon substrate by successive layers of chemicals, masking, and laser exposure. When the CPU components could be made small enough, the MCH and CPU were designed onto the same chip, and this is the design used by the Intel Core i7 (1st-gen).

Issue 136: The mobile workstation – laptops

Slim laptops have been undergoing a gradual transition: more and more of their chips are no longer available as a replaceable card, but instead soldered directly to the mainboard. Since 2017/2018, most slim laptops pretty much have CPU, memory, storage, and network chips all soldered directly to the mainboard.

Issue 137: The M1 Macbook Air

The M1 goes one step further: not only does it make do with fewer chips, it does so with passive cooling!

Issue 138: System-on-Chip (SoC)

A system-on-chip (SoC) combines the core functionality of a system—processing, graphics, memory, and control—into a single chip package.

Issue 139: What’s before this line is mine, what’s after this line is yours

Around 2015, the high-performance computer industry quickly realised that this would be much more efficient if the CPU and GPU could share the same memory.

Issue 140: The shared memory dream

Shared memory is easier to implement when a company has control over the designs of both CPU and GPU.

Issue 141: The Apple A14 and M1

The Apple A14 and Apple M1 are essentially the same chip architecture: they use almost the same building blocks, just with different numbers of them. On top of that, the Apple M1 implements unified memory, allowing the CPU and GPU (and other SoC components) to share the same system memory, greatly facilitating intra-chip communication.

Issue 142: Implications (Part 1) - Software

Using the same hardware for both smartphones and laptops would make it much easier to write apps for both platforms. The closer they are in features, hardware, and software support, the easier things will be for developers.

Issue 143: Implications (Part 2) – Future Goals

Issue 144: Programs-in-a-vat

n 1999, VMware launched VMware Workstation, which allowed multiple operating systems to run off a single machine.

Issue 145: What an app wants, what an app needs

Programs do not usually deal with the gnarly details of hardware, but instead access it through an interface. They access storage devices through a filesystem, and access hardware through drivers.

Issue 146: Virtual hardware

Virtual hardware can be created in the form of drivers that respond to a program’s requests for hardware resources. If a bootup program enumerates hardware devices and receives a response, then as long as it continues to receive valid and correct responses, it can work with the virtual hardware to run an operating system.

Issue 147: Operating systems on virtual hardware

Running a virtual machine is like running a physical machine, but within a window in your OS.

Issue 148: History of commercial computing - cohosting

Renting out virtual hardware instead of physical hardware meant that instead of having to move hardware around and manage it, you could send the data for running an OS to the hosting company and have them be responsible for hardware operations.

Issue 149: History of commercial computing - containerisation

Containers are one layer of virtualisation above virtual machines: containerisation systems virtualise access to the operating system, presenting a virtual interface that provides software with the resources it needs, without being aware of software running in other containers on the same system.

Issue 150: System VMs vs Process VMs

System VMs provide a set of virtualised hardware that the OS interacts with. Process VMs provide a set of libraries that a program (written in that programming language) interacts with.

Issue 151: the Java VM

The Java Runtime Environment (JRE) bundles the Java VM and supporting libraries. The JRE has to be installed on the user’s system for Java programs to work, unless the program bundles the supporting libraries. Solo programmers can start programming with OpenJDK for free with fewer features and less support, while commercial companies can license Oracle JDK for better support and features.

Issue 152: Getting started with programming

Actually making a web application requires you to set up lots of supporting software and carry out lots of steps to create a suitable app environment.

Issue 153: Using the cloud

The cloud offers standard digital business services, accessible through a web interface and API, which any developer (with a credit card) can use. Developers don’t have to reinvent the wheel, so long as they know how to use web APIs.

Issue 154: Emulation

Programs that were not compiled for the instruction set of the host OS have to go through an emulation layer program. This program translates the instructions of that program into compatible instructions that its own processor can execute.

Issue 155: Emulation performance

Translating a set of instructions before executing it will always lead to a slowdown, although sometimes this may not be noticeable to users.

Issue 156: Translation

To speed up execution and avoid translation overhead, some systems employ ahead-of-time translation, storing the translated instructions to be executed in future. But many systems employ a mix of just-in-time (JIT) and ahead-of-time (AOT) techniques.

Issue 157: NTP and time-syncing

Time is synchronised from higher-precision sources through a protocol called Network Time Protocol (NTP). A public pool of time servers is available for synchronisation at pool.ntp.org.

Issue 158: GPS

To get your location using GPS, your phone receives information from four overhead GPS satellites: their location, and the distance between them and your phone. With this information, your phone can calculate its location.

Issue 159: Wifi & cell tower location tracking

nstead of GPS satellites, smartphones can also use wifi points and cell towers to determine their position (if enabled in the OS).

Issue 160: CDNs and content distribution

A content delivery network comprises multiple servers around the world that are able to quickly distribute static content (typically images and video) to viewers that request it. This avoids overloading the hosting server, which would otherwise have to serve data over the network, possibly through many intermediary hops.

Issue 161: Security and XSS

Cross-site scripting attacks occur when a webpage loads malicious code from a third-party, usually carried out by a script in the page. Today, websites are protected from loading unauthorised scripts through cross-origin resource sharing (CORS) policy implemented in browsers, which only allows a website to load scripts from authorised domains.

Issue 162: Fonts

Typeface families consist of multiple fonts for each style in the typeface. Each font consists of glyphs, which are mathematical shapes described by curves joining points. These shapes need to be rasterised for display on a computer screen, or for printing on paper. Font files usually come in .ttf, .otf, or .woff formats.

Issue 163: System & software ecosystems

Software that we use usually comes from the OS makers, or from third-party developers. These two groups of developers are not the same, and might even have conflicting intentions and goals.

Issue 164: Linux, the universal operating system

Linux software is distributed through Linux distros. The maintainers of distros maintain repositories of software that have been tested with the distro. Most users will access software in the distro’s repositories through a program called a package manager. So users have full control over when updates and new software should be installed.

Issue 165: The myths of system slowdown

There are easy and quick ways to check the validity of the most common advice for resolving system slowdown. But it still seems to happen even after these tips have been tried.

Issue 166: A cause of system slowdown: caches

Caches speed up app operations by storing temporary data on the device’s storage. This assumes that access to storage is much faster than access to the file’s original source. On Android, users can clear an app’s cache, but not the system cache.

Issue 167: Database fragmentation

Fragmentation is likely a contributor of system slowdown, particularly for mobile devices: the databases used by most mobile apps tend to store data in many small chunks rather than fewer big chunks, which slows down data search operations. The most effective measure for improving device responsiveness is usually to clear the app cache, so the app does not attempt to read previous data from storage.

Issue 168: Search engines

A search engine uses bots to build up a database of URLs and their contents. The search engine uses various algorithms to determine the most relevant results for a search request.

Issue 169: Search engine optimisation

By better understanding how search bots categorise pages, a website owner can use keywords and other techniques to optimise the ranking of their page for specific search terms.