The Epitome of Nerd-dom

A short while ago, I did a series of posts on computing based on the fact that I had done a lot of related research when studying the installation of Linux. I feel that I should now come clean and point out that between the time of that first post being written and now, I have tried and failed to install Ubuntu on an old laptop six times already, which has served to teach me even more about exactly how it works, and how it differs from is more mainstream competitors. So, since I don’t have any better ideas, I thought I might dedicate this post to Linux itself.

Linux is named after both its founder, Linus Torvalds, a Finnish programmer who finished compiling the Linux kernel in 1992, and Unix, the operating system that could be considered the grandfather of all modern OSs and which Torvalds based his design upon (note- whilst Torvald’s first name has a soft, extended first syllable, the first syllable of the word Linux should be a hard, short, sharp ‘ih’ sound). The system has its roots in the work of Richard Stallman, a lifelong pioneer and champion of the free-to-use, open source movement, who started the GNU project in 1983. His ultimate goal was to produce a free, Unix-like operating system, and in keeping with this he wrote a software license allowing anyone to use and distribute software associated with it so long as they stayed in keeping with the license’s terms (ie nobody can use the free software for personal profit). The software compiled as part of the GNU project was numerous (including a still widely-used compiler) and did eventually come to fruition as an operating system, but it never caught on and the project was, in regards to its achieving of its final aims, a failure (although the GNU General Public License remains the most-used software license of all time).

Torvalds began work on Linux as a hobby whilst a student in April 1991, using another Unix clone MINIX to write his code in and basing it on MINIX’s structure. Initially, he hadn’t been intending to write a complete operating system at all, but rather a type of display interface called a terminal emulator- a system that tries to emulate a graphical terminal, like a monitor, through a more text-based medium (I don’t really get it either- it’s hard to find information a newbie like me can make good sense of). Strictly speaking a terminal emulator is a program, existing independent of an operating system and acting almost like one in its own right, directly within the computer’s architecture. As such, the two are somewhat related and it wasn’t long before Torvalds ‘realised’ he had written a kernel for an operating system and, since the GNU operating system had fallen through and there was no widespread, free-to-use kernel out there, he pushed forward with his project. In August of that same year he published a now-famous post on a kind of early internet forum called Usenet, saying that he was developing an operating system that was “starting to get ready”, and asking for feedback concerning where MINIX was good and where it was lacking, “as my OS resembles it somewhat”. He also, interestingly,  said that his OS “probably never will support anything other than AT-harddisks”. How wrong that statement has proved to be.

When he finally published Linux, he originally did so under his own license- however, he borrowed heavily from GNU software in order to make it run properly (so to have a proper interface and such), and released later versions under the GNU GPL. Torvalds and his associates continue to maintain and update the Linux kernel (Version 3.0 being released last year) and, despite some teething troubles with those who have considered it old-fashioned, those who thought MINIX code was stolen (rather than merely borrowed from), and Microsoft (who have since turned tail and are now one of the largest contributors to the Linux kernel), the system is now regarded as the pinnacle of Stallman’s open-source dream.

One of the keys to its success lies in its constant evolution, and the interactivity of this process. Whilst Linus Torvalds and co. are the main developers, they write very little code themselves- instead, other programmers and members of the Linux community offer up suggestions, patches and additions to either the Linux distributors (more on them later) or as source code to the kernel itself. All the main team have to do is pick and choose the features they want to see included, and continually prune what they get to maximise the efficiency and minimise the vulnerability to viruses of the system- the latter being one of the key features that marks Linux (and OS X) over Windows. Other key advantages Linux holds includes its size and the efficiency with which it allocates CPU usage; whilst Windows may command a quite high percentage of your CPU capacity just to keep itself running, not counting any programs running on it, Linux is designed to use your CPU as efficiently as possible, in an effort to keep it running faster. The kernel’s open source roots mean it is easy to modify if you have the technical know-how, and the community of followers surrounding it mean that any problem you have with a standard distribution is usually only a few button clicks away. Disadvantages include a certain lack of user-friendliness to the uninitiated or not computer-literate user since a lot of programs require an instruction typed into the command bar, far fewer  programs, especially commercial, professional ones, than Windows, an inability to process media as well as OS X (which is the main reason Apple computers appear to exist), and a tendency to go wrong more frequently than commercial operating systems. Nonetheless, many ‘computer people’ consider this a small price to pay and flock to the kernel in their thousands.

However, the Linux kernel alone is not enough to make an operating system- hence the existence of distributions. Different distributions (or ‘distros’ as they’re known) consist of the Linux kernel bundled together with all the other features that make up an OS: software, documentation, window system, window manager, and desktop interface, to name but some. A few of these components, such as the graphical user interface (or GUI, which covers the job of several of the above components), or the package manager (that covers program installation, removal and editing), tend to be fairly ubiquitous (GNOME or KDE are common GUIs, and Synaptic the most typical package manager), but different people like their operating system to run in slightly different ways. Therefore, variations on these other components are bundled together with the kernel to form a distro, a complete package that will run as an operating system in exactly the same fashion as you would encounter with Windows or OS X. Such distros include Ubuntu (the most popular among beginners), Debian (Ubuntu’s older brother), Red Hat, Mandriva and Crunchbang- some of these, such as Ubuntu, are commercially backed enterprises (although how they make their money is a little beyond me), whilst others are entirely community-run, maintained solely thanks to the dedication, obsession and boundless free time of users across the globe.

If you’re not into all this computer-y geekdom, then there is a lot to dislike about Linux, and many an average computer user would rather use something that will get them sneered at by a minority of elitist nerds but that they know and can rely upon. But, for all of our inner geeks, the spirit, community, inventiveness and joyous freedom of the Linux system can be a wonderful breath of fresh air. Thank you, Mr. Torvalds- you have made a lot of people very happy.

Practical computing

This looks set to be my final post of this series about the history and functional mechanics of computers. Today I want to get onto the nuts & bolts of computer programming and interaction, the sort of thing you might learn as a budding amateur wanting to figure out how to mess around with these things, and who’s interested in exactly how they work (bear in mind that I am not one of these people and am, therefore, likely to get quite a bit of this wrong). So, to summarise what I’ve said in the last two posts (and to fill in a couple of gaps): silicon chips are massive piles of tiny electronic switches, memory is stored in tiny circuits that are either off or on, this pattern of off and on can be used to represent information in memory, memory stores data and instructions for the CPU, the CPU has no actual ability to do anything but automatically delegates through the structure of its transistors to the areas that do, the arithmetic logic unit is a dumb counting machine used to do all the grunt work and is also responsible, through the CPU, for telling the screen how to make the appropriate pretty pictures.

OK? Good, we can get on then.

Programming languages are a way of translating the medium of computer information and instruction (binary data) into our medium of the same: words and language. Obviously, computers do not understand that the buttons we press on our screen have symbols on them, that these symbols mean something to us and that they are so built to produce the same symbols on the monitor when we press them, but we humans do and that makes computers actually usable for 99.99% of the world population. When a programmer brings up an appropriate program and starts typing instructions into it, at the time of typing their words mean absolutely nothing. The key thing is what happens when their data is committed to memory, for here the program concerned kicks in.

The key feature that defines a programming language is not the language itself, but the interface that converts words to instructions. Built into the workings of each is a list of ‘words’ in binary, each word having a corresponding, but entirely different, string of data associated with it that represents the appropriate set of ‘ons and offs’ that will get a computer to perform the correct task. This works in one of two ways: an ‘interpreter’ is an inbuilt system whereby the programming is stored just as words and is then converted to ‘machine code’ by the interpreter as it is accessed from memory, but the most common form is to use a compiler. This basically means that once you have finished writing your program, you hit a button to tell the computer to ‘compile’ your written code into an executable program in data form. This allows you to delete the written file afterwards, makes programs run faster, and gives programmers an excuse to bum around all the time (I refer you here)

That is, basically how computer programs work- but there is one last, key feature, in the workings of a modern computer, one that has divided both nerds and laymen alike across the years and decades and to this day provokes furious debate: the operating system.

An OS, something like Windows (Microsoft), OS X (Apple) or Linux (nerds), is basically the software that enables the CPU to do its job of managing processes and applications. Think of it this way: whilst the CPU might put two inputs through a logic gate and send an output to a program, it is the operating system that will set it up to determine exactly which gate to put it through and exactly how that program will execute. Operating systems are written onto the hard drive, and can, theoretically, be written using nothing more than a magnetized needle, a lot of time and a plethora of expertise to flip the magnetically charged ‘bits’ on the hard disk. They consist of many different parts, but the key feature of all of them is the kernel, the part that manages the memory, optimises the CPU performance and translates programs from memory to screen. The precise translation and method by which this latter function happens differs from OS to OS, hence why a program written for Windows won’t work on a Mac, and why Android (Linux-powered) smartphones couldn’t run iPhone (iOS) apps even if they could access the store. It is also the cause of all the debate between advocates of different operating systems, since different translation methods prioritise/are better at dealing with different things, work with varying degrees of efficiency and are more  or less vulnerable to virus attack. However, perhaps the most vital things that modern OS’s do on our home computers is the stuff that, at first glance seems secondary- moving stuff around and scheduling. A CPU cannot process more than one task at once, meaning that it should not be theoretically possible for a computer to multi-task; the sheer concept of playing minesweeper whilst waiting for the rest of the computer to boot up and sort itself out would be just too outlandish for words. However, a clever piece of software called a scheduler in each OS which switches from process to process very rapidly (remember computers run so fast that they can count to a billion, one by one, in under a second) to give the impression of it all happening simultaneously. Similarly, a kernel will allocate areas of empty memory for a given program to store its temporary information and run on, but may also shift some rarely-accessed memory from RAM (where it is accessible) to hard disk (where it isn’t) to free up more space (this is how computers with very little free memory space run programs, and the time taken to do this for large amounts of data is why they run so slowly) and must cope when a program needs to access data from another part of the computer that has not been specifically allocated a part of that program.

If I knew what I was talking about, I could witter on all day about the functioning of operating systems and the vast array of headache-causing practicalities and features that any OS programmer must consider, but I don’t and as such won’t. Instead, I will simply sit back, pat myself on the back for having actually got around to researching and (after a fashion) understanding all this, and marvel at what strange, confusing, brilliant inventions computers are.

What we know and what we understand are two very different things…

If the whole Y2K debacle over a decade ago taught us anything, it was that the vast majority of the population did not understand the little plastic boxes known as computers that were rapidly filling up their homes. Nothing especially wrong or unusual about this- there’s a lot of things that only a few nerds understand properly, an awful lot of other stuff in our life to understand, and in any case the personal computer had only just started to become commonplace. However, over 12 and a half years later, the general understanding of a lot of us does not appear to have increased to any significant degree, and we still remain largely ignorant of these little feats of electronic witchcraft. Oh sure, we can work and operate them (most of us anyway), and we know roughly what they do, but as to exactly how they operate, precisely how they carry out their tasks? Sorry, not a clue.

This is largely understandable, particularly given the value of ‘understand’ that is applicable in computer-based situations. Computers are a rare example of a complex system that an expert is genuinely capable of understanding, in minute detail, every single aspect of the system’s working, both what it does, why it is there, and why it is (or, in some cases, shouldn’t be) constructed to that particular specification. To understand a computer in its entirety, therefore, is an equally complex job, and this is one very good reason why computer nerds tend to be a quite solitary bunch, with quite few links to the rest of us and, indeed, the outside world at large.

One person who does not understand computers very well is me, despite the fact that I have been using them, in one form or another, for as long as I can comfortably remember. Over this summer, however, I had quite a lot of free time on my hands, and part of that time was spent finally relenting to the badgering of a friend and having a go with Linux (Ubuntu if you really want to know) for the first time. Since I like to do my background research before getting stuck into any project, this necessitated quite some research into the hows and whys of its installation, along with which came quite a lot of info as to the hows and practicalities of my computer generally. I thought, then, that I might spend the next couple of posts or so detailing some of what I learned, building up a picture of a computer’s functioning from the ground up, and starting with a bit of a history lesson…

‘Computer’ was originally a job title, the job itself being akin to accountancy without the imagination. A computer was a number-cruncher, a supposedly infallible data processing machine employed to perform a range of jobs ranging from astronomical prediction to calculating interest. The job was a fairly good one, anyone clever enough to land it probably doing well by the standards of his age, but the output wasn’t. The human brain is not built for infallibility and, not infrequently, would make mistakes. Most of these undoubtedly went unnoticed or at least rarely caused significant harm, but the system was nonetheless inefficient. Abacuses, log tables and slide rules all aided arithmetic manipulation to a great degree in their respective fields, but true infallibility was unachievable whilst still reliant on the human mind.

Enter Blaise Pascal, 17th century mathematician and pioneer of probability theory (among other things), who invented the mechanical calculator aged just 19, in 1642. His original design wasn’t much more than a counting machine, a sequence of cogs and wheels so constructed as to able to count and convert between units, tens, hundreds and so on (ie a turn of 4 spaces on the ‘units’ cog whilst a seven was already counted would bring up eleven), as well as being able to work with currency denominations and distances as well. However, it could also subtract, multiply and divide (with some difficulty), and moreover proved an important point- that a mechanical machine could cut out the human error factor and reduce any inaccuracy to one of simply entering the wrong number.

Pascal’s machine was both expensive and complicated, meaning only twenty were ever made, but his was the only working mechanical calculator of the 17th century. Several, of a range of designs, were built during the 18th century as show pieces, but by the 19th the release of Thomas de Colmar’s Arithmometer, after 30 years of development, signified the birth of an industry. It wasn’t a large one, since the machines were still expensive and only of limited use, but de Colmar’s machine was the simplest and most reliable model yet. Around 3,000 mechanical calculators, of various designs and manufacturers, were sold by 1890, but by then the field had been given an unexpected shuffling.

Just two years after de Colmar had first patented his pre-development Arithmometer, an Englishmen by the name of Charles Babbage showed an interesting-looking pile of brass to a few friends and associates- a small assembly of cogs and wheels that he said was merely a precursor to the design of a far larger machine: his difference engine. The mathematical workings of his design were based on Newton polynomials, a fiddly bit of maths that I won’t even pretend to understand, but that could be used to closely approximate logarithmic and trigonometric functions. However, what made the difference engine special was that the original setup of the device, the positions of the various columns and so forth, determined what function the machine performed. This was more than just a simple device for adding up, this was beginning to look like a programmable computer.

Babbage’s machine was not the all-conquering revolutionary design the hype about it might have you believe. Babbage was commissioned to build one by the British government for military purposes, but since Babbage was often brash, once claiming that he could not fathom the idiocy of the mind that would think up a question an MP had just asked him, and prized academia above fiscal matters & practicality, the idea fell through. After investing £17,000 in his machine before realising that he had switched to working on a new and improved design known as the analytical engine, they pulled the plug and the machine never got made. Neither did the analytical engine, which is a crying shame; this was the first true computer design, with two separate inputs for both data and the required program, which could be a lot more complicated than just adding or subtracting, and an integrated memory system. It could even print results on one of three printers, in what could be considered the first human interfacing system (akin to a modern-day monitor), and had ‘control flow systems’ incorporated to ensure the performing of programs occurred in the correct order. We may never know, since it has never been built, whether Babbage’s analytical engine would have worked, but a later model of his difference engine was built for the London Science Museum in 1991, yielding accurate results to 31 decimal places.

…and I appear to have run on a bit further than intended. No matter- my next post will continue this journey down the history of the computer, and we’ll see if I can get onto any actual explanation of how the things work.