“Lies, damn lies, and statistics”

Ours is the age of statistics; of number-crunching, of quantifying, of defining everything by what it means in terms of percentages and comparisons. Statistics crop up in every walk of life, to some extent or other, in fields as widespread as advertising and sport. Many people’s livelihoods now depend on their ability to crunch the numbers, to come up with data and patterns, and much of our society’s increasing ability to do awesome things can be traced back to someone making the numbers dance.

In fact, most of what we think of as ‘statistics’ are not really statistics at all, but merely numbers; to a pedantic mathematician, a statistic is defined as a mathematical function of a sample of data, not the whole ‘population’ we are considering. We use statistics when it would be impractical to measure the whole population, usually because it’s too large, and when we instead are trying to mathematically model the whole population based on a small sample of it. Thus, next to no sporting ‘statistics’ are in fact true statistics as they tend to cover the whole game; if I heard during a rugby match that “Leicester had 59% of the possession”, that is nothing more than a number; or, to use the mathematical term, a parameter. A statistic would be to say “From our sample [of one game] we can conclude that Leicester control an average of 59% of the possession when they play rugby”, but this is quite evidently not true since we couldn’t extrapolate Leicester’s normal behaviour from a single match. It is for this reason that complex mathematical formulae are used to determine the uncertainty of a conclusion drawn from a statistical test, and these are based on the size of the sample we are testing compared to the overall size of the population we are trying to model. These uncertainty levels are often brushed under the carpet when pseudoscientists try to make dramatic, sweeping claims about something, but they are possibly the most important feature of modern statistics.

Another weapon for the poor statistician can be the mis-application of the idea of correlation. Correlation is basically what it means when you take two variables, plot them against one another on a graph, and find you get a nice neat line joining them, suggesting that the two are in some way related. Correlation tends to get scientists very excited, since if two things are linked then it suggests that you can make one thing happen by doing another, an often advantageous concept, and this is known as a causal relationship. However, whilst correlation and causation are rarely not intertwined, the first lesson every statistician learns is this; correlation DOES NOT imply causation.

Imagine, for instance, you have a cold. You feel like crap, your head is spinning, you’re dehydrated and you can’t breath through your nose. If we were, during the period before, during and after your cold, to plot a graph of one’s relative ability to breath through the nose against the severity of your headache (yeah, not very scientific I know), these two facts would both correlate, since they happen at the same time due to the cold. However, if I were to decide that this correlation implies causation, then I would draw the conclusion that all I need to do to give you a terrible headache is to plug your nose with tissue paper so you can’t breath through it. In this case, I have ignored the possibility (and, as it transpires, the eventuality) of there being a third variable (the cold virus) that causes both of the other two variables, and this is very hard to investigate without poking our head out of the numbers and looking at the real world. There are statistical techniques that enable us to do this, but they are for another time.

Whilst this example was more childish than anything, mis-extrapolation of a correlation can have deadly consequences. One example, explored in Ben Goldacre’s Bad Science, concerns beta-carotene, an antioxidant found in carrots, and in 1981 an epidemiologist called Richard Peto published a meta-analysis (post for another time) of a series of scientific studies that suggested people with high beta-carotene levels showed a reduced risk of cancer. At the time, antioxidants were considered the wonder-substance of the nutrition, and everyone got on board with the idea that beta-carotene was awesome stuff. However, all of the studies examined were observational ones; taking a lot of different people, seeing what their beta-carotene levels were and then examining whether or not they had cancer or developed it in later life. None of the studies actually gave their subjects beta-carotene and then saw if that affected their cancer risk, and this prompted the editor of Nature magazine (the scientific journal in which Peto’s paper was published) to include a footnote reading:

Unwary readers (if such there are) should not take the accompanying article as a sign that the consumption of large quantities of carrots (or other dietary sources of beta-carotene) is necessarily protective against cancer.

The editor’s footnote quickly proved a well-judged one; a study conducted in Finland some time afterwards actually gave participants at high risk of lung cancer beta-carotene and found their risk of both getting the cancer and of death were higher than for the ‘placebo’ control group. A later study, named CARET (Carotene And Retinol Efficiency Trial), also tested groups at a high risk of lung cancer, giving half of them a mixture of beta-carotene and vitamin A and the other half placebos. The idea was to run the trial for six years and see how many illnesses/deaths each group ended up with; but after preliminary data found that those having the antioxidant tablets were 46% more likely to die from lung cancer, they decided it would be unethical to continue the trial and it was terminated early. Had the Nature article been allowed to get out of hand before this research was done, then it could have put thousands of people who hadn’t read the article properly at risk; and all because of the dangers of assuming correlation=causation.

This wasn’t really the gentle ramble through statistics I originally intended it to be, but there you go; stats. Next time, something a little less random. Maybe

Advertisement

Practical computing

This looks set to be my final post of this series about the history and functional mechanics of computers. Today I want to get onto the nuts & bolts of computer programming and interaction, the sort of thing you might learn as a budding amateur wanting to figure out how to mess around with these things, and who’s interested in exactly how they work (bear in mind that I am not one of these people and am, therefore, likely to get quite a bit of this wrong). So, to summarise what I’ve said in the last two posts (and to fill in a couple of gaps): silicon chips are massive piles of tiny electronic switches, memory is stored in tiny circuits that are either off or on, this pattern of off and on can be used to represent information in memory, memory stores data and instructions for the CPU, the CPU has no actual ability to do anything but automatically delegates through the structure of its transistors to the areas that do, the arithmetic logic unit is a dumb counting machine used to do all the grunt work and is also responsible, through the CPU, for telling the screen how to make the appropriate pretty pictures.

OK? Good, we can get on then.

Programming languages are a way of translating the medium of computer information and instruction (binary data) into our medium of the same: words and language. Obviously, computers do not understand that the buttons we press on our screen have symbols on them, that these symbols mean something to us and that they are so built to produce the same symbols on the monitor when we press them, but we humans do and that makes computers actually usable for 99.99% of the world population. When a programmer brings up an appropriate program and starts typing instructions into it, at the time of typing their words mean absolutely nothing. The key thing is what happens when their data is committed to memory, for here the program concerned kicks in.

The key feature that defines a programming language is not the language itself, but the interface that converts words to instructions. Built into the workings of each is a list of ‘words’ in binary, each word having a corresponding, but entirely different, string of data associated with it that represents the appropriate set of ‘ons and offs’ that will get a computer to perform the correct task. This works in one of two ways: an ‘interpreter’ is an inbuilt system whereby the programming is stored just as words and is then converted to ‘machine code’ by the interpreter as it is accessed from memory, but the most common form is to use a compiler. This basically means that once you have finished writing your program, you hit a button to tell the computer to ‘compile’ your written code into an executable program in data form. This allows you to delete the written file afterwards, makes programs run faster, and gives programmers an excuse to bum around all the time (I refer you here)

That is, basically how computer programs work- but there is one last, key feature, in the workings of a modern computer, one that has divided both nerds and laymen alike across the years and decades and to this day provokes furious debate: the operating system.

An OS, something like Windows (Microsoft), OS X (Apple) or Linux (nerds), is basically the software that enables the CPU to do its job of managing processes and applications. Think of it this way: whilst the CPU might put two inputs through a logic gate and send an output to a program, it is the operating system that will set it up to determine exactly which gate to put it through and exactly how that program will execute. Operating systems are written onto the hard drive, and can, theoretically, be written using nothing more than a magnetized needle, a lot of time and a plethora of expertise to flip the magnetically charged ‘bits’ on the hard disk. They consist of many different parts, but the key feature of all of them is the kernel, the part that manages the memory, optimises the CPU performance and translates programs from memory to screen. The precise translation and method by which this latter function happens differs from OS to OS, hence why a program written for Windows won’t work on a Mac, and why Android (Linux-powered) smartphones couldn’t run iPhone (iOS) apps even if they could access the store. It is also the cause of all the debate between advocates of different operating systems, since different translation methods prioritise/are better at dealing with different things, work with varying degrees of efficiency and are more  or less vulnerable to virus attack. However, perhaps the most vital things that modern OS’s do on our home computers is the stuff that, at first glance seems secondary- moving stuff around and scheduling. A CPU cannot process more than one task at once, meaning that it should not be theoretically possible for a computer to multi-task; the sheer concept of playing minesweeper whilst waiting for the rest of the computer to boot up and sort itself out would be just too outlandish for words. However, a clever piece of software called a scheduler in each OS which switches from process to process very rapidly (remember computers run so fast that they can count to a billion, one by one, in under a second) to give the impression of it all happening simultaneously. Similarly, a kernel will allocate areas of empty memory for a given program to store its temporary information and run on, but may also shift some rarely-accessed memory from RAM (where it is accessible) to hard disk (where it isn’t) to free up more space (this is how computers with very little free memory space run programs, and the time taken to do this for large amounts of data is why they run so slowly) and must cope when a program needs to access data from another part of the computer that has not been specifically allocated a part of that program.

If I knew what I was talking about, I could witter on all day about the functioning of operating systems and the vast array of headache-causing practicalities and features that any OS programmer must consider, but I don’t and as such won’t. Instead, I will simply sit back, pat myself on the back for having actually got around to researching and (after a fashion) understanding all this, and marvel at what strange, confusing, brilliant inventions computers are.

Up one level

In my last post (well, last excepting Wednesday’s little topical deviation), I talked about the real nuts and bolts of a computer, detailing the function of the transistors that are so vital to the workings of a computer. Today, I’m going to take one step up and study a slightly broader picture, this time concerned with the integrated circuits that utilise such components to do the real grunt work of computing.

An integrated circuit is simply a circuit that is not comprised of multiple, separate, electronic components- in effect, whilst a standard circuit might consist of a few bits of metal and plastic connected to one another by wires, in an IC they are all stuck in the same place and all assembled as one. The main advantage of this is that since all the components don’t have to be manually stuck to one another, but are built in circuit form from the start, there is no worrying about the fiddliness of assembly and they can be mass-produced quickly and cheaply with components on a truly microscopic scale. They generally consist of several layers on top of the silicon itself, simply to allow space for all of the metal connecting tracks and insulating materials to run over one another (this pattern is usually, perhaps ironically, worked out on a computer), and the sheer detail required of their manufacture surely makes it one of the marvels of the engineering world.

But… how do they make a computer work? Well, let’s start by looking at a computer’s memory, which in all modern computers takes the form of semiconductor memory. Memory takes the form of millions upon millions of microscopically small circuits known as memory circuits, each of which consists of one or more transistors. Computers are electronic, meaning to only thing they understand is electricity- for the sake of simplicity and reliability, this takes the form of whether the current flowing in a given memory circuit is ‘on’ or ‘off’. If the switch is on, then the circuit is represented as a 1, or a 0 if it is switched off. These memory circuits are generally grouped together, and so each group will consist of an ordered pattern of ones and zeroes, of which there are many different permutations. This method of counting in ones and zeroes is known as binary arithmetic, and is sometimes thought of as the simplest form of counting. On a hard disk, patches of magnetically charged material represent binary information rather than memory circuits.

Each little memory circuit, with its simple on/off value, represents one bit of information. 8 bits grouped together forms a byte, and there may be billions of bytes in a computer’s memory. The key task of a computer programmer is, therefore, to ensure that all the data that a computer needs to process is written in binary form- but this pattern of 1s and 0s might be needed to represent any information from the content of an email to the colour of one pixel of a video. Clearly, memory on its own is not enough, and the computer needs some way of translating the information stored into the appropriate form.

A computer’s tool for doing this is known as a logic gate, a simple electronic device consisting of (you guessed it) yet more transistor switches. This takes one or two inputs, either ‘on’ or ‘off’ binary ones, and translates them into another value. There are three basic types:  AND gates (if both inputs equal 1, output equals 1- otherwise, output equals 0), OR gates (if either input equals 1, output equals 1- if both inputs equal 0, output equals 0), and NOT gates (if input equals 1, output equals 0, if input equals 0, output equals 1). The NOT gate is the only one of these with a single input, and combinations of these gates can perform other functions too, such as NAND (not-and) or XOR (exclusive OR; if either input equals 1, output equals 1, but if both inputs equal 1 or 0, output equals 0) gates. A computer’s CPU (central processing unit) will contain hundreds of these, connected up in such a way as to link various parts of the computer together appropriately, translate the instructions of the memory into what function a given program should be performing, and thus cause the relevant bit (if you’ll pardon the pun) of information to translate into the correct process for the computer to perform.

For example, if you click on an icon on your desktop, your computer will put the position of your mouse and the input of the clicking action through an AND gate to determine that it should first highlight that icon. To do this, it orders the three different parts of each of the many pixels of that symbol to change their shade by a certain degree, and the the part of the computer responsible for the monitor’s colour sends a message to the Arithmetic Logic Unit (ALU), the computer’s counting department, to ask what the numerical values of the old shades plus the highlighting is, to give it the new shades of colour for the various pictures. Oh, and the CPU should also open the program. To do this, its connections send a signal off to the memory to say that program X should open now. Another bit of the computer then searches through the memory to find program X, giving it the master ‘1’ signal that causes it to open. Now that it is open, this program routes a huge amount of data back through the CPU to tell it to change the pattern of pretty colours on the screen again, requiring another slue of data to go through the ALU, and that areas of the screen A, B and C are now all buttons, so if you click there then we’re going to have to go through this business all over again. Basically the CPU’s logical function consists of ‘IF this AND/OR this happens, which signal do I send off to ask the right part of the memory what to do next?’. And it will do all this in a miniscule fraction of a second. Computers are amazing.

Obviously, nobody in their right mind is going to go through the whole business of telling the computer exactly what to do with each individual piece of binary data manually, because if they did nothing would ever get done. For this purpose, therefore, programmers have invented programming languages to translate their wishes into binary, and for a little more detail about them, tune in to my final post on the subject…

A Continued History

This post looks set to at least begin by following on directly from my last one- that dealt with the story of computers up to Charles Babbage’s difference and analytical engines, whilst this one will try to follow the history along from there until as close to today as I can manage, hopefully getting in a few of the basics of the workings of these strange and wonderful machines.

After Babbage’s death as a relatively unknown and unloved mathematician in 1871, the progress of the science of computing continued to tick over. A Dublin accountant named Percy Ludgate, independently of Babbage’s work, did develop his own programmable, mechanical computer at the turn of the century, but his design fell into a similar degree of obscurity and hardly added anything new to the field. Mechanical calculators had become viable commercial enterprises, getting steadily cheaper and cheaper, and as technological exercises were becoming ever more sophisticated with the invention of the analogue computer. These were, basically a less programmable version of the difference engine- mechanical devices whose various cogs and wheels were so connected up that they would perform one specific mathematical function to a set of data. James Thomson in 1876 built the first, which could solve differential equations by integration (a fairly simple but undoubtedly tedious mathematical task), and later developments were widely used to collect military data and for solving problems concerning numbers too large to solve by human numerical methods. For a long time, analogue computers were considered the future of modern computing, but since they solved and modelled problems using physical phenomena rather than data they were restricted in capability to their original setup.

A perhaps more significant development came in the late 1880s, when an American named Herman Hollerith invented a method of machine-readable data storage in the form of cards punched with holes. These had been around for a while to act rather like programs, such as the holed-paper reels of a pianola or the punched cards used to automate the workings of a loom, but this was the first example of such devices being used to store data (although Babbage had theorised such an idea for the memory systems of his analytical engine). They were cheap, simple, could be both produced and read easily by a machine, and were even simple to dispose of. Hollerith’s team later went on to process the data of the 1890 US census, and would eventually become most of IBM. The pattern of holes on these cards could be ‘read’ by a mechanical device with a set of levers that would go through a hole if there was one present, turning the appropriate cogs to tell the machine to count up one. This system carried on being used right up until the 1980s on IBM systems, and could be argued to be the first programming language.

However, to see the story of the modern computer truly progress we must fast forward to the 1930s. Three interesting people and acheivements came to the fore here: in 1937 George Stibitz, and American working in Bell Labs, built an electromechanical calculator that was the first to process data digitally using on/off binary electrical signals, making it the first digital. In 1936, a bored German engineering student called Konrad Zuse dreamt up a method for processing his tedious design calculations automatically rather than by hand- to this end he devised the Z1, a table-sized calculator that could be programmed to a degree via perforated film and also operated in binary. His parts couldn’t be engineered well enough for it to ever work properly, but he kept at it to eventually build 3 more models and devise the first programming language. However, perhaps the most significant figure of 1930s computing was a young, homosexual, English maths genius called Alan Turing.

Turing’s first contribution to the computing world came in 1936, when he published a revolutionary paper showing that certain computing problems cannot be solved by one general algorithm. A key feature of this paper was his description of a ‘universal computer’, a machine capable of executing programs based on reading and manipulating a set of symbols on a strip of tape. The symbol on the tape currently being read would determine whether the machine would move up or down the strip, how far, and what it would change the symbol to, and Turing proved that one of these machines could replicate the behaviour of any computer algorithm- and since computers are just devices running algorithms, they can replicate any modern computer too. Thus, if a Turing machine (as they are now known) could theoretically solve a problem, then so could a general algorithm, and vice versa if it couldn’t. Not only that, but since modern computers cannot multi-task on the. These machines not only lay the foundations for computability and computation theory, on which nearly all of modern computing is built, but were also revolutionary as they were the first theorised to use the same medium for both data storage and programs, as nearly all modern computers do. This concept is known as a von Neumann architecture, after the man who first pointed out and explained this idea in response to Turing’s work.

Turing machines contributed one further, vital concept to modern computing- that of Turing-completeness. A Turing-complete system was defined as a single Turing machine (known as a Universal Turing machine) capable of replicating the behaviour of any other theoretically possible Turing machine, and thus any possible algorithm or computable sequence. Charles Babbage’s analytical engine would have fallen into that class had it ever been built, in part because it was capable of the ‘if X then do Y’ logical reasoning that characterises a computer rather than a calculator. Ensuring the Turing-completeness of a system is a key part of designing a computer system or programming language to ensure its versatility and that it is capable of performing all the tasks that could be required of it.

Turing’s work had laid the foundations for nearly all the theoretical science of modern computing- now all the world needed was machines capable of performing the practical side of things. However, in 1942 there was a war on, and Turing was being employed by the government’s code breaking unit at Bletchley Park, Buckinghamshire. They had already cracked the German’s Enigma code, but that had been a comparatively simple task since they knew the structure and internal layout of the Enigma machine. However, they were then faced by a new and more daunting prospect: the Lorenz cipher, encoded by an even more complex machine for which they had no blueprints. Turing’s genius, however, apparently knew no bounds, and his team eventually worked out its logical functioning. From this a method for deciphering it was formulated, but it required an iterative process that took hours of mind-numbing calculation to get a result out. A faster method of processing these messages was needed, and to this end an engineer named Tommy Flowers designed and built Colossus.

Colossus was a landmark of the computing world- the first electronic, digital, and partially programmable computer ever to exist. It’s mathematical operation was not highly sophisticated- it used vacuum tubes containing light emission and sensitive detection systems, all of which were state-of-the-art electronics at the time, to read the pattern of holes on a paper tape containing the encoded messages, and then compared these to another pattern of holes generated internally from a simulation of the Lorenz machine in different configurations. If there were enough similarities (the machine could obviously not get a precise matching since it didn’t know the original message content) it flagged up that setup as a potential one for the message’s encryption, which could then be tested, saving many hundreds of man-hours. But despite its inherent simplicity, its legacy is simply one of proving a point to the world- that electronic, programmable computers were both possible and viable bits of hardware, and paved the way for modern-day computing to develop.

What we know and what we understand are two very different things…

If the whole Y2K debacle over a decade ago taught us anything, it was that the vast majority of the population did not understand the little plastic boxes known as computers that were rapidly filling up their homes. Nothing especially wrong or unusual about this- there’s a lot of things that only a few nerds understand properly, an awful lot of other stuff in our life to understand, and in any case the personal computer had only just started to become commonplace. However, over 12 and a half years later, the general understanding of a lot of us does not appear to have increased to any significant degree, and we still remain largely ignorant of these little feats of electronic witchcraft. Oh sure, we can work and operate them (most of us anyway), and we know roughly what they do, but as to exactly how they operate, precisely how they carry out their tasks? Sorry, not a clue.

This is largely understandable, particularly given the value of ‘understand’ that is applicable in computer-based situations. Computers are a rare example of a complex system that an expert is genuinely capable of understanding, in minute detail, every single aspect of the system’s working, both what it does, why it is there, and why it is (or, in some cases, shouldn’t be) constructed to that particular specification. To understand a computer in its entirety, therefore, is an equally complex job, and this is one very good reason why computer nerds tend to be a quite solitary bunch, with quite few links to the rest of us and, indeed, the outside world at large.

One person who does not understand computers very well is me, despite the fact that I have been using them, in one form or another, for as long as I can comfortably remember. Over this summer, however, I had quite a lot of free time on my hands, and part of that time was spent finally relenting to the badgering of a friend and having a go with Linux (Ubuntu if you really want to know) for the first time. Since I like to do my background research before getting stuck into any project, this necessitated quite some research into the hows and whys of its installation, along with which came quite a lot of info as to the hows and practicalities of my computer generally. I thought, then, that I might spend the next couple of posts or so detailing some of what I learned, building up a picture of a computer’s functioning from the ground up, and starting with a bit of a history lesson…

‘Computer’ was originally a job title, the job itself being akin to accountancy without the imagination. A computer was a number-cruncher, a supposedly infallible data processing machine employed to perform a range of jobs ranging from astronomical prediction to calculating interest. The job was a fairly good one, anyone clever enough to land it probably doing well by the standards of his age, but the output wasn’t. The human brain is not built for infallibility and, not infrequently, would make mistakes. Most of these undoubtedly went unnoticed or at least rarely caused significant harm, but the system was nonetheless inefficient. Abacuses, log tables and slide rules all aided arithmetic manipulation to a great degree in their respective fields, but true infallibility was unachievable whilst still reliant on the human mind.

Enter Blaise Pascal, 17th century mathematician and pioneer of probability theory (among other things), who invented the mechanical calculator aged just 19, in 1642. His original design wasn’t much more than a counting machine, a sequence of cogs and wheels so constructed as to able to count and convert between units, tens, hundreds and so on (ie a turn of 4 spaces on the ‘units’ cog whilst a seven was already counted would bring up eleven), as well as being able to work with currency denominations and distances as well. However, it could also subtract, multiply and divide (with some difficulty), and moreover proved an important point- that a mechanical machine could cut out the human error factor and reduce any inaccuracy to one of simply entering the wrong number.

Pascal’s machine was both expensive and complicated, meaning only twenty were ever made, but his was the only working mechanical calculator of the 17th century. Several, of a range of designs, were built during the 18th century as show pieces, but by the 19th the release of Thomas de Colmar’s Arithmometer, after 30 years of development, signified the birth of an industry. It wasn’t a large one, since the machines were still expensive and only of limited use, but de Colmar’s machine was the simplest and most reliable model yet. Around 3,000 mechanical calculators, of various designs and manufacturers, were sold by 1890, but by then the field had been given an unexpected shuffling.

Just two years after de Colmar had first patented his pre-development Arithmometer, an Englishmen by the name of Charles Babbage showed an interesting-looking pile of brass to a few friends and associates- a small assembly of cogs and wheels that he said was merely a precursor to the design of a far larger machine: his difference engine. The mathematical workings of his design were based on Newton polynomials, a fiddly bit of maths that I won’t even pretend to understand, but that could be used to closely approximate logarithmic and trigonometric functions. However, what made the difference engine special was that the original setup of the device, the positions of the various columns and so forth, determined what function the machine performed. This was more than just a simple device for adding up, this was beginning to look like a programmable computer.

Babbage’s machine was not the all-conquering revolutionary design the hype about it might have you believe. Babbage was commissioned to build one by the British government for military purposes, but since Babbage was often brash, once claiming that he could not fathom the idiocy of the mind that would think up a question an MP had just asked him, and prized academia above fiscal matters & practicality, the idea fell through. After investing £17,000 in his machine before realising that he had switched to working on a new and improved design known as the analytical engine, they pulled the plug and the machine never got made. Neither did the analytical engine, which is a crying shame; this was the first true computer design, with two separate inputs for both data and the required program, which could be a lot more complicated than just adding or subtracting, and an integrated memory system. It could even print results on one of three printers, in what could be considered the first human interfacing system (akin to a modern-day monitor), and had ‘control flow systems’ incorporated to ensure the performing of programs occurred in the correct order. We may never know, since it has never been built, whether Babbage’s analytical engine would have worked, but a later model of his difference engine was built for the London Science Museum in 1991, yielding accurate results to 31 decimal places.

…and I appear to have run on a bit further than intended. No matter- my next post will continue this journey down the history of the computer, and we’ll see if I can get onto any actual explanation of how the things work.