Artful Computing

In the present Covid-19 pandemic many governments are using mathematical modelling, (implemented in computer programs) to inform their critical decisions. Those from non mathematical backgrounds hear these words, but may not be clear about what we can and cannot reasonably do with computer models. At present we do not wish to put our faith in false prophets, but we also need to try to foresee the future course of events as best we can. 

The current pandemic has brought computer modelling of one kind into a sharp daily-news focus but it is actually pervasive in modern life. My own background involves nearly 40 years of computational physics in the nuclear industry, where I constructed and used highly complex mathematical models implemented in computer programs to support both the daily operation and the safety assessment of nuclear reactors. I have many peers working in other areas of science and engineering, such as climate modelling or aircraft design, who are all doing essentially similar and vital jobs. I developed both a great respect and a healthy scepticism for the application of computational modelling, and know that you need both sophisticated software and sophisticated processes for using such powerful tools: they can both inform and also seriously mislead you.

A `model’ is something that displays certain selected characteristics of the appearance and behaviour of part of the real world. The `selected’ part of this definition is significant: we choose to capture in the model just those characteristics that are in some sense important for a particular purpose, and ignore features that are irrelevant to that purpose. Mathematical models are equations which constrain the relationships between their algebraic variables such that the values these variables can take behave together like the selected real, measurable quantities. We can, for example, model the behaviour of an aircraft in a flight simulator by taking measurements of the pilot’s control inputs (e.g. stick, rudder and throttle) and then calculate using our model the speed, direction and attitude that a real aircraft would take up - probably, that is as long as we stay within the proven limits of validity of the model. Situations where such models are of high practical importance are frequently complicated because the interesting parts of the world are complicated and we have to solve the mathematics on a computer. Nevertheless, even when we can deploy extremely powerful computers we often have to simplify the mathematics (ignore some of the finer detail in the world) in order to get answers in a reasonable time. (A weather forecast that arrives a day late, for example, is not very useful. One that is 24 hours early may be less accurate, but can still give useful warnings.)

So, even our most sophisticated models (such as those used to predict the weather) still often have considerable simplifications because the real world often works at many different levels of detail and it is usually difficult to handle all of them at the same time. In the case of the weather, for example, we ultimately need to understand everything from the way tiny water drops start to form in clouds right up to global circulation patterns. A great deal of ingenuity and hard work is involved in coping with integrating these scale differences, often very successfully, but sometimes we get thrown into situations where the approximations we use betray us. We have to be ready to recognise and deal with those failures.

There is a sense, therefore, in which none of these computer models can ever produce 100% accurate and reliable predictions but that does not mean they are of no use. They can still be used effectively and safely, as long as you know what you are doing. The professionals spend a lot of time validating the accuracy of their models and learning just how far they may depart from reality and how to recognising when this may be occurring. They learn the signs that show when they can be trusted and when they must be regarded with scepticism. (We often spend far more time doing this than building the computer programs in the first place.)

If we are asked for guidance to support a critical decision, we do not just `press the button’ and pass on the predictions. We worry about whether the outputs are very sensitive to the exact values of inputs. If we are feeding real-World data into the models (which is often intrinsically uncertain) then we had better give a range of predictions, based on the the possible variations in the data, and try to work out which of those we can consider improbable and which are more likely. The models, however, do not make the decisions for you: that requires judgement about the amount of risk people are prepared to accept from the potential adverse consequences that cannot be ruled out, and the costs of mitigating those risk and whether those are acceptable. These are human and perhaps political issues.

Even highly oversimplified mathematical representations can be useful because they give valuable qualitative insights into the way complicated systems might behave in a broad sense. In particular we can often quickly gain a good understanding of the way the answers would change when we modify the inputs to the model (i.e. adjust this parameter down and that output goes up). In epidemic modelling, for example, even the simplest schemes suggest that infection rates build up, and then die away as more of the population become immune and it become less likely that an infected person can contact new hosts who have no immunity. They will tell you clearly that reducing the probability of person-to-person transmission may not always reduce the total number who eventually become infected (though it might) but it will certainly spread them out in time and that may be helpful in protecting an overstretched health service. If, however, we want to know exactly when the peak will occur and how high it will be, that requires a more sophisticated approach, looking in more detail at how different groups of people in the population interact with each other.

Physics modelling is easy compared to trying to predict the way people will behave. In order to model epidemics we need to represent the way people interact and we cannot know how every individual is going to behave at any particular time, so one has to divide the population into broad groups (perhaps children, teenagers, young singles, older married with children, elderly,  people who live in high-density environment, and so on) and then make broad assumptions, based on empirical observations, about the way each group on average will behave and interact within itself with the other groups. 

In principle, you might think, that by adding more detail (such as more finely distinguished population groups) we might get better predictions, but that is not necessarily the case because the finer the detail the less we may know about the average behaviour of a particular group. One may, for example, have reasonable data on the average number of personal interaction of single 20-somethings each day, but further dividing that group by, say, educational attainment or social background we would soon find that only a few individuals have been observed to produce each average: it becomes statistically unreliable.

Furthermore, even as an epidemic progresses, the original data about population behaviour is less reliable because the epidemic itself changes behaviour. We have little empirical data on responses to pandemics in our time and our culture. Reasonable assumptions, will need to be made and tuned to reality as we learn more during the progress of the epidemic.

In fact, as we add more complexity to our models we also raise the risk that we make more mistakes in the mathematics and introduce more errors into the ever more complex software implementation. More complicated computer models also then become much harder to test and validate because they have much more complicated behaviour, and it will cost a lot more in money and time to collect the evidence confirming that all the nuances of behaviour are correctly represented. In fact, the business of testing and validating a computer model to a state where it is sensible to base important decisions on its predictions may often take far more effort than the development and programming of the model. So while, complicated models may potentially produce higher fidelity representations of the real world, they may also go so badly wrong that they are highly misleading. 

When we need to be really sure that we are not being misled, it is sometimes better to stay with the simpler but transparently understandable approaches, and build in extra clearly defensible safety margins. If I cannot constrain the magnitude of potential errors in a more sophisticated approach then, although the predictions may be potentially more accurate, I may find that the larger safety margins I must apply actually offer less scope for manoeuvre.

When avoiding errors is really important we often use independent computer models written by different teams to study the same problem. If fact, systematic comparisons of this type suggest that many computer models (even those produced after high investments) may rather be less dependable than their authors might like to believe  - see Hatton (1997).

Even then, it is usually not the scientist of engineer who has to make the high impact decisions and communication of advice to senior managers or politicians in a way that does not mislead requires care. The decision makers rightly claim that the balancing of risk and benefits though their choices is their particular province and responsibility. Our responsibility as modellers is to correctly communicate not only our projections but the uncertainties in those projections so that the decision makers can make a fair assessment of the real risks they may need to accept.

The modelling exercises carried out to inform the pronouncements of the International Panel on Climate Change are an excellent example of the way things should be done. Not only do we see contributions from many independently developed models. We also see that many variations of the model are studied, in which the controlling parameters (some of which represent uncertain aspects of the real world) are varied to examine the effect this has on outputs. 

In the case of the current climate emergency, you may, nevertheless, be forgiven for thinking that political leaders have so far guided their decision making based on the more optimistic outcomes amongst the predicted climate scenarios - but they have certainly not been misled. I have noted that in the current Covid-19 pandemic (as with the climate emergency) that journalists have often focussed on either particularly optimistic or particularly pessimistic predictions: neither on their own is especially informative.

We should not, however, be hypocritical, though as most safety professional realise, the public understanding of risk is not particularly sophisticated. We all like to think that we put a high value on life, but many of us still climb into cars and accept the not insignificant risk that we will kill or injure ourselves, our family members or even strangers who happen to get in the way. Safety professionals and their political masters, working on our behalf, have the unenviable job of working out how much inconvenience and cost we are really prepared to accept for improved safety, while still allowing us to believe one thing and do another.

None of this is easy or quick, and it often involves a good deal of informed insight, which is why it is important that such complicated models are operated by `suitably qualified and experienced’ engineers and scientists. In the end it is the scientist or engineer who makes a judgement as to which predictions should be trusted. If the computational models have been developed, built, tested and validated by suitably qualified and experienced people, using the appropriate techniques compatible with accepted professional standards, and they are operated by suitably qualified and experienced people within the scope of the validation database, it is usually possible to justify a claim that any remaining errors still in the software probably have an insignificant effect on the outcomes. (Conversely, we would probably also claim that a modelling defect that is having a serious undesirable influence on outcomes would most likely be noticed by our highly qualified users.) If any of these conditions are compromised then there must be a corresponding decrease in the amount of trust that we should give to the results. The ability to successfully communicate these subtleties to decisions makers is also part of what it means to be suitably qualified and experienced.

Given the importance of computational modelling in the modern world you might think that there is a consensus about the skills and qualifications required, and the professional standards that ought to be enforced. This is a surprisingly grey area, though regulated industries, such as nuclear, have to establish and show they comply with processes and standards that will pass audit by the appropriate authorities. (I once had some responsibilities to my former employer both for defining appropriate working practices and the training requirement that ensured that our `suitably qualified and experienced’ staff were able to operate them successfully. I spent a good deal of my time looking at what engineers in similar industries doing similar work considered reasonable…and constantly worrying whether I had found the correct balance between regulation and trust in the competence of the often highly innovative people who looked to me for advice.)

In a wider context there are a number of reasons why it is very difficult to produce agreement, ranging genuine well-founded disagreements about choosing appropriate techniques in very disparate contexts with very different risk spectra. There are also real difficulties in finding enough of the rare people who can exhibit the full range of required skills and knowledge, including, for example, high levels of ability in physics and maths combined with extensive knowledge and experience on computer science and engineering and the essential interpersonal skills to understand and communicate the human context of everything we did. We also have to deal with sufferers from the Dunning-Kruger effect: (see Wikipedia!) incompetent people often fail to recognise their own incompetence, and it is not unknown for their undoubted abilities in self-promotion to push them to positions of responsibility where they can be a block to raising standards. Those of us who have had extensive training and experience in both physics and computing all too frequently recognise the signs.

It is about finding out who you can trust as well as which models you can trust.

The author is a Fellow of the Institute of Physics, a Fellow of the Royal Astronomical Society, a Member of the British Computer Society and also a Chartered Engineer.

References

Hatton, L., 1997. The T-experiments: errors in scientific software. In Quality of Numerical Software (pp. 12-31). Springer, Boston, MA.

Breadcrumbs