Statistical models are a lot like beauty pageants. You can philosophically discuss what truth/beauty really is or rant about how models don’t represent the general population. Both have curves, tight fits and ways to cover up unfortunate data points. Both are slightly obsessed with perfection, while both actually know this is unattainable. The names of statistical models can challenge any Miss World contest presenter: Miss Poisson (pronounced “pwa-saun”), Miss Chi (pronounced “kai”), Miss Least Likelihood of winning…
You may soon believe the Conversation is a sister magazine of the Cosmopolitan.
The stylish pageant contestants make the rest of us very aware of our own normal distributions. Similarly, statistical models make us feel below average intelligence, because it is hard to understand what they really are.
Do you know whether “residual”, “dimensionality reduction” and “k-folds” are plastic surgery or statistical terms? The similarities are not so strange, as the word “model” refers to the same concept in both statistics and fashion – an ideal or generalisable representation of the population of interest. In fashion, models represent our ideal pictures of beautiful people and in health statistics, a model tries to represent a population of typical people with or without a certain disease.
(At this point you should be imagining a comedy of errors of a statistician attending a modelling workshop of the wrong kind, and the ensuing confusion of terms. The Devil wears Data? The Data wears Prada?)
Statistical models also “compete” with each other to be crowned the best for the data and is also judged by specific criteria. The statistical model swimwear section judges how well the model fits the data. Imagine measuring 5 proteins in the blood of 10 people, and one metabolite in urine. These are “biomarkers”, because they are measurable markers of some biological state inside the body that we cannot easily see from the outside. A model can explain the relationship between the proteins in the blood and the metabolite in the urine. It takes the values of the proteins and tries to make them into an equation, in different combinations with each other, until it comes as close as possible to producing the true value of the urine metabolite for each individual. These are the predictions. The differences between each prediction and each actual measured urine value, are called residuals. The model’s aim is to minimise these differences, in the same way that a swimsuit generally fits tightly around one’s body. We usually try a number of these statistical swimsuits on a dataset, and choose the one that fits the best.
Beauty pageants also have a personality section – good looks alone will not do. Will the model also bring about world peace? Statistical models bring about world peace by being generalisable. Because we can never measure our 5 proteins in all the people with a specific disease, we take a sample of people that we think are good representatives of most other people who have that disease. A statistical model uses the sample data to get clues about all the other unmeasured people and tries to make a picture of that population, not only a picture of the people in the sample. If it is too much in fashion only with the sample data, it will be irrelevant in a different set of same disease people. So, the model must do well in the swimsuit and the personality section.
Arguably the most adored part of beauty pageants, is the glamorous evening wear section. The statistical counterpart is the show-stopping significant p-value. If the model does not display a noticeable difference between people with the disease compared to those without, it will not make an impression. P-values describe the probability that the model’s prediction is real and not just a fluke, like the comedy of errors statistician (in the movie, she will obviously win the contest), who by sheer luck manages to stay upright in stilettos for the whole evening. If a model predicts a difference in disease versus non-disease biomarker values and has a small p-value, there is only a small chance that the model found that difference by sheer luck. It means this difference in biomarkers may possibly also be found in other diseased and non-diseased people, outside the sample.
Statistics is also beautiful, but like human models, not perfect, and not likely to bring about world peace. It tries to show a picture of what all people with a certain disease may be like, if they were a lot like the people in the sample we measured. We can never know whether our models are right, but estimates like p-values describe the probability that we came to wrong conclusions. And thanks to Miss Universe 2015, wrong conclusions is just another thing statistical models have in common with beauty pageants.