An introduction to the Akaike information criterion
The Akaike information criterion (AIC) is a mathematical method for evaluating how well a model fits the data it was generated from. In statistics, AIC is used to compare different possible models and determine which one is the best fit for the data. AIC is calculated from:
 the number of independent variables used to build the model.
 the maximum likelihood estimate of the model (how well the model reproduces the data).
The bestfit model according to AIC is the one that explains the greatest amount of variation using the fewest possible independent variables.
When to use AIC
In statistics, AIC is most often used for model selection. By calculating and comparing the AIC scores of several possible models, you can choose the one that is the best fit for the data.
When testing a hypothesis, you might gather data on variables that you aren’t certain about, especially if you are exploring a new idea. You want to know which of the independent variables you have measured explain the variation in your dependent variable.
A good way to find out is to create a set of models, each containing a different combination of the independent variables you have measured. These combinations should be based on:
 Your knowledge of the study system – avoid using parameters that are not logically connected, since you can find spurious correlations between almost anything!
 Your experimental design – for example, if you have split two treatments up among test subjects, then there is probably no reason to test for an interaction between the two treatments.
Once you’ve created several possible models, you can use AIC to compare them. Lower AIC scores are better, and AIC penalizes models that use more parameters. So if two models explain the same amount of variation, the one with fewer parameters will have a lower AIC score and will be the betterfit model.
How to compare models using AIC
AIC determines the relative information value of the model using the maximum likelihood estimate and the number of parameters (independent variables) in the model. The formula for AIC is:
K is the number of independent variables used and L is the loglikelihood estimate (a.k.a. the likelihood that the model could have produced your observed yvalues). The default K is always 2, so if your model uses one independent variable your K will be 3, if it uses two independent variables your K will be 4, and so on.
To compare models using AIC, you need to calculate the AIC of each model. If a model is more than 2 AIC units lower than another, then it is considered significantly better than that model.
You can easily calculate AIC by hand if you have the loglikelihood of your model, but calculating loglikelihood is complicated! Most statistical software will include a function for calculating AIC. We will use R to run our AIC analysis.
AIC in R
To compare several models, you can first create the full set of models you want to compare and then run aictab()
on the set.
For the sugarsweetened beverage data, we’ll create a set of models that include the three predictor variables (age, sex, and beverage consumption) in various combinations. Download the dataset and run the lines of code in R to try it yourself.
Create the models
First, we can test how each variable performs separately.
Next, we want to know if the combination of age and sex are better at describing variation in BMI on their own, without including beverage consumption.
We also want to know whether the combination of age, sex, and beverage consumption is better at describing the variation in BMI than any of the previous models.
Finally, we can check whether the interaction of age, sex, and beverage consumption can explain BMI better than any of the previous models.
Compare the models
To compare these models and find which one is the best fit for the data, you can put them together into a list and use the aictab() command to compare all of them at once. To use aictab(), first load the library AICcmodavg.
Then put the models into a list (‘models’) and name each of them so the AIC table is easier to read (‘model.names’).
Finally, run aictab()
to do the comparison.
Interpreting the results
The code above will produce the following output table:
The bestfit model is always listed first. The model selection table includes information on:
 K: The number of parameters in the model. The default K is 2, so a model with one parameter will have a K of 2 + 1 = 3.
 AICc: The information score of the model (the lowercase ‘c’ indicates that the value has been calculated from the AIC test corrected for small sample sizes). The smaller the AIC value, the better the model fit.
 Delta_AICc: The difference in AIC score between the best model and the model being compared. In this table, the nextbest model has a deltaAIC of 6.69 compared with the top model, and the thirdbest model has a deltaAIC of 15.96 compared with the top model.
 AICcWt: AICc weight, which is the proportion of the total amount of predictive power provided by the full set of models contained in the model being assessed. In this case, the top model contains 97% of the total explanation that can be found in the full set of models.
 Cum.Wt: The sum of the AICc weights. Here the top two models contain 100% of the cumulative AICc weight.
 LL: Loglikelihood. This is the value describing how likely the model is, given the data. The AIC score is calculated from the LL and K.
From this table we can see that the best model is the combination model – the model that includes every parameter but no interactions (bmi ~ age + sex + consumption).
The model is much better than all the others, as it carries 96% of the cumulative model weight and has the lowest AIC score. The nextbest model is more than 2 AIC units higher than the best model (6.33 units) and carries only 4% of the cumulative model weight.
Based on this comparison, we would choose the combination model to use in our data analysis.
Reporting the results
If you are using AIC model selection in your research, you can state this in your methods section. Report that you used AIC model selection, briefly explain the bestfit model you found, and state the AIC weight of the model.
After finding the bestfit model you can go ahead and run the model and evaluate the results. The output of your model evaluation can be reported in the results section of your paper.
Frequently asked questions about AIC
 What is the Akaike information criterion?

The Akaike information criterion is a mathematical test used to evaluate how well a model fits the data it is meant to describe. It penalizes models which use more independent variables (parameters) as a way to avoid overfitting.
AIC is most often used to compare the relative goodnessoffit among different models under consideration and to then choose the model that best fits the data.
 What is a model?

In statistics, a model is the collection of one or more independent variables and their predicted interactions that researchers use to try to explain variation in their dependent variable.
You can test a model using a statistical test. To compare how well different models fit your data, you can use Akaike’s information criterion for model selection.
 What is meant by model selection?

In statistics, model selection is a process researchers use to compare the relative value of different statistical models and determine which one is the best fit for the observed data.
The Akaike information criterion is one of the most common methods of model selection. AIC weights the ability of the model to predict the observed data against the number of parameters the model requires to reach that level of precision.
AIC model selection can help researchers find a model that explains the observed variation in their data while avoiding overfitting.
 How is AIC calculated?

The Akaike information criterion is calculated from the maximum loglikelihood of the model and the number of parameters (K) used to reach that likelihood. The AIC function is 2K – 2(loglikelihood).
Lower AIC values indicate a betterfit model, and a model with a deltaAIC (the difference between the two AIC values being compared) of more than 2 is considered significantly better than the model it is being compared to.
1 comment
Thije Jim Zuidewind
October 13, 2020 at 5:29 PMGreat example, I finally have an "easy" way to analyse my models. I do have a question though. I ran a sample of 6 models, four of which had a very low weight so I could easily discard those. However my top model has a weight of 0.49 and my second 0.33, is this difference big enough to say the first model is more accurate?