Case study

Load Factor Analysis


Forecast developed a model which uses Bayesian approach to predict a flight's load factor to improve resource allocation and streamline passenger flow at client's Australian airport, serving over 2 million passengers annually.


Our client is one of Australia’s leading international airport operators, serving over 2 million passengers a year.


Due to the rapidly changing nature of an airport environment, the client wanted to understand the impact to their operations when demand changed, the cost/benefit trade-offs of reducing passenger queue lengths and wait times by increasing staffing, and to understand the impact of large capital expansions of the airport.

As part of this overarching aim, we have developed a system to calculate the load factors of flights. Load factor refers to the percentage of the seats on a flight that are occupied. From the perspective of the airport, it is important to understand the load factor, and therefore the total number of passengers that will be in the airport at a given time, in order to ensure the airport is staffed at an appropriate level.


The model we developed uses a Bayesian approach to predict the likely load factor of a flight. A beta-binomial model was chosen, as it was assumed that the probability that a seat would be occupied could be modelled as a single Bernoulli trial, and that the number of seats occupied on a plane could therefore be modelled as a binomial distribution. This binomial distribution forms the likelihood function, with a beta distribution as the conjugate prior, and a beta-binomial distribution as the posterior.

Flights are categorised base on day of week, month, route, direction, and airline; with the same approach utilised for each group. Historical data from 2010-18 was used to create the prior (beta) distribution, and 2019 data to create the (binomial) likelihood (2020-21 was excluded due to the pandemic). This will be updated to use more recent data as it becomes available.

Once completed, the code was deployed via an AWS Lambda instance as part of the Digital Twin system. As there is very limited memory in Lambda, the code was restricted to using the numpy and pandas modules. To accommodate this, the prior distributions are calculated offline using the scipy module, with the parameters uploaded to the Digital Twin backend.


The system produces load factor predictions for use in the Digital Twin system, allowing the Client to more efficiently allocate resources, and to identify potential bottlenecks in passenger flow through the airport.

More like this

Case study


Airport Terminal Digital Twin