Skip to contents

Transform AEC distribution of preferences from long to wide format, with optional scaling and normalization. This function is useful for converting all distribution of preference data with similar format into format ready for ternary plots.

Usage

dop_transform(
  data,
  key_cols,
  value_col,
  item_col,
  normalize = TRUE,
  scale = 1,
  fill_value = 0,
  winner_col = NULL,
  winner_identifier = "Y"
)

Arguments

data

A data frame containing preference or vote distribution data, with format similar to AEC Distribution of Preferences 2022

key_cols

Columns that identify unique observations, e.g., DivisionNm, CountNumber

value_col

Numeric and non-negative. Column containing the numeric values to aggregate, e.g., CalculationValue, Votes.

item_col

Column name containing the items (candidates/parties) of the election, e.g., Party, Candidate. This column will become column names in the output wide format.

normalize

Logical. If TRUE (default), normalizes values within each group to sum to 1. If FALSE, returns raw aggregated values.

scale

Numeric. If normalize = FALSE, divides all values by this scale factor. Default is 1 (no scaling).

fill_value

Numeric. Value to use for missing combinations after pivoting. Default is 0.

winner_col

Optional character string specifying a column that indicates the winner/elected party. If provided, this column will be joined back to the output based on key columns. Useful for preserving election outcome information. Default is NULL.

winner_identifier

Optional character string specifying the value in winner_col that identifies winning candidates (e.g., "Y", "Elected"). Only used if winner_col is specified. Default is "Y".

Value

A data frame in wide format with:

  • Key columns identifying each observation

  • Columns for each item (candidate/party) containing aggregated/normalized values

  • Winner column (if winner_col was specified)

Examples

library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union
# Convert AEC 2025 Distribution of Preference data to wide format
data(aecdop_2025)

# We are interested in the preferences of Labor, Coalition, Greens and Independent. 
# The rest of the parties are aggregated as Other.
aecdop_2025 <- aecdop_2025 |>
 filter(CalculationType == "Preference Percent") |> 
 mutate(Party = case_when(
    !(PartyAb %in% c("LP", "ALP", "NP", "LNP", "LNQ")) ~ "Other",
    PartyAb %in% c("LP", "NP", "LNP", "LNQ") ~ "LNP",
   TRUE ~ PartyAb))

dop_transform(
  data = aecdop_2025,
  key_cols = c(DivisionNm, CountNumber),
  value_col = CalculationValue,
  item_col = Party,
  winner_col = Elected
)
#> # A tibble: 976 × 6
#>    DivisionNm CountNumber   ALP   LNP Other Winner
#>    <chr>            <dbl> <dbl> <dbl> <dbl> <chr> 
#>  1 Adelaide             0 0.465 0.242 0.294 ALP   
#>  2 Adelaide             1 0.467 0.242 0.291 ALP   
#>  3 Adelaide             2 0.476 0.244 0.279 ALP   
#>  4 Adelaide             3 0.483 0.249 0.268 ALP   
#>  5 Adelaide             4 0.493 0.285 0.222 ALP   
#>  6 Adelaide             5 0.691 0.309 0     ALP   
#>  7 Aston                0 0.373 0.377 0.251 ALP   
#>  8 Aston                1 0.373 0.378 0.249 ALP   
#>  9 Aston                2 0.376 0.380 0.244 ALP   
#> 10 Aston                3 0.378 0.384 0.238 ALP   
#> # ℹ 966 more rows