Transform AEC distribution of preferences from long to wide format
Source:R/input_wrapper.R
dop_transform.RdTransform AEC distribution of preferences from long to wide format, with optional scaling and normalization. This function is useful for converting all distribution of preference data with similar format into format ready for ternary plots.
Usage
dop_transform(
data,
key_cols,
value_col,
item_col,
normalize = TRUE,
scale = 1,
fill_value = 0,
winner_col = NULL,
winner_identifier = "Y"
)Arguments
- data
A data frame containing preference or vote distribution data, with format similar to AEC Distribution of Preferences 2022
- key_cols
Columns that identify unique observations, e.g., DivisionNm, CountNumber
- value_col
Numeric and non-negative. Column containing the numeric values to aggregate, e.g., CalculationValue, Votes.
- item_col
Column name containing the items (candidates/parties) of the election, e.g., Party, Candidate. This column will become column names in the output wide format.
- normalize
Logical. If
TRUE(default), normalizes values within each group to sum to 1. IfFALSE, returns raw aggregated values.- scale
Numeric. If
normalize = FALSE, divides all values by this scale factor. Default is 1 (no scaling).- fill_value
Numeric. Value to use for missing combinations after pivoting. Default is 0.
- winner_col
Optional character string specifying a column that indicates the winner/elected party. If provided, this column will be joined back to the output based on key columns. Useful for preserving election outcome information. Default is
NULL.- winner_identifier
Optional character string specifying the value in
winner_colthat identifies winning candidates (e.g., "Y", "Elected"). Only used ifwinner_colis specified. Default is "Y".
Value
A data frame in wide format with:
Key columns identifying each observation
Columns for each item (candidate/party) containing aggregated/normalized values
Winner column (if
winner_colwas specified)
Examples
library(dplyr)
#>
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#>
#> filter, lag
#> The following objects are masked from ‘package:base’:
#>
#> intersect, setdiff, setequal, union
# Convert AEC 2025 Distribution of Preference data to wide format
data(aecdop_2025)
# We are interested in the preferences of Labor, Coalition, Greens and Independent.
# The rest of the parties are aggregated as Other.
aecdop_2025 <- aecdop_2025 |>
filter(CalculationType == "Preference Percent") |>
mutate(Party = case_when(
!(PartyAb %in% c("LP", "ALP", "NP", "LNP", "LNQ")) ~ "Other",
PartyAb %in% c("LP", "NP", "LNP", "LNQ") ~ "LNP",
TRUE ~ PartyAb))
dop_transform(
data = aecdop_2025,
key_cols = c(DivisionNm, CountNumber),
value_col = CalculationValue,
item_col = Party,
winner_col = Elected
)
#> # A tibble: 976 × 6
#> DivisionNm CountNumber ALP LNP Other Winner
#> <chr> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 Adelaide 0 0.465 0.242 0.294 ALP
#> 2 Adelaide 1 0.467 0.242 0.291 ALP
#> 3 Adelaide 2 0.476 0.244 0.279 ALP
#> 4 Adelaide 3 0.483 0.249 0.268 ALP
#> 5 Adelaide 4 0.493 0.285 0.222 ALP
#> 6 Adelaide 5 0.691 0.309 0 ALP
#> 7 Aston 0 0.373 0.377 0.251 ALP
#> 8 Aston 1 0.373 0.378 0.249 ALP
#> 9 Aston 2 0.376 0.380 0.244 ALP
#> 10 Aston 3 0.378 0.384 0.238 ALP
#> # ℹ 966 more rows