Ten great R functions #2

This is a follow-up post to my previous post on great R functions. I use some of these functions a lot while a few of the functions have been very helpful at least once.

11. dplyr::coalesce()

I have been working with data where two columns have the relevant data that needed to be in one column. For example, there might be an outcome for the treatment group and an outcome for the control group in a survey experiment, but each observation only has a value on one of these variables.

To create one variable with all of the information, we can use the coalesce() function. This function will find the first non-missing element across several columns and add that to a variable. In the example below we create a new variable (var3) that is merged from two other variables.

df <- tibble(id = 1:4,
             var1 = c(1, 2, NA, NA),
             var2 = c(NA, NA, 3, 4))

df %>% 
  mutate(var3 = coalesce(var1, var2))

The new variable will have the values 1, 2, 3 and 4.

12. fs::dir_ls()

If you need a character vector with the files in a folder, preferably complying with a specific regular expression, the dir_ls function in the fs package got your covered. The example below will return all *.csv files in your working directory (you can also specify a specific path if it should not be your working directory).

fs::dir_ls(regexp = "\\.csv$")

13. janitor::clean_names()

The clean_names() function does exactly what it promises: clean names. When I get Excel datasets to work with, the first row often have names that are not ideal variable names, including spaces and different signs.

In the example below I create an empty dataset with two variables with the horrible names: Annual sales (USD) and Growth rate (%). Then I use the clean_names() function to get clean names from the data frame. Specifically, the function takes the variable names and edit them into snake_case names.

df <- data.frame("Annual sales (USD)" = NA,
                 "Growth rate (%)" = NA)

janitor::clean_names(df)

The variable names returned from the function are annual_sales_usd and growth_rate. Much better!

14. dplyr::add_count()

Yet another function that does exactly what it promises. add_count() adds a count of the variable of interest, i.e., the number of values with that specific value. The example below will count how many observations have the specific values on the gear variable in mtcars and add that information to a new variable (gear_n).

mtcars %>% 
  add_count(gear, name = "gear_n")

The function is similar to the count() function but it will not group all observations together on the selected variable. Accordingly, you should only use count() if you want to summarise your data without having to use group_by().

15. performance::check_collinearity()

When you estimate a regression model, you often need to check whether certain assumptions hold or not. The performance package got a lot of relevant functions that makes this easy, such as check_collinearity(). This function easily let you examine the potential multicollinearity in your model.

You can read more about the function and see examples here.

16. dplyr::across()

If you need to apply a function (or functions) across multiple columns, across() is a great function to use. In one of my scripts, I had to create confidence intervals for poll estimates, and I used the function to create new variables with the maximum and minimum estimates.

polls %>% 
  mutate(across(starts_with("party"), 
                ~ .x + 1.96 * sqrt((.x * (100 - .x)) / n), 
                .names = "ci_max_{.col}"),
         across(starts_with("party"), 
                ~ .x - 1.96 * sqrt((.x * (100 - .x)) / n), 
                .names = "ci_min_{.col}")
  )

As you can see, the function takes all variables that starts with “party”, calculates the lower and upper estimates and saves the information in new variables.

17. RVerbalExpressions::rx()

Writing regular expressions can be difficult and involve a lot of frustration. The rx() function let you easily write code that returns the regular expression you want. You can see several good examples on how to use the function here.

18. lubridate::make_date()

make_date() is a great function that easily creates a date variable when you have the information on year, month and day in three separate variables. For example:

df %>% 
  mutate(date = make_date(year, month, day))

19. dplyr::pull()

If you want to extract a single column from a data frame, you can use the pull() function. The example below pulls the gear variable from the data frame and then returns the summary of the variable.

mtcars %>% 
  pull(gear) %>% 
  summary()

Similarly, if you want to extract an element from a list, you can use the pluck() function.

20. scales::show_col()

This was a function I was not familiar with until I saw Andrew Heiss mentioning it on Twitter. It is an amazing function to explore different colour schemes. Do check it out.

Potpourri: Statistics #79

Bayes Rules! An Introduction to Bayesian Modeling with R
A friendly introduction to machine learning compilers and optimizers
A History of Polar Area / Coxcomb / Rose charts & how to make them in R’s ggplot2
A Dataset of Cryptic Crossword Clues
– Survival Analysis: Part I: Basic concepts and first analyses, Part II: Multivariate data analysis – an introduction to concepts and methods, Part III: Multivariate data analysis – choosing a model and assessing its adequacy and fit, Part IV: Further concepts and methods in survival analysis
Dataviz Accessibility Resources
RegExplain
A Succinct Intro to R
Deep Learning’s Diminishing Returns
Working with Google Sheets from R
The Rise of the Pandemic Dashboard
Predicting FT Trending Topics
The Art of Linear Algebra: Graphic Notes on “Linear Algebra for Everyone”
Modeling Possibly Nonlinear Confounders
ggHoriPlot: build horizon plots in ggplot2
Finding the Eras of MTV’s The Challenge Through Clustering
Why data scientists shouldn’t need to know Kubernetes
Creating a Dataset from an Image in R Markdown using reticulate
Neural Networks from scratch
plotDK: Plot Summary Statistics as Choropleth Maps of Danish Administrative Areas
The Power of Parameterized Reports With Plumber
Riding tables with {gt} and {gtExtras}
How to explain gradient boosting
How to visualize decision trees
Speech and Language Processing
Sexy up your logistic regression model with logit dotplots
AI’s Islamophobia problem
Possession Is The Puzzle Of Soccer Analytics. These Models Are Trying To Solve It.


Previous posts: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35 #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63 #64 #65 #66 #67 #68 #69 #70 #71 #72 #73 #74 #75 #76 #77 #78

Potpourri: Statistics #78

Investigation of Data Irregularities in Doing Business 2018 and Doing Business 2020
Dyadic Clustering in International Relations
Forecasting: Principles and Practice
Data Disasters
A Quick How-to on Labelling Bar Graphs in ggplot2
Data visualisation using R, for researchers who don’t use R
Easy access to high-resolution daily climate data for Europe
Put R Models in Production
Machine learning, explained
Three ways to visualize binary survey data
In defense of simple charts
Modern Statistics with R
How to avoid machine learning pitfalls: a guide for academic researchers
Tune xgboost models with early stopping to predict shelter animal status
Machine-learning on dirty data in Python: a tutorial
I saw your RCT and I have some worries! FAQs
Up and running with officedown
Use racing methods to tune xgboost models and predict home runs
The 5-minute learn: Create pretty and geographically accurate transport maps in R
R’s Internal Data Formats: .Rda, .RData, .rds
Improve Your Code – Best Practices for Durable Code
An educator’s perspective of the tidyverse
Estimating regression coefficients using a Neural Network (from scratch)
Let users choose which plot you want to show
A look into ANOVA. The long way.
3 alternatives to a discrete color scale legend in ggplot2
Downloading the Census Household Pulse Survey in R
The Stata Guide
The Four Pipes of magrittr
Introducing {facetious} – alternate facets for ggplot2
Alternatives to Simple Color Legends in ggplot2
Top 3 Coding Best Practices from the Shiny Contest
Visualizing ordinal variables
Making Shiny apps mobile friendly
Climate circles
Elegant and informative maps with tmap
Exploring R² and regression variance with Euler/Venn diagrams
Exploring Pamela Jakiela’s simple TWFE diagnostics with R
The marginaleffects package for R
A lightweight data validation ecosystem with R, GitHub, and Slack
Create spatial square/hexagon grids and count points inside in R with sf
A daily updated JSON dataset of all the Open House London venues, events, and metadata
Animating Network Evolutions with gganimate
Beyond Bar and Box Plots
Causal Inference in R Workshop
Odds != Probability
How to visualize polls and results of the German election with Datawrapper
Irreproducibility in Machine Learning
tidybundestag
A collection of themes for RStudio
Shiny, Tableau, and PowerBI: Better Business Intelligence
Automate PowerPoint Production Using R
Estimating graph dimension with cross-validated eigenvalues
Understanding text size and resolution in ggplot2
Introduction to linear mixed models


Previous posts: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35 #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63 #64 #65 #66 #67 #68 #69 #70 #71 #72 #73 #74 #75 #76 #77

Replace equations with code

Here is a suggestion: In empirical research, academics should move equations from the methods section to the appendix and, if anything, show the few lines of code used to estimate the model(s) in the software being used (ideally with citations to the software and statistical packages). Preferably, it should be possible to understand the estimation strategy without having to read any equations.

Of course, I am talking about the type of work that is not primarily interested in developing a new estimator or a formal theory that can be applied to a few case studies (or shed light on the limitations of empirical models). I am not against the use of equations or abstractions of any kind to communicate clearly and without ambiguity. I am, however, skeptical towards how empirical research often include equations for the sake of … including equations.

I have a theory that academics, and in particular political scientists, put more equations in their research to show off their skills rather than to help the reader understand what is going on. In most cases, equations are not needed and are often there only to impress reviewers and peers, which of course are the same people (hence, peer-review). The use of equations are excluding readers rather than including readers.

I am confident that most researchers spend more time in their favourite statistical IDE than they do writing and reading equations. For that reason, I also believe that most researchers will find it easier to read actual code instead of equations. Take this example on the equation and code for a binomial regression model (estimated with glmer()) from Twitter:

Personally, I find it much easier to understand what is going on when I look at the R code instead of the extracted equation. Not only that, I also find it easier to think of potential alternatives to the regression model, e.g., that I can easily change the functional form and see how such changes will affect the results. This is something I rarely consider when I only look at equations.

The example above is from R, and not all researchers use or understand R. However, I am quite certain that everybody that understands the equation above will also be able to understand the few lines of code. And when people use Stata, it is often even easier to read the code (even if you are not an avid Stata user). SPSS syntax is much more difficult to read but that says more about why you should not use SPSS in the first place.

I am not against the use of equations in research papers. However, I do believe empirical research would be much better off by showing and citing code instead of equations. Accordingly, please replace equations with code.

Potpourri: Statistics #75

Introducing pewmethods: An R package for working with survey data
Exploring survey data with the pewmethods R package
Weighting survey data with the pewmethods R package
Analyzing international survey data with the pewmethods R package
autumn: Fast, Modern, and Tidy Raking
Data science for economists
Papers about Causal Inference and Language
Yale Applied Empirical Methods PHD Course
Spreadsheet Munging Strategies
Visual Vocabulary: Designing with data
What can we learn from a country’s diplomatic gifts?
Map, Walk, Pivot
The Epidemiologist R Handbook
Machine learning with {tidymodels}
Choose your own tidymodels adventure
Applied Spatial Statistics with R
ggplot: the placing and order of aesthetics matters
Introduction to Functional Data Analysis with R
Visualizing Distributions with Raincloud Plots with ggplot2
A Chat with Andrew on MLOps: From Model-centric to Data-centric AI
ISLR tidymodels Labs
Plotting maps with ggplot2
R instructions for our research projects
A gentle introduction to deep learning in R using Keras
Everything You Always Wanted to Know About ANOVA
Replication Materials for “The Flying Bomb and the Actuary” (Shaw and Shaw, 2019)
Colors and emotions in data visualization
Rookie R mistakes
10 Tips to Customize Text Color, Font, Size in ggplot2 with element_text()
Writing unit tests in R
The Good, the Bad and the Ugly: how to visualize Machine Learning data
A curated list of APIs, open data and ML/AI projects on climate change
R for SEO
Using Geospatial Data in R
Good Data Scientist, Bad Data Scientist
The Evolution of a ggplot (Ep. 1)
Do Wide and Deep Networks Learn the Same Things?


Previous posts: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35 #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63 #64 #65 #66 #67 #68 #69 #70 #71 #72 #73 #74

Potpourri: Statistics #72 (Monty Hall problem)

Monty Hall Simulations
Making the Monty Hall problem weirder but obvious
The Intuitive Monty Hall Problem
The psychology of the Monty Hall problem: Discovering psychological mechanisms for solving a tenacious brain teaser
The Collider Principle in Causal Reasoning: Why the Monty Hall Dilemma Is So Hard
Rationality, the Bayesian standpoint, and the Monty-Hall problem
Josh Miller’s alternative, more intuitive, formulation of Monty Hall problem
Monty Hall problem solved in tidyverse


Previous posts: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35 #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63 #64 #65 #66 #67 #68 #69 #70 #71

Ten great R functions

Here are ten R functions that have saved me a lot of time over the years.

1. forcats::fct_reorder()

The forcats package has a lot of great functions. The one I use the most is the fct_reorder() function. I have also seen David Robinson using it a lot in his YouTube videos (I recommend his videos in this post).

The function is good to change the order of values in a factor variable, e.g. if you want to make sure there is some structure to the values you present in a bar chart:

2. countrycode::countrycode()

I have lost count of the number of times I have used the countrycode package. If you are doing comparative research and not using the countrycode() function, you are in for a treat.

In a lot of datasets you will not have the full country name (e.g. Denmark), but something like ISO 3166-1 alpha-2 codes (e.g. DK). The countrycode() function can easily return country names based on ISO codes (or vice versa). Here is an example:

countrycode(c("DK", "SE"), 
            origin = "iso2c", 
            destination = "country.name")

This code will return Denmark and Sweden. As you can see, you simply provide the “origin” (i.e. the type of data you have) and the “destination” (i.e. the type of data you would like). I especially find this function useful when I need to merge datasets with different country variables and when I want to present full country names in a visualisation instead of ISO codes.

Last, if you are working on a country-level dataset, make sure that it is easy to match the countries with any of the variables available in the countrycode package.

3. tidyr::separate_rows()

I recently had to work with a dataset where each country had several priorities in relation to the Sustainable Development Goals (SDGs). However, there was only one SDG variable with information on the relevant SDGs for each country. The separate_rows() function is great to turn such data into multiple rows.

df <- tibble(
  country = c(1, 2),
  SDG = c("SDG 5,SDG 17,SDG 3", "SDG 1,SDG 2,SDG 3")
)

df %>% separate_rows(SDG,
                     sep = ",",
                     convert = TRUE)

The sep argument is specifying what separator you would like to use to separate the information (in this case a comma). The code will return a tibble with two variables and six observations.

4. tidyr::crossing()

I often use the crossing() function when I need to create a data frame from scratch. For example, if you need to create a country-year data frame for a few countries from 1965 to 2021, you can create a data frame where each country has a row for each year. Here is an example:

crossing(country = c("Denmark", "Sweden"),
         year = 1965:2021,
         value = NA_real_)

5. stringi::stri_reverse()

I had to scrape a PDF file but the text I got from the document was reversed, e.g. ‘Agriculture’ was ‘erutlucirgA’. There might be different ways to do this in an easy way, but the function stri_reverse() in the stringi package did the trick. Here is a simple example:

x <- "snoitcnuf R taerg neT"

stringi::stri_reverse(x)

And what we get is: “Ten great R functions”.

6. purrr::reduce()

The reduce() function is a great to collapse repetitive piping. There is a good blog post on the function here. To illustrate, when I used to merge several data frames into one large data frame, I used multiple lines of left_join().

reduce(list(df_1, df_2,
            df_3, df_4), 
       left_join, 
       by = c("iso2c", "year"))

The code will left join all data frames on the iso2c and year variable.

7. dplyr::distinct()

If you have multiple rows in a data frame, e.g. multiple countries, but want a unique row for each country, you can use the distinct() function to get distinct rows. In the example below we have four rows but we turn them into a data frame with distinct rows on the variable x.

df <- tibble(
  x = c(1, 1, 2, 2),
  y = c(1, 1, 2, 4)
) 

df %>% dplyr::distinct(x, .keep_all = TRUE)

8. fuzzyjoin::regex_left_join()

The regex_left_join() from the fuzzyjoin is great if you need to merge a data frame based upon a regular expression. I found this useful when I had to join data frames with different country names.

Here is a simple example where we join two data frames where it merges the rows for both “Denmark” and “denmark”.

df_1 <- data.frame(
  country = c("Denmark", "denmark"),
  year = 2020:2021
)

df_2 <- data_frame(regex_country = c("[Dd]enmark"),
                   type = 1:2)

df_1 %>%
  fuzzyjoin::regex_inner_join(df_2, by = c(country = "regex_country"))

9. ggplot2::labs()

I used to look up the theme() function when I had to remove the title of a legend, or use scale_x_continuous() if I had to change the title of the x-axis. Not anymore. The labs() function is an easy way to change the labels in your figure. You can also use it to change the title and subtitle of your figure. Highly recommended.

10. tidyr::drop_na()

When I check some of my old code, I often see lines like this:

df %>% 
  filter(!is.na(var1))

However, there is a much easier way to do this, namely using the drop_na() function.

df %>% 
  drop_na(var1)

This is not only much easier to write than having to rely on two functions, but also a lot easier to read.

Potpourri: Statistics #71

SDS 375/395 Data Visualization in R
Demystifying the coalesce function
Data Viz Bookmarks
Data Science: A First Introduction
Crime by the Numbers
The value of p
The Tidyverse in a Table
Sample Size Justification
Learn tidytext with my new learnr course
Using random effects in GAMs with mgcv
Public Policy Analytics: Code & Context for Data Science in Government
How to run 100 regressions without loops in R
Spreadsheet mistakes – news stories
Weights in statistics
Importing Multiple Files Quickly and Efficiently
Making Sense of Sensitivity: Extending Omitted Variable Bias
Microsoft365R: an R interface to the Microsoft 365 suite
fixest: Fast Fixed-Effects Estimations
Grab World Bank Data in R with {WDI}
Lists are my secret weapon for reporting stats with knitr
Building a team of internal R packages
Tidyverse Skills for Data Science in R
Practical Applications in R for Psychologists
Transform List into Dataframe with tidyr and purrr
Main terms and concepts in R
A complete guide to scales
Computational Thinking for Social Scientists
A Crash Course in Good and Bad Controls
Causal design patterns for data analysts
Modern Data Science with R
Generating SQL with {dbplyr} and sqlfluff
Hypothesis test by hand
How to Use Git/GitHub with R
Testing for normality
Scrape Hundreds of PDF Documents From the Web with R and rvest
Radial Patterns in ggplot2
a gRadual intRoduction to Shiny
Reading tables from images with magick
ggplot Wizardry Hands-On


Previous posts: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17 #18 #19 #20 #21 #22 #23 #24 #25 #26 #27 #28 #29 #30 #31 #32 #33 #34 #35 #36 #37 #38 #39 #40 #41 #42 #43 #44 #45 #46 #47 #48 #49 #50 #51 #52 #53 #54 #55 #56 #57 #58 #59 #60 #61 #62 #63 #64 #65 #66 #67 #68 #69 #70

How to improve your figures #3: Don’t show variable names

When you plot a figure in your favourite statistical software, you will most likely see the name of the variable(s) you are plotting. If your income variable is called inc, your software will call the axis with income for inc and not income. In most cases variable names are not sufficient and you should, for that reason, not show variable names in your figures.

Good variable names are easy to read and write – and follow specific naming conventions. For example, you cannot (and should not) include spaces in your variable names. That is why we use underscores (_) to separate words in variable names. However, R, SPSS and Stata will happily show such underscores in your figures – and you need to fix that.

I believe this is data visualisation 101 but it is something I see a lot, including in published research. For example, take a look at this figure (Figure 1 from this paper):

As you can see, we have Exitfree, Anti_EU and some GDP* variables. The good thing about this paper is that the variable names are mentioned in the main text as well: “Individuals and parties may have ideological objections to European integration and hence desire a free exit right irrespective of whether their country is peripheral. To control for this, a variable variable ‘Anti_EU’ is constructed based on the variable ‘eu_anti_pro’ in the ParlGov database”. However, I would still recommend that you do not show the actual variable names in the figures but use actual names (with spaces and everything).

Let’s look at another few examples from this paper. Here is the first figure:

The important thing is not what the figure is about, but the labels. You will see labels such as PID_rep_dem and age_real. These are not good labels to have in a figure in a paper. age_real is not mentioned anywhere in the paper (only age as a covariate is mentioned).

Let us take a look at Figure 3 from the same paper:

Here you will see a variable called form2. What was form 1? Is there a form 3? When we rely on variable names instead of clear labels, we introduce ambiguity and makes it difficult for the reader to understand what is going on. Notice also the difference between Figure 1 and Figure 3 for age, i.e. age_real and real_age. Are those variables the same (i.e. a correlation of 1)? And if that is the case, why have two age variables?

Okay, next example. Look at Figure 6 from this paper:

Here we see a variable on the x-axis called yrs_since1920 (years since 1920). It would be better having a label for this axis simply being “Years since 1920”. Or even better: just the year and having the actual years on the axis. Notice also here the 1.sønderjylland_ny label. Sønderjylland is not mentioned in the paper and it is not clear how ny (new in Danish) should be understood here (most likely that it wasn’t the first Sønderjylland variable that was created in the data).

Let’s take another example, specifically Figure 3 from this paper:

Here we see the good old underscores en masse. anti_elite, immigrant_blame, ring_wing_complete_populism, rich_blame and left_wing_complete_populism. There are 29 authors on the article in question. Too many cooks spoil the broth? Nahh, I am sure most of the authors on the manuscript didn’t even bother looking at the figures (also, if you want to have fun, take a critical look at the results provided in the appendix!).

And now I notice that all of the examples I have provided above are from Stata. I promise it is a coincidence. However, let’s take one last example from R just to confirm that it is not only an issue in Stata. Specifically, look at Figure 3 in this paper (or Figure 4, Figure 5 and Figure 6):

The figure show trends in public opinion on economic issues in the United States from 1972 to 2016. There are too many dots in the labels here. guar.jobs.n.income, FS.aid.4.college etc. are not ideal labels in your figure.

In sum, I like most of the papers above (there is a reason I found the examples in the first place). However, it is a major turn-off that the figures do not show actual labels but simply rely on the variable names or weird abbreviations to show crucial information.

Udregn mandater til Folketinget med R

I mange meningsmålinger rapporteres partiernes opbakning ikke udelukkende med andelen af stemmer i procent, men også som mandattal. D’Hondts metode bruges som bekendt fordelingen af kredsmandater ved Folketingsvalg, der sammen med tillægsmandater sikrer en ligelig fordeling mellem stemmer og mandater ved valget.

Hvis man gerne vil estimere hvor mange mandater de respektive partier står til at få, kan jeg varmt anbefale seatdist pakken til R. Den er udviklet af Juraj Medzihorsky og kan findes her. Når du har installeret pakken kan du nemt hente den ind i R og bruge giveseats() funktionen til at udregne mandater:

giveseats(c(33, 6, 10, 7, 8, 1, 3, 1, 5, 17, 8, 1), 
          ns = 175, 
          thresh = 0.02,
          method = "dh")

Det første vi giver funktionen er en vektor med opbakningen til partierne i procent (jeg har her undladt decimaler blot for at gøre det nemmere at læse). 33 er eksempelvis opbakningen til Socialdemokratiet i procent. ns angiver hvor mange mandater, vi skal fordele (number of seats, i dette tilfælde 175), thres angiver spærregrænsen (2% i dette tilfælde) og method er vores fordelingsmetode (dh for D’Hondt).

Her kan vi se at Socialdemokratiet vil få omkring 61 mandater ved næste folketingsvalg. Dette er selvfølgelig et estimat, da vi 1) har usikkerhed i meningsmålingen og 2) alle mandater ikke fordeles så simpelt ved valget. Vi tager ligeledes ikke de fire nordatlantiske mandater med i betragtning. Ikke desto mindre er det relativt nemt at få et estimat på, hvor stor opbakningen er til partierne i mandater. Pakken giver desuden en lang række muligheder for at undersøge partiernes mandattal, hvis man tog andre mandatfordelingsmetoder i brug.