Statistical Research Methods: Providing Researchers with the Confidence to Understand Data

Guest post by Dr James Abdey,  Associate Professor (Education) in the Department of Statistics at the London School of Economics and Political Science.


Dr James Abdey – Lead Advisor and Author for Epigeum’s Statistical Research Methods – discusses the importance of statistical literacy and applications of methodologies to assist researchers in understanding their data and presenting their results clearly, objectively, and with confidence. 

James is Associate Professor (Education) in the Department of Statistics at the London School of Economics and Political Science.

Many learners face significant trepidation when faced with a large dataset, and ask a number (if not all) of the following common questions:

  • Where do I start?

  • What should I do?

  • Which statistical test is relevant – zt, or F?

  • What do these different letters even mean?

Important to acknowledge is that such statistical anxiety is normal and a driving factor behind creating this Epigeum resource. We worked on this new Statistical Research Methods programme with an aim to guide learners through the art of data analysis, and inspire them to have confidence in their data – a confidence level of at least 95%!

The course is designed in a logical way to navigate learners through the full journey of statistical discovery. However, we expect learners of varying levels of prior knowledge to enrol in this course, and for this reason we have clearly structured this course to allow individuals to pick and choose the most relevant modules to help them fulfil their research objectives. In addition, learners can complete a pre-programme diagnostic tool which highlights their needs and guides them to signposted recommended content. This thereby makes this online resource suitable both for those wishing to gain a foundation in statistics, as well as those more experienced who would like a refresher on selected topics.

Researchers can be eager in their efforts to calculate results for projects, but before being overly enthusiastic, it is essential to clearly define your research questions and the corresponding hypotheses to be tested. After all, you need to be clear about what questions you want to answer before you can actually answer them! However, when faced with multiple rows and columns of raw data, this might initially appear a somewhat overwhelming task. Fear not! Within this new Statistical Research Methods course great emphasis is put on how to define your research questions based on the different levels of measurement of your variables – for example, distinguishing between categorical and quantitative variables.

Many people are visual learners, and data visualisation is indispensable for storytelling with data – stories of data fact, not fiction. However, all charts are not created equal. Do you know your histogram from your boxplot? Your scatterplot from your bar chart? The number and type of variables you have dictate which plot types are (and are not) suitable. Such exploratory data analysis serves as a useful starting point for identifying interesting features of your data – whether x and y exhibit a relationship, for example. Such exploration itself can help with establishing your research priorities. Is such a relationship causal? How strong is the relationship? This is precisely why we include a module on this with interactive graphs and activities so that learners can build up significant knowledge in this field.

A researcher should never underestimate the importance of simple descriptive statistics. The clue is in the name, i.e. how to ‘describe’ datasets. But learners may ask whether you should focus on the mean or the median? Does it matter? Can you distinguish a variance from a standard deviation? How does central tendency differ from dispersion? Rest assured; all becomes clear in Statistical Research Methods.

Our interest as academics and statisticians lies in application and scenarios in the real world. To make sense of such complexity we must first simplify the real world, by focusing on the stylised facts of variables, and approximating them using tried-and-tested probability distributions. Which to use – the binomial, the Poisson, the normal? Each has its use and the circumstances for implementing each are explained within the Statistical Research Methods course.

Much of the data that learners will work with will form samples drawn (hopefully at random) from larger populations. Researchers often seek to infer unobserved attributes of populations using observed samples. Is your sample sufficiently representative of your target population? Exact representativeness is often elusive, such that uncertainties often exist in the world of statistical inference, which is why researchers are taught within this course a defined process to implement when they work with sample data.

Generalising statistical results from samples to broader populations is a common occurrence, however this should be exercised with due caution. Opinion polls can be wrong, biases can impact data, samples may not adequately represent the population. To address these issues, among others, this course focuses on various aspects of sampling, estimation and the generalisation of results. This seeks to meet the need of researchers who need to understand the uncertainty of their estimates of parameters and be able to communicate likely estimation errors via confidence intervals.

Learners enrolled on Statistical Research Methods may need to perform hypothesis testing. With this in mind, we gave significant treatment of this area, with an emphasis not just on the mechanics of statistical testing, but also on recognising the possibilities of false positives and false negatives. Both types of error are “bad” - but depending on the testing situation they may be asymmetric in their severity. Matters including the statistical power of tests are explored within this course topic, such as the impact of sample size and effect size on the strength of statistical significance.

Researchers regularly employ statistical software packages to crunch the numbers. A variety of these software tools are available in the current field, with their respective merits and limitations, for example how easy they are to use. We’ve made extensive screencast demonstrations of a selection of popular software packages available within the course such as Excel, Python and R, and learners are encouraged to experiment with each to determine their preferred choice.

We conclude the core modules of the programme by explaining the source(s) of variation in data. Analysis of variance and linear regression are popular tools for understanding the drivers of observed variation in so-called dependent variables of interest. These are essential tools in an empirical researcher’s toolbox which necessitates their inclusion in this programme.

Finally, in addition to the eight core modules within Statistical Research Methods, we have also created two companion materials on data management, data integrity, research ethics and open data sources. All learners are strongly encouraged to become familiar with this content as they touch on important areas regarding the handling of data.

I wish all learners every success in their journey through Statistical Research Methods, after which I very much hope everyone will feel confident and competent to apply their acquired statistical knowledge to their own research. Death and taxes, allegedly, are the only certainties in life. With a high degree of confidence, I trust the discovery of the “joy of stats” is another to add to that expression! 


 Statistical Research Methods published in December 2024. To learn more about how this programme can offer your researchers confidence in statistics and their research projects, visit the course page below.

Discover Statistical Research Methods  and Watch a Recorded Video Demo 

Next
Next

Webinar: How to select the right peer-reviewed journal for your research