In this exam, which is more like a practice exercise; you’ll have to solve several applied tasks. This means you’ll have to use Jamovi in some parts to analyze data sets. In other parts, you’ll need to answer theoretical questions based on the slides provided in class.
If you find a more creative way to answer an exercise please do it, don’t limit yourself! There are multiple ways to answer to a statistical dilemma, and your creativity is totally encouraged!
You may also use R to answer this “exam”.
Part 1: Let’s check some important concepts
Please copy the question along with your answer in a Word document. You may also use Quarto to earn extra points. 4 points each question.
Note
You may watch this video to complete the second part.
A true experiment selects participants randomly, and assign participants randomly to different conditions :
Qualitative research collects numerical values to reject the null hypothesis :
Mario needs to know if there is a causal relationship between sugar intake and high BMI (Body Mass Index). What type of design should Mario conduct?
Select the option that will help you to collect counterfactual evidence in an experiment:
If you estimate a correlation between two variables, you can also assume causation:
Parameters in a statistical model are unknown information, we collect data to reduce the uncertainty in the parameters:
The following expression represents a probabilistic model: (25 points)
\[\begin{equation}
Y \sim p(y)
\end{equation}\]
The reduction in uncertainty about model parameters that you achieve when you collect data is called statistical inference:
The following histogram shows a continuous distribution:
library(palmerpenguins)
Attaching package: 'palmerpenguins'
The following objects are masked from 'package:datasets':
penguins, penguins_raw
hist(penguins$body_mass_g, breaks =50, main ="Penguins' body mass",xlab ="grams")
Diane needs to estimate a Classical Regression Model. Diane believes that ethnicity has an effect on income. Diane asked for four ethnicity categories in her survey: white, black, Hispanic/Latino, and other. How many dummy coded variables does she need to compute to add ethnicity as a predictor of income?
We can consider t-test and the Classical Regression Model as part of the General Linear Model:
In the following correlation matrix, is the correlation (\(r\) = -0.053) between attention and rumZ explained by chance alone? (25 points)
What type of plot is the following example?
According the following table, is the mean difference between groups explained by chance alone? If not, how do you know?
Second Part: Hands on real data…as always.
In this second part, you may use JAMOVi or R to answer each question. I’ll provide data sets that you will open in JAMOVI or R to answer each question.
In this exercise we will estimate a One-way ANOVA. One-way means that we will use only one predictor or grouping variable to reject the null hypothesis. The data set that we will use is the data file named ruminationClean.csv. You may download the file from this link CLICK HERE
You can also copy the following code to open the file from my personal repository. This method doesn’t need to download any file. This method only works if you are using R:
In our One-way ANOVA we will use the variable “Booklet” as our grouping variable. Booklet is a variable that has three numbers: 1,2 and 3. These numbers corresponds to different versions of the same survey, in each version the depression items were presented in different locations. In booklet 1, depression items were presented at the beginning of the survey, in booklet 2, depression items were presented in the middle of the survey, and finally in booklet 3, depression questions were presented at the end of the survey.
Why did I sort the depression items into three positions? This is appropriate when your survey is very long, and you believe people will be exhausted by the end of the survey. When participants are tired, they tend to give quick answers without thinking carefully. To avoid the effect of tiredness, you can create different versions of the same survey, assign the surveys randomly, and then analyze if there was an effect of tiredness in the depression score.
Important
I used the word “booklet” to name the column because this study was done long time ago. In that moment, online surveys were not often used in Costa Rica. Each participant had to answer a paper and pencil version of the survey.
Follow the next steps:
Open the file ruminationClean.csv in JAMOVI.
Click on ANOVA in the top bar.
Select “Booklet” variable as your grouping variable.
Select “depreZ” (depression) as your dependent variable.
After following the previous steps, answer the following questions:
What is the null hypothesis in this ANOVA analysis? (10 points)
Create a box plot of depreZ by Booklet interpret the plot (10points)