Hypothesis Testing

using computer simulation. Based on examples from the infer package. Code for Quiz 13.

Question: t-test

Read it into and assign to hr

Note: col_types = “fddfff” defines the column types factor-double-double-factor-factor-factor

Table 1: Data summary
Name hr
Number of rows 500
Number of columns 6
_______________________
Column type frequency:
factor 4
numeric 2
________________________
Group variables None

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
gender 0 1 FALSE 2 fem: 253, mal: 247
evaluation 0 1 FALSE 4 bad: 148, fai: 138, goo: 122, ver: 92
salary 0 1 FALSE 6 lev: 98, lev: 87, lev: 87, lev: 86
status 0 1 FALSE 3 fir: 196, pro: 172, ok: 132

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
age 0 1 39.41 11.33 20 29.9 39.35 49.1 59.9 ▇▇▇▇▆
hours 0 1 49.68 13.24 35 38.2 45.50 58.8 79.9 ▇▃▃▂▂

The mean hours worked per week is: 49.7

Q: Is the mean number of hours worked per week 48?

Response: hours (numeric)
# A tibble: 500 x 1
   hours
   <dbl>
 1  49.6
 2  39.2
 3  63.2
 4  42.2
 5  54.7
 6  54.3
 7  37.3
 8  45.6
 9  35.1
10  53  
# … with 490 more rows

hypothesize that the average hours worked is 48

Response: hours (numeric)
Null Hypothesis: point
# A tibble: 500 x 1
   hours
   <dbl>
 1  49.6
 2  39.2
 3  63.2
 4  42.2
 5  54.7
 6  54.3
 7  37.3
 8  45.6
 9  35.1
10  53  
# … with 490 more rows
Response: hours (numeric)
Null Hypothesis: point
# A tibble: 500,000 x 2
# Groups:   replicate [1,000]
   replicate hours
       <int> <dbl>
 1         1  49.9
 2         1  37.0
 3         1  33.4
 4         1  45.9
 5         1  50.4
 6         1  36.6
 7         1  52.7
 8         1  49.6
 9         1  52.7
10         1  35.7
# … with 499,990 more rows

The output has 500,000 rows

# A tibble: 1,000 x 2
   replicate     stat
 *     <int>    <dbl>
 1         1  1.96   
 2         2  0.532  
 3         3 -1.04   
 4         4 -0.00975
 5         5  1.32   
 6         6  0.177  
 7         7  0.550  
 8         8  0.517  
 9         9  0.492  
10        10 -0.821  
# … with 990 more rows

null_t_distribution has 1000 t-stats

# A tibble: 1 x 1
   stat
  <dbl>
1  2.83
# A tibble: 1 x 1
  p_value
    <dbl>
1   0.008

shade_p_value on the simulated null distribution

If the p-value < 0.05? ??? (yes)

Does your analysis support the null hypothesis that the true mean number of hours worked was 48? ??? (no)

Question: 2 sample t-test

Note: col_types = “fddfff” defines the column types factor-double-double-factor-factor-factor

use skim to summarize the data in hr_2 by gender

Table 2: Data summary
Name Piped data
Number of rows 500
Number of columns 6
_______________________
Column type frequency:
factor 3
numeric 2
________________________
Group variables gender

Variable type: factor

skim_variable gender n_missing complete_rate ordered n_unique top_counts
evaluation male 0 1 FALSE 4 bad: 79, fai: 68, goo: 61, ver: 48
evaluation female 0 1 FALSE 4 bad: 75, fai: 74, ver: 48, goo: 47
salary male 0 1 FALSE 6 lev: 49, lev: 48, lev: 48, lev: 44
salary female 0 1 FALSE 6 lev: 47, lev: 46, lev: 41, lev: 39
status male 0 1 FALSE 3 fir: 93, pro: 90, ok: 73
status female 0 1 FALSE 3 fir: 101, pro: 89, ok: 54

Variable type: numeric

skim_variable gender n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
age male 0 1 38.63 11.57 20.3 28.50 37.85 49.52 59.6 ▇▇▆▆▆
age female 0 1 41.14 11.43 20.3 31.30 41.60 50.90 59.9 ▆▅▇▇▇
hours male 0 1 49.30 13.24 35.0 37.35 46.00 59.23 79.9 ▇▃▂▂▂
hours female 0 1 49.49 13.08 35.0 37.68 45.05 58.73 78.4 ▇▃▃▂▂

Females worked an average of 49.5 hours per week

Males worked an average of 49.3 hours per week

Response: hours (numeric)
Explanatory: gender (factor)
# A tibble: 500 x 2
   hours gender
   <dbl> <fct> 
 1  78.1 male  
 2  35.1 female
 3  36.9 female
 4  38.5 male  
 5  36.1 male  
 6  78.1 female
 7  76   female
 8  35.6 female
 9  35.6 male  
10  56.8 male  
# … with 490 more rows
Response: hours (numeric)
Explanatory: gender (factor)
Null Hypothesis: independence
# A tibble: 500 x 2
   hours gender
   <dbl> <fct> 
 1  78.1 male  
 2  35.1 female
 3  36.9 female
 4  38.5 male  
 5  36.1 male  
 6  78.1 female
 7  76   female
 8  35.6 female
 9  35.6 male  
10  56.8 male  
# … with 490 more rows
Response: hours (numeric)
Explanatory: gender (factor)
Null Hypothesis: independence
# A tibble: 500,000 x 3
# Groups:   replicate [1,000]
   hours gender replicate
   <dbl> <fct>      <int>
 1  60.8 male           1
 2  36.4 female         1
 3  62.6 female         1
 4  61.9 male           1
 5  48.2 male           1
 6  35.2 female         1
 7  64.1 female         1
 8  42.9 female         1
 9  47.2 male           1
10  65.8 male           1
# … with 499,990 more rows

The output has 500,000 rows

# A tibble: 1,000 x 2
   replicate   stat
 *     <int>  <dbl>
 1         1  1.30 
 2         2  1.56 
 3         3  1.46 
 4         4  0.884
 5         5  0.145
 6         6 -0.684
 7         7 -1.20 
 8         8 -0.378
 9         9  0.642
10        10  0.288
# … with 990 more rows

# A tibble: 1 x 1
   stat
  <dbl>
1 0.160
# A tibble: 1 x 1
  p_value
    <dbl>
1   0.862

If the p-value < 0.05? ??? (yes)

Does your analysis support the null hypothesis that the true mean number of hours worked by female and male employees was the same? (no)

Question: ANOVA

Table 3: Data summary
Name Piped data
Number of rows 500
Number of columns 6
_______________________
Column type frequency:
factor 3
numeric 2
________________________
Group variables status

Variable type: factor

skim_variable status n_missing complete_rate ordered n_unique top_counts
gender fired 0 1 FALSE 2 fem: 96, mal: 89
gender ok 0 1 FALSE 2 fem: 77, mal: 76
gender promoted 0 1 FALSE 2 fem: 87, mal: 75
evaluation fired 0 1 FALSE 4 bad: 65, fai: 63, goo: 31, ver: 26
evaluation ok 0 1 FALSE 4 bad: 69, fai: 59, goo: 15, ver: 10
evaluation promoted 0 1 FALSE 4 ver: 63, goo: 60, fai: 20, bad: 19
salary fired 0 1 FALSE 6 lev: 41, lev: 37, lev: 32, lev: 32
salary ok 0 1 FALSE 6 lev: 40, lev: 37, lev: 29, lev: 23
salary promoted 0 1 FALSE 6 lev: 37, lev: 35, lev: 29, lev: 23

Variable type: numeric

skim_variable status n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
age fired 0 1 38.64 11.43 20.2 28.30 38.30 47.60 59.6 ▇▇▇▅▆
age ok 0 1 41.34 12.11 20.3 31.00 42.10 51.70 59.9 ▆▆▆▆▇
age promoted 0 1 42.13 10.98 21.0 33.40 42.95 50.98 59.9 ▆▅▆▇▇
hours fired 0 1 41.67 7.88 35.0 36.10 38.90 43.90 75.5 ▇▂▁▁▁
hours ok 0 1 48.05 11.65 35.0 37.70 45.60 56.10 78.2 ▇▃▃▂▁
hours promoted 0 1 59.27 12.90 35.0 51.12 60.10 70.15 79.7 ▆▅▇▇▇

Response: hours (numeric)
Explanatory: status (factor)
# A tibble: 500 x 2
   hours status  
   <dbl> <fct>   
 1  36.5 fired   
 2  55.8 ok      
 3  35   fired   
 4  52   promoted
 5  35.1 ok      
 6  36.3 ok      
 7  40.1 promoted
 8  42.7 fired   
 9  66.6 promoted
10  35.5 ok      
# … with 490 more rows
Response: hours (numeric)
Explanatory: status (factor)
Null Hypothesis: independence
# A tibble: 500 x 2
   hours status  
   <dbl> <fct>   
 1  36.5 fired   
 2  55.8 ok      
 3  35   fired   
 4  52   promoted
 5  35.1 ok      
 6  36.3 ok      
 7  40.1 promoted
 8  42.7 fired   
 9  66.6 promoted
10  35.5 ok      
# … with 490 more rows
Response: hours (numeric)
Explanatory: status (factor)
Null Hypothesis: independence
# A tibble: 500,000 x 3
# Groups:   replicate [1,000]
   hours status   replicate
   <dbl> <fct>        <int>
 1  46.2 fired            1
 2  65.1 ok               1
 3  40   fired            1
 4  48   promoted         1
 5  56.4 ok               1
 6  40.5 ok               1
 7  39.6 promoted         1
 8  59.7 fired            1
 9  56.7 promoted         1
10  35.2 ok               1
# … with 499,990 more rows

The output has 500,000 rows

# A tibble: 1,000 x 2
   replicate   stat
 *     <int>  <dbl>
 1         1 0.365 
 2         2 2.30  
 3         3 0.166 
 4         4 2.00  
 5         5 0.496 
 6         6 0.0308
 7         7 1.18  
 8         8 0.394 
 9         9 0.0437
10        10 1.23  
# … with 990 more rows

null_distribution_anova has 1000 F-stats

# A tibble: 1 x 1
   stat
  <dbl>
1  115.
# A tibble: 1 x 1
  p_value
    <dbl>
1       0

If the p-value < 0.05? ??? (yes)

Does your analysis support the null hypothesis that the true means of the number of hours worked for those that were “fired”, “ok” and “promoted” were the same? ??? (yes)

Footnotes