Hypothesis Testing

using computer simulation. Based on examples from the infer package. Code for Quiz 13.

Question: t-test

Read it into and assign to hr

Note: col_types = “fddfff” defines the column types factor-double-double-factor-factor-factor

Table 1: Data summary
Name hr
Number of rows 500
Number of columns 6
_______________________
Column type frequency:
factor 4
numeric 2
________________________
Group variables None

Variable type: factor

skim_variable n_missing complete_rate ordered n_unique top_counts
gender 0 1 FALSE 2 fem: 253, mal: 247
evaluation 0 1 FALSE 4 bad: 148, fai: 138, goo: 122, ver: 92
salary 0 1 FALSE 6 lev: 98, lev: 87, lev: 87, lev: 86
status 0 1 FALSE 3 fir: 196, pro: 172, ok: 132

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
age 0 1 39.41 11.33 20 29.9 39.35 49.1 59.9 ▇▇▇▇▆
hours 0 1 49.68 13.24 35 38.2 45.50 58.8 79.9 ▇▃▃▂▂

The mean hours worked per week is: 49.7

Q: Is the mean number of hours worked per week 48?

Response: hours (numeric)
# A tibble: 500 x 1
   hours
   <dbl>
 1  49.6
 2  39.2
 3  63.2
 4  42.2
 5  54.7
 6  54.3
 7  37.3
 8  45.6
 9  35.1
10  53  
# … with 490 more rows

hypothesize that the average hours worked is 48

Response: hours (numeric)
Null Hypothesis: point
# A tibble: 500 x 1
   hours
   <dbl>
 1  49.6
 2  39.2
 3  63.2
 4  42.2
 5  54.7
 6  54.3
 7  37.3
 8  45.6
 9  35.1
10  53  
# … with 490 more rows
Response: hours (numeric)
Null Hypothesis: point
# A tibble: 500,000 x 2
# Groups:   replicate [1,000]
   replicate hours
       <int> <dbl>
 1         1  49.9
 2         1  37.0
 3         1  33.4
 4         1  45.9
 5         1  50.4
 6         1  36.6
 7         1  52.7
 8         1  49.6
 9         1  52.7
10         1  35.7
# … with 499,990 more rows

The output has 500,000 rows

# A tibble: 1,000 x 2
   replicate     stat
 *     <int>    <dbl>
 1         1  1.96   
 2         2  0.532  
 3         3 -1.04   
 4         4 -0.00975
 5         5  1.32   
 6         6  0.177  
 7         7  0.550  
 8         8  0.517  
 9         9  0.492  
10        10 -0.821  
# … with 990 more rows

null_t_distribution has 1000 t-stats

# A tibble: 1 x 1
   stat
  <dbl>
1  2.83
# A tibble: 1 x 1
  p_value
    <dbl>
1   0.008

shade_p_value on the simulated null distribution

If the p-value < 0.05? ??? (yes)

Does your analysis support the null hypothesis that the true mean number of hours worked was 48? ??? (no)

Question: 2 sample t-test

Note: col_types = “fddfff” defines the column types factor-double-double-factor-factor-factor

use skim to summarize the data in hr_2 by gender

Table 2: Data summary
Name Piped data
Number of rows 500
Number of columns 6
_______________________
Column type frequency:
factor 3
numeric 2
________________________
Group variables gender

Variable type: factor

skim_variable gender n_missing complete_rate ordered n_unique top_counts
evaluation male 0 1 FALSE 4 bad: 79, fai: 68, goo: 61, ver: 48
evaluation female 0 1 FALSE 4 bad: 75, fai: 74, ver: 48, goo: 47
salary male 0 1 FALSE 6 lev: 49, lev: 48, lev: 48, lev: 44
salary female 0 1 FALSE 6 lev: 47, lev: 46, lev: 41, lev: 39
status male 0 1 FALSE 3 fir: 93, pro: 90, ok: 73
status female 0 1 FALSE 3 fir: 101, pro: 89, ok: 54

Variable type: numeric

skim_variable gender n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
age male 0 1 38.63 11.57 20.3 28.50 37.85 49.52 59.6 ▇▇▆▆▆
age female 0 1 41.14 11.43 20.3 31.30 41.60 50.90 59.9 ▆▅▇▇▇
hours male 0 1 49.30 13.24 35.0 37.35 46.00 59.23 79.9 ▇▃▂▂▂
hours female 0 1 49.49 13.08 35.0 37.68 45.05 58.73 78.4 ▇▃▃▂▂

Females worked an average of 49.5 hours per week

Males worked an average of 49.3 hours per week

Response: hours (numeric)
Explanatory: gender (factor)
# A tibble: 500 x 2
   hours gender
   <dbl> <fct> 
 1  78.1 male  
 2  35.1 female
 3  36.9 female
 4  38.5 male  
 5  36.1 male  
 6  78.1 female
 7  76   female
 8  35.6 female
 9  35.6 male  
10  56.8 male  
# … with 490 more rows
Response: hours (numeric)
Explanatory: gender (factor)
Null Hypothesis: independence
# A tibble: 500 x 2
   hours gender
   <dbl> <fct> 
 1  78.1 male  
 2  35.1 female
 3  36.9 female
 4  38.5 male  
 5  36.1 male  
 6  78.1 female
 7  76   female
 8  35.6 female
 9  35.6 male  
10  56.8 male  
# … with 490 more rows
Response: hours (numeric)
Explanatory: gender (factor)
Null Hypothesis: independence
# A tibble: 500,000 x 3
# Groups:   replicate [1,000]
   hours gender replicate
   <dbl> <fct>      <int>
 1  60.8 male           1
 2  36.4 female         1
 3  62.6 female         1
 4  61.9 male           1
 5  48.2 male           1
 6  35.2 female         1
 7  64.1 female         1
 8  42.9 female         1
 9  47.2 male           1
10  65.8 male           1
# … with 499,990 more rows

The output has 500,000 rows

# A tibble: 1,000 x 2
   replicate   stat
 *     <int>  <dbl>
 1         1  1.30 
 2         2  1.56 
 3         3  1.46 
 4         4  0.884
 5         5  0.145
 6         6 -0.684
 7         7 -1.20 
 8         8 -0.378
 9         9  0.642
10        10  0.288
# … with 990 more rows

# A tibble: 1 x 1
   stat
  <dbl>
1 0.160
# A tibble: 1 x 1
  p_value
    <dbl>
1   0.862

If the p-value < 0.05? ??? (yes)

Does your analysis support the null hypothesis that the true mean number of hours worked by female and male employees was the same? (no)

Question: ANOVA

Table 3: Data summary
Name Piped data
Number of rows 500
Number of columns 6
_______________________
Column type frequency:
factor 3
numeric 2
________________________
Group variables status

Variable type: factor

skim_variable status n_missing complete_rate ordered n_unique top_counts
gender fired 0 1 FALSE 2 fem: 96, mal: 89
gender ok 0 1 FALSE 2 fem: 77, mal: 76
gender promoted 0 1 FALSE 2 fem: 87, mal: 75
evaluation fired 0 1 FALSE 4 bad: 65, fai: 63, goo: 31, ver: 26
evaluation ok 0 1 FALSE 4 bad: 69, fai: 59, goo: 15, ver: 10
evaluation promoted 0 1 FALSE 4 ver: 63, goo: 60, fai: 20, bad: 19
salary fired 0 1 FALSE 6 lev: 41, lev: 37, lev: 32, lev: 32
salary ok 0 1 FALSE 6 lev: 40, lev: 37, lev: 29, lev: 23
salary promoted 0 1 FALSE 6 lev: 37, lev: 35, lev: 29, lev: 23

Variable type: numeric

skim_variable status n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
age fired 0 1 38.64 11.43 20.2 28.30 38.30 47.60 59.6 ▇▇▇▅▆
age ok 0 1 41.34 12.11 20.3 31.00 42.10 51.70 59.9 ▆▆▆▆▇
age promoted 0 1 42.13 10.98 21.0 33.40 42.95 50.98 59.9 ▆▅▆▇▇
hours fired 0 1 41.67 7.88 35.0 36.10 38.90 43.90 75.5 ▇▂▁▁▁
hours ok 0 1 48.05 11.65 35.0 37.70 45.60 56.10 78.2 ▇▃▃▂▁
hours promoted 0 1 59.27 12.90 35.0 51.12 60.10 70.15 79.7 ▆▅▇▇▇

Response: hours (numeric)
Explanatory: status (factor)
# A tibble: 500 x 2
   hours status  
   <dbl> <fct>   
 1  36.5 fired   
 2  55.8 ok      
 3  35   fired   
 4  52   promoted
 5  35.1 ok      
 6  36.3 ok      
 7  40.1 promoted
 8  42.7 fired   
 9  66.6 promoted
10  35.5 ok      
# … with 490 more rows
Response: hours (numeric)
Explanatory: status (factor)
Null Hypothesis: independence
# A tibble: 500 x 2
   hours status  
   <dbl> <fct>   
 1  36.5 fired   
 2  55.8 ok      
 3  35   fired   
 4  52   promoted
 5  35.1 ok      
 6  36.3 ok      
 7  40.1 promoted
 8  42.7 fired   
 9  66.6 promoted
10  35.5 ok      
# … with 490 more rows
Response: hours (numeric)
Explanatory: status (factor)
Null Hypothesis: independence
# A tibble: 500,000 x 3
# Groups:   replicate [1,000]
   hours status   replicate
   <dbl> <fct>        <int>
 1  46.2 fired            1
 2  65.1 ok               1
 3  40   fired            1
 4  48   promoted         1
 5  56.4 ok               1
 6  40.5 ok               1
 7  39.6 promoted         1
 8  59.7 fired            1
 9  56.7 promoted         1
10  35.2 ok               1
# … with 499,990 more rows

The output has 500,000 rows

# A tibble: 1,000 x 2
   replicate   stat
 *     <int>  <dbl>
 1         1 0.365 
 2         2 2.30  
 3         3 0.166 
 4         4 2.00  
 5         5 0.496 
 6         6 0.0308
 7         7 1.18  
 8         8 0.394 
 9         9 0.0437
10        10 1.23  
# … with 990 more rows

null_distribution_anova has 1000 F-stats

# A tibble: 1 x 1
   stat
  <dbl>
1  115.
# A tibble: 1 x 1
  p_value
    <dbl>
1       0

If the p-value < 0.05? ??? (yes)

Does your analysis support the null hypothesis that the true means of the number of hours worked for those that were “fired”, “ok” and “promoted” were the same? ??? (yes)