Skip to contents

This dataset contains responses from the General Social Survey (GSS) for the years 1976 and 1977, focusing on social status and tolerance towards minorities The latent class models can be fitted using this dataset replicate the analysis carried on McCutcheon (1985) and Bakk et al. (2014).
The data contains some covariates including year of the interview, age, sex, race, degree, and income of respondents. The variables associating social status include father's occupation and education level, and mother's education level, while the variables associating tolerance towards minorities are created by agreeing three related questions: (1) allowing public speaking, (2) allowing teaching, and (3) allowing literatures.

Format

A data frame with 2942 rows and 14 variables:

YEAR

Interview year (1976, 1977)

COHORT

Respondent's age
levels: (1)YOUNG, (2)YOUNG-MIDDLE, (4)MIDDLE, (5)OLD

SEX

Respondent's sex
levels: (1)MALE, (2)FEMALE

RACE

Respondent's race
levels: (1)WHITE (2)BLACK, (3)OTHER

DEGREE

Respondent's degree
levels: (1)LT HS, (2)HIGH-SCH, (3)HIGHER

REALRINC

Income of respondents

PAPRES

Father's prestige (occupation)
levels: (1)LOW, (2)MIDIUM, (2)HIGH

PADEG

Father's degree
levels: (1)LT HS, (2)HIGH-SCH, (3)COLLEGE, (4) BACHELOR, (5)GRADUATE

MADEG

Mother's degree
levels: (1)LT HS, (2)HIGH-SCH, (3)COLLEGE, (4) BACHELOR, (5)GRADUATE

TOLRAC

Tolerance towards racists

TOLCOM

Tolerance towards communists

TOLHOMO

Tolerance towards homosexuals

TOLATH

Tolerance towards atheists

TOLMIL

Tolerance towards militarists

Source

General Social Survey (GSS) 1976, 1977

References

Bakk Z, Kuha J. (2021) Relating latent class membership to external variables: An overview. Br J Math Stat Psychol. 74(2):340-362.

McCutcheon, A. L. (1985). A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly, 49, 474–488.

Examples

library(magrittr)
data <- gss7677[gss7677$RACE == "BLACK",]
model_stat <- slca(status(3) ~ PAPRES + PADEG + MADEG) %>%
   estimate(data = data, control = list(verbose = FALSE))
summary(model_stat)
#> Structural Latent Class Model
#> 
#> Summary of analysis
#>                                      
#>  Number of observations           287
#>  Number of manifest variables       3
#>  Number of latent class variables   1
#> 
#> 
#> Summary of model structure
#> 
#>  Latent variables (Root*):                
#>   Label: status*
#>  nclass: 3      
#> 
#>  Measurement model:                                       
#>   status -> { PAPRES, PADEG, MADEG }  a
#> 
#> 
#> Summary of manifest variables
#> 
#>  Categories for each variable:
#>          response
#>            1      2         3        4         5       
#>    PAPRES  LOW    MIDIUM    HIGH                       
#>    PADEG   LT-HS  HIGH-SCH  COLLEGE  BACHELOR  GRADUATE
#>    MADEG   LT-HS  HIGH-SCH  COLLEGE  BACHELOR  GRADUATE
#> 
#>  Frequencies for each categories:
#>          response
#>              1    2  3  4  5  <NA>
#>    PAPRES   80  114  8          85
#>    PADEG   155   27  1     2   102
#>    MADEG   183   44  4  4  1    51
#> 
#> 
#> Summary of model fit
#>                                                                                         
#>  Number of free parameters                                                            32
#>  Log-likelihood                                                                 -387.769
#>  Information criteria                                                                   
#>    Akaike (AIC)                                                                  839.537
#>    Bayesian (BIC)                                                                956.641
#>  Chi-squared Tests                                                                      
#>    Residual degree of freedom (df)                                                    42
#>    Pearson Chi-squared (X-squared) 4531545579066822527179116041846117901058120351744.000
#>      P(>Chi)                                                                       0.000
#>    Likelihood Ratio (G-squared)                                                   31.612
#>      P(>Chi)                                                                       0.879
param(model_stat)
#> PI :
#> (status)
#>   class
#>          1       2       3
#>     0.1220  0.7684  0.1095
#> 
#> RHO :
#> (a)
#>         class
#> response       1       2       3
#>    1(V1)  0.0000  0.4526  0.4220
#>    2      0.7896  0.5474  0.4422
#>    3      0.2104  0.0000  0.1358
#>    1(V2)  0.5849  1.0000  0.0000
#>    2      0.4151  0.0000  0.8546
#>    3      0.0000  0.0000  0.0485
#>    4      0.0000  0.0000  0.0000
#>    5      0.0000  0.0000  0.0969
#>    1(V3)  0.9152  0.8642  0.0000
#>    2      0.0000  0.1358  0.7504
#>    3      0.0000  0.0000  0.1539
#>    4      0.0848  0.0000  0.0572
#>    5      0.0000  0.0000  0.0385
#> 
#>        V1     V2    V3   
#> status PAPRES PADEG MADEG

model_tol <- slca(tol(4) ~ TOLRAC + TOLCOM + TOLHOMO + TOLATH + TOLMIL) %>%
   estimate(data = data, control = list(verbose = FALSE))
summary(model_tol)
#> Structural Latent Class Model
#> 
#> Summary of analysis
#>                                      
#>  Number of observations           287
#>  Number of manifest variables       5
#>  Number of latent class variables   1
#> 
#> 
#> Summary of model structure
#> 
#>  Latent variables (Root*):             
#>   Label: tol*
#>  nclass: 4   
#> 
#>  Measurement model:                                                       
#>   tol -> { TOLRAC, TOLCOM, TOLHOMO, TOLATH, TOLMIL }  a
#> 
#> 
#> Summary of manifest variables
#> 
#>  Categories for each variable:
#>           response
#>             1         2         
#>    TOLRAC   TOLERANT  INTOLERANT
#>    TOLCOM   TOLERANT  INTOLERANT
#>    TOLHOMO  TOLERANT  INTOLERANT
#>    TOLATH   TOLERANT  INTOLERANT
#>    TOLMIL   TOLERANT  INTOLERANT
#> 
#>  Frequencies for each categories:
#>           response
#>               1    2  <NA>
#>    TOLRAC    64  216     7
#>    TOLCOM    83  184    20
#>    TOLHOMO  103  164    20
#>    TOLATH    67  212     8
#>    TOLMIL    74  203    10
#> 
#> 
#> Summary of model fit
#>                                            
#>  Number of free parameters               23
#>  Log-likelihood                    -644.804
#>  Information criteria                      
#>    Akaike (AIC)                    1335.609
#>    Bayesian (BIC)                  1419.777
#>  Chi-squared Tests                         
#>    Residual degree of freedom (df)        8
#>    Pearson Chi-squared (X-squared)    6.773
#>      P(>Chi)                          0.561
#>    Likelihood Ratio (G-squared)       7.099
#>      P(>Chi)                          0.526
param(model_tol)
#> PI :
#> (tol)
#>   class
#>          1       2       3       4
#>     0.1248  0.5747  0.2272  0.0733
#> 
#> RHO :
#> (a)
#>         class
#> response       1       2       3       4
#>    1(V1)  1.0000  0.0000  0.4018  0.1747
#>    2      0.0000  1.0000  0.5982  0.8253
#>    1(V2)  0.9714  0.0654  0.3509  1.0000
#>    2      0.0286  0.9346  0.6491  0.0000
#>    1(V3)  0.9003  0.1342  0.5254  0.9531
#>    2      0.0997  0.8658  0.4746  0.0469
#>    1(V4)  1.0000  0.0084  0.4540  0.1852
#>    2      0.0000  0.9916  0.5460  0.8148
#>    1(V5)  0.8657  0.1057  0.2190  0.6944
#>    2      0.1343  0.8943  0.7810  0.3056
#> 
#>     V1     V2     V3      V4     V5    
#> tol TOLRAC TOLCOM TOLHOMO TOLATH TOLMIL

model_lta <- slca(
   status(3) ~ PAPRES + PADEG + MADEG,
   tol(4) ~ TOLRAC + TOLCOM + TOLHOMO + TOLATH + TOLMIL,
   status ~ tol
) %>% estimate(data = data, control = list(verbose = FALSE))
summary(model_lta)
#> Structural Latent Class Model
#> 
#> Summary of analysis
#>                                      
#>  Number of observations           287
#>  Number of manifest variables       8
#>  Number of latent class variables   2
#> 
#> 
#> Summary of model structure
#> 
#>  Latent variables (Root*):                    
#>   Label: status* tol
#>  nclass: 3       4  
#> 
#>  Measurement model:                                                          
#>   status -> { PAPRES, PADEG, MADEG }                     a
#>   tol    -> { TOLRAC, TOLCOM, TOLHOMO, TOLATH, TOLMIL }  b
#> 
#>  Structural model:                   
#>   status -> { tol }
#> 
#>  Dependency constraints:
#>   A            
#>   status -> tol
#> 
#>  Tree of structural model:                
#>   status  -> tol
#> 
#> 
#> Summary of manifest variables
#> 
#>  Categories for each variable:
#>           response
#>             1         2           3        4         5       
#>    PAPRES   LOW       MIDIUM      HIGH                       
#>    PADEG    LT-HS     HIGH-SCH    COLLEGE  BACHELOR  GRADUATE
#>    MADEG    LT-HS     HIGH-SCH    COLLEGE  BACHELOR  GRADUATE
#>    TOLRAC   TOLERANT  INTOLERANT                             
#>    TOLCOM   TOLERANT  INTOLERANT                             
#>    TOLHOMO  TOLERANT  INTOLERANT                             
#>    TOLATH   TOLERANT  INTOLERANT                             
#>    TOLMIL   TOLERANT  INTOLERANT                             
#> 
#>  Frequencies for each categories:
#>           response
#>               1    2  3  4  5  <NA>
#>    PAPRES    80  114  8          85
#>    PADEG    155   27  1     2   102
#>    MADEG    183   44  4  4  1    51
#>    TOLRAC    64  216              7
#>    TOLCOM    83  184             20
#>    TOLHOMO  103  164             20
#>    TOLATH    67  212              8
#>    TOLMIL    74  203             10
#> 
#> 
#> Summary of model fit
#>                                                                                                                                                                           
#>  Number of free parameters                                                                                                                                              61
#>  Log-likelihood                                                                                                                                                  -1023.309
#>  Information criteria                                                                                                                                                     
#>    Akaike (AIC)                                                                                                                                                   2168.617
#>    Bayesian (BIC)                                                                                                                                                 2391.845
#>  Chi-squared Tests                                                                                                                                                        
#>    Residual degree of freedom (df)                                                                                                                                    2338
#>    Pearson Chi-squared (X-squared) 73439981827502899854237373222465416178615306877158727700413671110533637342545223540916196699758629013844421674480676704619484676096.000
#>      P(>Chi)                                                                                                                                                         0.000
#>    Likelihood Ratio (G-squared)                                                                                                                                    217.236
#>      P(>Chi)                                                                                                                                                         1.000
param(model_lta)
#> PI :
#> (status)
#>   class
#>          1       2       3
#>     0.3302  0.1641  0.5057
#> 
#> TAU :
#> (A)
#>      parent
#> child       1       2       3
#>     1  0.0000  0.3554  0.4029
#>     2  0.2370  0.2294  0.0046
#>     3  0.2182  0.0000  0.5187
#>     4  0.5448  0.4152  0.0738
#>              
#> parent status
#> child  tol   
#> 
#> RHO :
#> (a)
#>         class
#> response       1       2       3
#>    1(V1)  0.6363  0.2849  0.2726
#>    2      0.3637  0.5246  0.7077
#>    3      0.0000  0.1905  0.0196
#>    1(V2)  1.0000  0.0000  1.0000
#>    2      0.0000  0.9000  0.0000
#>    3      0.0000  0.0333  0.0000
#>    4      0.0000  0.0000  0.0000
#>    5      0.0000  0.0667  0.0000
#>    1(V3)  0.7069  0.3148  0.9719
#>    2      0.2931  0.4839  0.0192
#>    3      0.0000  0.1015  0.0000
#>    4      0.0000  0.0745  0.0089
#>    5      0.0000  0.0254  0.0000
#> 
#>        V1     V2    V3   
#> status PAPRES PADEG MADEG
#> (b)
#>         class
#> response       1       2       3       4
#>    1(V1)  0.0001  0.9708  0.0203  0.3774
#>    2      0.9999  0.0292  0.9797  0.6226
#>    1(V2)  0.0603  0.9863  0.0380  0.5803
#>    2      0.9397  0.0137  0.9620  0.4197
#>    1(V3)  0.3006  0.9002  0.0000  0.6755
#>    2      0.6994  0.0998  1.0000  0.3245
#>    1(V4)  0.0454  1.0000  0.0146  0.3932
#>    2      0.9546  0.0000  0.9854  0.6068
#>    1(V5)  0.0000  0.8963  0.1824  0.3582
#>    2      1.0000  0.1037  0.8176  0.6418
#> 
#>     V1     V2     V3      V4     V5    
#> tol TOLRAC TOLCOM TOLHOMO TOLATH TOLMIL

regress(model_lta, status ~ SEX, data)
#> Coefficients:     
#> class  (Intercept)  SEXFEMALE
#>   1/3  -0.430       -0.192   
#>   2/3  -1.147       -0.573   
# \donttest{
regress(model_lta, status ~ SEX, data, method = "BCH")
#> Coefficients:     
#> class  (Intercept)  SEXFEMALE
#>   1/3  -0.431       -0.299   
#>   2/3  -1.154       -0.767   
regress(model_lta, status ~ SEX, data, method = "ML")
#> Coefficients:     
#> class  (Intercept)  SEXFEMALE
#>   1/3  -0.651       -0.316   
#>   2/3  -1.626       -0.834   
# }