Skip to contents

A dataset containing substance use behavior from the National Longitudinal Survey of Youth 1997 (NLSY97) for three years: 1998, 2003, and 2008. The dataset focuses on the youth born in 1984 and tracks three substance use behaviors: tobacco/cigarette smoking, alcohol drinking, and marijuana use.

Format

A data frame with 1004 rows and 38 columns:

SEX

Respondent's sex

RACE

Respondent's race

ESMK_98, ESMK_03, ESMK_08

(Ever smoked) Ever smoked in 1998, 2003, and 2008 (0: No, 1: Yes)

FSMK_98, FSMK_03, FSMK_08

(Frequent smoke) Monthly smokes in 1998, 2003, and 2008 (0: No, 1: Yes)

DSMK_98, DSMK_03, DSMK_08

(Daily smoke) Daily smokes in 1998, 2003, and 2008 (0: No, 1: Yes)

HSMK_98, HSMK_03, HSMK_08

(Heavy smoke) 10+ cigarettes per day in 1998, 2003, and 2008 (0: No, 1: Yes)

EDRK_98, EDRK_03, EDRK_08

(Ever drunk) Have you ever drunk in 1998, 2003, and 2008? (0: No, 1: Yes)

CDRK_98, CDRK_03, CDRK_08

(Current drinker) Monthly drinking in 1998, 2003, and 2008 (0: No, 1: Yes)

WDRK_98, WDRK_03, WDRK_08

(Weakly drinker) 5+ days drinking in a month in 1998, 2003, and 2008 (0: No, 1: Yes)

BDRK_98, BDRK_03, BDRK_08

(Binge drinker) 5+ drinks on the same day at least one time in the last 30 day (0: No, 1: Yes)

EMRJ_98, EMRJ_03, EMRJ_08

(Ever marijuana used) Have you ever used marijuana in 1998, 2003, and 2008? (0: No, 1: Yes)

CMRJ_98, CMRJ_03, CMRJ_08

(Corrent marijuana user) Monthly marijuana use in 1998, 2003, and 2008 (0: No, 1: Yes)

OMRJ_98, OMRJ_03, OMRJ_08

(Occasional marijuana user) 10+ days marijuana use in a month in 1998, 2003, and 2008 (0: No, 1: Yes)

SMRJ_98, SMRJ_03, SMRJ_08

(School/work marijuana user) Marijuana use before/during school or work in 1998, 2003, and 2008 (0: No, 1: Yes)

Similar naming conventions apply for the years 2003 and 2008, replacing '98' with '03' and '08', respectively.

Source

National Longitudinal Survey of Youth 1997 (NLSY97)

References

Bureau of Labor Statistics, U.S. Department of Labor. National Longitudinal Survey of Youth 1997 cohort, 1997-2017 (rounds 1-18). Produced and distributed by the Center for Human Resource Research (CHRR), The Ohio State University. Columbus, OH: 2019.

Examples

library(magrittr)
nlsy_smoke <- slca(smk98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98) %>%
   estimate(data = nlsy97, control = list(verbose = FALSE))
summary(nlsy_smoke)
#> Structural Latent Class Model
#> 
#> Summary of analysis
#>                                       
#>  Number of observations           1004
#>  Number of manifest variables        4
#>  Number of latent class variables    1
#> 
#> 
#> Summary of model structure
#> 
#>  Latent variables (Root*):               
#>   Label: smk98*
#>  nclass: 3     
#> 
#>  Measurement model:                                                    
#>   smk98 -> { ESMK_98, FSMK_98, DSMK_98, HSMK_98 }  a
#> 
#> 
#> Summary of manifest variables
#> 
#>  Categories for each variable:
#>           response
#>             1    2 
#>    ESMK_98  Yes  No
#>    FSMK_98  Yes  No
#>    DSMK_98  Yes  No
#>    HSMK_98  Yes  No
#> 
#>  Frequencies for each categories:
#>           response
#>               1    2  <NA>
#>    ESMK_98  558  446     0
#>    FSMK_98  413  591     0
#>    DSMK_98  179  825     0
#>    HSMK_98  115  889     0
#> 
#> 
#> Summary of model fit
#>                                             
#>  Number of free parameters                14
#>  Log-likelihood                    -1536.221
#>  Information criteria                       
#>    Akaike (AIC)                     3100.442
#>    Bayesian (BIC)                   3169.207
#>  Chi-squared Tests                          
#>    Residual degree of freedom (df)         1
#>    Pearson Chi-squared (X-squared)   118.589
#>      P(>Chi)                           0.000
#>    Likelihood Ratio (G-squared)      125.712
#>      P(>Chi)                           0.000

# \donttest{
# JLCA
model_jlca <- slca(
   smk98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98,
   drk98(3) ~ EDRK_98 + CDRK_98 + WDRK_98 + BDRK_98,
   mrj98(3) ~ EMRJ_98 + CMRJ_98 + OMRJ_98 + SMRJ_98,
   substance(4) ~ smk98 + drk98 + mrj98
) %>% estimate(data = nlsy97, control = list(verbose = FALSE))
summary(model_jlca)
#> Structural Latent Class Model
#> 
#> Summary of analysis
#>                                       
#>  Number of observations           1004
#>  Number of manifest variables       12
#>  Number of latent class variables    4
#> 
#> 
#> Summary of model structure
#> 
#>  Latent variables (Root*):                                     
#>   Label: smk98 drk98 mrj98 substance*
#>  nclass: 3     3     3     4         
#> 
#>  Measurement model:                                                    
#>   smk98 -> { ESMK_98, FSMK_98, DSMK_98, HSMK_98 }  a
#>   drk98 -> { EDRK_98, CDRK_98, WDRK_98, BDRK_98 }  b
#>   mrj98 -> { EMRJ_98, CMRJ_98, OMRJ_98, SMRJ_98 }  c
#> 
#>  Structural model:                                      
#>   substance -> { smk98, drk98, mrj98 }
#> 
#>  Dependency constraints:
#>   A                  B                  C                 
#>   substance -> smk98 substance -> drk98 substance -> mrj98
#> 
#>  Tree of structural model:                     
#>   substance  -> smk98
#>              -> drk98
#>              -> mrj98
#> 
#> 
#> Summary of manifest variables
#> 
#>  Categories for each variable:
#>           response
#>             1    2 
#>    ESMK_98  Yes  No
#>    FSMK_98  Yes  No
#>    DSMK_98  Yes  No
#>    HSMK_98  Yes  No
#>    EDRK_98  Yes  No
#>    CDRK_98  Yes  No
#>    WDRK_98  Yes  No
#>    BDRK_98  Yes  No
#>    EMRJ_98  Yes  No
#>    CMRJ_98  Yes  No
#>    OMRJ_98  Yes  No
#>    SMRJ_98  Yes  No
#> 
#>  Frequencies for each categories:
#>           response
#>               1    2  <NA>
#>    ESMK_98  558  446     0
#>    FSMK_98  413  591     0
#>    DSMK_98  179  825     0
#>    HSMK_98  115  889     0
#>    EDRK_98  735  269     0
#>    CDRK_98  521  483     0
#>    WDRK_98  218  786     0
#>    BDRK_98  288  716     0
#>    EMRJ_98  383  621     0
#>    CMRJ_98  226  778     0
#>    OMRJ_98   92  912     0
#>    SMRJ_98   98  906     0
#> 
#> 
#> Summary of model fit
#>                                             
#>  Number of free parameters                63
#>  Log-likelihood                    -4069.738
#>  Information criteria                       
#>    Akaike (AIC)                     8265.476
#>    Bayesian (BIC)                   8574.916
#>  Chi-squared Tests                          
#>    Residual degree of freedom (df)      4032
#>    Pearson Chi-squared (X-squared)   290.581
#>      P(>Chi)                           1.000
#>    Likelihood Ratio (G-squared)      304.295
#>      P(>Chi)                           1.000
param(model_jlca)
#> PI :
#> (substance)
#>   class
#>          1       2       3       4
#>     0.2768  0.1218  0.4320  0.1693
#> 
#> TAU :
#> (A)
#>      parent
#> child       1       2       3       4
#>     1  0.3726  0.4380  0.0000  0.9863
#>     2  0.1008  0.2749  0.9330  0.0136
#>     3  0.5265  0.2871  0.0670  0.0000
#>                 
#> parent substance
#> child  smk98    
#> (B)
#>      parent
#> child       1       2       3       4
#>     1  0.0447  0.9695  0.2330  0.4482
#>     2  0.0539  0.0100  0.6512  0.1996
#>     3  0.9014  0.0206  0.1158  0.3523
#>                 
#> parent substance
#> child  drk98    
#> (C)
#>      parent
#> child       1       2       3       4
#>     1  0.5345  0.0000  0.0200  0.0000
#>     2  0.2876  0.7671  0.0000  0.2250
#>     3  0.1780  0.2329  0.9800  0.7750
#>                 
#> parent substance
#> child  mrj98    
#> 
#> RHO :
#> (a)
#>         class
#> response       1       2       3
#>    1(V1)  1.0000  0.0484  1.0000
#>    2      0.0000  0.9516  0.0000
#>    1(V2)  0.6234  0.0000  1.0000
#>    2      0.3766  1.0000  0.0000
#>    1(V3)  0.0000  0.0000  0.8504
#>    2      1.0000  1.0000  0.1496
#>    1(V4)  0.0000  0.0000  0.5463
#>    2      1.0000  1.0000  0.4537
#> 
#>       V1      V2      V3      V4     
#> smk98 ESMK_98 FSMK_98 DSMK_98 HSMK_98
#> (b)
#>         class
#> response       1       2       3
#>    1(V1)  1.0000  0.1912  1.0000
#>    2      0.0000  0.8088  0.0000
#>    1(V2)  0.5120  0.0000  1.0000
#>    2      0.4880  1.0000  0.0000
#>    1(V3)  0.0000  0.0000  0.6003
#>    2      1.0000  1.0000  0.3997
#>    1(V4)  0.0000  0.0000  0.7930
#>    2      1.0000  1.0000  0.2070
#> 
#>       V1      V2      V3      V4     
#> drk98 EDRK_98 CDRK_98 WDRK_98 BDRK_98
#> (c)
#>         class
#> response       1       2       3
#>    1(V1)  1.0000  1.0000  0.0217
#>    2      0.0000  0.0000  0.9783
#>    1(V2)  1.0000  0.3244  0.0000
#>    2      0.0000  0.6756  1.0000
#>    1(V3)  0.5851  0.0000  0.0000
#>    2      0.4149  1.0000  1.0000
#>    1(V4)  0.6233  0.0000  0.0000
#>    2      0.3767  1.0000  1.0000
#> 
#>       V1      V2      V3      V4     
#> mrj98 EMRJ_98 CMRJ_98 OMRJ_98 SMRJ_98

# JLCPA
nlsy_jlcpa <- slca(
   smk98(3) ~ ESMK_98 + FSMK_98 + DSMK_98 + HSMK_98,
   drk98(3) ~ EDRK_98 + CDRK_98 + WDRK_98 + BDRK_98,
   mrj98(3) ~ EMRJ_98 + CMRJ_98 + OMRJ_98 + SMRJ_98,
   use98(5) ~ smk98 + drk98 + mrj98,
   smk03(3) ~ ESMK_03 + FSMK_03 + DSMK_03 + HSMK_03,
   drk03(3) ~ EDRK_03 + CDRK_03 + WDRK_03 + BDRK_03,
   mrj03(3) ~ EMRJ_03 + CMRJ_03 + OMRJ_03 + SMRJ_03,
   use03(5) ~ smk03 + drk03 + mrj03,
   smk08(3) ~ ESMK_08 + FSMK_08 + DSMK_08 + HSMK_08,
   drk08(3) ~ EDRK_08 + CDRK_08 + WDRK_08 + BDRK_08,
   mrj08(3) ~ EMRJ_08 + CMRJ_08 + OMRJ_08 + SMRJ_08,
   use08(5) ~ smk08 + drk08 + mrj08,
   prof(4) ~ use98 + use03 + use08,
   constraints = list(
      c("smk98", "smk03", "smk08"),
      c("drk98", "drk03", "drk08"),
      c("mrj98", "mrj03", "mrj08"),
      c("use98 ~ smk98", "use03 ~ smk03", "use08 ~ smk08"),
      c("use98 ~ drk98", "use03 ~ drk03", "use08 ~ drk08"),
      c("use98 ~ mrj98", "use03 ~ mrj03", "use08 ~ mrj08")
   )
) %>% estimate(nlsy97, control = list(verbose = FALSE))
# }