Article Versions
Export Article
Cite this article
  • Normal Style
  • MLA Style
  • APA Style
  • Chicago Style
Research Article
Open Access Peer-reviewed

Key Predictors of COVID-19 Seropositivity among Adults in New York City; Community Health Survey 2020

Apeksha Mewani , Vincent Jones II, Erin Jacques
American Journal of Public Health Research. 2025, 13(3), 103-116. DOI: 10.12691/ajphr-13-3-2
Received February 13, 2025; Revised April 29, 2025; Accepted May 08, 2025

Abstract

This study identifies common key predictors of COVID-19 seropositivity by comparing various regression models using a hierarchical regression method among a sample of New York City (NYC) adults, based on the analysis of the New York City Community Health Survey (NYC CHS) 2020 dataset for this. An exploratory approach is used to understand the social, environmental, and individual determinants of health in NYC’s population at the peak of the pandemic and the effects on COVID-19 seropositivity. Hierarchical logistic regression was carried out on a sample of 928 participants. The findings suggest that age (65-75 years), race (Black and Hispanic), and the birthplace (US) were significant to the hierarchical regression when only socioeconomic factors were considered. Including health behaviors, tobacco usage behaviors, and physical activity became s. In the full model, BMI, asthma prevalence, and suicidal thoughts significantly correlated with COVID-19 seropositivity. The findings are consistent with public health literature highlighting the importance of healthy behaviors and public health efforts in maintaining overall health and immunity.

1. Introduction

The novel coronavirus was first identified in Wuhan, China, in December 2019, prompting the World Health Organization to declare a global health emergency by January 30, 2020 1. By May 1, 2022, an estimated 10.5 million children worldwide had lost parents or caregivers due to COVID-19, with around 7.5 million orphaned 2. In the US, 204,000 children and teens lost caregivers 3, and 115,000 healthcare workers died from the virus 4. The impact of the pandemic continues to affect vulnerable populations, especially those who experienced sensitive life stages 5.

Epidemiologic trends showed how vulnerable groups were negatively impacted during the pandemic. Daily wage workers faced significant challenges, with severe food insecurity, per a study concerning Bangladeshi households, who experienced considerable economic loss due to lockdowns, estimated at US$64.2 million 6. In Australia, vulnerable children and young people faced compounded health risks and barriers to accessing support 7. In contrast, occupational risk disparities highlighted workers' plight in health, tourism, and retail industries 8.

In the US, the COVID-19 pandemic led to significant job losses, disproportionately affecting minorities 9 10. Losing jobs often means losing health insurance, creating barriers to treatment. Support programs for employment, childcare, and healthy food access are critical for vulnerable populations. Thus, socio-economic conditions and other determinants, which often influence health, significantly impacted COVID-19 morbidity and mortality, exacerbating health disparities in underserved populations 10. These factors are crucial when addressing health inequities during emergencies, wherein access to healthcare is essential. Strategies like increasing insurance coverage can help individuals obtain necessary services. Despite initiatives like the Healthy People 2030, which aim to improve access to high-quality healthcare, 14% of US adults, particularly non-white, lower education and income adults, indicated reluctance to seek medical care for COVID-19 symptoms 11. Economic instability, including unemployment, exacerbates health issues, and is linked to the occurrence of cardiovascular disease and diabetes, which increased the risk of contracting COVID-19 12. Stress too negatively affects immune function, especially in pregnancy, highlighting the need for a psychoneuroimmunology approach to address the stress-immune interaction.

Healthy People 2030 aims to enhance community safety and health through policy changes and interventions, including creating sidewalks and bike lanes to promote physical activity. Health literacy is crucial in preventing communicable diseases like COVID-19, helping individuals understand transmission and the importance of following health recommendations during pandemics 10. The pandemic has highlighted the interconnection between economic, social, and health status. Rezaei et al. 13 emphasize the need for accurate data on health equity and social determinants of health (SDOH) to ensure transparency and accountability in addressing these issues, particularly during crises like COVID-19. SDOH factors, including water sanitation, food security, and access to healthcare, significantly impact health outcomes. Economic strategies like direct payments and food support can help vulnerable populations cope with disparities.

Biological vulnerabilities, like pre-existing conditions and age, further influence COVID-19 outcomes, with significant disparities observed among racial and ethnic minority groups 14. The CDC reports higher hospitalization among African Americans as compared to their population proportion.

Robertson et al. 14 argue that public health interventions can inadvertently increase health disparities. For example, social distancing may have disproportionately affected certain racial and ethnic groups due to their employment situations and living conditions. Upon analyzing the SDOH's impact through indices focused on social distancing ability, susceptibility to COVID-19 complications, and healthcare access, the researchers suggest that effective public health strategies must consider SDOH to prevent exacerbating existing disparities in exposure and access to treatment.

1.1. Rationale for Selecting Variables in This Paper

The NYC CHS 2020 dataset, with over 200 variables, influenced the rationale for selecting variables in this research. Existing literature on the social, environmental, and individual determinants of health highlights gaps in analyses regarding disease seropositivity and health disparities among NYC adults during the peak of the COVID-19 pandemic. The study employs exploratory analyses aligned with relevant literature to address these gaps.

The variables are categorized into four groups: general health and individual, environmental, and social determinants. The key variables include access to health insurance, medical care over the past year, prescription medication use, prevalence of hypertension, diabetes, and other such diseases, mental health indices, substance use, living arrangements, vaccination status, physical activity, and demographics. COVID-19 seropositivity is the primary outcome measured.

Individual-level factors, such as tobacco and alcohol usage, physical activity and dietary habits, are important indicators of overall health and chronic disease risk and influence long-term health trajectories. Thus, monitoring them offers insights into public health trends. Environmental factors like the locale of residence, household size, and social determinants like education and access to healthcare will help assess health inequities. Additionally, studying the prevalence of conditions like asthma and diabetes, vaccination rates and psychological distress, will elucidate population health characteristics.

COVID-19 seropositivity was chosen as the primary outcome due to the absence of severity data in the NYC CHS 2020 dataset. While seropositivity and severity are essential to measure disease burden, focusing on the former provides a more comprehensive understanding of the epidemic.

1.2. Frameworks Guiding this Research

The conceptual frameworks described below provide a robust foundation for identifying key predictors of COVID-19 seropositivity among New York City adults by integrating social, environmental, and individual determinants of health. Using the Community Health Survey (NYC CHS) 2020 dataset, the study employs a hierarchical regression method to compare various regression models to explore how these determinants interact and contribute to health outcomes. By adopting an exploratory approach, the research delves into the influences of factors such as access to healthcare, demographic characteristics, and individual health behaviors at the pandemic's peak. This link between conceptual frameworks and empirical analysis provides a comprehensive understanding of the complex influences on seropositivity, ultimately informing and improving public health interventions to address health disparities among NYC’s residents.

Solar and Irwin 15 said, “the World Health Organization Commission on Social Determinants of Health published a conceptual framework for action on the social determinants of health” in 2010.

The National Academies of Sciences, Engineering, and Medicine 16, describes the conceptual framework succinctly below:

This framework [displayed below] is divided into structural and intermediary determinants. The structural determinants comprise the societal, economic, and political context in which a person is born and lives, which dictates one's socioeconomic position. One's socioeconomic position then sets the stage for the intermediary determinants (material circumstances, psychosocial circumstances, behavioral and biological factors, and the health system itself) and the likelihood of exposure to health-compromising conditions. Illnesses caused by poor living conditions can then return to the structural determinants if, for example, a person experiences a loss of employment or income, lowering their socioeconomic status 16.

The intermediary and structural factors were incorporated into the framework to bridge the gap between social cohesion and social capital, despite the minimal political gain. This was done to develop approaches to public health and the social determinants of health that were less politically inclined, as well as systems that would promote equity, making it easier for citizens and institutions to cooperate. The effects of these structural and intermediary determinants on equity in health and well-being are shown in the figure’s final box to the right. These provide feedback into structural determinants and the positive, negative, or neutral effects on subsequent generations.

Frieden 17 developed a five-tier pyramid to improve public health (fig 2). The interventions with the highest potential to influence social determinants of health (including reduced poverty, increased education) are at the base of the pyramid. Interventions that are labor-intensive but would benefit many people are at the following levels: fluoridated water treatments, immunization, and vaccination programs. Next comes the direct clinical interventions that prevent specific conditions and significantly impact an individual’s health, including cardiovascular diseases. Finally, health education is the most labor-intensive intervention with negligible impact on public health. Frieden 17 states that similar public health frameworks have been developed since 1994, emphasizing the effects of clinical health services and health care delivery. However, in contrast to these frameworks, Frieden’s model relies on the social determinants of health.

2. Methods

As discussed above, determinants of health can be divided into three major categories: individual characteristics and behaviors, physical environment, social and economic environment. The order of each variable’s entry into the hierarchical analysis is based on the determinants of health theory. The first model includes the socio-economic factors; the second model comprises socio-economic and health behavior factors; the third model includes socio-economic, health behavior, and built environment factors; and the full model includes all the above along with existing co-morbidities and health conditions.

The variables of interest are: access to health insurance; access to a personal care provider; the ability to receive medical care (12 months); prescription medication use; the prevalence of diseases such as hypertension, asthma, and diabetes (either type 1 or 2); Kessler-6 index score for mental health; the prevalence of non-specific psychological distress; use of mental health treatments; tobacco use; alcohol use; home ownership; rent payment delays; the location of residence; flu vaccination status; physical activity engagement (30 days); consumption of fruits and vegetables; consumption of sugar-sweetened beverages; difficulty in performing daily activities; assisted device utilization; engagement in HIV test; the incidence of interpersonal violence; the incidence of suicidal thoughts; and lastly demographics such as age, race, orientation, gender, marital status, education levels, employment status, number of adults in the household, imputed poverty level, citizenship, and BMI. The outcome variable will be COVID-19 seropositivity.

The data for this study were collected from the New York City Community Health Survey. The New York City Department of Health and Mental Hygiene, a part of the Bureau of Epidemiology Services, conducts an annual telephonic survey, namely The New York City Community Health Survey (CHS). The annual cross-sectional study provides substantial insight into the health of New Yorkers with neighborhood, borough, and citywide estimations. The CHS uses a disproportionate stratified random sample to help assure geographic representativeness across the city 18. “Participation goals are set for 42 United Hospital Fund neighborhoods, defined by contiguous zip codes” (p. 188).

The survey recruits approximately 10,000 randomly selected adults every year from all five boroughs in New York City (Manhattan, Brooklyn, Queens, Bronx, and Staten Island). The data is collected from participants via landline and mobile phone through a computer-assisted telephone interviewing (CATI) system. The target population of the CHS includes adults in non-group quarters aged 18 and above who live in New York City and have a mobile phone or live in a household with a landline telephone. (Before 2009, the CHS only included those living in households with a landline telephone.) In addition, the survey is translated into languages other than English, like Spanish, Russian, Mandarin, and Cantonese. In 2017, Bengali and Haitian Creole were also added as language options.

The data for CHS for COVID-19 includes that of children. Five thousand three hundred five adults were interviewed following this module from March to August 2020. A total of 6,777 New Yorkers were included in the sample and questioned about the experiences of an additional 1,472 children. 75% of homes with minors were asked to answer the COVID-19 module questions on behalf of the child at home from March through May 2020, if the adult conducting the survey knew enough about the child. Participants under 18 were not included in this study. Participants for the serosurvey were recruited from the ongoing NYC CHS. From June to October 2020, 1074 respondents completed the survey; 497 provided whole blood, and 577 provided only self-reported antibody test results 18 (p.190).

A complete case analysis in the form of a case deletion was used if data was missing. This technique involves discarding cases with missing values and proceeding with the analysis using standard methods. CID links the CHS and Population Health Survey responses to results from the NYC Serosurvey. COVID-19 data may also be related to the Serosurvey dataset using CID. The serosurvey dataset includes the results of SARS-CoV-2 antibody tests for a subset of the adult survey population that agreed to provide blood samples for the purpose. The 2020 serosurvey is linked with the Population Health Survey dataset to run analyses by demographic groups or other variables of interest.

First, the R libraries were loaded in RStudio v2022. Then, the haven and Dplyr libraries were loaded in the R script. The dimensions were assessed after loading the combined dataset into the R script. The SORT command was used to determine the column names (variables). The combined dataset without missing cases has 5305 observations with 202 variables. This dataset is used for RQ1 to determine the characteristics of the population. The FILTER command only added the ‘sero1_result’ to the new dataset. This removed the cases without serosurvey results, and the latest data set was left with 928 observations and 202 variables. Interpersonal violence variables were combined with the OR clause.

Furthermore, the categorical variables were changed into factors using the MUTATE clause. COVID-19 seropositivity was coded zero as unfavorable and one as positive. All other binomial variables were coded similarly, with 0 being negative and 1 being positive.

2.1. Statistical Analysis

Logistic regression helps analyze effects among variables, where the dependent variable is categorical (e.g., presence or absence) and the explanatory (independent) variables are categorical, numerical, or both. The outcome variable for research question 2, COVID-19 seropositivity (1: positive for COVID-19, 2: negative for COVID-19), is categorical. Hence this is compatible with the analysis.

The analyses investigate which variable has the most significant effects on the seropositivity of COVID-19 using a hierarchical logistic regression model-building method. As discussed in chapter 2, determinants of health can be divided into three major categories: individual characteristics and behaviors, physical environment, and social and economic environment. The order of the variable’s entry into the hierarchical analysis is based on the determinants of health theory. The first model includes the socio-economic factors, the second model includes socio-economic and health behavior factors, the third model includes socio-economic, health behavior, and built environment factors, and the full model includes all the above and existing co-morbidities and health conditions.

The sample size for the hierarchical regression is 928. Before the models were created, all required libraries were installed and added to RStudio v2022 for data transformation: haven and dplyr. All categorical variables were converted into factors. The mutate clause was used in R to perform this conversion and create a subset. Then, the four different models were pre-defined. The logistic regression models were generated, and the summary of the model and coefficients were pulled and assessed. Each variable’s effect and significance with z statistics and p values from the regression coefficient table were identified. The model fit (model performance) was assessed using the Akaike Information Criterion (AIC) and the Hosmer-Lemeshow goodness of fit chi-square test.

The random forest model is used to identify influential variables affecting the seropositivity of COVID-19 using the variance importance plot results. For the random forest modeling in the second research question, the data were not randomly split into train and test datasets with a 7:3 ratio since we are not predicting but viewing the variables of most importance. The “randomForest” library was installed and primarily used for this model. First, the random forest equation was created and generated. Then, the variance importance plot was made with only the top ten critical variables according to the random forest model. This ranks the variables by their influence on the model.

3. Results

In model 1 (Table 3), a logistic regression was performed to ascertain the effects of socio-economic factors such as age, race, marital status, access to healthcare and insurance, among others, on the likelihood of testing positive /negative for COVID-19 seropositivity. The logistic regression model was statistically significant, X2 (34, N = 928) = 67.24, p =0.001. The Hosmer and Lemeshow test indicated a good fit (p=0.20). The model explained 7.16% (McFadden R2) of the variance in seropositivity and correctly classified 79.63% of cases. Age group 65-75 (p=0.04), Black (p=0.01) and Hispanic (p=0.02) race, and the US as the birthplace (p=0.04) were statistically significant factors. Those between the ages of 65-75 were 0.38 times as likely than 18–24-year-olds to test positive for COVID-19 seropositivity (OR=0.38, 95% CI [0.15,0.95]). The Blacks (OR=1.99, 95% CI [1.19,3.3]) and Hispanics (OR=1.82, 95% CI [1.12,2.98]) were almost twice as likely as whites to be tested positive for COVID-19 seropositivity. Being born in the US, was also associated with testing negative for COVID-19 seropositivity (OR=0.66, 95% CI [0.44,0.98]).

In model 2 (Table 2) a logistic regression was performed to ascertain the effects of socio-economic factors with the health behavior factors on the likelihood of testing positive/negative for COVID-19 seropositivity. The logistic regression model was statistically significant, X2 (42, N = 928) = 87.12, p < 0.001. The Hosmer and Lemeshow test indicated a good fit (p=0.78). The model explained 9.28% (McFadden R2) of the variance in seropositivity and correctly classified 79.63% of cases. With the inclusion of health behaviors in this model, the Black race (p=0.04), being a tobacco smoker (p=0.03), and exercising or engaging in physical activity (p<0.01) were statistically significant. Those who are smokers were 0.46 times as likely as those who never smoked to be tested positive for COVID-19 seropositivity (OR=0.46, 95% CI [0.22,0.89]). Additionally, those who exercised recently were 0.53 times as likely to be tested positive for COVID-19 seropositivity (OR=0.53, 95% CI [0.35,0.81]).

Model 3 also entails a logistic regression to ascertain the effects of socio-economic and health behavior factors and built environment factors on the likelihood of testing positive or negative for COVID-19 seropositivity. The logistic regression model was statistically significant, X2 (47, N = 928) = 92.62, p < 0.001 (Table 5). The Hosmer and Lemeshow test indicated a good fit (p=0.42). The model explained 9.87% (McFadden R2) of the variance in seropositivity and correctly classified 81.14% of cases. With the inclusion of built environment factors such as borough of residence and number of adults in the household in this model, the same variables; Black race (p=0.04), being a current tobacco smoker (p=0.03) and exercising or engaging in physical activity (p<0.01) were still statistically significant.

In the last model (Table 6), which is the complete model, a logistic regression was performed to ascertain the effects of socio-economic, health behavior, built environment factors, with the inclusion of comorbidities and health conditions, on the likelihood of testing positive or negative for COVID-19 seropositivity. The logistic regression model was statistically significant, X2 (61, N = 928) = 116.78, p <0.001. The Hosmer and Lemeshow test indicated a good fit (p=0.90). The model explained 12.45% (McFadden R2) of the variance in seropositivity and correctly classified 80.60% of cases. With the inclusion of existing comorbidities and health conditions in this whole model, exercising or engaging in physical activity (p<0.01), having asthma currently or in the past (p=0.04), BMI (p=0.01), and experiencing suicidal thoughts recently (p=0.04) were found to be statistically significant factors. Those with or had asthma are 0.57 times as likely to be tested positive for COVID-19 seropositivity (OR=0.57, 95% CI [0.33,0.96]). Every point increase of BMI increases the chance of testing positive for COVID-19 seropositivity by 4% (OR=1.04, 95% CI [1.01,1.08]). Individuals who experienced suicidal thoughts recently were also 2.65 times more likely to be tested positive for COVID-19 seropositivity (OR=2.65, 95% CI [1.02,6.55]).

After controlling for demographical and socio-economic characteristics and introducing health behavior characteristics in model 2, the explanatory variance increased by 2.12%, with both physical activity and smoking behavior reaching significant levels. The predictive power of exercise behavior was higher, indicating that less exercise was significantly associated with greater chances of seropositivity. After incorporating the built environment factors in model 3, including socio-economic and health behavior characteristics, the total explanatory variance rose only 0.59%. In model 4, with the addition of comorbidities, the variance was 12.45%, increasing by 2.58%. The comparison of all models and change in significance of variables is represented in Table 3. Goodness of fit testing was performed for the regression analysis using Hosmer-Lemeshow test. These values indicate that these four models and data fit was good. Additionally, this showed that using regression models with these variables was appropriate while predicting the outcome.

When using the random forest method, the variance importance plot (Figure 3) suggested ten variables that had the highest effect on COVID-19 seropositivity: BMI, Kessler-6 index score for mental health, consumption of fruits and vegetables, imputed poverty level, age group, race, the borough of residence, marital status, and the number of adults in the household.

4. Discussion

The statistical analyses, which took the exploratory route, entailed creating four logistic regression models in a hierarchical regression method to gain a comprehensive understanding of how such varied determinants of health affect COVID-19 seropositivity in the New York City population.

The complete model found significant relationships between COVID-19 seropositivity and asthma, BMI, physical activity, and the incidence of suicidal thoughts. Those with or had asthma are 0.57 times as likely to be tested positive for COVID-19 seropositivity (OR=0.57, 95% CI [0.33,0.96]). Every point increase in BMI increases the chance of testing positive for COVID-19 seropositivity by 4% (OR=1.04, 95% CI [1.01,1.08]). Individuals who experienced suicidal thoughts were also 2.65 times more likely to be tested positive for COVID-19 seropositivity (OR=2.65, 95% CI [1.02,6.55]). Exercise is a protective factor, whereas suicidal thoughts, high BMI, higher number of adults in a household, and asthma are found to be risk factors.

The finding that physical activity is associated with lower COVID-19 seropositivity in this NYC sample is consistent with public health literature highlighting the importance of regular exercise in maintaining overall health and immunity. A recent study analyzed data from 48,440 adults in the UK and found that those physically active before the pandemic were less likely to be hospitalized or die from COVID-19, as it may improve immune function and reduce the severity of symptoms 19. Similarly, the association between BMI and COVID-19 seropositivity is consistent with the known risk factors, including obesity, leading to severe outcomes 20. Asthma is generally a risk factor for severe respiratory infections, hence the findings here are consistent. Individuals with asthma may be more likely to practice social distancing and other protective measures, which could explain this association. Ko et al. 8 suggest that asthma patients contracting SARS-CoV-2 are more likely to have severe outcomes and require intensive care. We also know from the literature that the odds of COVID-19 seropositivity were significantly higher among individuals living in households with more than four people than those living in smaller households. This finding runs true in this NYC sample as well. The inferences are generally consistent with the literature, which identified several risk factors for COVID-19, including age, underlying health conditions, and social determinants of health 10.

The finding that Black individuals have a higher risk of COVID-19 seropositivity in models 1, 2, and 3 is consistent with the public health literature. The group has been disproportionately affected by COVID-19 in many settings, including New York City in 2020 10. Concurrently, the associations between BMI, physical activity, difficulty in doing daily activities, and the prevalence of asthma with COVID-19 seropositivity are consistent with previous research on the risk factors 20. Lippi et al. 21 found that smoking is associated with an increased risk of contracting COVID-19, as well as more severe outcomes and a higher mortality rate. Including the incidence of suicidal behaviors as a predictor of COVID-19 seropositivity is novel and warrants further investigation.

The Hosmer-Lemeshow test results indicate good model fit, and the variable selection across the different models provides some confidence in the results. The high p-values for the Hosmer-Lemeshow tests suggest a good fit for all models, and the AIC value indicates that this model is a good fit for the data. The results from this method provide further evidence for the several known risk factors for COVID-19 seropositivity and highlight the potential of machine learning methods in identifying complex and nonlinear relationships between predictors and outcomes.

The backward and forward stepwise regression models were deemed to have the best performance. While the backward stepwise model had slightly lower AIC than the forward stepwise model, the forward stepwise model had a higher p-value in the Hosmer-Lemeshow test and, thus, showed lower error between the predicted model and the actual values.

However, a model's performance may vary depending on the specific research question and dataset. Using multiple regression models with different variable selection techniques allowed for a comprehensive examination of the data and helped identify consistent predictors of COVID-19 seropositivity.

4.1. Implications

To date, no exploratory study has been conducted that comprehensively examines the impact of various determinants of health on COVID-19 seropositivity in the NYC population. Specifically, there is minimal research investigating the effects of approximately 40 types of determinants of health on the prevalence of COVID-19 antibodies in this population. This analysis helps us gain a more nuanced understanding of the determinants of health and their effects during the biggest pandemic we have lived through. Gaining a comprehensive view of the factors that were found to be shared among the various regression models; the prevalence of asthma, BMI, physical activity levels, the incidence of suicidal thoughts, and tobacco use, provides insights into the key factors that drive transmission and spread of the virus and should guide the development of targeted interventions and policies to improve health outcomes in vulnerable populations. For instance, policies could be implemented to provide targeted support and resources for individuals and households with a high prevalence of asthma or larger household sizes, such as access to medication, home-based oxygen therapy, or financial assistance for housing.

Public health policies promote healthy lifestyles and behaviors by providing education and resources to improve physical activity levels and reduce tobacco use. Such policies could include initiatives that increase access to affordable nutritious food options, create safe spaces for physical activity, and promote smoking cessation programs. Furthermore, policies could be developed to address the incidence of suicidal thoughts, encourage access to mental health services, and improve HIV testing rates to reduce the burden of comorbidities associated with COVID-19. Identifying critical predictors of COVID-19 seropositivity in an exploratory study offers valuable insights into the potential focus areas for public health policies and interventions to reduce disparities and improve health outcomes during future pandemics.

Identifying these critical predictors of COVID-19 seropositivity also offers opportunities to generate new research questions and explore the relationships between these factors and COVID-19 outcomes. Some possible future research questions and ideas include: A) Can interventions targeting the prevention and management of asthma help reduce the risk of COVID-19 infection and severity in such cases? B) What is the relationship between suicidal thoughts and COVID-19 outcomes, and what interventions could be implemented to support mental health and reduce the risk of such thoughts during a pandemic? C) What factors contribute to lower rates of HIV testing among specific populations, and the interventions needed to increase such testing rates and reduce the burden of comorbidities associated with the two diseases? By exploring these research questions and ideas, we can better understand the complex relationships between various determinants of health and COVID-19 outcomes, thus developing effective public health interventions to reduce the burden of the pandemic on vulnerable populations.

It is also important to note a novel methodology used in the research question of this research study, that is, using variance importance plots generated by the random forest method, the most critical predictors contributing to prediction accuracy can be identified, and findings in public health research can be validated. Comparing results with logistic regression can help identify discrepancies or biases. This approach provides a comprehensive understanding of factors contributing to negative and positive health outcomes and helps prioritize interventions and policies to improve population health. Overall, this novel approach of comparing results from the random forest and logistic regression methods using variance importance plots can contribute to public health by providing new insights and more accurate predictions of health outcomes.

4.2. Limitations and Future Directions

This study has several limitations related to the self-reported nature of the NYC Community Health Survey (NYC CHS). This study's cross-sectional design inherently limits the ability to establish causal relationships between the identified predictors and COVID-19 seropositivity. The data represents a single time point, making it difficult to assess changes in exposure, health behaviors, or seropositivity status over time. Cross-sectional design also hinders establishing causal relationships and tracking changes over time. For example, the 2020 survey did not adequately capture the evolving nature of COVID-19, focusing only on seropositivity rather than severity, which was a significant limitation.

Additionally, the reliance on self-reported data introduces potential biases, including recall bias and social desirability bias, which may lead to misclassification or underreporting of key health behaviors and conditions. These factors could influence the observed associations, potentially overestimating or underestimating the strength of relationships between socioeconomic, behavioral, and health-related factors and seropositivity outcomes. Future studies utilizing longitudinal designs and objective health measures would be valuable in confirming these findings and better understanding the directionality of these associations. Misinterpretation of survey questions and incomplete responses further complicate data accuracy.

While this study provides valuable insights into COVID-19 seropositivity predictors among NYC adults, its generalizability to the broader NYC population and similar urban settings should be considered cautiously. The NYC CHS employs a stratified random-digit-dialing method, which may underrepresent certain demographic groups, particularly individuals without stable housing, those with limited access to telecommunications, and non-English speakers who may face language barriers despite survey translation efforts. Additionally, the study sample may not fully capture the diversity of experiences across all boroughs, particularly in neighborhoods with historically lower survey response rates. Although the findings align with broader public health research on health disparities, caution is warranted when extrapolating results to populations outside NYC, as regional differences in healthcare access, socioeconomic conditions, and public health policies may influence seropositivity trends. Future research incorporating more diverse sampling methods and longitudinal designs would enhance the applicability of these findings to other urban populations.

The NYC CHS uses a random-digit-dialing methodology, which could introduce sampling bias and affect voluntary survey participation. Assessing severity can vary significantly among individuals, thus complicating data standardization. The overrepresentation of white males in the NYC CHS sample may have skewed results and affected the accuracy and generalizability of findings related to health behaviors. The presence or absence of specific demographic groups may have influenced observation patterns in the data.

Future research should address this study's limitations by incorporating longitudinal designs that track individuals over time to establish better causal relationships between socioeconomic, behavioral, and health-related factors and disease seropositivity. Longitudinal studies help determine whether specific risk factors precede infection or result from broader health and social conditions. Additionally, integrating objective health data, such as clinical records or biomarker assessments, could reduce reliance on self-reported measures and improve data accuracy. Expanding research to include more comprehensive data on disease severity, rather than just seropositivity, would provide a clearer picture of the health impact of a future pandemic across different demographic and socioeconomic groups. Finally, conducting similar studies in diverse urban and rural settings could enhance the generalizability of findings and inform targeted public health interventions.

ACKNOWLEDGEMENTS

We want to thank the Bureau of Epidemiology and DOHMH in NYC for their support with data access.

References

[1]  CDC, 2020.
In article      
 
[2]  Bellandi D., “Estimate: 10.5 Million Children Lost a Parent, Caregiver to COVID-19,” JAMA, 328(15). 1490. 2022.
In article      View Article
 
[3]  Rady Children's Hospital. COVID collaborative for children [Internet]. San Diego. Rady Children's Hospital. [Cited 2023 March 17] Available from: https://www.rchsd.org/health-safety/health-alerts/covid-collaborative-for-children/.
In article      
 
[4]  ICN. ICN says 115,000 healthcare worker deaths from COVID-19 exposes collective failure of leaders to protect global workforce [Internet]. Place unknown; Oct 2021. [Cited 2023 March 17] Available from: https:// www.icn.ch/ news/icn-says-115000-healthcare - worker-deaths-covid-19-exposes-collective-failure-leaders-protect.
In article      
 
[5]  Coker T.R., Cheng T.L., Ybarra M. “Addressing the Long-term Effects of the COVID-19 Pandemic on Children and Families: A Report from the National Academies of Sciences, Engineering, and Medicine,” JAMA. March 2023, [Online].
In article      View Article  PubMed
 
[6]  Mottaleb, K.A., Mainuddin, M., and Sonobe, T., “Covid-19 induced economic loss and ensuring food security for vulnerable groups: Policy implications from Bangladesh,” PLOS ONE, 15(10), Oct. 2020.
In article      View Article  PubMed
 
[7]  Jones, B., Woolfenden, S., Pengilly, S., Breen, C., Cohn, R., Biviano, L., Johns, A., Worth, A., Lamb, R., Lingam, R., Silove, N., Marks, S., Tzioumi, D., and Zwi, K., “COVID‐19 pandemic: The impact on vulnerable children and young people in Australia,” Journal of Paediatrics and Child Health, 56(12), 1851–1855, Sep. 2020.
In article      View Article  PubMed
 
[8]  Koh, D., “Occupational risks for COVID-19 infection,” Occupational Medicine, 70(1), 3–5, January 2020.
In article      View Article  PubMed
 
[9]  Couch, K.A., Fairlie, R.W., and Xu, H., “Early evidence of the impacts of COVID-19 on minority unemployment,” Journal of Public Economics, 192, 104287, 2020.
In article      View Article  PubMed
 
[10]  Singu, S., Acharya, A., Challagundla, K., and Byrareddy, S.N., “Impact of social determinants of health on the emerging COVID-19 pandemic in the United States,” Frontiers in Public Health, 8.[Online] Available on: https:// www.frontiersin.org/ journals/public-health/articles/10.3389/fpubh.2020.00406.
In article      View Article  PubMed
 
[11]  Witters, D. In U.S., 14% with likely COVID-19 to avoid care due to cost [Internet]. Gallup; April 28, 2020 [Cited November 29, 2022]. Available from: https:// news.gallup.com/ poll/309224/ avoid-care-likely-covid-due-cost.aspx.
In article      
 
[12]  Abrams, E.M., and Szefler, S.J., “COVID-19 and the impact of Social Determinants of Health,” The Lancet Respiratory Medicine, 8(7), 659–661, Jul 2020.
In article      View Article  PubMed
 
[13]  Rezaei, N., Moghaddam, S.S., Farzadfar, F., and Larijani, B., “Social Determinants of health inequity in Iran: A narrative review,” Journal of Diabetes and Metabolic Disorders, 22, 5-12, Jun 2023.
In article      View Article  PubMed
 
[14]  Robertson, M.M., Shamsunder, M.G., Brazier, E., Mantravadi, M., Zimba, R., Rane, M.S., Westmoreland, D.A., Parcesepe, A.M., Maroko, A.R., Kulkarni, S.G., Grov, C., and Nash, D. “Racial/ethnic disparities in exposure, disease susceptibility, and clinical outcomes during COVID-19 pandemic in National Cohort of adults, United States,” Emerging Infectious Diseases, 28(11), 2171–2180, Nov 2022.
In article      View Article  PubMed
 
[15]  Solar, O., and A. Irwin., “A conceptual framework for action on the social determinants of health. Social determinants of health discussion paper 2 (policy and practice)” Geneva, Switzerland: WHO; 2010 [cited: September 22, 2016] Available from: http://www.who.int/sdhconference/resources/ConceptualframeworkforactiononSDH_eng.pdf.
In article      
 
[16]  Committee on Educating Health Professionals to Address the Social Determinants of Health, A framework for educating health professionals to address the social determinants of health. The National Academies Press: National Academies of Sciences, Engineering, and Medicine, Institute of Medicine, Board on Global Health. Oct 14, 2016.
In article      
 
[17]  Frieden, T. R., “A framework for public health action: The health impact pyramid,” American Journal of Public Health 100(4):590-595, Apr, 2010.
In article      View Article  PubMed
 
[18]  Parrott, J.C., Maleki, A.N., Vassor, V.E., Osahan, S., Hsin, Y., Sanderson, M., Fernandez, S., Levanon Seligson, A., Hughes, S., Wu, J., DeVito, A.K., LaVoie, S.P., Rakeman, J.L., Gould, L.H., and Alroy, K.A., “Prevalence of SARS-COV-2 antibodies in New York City adults, June–October 2020: A population-based survey,” The Journal of Infectious Diseases, 224(2), 188–195, Jul, 2021.
In article      View Article  PubMed
 
[19]  Sallis, J.F., Adlakha, D., Oyeyemi, A., Salvo, D., and Anjana, R.M., “Adapting to the COVID-19 Pandemic: Physical Activity, Situational Awareness, and Resilience,” Translational Journal of the American College of Sports Medicine, 6(12), 251-258, 2021.
In article      
 
[20]  Mänty, M., Heinonen, O.J., Viljanen, A., Pajala, S., Koskenvuo, M., Kaprio, J., and Rantanen, T, “Body mass index and disability in adulthood: A 20-year panel study,” American Journal of Public Health, 101(2), 260-266, 2011.
In article      
 
[21]  Lippi, G., Henry, B.M., and Sanchis-Gomar, F., “Active smoking and COVID-19: a double-edged sword,” European Journal of Internal Medicine, 77, 123-124, Jul, 2020.
In article      View Article  PubMed
 

Published with license by Science and Education Publishing, Copyright © 2025 Apeksha Mewani, Vincent Jones II and Erin Jacques

Creative CommonsThis work is licensed under a Creative Commons Attribution 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

Cite this article:

Normal Style
Apeksha Mewani, Vincent Jones II, Erin Jacques. Key Predictors of COVID-19 Seropositivity among Adults in New York City; Community Health Survey 2020. American Journal of Public Health Research. Vol. 13, No. 3, 2025, pp 103-116. https://pubs.sciepub.com/ajphr/13/3/2
MLA Style
Mewani, Apeksha, Vincent Jones II, and Erin Jacques. "Key Predictors of COVID-19 Seropositivity among Adults in New York City; Community Health Survey 2020." American Journal of Public Health Research 13.3 (2025): 103-116.
APA Style
Mewani, A. , II, V. J. , & Jacques, E. (2025). Key Predictors of COVID-19 Seropositivity among Adults in New York City; Community Health Survey 2020. American Journal of Public Health Research, 13(3), 103-116.
Chicago Style
Mewani, Apeksha, Vincent Jones II, and Erin Jacques. "Key Predictors of COVID-19 Seropositivity among Adults in New York City; Community Health Survey 2020." American Journal of Public Health Research 13, no. 3 (2025): 103-116.
Share
  • Table 3. Results of Logistic Regression Model with Socioeconomic, Health Behavior, and Built Environment Factors
[1]  CDC, 2020.
In article      
 
[2]  Bellandi D., “Estimate: 10.5 Million Children Lost a Parent, Caregiver to COVID-19,” JAMA, 328(15). 1490. 2022.
In article      View Article
 
[3]  Rady Children's Hospital. COVID collaborative for children [Internet]. San Diego. Rady Children's Hospital. [Cited 2023 March 17] Available from: https://www.rchsd.org/health-safety/health-alerts/covid-collaborative-for-children/.
In article      
 
[4]  ICN. ICN says 115,000 healthcare worker deaths from COVID-19 exposes collective failure of leaders to protect global workforce [Internet]. Place unknown; Oct 2021. [Cited 2023 March 17] Available from: https:// www.icn.ch/ news/icn-says-115000-healthcare - worker-deaths-covid-19-exposes-collective-failure-leaders-protect.
In article      
 
[5]  Coker T.R., Cheng T.L., Ybarra M. “Addressing the Long-term Effects of the COVID-19 Pandemic on Children and Families: A Report from the National Academies of Sciences, Engineering, and Medicine,” JAMA. March 2023, [Online].
In article      View Article  PubMed
 
[6]  Mottaleb, K.A., Mainuddin, M., and Sonobe, T., “Covid-19 induced economic loss and ensuring food security for vulnerable groups: Policy implications from Bangladesh,” PLOS ONE, 15(10), Oct. 2020.
In article      View Article  PubMed
 
[7]  Jones, B., Woolfenden, S., Pengilly, S., Breen, C., Cohn, R., Biviano, L., Johns, A., Worth, A., Lamb, R., Lingam, R., Silove, N., Marks, S., Tzioumi, D., and Zwi, K., “COVID‐19 pandemic: The impact on vulnerable children and young people in Australia,” Journal of Paediatrics and Child Health, 56(12), 1851–1855, Sep. 2020.
In article      View Article  PubMed
 
[8]  Koh, D., “Occupational risks for COVID-19 infection,” Occupational Medicine, 70(1), 3–5, January 2020.
In article      View Article  PubMed
 
[9]  Couch, K.A., Fairlie, R.W., and Xu, H., “Early evidence of the impacts of COVID-19 on minority unemployment,” Journal of Public Economics, 192, 104287, 2020.
In article      View Article  PubMed
 
[10]  Singu, S., Acharya, A., Challagundla, K., and Byrareddy, S.N., “Impact of social determinants of health on the emerging COVID-19 pandemic in the United States,” Frontiers in Public Health, 8.[Online] Available on: https:// www.frontiersin.org/ journals/public-health/articles/10.3389/fpubh.2020.00406.
In article      View Article  PubMed
 
[11]  Witters, D. In U.S., 14% with likely COVID-19 to avoid care due to cost [Internet]. Gallup; April 28, 2020 [Cited November 29, 2022]. Available from: https:// news.gallup.com/ poll/309224/ avoid-care-likely-covid-due-cost.aspx.
In article      
 
[12]  Abrams, E.M., and Szefler, S.J., “COVID-19 and the impact of Social Determinants of Health,” The Lancet Respiratory Medicine, 8(7), 659–661, Jul 2020.
In article      View Article  PubMed
 
[13]  Rezaei, N., Moghaddam, S.S., Farzadfar, F., and Larijani, B., “Social Determinants of health inequity in Iran: A narrative review,” Journal of Diabetes and Metabolic Disorders, 22, 5-12, Jun 2023.
In article      View Article  PubMed
 
[14]  Robertson, M.M., Shamsunder, M.G., Brazier, E., Mantravadi, M., Zimba, R., Rane, M.S., Westmoreland, D.A., Parcesepe, A.M., Maroko, A.R., Kulkarni, S.G., Grov, C., and Nash, D. “Racial/ethnic disparities in exposure, disease susceptibility, and clinical outcomes during COVID-19 pandemic in National Cohort of adults, United States,” Emerging Infectious Diseases, 28(11), 2171–2180, Nov 2022.
In article      View Article  PubMed
 
[15]  Solar, O., and A. Irwin., “A conceptual framework for action on the social determinants of health. Social determinants of health discussion paper 2 (policy and practice)” Geneva, Switzerland: WHO; 2010 [cited: September 22, 2016] Available from: http://www.who.int/sdhconference/resources/ConceptualframeworkforactiononSDH_eng.pdf.
In article      
 
[16]  Committee on Educating Health Professionals to Address the Social Determinants of Health, A framework for educating health professionals to address the social determinants of health. The National Academies Press: National Academies of Sciences, Engineering, and Medicine, Institute of Medicine, Board on Global Health. Oct 14, 2016.
In article      
 
[17]  Frieden, T. R., “A framework for public health action: The health impact pyramid,” American Journal of Public Health 100(4):590-595, Apr, 2010.
In article      View Article  PubMed
 
[18]  Parrott, J.C., Maleki, A.N., Vassor, V.E., Osahan, S., Hsin, Y., Sanderson, M., Fernandez, S., Levanon Seligson, A., Hughes, S., Wu, J., DeVito, A.K., LaVoie, S.P., Rakeman, J.L., Gould, L.H., and Alroy, K.A., “Prevalence of SARS-COV-2 antibodies in New York City adults, June–October 2020: A population-based survey,” The Journal of Infectious Diseases, 224(2), 188–195, Jul, 2021.
In article      View Article  PubMed
 
[19]  Sallis, J.F., Adlakha, D., Oyeyemi, A., Salvo, D., and Anjana, R.M., “Adapting to the COVID-19 Pandemic: Physical Activity, Situational Awareness, and Resilience,” Translational Journal of the American College of Sports Medicine, 6(12), 251-258, 2021.
In article      
 
[20]  Mänty, M., Heinonen, O.J., Viljanen, A., Pajala, S., Koskenvuo, M., Kaprio, J., and Rantanen, T, “Body mass index and disability in adulthood: A 20-year panel study,” American Journal of Public Health, 101(2), 260-266, 2011.
In article      
 
[21]  Lippi, G., Henry, B.M., and Sanchis-Gomar, F., “Active smoking and COVID-19: a double-edged sword,” European Journal of Internal Medicine, 77, 123-124, Jul, 2020.
In article      View Article  PubMed