The paper investigates the comparative effects of several random sampling methods on the maximum likelihood estimates of a simple logistic regression model. The study uses simulated data (logistic populations with pre-defined parameter values) that used Monte Carlo methods to simulate. Sampling techniques include Simple Random Sampling (SRS) and six variations of Stratified Sampling where two are single-stage Stratified Sampling and four are choice-based (two-phase) Stratified Sampling. Parameter estimates arising under each sampling technique were compared using performance measures Bias, Standard Error & Percentage of models that are feasibly estimated. The simulation-based analysis found that choice-based sampling with proportional allocation in both phases is the best-suited sampling technique for parameter estimation of a simple logistic regression model.
| [1] | Amemiya, T., “The n-2-Order Mean Squared Errors of the Maximum Likelihood and the Minimum Logit Chi-Square Estimator”, The Annals of Statistics, 8 (3), 488-505, 1980.View Article |
| [2] | Gordon, D.V., Lin, Z., Osberg, L. and Phipps, S., “Predicting Probabilities: Inherent and Sampling Variability in the Estimation of Discrete-Choice Models”, Oxford Bulletin of Economics and Statistics, 56 (1), 13-31, 1994.View Article |
| [3] | Whittemore, A.S., “Sample Size for Logistic Regression with Small Response Probability”, Journal of the American Statistical Association, 76 (373), 27-32, 1981.View Article |
| [4] | Hsieh, F.Y., “Sample size tables for logistic regression”, Statistics in medicine, 8 (7), 795-802, 1989.View Article PubMed |
| [5] | Breslow, N. E., and Chatterjee, N., “Design and analysis of two‐phase studies with binary outcome applied to Wilms tumour prognosis”, Journal of the Royal Statistical Society: Series C (Applied Statistics), 48 (4), 457-468, 1999.View Article |
| [6] | Giles, J. A., and Courchane, M. J., “Stratified sample design for fair lending binary logit models”, Department of Economics, University of Victoria, 2000. |
| [7] | Dietrich, J., “The effects of sampling strategies on the small sample properties of the logit estimator”, Journal of Applied Statistics, 32 (6), 543-554, 2005.View Article |
| [8] | Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., and Feinstein, A. R., “A simulation study of the number of events per variable in logistic regression analysis”, Journal of clinical epidemiology, 49 (12), 1373-1379, 1996.View Article |
| [9] | Schaefer, R. L., “Alternative estimators in logistic regression when the data are collinear”, Journal of Statistical Computation and Simulation, 25 (1-2), 75-91, 1986.View Article |
| [10] | Albert, A. and Anderson, J.A., “On the existence of maximum likelihood estimates in logistic regression models”, Biometrika, 71 (1), 1-10, 1984.View Article |