Department of Chemistry, College of Basic Sciences, Shahrood Branch, Islamic Azad University, Shahrood, Iran
A robust linear quantitative structure-property relationship (QSPR) model has been constructed to model and predict the refractivity indices of 101 organic compounds as common halo-derivatives of normal paraffin by application of the structural descriptors combined with multiple linear regressions (MLR) method. In the main part of this study, theoretical molecular descriptors were adopted from the original pool through the stepwise feature selection method. A simple model with low standard errors and promising correlation coefficients was obtained. MLR method could model the relationship between refractivity and structural descriptors, perfectly. The accuracy of the proposed MLR model was illustrated using cross-validation, validation through an external test set, and Y-randomization techniques. The linear techniques such as MLR combined with a successful variable selection procedure are capable of generating an efficient QSPR model for predicting the refractivity indices of different compounds. The constructed model, with high statistical significance (R2train = 0.926; Ftrain = 240.675; R2test = 0.947; Ftest = 52.978; REP (%) = 1.219; Q2LOO = 0.914 and Q2LGO = 0.914), could be adequately used for the prediction and description of the affecting parameters on refractivity behavior of similar or even unknown compounds.