Ontent/5/Page 3 ofparameterizations [39,4450] and modifications [47,51,52] of EEM are nonetheless below improvement. Its accuracy is comparable towards the QM charge calculation approach for which it was parameterized. Also, EEM is extremely quickly, as its computational complexity is (N three ), exactly where N will be the variety of atoms inside the molecule. For that reason, within the present study, we focus on pKa prediction applying QSPR models which employ EEM charges. Especially, we designed and evaluated QSPR models based on EEM charges computed applying 18 EEM parameter sets. We also compared these QSPR models with corresponding QSPR models which employ QM charges computed by the identical charge calculation schemes employed for EEM parameterization.MethodsEEM parameter setsIn our study, we utilized all EEM parameters published till now. Especially, we discovered 18 different EEM parameters sets, published in eight various articles [39,4450]. The parameters cover two QM theory levels (HF and B3LYP), two basis sets (STO3G and 61G) and six population analyses (MPA, NPA, Hirshfeld, MK, CHELPG, AIM). However, only some combinations of QM theory levels, basis sets and population analyses are obtainable. Alternatively, additional parameter sets had been published for some combinations (i.e., six parameter sets for HF/STO3G/MPA). All of the parameter sets consist of parameters for C, O, N and H. Some sets involve also parameters for S, P, halogens and metals. The majority of the sets don’t involve parameters for C and N bonded by triple bond. Summary information and facts about all these parameter sets is given in Table 1.EEM charge calculationa model as possible, with all the risk that the accuracy of such a model might not be higher. The second approach is always to develop a lot more models, every single of them getting dedicated to a particular class of compounds. Right here we took the second strategy, following a similar methodology as in preceding research [2124]. Specifically, we concentrate on substituted phenols, due to the fact they are by far the most common test set molecules employed inside the evaluation of novel pKa prediction approaches [2124,5658]. Our information set includes the 3D structures of 74 distinct phenol molecules. This data set is of higher structural diversity and it covers molecules with pKa values from 0.38 to 11.1. The molecules had been obtained in the NCI Open Database Compounds [59] and their 3D structures have been generated by CORINA 2.6 [60], without having any additional geometry optimization. Our information set is really a subset from the phenol information set applied in our preceding work connected to pKa prediction from QM atomic charges [24].1446002-37-4 web The subset is made up of phenols which include only C, O, N and H, and none from the molecules include triple bonds.Buy1864059-82-4 This limitation is essential, since the EEM parameters of all 18 studied EEM parameter sets are offered only for such molecules (see Table 1).PMID:23563799 For every phenol molecule from our data set, we also prepared the structure in the dissociated form, exactly where the hydrogen is missing from the phenolic OH group. This dissociated molecule was made by removing the hydrogen from the original structure without the need of subsequent geometry optimization. The list of the molecules, such as their names, NCS numbers, CAS numbers and experimental pKa values, is usually discovered within the (Additional file 1: Table S1a). The SDF files using the 3D structures of molecules and their dissociated types are also within the (Further file 2: Molecules).Information set for carboxylic acidsThe EEM charges were calculated by the system EEM SOLVER [53] employing every single of your 18 EEM parameter sets.QM cha.