Abstract
Background. Dapsone (DAP) is an anti-inflammatory and antimicrobial active pharmaceutical ingredient used to treat, e.g., AIDS-related diseases. However, low solubility is a feature hampering its efficient use.
Objectives. First, deep eutectic solvents (DES) were used as solubilizing agents for DAP as an alternative to traditional solvents. Second, intermolecular interactions in the systems were described and quantified. Finally, the solubility prediction model, previously created using the machine learning protocol, was extended and improved using new data obtained for eutectic systems.
Material and methods. New DES were created by blending choline chloride (ChCl) with 6 selected polyols. The solubility of DAP in these solvents was measured spectrophotometrically. The impact of water dilution on the solubility curve was investigated. Experimental research was enriched with theoretical interpretations of intermolecular interactions, identifying the most probable pairs in the systems. Dapsone self-association and its ability to interact with components of the analyzed systems were considered. Thermodynamic characteristics of pairs were utilized as molecular descriptors in the machine learning process, predicting solubility in both traditional organic solvents and the newly designed DES.
Results. The newly formulated solvents demonstrated significantly higher efficiency compared to traditional organic solvents, and a small addition of water increased solubility, indicating its role as a co-solvent. The interpretation of the mechanism of DAP solubility highlighted the competitive nature of self-association and pair formation. Thermodynamic parameters characterizing affinity were instrumental in developing an efficient model for theoretical screening across diverse solvent classes. The study emphasized the necessity of retraining models when introducing new experimental data, as exemplified by enriching the model with data from DES.
Conclusions. The research showcased the efficacy of developing new DES for enhancing solubility and creating environmentally and pharmaceutically viable systems, using DAP as an example. Molecular interactions proved valuable in understanding solubility mechanisms and formulating predictive models through machine learning processes.
Keywords: solubility, dapsone, machine learning, intermolecular interactions, deep eutectic solvents
Streszczenie
Wprowadzenie. Dapson jest substancją czynną o działaniu przeciwzapalnym i przeciwbakteryjnym, stosowaną m.in. w leczeniu schorzeń związanych z AIDS. Jednakże jego niska rozpuszczalność to cecha utrudniająca jego efektywne wykorzystanie.
Cel pracy. Po pierwsze, do rozpuszczania dapsonu jako alternatywy dla tradycyjnych rozpuszczalników zastosowano rozpuszczalniki głęboko eutektyczne (DES). Po drugie, opisano i obliczono oddziaływania międzycząsteczkowe w tych układach. Wreszcie, model przewidywania rozpuszczalności, wcześniej opracowany przy użyciu protokołu uczenia maszynowego, został rozszerzony i ulepszony uwzględniając rozpuszczalniki z grupy głębokich eutektyków.
Materiał i metody. Przygotowano nowe rozpuszczalniki należące do grupy głębokich eutektyków (DES) poprzez zmieszanie chlorku choliny oraz jednego z sześciu wybranych polioli. W każdym z zaprojektowanych rozpuszczalników zmierzono rozpuszczalność dapsonu metodą spektrofotometryczną. Przebadano wpływ rozcieńczania wodą na rozpuszczalność. Badania eksperymentalne wzbogacono o interpretację teoretyczną oddziaływań międzycząsteczkowych i zidentyfikowano najbardziej prawdopodobne pary. Uwzględniono zarówno autoasocjację dapsonu, jak i jego zdolność do oddziaływania ze składnikami analizowanych układów. Charakterystyki termodynamiczne par wykorzystano jako deskryptory molekularne w procesie uczenia maszynowego. Otrzymany model wykorzystano do przewidywania rozpuszczalności zarówno w tradycyjnych rozpuszczalnikach organicznych, jak i nowych głębokich eutektykach.
Wyniki. Potwierdzono znacznie wyższą efektywność nowych rozpuszczalników niż tradycyjnych rozpuszczalników organicznych, a niewielki dodatek wody wywołał wzrost rozpuszczalności, co wskazuje na jej rolę jako współrozpuszczalnika. Interpretacja mechanizmu rozpuszczalności dapsonu wykazała konkurencyjny charakter autoasocjacji oraz tworzenia par ze składnikami analizowanych roztworów. Parametry termodynamiczne charakteryzujące powinowactwo wykorzystano w procesie uczenia maszynowego i opracowano bardzo efektywny model pozwalający na teoretyczne badania przesiewowe w bardzo szerokiej domenie klas rozpuszczalników. Zwrócono uwagę na konieczność ponownego treningu modelu w przypadku rozszerzenia puli danych doświadczalnych o przypadki układów o nowych cechach na przykładzie wzbogacenia modelu o nowe wartości dla DES.
Wnioski. Na przykładzie dapsonu udokumentowano, że projektowanie nowych rozpuszczalników głęboko eutektycznych jest wysoce efektywnym sposobem zwiększania rozpuszczalności prowadzącym do otrzymania ekologicznie i farmaceutycznie akceptowalnych układów. Oddziaływania między molekularne są wartościowym źródłem informacji umożliwiającym nie tylko zrozumienie mechanizmu rozpuszczalności, ale również sformułowania modeli teoretycznych wykorzystujących proces uczenia maszynowego.
Słowa kluczowe: dapson, rozpuszczalność, uczenie maszynowe, oddziaływania międzycząsteczkowe, rozpuszczalniki głęboko eutektyczne.
Background
Dapsone (IUPAC name: 4,4’-Diaminodiphenyl sulfone, abbreviated here as DAP) is a synthetic sulphone known for its anti-inflammatory and antimicrobial properties.1 Both systematic and topical forms of DAP are used in medicine, and among different diseases addressed by this drug, one can include dermatitis herpetiformis, leprosy, acne, malaria, and also conditions related to AIDS.2, 3, 4, 5 Dapsone demonstrates its antibacterial activity by competitively inhibiting dihydropteroate synthetase, thereby impeding the biosynthesis of folic acid. This disruption hampers the production of nucleic acids, which are crucial for the survival and multiplication of the affected bacteria.2, 6 Dapsone’s ability to reduce inflammation is associated with its modulation of the production of cytokines.7, 8 Moreover, its ability to bind with NADPH oxidase results in the inhibition of reactive oxygen species (ROS) and superoxide radical production, thus rendering its antioxidant capacity.9 Dapsone is metabolized in the liver via acetylation and hydroxylation, and excreted primarily in urine.10, 11 Among its adverse effects, hematologic issues and peripheral neuropathy are considered the most important, with the DAP hypersensitivity syndrome being identified as a life-threatening drug reaction.12 Dapsone is included in class II of drugs according to the Biopharmaceutics Classification System (BCS), which reflects its limited water solubility as well as its reduced permeability.13 Because of the above features, the transdermal delivery route, instead of the oral one, is the preferred way of administering this drug.14 These limitations were the inspiration for various strategies aimed at resolving the poor aqueous solubility and low bioavailability of DAP.15, 16
Solubility, as a fundamental characteristic, holds significant importance in the pharmaceutical realm and plays a crucial role in drug design and formulation.17, 18 The reactivity, stability and bioavailability of a chemical compound are significantly influenced by its capacity for dissolution in a specific solvent. Investigations into solubility, encompassing both equilibrium and kinetic methodologies, play a vital role in various critical domains, including enhancing liquid dosage,19 improving bioavailability,20 facilitating crystallization,21 aiding pre-formulation,22 and enabling thermodynamic modeling.23 Based on the above, it can be easily concluded that solvents are extremely important for the pharmaceutical industry. In fact, it can be estimated that they account for even 90% of all chemicals used in this industry.24 Because of the amount of solvents used, it is crucial to focus not only on their effectiveness in dissolving desired compounds but also on their environmental impact. This is why the framework of “green chemistry”25, 26 has become widely introduced in the context of solvents.27, 28, 29, 30 Various properties can be attributed to these green solvents, with non-toxicity, non-flammability and low environmental impact being the most important ones.
When considering the above properties, a particular group of designed solvents comes to mind, namely deep eutectic solvents (DES). They are formed by mixing at least 2 compounds, and their key feature lies in the lowering of the melting point compared to individual constituents of the eutectic mixture.31, 32 This characteristic enables them to remain in a liquid state even at low temperatures. The DES share many aspects with more traditional ionic liquids; they differ mainly by including non-ionic constituents in their structure.33 Frequently, DES employ compounds that can be termed primary metabolites derived from plants, including alcohols, sugars, organic acids, and amino acids.34, 35 The properties of DES are in line with the general requirements for green solvents since these systems are not readily volatile and flammable, they are sustainable and biodegradable, their preparation is simple and cost-efficient, and they can be tailored for user-specific requirements.36, 37, 38, 39 These desirable properties resulted in the widespread usage of DES, and the pharmaceutical industry is, of course, one of the beneficiaries of this approach.40, 41, 42 The DES have proven to have the ability to significantly increase the solubility and bioavailability of many active pharmaceutical ingredients.43, 44, 45 Also our research group has successfully demonstrated the usefulness of using DES for the dissolution of many active substances, including sulfonamides,46 curcumin,47 caffeine,48 and edaravone.49 Therefore, it seems natural to apply DES in the case of DAP.
Selecting the right solvent to enhance the solubility of a specific active pharmaceutical ingredient is a challenging and labor-intensive process. The number of experiments is constrained by various factors, including time, financial limitations and the contemporary emphasis on eco-friendly chemical practices. Hence, preliminary screening using diverse computational methods seems imperative before conducting actual experiments. Of notable importance is the use of machine learning to determine pharmaceutical solubility limits, an area that has seen a surge in neural network and deep learning applications.50, 51, 52 Our previous studies have demonstrated that combining COSMO-RS with machine learning techniques yields highly accurate predictions,53, 54, 55 also in the case of daposne.56
The current study had 3 primary objectives. First, DAP solubility data was augmented using several designed DES, which were expected to have a greater solubilization potential than traditional organic solvents. Furthermore, the intermolecular interactions within the considered systems were studied in order to gain insight into the observed solubility phenomena. Finally, the model for predicting solubility, previously created using the machine learning protocol, was extended and improved using new data obtained for eutectic systems.
Materials and methods
Materials
Dapsone (CAS No. 80-08-0) was acquired from Sigma-Aldrich (St. Louis, USA) with ≤99% purity. The constituents of DES were also obtained from Sigma-Aldrich with the same purity. These constituents include choline chloride (ChCl, CAS No. 67-48-1), as well as 6 polyols, namely glycerol (GLY, CAS No. 56-81-5), ethylene glycol (ETG, CAS No. 107-21-1), diethylene glycol (DEG, CAS No. 111-46-6), triethylene glycol (TEG, CAS No. 112-27-6), 1,2-propanediol (P2D, CAS No. 57-55-6), and 1,3-butanediol (B3D, CAS No. 107-88-0) used as hydrogen bond donors (HBDs). Methanol (CAS No. 67-56-1), used as a solvent throughout the study, was supplied by Avantor Performance Materials (Gliwice, Poland) and had a purity of at least 99%. Prior to use, ChCl was dried, while all the other compounds were used without any initial procedures.
Preparation of the samples and solubility measurements
For DAP solubility determination in the investigated DES, the initial step involved the preparation of a calibration curve. For this purpose, a stock solution of DAP was prepared in methanol and subsequently diluted in 10-mL volumetric flasks. The concentration of solutions used for the preparation of the calibration curve ranged from 0.007 mg/mL to 0.017 mg/mL. An A360 spectrophotometer from AOE Instruments (Shanghai, China) was employed for spectrophotometrical measurements of the solutions. The wavelength corresponding to the maximum absorbance value was found to be 295 nm. Three separate calibration curves were averaged in order to obtain the final curve. The linear regression equation was found to be A = 113.87∙C + 0.0015 (A – absorbance, C – concentration expressed in mg/mL). The degree of linearity was found to be satisfactory, with R2 = 0.999.
In the investigation of various DES, ChCl consistently served as one of the constituents. The 2nd component varied and GLY, ETG, DEG, TEG, P2D, or B3D. To formulate the DES, ChCl and the 2nd component were mixed together in sealed test tubes in 1:2 molar ratio. A water bath at 90°C was utilized for creating homogeneous solutions. The resulting DES were used either in their pure form or combined with water to create binary systems with varying water proportions. To obtain saturated solutions of DAP in the studied systems, excess amounts of DAP were added to the test tubes containing both pure DES and binary mixtures with water.
The prepared samples were incubated for 24 h at 25°C in an Orbital Shaker Incubator ES-20/60 from Biosan (Riga, Latvia). The temperature was precisely maintained at 0.1°C, with a variation of ±0.5°C observed over the 24-h cycle. During the mixing process, all samples were agitated at a speed of 60 rpm. Afterward, the samples were filtered using a syringe combined with a PTFE syringe filter of 0.22 µm pore size. In order to prevent precipitation, the test tubes, syringes, pipette tips, and filters were initially warmed to align with the temperature of the sample.
Ultimately, fixed volumes of the filtered solution were placed in test tubes filled with methanol, and the samples diluted in this way were subjected to spectrophotometric measurements. Additionally, to determine the mole fractions of DAP, 1 mL of each solution was precisely weighed in a 10 mL volumetric flask in order to obtain the density of the sample. Throughout the study, an Eppendorf (Hamburg, Germany) Reference 2 pipette was used with a systematic error of 0.6 µL. The RADWAG (Radom, Poland) AS 110 R2.PLUS analytical balance with a precision of 0.1 mg was also utilized.
The solubility of DAP in the considered solvents was determined based on spectrophotometric measurements of the prepared samples. The wavelength range from 190 nm to 500 nm was used during solubility measurements, and the corresponding resolution was 1 nm. Initially, methanol was used for spectrophotometer calibration, and it was also utilized to dilute the measured samples. Dilution was necessary to ensure that the absorbance values remained within the linear range. Based on the calibration curve, the absorbance values measured at 295 nm were used to calculate DAP solubility, expressed both as its concentration and mole fraction. These values were obtained by averaging the results from 3 separate measurements.
Computations details
Solubility predictions were made using a custom Python model designed to tune the hyperparameters of 36 regressors using various algorithms, which include, among many others, boosting, nearest neighbors, linear models, ensembles, and neural networks. The methodology was already described in our previous studies,56, 57, 58 and only a brief description is presented here. The exploration of the hyperparameter space was aimed at finding their optimal values throughout 5000 minimization trials with the help of the Optuna framework. The performance of the considered regression models was assessed based on a custom score function combining metrics accounting for model accuracy and generalizability. The ultimate performance assessment of all models relied on the loss values representing both test and validation subsets. The ensemble model definition incorporated the subset of regression models characterized by the lowest values corresponding to both criteria. The final predictions were the results of the averaging of the chosen models. The solubility data from the previous work56 was used for model training, and was supplemented with the new measurements presented in this work. The same types of molecular descriptors were used to characterize the intermolecular interactions in the system, expressed in terms of solute–solvent affinities. These properties were characterized by the Gibbs free energy (ΔGr) values associated with reactions involving the formation of pairs, namely X + Y = XY, with X and Y representing solute and solvent molecules, respectively. For the purpose of machine learning, both the enthalpic and entropic contributions included in the affinities were additionally incorporated into the set of molecular descriptors. Each compound in the form of monomers, dimers or heteromolecular pairs was represented by the set of the most representative conformations selected from a large number of potential geometries optimized using RI-DFT BP86 (B88-VWN-P86) in Turbomole v. 7.5.1 (Turbomole GmbH, Frankfurt am Main, Germany). Highly similar clusters and those exceeding the 2.5 kcal/mol threshold for relative energy were not included in the pool of conformations. All thermodynamic properties were computed using the COSMOtherm program v. 22.0.0 (Dassault Systèmes, Biovia, San Diego, USA).
Results and Discussion
Experimental solubility of DAP in designed solvents
The solubility of DAP was studied in several DES comprising ChCl and 1 of 6 considered polyols, namely GLY, ETG, DEG, TEG, P2D, and B3D. These systems were studied previously for other active pharmaceutical ingredients and proved to be very effective in enhancing their solubility.49 Also, based on earlier experiences, the 1:2 molar ratio of DES constituents (i.e., a twofold excess amount of the polyol) was used. The studies encompassed the determination of DAP solubility in neat DES, as well as in binary solvent mixtures of DES and water. In the latter case, different molar proportions of the eutectic and water were used. The results are presented in Figure 1.
The analysis of the results leads to several interesting observations. First of all, there is a following decreasing trend among the studied neat DES in terms of DAP dissolution effectiveness, namely: TEG > DEG > ETG > GLY > B3D > P2D. At 25°C, the mole fraction solubility of DAP in the neat DES comprising ChCl and TEG equals xDAP = 107.50∙10−3. Using DEG as a DES constituent yields a slightly smaller solubility of xDAP = 96.48∙10−3. The next 2 systems, involving ETG and GLY, are responsible for a considerably lower solubility of xDAP = 70.96∙10−3 and xDAP = 65.1∙10−3, respectively. Finally, DAP is the least soluble in DES utilizing B3D and P2D, with xDAP = 49.36∙10−3 and xDAP = 46.91∙10−3, respectively.
The studied DES systems differ slightly in their behavior when introduced as a part of an aqueous binary solvent. For all studied DES, a cosolvency effect can be observed, which means that at a specific molar composition of the aqueous binary solvent, the solubility of DAP is higher than for the neat DES. However, the solubility profiles presented in Figure 1 are not identical among the studied DES. The most important difference comes from the binary mixture composition, which results in the highest DAP solubility. In the case of eutectics comprising ETG, DEG and TEG, the x*w = 0.2 composition is responsible for the highest solubility of DAP. Meanwhile, for DES utilizing P2D, B3D and GLY, this composition is x*w = 0.3. The shape of the solubility profiles also differs slightly between the studied systems, although the solubility increase is rather similar and in the range from 1.07 to 1.15 times greater when comparing the optimal binary composition and the neat DES. For example, the most efficient composition for the DES comprising TEG results in the solubility of DAP equals to xDAP = 120.96∙10−3 at 25°C. Interestingly, there is a deviation from the solubility effectiveness trend observed among the studied neat DES. When the optimal composition is concerned, the eutectic involving GLY turns out to be more effective than the one with ETG, which is the opposite for neat DES.
Of course, it is necessary to put the obtained results in the context of DAP solubility in the neat solvents studied previously.56 All the studied DES, as well as most water-DES mixtures, outperform classical organic solvents, including dimethyl sulfoxide (DMSO), DEG and diethylene glycol bis(3-aminopropyl) ether (B3APE), as evidenced in Figure 1. Even DMSO, a compound that is known to be a very efficient solubilizer, yielded DAP solubility at a level of xDAP = 18.95∙10−3, which is only around 20% of the solubility of the neat eutectic involving TEG.
Intermolecular interactions of DAP in the studied DES
The characteristics of the intermolecular interactions in saturated solvents studied here are expressed as the collection of the values of ΔGr of synthesis reactions of homo- and hetero-molecular pairs. This, however, requires the proper representation of the structural diversity of potential intermolecular contacts in considered systems. Hence, an extensive conformational search for potential pair formation was conducted as an initial step. According to the procedure described in the methodology section, each complex is represented by the set of the most stable conformations used for the determination of the affinity values. It is important to note that 2 types of thermodynamic characteristics can be derived using COSMOtherm depending on the way of interpretation of the reaction of the pair formation. For the purpose of the overall stability of the systems, the values of concentration independent ΔGr were used. Such a way of representing affinity simply utilizes the product of activity coefficients, correcting the values of equilibrium constants determined based on the mole fraction distribution in a given system. The resulting activity equilibrium constants are directly related to the values of the ΔGr and are related to the temperature, but remain the same independently of the system composition. These values are collected in Figure 2, quantifying all possible binary contacts of DAP in the studied DES. The graphical representation of the most stable complexes is provided in Figure 3.
The plots presented in Figure 2 lead to the conclusion that the self-association of DAP is the most predominant factor of system stability. This suggests that dimer formation might be considered the driving force behind solubility restrictions in any solvent. Indeed, the heteromolecular complexes that characterize the affinity of DAP to any individual DES component are lower compared to DAP-DAP formation. It is worth emphasizing that ChCl is a very important component of the system, playing a crucial stabilization role. Indeed, the DAP-ChCl pairs are very stable in all cases. The 2nd factor contributing to the overall system stability comes from DAP-HBD pairs, which are slightly less favorable compared to DAP-ChCl. There is only 1 exception for DAP-TEG, as this pair has stability comparable to the latter. As one might expect, the hydration of DAP has the smallest contribution to the overall affinity, which might be attributed to the low polarity of the solute. In Figure 2, the relative solute–solvent affinity was drawn as a thick red line, representing the difference between DAP self-affinity and the sum of affinity to DES components. It is very interesting to observe that there is a correspondence between the order of DAP solidity in the studied DES and the relative affinity. Indeed, the highest DAP solubility was observed in the ChCl mixture with TEG. The sequence of solubility reduction in TEG, DEG and ETG is associated with the parallel decrease of ∆Gr(AB) values. For the rest of the studied HBDs, the trend is not so clear, but solute–solvent interactions are comparable to DAP self-affinity. This conclusion not only qualitatively reveals the mechanism of DAP dissolution in DES but is also a good prognostic for the utilization of intermolecular interaction for machine learning purposes.
It is interesting to note that the structure of DAP allows for 3 types of interactions. The presence of the oxygen centers bound to sulfur atoms stands for acceptors of hydrogen bonding. The amino groups bound to aromatic rings can act as potential donors of hydrogen bonds. Finally, the apolar, aromatic rings can be the source of stabilization via non-covalent dispersive interactions. This general overview suggests that DAP should be easily solvated in a variety of solvents and soluble in diverse solvents. If this simple picture is confronted with the measured data, one can infer a rather more complex behavior of DAP.
Indeed, 3 separate classes of solvents interacting with DAP can be distinguished.56 The 1st class comprises systems in which DAP acts as a proton acceptor via the sulfonyl group. This class of solvents encompasses water and aliphatic alcohols. It is anticipated that solubility in these solvents, which donate protons, will generally be at a moderate level due to the potential self-association of DAP. The 2nd group includes systems in which DAP is a proton donor in conjunction with solvents like acetone or DEG. The likelihood of DAP dissolving in these solvents is relatively high, as the non-polar region of DAP is less inclined to self-associate thanks to the obstruction caused by solvent molecules. In the 3rd and final class, solvents can be found that react with DAP through non-hydrogen bond interactions. This includes such solvents as DMSO and N-methyl-2-pyrrolidone, and the potential solubility range of DAP in these compounds is quite extended.
It is important to add that there is another aspect that should be taken into account if the dissolution mechanism is to be described, which is the self-affinity of DAP. As it was documented in Figure 2, this type of interaction is the strongest among all pars potentially present in DAP solutions. This holds not only for DES but also for many neat and binary solvents as well.56 The dispersive forces are the predominant factor stabilizing the 2 DAP molecules. However, 2 distinct classes of structures were identified in the conformational search, as documented in Figure 4. Interestingly, the structure of the most stable pair (on the left) is deficient in any type of hydrogen bonding. However, for some stable dimers the active role of both donor and acceptor centers of DAP is evident (on the right). The interplay of DAP self-association and complex formation with solvent molecules of DAP can be regarded as the main factor determining solubility. Solvents whose molecules offer concurrency to DAP dimerization are supposed to be better ones, such as, for example, TEG or DEG. These 2 molecules strongly interact with the non-polar region of DAP, in consequence effectively reducing the self-association. Solvents that can form hydrogen-bonded complexes typically do not impair as seriously the ability of dimerization, which results in more probable self-association leading eventually to aggregation and sedimentation.
Extended model for solubility prediction of DAP
In one of our previous projects,56 a model for predicting DAP solubility was developed and validated. The ensemble of regressors with tuned hyperparameters was found to be very accurate in back-computations of DAP mole fraction at saturated conditions in neat solvents and binary solvent mixtures at varying compositions and temperatures. Also, the performed detailed analysis of potential predicative potential proved the ability of the formulated model to reliably predict new cases. The studied solvent space was quite extended and encompassed major types of solvents, including polar-protic, polar-aprotic and non-polar ones. It is interesting to see if such a broad range of solvents representatively covers the possible variety of dissolution media. This aspect is especially important from the perspective of the type of solvents studied here, which are very polar, with diverse hydrogen bonding capabilities. Hence, the plots of selected descriptors were prepared for comparison of the coherence of their distributions. The previously developed model,56 which optimized the set of the most suitable regressors, utilized intermolecular interactions as molecular descriptors. The detailed analysis of the importance of given descriptors revealed that for the majority of regressors, 2 contributions are dominant. These 2 descriptors characterize the solute–solute (∆Sr(AA)) and solute–solvent (∆Sr(AB)) entropic contributions to the affinity in the studied systems. The distributions of these values provided in Figure 5 are separated into 3 sets, namely pure solvents, binary mixtures and ternary DES systems. The most important conclusion is that the new set of data adds distinct features to the overall characteristics. Although the obtained new results are within the range of solubility data used in the previous machine learning, molecular descriptors have quite distinct distributions. It is worth mentioning that the solubility measurements in DES were done only at room temperature, while the previous dataset included a quite extended range of temperatures from 5°C up to 80°C. It is not surprising that at elevated temperatures, the DAP solubility has been found to be very high, even higher than in the case of the studied DES. Hence, it is not the solubility range that prevents direct utilization of the developed model, but the distinct interactions in new systems. Consequently, it is very unlikely that the previous model can be used for solubility predictions in DES, and the utilization of such a model is not methodologically acceptable as the new pool of data is outside its applicability domain. Hence, it is necessary to extend the model for reliable predictions. The retraining of the model is characterized in Figure 6. First of all, the list of the most efficient regressors provided in the caption defines the new ensemble. No additional tuning of weight was performed, as the simple addition of their contributions was quite effective, as documented in the right panel of Figure 6. The accuracy of back-computed DAP solubility is very high, which is a good prognosis for potential screening for alternative solvents. The fact that the applicability domain is significantly extended by the inclusion of solubility in DES means that the range of potential applications is seriously extended compared to previously formulated ensemble.56
Conclusions
The performed multi-aspect analysis of DAP solubility in DES led to a number of conclusions regarding their effectiveness, dissolution mechanism and future screening for new solvents. First of all, the considered eutectic systems outperformed the classical organic solvents studied earlier, including DMSO, with the DES comprising ChCl and TEG being the most effective. It was also found that the addition of a specific amount of water to the DES systems increases DAP solubility even further. Second, the interplay of DAP self-association and the formation of complexes between DAP and solvent molecules can be regarded as the main factor influencing solubility. The self-association of DAP seems to be the most important factor influencing the stability of the systems and thus limiting its solubility. Conversely, the reason for increased solubility in some solvents is the ability of some molecules to strongly interact with the non-polar region of DAP, thus reducing self-association and increasing solubility. Finally, some important considerations have to be made regarding the usage of machine learning for solubility predictions. The general problem of the applicability domain restrictions imposed on any non-linear model developed using machine learning protocols is stressed and exemplified in the case of DAP solubility in DES. According to chemical intuition, the extension of solubility space by the inclusion of new dissolution media can affect the predictive potential of any non-linear model. However, the problem is whether the actual new pool of data resulting from new measurements requires re-formulating the model from scratch.
This paper univocally illustrates the necessity of augmenting the dataset with measurements in DES. The initial model was formulated for a quite extended and diverse set of solvents, which is not a common situation for active pharmaceutical ingredients. Hence, it might seem that the major portion of solvent structural and energetic diversity was already included in the original formulation, and the model can be applied for prediction in a variety of solvents not included in the pool of solvents used for model training. However, DES so seriously differ from organic solvents that retraining was indispensable. Fortunately, after the extension of the solubility data with 6 different solvents, the newly designed and retrained ensemble of regressors was able to capture all the new features of the system and very accurately describe the whole dataset. Since this is also associated with the extension of the applicability domain, its scope of application is broadened. This positive statement is partly mitigated by the fact that retraining is an unavoidable aspect of ensemble development if new data extend the solvent space.
Data availability
The datasets generated and/or analyzed during the current study are available from the corresponding author on reasonable request.
Consent for publication
All the authors give their consent for the publication of their identifiable details in the Polymers in Medicine journal.