4th International Conference on Data Science and Applications (ICONDATA’21), Yalova, Turkey, 4 - 06 June 2021, pp.31
Coal, as most common fossil-fuel, has been subjected to several geochemical and mineralogical studies due to presence of potentially toxic elements. In these studies, statistical methods (e.g., correlation coefficient, cluster and factor analyses) are mainly used for determination of toxic elements affinities in coal. In environmental concerns, the statistical methods are recently topic of discussion due to correlations between some elements with minerals that cannot be chemically affiliated. Nevertheless, microanalyses methods as like scanning electron microscopy-energy dispersive spectrometer (SEM-EDX) or electron microprobe (EPMA) and machine learning algorithms more commonly applied in determination of affinities of some toxic elements in coal. This study aims to correlate geochemical and mineralogical data of lower (kM2) coal seam in the Soma coalfield with Bray-Curtis, Cosine and Tanimoto similarities and different similarity measures like Pearson correlation co-efficiencies. The results of similarity measures evaluated using agglomerative hierarchical clustering algorithm (average linkage) and elements grouped in several clusters. Most of identified elemental groups, expected a few of them, based on Canberra, Chebyshev, Bray-Curtis and Tanimoto measures do not appear to be in agreement with mineralogical and geochemical data. Nevertheless, elements affiliated with aluminosilicate elements (e.g., Al, K, B, and Cs) are grouped in together, and elements (e.g., S, As, Mo and U) related with redox conditions in coal formation environment are located in the same group according to Pearson correlation co-efficiencies and cosine similarity. In addition, this data is in agreement with SEM-EDX and XRD data of studied coal samples. These results imply that cosine similarity could be an alternative for Pearson correlation coefficiency’s method in coal studies. As a result, more detailed studies using both similarity measures should be conducted in future, and these similarity measures should always be correlated with SEM-EDX and XRD data.