SOSYOEKONOMI, cilt.29, sa.49, ss.11-32, 2021 (ESCI)
Cluster analysis is often used to determine housing submarkets. However, commonly used methods cannot handle mixed-mode data when variables of different types and units are combined. We propose new similarity measures that handle both continuous and categorical variables using normalization and discretization steps and partial match criteria. These measures are used in agglomerative hierarchical clustering with a formulation where the optimal number of clusters is automatically determined without a priori information regarding the number of submarkets. The experiments using housing sales data show that the proposed measures perform better than the commonly used standardized Euclidean distance in identifying submarkets.