Although citation counts are often considered a measure of academic impact, they are criticized for failing to evaluate impact as intended. In this paper we propose that software engineering citations may be classified according to how the citation is used by the author of the citing paper, and that through this classification of citation behaviour it is possible to achieve a more refined understanding of the cited paper's impact. Our objective in this work is to conduct an initial evaluation using the citation behaviour taxonomy proposed by Bornmann and Daniel. We independently classified citations to ten highly-cited papers published at the International Symposium on Empirical Software Engineering and Measurement (ESEM). The degree to which classifications were consistent between researchers was analyzed in order to assess the clarity of Bornmann and Daniel's taxonomy. We found poor to fair agreement between researchers even though the taxonomy was perceived as relatively easy to apply for the majority of citations. We were nevertheless able to identify clear differences in the profile of citation behaviors between the cited papers. We conclude that an improved taxonomy is required if classification is to be reliable, and that a degree of automation would improve reliability as well as reduce the time taken to make a classification.