Cost Sensitive Learning and SMOTE Methods for Imbalanced Data

Akbar Khan, Faizullah Khan, Surat Khan, Ishtiaq Ahmed Khan, Muhammad Saeed


Class imbalance is one of the main problem using different algorithms used in machine learning. In imbalance classification of data the false negative is always high. The researchers have introduced many methods to deal with this problem, but the purpose of this paper is to apply machine learning algorithms under the SMOTE and cost sensitive learning approaches and acquired the results from the different experiments to find out the suitable results for imbalanced data.


Cost Sensitive Learning; Machine Learning; WEKA; SMOTE; Imbalanced Data;

Full Text:



Nitesh VC, Nathalie J, Aleksander K. (2004). Editorial: special issue on learning from imbalanced data

sets. SIGKDD Explorations 6(1):1–6.

Qiang Y, Xindong W. (2006). 10 challenging problems in data mining research. International Journal

of Information Technology and Decision Making 5(4):597–604.

Sheng S, Ling CX, Yang Q. (2005). Simple test strategies for cost sensitive decision trees.

Springer-Verlag Berlin Heidelberg, pp. 365–376.

Xiaoyong C, Lin D, Qiang Y, Charles X Ling. (2004). Test-cost sensitive naïve Bayes classification, in

International Conference on Data Mining, pp. 51–58.

Jason VH, Taghi M K, Amri N. (2007). Experimental perspectives on learning from imbalanced data,

In Proceedings of the 24th International Conference on Machine Learning, Corvallis, pp.935-942.

Gary MW. (2004). Mining with rarity: a unifying framework. ACM SIGKDD Explorations 6(1):7-19.

Gary MW, Foster P. (2003). Learning when training data are costly: the effect of class distribution on

tree induction. Journal of Artificial Intelligence Research 19:315-354.

Elkan C. (2001). The Foundations of Cost-Sensitive Learning. Proc. Int’l Joint Conf. Artificial Intelligence,

pp. 973-978.

Haibo H, Edwardo G. (2009). Learning from imbalanced data, IEEE Trans. Knowl. Data, pp 1263-1284.

Contacts | Feedback
© 2002-2014 BUITEMS