Data Mining in Business Domains: A Conceptual Model of Recommender Systems for Classifier Selection

Main Article Content

Abstract

Data m ining is a valuable business tool
that com panies can utilize to understand
their custom ers an d attain com petitive
advantage. A critical com pon en t o f data
m ining is classifier selection; com panies
must m eticulously select an appropriate
classifier as it im pacts the accuracy o f the
results. In order to select an appropriate
cla ssifier, a co m p a n y 's k n o w le d g e
discovery team must m aster a lot o f
background inform ation o f the dataset,
the m odel a n d the algorithm s in question.
We suggest that recom m ender systems
can e a s e this c o m p le x p ro cess by
searching the know ledge stored in the
result repository an d recom m ending an
appropriate classifier to be used fo r a
particu lar dataset. In this study we
propose such a system an d take a first look
on how it can be done. We com pare
variou s classifiers a g a in st d ifferen t
datasets an d then com e up with the m ost
appropriate classifier fo r a particular
datasetbased on its unique characteristic.
The results o f our experim ents indicate
that A daB oost is a relatively stable
perform er com pared to other algorithm s.
Other findings and m anagerial
im plications are also discussed in our
study. 

References

Adriaans, P. and Zandinge, D. (1996) Data Mining, Reading Mass.: Addison-Wesley.

Aha, D.W. (1992) Generalizing from Case Studies: A Case Study. In 9th Int. Machine Learning Conference.

Bigus, J.R. (1996) Data Mining with Neural Networks, New York: McGraw-Hill, pp. 131–177.

Brodley, C.E. (1995) Recursive Automatic Bias Selection for Classifier Construction. Machine Learning, 20:63–94.

Brodley, C.E. and Smyth, P. (1997) Applying Classification Algorithms in Practice. Statistics and Computing.

Chen, M.S., Han, J. and Yu, P.S. (1996) “Data Mining: An Overview from a Database Perspective”, IEEE Transactions on Knowledge and Data Engineering, Vol. 8, pp. 866–881.

Donato, J.M., Schryver, J.C., Kinkel, G.C., Schmoyer Jr., R.L., Leuze, M.R. and Grandy, N.W. (1999) “Mining Multi-Dimensional Data for Decision Support”, Future Generation Computer Systems, Vol. 15, No. 3, pp. 433–441.

Fayyad, U.M., Piatetsky-Shapiro, G., and Smyth, P. (1996) From Data Mining To Knowledge Discovery In Databases, AI Magazine, Vol. 17, pp. 37–54.

Fayyad, U. and Stolorz, R. (1997) “Data Mining and KDD: Promise And Challenge”, Future Generation Computer Systems, Vol. 13, No. 2–3, pp. 99–115.

Fayyad, U., Piatetsky-Shapiro, G., Smith, G. & Uthurusamy, R. (1998). Advances in Knowledge Discovery and Data Mining. Menlo Park, CA: AAAI Press.

Gama, J. and Brazdil, P. (1995) Characterization of Classification Algorithms. In Pinto Pereira, C. and Mamede, N. (Eds.), Progress in AI, 7th Portuguese Conf. (EPIA’95), pp. 83–102. Springer-Verlag.

Gordon, F. and Desjardin, M. (1995) Evaluation and Selection of Biases. Machine Learning, 20:5–22.

Hui, S.C. and Jha, G. (2000) “Data Mining For Customer Service Support”, Information and Management, Vol. 38, No. 1, pp. 1–13.

Kohavi, R. (1995) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In C.S. Mellish (Ed.), Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, San Mateo, CA: Morgan Kaufmann, pp. 1137–1143.

Kohavi, R. (1996) Data Mining using MLC++, A Machine Learning Library in C++. In Tools with AI.

Linhard, H. and Zucchini, W. (1986) Model Selection. NY: Wiley.

Mitchel, T. (1997) Machine Learning. McGraw Hill.

Nakhaeizadeh, G. and Schnable, A. (1997) “Development of Multi-Criteria Metrics for the Evaluation of Data Mining Algorithms”. In KDD’97, pp. 37–42. AAAI Press.

Pitta, D. (1998) “Marketing one-to-one and Its Dependence on Knowledge Discovery in Databases”, Journal of Consumer Marketing, Vol. 15, No. 5, pp. 468–480.

Provost, F.J. and Buchanan, B.G. (1995) “Inductive Policy: The Pragmatics of Bias Selection”. Machine Learning, 20:35–61.

Rendell, L.A., Seshu, R.M., and Tcheng, D.K. (1987) Layered Concept Learning and Dynamically-Variable Bias Management. In 10th Int. Joint Conf. on Artificial Intelligence, pp. 308–334.

Shavlik, J.W., Mooney, R. and Towell, G. (1991) “Symbolic and Neural Computation: An Experimental Approach”. Machine Learning, 6:111–114.

Salzberg, S.A. (1991) “Nearest Hyper-Rectangle Learning Method”. Machine Learning, 6:251–276.

Schaffer, C. (1993) “Selecting a Classification Method by Cross-Validation”. Machine Learning, 13:135–143.

Schaffer, C. (1993) Selecting a Classification Method by Cross-Validation. Preliminary Papers of the Fourth International Workshop on Artificial Intelligence and Statistics, pp. 15–25.

Sinha, A.P., and May, J.H. (2005) “Evaluating and Tuning Predictive Data Mining Models Using Receiver Operating Characteristic Curves”. Journal of Management Information Systems, 21(3), 249–280.

Solomon, S., Nguyen, H., Liebowitz, J., and Agresti, W. (2006) “Using Data Mining to Improve Traffic Safety Programs”, Industrial Management & Data Systems, 5, 621–643.

Sung, H.H. and Sang, C.P. (1998) “Application of Data Mining Tools to Hotel Data Mart on the Intranet for Database Marketing”, Expert Systems With Applications, Vol. 14, No. 1, pp. 1–31.

Weiss, S.M. and Kapouleas, I. (1989) “An Empirical Comparison of Pattern Recognition, Neural Nets and Machine Learning Classification Methods”. In 11th International Conference on Artificial Intelligence.

Witten, I.H., and Frank, E. (2000) Data Mining: Practical Machine Learning Tools and Techniques with Java Implementation. San Francisco, CA: Morgan Kaufmann.

RESEARCH ARTICLE