Book Image

Machine Learning in Java

By : Bostjan Kaluza
Book Image

Machine Learning in Java

By: Bostjan Kaluza

Overview of this book

<p>As the amount of data continues to grow at an almost incomprehensible rate, being able to understand and process data is becoming a key differentiator for competitive organizations. Machine learning applications are everywhere, from self-driving cars, spam detection, document search, and trading strategies, to speech recognition. This makes machine learning well-suited to the present-day era of Big Data and Data Science. The main challenge is how to transform data into actionable knowledge.</p> <p>Machine Learning in Java will provide you with the techniques and tools you need to quickly gain insight from complex data. You will start by learning how to apply machine learning methods to a variety of common tasks including classification, prediction, forecasting, market basket analysis, and clustering.</p> <p>Moving on, you will discover how to detect anomalies and fraud, and ways to perform activity recognition, image recognition, and text analysis. By the end of the book, you will explore related web resources and technologies that will help you take your learning to the next level.</p> <p>By applying the most effective machine learning methods to real-world problems, you will gain hands-on experience that will transform the way you think about data.</p>
Table of Contents (19 chapters)
Machine Learning in Java
Credits
About the Author
About the Reviewers
www.PacktPub.com
Preface
References
Index

Appendix A. References

The following are the references for all the citations throughout the book:

  • Adomavicius, G. and Tuzhilin, A.. Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6), 734-749. 2005.

  • Bengio, Y.. Learning Deep Architectures for AI. Foundations and Trends in Machine Learning 2(1), 1-127. 2009. Retrieved from http://www.iro.umontreal.ca/~bengioy/papers/ftml.pdf.

  • Blei, D. M., Ng, A. Y., and Jordan, M. I.. Latent dirichlet allocation. Journal of Machine Learning Research. 3, 993–1022. 2003. Retrieved from: http://www.jmlr.org/papers/volume3/blei03a/blei03a.pdf.

  • Bondu, A., Lemaire, V., Boulle, M.. Exploration vs. exploitation in active learning: A Bayesian approach. The 2010 International Joint Conference on Neural Networks (IJCNN), Barcelona, Spain. 2010.

  • Breunig, M. M., Kriegel, H.-P., Ng, R. T., Sander, J.. LOF: Identifying Density-based Local Outliers (PDF). Proceedings from the 2000 ACM SIGMOD International Conference on Management of Data, 29(2), 93–104. 2000

  • Campbell, A. T. (n.d.). Lecture 21 - Activity Recognition. Retrieved from http://www.cs.dartmouth.edu/~campbell/cs65/lecture22/lecture22.html.

  • Chandra, N.S. Unraveling the Customer Mind. 2012. Retrieved from http://www.cognizant.com/InsightsWhitepapers/Unraveling-the-Customer-Mind.pdf.

  • Dror, G., Boulle ́, M., Guyon, I., Lemaire, V., and Vogel, D.. The 2009 Knowledge Discovery in Data Competition (KDD Cup 2009) Volume 3, Challenges in Machine Learning, Massachusetts, US. Microtome Publishing. 2009.

  • Gelman, A. and Nolan, D.. Teaching Statistics a bag of tricks. Cambridge, MA. Oxford University Press. 2002.

  • Goshtasby, A. A. Image Registration Principles, Tools and Methods. London, Springer. 2012.

  • Greene, D. and Cunningham, P.. Practical Solutions to the Problem of Diagonal Dominance in Kernel Document Clustering. Proceedings from the 23rd International Conference on Machine Learning, Pittsburgh, PA. 2006. Retrieved from http://www.autonlab.org/icml_documents/camera-ready/048_Practical_Solutions.pdf.

  • Gupta, A.. Learning Apache Mahout Classification, Birmingham, UK. Packt Publishing. 2015.

  • Gutierrez, N.. Demystifying Market Basket Analysis. 2006. Retrieved from http://www.information-management.com/specialreports/20061031/1067598-1.html.

  • Hand, D., Manilla, H., and Smith, P.. Principles of Data Mining. USA. MIT Press. 2001. Retrieved from ftp://gamma.sbin.org/pub/doc/books/Principles_of_Data_Mining.pdf.

  • Intel. What Happens in an Internet Minute?. 2013. Retrieved from http://www.intel.co.uk/content/www/uk/en/communications/internet-minute-infographic.html.

  • Kaluža, B.. Instant Weka How-To. Birmingham. Packt Publishing. 2013.

  • Karkera, K. R.. Building Probabilistic Graphical Models with Python. Birmingham, UK. Packt Publishing. 2014.

  • KDD (n.d.). KDD Cup 2009: Customer relationship prediction. Retrieved from http://www.kdd.org/kdd-cup/view/kdd-cup-2009.

  • Koller, D. and Friedman, N.. Probabilistic Graphical Models Principles and Techniques. Cambridge, Mass. MIT Press. 2012.

  • Kurucz, M., Siklósi, D., Bíró, I., Csizsek, P., Fekete, Z., Iwatt, R., Kiss, T., and Szabó, A.. KDD Cup 2009 @ Budapest: feature partitioning and boosting 61. JMLR W&CP 7, 65–75. 2009.

  • Laptev, N., Amizadeh, S., and Billawala, Y. (n.d.). A Benchmark Dataset for Time Series Anomaly Detection. Retrieved from http://yahoolabs.tumblr.com/post/114590420346/a-benchmark-dataset-for-time-series-anomaly.

  • LKurgan, L.A. and Musilek, P.. A survey of Knowledge Discovery and Data Mining process models. The Knowledge Engineering Review, 21(1), 1–24. 2006.

  • Lo, H.-Y., Chang, K.-W., Chen, S.-T., Chiang, T.-H., Ferng, C.-S., Hsieh, C.-J., Ko, Y.-K., Kuo, T.-T., Lai, H.-C., Lin, K.-Y., Wang, C.-H., Yu, H.-F., Lin, C.-J., Lin, H.-T., and Lin, S.-de. An Ensemble of Three Classifiers for KDD Cup 2009: Expanded Linear Model, Heterogeneous Boosting, and Selective Naive Bayes, JMLR W&CP 7, 57–64. 2009.

  • Magalhães, P. Incorrect information provided by your website. 2010. Retrevied from http://www.best.eu.org/aboutBEST/helpdeskRequest.jsp?req=f5wpxc8&auth=Paulo.

  • Mariscal, G., Marban, O., and Fernandez, C.. A survey of data mining and knowledge discovery process models and methodologies. The Knowledge Engineering Review, 25(2), 137–166. 2010.

  • Mew, K. (2015). Android 5 Programming by Example. Birmingham, UK. Packt Publishing.

  • Miller, H., Clarke, S., Lane, S., Lonie, A., Lazaridis, D., Petrovski, S., and Jones, O.. Predicting customer behavior: The University of Melbourne's KDD Cup report, JMLR W&CP 7, 45–55. 2009.

  • Niculescu-Mizil, A., Perlich, C., Swirszcz, G., Sind- hwani, V., Liu, Y., Melville, P., Wang, D., Xiao, J., Hu, J., Singh, M., Shang, W. X., and Zhu, Y. F.. Winning the KDD Cup Orange Challenge with Ensemble Selection. JMLR W&CP, 7, 23–34. 2009. Retrieved from http://jmlr.org/proceedings/papers/v7/niculescu09/niculescu09.pdf.

  • Oracle (n.d.). Anomaly Detection. Retrieved from http://docs.oracle.com/cd/B28359_01/datamine.111/b28129/anomalies.htm.

  • Osugi, T., Deng, K., and Scott, S.. Balancing exploration and exploitation: a new algorithm for active machine learning. Fifth IEEE International Conference on Data Mining, Houston, Texas. 2005.

  • Power, D. J. (ed.). DSS News. DSSResources.com, 3(23). 2002. Retreived from http://www.dssresources.com/newsletters/66.php.

  • Quinlan, J. R. C4.5: Programs for Machine Learning. San Francisco, CA. Morgan Kaufmann Publishers. 1993.

  • Rajak, A.. Association Rule Mining-Applications in Various Areas. 2008. Retrieved from https://www.researchgate.net/publication/238525379_Association_rule_mining-_Applications_in_various_areas.

  • Ricci, F., Rokach, L., Shapira, B., and Kantor, P. B.. (eds.). Recommender Systems Handbook. New York, Springer. 2010.

  • Rumsfeld, D. H. and Myers, G.. DoD News Briefing – Secretary Rumsfeld and Gen. Myers. 2002. Retrieved from http://archive.defense.gov/transcripts/transcript.aspx?transcriptid=2636.

  • Stevens, S. S.. On the Theory of Scales of Measurement. Science, 103 (2684), 677–680. 1946.

  • Sutton, R. S. and Barto, A. G.. Reinforcement Learning An Introduction. Cambridge, MA: MIT Press. 1998.

  • Tiwary, C.. Learning Apache Mahout. Birmingham, UK. Packt Publishing. 2015.

  • Tsai, J., Kaminka, G., Epstein, S., Zilka, A., Rika, I., Wang, X., Ogden, A., Brown, M., Fridman, N., Taylor, M., Bowring, E., Marsella, S., Tambe, M., and Sheel, A.. ESCAPES - Evacuation Simulation with Children, Authorities, Parents, Emotions, and Social comparison. Proceedings from 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011) 2 (6), 457–464. 2011. Retrieved from http://www.aamas-conference.org/Proceedings/aamas2011/papers/D3_G57.pdf.

  • Tsanas, A. and Xifara. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings, 49, 560-567. 2012.

  • Utts, J.. What Educated Citizens Should Know About Statistics and Probability. The American Statistician, 57 (2), 74-79. 2003.

  • Wallach, H. M., Murray, I., Salakhutdinov, R., and Mimno, D.. Evaluation Methods for Topic Models. Proceedings from the 26th International conference on Machine Learning, Montreal, Canada. 2009. Retrieved from http://mimno.infosci.cornell.edu/papers/wallach09evaluation.pdf.

  • Witten, I. H. and Frank, E.. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. USA. Morgan Kaufmann Publishers. 2000.

  • Xie, J., Rojkova, V., Pal, S., and Coggeshall, S.. A Combination of Boosting and Bagging for KDD Cup 2009. JMLR W&CP, 7, 35–43. 2009.

  • Zhang, H.. The Optimality of Naive Bayes. Proceedings from FLAIRS 2004 conference. 2004. Retrieved from http://www.cs.unb.ca/~hzhang/publications/FLAIRS04ZhangH.pdf.

  • Ziegler, C-N., McNee, S. M., Konstan, J. A., and Lausen, G.. Improving Recommendation Lists Through Topic Diversification. Proceedings from the 14th International World Wide Web Conference (WWW '05), Chiba, Japan. 2005. Retrieved from http://www2.informatik.uni-freiburg.de/~cziegler/papers/WWW-05-CR.pdf.