Book Image

Java Data Analysis

By : John R. Hubbard
Book Image

Java Data Analysis

By: John R. Hubbard

Overview of this book

Data analysis is a process of inspecting, cleansing, transforming, and modeling data with the aim of discovering useful information. Java is one of the most popular languages to perform your data analysis tasks. This book will help you learn the tools and techniques in Java to conduct data analysis without any hassle. After getting a quick overview of what data science is and the steps involved in the process, you’ll learn the statistical data analysis techniques and implement them using the popular Java APIs and libraries. Through practical examples, you will also learn the machine learning concepts such as classification and regression. In the process, you’ll familiarize yourself with tools such as Rapidminer and WEKA and see how these Java-based tools can be used effectively for analysis. You will also learn how to analyze text and other types of multimedia. Learn to work with relational, NoSQL, and time-series data. This book will also show you how you can utilize different Java-based libraries to create insightful and easy to understand plots and graphs. By the end of this book, you will have a solid understanding of the various data analysis techniques, and how to implement them using Java.
Table of Contents (20 chapters)
Java Data Analysis
Credits
About the Author
About the Reviewers
www.PacktPub.com
Customer Feedback
Preface
Index

The Apache Commons Math Library


The Apache Software Foundation is an American non-profit corporation that supports open source software projects written by volunteers. It includes a vast number of projects in widely diverse fields (see https://en.wikipedia.org/wiki/List_of_Apache_Software_Foundation_projects). One part of this collection is called Apache Commons, consisting of various Java libraries. These can be downloaded from http://commons.apache.org/downloads/.

We used the Apache Commons Math Library in Chapter 6, Regression Analysis. Here are the steps to follow to use it in your NetBeans projects:

  1. Download either the tar.gz file or the .zip file of the most recent version (3.6.1, as of August, 2017) of the commons-math archive file from: http://commons.apache.org/proper/commons-math/download_math.cgi.

  2. Expand the archive (double-click or right-click on it), and then copy the resulting folder to a location where other Java libraries are kept on your machine (for example, Library/Java/Extensions).

  3. In NetBeans, select Libraries from the Tools menu.

  4. In the Ant Library Manager window, click on the New Library… button.

  5. In the New Library dialogue, enter Apache Commons Math for the Library Name (see Figure A-32), and click OK. This adds that library name to your NetBeans Libraries list.

    Figure A-32. Creating a new library in NetBeans

  6. Next, with the Classpath tab selected, click the Add JAR/Folder… button and then navigate to the folder where you copied the folder in step 2. Select the commons-math3-3.6.1.jar file inside that folder (see Figure A-33) and click on the Add JAR/Folder button.

    Figure A-33. Locating the JAR file for a NetBeans library

  7. Switch to the Javadoc tab and repeat step 6, except this time, click the Add ZIP/Folder… button and then select the commons-math3-3.6.1-javadoc.jar file from that same folder.

  8. Then click OK. You have now defined a NetBeans library named Apache Commons Math, containing both the compiled JAR and the Javadoc JAR for the commons-math3-3.6.1 that you downloaded. Henceforth, you can easily add this library to any NetBeans project that you'd like to have use it.

  9. To designate that library for use by your current (or any other) project, right-click on the project icon and select Properties. Then select Libraries in the Categories list on the left.

  10. Click on the Add Library… button, and then select your Apache Commons Math library from the list. Click on the Add Library button and then OK (see Figure A-34). Now, you can use all the items in that library in any source code that you write in that project. Moreover, the Javadocs should work the same for those packages, interfaces, classes, and members as they do for standard Java code.

    Figure A-34. Adding the Apache Commons Math Library to a specific NetBeans project

To test your installation, do this:

  1. In your main class, add this code:

    SummaryStatistics stats;
  2. Since the SummaryStatistics class is not part of the standard Java API, NetBeans will mark that line as erroneous, like this:

  3. Click on the tiny red ball in the current line's margin. A drop-down list appears (see Figure. A-35).

    Figure A-35. Adding the correct import statement in NetBeans

  4. With the Add import item selected (the first line in in the list), press Enter. That will cause the import statement to be inserted in your source code (line 7, Figure. A-36).

    Figure A-36. Getting the import statement inserted automatically in NetBeans

  5. Next, click on the SummaryStatistics class name where the stats variable is being declared, and then select Show Javadoc from the drop-down menu that appears. That should bring up the Javadoc page for that class in your default web browser.

  6. Once you have any Javadoc page displayed from a library (such as org.apache.commons.math3), you can easily investigate all the other subpackages, interfaces, and classes in that library. Just click on the Frames link at the top of the page and then use the sliding lists in the two frames on the left (Figure A-37).

    Figure A-37. Surveying the other Javadocs in a library