Book Image

Julia for Data Science

By : Anshul Joshi
2 (1)
Book Image

Julia for Data Science

2 (1)
By: Anshul Joshi

Overview of this book

Julia is a fast and high performing language that's perfectly suited to data science with a mature package ecosystem and is now feature complete. It is a good tool for a data science practitioner. There was a famous post at Harvard Business Review that Data Scientist is the sexiest job of the 21st century. (https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century). This book will help you get familiarised with Julia's rich ecosystem, which is continuously evolving, allowing you to stay on top of your game. This book contains the essentials of data science and gives a high-level overview of advanced statistics and techniques. You will dive in and will work on generating insights by performing inferential statistics, and will reveal hidden patterns and trends using data mining. This has the practical coverage of statistics and machine learning. You will develop knowledge to build statistical models and machine learning systems in Julia with attractive visualizations. You will then delve into the world of Deep learning in Julia and will understand the framework, Mocha.jl with which you can create artificial neural networks and implement deep learning. This book addresses the challenges of real-world data science problems, including data cleaning, data preparation, inferential statistics, statistical modeling, building high-performance machine learning systems and creating effective visualizations using Julia.
Table of Contents (17 chapters)
Julia for Data Science
Credits
About the Author
About the Reviewer
www.PacktPub.com
Preface

Counting functions


In data exploration, counting over a range is often done. It helps to find out the most/least occurring value. Julia provides the counts function to count over a range. Let's say we have an array of values. For our convenience, we will now use the random function to create an array:

We have created an array of 30 values ranging from 1 to 5. Now we want to know how many times they occur in the dataset:

Using the count function, we found that 1(7), 2(1), 3(5), 4(11), and 5(6). counts take different arguments to suit the use case.

The proportions() function is used to compute the proportions of the values in the dataset and  Julia provides the function:

We calculated proportions on the same dataset that we used in the previous examples. It shows that the ratio of value 1 in the dataset is 0.23333. This can also be seen as the probability of finding the value in the dataset.

Other count functions include:

  • countmap(arr): This is a map function that maps the values to the number...