Book Image

Julia Programming Projects

By : Adrian Salceanu
Book Image

Julia Programming Projects

By: Adrian Salceanu

Overview of this book

Julia is a new programming language that offers a unique combination of performance and productivity. Its powerful features, friendly syntax, and speed are attracting a growing number of adopters from Python, R, and Matlab, effectively raising the bar for modern general and scientific computing. After six years in the making, Julia has reached version 1.0. Now is the perfect time to learn it, due to its large-scale adoption across a wide range of domains, including fintech, biotech, education, and AI. Beginning with an introduction to the language, Julia Programming Projects goes on to illustrate how to analyze the Iris dataset using DataFrames. You will explore functions and the type system, methods, and multiple dispatch while building a web scraper and a web app. Next, you'll delve into machine learning, where you'll build a books recommender system. You will also see how to apply unsupervised machine learning to perform clustering on the San Francisco business database. After metaprogramming, the final chapters will discuss dates and time, time series analysis, visualization, and forecasting. We'll close with package development, documenting, testing and benchmarking. By the end of the book, you will have gained the practical knowledge to build real-world applications in Julia.
Table of Contents (19 chapters)
Title Page
Copyright and Credits
Dedication
About Packt
Contributors
Preface
Index

Clustering


As you've probably come to realize by now, when it comes to data science, there are almost always multiple avenues to attack a problem. At the algorithmic level, depending on the particularities of the data and the specific problem we're trying to solve, we'll usually have more than one option. A wealth of choices is usually good news as some algorithms can produce better results than others, depending on the specifics. Clustering is no exception—a few well-known algorithms are available, but we must understand their strengths and their limitations in order to avoid ending up with irrelevant clusters.

Scikit-learn, the famous Python machine learning library, drives the point home by using a few toy datasets. The datasets produce easily recognizable plots, making it easy for a human to identify the clusters. However, applying unsupervised learning algorithms will lead to strikingly different results—some of them in clear contradiction of what our human pattern recognition abilities...