Book Image

Julia Programming Projects

By : Adrian Salceanu
Book Image

Julia Programming Projects

By: Adrian Salceanu

Overview of this book

Julia is a new programming language that offers a unique combination of performance and productivity. Its powerful features, friendly syntax, and speed are attracting a growing number of adopters from Python, R, and Matlab, effectively raising the bar for modern general and scientific computing. After six years in the making, Julia has reached version 1.0. Now is the perfect time to learn it, due to its large-scale adoption across a wide range of domains, including fintech, biotech, education, and AI. Beginning with an introduction to the language, Julia Programming Projects goes on to illustrate how to analyze the Iris dataset using DataFrames. You will explore functions and the type system, methods, and multiple dispatch while building a web scraper and a web app. Next, you'll delve into machine learning, where you'll build a books recommender system. You will also see how to apply unsupervised machine learning to perform clustering on the San Francisco business database. After metaprogramming, the final chapters will discuss dates and time, time series analysis, visualization, and forecasting. We'll close with package development, documenting, testing and benchmarking. By the end of the book, you will have gained the practical knowledge to build real-world applications in Julia.
Table of Contents (19 chapters)
Title Page
Copyright and Credits
About Packt

Building our Wikipedia crawler - take two

Our code runs as expected, refactored and neatly packed into a module. However, there's one more thing I'd like us to refactor before moving on. I'm not especially fond of our extractlinks function.

First of all, it naively iterates over all the HTML elements. For example, say that we also want to extract the title of the page—every time we want to process something that's not a link, we'll have to iterate over the whole document again. That's going to be resource-hungry and slow to run.

Secondly, we're reinventing the wheel. In Chapter 3, Setting Up the Wiki Game, we said that CSS selectors are the lingua franca of DOM parsing. We'd benefit massively from using the concise syntax of CSS selectors with the underlying optimizations provided by specialized libraries.

Fortunately, we don't need to look too far for this kind of functionality. Julia's Pkg system provides access to Cascadia, a native CSS selector library. And, the great thing about it is...