Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Mathematics of Machine Learning
  • Table Of Contents Toc
Mathematics of Machine Learning

Mathematics of Machine Learning

By : Tivadar Danka
close
close
Mathematics of Machine Learning

Mathematics of Machine Learning

By: Tivadar Danka

Overview of this book

Mathematics of Machine Learning provides a rigorous yet accessible introduction to the mathematical underpinnings of machine learning, designed for engineers, developers, and data scientists ready to elevate their technical expertise. With this book, you’ll explore the core disciplines of linear algebra, calculus, and probability theory essential for mastering advanced machine learning concepts. PhD mathematician turned ML engineer Tivadar Danka—known for his intuitive teaching style that has attracted 100k+ followers—guides you through complex concepts with clarity, providing the structured guidance you need to deepen your theoretical knowledge and enhance your ability to solve complex machine learning problems. Balancing theory with application, this book offers clear explanations of mathematical constructs and their direct relevance to machine learning tasks. Through practical Python examples, you’ll learn to implement and use these ideas in real-world scenarios, such as training machine learning models with gradient descent or working with vectors, matrices, and tensors. By the end of this book, you’ll have gained the confidence to engage with advanced machine learning literature and tailor algorithms to meet specific project requirements. *Email sign-up and proof of purchase required
Table of Contents (36 chapters)
close
close
2
Part 1: Linear Algebra
11
References
12
Part 2: Calculus
19
References
20
Part 3: Multivariable Calculus
24
References
25
Part 4: Probability Theory
29
References
30
Part 5: Appendix
31
Other Books You May Enjoy
32
Index

What this book covers

Chapter 1, Vectors and vector spaces covers what vectors are and how to work with them. We’ll travel from concrete examples through precise mathematical definitions to implementations, understanding vector spaces and NumPy arrays, which are used to represent vectors efficiently. Besides the fundamentals, we’ll learn

Chapter 2, The geometric structure of vector spaces moves forward by studying the concept of norms, distances, inner products, angles, and orthogonality, enhancing the algebraic definition of vector spaces with some much-needed geometric structure. These are not just tools for visualization; they play a crucial role in machine learning. We’ll also encounter our first algorithm, the Gram-Schmidt orthogonalization method, turning any set of vectors into an orthonormal basis.

In Chapter 3, Linear algebra in practice, we break out NumPy once more, and implement everything that we’ve learned so far. Here, we learn how to work with the high-performance NumPy arrays in practice: operations, broadcasting, functions, culminating in the from-scratch implementation of the Gram-Schmidt algorithm. This is also the first time we encounter matrices, the workhorses of linear algebra.

Chapter 4, Linear transformations is about the true nature of matrices; that is, structure-preserving transformations between vector spaces. This way, seemingly arcane things – such as the definition of matrix multiplication – suddenly make sense. Once more, we take the leap from algebraic structures to geometric ones, allowing us to study matrices as transformations that distort their underlying space. We’ll also look at one of the most important descriptors of matrices: the determinants, describing how the underlying linear transformations affect the volume of the spaces.

Chapter 5, Matrices and equations presents the third (and for us, the final) face of matrices as systems of linear equations. In this chapter, we first learn how to solve systems of linear equations by hand using the Gaussian elimination, then supercharge it via our newfound knowledge of linear algebra, obtaining the mighty LU decomposition. With the help of the LU decomposition, we go hard and achieve a roughly 70000 × speedup on computing determinants.

Chapter 6 introduces two of the most important descriptors of matrices: eigenvalues and eigenvectors. Why do we need them?

Because in Chapter 7, Matrix factorizations, we are able to reach the pinnacle of linear algebra with their help. First, we show that real and symmetric matrices can be written in diagonal form by constructing a basis from their eigenvectors, known as the spectral decomposition theorem. In turn, a clever application of the spectral decomposition leads to the singular value decomposition, the single most important result of linear algebra.

Chapter 8, Matrices and graphs closes the linear algebra part of the book by studying the fruitful connection between linear algebra and graph theory. By representing matrices as graphs, we are able to show deep results such as the Frobenius normal form, or even talk about the eigenvalues and eigenvectors of graphs.

In Chapter 9, Functions, we take a detailed look at functions, a concept that we have used intuitively so far. This time, we make the intuition mathematically precise, learning that functions are essentially arrows between dots.

Chapter 10, Numbers, sequences, and series continues down the rabbit hole, looking at the concept of numbers. Each step from natural numbers towards real numbers represents a conceptual jump, peaking at the study of sequences and series.

With Chapter 11, Topology, limits, and continuity, we are almost at the really interesting parts. However, in calculus, the objects, concepts, and tools are most often described in terms of limits and continuous functions. So, we take a detailed look at what they are.

Chapter 12 is about the single most important concept in calculus: Differentiation. In this chapter, we learn that the derivative of a function describes 1) the slope of the tangent line, and 2) the best local linear approximation to a function. From a practical side, we also look at how derivatives behave with respect to operations, most importantly the function composition, yielding the essential chain rule, the bread and butter of backpropagation.

After all the setup, Chapter 13, Optimization introduces the algorithm that is used to train virtually every neural network: gradient descent. For that, we learn how the derivative describes the monotonicity of functions and how local extrema can be characterized with the first and second order derivatives.

Chapter 14, Integration wraps our study of univariate functions. Intuitively speaking, integration describes the (signed) area under the functions’ graph, but upon closer inspection, it also turns out to be the inverse of differentiation. In machine learning (and throughout all of mathematics, really), integrals describe various probabilities, expected values, and other essential quantities.

Now that we understand how calculus is done in single variables, Chapter 15 leads us to the world of Multivariable functions, where machine learning is done. There, we have an entire zoo of functions: scalar-vector, vector-scalar, and vector-vector ones.

In Chapter 16, Derivatives and gradients, we continue our journey, overcoming the difficulties of generalizing differentiation to multivariable functions. Here, we have three kinds of derivatives: partial, total, and directional; resulting in the gradient vector and the Jacobian and Hessian matrices.

As expected, optimization is also slightly more complicated in multiple variables. This issue is cleared up by Chapter 17, Optimization in multiple variables, where we learn the analogue of the univariate second-derivative test, and implement the almighty gradient descent in its final form, concluding our study of calculus.

Now that we have a mechanistic understanding of machine learning, Chapter 18, What is probability? shows us how to reason and model under uncertainty. In mathematical terms, probability spaces are defined by the Kolmogorov axioms, and we’ll also learn the tools that allow us to work with probabilistic models.

Chapter 19 introduces Random variables and distributions, allowing us not only to bring the tools of calculus into probability theory, but to compact probabilistic models into sequences or functions.

Finally, in Chapter 20, we learn the concept of The expected value, quantifying probabilistic models and distributions with averages, variances, covariances, and entropy.

CONTINUE READING
83
Tech Concepts
36
Programming languages
73
Tech Tools
Icon Unlimited access to the largest independent learning library in tech of over 8,000 expert-authored tech books and videos.
Icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Icon 50+ new titles added per month and exclusive early access to books as they are being written.
Mathematics of Machine Learning
notes
bookmark Notes and Bookmarks search Search in title playlist Add to playlist download Download options font-size Font size

Change the font size

margin-width Margin width

Change margin width

day-mode Day/Sepia/Night Modes

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY

Submit Your Feedback

Modal Close icon
Modal Close icon
Modal Close icon