Notes, examples, and Python demos for the 2nd edition of the textbook "Machine Learning Refined" (published by Cambridge University Press).

Below you will find a range of resources that complement the 2nd edition of Machine Learning Refined (published by Cambridge University Press).

- Sample chapters from the 2nd edition
- A sampler of widgets / pedagogy
- Online notes (jupyter notebooks)
- What is new in the second edition?
- How to use the book
- Technical prerequisites
- Coding exercises
- Slides and additional instructor resources
- Errata
- Get a copy of the book
- Reviews and Endorsements
- Software installation and dependencies
- Contact

We believe mastery of a certain machine learning concept/topic is achieved only when the answer to each of the following three questions is affirmative.

Can you describe the idea with a simple picture?`Intuition`

Can you express your intuition in mathematical notation and derive underlying models/cost functions?`Mathematical derivation`

Can you code up your derivations in a programming language, say Python, without using high-level libraries?`Implementation`

** Intuition comes first.** Intuitive leaps precede intellectual ones, and because of this we have included over 300 color illustrations in the book that have been meticulously designed to enable an intuitive grasp
of technical concepts. Many of those illustrations are snapshots of animations that show convergence of certain algorithms,
evolution of certain models from underfitting all the way to overfitting, etc. This sort of concepts can be illustrated and intuited best using animations (as opposed to static figures).
You'll find a large number of such animations in this repository -- which you can modify yourself too via the raw Jupyter notebook version of these notes. Here are just a few examples:

| | ---|---|--- Cross-validation (regression) | Cross-validation (two-class classification) | Cross-validation (multi-class classification)

| | ---|---|--- K-means clustering | Feature normalization| Normalized gradient descent

| | ---|---|--- Rotation | Convexification | Dogification!

| | ---|---|--- A nonlinear transformation | Weighted classification | The moving average

| ---|--- Batch normalization | Logistic regression

| ---|--- Polynomials vs. NNs vs. Trees (regression) | Polynomials vs. NNs vs. Trees (classification)

| ---|--- Changing gradient descent's steplength (1d) | Changing gradient descent's steplength (2d)

| ---|--- Convex combination of two functions | Taylor series approximation

| ---|--- Feature selection via regularization | Secant planes

| ---|--- Function approximation with a neural network | A regression tree

** Mathematical optimization: the workhorse of machine learning.** We highly emphasize the importance of mathematical optimization in our treatment of machine learning. Optimization is the workhorse of machine learning
and is fundamental at many levels – from the tuning of individual models to
the general selection of appropriate nonlinearities via cross-validation. Because
of this a strong understanding of mathematical optimization is requisite if one
wishes to deeply understand machine learning, and if one wishes to be able to
implement fundamental algorithms. Part I of the book provides a complete introduction to mathematical optimization, covering zero-, first-, and second-order methods, that are relied upon later in deriving and tuning machine learning models.

** Learning by doing.** We place significant emphasis on the design and implementation of algorithms throughout the text with implementations of fundamental
algorithms given in Python. These fundamental examples can then be used as
building blocks for the reader to help complete the text’s programming exercises, allowing them to ”get their hands dirty” and ”learn by doing,” practicing
the concepts introduced in the body of the text. While in principle any programming language can be used to complete the text’s coding exercises, we highly
recommend using Python for its ease of use and large support community. We
also recommend using the open-source Python libraries NumPy, autograd, and
matplotlib, as well as the Jupyter notebook editor to make implementing and
testing code easier. A complete set of installation instructions, datasets, as well
as starter notebooks can be found in this repository.

A select number of Chapters/Sections are highlighted below and are linked to HTML notes that served as *early drafts* for the second edition of the textbook. You can find these html files as well as Jupyter notebooks which created them in the `notes`

subdirectory.

1.1 Introduction

1.2 Distinguishing Cats from Dogs: a Machine Learning Approach

1.3 The Basic Taxonomy of Machine Learning Problems

1.4 Mathematical Optimization

1.5 Conclusion

2.1 Introduction

2.2 The Zero-Order Optimality Condition

2.3 Global Optimization Methods

2.4 Local Optimization Methods

2.5 Random Search

2.6 Coordinate Search and Descent

2.7 Conclusion

2.8 Exercises

3.1 Introduction

3.2 The First-Order Optimality Condition

3.3 The Geometry of First-Order Taylor Series

3.4 Computing Gradients Efficiently

3.5 Gradient Descent

3.6 Two Natural Weaknesses of Gradient Descent

3.7 Conclusion

3.8 Exercises

4.1 The Second-Order Optimality Condition

4.2 The Geometry of Second-Order Taylor Series

4.3 Newton’s Method

4.4 Two Natural Weaknesses of Newton’s Method

4.5 Conclusion

4.6 Exercises

5.1 Introduction

5.2 Least Squares Linear Regression

5.3 Least Absolute Deviations

5.4 Regression Quality Metrics

5.5 Weighted Regression

5.6 Multi-Output Regression

5.7 Conclusion

5.8 Exercises

5.9 Endnotes

6.1 Introduction

6.2 Logistic Regression and the Cross Entropy Cost

6.3 Logistic Regression and the Softmax Cost

6.4 The Perceptron

6.5 Support Vector Machines

6.6 Which Approach Produces the Best Results?

6.7 The Categorical Cross Entropy Cost

6.8 Classification Quality Metrics

6.9 Weighted Two-Class Classification

6.10 Conclusion

6.11 Exercises

7.1 Introduction

7.2 One-versus-All Multi-Class Classification

7.3 Multi-Class Classification and the Perceptron

7.4 Which Approach Produces the Best Results?

7.5 The Categorical Cross Entropy Cost Function

7.6 Classification Quality Metrics

7.7 Weighted Multi-Class Classification

7.8 Stochastic and Mini-Batch Learning

7.9 Conclusion

7.10 Exercises

8.1 Introduction

8.2 Fixed Spanning Sets, Orthonormality, and Projections

8.3 The Linear Autoencoder and Principal Component Analysis

8.4 Recommender Systems

8.5 K-Means Clustering

8.6 General Matrix Factorization Techniques

8.7 Conclusion

8.8 Exercises

8.9 Endnotes

9.1 Introduction

9.2 Histogram Features

9.3 Feature Scaling via Standard Normalization

9.4 Imputing Missing Values in a Dataset

9.5 Feature Scaling via PCA-Sphering

9.6 Feature Selection via Boosting

9.7 Feature Selection via Regularization

9.8 Conclusion

9.9 Exercises

10.1 Introduction

10.2 Nonlinear Regression

10.3 Nonlinear Multi-Output Regression

10.4 Nonlinear Two-Class Classification

10.5 Nonlinear Multi-Class Classification

10.6 Nonlinear Unsupervised Learning

10.7 Conclusion

10.8 Exercises

11.1 Introduction

11.2 Universal Approximators

11.3 Universal Approximation of Real Data

11.4 Naive Cross-Validation

11.5 Efficient Cross-Validation via Boosting

11.6 Efficient Cross-Validation via Regularization

11.7 Testing Data

11.8 Which Universal Approximator Works Best in Practice?

11.9 Bagging Cross-Validated Models

11.10 K-Fold Cross-Validation

11.11 When Feature Learning Fails

11.12 Conclusion

11.13 Exercises

12.1 Introduction

12.2 Fixed-Shape Universal Approximators

12.3 The Kernel Trick

12.4 Kernels as Measures of Similarity

12.5 Optimization of Kernelized Models

12.6 Cross-Validating Kernelized Learners

12.7 Conclusion

12.8 Exercises

13.1 Introduction

13.2 Fully Connected Neural Networks

13.3 Activation Functions

13.4 The Backpropagation Algorithm

13.5 Optimization of Neural Network Models

13.6 Batch Normalization

13.7 Cross-Validation via Early Stopping

13.8 Conclusion

13.9 Exercises

14.1 Introduction

14.2 From Stumps to Deep Trees

14.3 Regression Trees

14.4 Classification Trees

14.5 Gradient Boosting

14.6 Random Forests

14.7 Cross-Validation Techniques for Recursively Defined Trees

14.8 Conclusion

14.9 Exercises

A.1 Introduction

A.2 Momentum-Accelerated Gradient Descent

A.3 Normalized Gradient Descent

A.4 Advanced Gradient-Based Methods

A.5 Mini-Batch Optimization

A.6 Conservative Steplength Rules

A.7 Newton’s Method, Regularization, and Nonconvex Functions

A.8 Hessian-Free Methods

B.1 Introduction

B.2 The Derivative

B.3 Derivative Rules for Elementary Functions and Operations

B.4 The Gradient

B.5 The Computation Graph

B.6 The Forward Mode of Automatic Differentiation

B.7 The Reverse Mode of Automatic Differentiation

B.8 Higher-Order Derivatives

B.9 Taylor Series

B.10 Using the autograd Library

C.1 Introduction

C.2 Vectors and Vector Operations

C.3 Matrices and Matrix Operations

C.4 Eigenvalues and Eigenvectors

C.5 Vector and Matrix Norms

The second edition of this text is a complete revision of our first endeavor, with
virtually every chapter of the original rewritten from the ground up and eight
new chapters of material added, doubling the size of the first edition. Topics from
the first edition, from expositions on gradient descent to those on One-versusAll classification and Principal Component Analysis have been reworked and
polished. A swath of new topics have been added throughout the text, from
derivative-free optimization to weighted supervised learning, feature selection,
nonlinear feature engineering, boosting-based cross-validation, and more.
While heftier in size, the intent of our original attempt has remained unchanged: to explain machine learning, from first principles to practical implementation, in the simplest possible terms.

Example ”roadmaps” shown below provide suggested paths for navigating the text based on a variety of learning outcomes and university courses taught using the present book.

To make full use of the text one needs only a basic understanding of vector algebra (mathematical
functions, vector arithmetic, etc.) and computer programming (for example,
basic proficiency with a dynamically typed language like Python). We provide
complete introductory treatments of other prerequisite topics including linear
algebra, vector calculus, and automatic differentiation in the appendices of the
text.

In the mlrefined_exercises directory you can find starting wrappers for coding exercises from the first and second editions of the text.

Slides for the 2nd edition of the text are available in pptx, jupyter, and reveal.js formats. Slides for the 1st edition of the text are also available.

Instructors may request a copy of this text for examination from the publisher's website. Cambridge University Press can also provide you with the **solution manual** to both editions of the text.

Here you can find a regularly updated errata sheet for the second edition of the text. Please report any typos, bugs, broken links, etc., in the **Issues Section** of this repository or by contacting us directly via email (see contact section for more info).

An excellent book that treats the fundamentals of machine learning from basic principles to practical implementation. The book is suitable as a text for senior-level and first-year graduate courses in engineering and computer science. It is well organized and covers basic concepts and algorithms in mathematical optimization methods, linear learning, and nonlinear learning techniques. The book is nicely illustrated in multiple colors and contains numerous examples and coding exercises using Python.

**John G. Proakis**, University of California, San Diego

Some machine learning books cover only programming aspects, often relying on outdated software tools; some focus exclusively on neural networks; others, solely on theoretical foundations; and yet more books detail advanced topics for the specialist. This fully revised and expanded text provides a broad and accessible introduction to machine learning for engineering and computer science students. The presentation builds on first principles and geometric intuition, while offering real-world examples, commented implementations in Python, and computational exercises. I expect this book to become a key resource for students and researchers.

**Osvaldo Simeone**, King's College, London

This book is great for getting started in machine learning. It builds up the tools of the trade from first principles, provides lots of examples, and explains one thing at a time at a steady pace. The level of detail and runnable code show what's really going when we run a learning algorithm.

**David Duvenaud**, University of Toronto

This book covers various essential machine learning methods (e.g., regression, classification, clustering, dimensionality reduction, and deep learning) from a unified mathematical perspective of seeking the optimal model parameters that minimize a cost function. Every method is explained in a comprehensive, intuitive way, and mathematical understanding is aided and enhanced with many geometric illustrations and elegant Python implementations.

**Kimiaki Sihrahama**, Kindai University, Japan

Books featuring machine learning are many, but those which are simple, intuitive, and yet theoretical are extraordinary 'outliers'. This book is a fantastic and easy way to launch yourself into the exciting world of machine learning, grasp its core concepts, and code them up in Python or Matlab. It was my inspiring guide in preparing my 'Machine Learning Blinks' on my BASIRA YouTube channel for both undergraduate and graduate levels.

**Islem Rekik**, Director of the Brain And SIgnal Research and Analysis (BASIRA) Laboratory

After cloning this repository and entering the directory we recommend one of three methods for successfully running the Jupyter notebooks contained therein.

After installing docker and docker-compose on your machine

traverse to this repo at your terminal and type

`docker-compose up -d`

When running this command the first time an associated docker image is pulled from DockerHub.

Then in any web browser go to

`localhost:8888`

to view the repository contents - including jupyter notebooks.

After installing Anaconda Python 3 distribution on your machine, cd into this repo's directory and follow these steps to create a conda virtual environment to view its contents and notebooks.

First, create the environment

`conda create python=3.6 --name mlr2 --file requirements.txt`

Then activate it

`conda activate mlr2`

Run jupyter via the command below

`jupyter notebook --port=8888 --ip=0.0.0.0 --allow-root --NotebookApp.token=''`

And finally, open any web browser and traverse to

`localhost:8888`

to view the repository contents - including jupyter notebooks.

Using Python3 and pip3 on your machine, cd into this repo's directory and follow these steps to install the required packages.

First install Python requirements

`pip3 install -r requirements.txt`

Run jupyter via the command below

`jupyter notebook --port=8888 --ip=0.0.0.0 --allow-root --NotebookApp.token=''`

And finally, open any web browser and traverse to

`localhost:8888`

to view the repository contents - including jupyter notebooks.

This repository is in active development by Jeremy Watt and Reza Borhani. Please do not hesitate to reach out with comments, questions, typos, etc.