Posts

Neural Network Framework

  Project 6  From First Principles to Your First Real Neural Network Framework In Section 1, you built every piece of a neural network by hand:  dot products, activations, gradients, multi‑feature regression, logistic regression, and even multi‑layer backprop for XOR. You’ve seen how neuron networks  stack layers, and how gradients flow backward through the entire system. Now we’re ready for the next step. In the real world, nobody trains neural networks by manually writing out gradients or wiring layers together. Instead, we use frameworks like PyTorch and TensorFlow, tools that package all the math you’ve learned into clean, modular components. Before we jump into those frameworks, we’re going to build our own. Why? Because once you understand how a framework works on the inside, learning PyTorch becomes effortless. You’ll recognize every concept, layers, modules, activations, loss functions, training loops; because you built them yourself. Project 6 is where every...

Neural Network Activation Functions

A Practical Guide to Neural Network Activation Functions (with Code + Intuition) Activation functions are the non‑linear heart of neural networks. Without them, a network would collapse into a simple linear transformation, no matter how many layers you stack. In this post, we’ll walk through a complete set of activation functions implemented in NumPy, explain what each one does, and discuss when and why you’d choose it. All examples below come from this file: import numpy as np def _clip(z, min_val=-500, max_val=500): return np.clip(z, min_val, max_val) The _clip helper prevents numerical overflow in functions like sigmoid and softplus. 1. Linear (Identity) def linear(z): return z def linear_deriv(z): return np.ones_like(z) What it does Linear activation returns the input unchanged. When to use it Output layer of regression models (predicting continuous values like price, temperature, etc.) Hidden layers almost never use it — it adds no nonlinearity. ...

Core Concepts of Machine Learning

Core Concepts of Machine Learning: A First‑Principles Glossary Machine learning can feel like a maze of formulas and jargon, but underneath it all, the field is built from a small set of core ideas. These ideas repeat across linear regression, logistic regression, neural networks, and even modern deep learning. This glossary collects the essential concepts from the first four projects of the curriculum. Each entry focuses on intuition first, with just enough math to make the idea clear. Think of this as the conceptual map that ties everything together.   The Design Matrix (X) The design matrix is the standard way machine learning represents data. Each row is one example Each column is one feature Shape: number of samples by number of features Why it matters: Turns many dot products into one matrix multiplication Enables vectorized gradient descent Makes batching easy (just slice rows) In neural networks, the basic operation is: X times W plus b The design matrix is the bridge bet...

Project 5: The Design Matrix

Project 5 : The Design Matrix The Structure of Neural Networks Kaggle Notebook GitHub Repo Why This Project Exists Projects 1–4 taught you the operations of machine learning: dot products gradients MSE cross‑entropy logistic regression hidden layers But they didn’t yet reveal the structural object that unifies all of them: The Design Matrix ๐‘‹ This project shows: Why ML always uses a matrix how neural networks generalize it how this simplifies the system What Is the Design Matrix? A design matrix is how machine learning represents data. Each row = one example =  n = number of examples (rows of X ) Each column = one feature d = number of features (columns of X) When we say:      ๐‘‹ ∈ ๐‘… ๐‘›×๐‘‘ We mean: X has n rows (one per data point) X has d columns (one per feature) EX. import numpy as np # 3 samples, 2 features X = np.array([                      [1.0, 2.0],               ...
  Project 3: Binary Classification Kaggle Notebook GitHub repo Logistic Regression, Sigmoid, Cross Entropy, and the Geometry of the Decision Boundary This project continues the progression from Project 1 and Project 2. Project 1 introduced the learning loop using a single feature. Project 2 expanded the model to multiple features and introduced the dot product and matrix view. Project 3 now introduces classification. The structure of the model stays almost the same. The only new ingredient is the sigmoid function, and a new loss function called binary cross entropy. The goal of this project is to show that logistic regression is simply linear regression passed through a nonlinear squashing function. The learning loop is the same. The gradients simplify beautifully. The geometry becomes a separating line or plane. The entire model can be written cleanly in matrix form. 1. The Model In regression, the model was: y_hat = w dot x + b For classification, the model becomes: z = w dot ...