Project 6, Part 2:

Saving, Loading, and Running Inference

How Neural Networks Become Real Systems

In Part 1 of Project 6, we built a fully modular neural network from scratch.

We created a:

Layer class
SequentialModel class
activation functions
training loop

This gave us a complete learning system, but only inside a single Python session. Real machine learning systems must be able to save what they have learned and load it later without retraining.

Part 2 introduces model persistence. We add the ability to save the learned parameters of the network, load them into a new model instance, and run inference on new data. This is the final step in turning our hand‑built neural network into a usable system.

To verify that saving and loading work correctly, we use the XOR dataset. XOR is small, deterministic, and nonlinear, making it an ideal test case for a simple neural network.

This part introduces four files:

saved_model.py – saving and loading parameters
data.py – the XOR dataset
test.py – training, evaluation, and saving
inference.py – loading the saved model and making predictions

Below is the full code for each file.

saved_model.py

Saving and Loading Model Parameters

This file stores and restores the weights and biases of each layer. Only the parameters are saved, not the architecture. When loading, the architecture must match the saved model.

Code :

import pickle

def save_model(model, filepath='model.pkl'):

"""

Saves only the parameters (weights and biases) of each layer.

"""

state = []

for layer in model.layers:

layer_params = {}

if hasattr(layer, "w"):

layer_params["w"] = layer.w

if hasattr(layer, "b"):

layer_params["b"] = layer.b

state.append(layer_params)

with open(filepath, 'wb') as f:

pickle.dump(state, f)

print(f"Model saved to {filepath}")

def load_model(model, filepath='model.pkl'):

"""

Loads parameters into an existing model instance.

The architecture must match the saved model.

"""

with open(filepath, 'rb') as f:

state = pickle.load(f)

for layer, layer_state in zip(model.layers, state):

if hasattr(layer, "w") and "w" in layer_state:

layer.w = layer_state["w"]

if hasattr(layer, "b") and "b" in layer_state:

layer.b = layer_state["b"]

print(f"Model loaded from {filepath}")

data.py

The XOR Dataset

This dataset contains the four points of the XOR truth table. Because the dataset is so small, we return the same data for both training and testing.

Code:

import numpy as np

def get_data():

X = np.array([

[0, 0],

[0, 1],

[1, 0],

[1, 1]

], dtype=np.float32)

y = np.array([

[0],

[1],

[0]

], dtype=np.float32)

return X, X, y, y

test.py

Training, Evaluating, and Saving the Model

This script builds the model, loads the XOR data, trains the network, evaluates accuracy, and saves the learned parameters.

Code:

import numpy as np

from layers import DenseLayer

from activations import relu, relu_deriv, sigmoid, sigmoid_deriv

from model import SequentialModel

from trainer import Trainer

from losses import binary_cross_entropy, binary_cross_entropy_deriv

from saved_model import save_model

from data import get_data

model = SequentialModel([

DenseLayer(2, 2, relu, relu_deriv),

DenseLayer(2, 1, sigmoid, sigmoid_deriv)

])

model.summary()

X_train, X_test, y_train, y_test = get_data()

X_train = np.array(X_train)

y_train = np.array(y_train)

X_test = np.array(X_test)

y_test = np.array(y_test)

trainer = Trainer(

model=model,

loss_fn=binary_cross_entropy,

loss_deriv=binary_cross_entropy_deriv,

lr=0.01

)

trainer.train(

X_train,

y_train,

epochs=2000,

batch_size=4,

log_interval=200

)

loss, accuracy = trainer.evaluate(X_test, y_test, classification=True)

print(f"\nFinal Test Loss: {loss:.6f}")

print(f"Final Test Accuracy: {accuracy * 100:.2f}%")

save_model(model, filepath="model.pkl")

print("\nModel saved successfully.")

inference.py

Loading the Saved Model and Running Predictions

This script loads the saved parameters into a new model instance and performs inference on the XOR inputs.

Code:

import numpy as np from layers import DenseLayer from activations import relu, relu_deriv, sigmoid, sigmoid_deriv from model import SequentialModel from saved_model import load_model # Define the SAME model architecture used during training model = SequentialModel([ DenseLayer(2, 2, relu, relu_deriv), DenseLayer(2, 1, sigmoid, sigmoid_deriv) ]) # Load saved parameters load_model(model, "model.pkl") # Inference function def predict(input_vector): x = np.array(input_vector, dtype=np.float32) pred = model.predict([x])[0] # model.predict returns array of predictions pred = np.squeeze(pred) # ensure scalar return float(pred) # Example usage if __name__ == "__main__": samples = [[0,0], [0,1], [1,0], [1,1]] for s in samples: prediction = predict(s) print(f"Input: {s} -> Prediction: {prediction:.4f}")

Summary

Part 2 completes the neural network built from scratch by adding model persistence. The network can now be trained once, saved, loaded later, and used for inference without retraining. The XOR dataset provides a clear demonstration that the saved model behaves identically to the trained model.

This completes Project 6.

The next step is Project 7, where we introduce PyTorch.

In PyTorch, all of the components we built manually are provided automatically: layers, activations, autograd, optimizers, and model saving. Because we have built these pieces ourselves, the transition into PyTorch will be straightforward and intuitive.

PyTorch gives you all of this for free.

Your Implementation	PyTorch Equivalent
DenseLayer	nn.Linear
SequentialModel	nn.Sequential
Manual gradients	Autograd
Manual update step	optimizer.step()
save_model()	torch.save(state_dict)
load_model()	load_state_dict()

In the next project, we will rebuild this same XOR network using PyTorch and compare the two implementations side by side.

project 6 Introduction

project 6 part 1

Search This Blog

Human Side of Tech

Project 6, Part 2:

Project 6, Part 2:

Saving, Loading, and Running Inference

saved_model.py

data.py

test.py

inference.py

Summary

Comments

Post a Comment

Popular posts from this blog

How an AI Agent Works Without a Framework

Linear Regression: One Idea, Three Perspectives