💻 AI predictions in an online store - 01

graph of model

This post is focused on building and training a machine learning model to predict whether a given instance is likely to accept third-party advertising. The dataset is loaded from a CSV file, and various preprocessing steps are applied before creating, training, and evaluating the model.

Import Required Libraries

The necessary libraries are imported, including TensorFlow for machine learning tasks, Pandas for data manipulation, and other utilities. TensorBoard logs are configured to visualize the model training progress.

# Suppress tensorflow optimization warnings
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'  # or '2' to also hide INFO messages

import tensorflow as tf

Load Dataset

The dataset is loaded from a CSV file named ‘dataset.csv’ using Pandas. The dataset is then copied to ensure the original data remains intact. For demonstration purposes, the dataset is truncated to the first one million rows.

import pandas as pd

raw_dataset = pd.read_csv('dataset.csv')
raw_dataset_copy = raw_dataset.copy()

# strip dataset to n values
raw_dataset_copy = raw_dataset_copy[:1000000]

Split Dataset

The dataset is split into training, validation, and test subsets using the train_test_split function from scikit-learn. The target variable ‘accept_third_party_advertising’ is separated from the input features.

from sklearn.model_selection import train_test_split
import numpy as np

x_raw = raw_dataset_copy
y_raw = x_raw.pop('accept_third_party_advertising')

x_train, x_temp, y_train, y_temp = train_test_split(x_raw, y_raw, test_size=0.2, random_state=24)
x_val, x_test, y_val, y_test = train_test_split(x_temp, y_temp, test_size=0.5, random_state=24)

Normalize Data

Data normalization is performed using TensorFlow’s Normalization layer. The layer is adapted to the training set and applied to normalize the training, validation, and test sets.

# Normalize the data using TensorFlow's Normalization layer
normalize = tf.keras.layers.Normalization()
normalize.adapt(x_train)

# Apply normalization to the training set
x_train_normalized = normalize(x_train)

# Apply normalization to the validation set
x_val_normalized = normalize(x_val)

# Apply normalization to the test set
x_test_normalized = normalize(x_test)

Create Model

A simple neural network model is created using TensorFlow’s Sequential API. It consists of a normalization layer, a dense layer with 16 units and ReLU activation, and a final dense layer with a single unit and sigmoid activation, suitable for binary classification.

The model is compiled with the Adam optimizer, binary crossentropy loss function, and accuracy and precision metrics.

model = tf.keras.Sequential([
    normalize,
    tf.keras.layers.Dense(16, input_shape=(9,), activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid') # [0, 1]
])

optimizer = tf.optimizers.Adam(learning_rate=0.001)

loss = tf.losses.BinaryCrossentropy()

model.compile(
    optimizer=optimizer,
    loss=loss,
    metrics=['accuracy', tf.keras.metrics.Precision()])

model.summary()

Train Model

The model is trained using the training set for two epochs. The training progress is monitored, and the results are visualized using matplotlib for accuracy and loss.

import matplotlib.pyplot as plt

history = model.fit(
    x=x_train,
    y=y_train,
    batch_size=None,
    epochs=10,
    callbacks=[tensorboard_callback],
    validation_data=(x_val, y_val))

Evaluate Model

The model is evaluated on the test set, and the loss, accuracy, and precision are printed. Additionally, a confusion matrix is generated and displayed using scikit-learn’s confusion_matrix and ConfusionMatrixDisplay.

predicted = model.predict(x_test)

from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay

squeezed_pred = tf.squeeze(predicted)
filtered_pred = np.array([1 if x >= 0.5 else 0 for x in squeezed_pred])
actual = np.array(y_test)

conf_mat = confusion_matrix(actual, filtered_pred)
displ = ConfusionMatrixDisplay(confusion_matrix=conf_mat)

displ.plot()

confussion matrix

In summary, this notebook demonstrates the end-to-end process of loading a dataset, preprocessing the data, creating a neural network model, training the model, and evaluating its performance using accuracy, precision metrics, and confusion matrix. The visualizations provided help in understanding how well the model performs on both training and test data.