Teaching an AI to count

Neural networks are a powerful tool in machine learning that can be trained to perform a wide range of tasks, from image classification to natural language processing. In this blog post, we’ll explore how to teach a neural network to add together two numbers. You can also think about this article as a tutorial for tensorflow.

Let’s start by importing some dependencies:

import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import tensorflow as tf

Our neural network will have one hidden layer with a ReLU activation function and an output layer with no activation function.

We’ll start by defining our input and output data. We’ll generate a dataset of 100,000 random pairs of numbers between -100 and 100. We’ll use these pairs as our input data and the sum of each pair as our output data.

r = np.random.rand(100000, 2) * 100 - 50
df = pd.DataFrame({'x': r[:,0], 'y': r[:,1], 'res': np.sum(r, axis=1)})

Our model has one hidden layer with 64 neurons and a ReLU activation function. The input layer has two neurons, one for each number in the input pair. The output layer has one neuron, which will output the sum of the two input numbers.

normalizer = tf.keras.layers.Normalization(axis=-1)
numeric_features = df[["x", "y"]]
normalizer.adapt(np.array(numeric_features))

model = tf.keras.Sequential([
    normalizer,
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1)
])

We’ll compile our model using the mean squared error loss function and the Adam optimizer.

Now we’re ready to train our model on our dataset. We’ll use 80% of our dataset for training and 20% for validation. We’ll train our model for 20 epochs with a batch size of 32. Now we’re ready to train our model on our dataset.

model.compile(optimizer='adam', loss='mean_absolute_error', metrics=['accuracy'])
history = model.fit(numeric_features, df[["res"]], epochs=20, verbose=1, batch_size=32, validation_split = 0.2)
model.summary()

Let’s plot the loss of the neural network model over time during training.

plt.plot(history.history['loss'])
plt.xlabel("No. of Iterations")
plt.ylabel("Loss")
plt.yscale('log')
plt.show()

Neural network loss graph

Our trained model should be able to somewhat accurately predict the sum of two numbers that it has never seen before. Let’s test it with some random input pairs.

model.predict([[1, 1], [3.2, 1.1], [1024, 1024]])

> 1/1 [==============================] - 0s 92ms/step

array([[   2.0295484],
       [   4.3189883],
       [2006.8055   ]], dtype=float32)

Google colab notebook

Teaching an AI to count

Kornel Lugosi