Training: Classical case
In this section, we present a complete use case of lambeq
’s training
module, implementing a classical pipeline on the meaning classification dataset introduced in [Lea2021]. The goal is to classify simple sentences (such as “skillful programmer creates software” and “chef prepares delicious meal”) into two categories, food or IT. The dataset consists of 130 sentences created using a simple context-free grammar.
We will use a SpiderAnsatz
to split large tensors into chains of smaller ones. The pipeline uses PyTorch as a backend.
Preparation
We start with importing PyTorch and specifying some training hyperparameters.
[1]:
import torch
BATCH_SIZE = 30
EPOCHS = 30
LEARNING_RATE = 3e-2
SEED = 0
Input data
Let’s read the data and print some example sentences.
[2]:
def read_data(filename):
labels, sentences = [], []
with open(filename) as f:
for line in f:
t = float(line[0])
labels.append([t, 1-t])
sentences.append(line[1:].strip())
return labels, sentences
train_labels, train_data = read_data('../examples/datasets/mc_train_data.txt')
val_labels, val_data = read_data('../examples/datasets/mc_dev_data.txt')
test_labels, test_data = read_data('../examples/datasets/mc_test_data.txt')
[3]:
train_data[:5]
[3]:
['skillful man prepares sauce .',
'skillful man bakes dinner .',
'woman cooks tasty meal .',
'man prepares meal .',
'skillful woman debugs program .']
Targets are represented as 2-dimensional arrays:
[4]:
train_labels[:5]
[4]:
[[1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [1.0, 0.0], [0.0, 1.0]]
Creating and parameterising diagrams
The first step is to convert sentences into string diagrams.
[5]:
from lambeq import BobcatParser
parser = BobcatParser(verbose='text')
train_diagrams = parser.sentences2diagrams(train_data)
val_diagrams = parser.sentences2diagrams(val_data)
test_diagrams = parser.sentences2diagrams(test_data)
Tagging sentences.
Parsing tagged sentences.
Turning parse trees to diagrams.
Tagging sentences.
Parsing tagged sentences.
Turning parse trees to diagrams.
Tagging sentences.
Parsing tagged sentences.
Turning parse trees to diagrams.
The produced diagrams need to be parameterised by a specific ansatz. For this experiment we will use a SpiderAnsatz
.
[6]:
from discopy import Dim
from lambeq import AtomicType, SpiderAnsatz
ansatz = SpiderAnsatz({AtomicType.NOUN: Dim(2),
AtomicType.SENTENCE: Dim(2)})
train_circuits = [ansatz(diagram) for diagram in train_diagrams]
val_circuits = [ansatz(diagram) for diagram in val_diagrams]
test_circuits = [ansatz(diagram) for diagram in test_diagrams]
train_circuits[0].draw()

Training
Instantiate model
We can now initialise the model by importing the PytorchModel
class, and passing all diagrams to the class method PytorchModel.from_diagrams()
.
[7]:
from lambeq import PytorchModel
all_circuits = train_circuits + val_circuits + test_circuits
model = PytorchModel.from_diagrams(all_circuits)
Note
The model can also be instantiated by using the PytorchModel.from_checkpoint()
method, if an existing checkpoint is available.
Define evaluation metric
Optionally, we can provide a dictionary of callable evaluation metrics with the signature metric(y_hat, y)
.
[8]:
sig = torch.sigmoid
def accuracy(y_hat, y):
return torch.sum(torch.eq(torch.round(sig(y_hat)), y))/len(y)/2 # half due to double-counting
eval_metrics = {"acc": accuracy}
Initialise trainer
Next step is to initialise a PytorchTrainer
object. Because this is a binary classification task, we will use binary cross-entropy as the loss. As an optimizer, we choose Adam with weight decay.
[9]:
from lambeq import PytorchTrainer
trainer = PytorchTrainer(
model=model,
loss_function=torch.nn.BCEWithLogitsLoss(),
optimizer=torch.optim.AdamW,
learning_rate=LEARNING_RATE,
epochs=EPOCHS,
evaluate_functions=eval_metrics,
evaluate_on_train=True,
verbose='text',
seed=SEED)
Create datasets
To facilitate batching and data shuffling, lambeq provides a Dataset
interface. Shuffling is enabled by default, and if not specified, the batch size is set to the length of the dataset. In our example we will use the BATCH_SIZE
we have set above.
[10]:
from lambeq import Dataset
train_dataset = Dataset(
train_circuits,
train_labels,
batch_size=BATCH_SIZE)
val_dataset = Dataset(val_circuits, val_labels, shuffle=False)
Train
Now we can pass the datasets to the fit()
method of the trainer to start the training.
[11]:
trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=5)
Epoch 1: train/loss: 0.7098 valid/loss: 0.6910 train/acc: 0.5000 valid/acc: 0.5000
Epoch 5: train/loss: 0.6114 valid/loss: 0.6300 train/acc: 0.6714 valid/acc: 0.6500
Epoch 10: train/loss: 0.4002 valid/loss: 0.5396 train/acc: 0.8286 valid/acc: 0.7833
Epoch 15: train/loss: 0.2053 valid/loss: 0.3814 train/acc: 0.8786 valid/acc: 0.7833
Epoch 20: train/loss: 0.1330 valid/loss: 0.3323 train/acc: 0.9071 valid/acc: 0.8167
Epoch 25: train/loss: 0.0926 valid/loss: 0.2886 train/acc: 0.9571 valid/acc: 0.8833
Epoch 30: train/loss: 0.0354 valid/loss: 0.0874 train/acc: 0.9929 valid/acc: 0.9500
Training completed!
Note
The evaluation_step
controls the interval in which the model is evaluated on the validation dataset. Default is 1. If evaluation on the validation dataset is expensive, we recommend setting it to a higher value.
Results
Finally, we visualise the results and evaluate the model on the test data.
[12]:
import matplotlib.pyplot as plt
fig1, ((ax_tl, ax_tr), (ax_bl, ax_br)) = plt.subplots(2, 2, sharey='row', figsize=(10, 6))
ax_tl.set_title('Training set')
ax_tr.set_title('Development set')
ax_bl.set_xlabel('Epochs')
ax_br.set_xlabel('Epochs')
ax_bl.set_ylabel('Accuracy')
ax_tl.set_ylabel('Loss')
colours = iter(plt.rcParams['axes.prop_cycle'].by_key()['color'])
ax_tl.plot(trainer.train_epoch_costs, color=next(colours))
ax_bl.plot(trainer.train_results['acc'], color=next(colours))
ax_tr.plot(trainer.val_costs, color=next(colours))
ax_br.plot(trainer.val_results['acc'], color=next(colours))
# print test accuracy
test_acc = accuracy(model(test_circuits), torch.tensor(test_labels))
print('Test accuracy:', test_acc.item())
Test accuracy: 0.9833333492279053

Adding custom layers to the model
In the default setting, the forward pass of a PytorchModel
performs a simple tensor contraction of the tensorised diagrams. However, if one likes to add additional custom layers, one can create a custom model that inherits from PytorchModel
and overwrite the PytorchModel.forward()
method.
[13]:
class MyCustomModel(PytorchModel):
def __init__(self):
super().__init__()
self.net = torch.nn.Linear(2, 2)
def forward(self, input):
"""define a custom forward pass here"""
preds = self.get_diagram_output(input)
preds = self.net(preds)
return preds
The rest follows the same procedure as explained above, i.e. initialise a trainer, fit the model and visualise the results.
[14]:
custom_model = MyCustomModel.from_diagrams(all_circuits)
custom_model_trainer = PytorchTrainer(
model=custom_model,
loss_function=torch.nn.BCEWithLogitsLoss(),
optimizer=torch.optim.AdamW,
learning_rate=LEARNING_RATE,
epochs=EPOCHS,
evaluate_functions=eval_metrics,
evaluate_on_train=True,
verbose='text',
seed=SEED)
custom_model_trainer.fit(train_dataset, val_dataset, logging_step=5)
Epoch 1: train/loss: 0.7148 valid/loss: 0.6871 train/acc: 0.4143 valid/acc: 0.5833
Epoch 5: train/loss: 0.4211 valid/loss: 0.5144 train/acc: 0.8714 valid/acc: 0.8167
Epoch 10: train/loss: 0.2201 valid/loss: 0.3550 train/acc: 0.9714 valid/acc: 0.9000
Epoch 15: train/loss: 0.1201 valid/loss: 0.3603 train/acc: 1.0000 valid/acc: 0.9667
Epoch 20: train/loss: 0.0426 valid/loss: 0.4961 train/acc: 1.0000 valid/acc: 0.9500
Epoch 25: train/loss: 0.0073 valid/loss: 0.6688 train/acc: 1.0000 valid/acc: 0.9000
Epoch 30: train/loss: 0.0018 valid/loss: 0.8141 train/acc: 1.0000 valid/acc: 0.9000
Training completed!
[15]:
import matplotlib.pyplot as plt
fig1, ((ax_tl, ax_tr), (ax_bl, ax_br)) = plt.subplots(2, 2, sharey='row', figsize=(10, 6))
ax_tl.set_title('Training set')
ax_tr.set_title('Development set')
ax_bl.set_xlabel('Epochs')
ax_br.set_xlabel('Epochs')
ax_bl.set_ylabel('Accuracy')
ax_tl.set_ylabel('Loss')
colours = iter(plt.rcParams['axes.prop_cycle'].by_key()['color'])
ax_tl.plot(custom_model_trainer.train_epoch_costs, color=next(colours))
ax_bl.plot(custom_model_trainer.train_results['acc'], color=next(colours))
ax_tr.plot(custom_model_trainer.val_costs, color=next(colours))
ax_br.plot(custom_model_trainer.val_results['acc'], color=next(colours))
# print test accuracy
test_acc = accuracy(model(test_circuits), torch.tensor(test_labels))
print('Test accuracy:', test_acc.item())
Test accuracy: 0.9833333492279053

See also: