{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# A complete use case" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "In this section we present a complete use case of manual training (without using the :py:mod:`~lambeq.training` package), based on the meaning classification dataset introduced in [Lea2021]_. The goal is to classify simple sentences (such as \"skillful programmer creates software\" and \"chef prepares delicious meal\") into two categories, food or IT. The dataset consists of 130 sentences created using a simple context-free grammar.\n", "\n", "We will use a :py:class:`.SpiderAnsatz` to split large tensors into chains of smaller ones. For differentiation we will use JAX, and we will apply simple gradient-descent optimisation to train the tensors.\n", "\n", ":download:`Download code <../_code/training-usecase.ipynb>`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Preparation\n", "\n", "We start with a few essential imports." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "warnings.filterwarnings('ignore') # Ignore warnings\n", "\n", "from jax import numpy as np\n", "import numpy\n", "\n", "from lambeq.backend.numerical_backend import set_backend\n", "set_backend('jax')\n", "\n", "numpy.random.seed(0) # Fix the seed\n", "np.random = numpy.random " ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ ".. note::\n", "\n", " Note the ``set_backend('jax')`` assignment in the above code. This is required to let :term:`lambeq` know that from now on we use JAX's version of ``numpy``.\n", "\n", "Let's read the datasets:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Input data" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Read data\n", "def read_data(fname):\n", " with open(fname, 'r') as f:\n", " lines = f.readlines()\n", " data, targets = [], []\n", " for ln in lines:\n", " t = int(ln[0])\n", " data.append(ln[1:].strip())\n", " targets.append(np.array([t, not(t)], dtype=np.float32))\n", " return data, np.array(targets)\n", "\n", "train_data, train_targets = read_data('../examples/datasets/mc_train_data.txt')\n", "test_data, test_targets = read_data('../examples/datasets/mc_test_data.txt')" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbsphinx": "hidden" }, "outputs": [], "source": [ "import os\n", "\n", "TESTING = int(os.environ.get('TEST_NOTEBOOKS', '0'))\n", "\n", "if TESTING:\n", " train_targets, train_data = train_targets[:2], train_data[:2]\n", " test_targets, test_data = test_targets[:2], test_data[:2]\n", " EPOCHS = 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first few lines of the train dataset:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "['skillful man prepares sauce .',\n", " 'skillful man bakes dinner .',\n", " 'woman cooks tasty meal .',\n", " 'man prepares meal .',\n", " 'skillful woman debugs program .',\n", " 'woman prepares tasty meal .',\n", " 'person runs program .',\n", " 'person runs useful application .',\n", " 'woman prepares sauce .',\n", " 'woman prepares dinner .']" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train_data[:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Targets are represented as 2-dimensional arrays:" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "Array([[1., 0.],\n", " [1., 0.],\n", " [1., 0.],\n", " [1., 0.],\n", " [0., 1.],\n", " [1., 0.],\n", " [0., 1.],\n", " [0., 1.],\n", " [1., 0.],\n", " [1., 0.]], dtype=float32)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "train_targets[:10]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Creating and parameterising diagrams" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "First step is to convert sentences into :term:`string diagrams `:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Parse sentences to diagrams\n", "\n", "from lambeq import BobcatParser\n", "\n", "parser = BobcatParser(verbose='suppress')\n", "train_diagrams = parser.sentences2diagrams(train_data)\n", "test_diagrams = parser.sentences2diagrams(test_data)\n", "\n", "train_diagrams[0].draw(figsize=(8,4), fontsize=13)" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "The produced diagrams need to be parameterised by a specific :term:`ansatz `. For this experiment we will use a :py:class:`.SpiderAnsatz`." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "image/png": "", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Create ansatz and convert to tensor diagrams\n", "\n", "from lambeq import AtomicType, SpiderAnsatz\n", "from lambeq.backend.tensor import Dim\n", "\n", "N = AtomicType.NOUN\n", "S = AtomicType.SENTENCE\n", "\n", "# Create an ansatz by assigning 2 dimensions to both\n", "# noun and sentence spaces\n", "ansatz = SpiderAnsatz({N: Dim(2), S: Dim(2)})\n", "\n", "train_circuits = [ansatz(d) for d in train_diagrams]\n", "test_circuits = [ansatz(d) for d in test_diagrams]\n", "\n", "all_circuits = train_circuits + test_circuits\n", "\n", "all_circuits[0].draw(figsize=(8,4), fontsize=13)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Creating a vocabulary\n", "\n", "We are now ready to create a vocabulary." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0.5488135 , 0.71518937])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create vocabulary\n", "\n", "from sympy import default_sort_key\n", "\n", "vocab = sorted(\n", " {sym for circ in all_circuits for sym in circ.free_symbols},\n", " key=default_sort_key\n", ")\n", "tensors = [np.random.rand(w.size) for w in vocab]\n", "\n", "tensors[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Training\n", "\n", "### Define loss function\n", "\n", "This is a binary classification task, so we will use binary cross entropy as the loss." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [], "source": [ "def sigmoid(x):\n", " return 1 / (1 + np.exp(-x))\n", "\n", "def loss(tensors):\n", " # Lambdify\n", " np_circuits = [c.lambdify(*vocab)(*tensors) for c in train_circuits]\n", " # Compute predictions\n", " predictions = sigmoid(np.array([c.eval(dtype=float) for c in np_circuits]))\n", "\n", " # binary cross-entropy loss\n", " cost = -np.sum(train_targets * np.log2(predictions)) / len(train_targets)\n", " return cost" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "The loss function follows the steps below:\n", "\n", "1. The :term:`symbols ` in the training diagrams are replaced with concrete ``numpy`` arrays.\n", "2. The resulting :term:`tensor networks ` are evaluated and produce results.\n", "3. Based on the predictions, an average loss is computed for the specific iteration.\n", "\n", "We use JAX in order to get a gradient function on the loss, and \"just-in-time\" compile it to improve speed:" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "from jax import jit, grad\n", "\n", "training_loss = jit(loss)\n", "gradient = jit(grad(loss))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Train" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "We are now ready to start training. The following loop computes gradients and uses them to update the tensors associated with the :term:`symbols `." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Epoch 10 - loss 0.1838509440422058\n", "Epoch 20 - loss 0.029141228646039963\n", "Epoch 30 - loss 0.014427061192691326\n", "Epoch 40 - loss 0.009020495228469372\n", "Epoch 50 - loss 0.006290055345743895\n", "Epoch 60 - loss 0.004701168276369572\n", "Epoch 70 - loss 0.0036874753423035145\n", "Epoch 80 - loss 0.0029964144341647625\n", "Epoch 90 - loss 0.0025011023972183466\n" ] } ], "source": [ "training_losses = []\n", "\n", "epochs = 90\n", "\n", "for i in range(epochs):\n", "\n", " gr = gradient(tensors)\n", " for k in range(len(tensors)):\n", " tensors[k] = tensors[k] - gr[k] * 1.0\n", "\n", " training_losses.append(float(training_loss(tensors)))\n", "\n", " if (i + 1) % 10 == 0:\n", " print(f\"Epoch {i + 1} - loss {training_losses[-1]}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Evaluate\n", "\n", "Finally, we use the trained model on the test dataset:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Accuracy on test set: 0.8666666666666667\n" ] } ], "source": [ "# Testing\n", "\n", "np_test_circuits = [c.lambdify(*vocab)(*tensors) for c in test_circuits]\n", "test_predictions = sigmoid(np.array([c.eval(dtype=float) for c in np_test_circuits]))\n", "\n", "hits = 0\n", "for i in range(len(np_test_circuits)):\n", " target = test_targets[i]\n", " pred = test_predictions[i]\n", " if np.argmax(target) == np.argmax(pred):\n", " hits += 1\n", "\n", "print(\"Accuracy on test set:\", hits / len(np_test_circuits))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Working with quantum circuits" ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "The process when working with :term:`quantum circuits ` is very similar, with two important differences:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. The parameterisable part of the circuit is an array of parameters, as described in Section [Circuit Symbols](training-symbols.ipynb#Circuit-symbols), instead of tensors associated to words.\n", "2. If optimisation takes place on quantum hardware, standard automatic differentiation cannot be used. An alternative is to use a gradient-approximation technique, such as [Simultaneous Perturbation Stochastic Approximation](https://en.wikipedia.org/wiki/Simultaneous_perturbation_stochastic_approximation) (SPSA)." ] }, { "cell_type": "raw", "metadata": { "raw_mimetype": "text/restructuredtext" }, "source": [ "More information can be also found in [Mea2020]_ and [Lea2021]_, the papers that describe the first NLP experiments on quantum hardware.\n", "\n", ".. rubric:: See also:\n", "\n", "- `Classical pipeline with Pytorch <../examples/classical-pipeline.ipynb>`_\n", "- `Quantum pipeline with tket <../examples/quantum-pipeline.ipynb>`_\n", "- `Quantum pipeline with JAX <../examples/quantum-pipeline-jax.ipynb>`_" ] } ], "metadata": { "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 4 }