Advanced: Extending lambeq

In this tutorial you will find examples of how to extend lambeq to add more readers, rewrite rules and ansätze, so you can start making your own contributions to the toolkit.

Download code

Creating readers

The Reader class is an abstract base class for converting sentences to diagrams. Each reader can be seen as a different compositional model, and lambeq can accommodate any compositional model that represents sentences in a string diagram/tensor network form.

A concrete subclass of Reader should implement the Reader.sentence2diagram() method, which converts a single sentence into a rigid diagram.

Reader example: “Comb” reader

In this example we will create a reader that, given a sentence, it generates the following tensor network:

drawing

Note that the particular compositional model is not appropriate for classical experiments, since the tensor that implements the layer can become very large for long sentences. However, the model can be implemented without problems on a quantum computer.

[1]:
import warnings
warnings.filterwarnings('ignore')

from lambeq import AtomicType, Reader
from lambeq.backend.grammar import Box, Id, Word

N = AtomicType.NOUN

class CombReader(Reader):
    def sentence2diagram(self, sentence):
        words = Id().tensor(*[Word(w, N) for w in sentence.split()])
        layer = Box('LAYER', words.cod, N)
        return words >> layer

diagram = CombReader().sentence2diagram('John gave Mary a flower')
diagram.draw()
../_images/tutorials_extend-lambeq_5_0.png

Note that, in the above code, the method tensor() refers to the monoidal product and not to a physical tensor object. What the specific line does, using the monoidal identity Id() as a starting point, is to tensor one-by-one the boxes of the words in the sentence accumulatively, from left to right, into a single diagram, as in a standard fold operation.

[2]:
Id().tensor(*[Word(w, N) for w in ['John', 'gave', 'Mary', 'a', 'flower']]).draw(figsize=(5,1))
../_images/tutorials_extend-lambeq_7_0.png

This diagram is then combined with the layer box to create the final output of the reader.

Note

In an actual implementation, the layer box should be shared among all sentences so it can be trained properly.

Creating rewrite rules

[3]:
from lambeq import BobcatParser

parser = BobcatParser(verbose='text')
d = parser.sentence2diagram('The food is fresh')

SimpleRewriteRule example: Negation functor

The SimpleRewriteRule class contains functionality that facilitates the creation of simple rewrite rules, without the need to define a new RewriteRule class from scratch. A SimpleRewriteRule finds words with codomain cod and name in list words, then replaces their boxes with the diagram in template.

Here is an example of a negation functor using SimpleRewriteRule. The functor adds a “NOT” box to the wire of certain auxiliary verbs:

[4]:
from lambeq import AtomicType, SimpleRewriteRule

N = AtomicType.NOUN
S = AtomicType.SENTENCE
adj = N @ N.l

NOT = Box('NOT', S, S)

negation_rewrite = SimpleRewriteRule(
    cod=N.r @ S @ S.l @ N,
    template=SimpleRewriteRule.placeholder(N.r @ S @ S.l @ N) >> Id(N.r) @ NOT @ Id(S.l @ N),
    words=['is', 'was', 'has', 'have'])

Note

The placeholder SimpleRewriteRule.placeholder(t) in the template above will be replaced by a box with the same name as the original box and type t.

A list of RewriteRules can be passed to Rewriter to create a rewriting functor. If no list is provided, then the default rewriting rules are used (see Diagram Rewriting).

[5]:
from lambeq import Rewriter
from lambeq.backend import draw_equation

not_d = Rewriter([negation_rewrite])(d)
draw_equation(d, not_d, symbol='->', figsize=(14, 4))
../_images/tutorials_extend-lambeq_15_0.png

RewriteRule example: “Past” functor

Sometimes, a rewrite rule may become too complicated to be implemented using the SimpleRewriteRule class, so the more general RewriteRule class should be used instead. A concrete subclass of a RewriteRule should implement the methods matches() and rewrite().

A rewriter uses the matches() methods of its RewriteRules to detect if a rule can be applied. If there is a match, then the matching box is replaced with the result of rewrite(box).

In the following example, a functor that changes the tense of certain auxiliary verbs is implemented by directly subclassing RewriteRule:

[6]:
from lambeq import RewriteRule

class PastRewriteRule(RewriteRule):
    mapping = {
        'is': 'was',
        'are': 'were',
        'has': 'had'
    }
    def matches(self, box):
        return box.name in self.mapping

    def rewrite(self, box):
        new_name = self.mapping[box.name]
        return type(box)(name=new_name, cod=box.cod)
[7]:
past_d = Rewriter([PastRewriteRule()])(d)
draw_equation(d, past_d, symbol='->', figsize=(9, 2))
../_images/tutorials_extend-lambeq_19_0.png

Creating ansätze

[8]:
d = parser.sentence2diagram('We will go')

Ansätze for the quantum pipeline are implemented by extending the CircuitAnsatz class, while ansätze for the classical pipeline need to extend the TensorAnsatz class. Both classes extend BaseAnsatz, sharing a common interface. Once an ansatz is instantiated, it can be used as a functor to convert diagrams to either a circuit or a tensor diagram.

An ansatz should be initialised with an ob_map argument, a dictionary which maps a rigid type to the number of qubits in the quantum case, or to a dimension size (e.g. Dim(2, 2)) for the classical case. Some ansätze may require additional arguments (see the API documentation for more details).

In lambeq, a functor is defined by specifying the mappings for objects ob and arrows ar. The CircuitAnsatz and TensorAnsatz classes already implement methods which extend ob_map to map not just base (atomic) types, but also compound types, into qubits and dimensions respectively. Therefore, to complete a new ansatz class, you only need to provide the mapping from rigid boxes to diagrams. This typically involves the following steps:

  1. Obtain the label of the box using the _summarise_box method. This provides a unique token which can be used to parameterise the box.

  2. Apply the functor to the domain and the codomain of the box.

  3. Construct and return an ansatz with new domain and codomain – see how to construct diagrams using the low-level lambeq backend here.

CircuitAnsatz example: “Real-valued” ansatz

This ansatz always returns a tensor with real-valued entries, since the ansatz is constructed using only the CNOT and Y rotation gates, which both implement real-valued unitaries. The CircuitAnsatz provides functionality to add postselections or discards to ensure that domains and codomains for the boxes match. To extend the CircuitAnsatz to create a new ansatz thus only involves providing a function to generate the circuit within a box.

[9]:
from lambeq.backend.quantum import CX, Id, Ry

from lambeq import CircuitAnsatz

def real_ansatz_circuit(n_qubits, params):

    circuit = Id(n_qubits)
    n_layers = params.shape[0] - 1

    for i in range(n_layers):
        syms = params[i]

        # adds a layer of Y rotations
        circuit >>= Id().tensor(*[Ry(sym) for sym in syms])

        # adds a ladder of CNOTs
        for j in range(n_qubits - 1):
            circuit >>= Id(j) @ CX @ Id(n_qubits - j - 2)

    # adds a final layer of Y rotations
    circuit >>= Id().tensor(*[Ry(sym) for sym in params[-1]])

    return circuit


class RealAnsatz(CircuitAnsatz):
    def __init__(self, ob_map, n_layers, n_single_qubit_params = 1, discard = False):

        super().__init__(ob_map,
                         n_layers,
                         n_single_qubit_params,
                         real_ansatz_circuit,
                         discard,
                         [Ry, ])

    def params_shape(self, n_qubits):
        return (self.n_layers + 1, n_qubits)
[10]:
real_d = RealAnsatz({N: 1, S: 1}, n_layers=2)(d)
real_d.draw(figsize=(12, 10))
../_images/tutorials_extend-lambeq_26_0.png

TensorAnsatz example: “Positive” ansatz

This ansatz returns a positive tensor, since the individual tensors are element-wise squared before contracted.

[11]:
from lambeq import TensorAnsatz, Symbol
from lambeq.backend import tensor
import math

class PositiveAnsatz(TensorAnsatz):

    def _ar(self, functor, box):
        # step 1: obtain label
        name = self._summarise_box(box)

        # step 2: map domain and codomain
        dom, cod = functor(box.dom), functor(box.cod)

        # step 3: construct and return ansatz
        syms = Symbol(name, math.prod(dom.dim), math.prod(cod.dim))

        return tensor.Box(box.name, dom, cod, syms ** 2)
[12]:
from lambeq.backend.tensor import Dim

ansatz = PositiveAnsatz({N: Dim(2), S: Dim(2)})
positive_d = ansatz(d)
positive_d.draw()
../_images/tutorials_extend-lambeq_30_0.png
[13]:
import numpy as np
from sympy import default_sort_key


syms = sorted(positive_d.free_symbols, key=default_sort_key)
sym_dict = {k: -np.ones(k.size) for k in syms}
subbed_diagram = positive_d.lambdify(*syms)(*sym_dict.values())

subbed_diagram.eval()
[13]:
array([8., 8.])

Contributions

We encourage you to implement your own readers, rewrite rules and ansätze and contribute to lambeq – detailed guidelines are available here. Below you can find some sources of inspiration:

  • rewrites for relative pronouns: [SCC2014a] [SCC2014b]

  • rewrites to deal with coordination: [Kar2016]

  • rewrites to reduce the dimension size of verbs: [Kea2014]

  • rewrites to language circuits (DisCoCirc): [CW2021]

  • ansätze benchmarked by their expressibility: [SJA2019]

  • high-level examples of ansätze: [link]

See also: