Step 2. Diagram rewriting

Syntactic derivations in pregroup form can become extremely complicated, which may lead to excessive use of hardware resources and prohibitively long training times. The purpose of the rewrite module is to provide a means to the user to address some of these problems, via rewriters and rewriting rules that simplify the string diagram. lambeq provides two kinds of rewriters:

  • Box-level rewriters utilize a sequence of functorial transformations on the diagram through rewriting rules. Each rewriting rule independently accesses individual boxes within the diagram, limiting its visibility to information solely within that specific box, without access to the broader diagram context.

  • Rewriters operating at the diagram-level function procedurally, implementing unrestricted transformations across the entirety of the diagram.

lambeq’s rewriters are explained in detail in the following sections.

⬇️ Download code

Box-level rewrite rules

We will demonstrate the use of box-level rewriting rules using again the sentence “John walks in the park”.

from lambeq import BobcatParser

# Parse the sentence
parser = BobcatParser(verbose='suppress')
diagram = parser.sentence2diagram("John walks in the park")

diagram.draw(figsize=(11,5), fontsize=13)
../_images/50a61811eed5f06f538fbb14140d35851feae921f5380eba82fd01979763108d.png

Note that the representation of the preposition is a tensor of order 5 in the “classical” case, or a state of 5 quantum systems in the quantum case. Applying the prepositional_phrase rewriting rule to the diagram takes advantage of the underlying compact-closed monoidal structure, by using a “cap” to bridge the discontinued subject noun wire within the preposition tensor. Furthermore, the determiner rewriting rule will apply a cap on type \(n \cdot n^l\), eliminating completely the determiner “the”.

from lambeq import Rewriter

# Apply rewrite rule for prepositional phrases

rewriter = Rewriter(['prepositional_phrase', 'determiner'])
rewritten_diagram = rewriter(diagram)

rewritten_diagram.draw(figsize=(11,5), fontsize=13)
../_images/1acca1e2fb5818ee4d01f967b1b6375ebb656acd8015fa652e405a07f3bb83f7.png

We will now ask lambeq to normalise the diagram, by “stretching” the wires and re-arranging the boxes if required:

normalised_diagram = rewritten_diagram.normal_form()
normalised_diagram.draw(figsize=(9,4), fontsize=13)
../_images/13a9c0972f7fc68fd14060a165292c4516a29c47549b3ac96739b4ca65817184.png

In the simplified diagram, the order of the preposition tensor is reduced by 2, which at least for a classical experiment, is a substantial improvement. Note also that the determiner is now eliminated, equating the meaning of the noun phrase “the park” with that of the noun “park”.

Another very useful rewrite rule is the CurryRewriteRule, which allows us to convert adjoint output wires into input wires using map-state duality. For example:

curry_functor = Rewriter(['curry'])
curried_diagram = curry_functor(normalised_diagram)
curried_diagram.draw(figsize=(9,4), fontsize=13)
../_images/1835579ef6b95733503aef325ea98760be8f683402635e2c3600924399157c6b.png

After normalisation the resulting diagram no longer contains any cups, which eliminates post-selection and allows for faster execution.

curried_diagram.normal_form().draw(figsize=(5,4), fontsize=13)
../_images/eef401a74df647da134592093d995efc2e5d161ea4f594c236fb744afe4a0cca.png

These examples clearly demonstrate the flexibility of string diagrams compared to simple tensor networks, which was one of the main reasons for choosing them as lambeq’s representation format. lambeq comes with a number of standard rewrite rules covering auxiliary verbs, connectors, coordinators, adverbs, determiners, relative pronouns, and prepositional phrases.

Rewrite rule

Description

auxiliary

Removes auxiliary verbs (such as “do”) by replacing them with caps.

connector

Removes sentence connectors (such as “that”) by replacing them with caps.

coordination

Simplifies “and” by replacing it with a layer of interleaving spiders.

curry

Uses map-state duality to reduce the number of cups in the diagram.

determiner

Removes determiners (such as “the”) by replacing them with caps.

object_rel_pronoun , subject_rel_pronoun

Simplifies relative pronouns (such as “that”) using cups, spiders and a loop.

postadverb , preadverb

Simplifies adverbs by passing through the noun wire transparently using a cap.

prepositional_phrase

Simplifies prepositions by passing through the noun wire using a cap.

Diagram-level rewriters

While box-level rewriters and rewrite rules access one box at a time, for certain cases of more general transformations you will require knowledge of the broader context in the diagram. For example, imagine the following derivation:

from lambeq.backend.grammar import Diagram, Word, Ty, Cup
from lambeq import AtomicType

n = AtomicType.NOUN
s = AtomicType.SENTENCE

words = [Word('do', n.r @ s @ n.l), Word('your', n @ n.l), 
         Word('homework', n), Word('now', s.r @ s)]
morphisms = [(Cup, 2, 3), (Cup, 4, 5), (Cup, 1, 6)]
diagram = Diagram.create_pregroup_diagram(words, morphisms)
diagram.draw(figsize=(5,2))
../_images/420045bbe0e76fac47e638e8c13d165c837192d054bb47caac1fd3286f2b8827.png

Note that this diagram has two free wires, coming from different boxes. In cases like these, if your loss function expects a single wire, you are going to get a dimension mismatch error. The probem can be addressed by merging the two free wires into one, so you are able to train your model in a consistent way. In lambeq, this can be done with the UnifyCodomainRewriter, which acts at the diagram level. Specifically, the rewriter looks at the codomain of the diagram (in the above case \(n^r \cdot s\)), and if this consists of more than one wires, it merges these wires with an extra box.

from lambeq import UnifyCodomainRewriter

rewriter = UnifyCodomainRewriter(output_type=s)

rewriter(diagram).draw(figsize=(5,4))
../_images/b945f68e7aadbbc351d6fba9d5732e8b048e5a26252a7d457a98b2b048597e42.png

This type of transformation would not be possible with box-level rewriters, since it requires knowledge that is available in more than one boxes.

Other diagram-level rewriters that are available in lambeq:

More diagram-level rewriters can be created by extending the class DiagramRewriter.

See also: