lambeq.text2diagram

exception lambeq.text2diagram.BobcatParseError(sentence: str)[source]

Bases: Exception

__init__(sentence: str) None[source]
args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class lambeq.text2diagram.BobcatParser(model_name_or_path: str = 'bert', root_cats: Optional[Iterable[str]] = None, device: int = - 1, cache_dir: Optional[StrPathT] = None, force_download: bool = False, verbose: str = 'progress', **kwargs: Any)[source]

Bases: lambeq.text2diagram.ccg_parser.CCGParser

CCG parser using Bobcat as the backend.

__init__(model_name_or_path: str = 'bert', root_cats: Optional[Iterable[str]] = None, device: int = - 1, cache_dir: Optional[StrPathT] = None, force_download: bool = False, verbose: str = 'progress', **kwargs: Any) None[source]

Instantiate a BobcatParser.

Parameters
model_name_or_pathstr, default: ‘bert’
Can be either:
  • The path to a directory containing a Bobcat model.

  • The name of a pre-trained model. By default, it uses the “bert” model. See also: BobcatParser.available_models()

root_catsiterable of str, optional

A list of the categories allowed at the root of the parse tree.

deviceint, default: -1

The GPU device ID on which to run the model, if positive. If negative (the default), run on the CPU.

cache_dirstr or os.PathLike, optional

The directory to which a downloaded pre-trained model should be cached instead of the standard cache ($XDG_CACHE_HOME or ~/.cache).

force_downloadbool, default: False

Force the model to be downloaded, even if it is already available locally.

verbosestr, default: ‘progress’,

See VerbosityLevel for options.

**kwargsdict, optional

Additional keyword arguments to be passed to the underlying parsers (see Other Parameters). By default, they are set to the values in the pipeline_config.json file in the model directory.

Other Parameters
Tagger parameters:
batch_sizeint, optional

The number of sentences per batch.

tag_top_kint, optional

The maximum number of tags to keep. If 0, keep all tags.

tag_prob_thresholdfloat, optional

The probability multiplier used for the threshold to keep tags.

tag_prob_threshold_strategy{‘relative’, ‘absolute’}

If “relative”, the probablity threshold is relative to the highest scoring tag. Otherwise, the probability is an absolute threshold.

span_top_kint, optional

The maximum number of entries to keep per span. If 0, keep all entries.

span_prob_thresholdfloat, optional

The probability multiplier used for the threshold to keep entries for a span.

span_prob_threshold_strategy{‘relative’, ‘absolute’}

If “relative”, the probablity threshold is relative to the highest scoring entry. Otherwise, the probability is an absolute threshold.

Chart parser parameters:
eisner_normal_formbool, default: True

Whether to use eisner normal form.

max_parse_treesint, optional

A safety limit to the number of parse trees that can be generated per parse before automatically failing.

beam_sizeint, optional

The beam size to use in the chart cells.

input_tag_score_weightfloat, optional

A scaling multiplier to the log-probabilities of the input tags. This means that a weight of 0 causes all of the input tags to have the same score.

missing_cat_scorefloat, optional

The default score for a category that is generated but not part of the grammar.

missing_span_scorefloat, optional

The default score for a category that is part of the grammar but has no score, due to being below the threshold kept by the tagger.

static available_models() list[str][source]

List the available models.

sentence2diagram(sentence: Union[str, List[str]], tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False) Optional[discopy.rigid.Diagram]

Parse a sentence into a DisCoPy diagram.

Parameters
sentencestr or list of str

The sentence to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
discopy.Diagram or None

The parsed diagram, or None on failure.

sentence2tree(sentence: Union[str, List[str]], tokenised: bool = False, suppress_exceptions: bool = False) Optional[lambeq.text2diagram.ccg_tree.CCGTree]

Parse a sentence into a CCGTree.

Parameters
sentencestr, list[str]

The sentence to be parsed, passed either as a string, or as a list of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
CCGTree or None

The parsed tree, or None on failure.

sentences2diagrams(sentences: SentenceBatchType, tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[Diagram]]

Parse multiple sentences into a list of discopy diagrams.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

tokenisedbool, default: False

Whether each sentence has been passed as a list of tokens.

verbosestr, optional

See VerbosityLevel for options. Not all parsers implement all three levels of progress reporting, see the respective documentation for each parser. If set, takes priority over the verbose attribute of the parser.

Returns
list of discopy.Diagram or None

The parsed diagrams. May contain None if exceptions are suppressed.

sentences2trees(sentences: SentenceBatchType, tokenised: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[CCGTree]][source]

Parse multiple sentences into a list of CCGTree s.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed, passed either as strings or as lists of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

tokenisedbool, default: False

Whether each sentence has been passed as a list of tokens.

verbosestr, optional

See VerbosityLevel for options. If set, takes priority over the verbose attribute of the parser.

Returns
list of CCGTree or None

The parsed trees. (May contain None if exceptions are suppressed)

class lambeq.text2diagram.CCGAtomicType(value)

Bases: lambeq.text2diagram.ccg_types._CCGAtomicTypeMeta

Standard CCG atomic types mapping to their biclosed type.

CONJUNCTION = Ty('conj')
NOUN = Ty('n')
NOUN_PHRASE = Ty('n')
PREPOSITIONAL_PHRASE = Ty('p')
PUNCTUATION = Ty('punc')
SENTENCE = Ty('s')
exception lambeq.text2diagram.CCGBankParseError(sentence: str = '', message: str = '')[source]

Bases: Exception

Error raised if parsing fails in CCGBank.

__init__(sentence: str = '', message: str = '')[source]
args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class lambeq.text2diagram.CCGBankParser(root: Union[str, os.PathLike[str]], verbose: str = 'suppress')[source]

Bases: lambeq.text2diagram.ccg_parser.CCGParser

A parser for CCGBank trees.

__init__(root: Union[str, os.PathLike[str]], verbose: str = 'suppress')[source]

Initialise a CCGBank parser.

Parameters
rootstr or os.PathLike

Path to the root of the corpus. The sections must be located in <root>/data/AUTO.

verbosestr, default: ‘suppress’,

See VerbosityLevel for options.

section2diagrams(section_id: int, planar: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) dict[str, Optional[Diagram]][source]

Parse a CCGBank section into diagrams.

Parameters
section_idint

The section to parse.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Stop exceptions from being raised, instead returning None for a diagram.

verbosestr, optional

See VerbosityLevel for options. If set, takes priority over the verbose attribute of the parser.

Returns
——-
diagramsdict

A dictionary of diagrams labelled by their ID in CCGBank. If a diagram fails to draw and exceptions are suppressed, that entry is replaced by None.

Raises
CCGBankParseError

If parsing fails and exceptions are not suppressed.

section2diagrams_gen(section_id: int, planar: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) Iterator[tuple[str, Optional[Diagram]]][source]

Parse a CCGBank section into diagrams, given as a generator.

The generator only reads data when it is accessed, providing the user with control over the reading process.

Parameters
section_idint

The section to parse.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Stop exceptions from being raised, instead returning None for a diagram.

verbosestr, optional

See VerbosityLevel for options. If set, takes priority over the verbose attribute of the parser.

Yields
ID, diagramtuple of str and Diagram

ID in CCGBank and the corresponding diagram. If a diagram fails to draw and exceptions are suppressed, that entry is replaced by None.

Raises
CCGBankParseError

If parsing fails and exceptions are not suppressed.

section2trees(section_id: int, suppress_exceptions: bool = False, verbose: Optional[str] = None) dict[str, Optional[CCGTree]][source]

Parse a CCGBank section into trees.

Parameters
section_idint

The section to parse.

suppress_exceptionsbool, default: False

Stop exceptions from being raised, instead returning None for a tree.

verbosestr, optional

See VerbosityLevel for options. If set, takes priority over the verbose attribute of the parser.

Returns
treesdict

A dictionary of trees labelled by their ID in CCGBank. If a tree fails to parse and exceptions are suppressed, that entry is None.

Raises
CCGBankParseError

If parsing fails and exceptions are not suppressed.

section2trees_gen(section_id: int, suppress_exceptions: bool = False, verbose: Optional[str] = None) Iterator[tuple[str, Optional[CCGTree]]][source]

Parse a CCGBank section into trees, given as a generator.

The generator only reads data when it is accessed, providing the user with control over the reading process.

Parameters
section_idint

The section to parse.

suppress_exceptionsbool, default: False

Stop exceptions from being raised, instead returning None for a tree.

verbosestr, optional

See VerbosityLevel for options. If set, takes priority over the verbose attribute of the parser.

Yields
ID, treetuple of str and CCGTree

ID in CCGBank and the corresponding tree. If a tree fails to parse and exceptions are suppressed, that entry is None.

Raises
CCGBankParseError

If parsing fails and exceptions are not suppressed.

sentence2diagram(sentence: Union[str, List[str]], tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False) Optional[discopy.rigid.Diagram]

Parse a sentence into a DisCoPy diagram.

Parameters
sentencestr or list of str

The sentence to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
discopy.Diagram or None

The parsed diagram, or None on failure.

sentence2tree(sentence: Union[str, List[str]], tokenised: bool = False, suppress_exceptions: bool = False) Optional[lambeq.text2diagram.ccg_tree.CCGTree]

Parse a sentence into a CCGTree.

Parameters
sentencestr, list[str]

The sentence to be parsed, passed either as a string, or as a list of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
CCGTree or None

The parsed tree, or None on failure.

sentences2diagrams(sentences: SentenceBatchType, tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[Diagram]]

Parse multiple sentences into a list of discopy diagrams.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

tokenisedbool, default: False

Whether each sentence has been passed as a list of tokens.

verbosestr, optional

See VerbosityLevel for options. Not all parsers implement all three levels of progress reporting, see the respective documentation for each parser. If set, takes priority over the verbose attribute of the parser.

Returns
list of discopy.Diagram or None

The parsed diagrams. May contain None if exceptions are suppressed.

sentences2trees(sentences: SentenceBatchType, tokenised: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[CCGTree]][source]

Parse a CCGBank sentence derivation into a CCGTree.

The sentence must be in the format outlined in the CCGBank manual section D.2 and not just a list of words.

Parameters
sentenceslist of str

List of sentences to parse.

suppress_exceptionsbool, default: False

Stop exceptions from being raised, instead returning None for a tree.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens. For CCGBankParser, it should be kept False.

verbosestr, optional

See VerbosityLevel for options. If set, takes priority over the verbose attribute of the parser.

Returns
treeslist of CCGTree

A list of trees. If a tree fails to parse and exceptions are suppressed, that entry is None.

Raises
CCGBankParseError

If parsing fails and exceptions are not suppressed.

ValueError

If tokenised flag is True (not valid for CCGBankParser).

class lambeq.text2diagram.CCGParser(root_cats: Optional[Iterable[str]] = None, verbose: str = 'suppress')[source]

Bases: lambeq.text2diagram.base.Reader

Base class for CCG parsers.

abstract __init__(root_cats: Optional[Iterable[str]] = None, verbose: str = 'suppress') None[source]

Initialise the CCG parser.

sentence2diagram(sentence: Union[str, List[str]], tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False) Optional[discopy.rigid.Diagram][source]

Parse a sentence into a DisCoPy diagram.

Parameters
sentencestr or list of str

The sentence to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
discopy.Diagram or None

The parsed diagram, or None on failure.

sentence2tree(sentence: Union[str, List[str]], tokenised: bool = False, suppress_exceptions: bool = False) Optional[lambeq.text2diagram.ccg_tree.CCGTree][source]

Parse a sentence into a CCGTree.

Parameters
sentencestr, list[str]

The sentence to be parsed, passed either as a string, or as a list of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
CCGTree or None

The parsed tree, or None on failure.

sentences2diagrams(sentences: SentenceBatchType, tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[Diagram]][source]

Parse multiple sentences into a list of discopy diagrams.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

tokenisedbool, default: False

Whether each sentence has been passed as a list of tokens.

verbosestr, optional

See VerbosityLevel for options. Not all parsers implement all three levels of progress reporting, see the respective documentation for each parser. If set, takes priority over the verbose attribute of the parser.

Returns
list of discopy.Diagram or None

The parsed diagrams. May contain None if exceptions are suppressed.

abstract sentences2trees(sentences: SentenceBatchType, tokenised: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[CCGTree]][source]

Parse multiple sentences into a list of CCGTree s.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed, passed either as strings or as lists of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

tokenisedbool, default: False

Whether each sentence has been passed as a list of tokens.

verbosestr, optional

See VerbosityLevel for options. Not all parsers implement all three levels of progress reporting, see the respective documentation for each parser. If set, takes priority over the verbose attribute of the parser.

Returns
list of CCGTree or None

The parsed trees. May contain None if exceptions are suppressed.

class lambeq.text2diagram.CCGRule(value)[source]

Bases: str, enum.Enum

An enumeration of the available CCG rules.

BACKWARD_APPLICATION = 'BA'
BACKWARD_COMPOSITION = 'BC'
BACKWARD_CROSSED_COMPOSITION = 'BX'
BACKWARD_TYPE_RAISING = 'BTR'
CONJUNCTION = 'CONJ'
FORWARD_APPLICATION = 'FA'
FORWARD_COMPOSITION = 'FC'
FORWARD_CROSSED_COMPOSITION = 'FX'
FORWARD_TYPE_RAISING = 'FTR'
GENERALIZED_BACKWARD_COMPOSITION = 'GBC'
GENERALIZED_BACKWARD_CROSSED_COMPOSITION = 'GBX'
GENERALIZED_FORWARD_COMPOSITION = 'GFC'
GENERALIZED_FORWARD_CROSSED_COMPOSITION = 'GFX'
LEXICAL = 'L'
REMOVE_PUNCTUATION_LEFT = 'LP'
REMOVE_PUNCTUATION_RIGHT = 'RP'
UNARY = 'U'
UNKNOWN = 'UNK'
__call__(dom: discopy.biclosed.Ty, cod: discopy.biclosed.Ty) discopy.biclosed.Diagram[source]

Produce a DisCoPy diagram for this rule.

If it is not possible to produce a valid diagram with the given parameters, the domain may be rewritten.

Parameters
domdiscopy.biclosed.Ty

The expected domain of the diagram.

coddiscopy.biclosed.Ty

The expected codomain of the diagram.

Returns
discopy.biclosed.Diagram

The resulting diagram.

Raises
CCGRuleUseError

If a diagram cannot be produced.

check_match(left: discopy.biclosed.Ty, right: discopy.biclosed.Ty) None[source]

Raise an exception if left does not match right.

classmethod infer_rule(dom: discopy.biclosed.Ty, cod: discopy.biclosed.Ty) lambeq.text2diagram.ccg_rule.CCGRule[source]

Infer the CCG rule that admits the given domain and codomain.

Return CCGRule.UNKNOWN if no other rule matches.

Parameters
domdiscopy.biclosed.Ty

The domain of the rule.

coddiscopy.biclosed.Ty

The codomain of the rule.

Returns
CCGRule

A CCG rule that admits the required domain and codomain.

property symbol: str

The standard CCG symbol for the rule.

exception lambeq.text2diagram.CCGRuleUseError(rule: lambeq.text2diagram.ccg_rule.CCGRule, message: str)[source]

Bases: Exception

Error raised when a CCGRule is applied incorrectly.

__init__(rule: lambeq.text2diagram.ccg_rule.CCGRule, message: str) None[source]
args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class lambeq.text2diagram.CCGTree(text: Optional[str] = None, *, rule: Union[str, CCGRule] = CCGRule.UNKNOWN, biclosed_type: Ty, children: Optional[Sequence[CCGTree]] = None)[source]

Bases: object

Derivation tree for a CCG.

This provides a standard derivation interface between the parser and the rest of the model.

__init__(text: Optional[str] = None, *, rule: Union[str, CCGRule] = CCGRule.UNKNOWN, biclosed_type: Ty, children: Optional[Sequence[CCGTree]] = None) None[source]

Initialise a CCG tree.

Parameters
textstr, optional

The word or phrase associated to the whole tree. If None, it is inferred from its children.

ruleCCGRule, default: CCGRule.UNKNOWN

The final CCGRule used in the derivation.

biclosed_typediscopy.biclosed.Ty

The type associated to the derived phrase.

childrenlist of CCGTree, optional

A list of JSON subtrees. The types of these subtrees can be combined with the rule to produce the output type. A leaf node has an empty list of children.

property child: lambeq.text2diagram.ccg_tree.CCGTree

Get the child of a unary tree.

deriv(word_spacing: int = 2, use_slashes: bool = True, use_ascii: bool = False, vertical: bool = False) str[source]

Produce a string representation of the tree.

Parameters
word_spacingint, default: 2

The minimum number of spaces between the words of the diagram. Only used for horizontal diagrams.

use_slashes: bool, default: True

Whether to use slashes in the CCG types instead of arrows. Automatically set to True when use_ascii is True.

use_ascii: bool, default: False

Whether to draw using ASCII characters only.

vertical: bool, default: False

Whether to create a vertical tree representation, instead of the standard horizontal one.

Returns
str

A string that contains the graphical representation of the CCG tree.

classmethod from_json(data: None) None[source]
classmethod from_json(data: Union[str, Dict[str, Any]]) lambeq.text2diagram.ccg_tree.CCGTree

Create a CCGTree from a JSON representation.

A JSON representation of a derivation contains the following fields:

textstr or None

The word or phrase associated to the whole tree. If None, it is inferred from its children.

ruleCCGRule

The final CCGRule used in the derivation.

typediscopy.biclosed.Ty

The type associated to the derived phrase.

childrenlist or None

A list of JSON subtrees. The types of these subtrees can be combined with the rule to produce the output type. A leaf node has an empty list of children.

property left: lambeq.text2diagram.ccg_tree.CCGTree

Get the left child of a binary tree.

property right: lambeq.text2diagram.ccg_tree.CCGTree

Get the right child of a binary tree.

property text: str

The word or phrase associated to the tree.

to_biclosed_diagram(planar: bool = False) discopy.biclosed.Diagram[source]

Convert tree to a derivation in DisCoPy form.

Parameters
planarbool, default: False

Force the diagram to be planar. This only affects trees using cross composition.

to_diagram(planar: bool = False) discopy.rigid.Diagram[source]

Convert tree to a DisCoCat diagram.

Parameters
planarbool, default: False

Force the diagram to be planar. This only affects trees using cross composition.

to_json() Dict[str, Any][source]

Convert tree into JSON form.

without_trivial_unary_rules() lambeq.text2diagram.ccg_tree.CCGTree[source]

Create a new CCGTree from the current tree, with all trivial unary rules (i.e. rules that map X to X) removed.

This might happen because there is no exact correspondence between CCG types and pregroup types, e.g. both CCG types NP and N are mapped to the same pregroup type n.

Returns
lambeq.text2diagram.CCGTree

A new tree free of trivial unary rules.

exception lambeq.text2diagram.DepCCGParseError(sentence: str)[source]

Bases: Exception

__init__(sentence: str) None[source]
args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class lambeq.text2diagram.DepCCGParser(*, lang: str = 'en', model: Optional[str] = None, use_model_unary_rules: bool = False, annotator: str = 'janome', tokenize: Optional[bool] = None, device: int = - 1, root_cats: Optional[Iterable[str]] = None, verbose: str = 'progress', **kwargs: Any)[source]

Bases: lambeq.text2diagram.ccg_parser.CCGParser

CCG parser using depccg as the backend.

__init__(*, lang: str = 'en', model: Optional[str] = None, use_model_unary_rules: bool = False, annotator: str = 'janome', tokenize: Optional[bool] = None, device: int = - 1, root_cats: Optional[Iterable[str]] = None, verbose: str = 'progress', **kwargs: Any) None[source]

Instantiate a parser based on depccg.

Parameters
lang{ ‘en’, ‘ja’ }

The language to use: ‘en’ for English, ‘ja’ for Japanese.

modelstr, optional

The name of the model variant to use, if any. depccg only has English model variants, namely ‘elmo’, ‘rebank’ and ‘elmo_rebank’.

use_model_unary_rulesbool, default: False

Use the unary rules supplied by the model instead of the ones by lambeq.

annotatorstr, default: ‘janome’

The annotator to use, if any. depccg supports ‘candc’ and ‘spacy’ for English, and ‘janome’ and ‘jigg’ for Japanese. By default, no annotator is used for English, and ‘janome’ is used for Japanese.

tokenizebool, optional

Whether to tokenise the input when annotating. This option should only be specified when using the ‘spacy’ annotator.

deviceint, optional

The ID of the GPU to use. By default, uses the CPU.

root_catsiterable of str, optional

A list of categories allowed at the root of the parse. By default, the English categories are:

  • S[dcl]

  • S[wq]

  • S[q]

  • S[qem]

  • NP

and the Japanese categories are:
  • NP[case=nc,mod=nm,fin=f]

  • NP[case=nc,mod=nm,fin=t]

  • S[mod=nm,form=attr,fin=t]

  • S[mod=nm,form=base,fin=f]

  • S[mod=nm,form=base,fin=t]

  • S[mod=nm,form=cont,fin=f]

  • S[mod=nm,form=cont,fin=t]

  • S[mod=nm,form=da,fin=f]

  • S[mod=nm,form=da,fin=t]

  • S[mod=nm,form=hyp,fin=t]

  • S[mod=nm,form=imp,fin=f]

  • S[mod=nm,form=imp,fin=t]

  • S[mod=nm,form=r,fin=t]

  • S[mod=nm,form=s,fin=t]

  • S[mod=nm,form=stem,fin=f]

  • S[mod=nm,form=stem,fin=t]

verbosestr, default: ‘progress’,

Controls the command-line output of the parser. Only ‘progress’ option is available for this parser.

**kwargsdict, optional

Optional arguments passed to depccg.

sentence2diagram(sentence: Union[str, List[str]], tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False) Optional[discopy.rigid.Diagram][source]

Parse a sentence into a DisCoPy diagram.

Parameters
sentencestr, list[str]

The sentence to be parsed, passed either as a string, or as a list of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
discopy.Diagram or None

The parsed diagram, or None on failure.

Raises
ValueErrorIf tokenised does not match with the input type.
sentence2tree(sentence: Union[str, List[str]], tokenised: bool = False, suppress_exceptions: bool = False) Optional[lambeq.text2diagram.ccg_tree.CCGTree][source]

Parse a sentence into a CCGTree.

Parameters
sentencestr, list[str]

The sentence to be parsed, passed either as a string, or as a list of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
CCGTree or None

The parsed tree, or None on failure.

Raises
ValueErrorIf tokenised does not match with the input type.
sentences2diagrams(sentences: SentenceBatchType, tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[Diagram]]

Parse multiple sentences into a list of discopy diagrams.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

tokenisedbool, default: False

Whether each sentence has been passed as a list of tokens.

verbosestr, optional

See VerbosityLevel for options. Not all parsers implement all three levels of progress reporting, see the respective documentation for each parser. If set, takes priority over the verbose attribute of the parser.

Returns
list of discopy.Diagram or None

The parsed diagrams. May contain None if exceptions are suppressed.

sentences2trees(sentences: SentenceBatchType, tokenised: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[CCGTree]][source]

Parse multiple sentences into a list of CCGTree s.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed, passed either as strings or as lists of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

tokenisedbool, default: False

Whether each sentence has been passed as a list of tokens.

verbosestr, optional

Controls the form of progress tracking. If set, takes priority over the verbose attribute of the parser. This class only supports ‘progress’ verbosity level - a progress bar.

Returns
list of CCGTree or None

The parsed trees. May contain None if exceptions are suppressed.

Raises
ValueErrorIf tokenised does not match with the input type
or if verbosity is set to an unsupported value
class lambeq.text2diagram.LinearReader(combining_diagram: discopy.rigid.Diagram, word_type: discopy.rigid.Ty = Ty('s'), start_box: discopy.rigid.Diagram = Id(Ty()))[source]

Bases: lambeq.text2diagram.base.Reader

A reader that combines words linearly using a stair diagram.

__init__(combining_diagram: discopy.rigid.Diagram, word_type: discopy.rigid.Ty = Ty('s'), start_box: discopy.rigid.Diagram = Id(Ty())) None[source]

Initialise a linear reader.

Parameters
combining_diagramDiagram

The diagram that is used to combine two word boxes. It is continuously applied on the left-most wires until a single output wire remains.

word_typeTy, default: core.types.AtomicType.SENTENCE

The type of each word box. By default, it uses the sentence type from core.types.AtomicType.

start_boxDiagram, default: Id()

The start box used as a sentinel value for combining. By default, the empty diagram is used.

sentence2diagram(sentence: Union[str, List[str]], tokenised: bool = False) discopy.rigid.Diagram[source]

Parse a sentence into a DisCoPy diagram.

If tokenise is True, sentence is tokenised, otherwise it is split into tokens by whitespace. This method creates a box for each token, and combines them linearly.

Parameters
sentencestr or list of str

The input sentence, passed either as a string or as a list of tokens.

tokenisedbool, default: False

Set to True, if the sentence is passed as a list of tokens instead of a single string. If set to False, words are split by whitespace.

Raises
ValueError

If sentence does not match tokenised flag, or if an invalid mode or parser is passed to the initialiser.

sentences2diagrams(sentences: SentenceBatchType, tokenised: bool = False) list[Optional[Diagram]]

Parse multiple sentences into a list of DisCoPy diagrams.

class lambeq.text2diagram.Reader[source]

Bases: abc.ABC

Base class for readers and parsers.

abstract sentence2diagram(sentence: Union[str, List[str]], tokenised: bool = False) Optional[discopy.rigid.Diagram][source]

Parse a sentence into a DisCoPy diagram.

sentences2diagrams(sentences: SentenceBatchType, tokenised: bool = False) list[Optional[Diagram]][source]

Parse multiple sentences into a list of DisCoPy diagrams.

class lambeq.text2diagram.TreeReader(ccg_parser: typing.Union[lambeq.text2diagram.ccg_parser.CCGParser, typing.Callable[[], lambeq.text2diagram.ccg_parser.CCGParser]] = <class 'lambeq.text2diagram.bobcat_parser.BobcatParser'>, mode: lambeq.text2diagram.tree_reader.TreeReaderMode = TreeReaderMode.NO_TYPE, word_type: discopy.rigid.Ty = Ty('s'))[source]

Bases: lambeq.text2diagram.base.Reader

A reader that combines words according to a parse tree.

__init__(ccg_parser: typing.Union[lambeq.text2diagram.ccg_parser.CCGParser, typing.Callable[[], lambeq.text2diagram.ccg_parser.CCGParser]] = <class 'lambeq.text2diagram.bobcat_parser.BobcatParser'>, mode: lambeq.text2diagram.tree_reader.TreeReaderMode = TreeReaderMode.NO_TYPE, word_type: discopy.rigid.Ty = Ty('s')) None[source]

Initialise a tree reader.

Parameters
ccg_parserCCGParser or callable, default: BobcatParser

A CCGParser object or a function that returns it. The parse tree produced by the parser is used to generate the tree diagram.

modeTreeReaderMode, default: TreeReaderMode.NO_TYPE

Determines what boxes are used to combine the tree. See TreeReaderMode for options.

word_typeTy, default: core.types.AtomicType.SENTENCE

The type of each word box. By default, it uses the sentence type from core.types.AtomicType.

classmethod available_modes() list[str][source]

The list of modes for initialising a tree reader.

sentence2diagram(sentence: Union[str, List[str]], tokenised: bool = False, suppress_exceptions: bool = False) Optional[discopy.rigid.Diagram][source]

Parse a sentence into a Diagram .

This produces a tree-shaped diagram based on the output of the CCG parser.

Parameters
sentencestr or list of str

The sentence to be parsed.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

Returns
discopy.rigid.Diagram or None

The parsed diagram, or None on failure.

sentences2diagrams(sentences: SentenceBatchType, tokenised: bool = False) list[Optional[Diagram]]

Parse multiple sentences into a list of DisCoPy diagrams.

static tree2diagram(tree: lambeq.text2diagram.ccg_tree.CCGTree, mode: lambeq.text2diagram.tree_reader.TreeReaderMode = TreeReaderMode.NO_TYPE, word_type: discopy.rigid.Ty = Ty('s'), suppress_exceptions: bool = False) Optional[discopy.rigid.Diagram][source]

Convert a CCGTree into a Diagram .

This produces a tree-shaped diagram based on the output of the CCG parser.

Parameters
treeCCGTree

The CCG tree to be converted.

modeTreeReaderMode, default: TreeReaderMode.NO_TYPE

Determines what boxes are used to combine the tree. See TreeReaderMode for options.

word_typeTy, default: core.types.AtomicType.SENTENCE

The type of each word box. By default, it uses the sentence type from core.types.AtomicType.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

Returns
discopy.rigid.Diagram or None

The parsed diagram, or None on failure.

class lambeq.text2diagram.TreeReaderMode(value)[source]

Bases: enum.Enum

An enumeration for TreeReader.

The words in the tree diagram can be combined using 3 modes:

NO_TYPE

The ‘no type’ mode names every rule box UNIBOX.

RULE_ONLY

The ‘rule name’ mode names every rule box based on the name of the original CCG rule. For example, for the forward application rule FA(N << N), the rule box will be named FA.

RULE_TYPE

The ‘rule type’ mode names every rule box based on the name and type of the original CCG rule. For example, for the forward application rule FA(N << N), the rule box will be named FA(N << N).

HEIGHT

The ‘height’ mode names every rule box based on the tree height of its subtree. For example, a rule box directly combining two words will be named layer_1.

HEIGHT = 3
NO_TYPE = 0
RULE_ONLY = 1
RULE_TYPE = 2
exception lambeq.text2diagram.WebParseError(sentence: str)[source]

Bases: OSError

__init__(sentence: str) None[source]
args
characters_written
errno

POSIX exception code

filename

exception filename

filename2

second exception filename

strerror

exception strerror

with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class lambeq.text2diagram.WebParser(parser: str = 'depccg', verbose: str = 'suppress')[source]

Bases: lambeq.text2diagram.ccg_parser.CCGParser

Wrapper that allows passing parser queries to an online service.

__init__(parser: str = 'depccg', verbose: str = 'suppress') None[source]

Initialise a web parser.

Parameters
parserstr, optional

The web parser to use. By default, this is depccg parser.

verbosestr, default: ‘suppress’,

See VerbosityLevel for options.

sentence2diagram(sentence: Union[str, List[str]], tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False) Optional[discopy.rigid.Diagram]

Parse a sentence into a DisCoPy diagram.

Parameters
sentencestr or list of str

The sentence to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
discopy.Diagram or None

The parsed diagram, or None on failure.

sentence2tree(sentence: Union[str, List[str]], tokenised: bool = False, suppress_exceptions: bool = False) Optional[lambeq.text2diagram.ccg_tree.CCGTree]

Parse a sentence into a CCGTree.

Parameters
sentencestr, list[str]

The sentence to be parsed, passed either as a string, or as a list of tokens.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if the sentence fails to parse, instead of raising an exception, returns None.

tokenisedbool, default: False

Whether the sentence has been passed as a list of tokens.

Returns
CCGTree or None

The parsed tree, or None on failure.

sentences2diagrams(sentences: SentenceBatchType, tokenised: bool = False, planar: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[Diagram]]

Parse multiple sentences into a list of discopy diagrams.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed.

planarbool, default: False

Force diagrams to be planar when they contain crossed composition.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

tokenisedbool, default: False

Whether each sentence has been passed as a list of tokens.

verbosestr, optional

See VerbosityLevel for options. Not all parsers implement all three levels of progress reporting, see the respective documentation for each parser. If set, takes priority over the verbose attribute of the parser.

Returns
list of discopy.Diagram or None

The parsed diagrams. May contain None if exceptions are suppressed.

sentences2trees(sentences: SentenceBatchType, tokenised: bool = False, suppress_exceptions: bool = False, verbose: Optional[str] = None) list[Optional[CCGTree]][source]

Parse multiple sentences into a list of CCGTree s.

Parameters
sentenceslist of str, or list of list of str

The sentences to be parsed.

suppress_exceptionsbool, default: False

Whether to suppress exceptions. If True, then if a sentence fails to parse, instead of raising an exception, its return entry is None.

verbosestr, optional

See VerbosityLevel for options. If set, it takes priority over the verbose attribute of the parser.

Returns
list of CCGTree or None

The parsed trees. May contain None if exceptions are suppressed.

Raises
URLError

If the service URL is not well formed.

ValueError

If a sentence is blank or type of the sentence does not match tokenised flag.

WebParseError

If the parser fails to obtain a parse tree from the server.

lambeq.text2diagram.cups_reader = <lambeq.text2diagram.linear_reader.LinearReader object>

A reader that combines words linearly using a stair diagram.

lambeq.text2diagram.spiders_reader = <lambeq.text2diagram.spiders_reader.SpidersReader object>

A reader that combines words using a spider.

lambeq.text2diagram.stairs_reader = <lambeq.text2diagram.linear_reader.LinearReader object>

A reader that combines words linearly using a stair diagram.

lambeq.text2diagram.word_sequence_reader = <lambeq.text2diagram.linear_reader.LinearReader object>

A reader that combines words linearly using a stair diagram.

lambeq.text2diagram.bag_of_words_reader = <lambeq.text2diagram.spiders_reader.SpidersReader object>

A reader that combines words using a spider.