Skip to content

edsnlp.pipes.trainable.biaffine_dep_parser.biaffine_dep_parser

TrainableBiaffineDependencyParser [source]

Bases: TorchComponent[BatchOutput, BatchInput]

The eds.biaffine_dep_parser component is a trainable dependency parser based on a biaffine model (Dozat and Manning, 2017). For each token, the model predicts a score for each possible head in the document, and a score for each possible label for each head. The results are then decoded either greedily by picking the best scoring head for each token independently, or holistically by computing the Maximum Spanning Tree (MST) over the graph of token → head scores.

Experimental

This component is experimental. In particular, it expects the input to be sentences and not full documents, as it has not been optimized for memory efficiency yet and computed the full matrix of scores for all pairs of tokens in a document.

At the moment, it is mostly used for benchmarking and research purposes.

Examples

import edsnlp, edsnlp.pipes as eds

nlp = edsnlp.blank("eds")
nlp.add_pipe(
    eds.biaffine_dep_parser(
        embedding=eds.transformer(model="hf-internal-testing/tiny-random-bert"),
        hidden_size=128,
        dropout_p=0.1,
        # labels unset, will be inferred from the data in `post_init`
        decoding_mode="mst",
    ),
    name="dep_parser"
)

Dependency parsers are typically trained on CoNLL-formatted Universal Dependencies corpora, which you can load using the edsnlp.data.read_conll function.

To train the model, you can adapt the the Training NER tutorial.

Parameters

PARAMETER DESCRIPTION
nlp

The pipeline object

TYPE: Optional[PipelineProtocol] DEFAULT: None

name

Name of the component

TYPE: str DEFAULT: 'biaffine_dep_parser'

embedding

The word embedding component

TYPE: WordEmbeddingComponent

context_getter

What context to use when computing the span embeddings (defaults to the whole document). For example {"section": "conclusion"} to predict dependencies in the conclusion section of documents.

TYPE: Optional[SpanGetterArg] DEFAULT: None

use_attrs

The attributes to use as features for the model (ex. ["pos_"] to use the POS tag). By default, no attributes are used.

Note that if you train a model with attributes, you will need to provide the same attributes during inference, and the model might not work well if the attributes were not annotated accurately on the test data.

TYPE: Optional[List[str]] DEFAULT: None

attr_size

The size of the attribute embeddings.

TYPE: int DEFAULT: 32

hidden_size

The size of the hidden layer in the MLP.

TYPE: int DEFAULT: 128

dropout_p

The dropout probability to use in the MLP.

TYPE: float DEFAULT: 0.0

labels

The labels to predict. The labels can also be inferred from the data during nlp.post_init(...).

TYPE: List[str] DEFAULT: ['root']

decoding_mode

Whether to decode the dependencies greedily or using the Maximum Spanning Tree algorithm.

TYPE: Literal['greedy', 'mst'] DEFAULT: mst

Authors and citation

The eds.biaffine_dep_parser trainable pipe was developed by AP-HP's Data Science team, and heavily inspired by the implementation of Grobol and Crabbé, 2021. The biaffine architecture is based on the biaffine parser of Dozat and Manning, 2017.

chuliu_edmonds_one_root [source]

Shamelessly copied from https://github.com/hopsparser/hopsparser/blob/main/hopsparser/mst.py#L63 All credits, Loic Grobol at Université Paris Nanterre, France, the original author of this implementation. Find the license of the hopsparser software below:

Copyright 2020 Benoît Crabbé benoit.crabbe@linguist.univ-paris-diderot.fr

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


Repeatedly Use the Chu‑Liu/Edmonds algorithm to find a maximum spanning dependency tree from the weight matrix of a rooted weighted directed graph.

ATTENTION: this modifies scores in place.

Input

  • scores: A 2d numeric array such that scores[i][j] is the weight of the $j→i$ edge in the graph and the 0-th node is the root.

Output

  • tree: A 1d integer array such that tree[i] is the head of the i-th node

  1. Dozat T. and Manning C.D., 2017. Deep Biaffine Attention for Neural Dependency Parsing. https://arxiv.org/abs/1611.01734

  2. Grobol L. and Crabbé B., 2021. Analyse en dépendances du français avec des plongements contextualisés. https://hal.archives-ouvertes.fr/hal-03223424