Training a new Pipeline

Training a new coreference resolution pipeline from scratch

Dataset Preparation

First, you need an annotated dataset.

This dataset should contain:

The raw text files
The annotations minimal infos:
- The start and end of the annotation (character indexes in the raw text)
- The label of the annotation (type of entity)
- The coreference chains ID

Ready to use annotated datasets can be downloaded directly from the datasets section.