Evaluation¶
Their are two interfaces to evaluate your predicted units:
- High level with
discophon.benchmark, which will run the complete evaluation suite on all available units. - Low level with
discophon.evaluate, where you have fine-grain control over the metrics.
High level¶
To run the complete benchmark evaluation, you first need to save your predicted units to JSONL files organized like this:
units/
├── units-cmn-dev.jsonl
├── units-cmn-test.jsonl
├── units-deu-dev.jsonl
├── ...
├── units-wol-dev.jsonl
└── units-wol-test.jsonl
The filenames should be in the format units-{language}-{split}.jsonl, where language is the language code1,
and split is the dataset split (test or dev).
You can run the benchmark evaluation on phoneme discovery with a many-to-one mapping on all available languages and splits like this:
from discophon.benchmark import benchmark_discovery
df = benchmark_discovery("/path/to/discophon_data", "/path/to/units", kind="many-to-one")
print(df) # pl.DataFrame with the results for each language and split
Use the functions benchmark_abx_continuous or
benchmark_abx_discrete for ABX evaluation.
Via the CLI:
❯ python -m discophon.benchmark --help
usage: discophon.benchmark [-h] [--benchmark {discovery,abx-discrete,abx-continuous}] [--kind {many-to-one,one-to-one}] [--step-units STEP_UNITS]
dataset predictions output
Phoneme Discovery benchmark
positional arguments:
dataset Path to the benchmark dataset
predictions Path to the directory with the discrete units or the features
output Path to the output file
options:
-h, --help show this help message and exit
--benchmark {discovery,abx-discrete,abx-continuous}
Which benchmark (default: discovery)
--kind {many-to-one,one-to-one}
Kind of assignment (either many-to-one, or one-to-one) (default: many-to-one)
--step-units STEP_UNITS
Step in ms between units or features. 'frequency' is then set to 1000 // step_units. (default: 20)
Low level¶
Phoneme discovery¶
You can use the phoneme_discovery function with units of type Units, and phones of type
Phones. You also need to set the kind of evaluation kind, the number of units n_units, and the language or number of phonemes
n_phonemes.
Example:
from discophon.data import read_gold_annotations, read_submitted_units
from discophon.evaluate import phoneme_discovery
phones = read_gold_annotations("/path/to/discophon_data/alignment/alignment-eng-test.txt")
units = read_submitted_units("/path/to/units/units-eng-test.jsonl")
result = phoneme_discovery(units, phones, kind="many-to-one", n_units=256, language="eng")
print(result)
Or via the CLI:
❯ python -m discophon.evaluate --help
usage: discophon.evaluate [-h] [--language LANGUAGE] [--n-phonemes N_PHONEMES] --n-units N_UNITS
[--kind {many-to-one,one-to-one}] [--step-units STEP_UNITS]
units phones
Evaluate predicted units on phoneme discovery
positional arguments:
units Path to predicted units
phones Path to gold alignments
options:
-h, --help show this help message and exit
--language LANGUAGE Evaluated language. Either use this or `--n-phonemes` (default: None)
--n-phonemes N_PHONEMES
Number of phonemes. Either use this or `--language` (default: None)
--n-units N_UNITS Required. Number of units (default: None)
--kind {many-to-one,one-to-one}
Kind of assignment (either many-to-one, or one-to-one) (default: many-to-
one)
--step-units STEP_UNITS
Step between units (in ms) (default: 20)
ABX¶
The ABX evaluation is done separately. First, install this package with the abx optional dependencies:
Then, either run it in Python:
from discophon.abx import discrete_abx, continuous_abx
result_discrete = discrete_abx(
"/path/to/discophon_data/item/triphone-eng-test.item",
"/path/to/units/units-eng-test.jsonl",
frequency=50,
)
print("Discrete: ", result_discrete)
result_continuous = continuous_abx(
"/path/to/discophon_data/item/triphone-eng-test.item",
"/path/to/units/units-eng-test.jsonl",
frequency=50,
)
print("Continuous: ", result_discrete)
Or via the CLI:
❯ python -m discophon.abx --help
usage: discophon.evaluate.abx [-h] --frequency FREQUENCY [--kind {triphone,phoneme}] item root
Continuous or discrete ABX
positional arguments:
item Path to the item file
root Path to the JSONL with units or directory with continuous features
options:
-h, --help show this help message and exit
--frequency FREQUENCY
Required. Units frequency in Hz (default: None)
--kind {triphone,phoneme}
Triphone- or phoneme-based ABX (default: triphone)
-
dev languages:
deu,swa,tam,tha,tur,ukr.test languages:
cmn,eng,eus,fra,jpn,wol. ↩