API reference¶
discophon.benchmark
¶
Run the DiscoPhon benchmark on your predictions.
Compute the scores for all languages and splits for which units or features have been extracted.
benchmark_discovery
¶
benchmark_discovery(
path_dataset: str | Path,
path_units: str | Path,
*,
kind: Literal["many-to-one", "one-to-one"],
step_units: int = STEP_UNITS,
) -> DataFrame
Benchmark phoneme discovery. Evaluate all languages and splits with available units.
The units should be saved in the directory path_units, in JSONL files
named units-{code}-{split}.jsonl with keys file (str) and units (list[int]).
Parameters:
-
path_dataset(str | Path) –Path to the DiscoPhon dataset
-
path_units(str | Path) –Path to the directory with the predicted units
-
kind(Literal['many-to-one', 'one-to-one']) –Kind of assignment. If it is
many-to-one, the number of units is set to the default (DEFAULT_N_UNITS). Otherwise, it is set to the number of phonemes plus one. -
step_units(int, default:STEP_UNITS) –Step between consecutive units (in ms).
Returns:
-
DataFrame–DataFrame with the results
benchmark_abx_discrete
¶
benchmark_abx_discrete(
path_dataset: str | Path,
path_units: str | Path,
*,
kind: Literal["triphone", "phoneme"] = "triphone",
step_units: int = STEP_UNITS,
) -> DataFrame
ABX on all discrete units available.
The units should be saved in the directory path_units, in JSONL files
named units-{code}-{split}.jsonl with keys file (str) and units (list[int]).
Parameters:
-
path_dataset(str | Path) –Path to the DiscoPhon dataset
-
path_units(str | Path) –Path to the directory with the predicted units
-
kind(Literal['triphone', 'phoneme'], default:'triphone') –Kind of representations to use for ABX computation.
-
step_units(int, default:STEP_UNITS) –Step between consecutive units (in ms).
Returns:
-
DataFrame–DataFrame with the results
benchmark_abx_continuous
¶
benchmark_abx_continuous(
path_dataset: str | Path,
path_features: str | Path,
*,
kind: Literal["triphone", "phoneme"] = "triphone",
step_units: int = STEP_UNITS,
) -> DataFrame
ABX on all continuous features available.
The features should be saved in the directory path_features, in subfolders path_features/{code}/{split}.
Parameters:
-
path_dataset(str | Path) –Path to the DiscoPhon dataset
-
path_features(str | Path) –Path to the directory with the extracted features
-
kind(Literal['triphone', 'phoneme'], default:'triphone') –Kind of representations to use for ABX computation.
-
step_units(int, default:STEP_UNITS) –Step between consecutive features (in ms). The feature frequency will be set to
1_000 // step_units
Returns:
-
DataFrame–DataFrame with the results
discophon.evaluate
¶
DiscoPhon evaluation module.
coocurrence_matrix
¶
coocurrence_matrix(
units: Units,
phones: Phones,
*,
n_units: int,
n_phonemes: int | None = None,
step_units: int = STEP_UNITS,
step_phones: int = STEP_PHONES,
language: str | Language | None = None,
) -> DataArray
Build the 2D coocurrence matrix of shape (n_phonemes, n_units) as a DataArray.
Parameters:
-
units(Units) –Predicted discrete units
-
phones(Phones) –Gold phone annotations
-
n_units(int) –Number of distinct discrete units in the evaluated system
-
n_phonemes(int | None, default:None) –Number of phonemes in the language under consideration. Either use this argument or
language. -
step_units(int, default:STEP_UNITS) –Step between consecutive units (in ms)
-
step_phones(int, default:STEP_PHONES) –Step between consecutive phones (in ms)
-
language(str | Language | None, default:None) –Evaluated language. Used to infer the number of phonemes if
n_phonemesis not set. Do not set both at the same time.
Returns:
-
DataArray–2D array for which the element (
i,j) is the number of times the unitjhas appeared where the underlying phoneme isi. The phonemes are sorted by frequency.
phone_assignments
¶
phone_assignments(
units: Units, coocurrence: DataArray, *, kind: Literal["many-to-one", "one-to-one"]
) -> Phones
Compute the assigned sequences of phones from units, the coocurrence matrix, and the kind of assignment.
Parameters:
-
units(Units) –Predicted discrete units
-
coocurrence(DataArray) –Coocurrence matrix between
unitsand the underlying phones, computed withcoocurrence_matrix -
kind(Literal['many-to-one', 'one-to-one']) –Kind of assignment.
Returns:
-
Phones–Assigned phones with this
kindof mapping
phoneme_discovery
¶
phoneme_discovery(
units: Units,
phones: Phones,
*,
kind: Literal["many-to-one", "one-to-one"],
n_units: int,
n_phonemes: int | None = None,
step_units: int = STEP_UNITS,
step_phones: int = STEP_PHONES,
language: str | Language | None = None,
) -> PhonemeDiscoveryEvaluation
Full evaluation of phoneme discovery: PNMI, PER, F1 and R-value boundary detection.
Parameters:
-
units(Units) –Predicted discrete units
-
phones(Phones) –Gold phone annotations
-
kind(Literal['many-to-one', 'one-to-one']) –Kind of assignment
-
n_units(int) –Number of distinct discrete units in the evaluated system
-
n_phonemes(int | None, default:None) –Number of phonemes in the language under consideration. Either use this argument or
language. -
step_units(int, default:STEP_UNITS) –Step between consecutive units (in ms)
-
step_phones(int, default:STEP_PHONES) –Step between consecutive phones (in ms)
-
step_units(int, default:STEP_UNITS) –Step between consecutive units (in ms)
-
language(str | Language | None, default:None) –Evaluated language. Used to infer the number of phonemes if
n_phonemesis not set. Do not set both at the same time.
Returns:
-
PhonemeDiscoveryEvaluation–Phoneme discovery results in a dictionary with keys
"pnmi","per","f1", and"r_val".
pnmi
¶
Compute PNMI.
Parameters:
-
coocurrence(DataArray) –Coocurrence matrix between
unitsand the underlying phones, computed withcoocurrence_matrix
Returns:
-
float–Phone-normalized mutual information (between 0 and 1)
phone_error_rate
¶
phone_error_rate(
predicted_phones_from_units: Phones, gold_phones: Phones, *, n_jobs: int = -1
) -> float
Phone error rate.
Total edit distances divided by the total length of the target annotations.
Parameters:
-
predicted_phones_from_units(Phones) –Predicted phones obtained with
phone_assignments -
gold_phones(Phones) –Gold phone annotations
-
n_jobs(int, default:-1) –The maximum number of concurrently runnings jobs to be passed to
joblib.Parallel
Returns:
-
float–Phone error rate. Multiply it by 100 to get a percentage.
phone_segmentation
¶
phone_segmentation(
predicted_phones_from_units: Phones,
gold_phones: Phones,
*,
margin_in_ms: int = 20,
step_units: int = STEP_UNITS,
step_phones: int = STEP_PHONES,
) -> SegmentationEvaluation
Phone segmentation evaluation.
Parameters:
-
predicted_phones_from_units(Phones) –Predicted phones obtained with
phone_assignments -
gold_phones(Phones) –Gold phone annotations
-
margin_in_ms(int, default:20) –Left and right margin around each gold boundaries (in ms). Predicted boundaries that fall in the resulting windows are considered correct. If two windows overlap, they are cut to the midpoint.
-
step_units(int, default:STEP_UNITS) –Step between consecutive units (in ms)
-
step_phones(int, default:STEP_PHONES) –Step between consecutive phones (in ms)
Returns:
-
SegmentationEvaluation–Instance of a dataclass containing the segmentation results in attributes
recall,precision,f1,os, andr_val. Use itsdescribemethod to get a summary of the segmentation evaluation.
discophon.abx
¶
ABX discriminability.
We split this part of the evaluation in a separate module because it's optional
and takes more time to compute. If you want to use it, install fastabx either
with pip install discophon[abx] or pip install fastabx.
discrete_abx
¶
discrete_abx(
path_item: str | Path,
path_units: str | Path,
*,
frequency: int,
kind: Literal["triphone", "phoneme"] = "triphone",
) -> TriphoneABX | PhonemeABX
ABX on discrete units.
Parameters:
-
path_item(str | Path) –Path to the ABX item file
-
path_units(str | Path) –Path to the predicted units: JSONL file with keys
file(str) andunits(list[int]). -
frequency(int) –Feature frequency in Hz. It is the inverse of the
step_unitsparameter used in other functions. -
kind(Literal['triphone', 'phoneme'], default:'triphone') –Kind of representations to consider. If
phoneme, we also compute the ABX in the "any" context condition, if addition of "within" context.
Returns:
-
TriphoneABX | PhonemeABX–Dictionary of ABX discriminabilities with keys
"within_speaker"and"across_speaker"ifkindis"phoneme", and with keys"within_speaker_within_context","across_speaker_within_context","within_speaker_any_context", and"across_speaker_any_context"otherwise.
continuous_abx
¶
continuous_abx(
path_item: str | Path,
path_features: str | Path,
*,
frequency: int,
kind: Literal["triphone", "phoneme"] = "triphone",
) -> TriphoneABX | PhonemeABX
ABX on continuous representations.
Parameters:
-
path_item(str | Path) –Path to the ABX item file
-
path_features(str | Path) –Path to the extracted features: folder of
.ptfiles with names corresponding to the file ids. -
frequency(int) –Feature frequency in Hz. It is the inverse of the
step_unitsparameter used in other functions. -
kind(Literal['triphone', 'phoneme'], default:'triphone') –Kind of representations to consider. If
phoneme, we also compute the ABX in the "any" context condition, if addition of "within" context.
Returns:
-
TriphoneABX | PhonemeABX–Dictionary of ABX discriminabilities with keys
"within_speaker"and"across_speaker"ifkindis"phoneme", and with keys"within_speaker_within_context","across_speaker_within_context","within_speaker_any_context", and"across_speaker_any_context"otherwise.
discophon.prepare
¶
Download and prepare the DiscoPhon benchmark dataset.
download_benchmark
¶
prepare_commonvoice_datasets
¶
Prepare the Common Voice datasets needed for DiscoPhon by resampling and copying the audio files.
The specific Common Voice data should exist in path_dataset/raw: the audio files are expected to be
in path_dataset/raw/${cv_code}/clips where cv_code is the Common Voice specific language code of language.
Parameters:
discophon.data
¶
Data loading and writing utilities.
STEP_PHONES
module-attribute
¶
Constant step in ms between consecutive phone annotations. Override it in function parameters only if you use new annotations built differently.
STEP_UNITS
module-attribute
¶
Default step in ms between consecutive units. Corresponds to 50 Hz model. Can be overridden easily.
DEFAULT_N_UNITS
module-attribute
¶
Default number of distinct units in the many-to-one evaluation.
Units
¶
Type of the discrete units: dictionary mapping file identifiers to lists of integers.
Phones
¶
Type of the gold or predicted phones: dictionary mapping file identifiers to list of strings.