Perplexity#

class ignite.metrics.Perplexity(output_transform=<function Perplexity.<lambda>>, device=device(type='cpu'), ignore_index=-100)[source]#

Calculates the Perplexity of a language model.

\text{PPL}(W) = \exp \left( -\frac{1}{N} \sum_{i=1}^{N} \log P(w_i | w_1, \ldots, w_{i-1}) \right)

where $N$ is the total number of tokens and $P(w_i | w_1, \ldots, w_{i-1})$ is the conditional probability of token $w_i$ given the preceding tokens.

Perplexity is computed as $\exp(\text{NLL})$ where NLL is the mean negative log-likelihood over all tokens. Lower perplexity indicates a better language model.

update must receive output of the form (y_pred, y) or {'y_pred': y_pred, 'y': y}.
y_pred must be a floating-point tensor of shape (batch_size, vocab_size, seq_len) containing the unnormalized log-probabilities (logits).
y must be a long tensor of shape (batch_size, seq_len) containing the target token indices.

Note

Perplexity uses token-weighted accumulation rather than batch-average to avoid bias towards shorter sequences. The total NLL and total token count are accumulated across all batches, and the final perplexity is computed as exp(total_nll / total_tokens).

Parameters:

output_transform (Callable) – a callable that is used to transform the Engine’s process_function’s output into the form expected by the metric. This can be useful if, for example, you have a multi-output model and you want to compute the metric with respect to one of the outputs. By default, metrics require the output as (y_pred, y) or {'y_pred': y_pred, 'y': y}.
device (str | device) – specifies which device updates are accumulated on. Setting the metric’s device to be the same as your update arguments ensures the update method is non-blocking. By default, CPU.
ignore_index (int) –

Examples

For more information on how metric works with Engine, visit Attach Engine API.

from ignite.metrics.nlp import Perplexity
import torch

ppl = Perplexity()

# batch_size=2, vocab_size=5, seq_len=3
y_pred = torch.randn(2, 5, 3)
y = torch.randint(0, 5, (2, 3))

ppl.update((y_pred, y))

print(type(ppl.compute()))

<class 'float'>

New in version 0.5.5.

Methods

`compute`	Computes the metric based on its accumulated state.
`reset`	Resets the metric to its initial state.
`update`	Updates the metric's state using the passed batch output.

compute()[source]#

Computes the metric based on its accumulated state.

By default, this is called at the end of each epoch.

Returns:

the actual quantity of interest. However, if a Mapping is returned, it will be (shallow) flattened into engine.state.metrics when completed() is called.

Return type:

Any

Raises:

NotComputableError – raised when the metric cannot be computed.

reset()[source]#

Resets the metric to its initial state.

By default, this is called at the start of each epoch.

Return type:: None

update(output)[source]#

Updates the metric’s state using the passed batch output.

By default, this is called once for each batch.

Parameters:: output (tuple[torch.Tensor, torch.Tensor]) – the is the output from the engine’s process function.
Return type:: None

Perplexity#

Search Docs