InContextLearningQAAccuracy#
- class composer.metrics.InContextLearningQAAccuracy(dist_sync_on_step=False)[source]#
Computes accuracy for In-context learning (ICL) question answering (QA) tasks.
ICL QA tasks consist of some number of example question answering tasks (referred to as the โcontextโ), followed by a test task where the model must match one of the possible answer aliases (referred to as the โcontinuationโ).
For example, the model may be provided the context below and evaluated on its ability to correctly predict the continuation.
Context: Question: Who was president of the United States in 2012?nAnswer: Barack ObamanQuestion: Is water wet?nAnswer: ` Continuation: [`yes, no]
Both predictions and answers will be normalized before comparison.
- Adds metric state variables:
correct (float): The number of instances where the prediction was a prefix for any of the answer aliases. total (float): The number of total instances that were predicted.
- Parameters
dist_sync_on_step (bool, optional) โ Synchronize metric state across processes at each forward() before returning the value at the step. Default:
False
.
- normalize_answer(answer)[source]#
Lower text and remove punctuation, articles and extra whitespace.
Copied from https://github.com/mandarjoshi90/triviaqa/blob/master/evaluation/triviaqa_evaluation.py