Whisper Large V2

Speech to text model for transcribing audio files.

LG Gram

Technical details

Model type
Audio
Train dataset
Common Voice Corpus 17.0
Test dataset
Common Voice Corpus 17.0

Hardware

GPU

CPU

Intel®
Core™ Ultra 5 125H (GPU)

Size (GB)

Original
6.17
CLIKA
1.71
72%
smaller

Accuracy (word error rate %)

Original
14.40
CLIKA
13.80
0.6%
more accurate

Speed (tokens/sec)

Original
0.77
CLIKA
3.48
4.5x
faster