farabi-lab/kazakh-stt
Viewer • Updated • 204k • 132 • 7
OpenAI Whisper-small моделін қазақ тілінде толық датасетте оқытылған нұсқасы.
kk)| Параметр | Мән |
|---|---|
| Epoch | 3 |
| Batch size | 8 |
| Learning rate | 1e-5 |
| GPU | NVIDIA RTX A5000 |
| Оқыту уақыты | 4 сағат 43 минут |
from transformers import pipeline
pipe = pipeline(
"automatic-speech-recognition",
model="Musa505/kazakh-tts",
generate_kwargs={"language": "kk"}
)
result = pipe("audio.wav")
print(result["text"])
import torch
from transformers import WhisperForConditionalGeneration, WhisperProcessor
model = WhisperForConditionalGeneration.from_pretrained("Musa505/kazakh-tts")
processor = WhisperProcessor.from_pretrained("Musa505/kazakh-tts")
model.eval().cuda()
inputs = processor(audio_array, sampling_rate=16000, return_tensors="pt")
inputs = {k: v.cuda() for k, v in inputs.items()}
with torch.no_grad():
predicted_ids = model.generate(**inputs, language="kk")
text = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(text)
| Epoch | Eval Loss |
|---|---|
| 1 | 0.1019 |
| 2 | 0.0817 |
| 3 | 0.0776 |
Apache 2.0