Myanmar CRNN OCR Model

Lightweight CRNN + CTC model for recognising text lines from scanned Myanmar documents. Used by mmpdfkit to convert scanned Myanmar PDFs to Unicode Markdown.

Model Details

Property Value
Architecture CRNN (Conv + BiLSTM + CTC)
Input Grayscale line crop, height=32px, variable width
Output Myanmar / English Unicode text
Vocabulary size 272 classes (Myanmar Unicode U+1000–U+109F, Extended U+AA60–U+AA7F, English, punctuation)
Format INT8 quantised ONNX (~8.9 MB)
Training data 7.6M synthetic + real scanned Myanmar document line images
Best val CER 4.17% (epoch 50)

Files

File Size Description
myanmar-crnn-ocr.onnx ~8.9 MB INT8 quantised — production model used by mmpdfkit

Usage with mmpdfkit

pip

pip install mmpdfkit[ocr]
mmpdfkit your_scanned_document.pdf

uv (recommended — faster installs, isolated environment)

uv tool install mmpdfkit[ocr]
mmpdfkit your_scanned_document.pdf

The model is downloaded automatically on first OCR run and cached at ~/.cache/mmpdfkit/.

Direct ONNX Usage

import onnxruntime as ort
import numpy as np
import cv2

# Load model
session = ort.InferenceSession("myanmar-crnn-ocr.onnx", providers=["CPUExecutionProvider"])

# Prepare a grayscale line crop (H, W) uint8
crop = cv2.imread("line.png", cv2.IMREAD_GRAYSCALE)
h, w = crop.shape
if h != 32:
    crop = cv2.resize(crop, (max(1, round(w * 32 / h)), 32), interpolation=cv2.INTER_LANCZOS4)
crop = cv2.resize(crop, (crop.shape[1] * 2, 32), interpolation=cv2.INTER_LANCZOS4)

x = (crop.astype(np.float32) / 255.0)[np.newaxis, np.newaxis]  # (1, 1, 32, W)
logits = session.run(["output"], {"input": x})[0]               # (1, W', 272)
indices = logits[0].argmax(axis=1).tolist()
# Apply greedy CTC decode (collapse repeats, remove blank index 22)

Preprocessing

  • Binarise page with Otsu threshold
  • Deskew up to ~5° using cv2.minAreaRect
  • Horizontal dilation to merge characters within a line
  • Row projection to segment line bands
  • Each line crop resized to height=32, then width doubled before inference

Training

Trained with CTC loss on 7.6M Myanmar document line images (synthetic renders + real scans). Optimiser: Adam, 50 epochs, best checkpoint selected by validation CER.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support