Java-based Tibetan translation and text-processing application focused on
Buddhist terminology workflows. It combines dictionary lookup,
tokenization, transliteration, ranking/model services,
and optional OCR-related utilities.
Deep learning research project comparing lightweight CNN and
CNN+Transformer hybrid architectures for Tibetan character OCR.
Focused on replacing hand-crafted OCR pipelines with learned
convolutional and attention-based feature extraction.
Tech:
PyTorch, CNNs, Transformers, ONNX, INT8 Quantization,
Gradio, Hugging Face Spaces, Mixed Precision Training
95.52% validation accuracy on 1,020 Tibetan character classes
Lightweight CNN and CNN+Transformer hybrid comparison