Scholarly Documentation

Pallava Script Unicode Initiative

Systematic epigraphic evidence corpus supporting the official Unicode encoding of the Pallava script — building on Anshuman Pandey's 2018 proposal (L2/18-083) and contributing to the Unicode Technical Committee's active script work.

Compiled by Sidda Jagadeesh Donthi Siddappa  ·  Independent Researcher
Characters Mapped
Evidence Instances
Inscription Sources
Confirmed Characters
Unicode Status: Pending
Ref: L2/18-083 (Pandey 2018)
Five-Layer Encoding Architecture
Layer 1
Glyph Images
Primary epigraphic evidence. Stone inscription photographs and handwritten script images.
*.jpg / *.png
Layer 2
PUA Encoding
Internal stable codepoints. U+E800–U+E8FF block reserved for Pallava characters.
U+E820 = ka
Layer 3
Grantha Unicode
Interoperability proxy using the closest encoded ancestor script (U+11300–U+1137F).
𑌕 U+11315
Layer 4
IAST Transliteration
ISO 15919 romanisation. Scholarly standard and LLM training backbone.
ka, śrī, namas
Layer 5
Translation
Sanskrit and Telugu semantic translations generated via Claude Vision API.
Sanskrit / Telugu
Character Inventory & Evidence
PUA Code IAST Glyph Grantha Telugu Devanagari Evidence Status Notes
Loading character table…
Unicode Submission Checklist
🔤 Complete character inventory (vowels, consonants, diacritics, conjuncts) In Progress (~50 of ~100)
📜 Epigraphic evidence per character (minimum 3 attestations each) Pending — building corpus
📊 Character frequency analysis across corpus Pending — needs larger corpus
🔬 Comparative paleography (Pallava vs Grantha vs Brahmi) Pending
📚 Scholarly citations compiled (Pandey 2018, Lockwood 2015, others) Partial — 3 references
✍️ Formal Unicode proposal document drafted Pending
🌐 Submitted to Unicode Technical Committee (UTC) Pending
Future-Proof Migration Plan
When Pallava script receives official Unicode approval:

1. The Unicode Consortium will publish official codepoints for each Pallava character.
2. Update the unicode_official field in pallava_pua_chart.json for each character.
3. Run the migration script: py -3 migrate_to_official_unicode.py
4. All corpus entries, knowledge base chunks, and training data will be updated automatically.
5. PUA codepoints remain valid as aliases — no data is lost.

The five-layer architecture ensures zero data loss — IAST and glyph images are encoding-independent and remain valid regardless of which Unicode codepoints are eventually assigned.
Scholarly References
Proposal to Encode the Pallava Script in Unicode
Anshuman Pandey · 2018 · Document L2/18-083 · unicode.org
The Creation of the Pallava Grantha Tamil Script
Michael Lockwood · 2015 · Primary source — ingested into corpus
SEI Liaison Report — Script Encoding Initiative (Pallava active)
Unicode Technical Committee · January 2025 · Document L2/25-014 · unicode.org
Grantha Unicode Block (U+11300–U+1137F)
Unicode Standard 7.0+ · Used as proxy encoding for Pallava characters · Wikipedia