Rule 1:
Transliterate Tibetan characters in a syllable from left to right and in stacks from top to bottom with the vowel being transliterated after the final consonant of the root letter or stack. Equivalents for characters are in the charts below.
Example: བསྒྲིབས་ becomes bsgribs.
Rule 2:
If there is no explicit vowel mark, the implicit vowel is transliterated as “a” and placed after the final consonant of the root letter or stack.
Example: མཁན་ becomes mkhan.
Rule 3:
Use the period to horizontally display two consonants that would normally be stacked.
Example: གྱོན་ becomes gyon but གཡོན་ becomes g.yon.
Rule 4:
The use of the plus-sign (“+”) is required between consonants in a non-standard Tibetan stack. (View list of Standard Tibetan Stacks.)
Example: སཏྟྭ་ becomes sat+t+wa.
Rule 5:
Use the plus-sign (“+”) between transliteration equivalents for multiple vowel signs above and/or below the same Tibetan stack. In such cases, the vowels should be transliterated from bottom to top even though this may contradict the logical order of the expanded phrase.
Example: བྲེུ་ becomes bru+e, and རྡོེ་, which is short for rdo rje, becomes rdo+e.
Rule 6:
The transliteration of a standard Tibetan stack that uses the plus-sign (“+”) is equivalent to the transliteration that does not.
Example: For རྟ་, the transliterations rta and r+ta are equivalent, though the former is preferable.
Rule 7:
For Tibetan transliterations of multi-syllable Sanskrit words that fall within a single tsheg bar (Tibetan “syllable”), the implicit vowel, “a,” should be inserted after each cluster consonant without an explicit vowel mark except when the virama (Tib. srog med) is subscribed to that cluster. If the word ends in an anusvara (“M”) or a visarga (“H”) the final “a” is inserted before their transliteration.
Example: སརྦ་མངྒལཾ་ becomes sarba mang+galaM.
Rule 8:
All characters can be represented by the escape sequence “\u” plus their 4-digit hexadecimal code for standard Unicode characters. For surrogate pairs, the escape sequence “\U” plus the 8-digit hexadecimal code should be used. In either case, the full 4 or 8 hexadecimal code must be used without dropping leading zeros. The characters in the list of those not found in Unicode 4.0 have been assigned values in the Private Use Area, so that the standard escape sequence, “\uXXXX,” can be used.
Example: ཀ་ can be represented by either “ka” or “\u0F40.”
Rule 9:
To insert a run of non-Tibetan characters within Tibetan transliteration: the whole string, encoded in UTF-8, must be enclosed in brackets. Pairs of opening and closing brackets may be nested with the final closing bracket indicating the resumption of Tibetan transliteration. The escape sequences “\uXXXX” and “\UXXXXXXXX” can be used within brackets to refer to Tibetan or non-Tibetan characters.
Example: ཁོང་New York་ལ་ཕེབས་སོང། becomes khong [New York] la phebs song /
Rule 10:
To insert a single non-Tibetan character, numeral, or punctuation mark within a run of transliterated Tibetan, prefix it with a backslash. (Note: The upper or lowercase “u” cannot be inserted through this method, since “\u” and “\U” trigger the insertion of Unicode characters by their hexadecimal value. Brackets must be used to insert a single letter “u” or “U,” e.g. [u] or [U].)
Example: དེ་ལ་3་ཡོད། becomes de la \3 yod/
Rule 11:
When the a-chen (ཨ) is found at the beginning of a word and lacks a vowel sign, it is transliterated as “a.” Otherwise, it is transliterated according to the vowel sign attached to it. If it is found in the middle of a stack, transliterate it as “+a”; if it is found in the middle of a syllable (tsheg bar), transliterate it as “.a”.
Example: ཨ་ཁུ་ becomes a khu, but ཨུག་པ་ becomes ug pa. Also, ཨཾ་ becomes aM.
Rule 12:
Capitals are used to denote the following Sanskrit-based Tibetan characters: the long vowels – A, I, U, -I; the anusvara – M; the visarga – H; the retroflex letters – T, Th, D, D+h, N, and Sh.
Example: མཱ་ becomes mA (diacritic transliteration is mā). དུཿ་ becomes duH (duḥ). ཕཊ་ becomes phaT (phaṭ).
Rule 13:
Capital R is used to indicate the full-form of ra when it is the top letter of a non-standard Tibetan stack (equivalent to U+0F6A). (View list of Standard Tibetan Stacks.)
Example: ཪྣ་ becomes R+na. ཪྻ་ becomes R+Ya, while ཪྱ་ becomes R+ya.
Rule 14:
The full-formed ra in the standard Tibetan stacks is transliterated as the lower-case “r”.
Example: རྙ་ becomes rnya, རླ་ becomes rla, and རྭ་ becomes rwa.
Rule 15:
Capital W, Y, and R are used to transliterate the full form of wa, ya, and ra respectively, when they are in any position except the top-most.
Rule 16:
In non-standard Tibetan stacks, the lower-case r, y, and w are used to represent the superscribed ra (ra mgo), the subscribed ra (ra btags), the subscribed ya (ya btags), and the subscribed wa (wa zur) respectively. (View list of Standard Tibetan Stacks.)