Jun 12 2006

Tamil Unicode ?! -6

Posted at 1:38 pm under Tamil Unicode

Tamil will be visible in the future. Tamil will be sortable or able to process or do anything else in the future. But to do that it will take few years. WHAT MY WHOLE POINT IS, This drawback wouldn’t have needed IF Tamil is encoded CORRECTLY in the Unicode.

As I have already point out, Tamil letter names are given Hindi [or other names]. Like Aytham is as “TAMIL SIGN Visarga”. puLLi as “TAMIL SIGN ANUSWARA”, and there is a 2nd puLLi named as “TAMIL SIGN VIRAMA” [The official version: http://www.unicode.org/charts/PDF/U0B80.pdf ]

Tamil mey letters are not there [k, ng, ch, N, th, etc.]. It is seperated as “ka” and “puLLi”, which is Tamil grammattically wrong. Also creates problem in sorting and other lexicon processing. Moreover, even the tamil letters in unicode are NOT in order. The order follows like this:
ka, nga, cha, ja, nja, da, Na, tha, -na, na, pa, ma, ya, ra, Ra, la, La, zha, va, sa, sha, Sa, ha

Positions of zha, La, Ra, na and ja are not in correct order.

Also Aytham come before vowels, which is also wrong. Tamil digits are positioned after alphabets and vowel modifiers. On top of these some Tamil letters are given the possibility of representing 2 ways. Like kee, koo, etc, and kau. This specifically causes problem in sorting. Should you chose which representation? Now you have the headache of accomodating 2 different representation in your programming code.
Yes you can come up with an programing code to put the characters in order, and try to compare ka+pulli all the time and make a rule that everybody should follow this pattern. Also write code to put aytham after vowels, and sort tamil numbers before tamil vowels. These are not impossible, but divides people. Hence different Tamil vendors will argue there is more advantage in the other way and not this way. This again splits the Tamils stand point as TAM & TAB & TSCII. This time its not about FONT, but its about the ALGORITHM. Should I put the “k” infront of “ka” or not when ordering tamil letters? Should I chose Aytham as the last letter of vowel or front? Some will chose to do this way, and others will chose the other way. There is the sample split of Tamil diaspara.

If people here know programming, can understand this:
“The simple computer programs library routines that are possible in English such as counting the number of characters in a sentence or reversing a string become ridiculously complex in Tamil Unicode.”
Manivannan from tamil-ulagam in yahoogroups.

To add to that, the substring finding, database storage, sorting, data storing and retrieving and displaying [takes more time – remember 100ms in computer is too long].

Yes, the computer system will become faster & efficient. Hence in the future it may take lesser speed than what it is now. But DON’T FORGET, Hindi’s speed will also be shorten. so lets say for arguement sake, if Tamil needs 100 ms now [store, retrieve, and display] & 20 years later 10ms. Hindi needs 10ms now and 20 years later 1ms. So There will
ALWAYS be a lack for Tamil, no matter what, BECAUSE, THE FOUNDATION IS INCORRECT.

AGAIN, I’m NOT saying these are impossible things [please i don’t know how plain I can say these than this]. It can be corrected programming wise. Why this headache for Tamil?

ISCII encoding is not designed for the best representation of Tamil. There are fundamental Tamil flaws in the Unicode, which is same as ISCII.

The reason for all these fundamental flaws in Tamil language and hence THE PERMANENT LACK, compared to Hindi, in computer processing for Tamil language is because Indian government gave a wrong representation of Tamil language.

Please don’t try to say, there is no name like “anuswara” and “visarga”, and “virama” in unicode for Tamil, as I have proved twice already.

