[ prog / sol / mona ]

prog


Everything is Unicode, until the exploits started rolling in

14 2021-01-23 02:36

>>10
You can only sort of do this; English doesn't have standardized pronounciation, and you still have to deal with ambiguouity in the language its self.

>>11
Half of the abomination is the lack of inflection. Why not Russian, Latin, or Ancient Greek? (and this is why we can't have a standard...)

>>12
ASCII and Big5 are both similar to Shift-JIS for the point I was trying to make with regards to CJK. If you were referring to encoding language as words in nearly isolating languages, interestingly Chinese characters aren't each words since compounds are so common most words are composed of a couple characters, in a 16-bit encoding this would make the efficiency with English in ASCII often fairly close. Verisimilitude in http://verisimilitudes.net/2018-06-06 proposes that English be encoded using a 24-bit encoding allowing for 16,777,216 words to be stored in the dictionary, and out-competing 8-bit ASCII for word lengths larger than two (this is not counting the advantage with regards to encoding word separators, and punctuation in the higher-level encoding).

51


VIP:

do not edit these