Subj : Re: What is a text character in a computer? To : comp.programming From : Thomas G. Marshall Date : Tue Oct 04 2005 12:53 am Roger Willcocks coughed up: > "Thomas G. Marshall" > wrote in > message news:Sja0f.15031$J03.13398@trndny05... >> Roger Willcocks coughed up: >>> "Thomas G. Marshall" >>> wrote in >>> message news:ip10f.14973$J03.1423@trndny05... >>>> Roger Willcocks coughed up: >>> >>>>> Much confusion comes from the (mostly American) assumption that >>>>> the mapping from binary number to character name is essentially >>>>> fixed, >>>> >>>> No, "essentially fixed" is true. AFAICT, the most of the computing >>>> universe still does seem to revolve around 7 or 8 bit ASCII. >>>> >>> >>> Case proven, I believe. ASCII = _American_ standard code for >>> information interchange. >> >> Ok. >> >> Ironically, I helped develop a postscript interpreter early on when >> there were hardly any others than adobe, and I'm also a Java >> engineer. Both lend themselves to far broader notions than ascii. >> >> IMO I think you're probably very correct regarding such bias and >> misunderstanding; mine as well. I'll not lay as much gasoline as I >> certainly could for the likely ensuing flamewar regarding /why/ that >> exists except to say that much of that bias is for very good >> historical reason. Like it or not, America always drove the >> computing landscape, and this causes ire among engineers of all the >> other countries. Arogance? Sure, I suppose. > > I agree that there are good historical reasons for the bias within > computing, but given that the ascii character set doesn't include > anything even mildly exotic, it's simply not suitable for typesetting > (in any language). So while the lack of unicode support in new > applications, such as PHP, could be perceived as arrogance, it's more > likely to be due to a lack of understanding of the whole character > encoding issue. Or a lack of caring, which I think is more likely. Or at least, a lack of short-term caring. One of the things about thinking only about american markets is that it isn't that bad a place to start. From there you can wander to whatever level of character flexibility you want, but it just might be that 1 octet byte / character is good enough. In the same way that nearly everything is produced for english first, and then table-driven to support other languages. Already established software vendors excepted. -- http://www.allexperts.com is a nifty way to get an answer to just about /anything/. .