Java’s Lack of Unsigned Types Considered Harmful
I have been aware of Java’s lack of unsigned types for a long time, but it was never really an issue for me — until today that is. Enter arrays of bytes, which Java considers signed and only signed and nothing but signed. The only way around it is a massively ugly hack that goes soemthing like this:
buf[i] & 0x000000FF
This promotes to an int, then chops off the signed stuff, and gives you a signed int that is big enough to hold all the values that an unsigned byte might.
I found an excellent delving into the reasons for this nonsense; it ends on this sad but all-to-true remark:
In the sidebar it says: “unsigned isn’t implemented yet; it might never be.” How right you were.
September 1st, 2006 at 12:59:24 pm
Back in the early days of C (“classic C, or K&R C), that language had double precision floating point (double) but no single precision (float). The “char” type was permitted to be implemented as either signed or unsigned, depending on what was most convenient for the underlying processor. For example, the Digital Equipment Corp. PDP-11 series (that hosted many of the early UNIX systems) would load a byte from memory into a register and do sign extension — so “char” on that machine was most easily implemented as signed. There was no similar instruction that would not sign-extend. To do “unsigned char” would have required the compiler writer to generate an extra instruction to zero-out the high-order bits. Most of the time programmers preferred to keep as much speed as possible and handle the sign extension, usually, by algorithm design. There were other machines with compilers that implemented “char” as unsigned. To write portable code you had to make sure it didn’t matter which flavor of machine you were running on.
Today, in C, you still don’t know if “char” is signed or not. But you have “unsigned char” and “signed char” which you can use if it matters to you.