Tuesday, October 17, 2006

Java and unsigned numbers

Java and unsigned numbers

Recently there was a requirement for some binnary data read/write work. I started out with some java code. I used the nio ByteBuffer to read and write data.

Something was not right. Whenever i tried to read some sample data the values were right at times and wrong(negative) at other times.

Then a senior consultant colleague, reminded me that all java numbers are 2's complement. In short the MSB(most significant bit) is used to indicate positive or negative number. For further details on 2's complement, please read http://en.wikipedia.org/wiki/Two's_complement .

That sent me on a trail to firgure out how could i read/write an unsigned number using java.

The basic understanding is that you use a bigger number to store the smaller signed number. That way the bigger signed number becomes the unsigned smaller number.

For example a byte is 8 bits. Its range in java is +127 to -128. (Please read up on the 2's complement bit to understand how we get the negative number)

0111 1111 = 127
1000 0000 = - 128 (2's complement)

So in order to convert this byte to unsinged byte we will incease the byte to a short.

A short in java is 16 bits

0000 0000 0000 0000 0111 1111 = 127
0000 0000 0000 0000 1000 0000 = 128

Here is some code to test this

public class Test {
public static void main(String args[]) {
byte b = -128;
short s = (short)(0x00FF & b);
System.out.println("Byte b = " + b);
System.out.println("Short s = " + s);
}
}

The Result is
D:\>java Test
Byte b = -128
Short s = 128

Please note that i have not directly casted to a short, instead i have done a bit and operation with all 1's (F in hex is 1111 in binary). That way the number is preserved at the byte level. If i had just cast it to a short then java would have treated it as a neagtive number i.e. it still would be -128.

Here is a link that explains clearly how to use binary opreations to achieve this for all number types - http://www.darksleep.com/player/JavaAndUnsignedTypes.html and also Java's integer sizes - http://www.scism.sbu.ac.uk/jfl/Appa/appa1.html

Why does java have no unsigned numbers ?

Sean R. Owens at http://www.darksleep.com/player/JavaAndUnsignedTypes.html, digs up old emails and interviews about Java and Oak where James Gosling explains this :-

http://www.gotw.ca/publications/c_family_interview.htm

Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up.

Finally Java has no unsigned numbers and when dealing with binary data which might contain unsigned numbers, one has to keep this fact in mind.

1 Comments:

Blogger Olo said...

There's some community pressure on Sun to implement unsigned types, but they completely ignore it, although it hurts code clarity and performance in lots of cases (especially cryptography and graphics processing).

See the bug database:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4504839

3:21 PM  

Post a Comment

<< Home