Monday 19 January 2015

Fiddle and Bits and Bits and Bits

My last post didn't fully answer the question: "What are the possible values a variable can be?" I will complete the answer here.

I will use 8-bit bytes for the entirety of this article.

Least Significant Bit


The least significant bit is the right most bit. It is so named because it represents the smallest possible number in the binary system: zero or one. It is the least significant value of the eight.

Don't let that fool you. It can be quite useful. More on that at the end of the post.

Byte of Bits


Lets look at an empty, zeroed byte: 00000000 (eight zeros).

The least significant bit is the right-most bit. So let's switch that bit on: 00000001.

We now have the number one. Move the bit up one slot and the value becomes two: 00000010.

How did that happen? Maths!

Math


Each bit in the byte is 2's compliment (see my previous post).

Each bit's value is 2 raised to a power. Starting at zero, incrementing by one for each step, move right to left and raise 2 to that power. The least significant bit is 2^0. The second least significant bit is 2^1, then 2^2 and so on. Sum the bits turned on for the value the byte represents.

Check it out:

Binary
Equation
Decimal
00
(0 x 2^1) + (0 x 2^0) = 0 + 0
0
01
(0 x 2^1) + (1 x 2^0) = 0 + 1
1
10
(1 x 2^1) + (0 x 2^0) = 2 + 0
2
11
(1 x 2^1) + (1 x 2^0) = 2 + 1
3

The value of each bit, from left to right, looks like this: 128, 64, 32, 16, 8, 4, 2, 1.

If you sum all these values, you get the value: 255. That's the maximum value of an unsigned byte!

Signedness


Ok, time to makes things weird. Well, not finding out your life-partner is your cousin weird but weird enough.

On the opposite side of the byte from the least significant bit is the most significant bit.

In many languages a type may be signed or unsigned. An unsigned byte can only be positive, whole numbers. A signed byte can be positive or negative whole numbers.

Most Significant Bit


A computer doesn't have a notion of a negation sign. It can represent a negative number, sure, but all it sees are bits. Using two’s compliment, how can a single bit be a sign when it must be a number?

The most significant bit does determine whether the value is negative or not. However, the eighth bit always has a value of 128.

Confused, yet? Good!

More Maths


When a computer interprets a signed byte the most significant bit becomes a negative value not a negative sign. Rather than being the value 128, it becomes -128.

Consider this: can you have a value of negative zero? Can zero ever be negative or is zero just zero?

If zero can’t be negative and the most significant bit merely represented the sign of the value, then what does 1000000 mean? This would be -0 and that is not a number.

To make the value -1, we need to add a value to -128 to get there. If we remove the eighth bit from the byte, the remaining seven bits consist of the values: 64, 32, 16, 8, 4, 2 and 1. Sum them all together and you get the value 127. -128 + 127 = -1.

Negative one is 11111111. There's a trick with this at the bottom of this post.

Let’s do another example of a negative number. Lets shoot for -54. What number do we need to add to -128 create this number in binary?

We just need to solve this equation: x = y + z; where x = -54 (the number we want), y = -128 (most significant bit) and z is the value we need.

-54 = -128 + z
-54 + 128 = -128 + z + 128
74 = z

Set the balance of the bits to 74 and the actual value will be -54: 11001010

Signed Values


We can now fully answer the question, “What values can a given C type have?”

A single byte has 256 different values. A C type that consists of a single byte is the char type.

An unsigned char can consist of the values 0 to 255.
A signed char can consist of the values -128 to 127.

Not a Dummy


Don’t feel bad if you have to read things over a few times for it to sink in. It took me a bit to really wrap my head around this but once I did, things really fell into place.

You can cheat, too. If an 8 bit byte has 256 values then divide it by two. That value negated is the lower bound. Reduce that same value by one for the upper bound.

256 / 2 = 128
Lower bound = -128
Upper Bound = 128 - 1 = 127

Fun Fact


Binary is how numerical permissions work for the ‘chmod’ command in Linux.

Bits
Value
Permission
000
0
None - Denied!
001
1
Execute
010
2
Write
100
4
Read

Combining the bits together we get the following table.

011
3
Write (2) + Execute (1) = 3
101
5
Read (4) + Execute (1) = 5
110
6
Read (4) + Write (2) = 6
111
7
Read (4) + Write (2) + Execute (1) = 7

This is the beauty of binary and two’s compliment. You can combine any number of unique options together without overlap.

Cool Tricks


Now that you’re a pro at understanding binary, I thought I might show you a pair of nifty tricks before I leave you.

1. Want to test for whether a number is odd or even?

if (n & 1 == 0) { /* even */ } else { /* odd */ }

2. Want to test whether a signed type is negative or positive?

const char MSB_BYTE = 1 << 8;
if (n & MSB_BYTE == MSB_BYTE) { /*negative */ } else { /* positive */ }


3.  Want to set an unsigned variable to it's maximum value but don't know what it might be on a particular machine? Many languages will let you do this little trick.

unsigned char c = -1

It sets all the bits on, which is always the maximum value. Go back to the "More Maths" section to see why this works.

There are no guarantees these tricks will work everywhere so read the specification for your language and check the compiler implementation to know before using them!

If you have never done bit shifting or used logical operators than this tricks probably make no sense. Once you learn about them come back here and think about these tricks and see if you can figure out why they work.

Friday 16 January 2015

Type Sizes and Max Values

Today, I want to take a step back and talk about something a bit more elementary.

If you've just started learning to program, understanding the value ranges of types can be hard. I hope that I can maybe impart some knowledge to you that will help you get a leg up.

In computing, we use a system called binary to count. Binary is a sequence of ones and zeros that are used to represent numbers.

We'll begin with...

Bits


I like to think of a bit like a gate in a yard; a gate that can only be in one of two states: open or closed.

An open bit is represented by the number 1. When open it allows an electrical current to flow through it. A closed bit, which does not allow electricity to flow through it is represented by the number zero.

This is the essence of binary. A grouping of numbers which have only two states.

Note: When a gate is open, it is in the “high state” (as it would be called in electrical engineering), allowing maximum electron flow through the gate. When a gate is closed, it is in the "low state".

Two’s Complement


As I just said, if something consists of a single bit, it has two states. 1 (one) or 0 (zero). So, if you add a second bit into the mix you have four possible states. Two states for the first bit and two states for the second.

All possible combinations of these two bits looks like this:

00 - bit one and bit two are both closed
01 - bit one is closed and bit two is open
10 - bit one is open and bit two is closed
11 - bit one and bit two are both open

If you add a third, you get 8 possible combinations. They are: 000, 001, 010, 100, 011, 110, 101, and 111.

Each bit you add has exponential growth for the maximum possible combinations. For each bit you have, you raise two to that power. One bit is two to the power of one. Two bits is two to the power of two. Three bits is two to the power of three and so on.

This is two’s complement.

Bytes


A byte is a specific number of bits grouped together.

Different computer architectures can have a different number of bits per byte. The Intel 4004 CPU, for example, had a 4 bit word size. The ASCII character set is based on a 7 bit byte. An 8 bit byte is most common and the remainder of this article will be based on this byte size.

We must raise two to the power of eight to find out the maximum value a single byte can represent. Two to the power of eight is two hundred and fifty-six.

Want proof? Enter the number two into any calculator. Multiply it by two seven more times. 2x2x2x2x2x2x2x2 = 256.

Looking at the progression, one step at a time: 2, 4, 8, 16, 32, 64, 128, 256.

A byte, then, can hold 256 different values but a single byte can’t actually hold the value 256. 

So, why not?

Zero is a Value Too!


Zero, is one of the 256 values that a byte can be. Consider a single bit. What are it's values? Zero or one.

When all the bits in a byte are closed it can only be one thing. Zero.

If zero is one of the 256 values a byte can be it means that the maximum value that can be held in a single byte is 255. 

Multiple Bytes


Bytes can be paired up, too, using two’s compliment. Two, four and eight are common multi-byte pairings.

Each time, we raise the power up further. A single byte is 2^8. Two bytes is 2^16. Four bytes is 2^32,. Eight bytes is 2^64.

Value Ranges of C Types


Lets look at a C type of each typical byte size.

  • char = 1 byte
  • short = 2 bytes
  • long (int) = 4 bytes
  • long long = 8 bytes

So, based on the above pairings, can you deduce what the value ranges are for each of these types?

Well, we already know how high a byte goes since it has 256 values including the zero value, so it has a range of 0 to 255.

  • A short is 2 bytes (16 bits). 2^16 is 65536 for a value range of 0 to 65535.
  • A long or int is typically 4 bytes (32 bits). 2^32 is 4,294,967,296 for a value range of 0 to 4,294,967,295.
  • A long long is typically 8 bytes (64 bits). 2^64 is 18,446,744,073,709,551,616 for a value range of 0 to 18,446,744,073,709,551,615.

All you need to know is how many bytes a type consists of and you can work out the value yourself!

Stay tuned for more...