Tuesday, November 20, 2012

Instruction Sets



A computer. Without an instruction set. Is nothing.

Possibly because even the techies don’t know much about them and you know the old saw, better to keep your mouth shut and all that.

What’s an instruction set? Well, it’s the language your processor speaks and its ‘words’ are composed of strings of binary digits, commonly referred to as machine code. At its basic level, this code is responsible for enabling and disabling where control pulses are directed in the micro. Back in the good old days, memory was expensive and clock cycles were valuable. The only way to program a computer was explicitly with machine code- there were no compilers or high level languages. Even operating systems were rare. They sucked up valuable resources.

Adding A to B and printing the result might look something like this…

10000001             Data in following addr to register A
00001010             Addr for A
10000010             Data in following addr to register B
00001011             Addr for B
00100001             ADD A to B; Result into C
11011000             Print Register C
11000000             Halt

The text to the right is simply comments, the machine of course only understands the binary code. While this is actually pretty compact, writing anything more than a simple program becomes a frustrating and error-prone process. Imagine debugging a program like this without any comments. And programs could easily exceed 100,000 lines.

If only there were a way to make the computer understand more human-like language and automatically convert..or..compile.. its own program. And so came Fortran (formula translator) for scientific work and COBOL (common business orientated language) for business uses, like invoicing and payroll.  And of course a myriad of other languages. For example, BASIC. The same instructions to the computer would look like this:

A=5
B=7
A+B=C
Print C

Now that’s programming! The downside is, the computer has to first examine all of our code and interpret it according to the rules. This means it might not always interpret what we want correctly, it also means our program will end up larger then intended with non-value added code, and because it’s larger and has more lines of code, it will run slower. It sees numbers that aren’t in parenthesis and so it assumes these will be variables. It sees the plus sign and so it assumes an addition will be performed. It looks for any characters after the word Print to determine if the output will be a variable or literal text. And then it writes its own binary code (just like we did above) to perform this. But no bets if the compiler will end up with the same code as we did.

Granted, there’s not much open to interpretation in something like A+B. But imagine a complex algorithm that used multiple constants. The programmer would know which storage locations were no longer needed as one-time use variables were used. But the compiler wouldn’t. It might just keep reassigning new locations as variables were called. It may end up using more memory and more cycles accessing them than is necessary, but today, we don’t care. We have Gigs of RAM and GHz processors. In fact, the whole GUI environment is a valid parallel of this. What would take one line to type in DOS to launch a program, requires a complete graphical environment with a pointing device, video mapping, a graphical file hierarchy, etc. etc. But again we don’t care, we have horsepower to compensate.

Imagine if we couldn’t make B=7. If all our variables had to go through A first.
First Set A=7. Then Move A to B. Then Set A=5. Now Add.
Imagine the bottleneck this would create if we had a list of numbers to assign!

Nobody says an instruction set is going to be easy to work with. A good instruction set translates into faster throughput. A faster machine. More flexible programming. Anyone that went through the Mac vs. PC wars of the 1990’s will remember how the PC with the faster Intel chip was getting smoked by the Mac’s PowerPC. I won’t go into those details here but it’s an interesting story. And as computers evolved and compilers became popular, the instruction sets took on a decided bent towards accommodating those languages. 

So while guys are beating their chests and boasting about their latest processor speeds, hard drive capacities and RAM figures, you’d do well to ask them about the efficiency of their instruction set.

[Incidentally, the binary code in the example above is real. It’s part of the instruction set of my computer]


No comments: