A brief introduction to Assembly

Assembly, how needs that anyway?

Speed has always been the top priority amongst programmers. Who can build the fastest program that does this and this?! Assembly was once THE programming language to use when it came to speed. Hand optimized assembly code is almost always faster than compiler generated code from a higher programming language like C. Today, the compilers are more cleaver than ever before. Computers are also faster, so the faster developing times are far more prioritized than a very little speed increase in the executing time.

'What exactly is assembly?', you might ask. It's a very mashine dependent low level programming language. To illustrate the difference between a single line in C and assembly, I'll take a variable increase as an example. In C, you would write 'i++' and it would increase the value of your variable i. This cannot be done in a single line in assembly code (at least not on a x86). The 'i++' compiles into three lines of assembly:

mov eax, [i]
add eax, 1
mov [i], eax

What it does, is that it copies the value at the memory position i into the register EAX. Then it adds 1 to EAX and copies the increased value back into the memory position. Most of the assembly instructions are three characters long. Some common instructions are:

InstrucionShort description
mov dst,srcCopies the value of src into dst
add dst,srcdst = dst + src
sub dst,srcdst = dst - src
mul dst,srcdst = dst * src
div dst,srcdst = dst / src
call addressjumps to a memory address, uses 'ret' to return
jmp addressas above, but without the return capability
jz addressConditional jump. Jumps only if the Zero-flag has been set
cmp a,bCompares a and b. Sets Zero-flag if they are different
push aSaves 'a' on the stack (pile of data, LIFO - Last in, first out)
pop dstSaves the top value on the stack in 'dst' and moves the stack pointer
int aCalls an interrupt (small program at BIOS or OS-level)


Binary numbers and such

The decimal number system is based on 10. A number can be written 'A*10^0 + B*10^1 + ... + n*10^n'. Example: 123 can be written '3*10^0 + 2*10^1 + 1*10^1'. The base is 10. In binary, the base is 2, so 123 would be written '1*2^0 + 1*2^1 + 0*2^2 + 1*2^3 + 1*2^4 + 1*2^5 + 1*2^6'. A little longer, but it still represents the same number. A commonly used system amongst programmers is hexadecimal which uses a base of 16. 'What comes after 9?', you ask. After 9 comes A, after that comes B, and so, until F (F = 15). The number 16 is written as 10 in hex. 123 would be represented as 'B*16^0 + 7*16^1'.

So, when I write 10, no one accually knows if I mean hex, decimal or binary... Normally I would mean decimal, but to be certain, an additional character is set to the right of the number. If I write 10h, then I mean hex. Then same applies to binary. 1111011b = 7Bh = 123. An additional style of writing hex can also be used, especially when dealing with memory addresses. Put '0x' in front of the number: 7Bh = 0x7B. An additional '0' is sometimes added if the 'h'-style is used: 7Bh = 07Bh.


Bytes, words and double words

A Byte is a collection of 8 bits. The bits are refered to as bits 0-7. Bit 0 is called the LSB, the Least Significant Bit and is the least 'valuable'. Bit 7 is the MST, the Most Significant Bit.

A Word is two bytes i.e., that is 16 bits.

A Double Word is as the word indicates, two words, four bytes or 32 bits.

To specify a variable in Nasm, which I use, you can choose either a byte, a word, a double word (can be float), a 64-bit float or an 80-bit float (why 80-bits, I have no idea?). Look at this example:

my_bytedb 0
my_worddw 0
my_double_worddd 0
my_double_floatdq 0.0
my_ten_bytedt 0.0

To use a variable in Nasm, just use the mov instruction as described in the example at the top of the page.

That was about it... Hope it helped you understand my tutorials a little better =). If you want a more detailed assembly tutorial, read the Art of Assembly. It has pretty much everything you'll need to know to get started. Maybe even better explained than here...

Move on to tutorial 1