Some words about my source code style

TAD/Hugi

Introduction

Here are some thoughts about how the 'style' of my own 80x86 source code. No doubt there is a million other ways to name macros, define symbolic items, write comments and document your source code, but this my own rather chaotic method. While learning to code I found very little help concerning the topic of symbolic names, so this explains the code chaos below.

This is NOT a set of rules I'm attempting to impose on every coder (that many books like to do.. damn those paper dictators!!), it's just my random ASCII madness that I like to call 'code'.

Two word method

I construct most symbolic names using two words joined together to create a 6 or 8 letter symbol. The reason for these (some would say) short symbol names comes from the fact that I learnt to code using some old assemblers which had very limited memory and/or a limited number of significant characters for symbol names. So for a procedure to draw a sprite I would use the name "DrawSpr" or "drawspr".

Here are some common abbreviations:

 scr    - screen 
 spr    - sprite 
 key    - keyboard 
 snd    - sound 
 rat    - mouse 
 dot    - point/pixel 
 cnt    - count 
 tick   - ticker/countdown value 
 rate   - reload value for a ticker 
 pnt    - a pointer 
 map    - background map/array 
 stk    - stack 
 top    - start of something 
 end    - end of something 
 test   - test for something 
 get    - fetch some variables 
 put    - place some variables 
 wrt    - write to memory/screen 
 prt    - print 
 calc   - calculate (usually screen co-ordinates or an item address) 
 find   - search for something 
 zap    - clear memory 
 clr    - same as zap 
 kill   - delete 
 wipe   - erase something (e.g. sprites) 
 move   - move something 
 sort   - sort items 
 pro    - process/update items 
 go     - start something (e.g. a sound or animation) 
 stop   - stop something 

Here are some regularly used examples of two-word procedure names.

 waitkey        - wait for a key press 
 testkey        - test for a key press 
 clrscr         - clear entire screen 
 calcpixl       - calculate screen pixel from (x,y) 
 plotpixl       - plot a pixel 
 drawline       - Hmmm.. guess.. 
 copyscr        - block copy a screen to somewhere else 
 flippage       - flip the screen/video page 
 waitfly        - same as vsync (force of habit..) 
 prosnd         - process sound fxs 
 gosnd          - begin a sound 
 loadmod        - load a sound module 
 playmod        - play it 
 stopmod        - stop it.  
 loadfile       - loads a file from disk 
 savefile       - saves a file to disk 
 sortpoly       - sort polygons 
 scanedge       - scan convert a polygon edge 

And of course the classic, inits:

 initmem        - initialise memory 
 termmem        - terminate/cleanup memory 
 initsnd        - init sound... 
 termsnd        - etc.. 

Casing and Capitals

Some people like to use purely lowercase for all their source code. This is usually because they are writing for pmode where instructions, operands and labels seem to mysterious get very long by themselves. The advantage is that your typing speed is increased because you never need to hit the the [SHIFT] or [CAPSLOCK] keys (apart from the special shift characters of course). The disadvantage is that it can be difficult to break a long label symbol back into its component parts.

Lots of other coders prefer to use mixed lower and uppercase characters. So instead of 'drawspr' they would use 'DrawSpr' or perhaps 'DrawSprite'. It does make it far easier for other people to read your source code (especially if you are working on a team project or intend to publish it later on).

Underscore

Another popular way to visually break up neighbouring words is to use the underscore ( '_' ) character 95 (5F hex). This does allow purely lowercase symbolic names to be used while keeping the source highly readible. For example, 'alloc_dma_buffer'.

There is another useful advantage of using the underscore. It helps you to search and replace items at a later date. If you have used a common scheme throughout your source then you could quickly search for all the dma related stuff just be typing '_dma_', easy eh?

I tend to only use the underscore for equates, offsets and structures and sometimes not at all (like to break my own rules from time to time.. heheh).

Local labels

Thankfully the bad old days of using labels like 'DS16LP2' are gone. This is due to the wide spread use of local labels. I would guess many coders would simply stick to the normal @@1, @@2, @@3 ... @@nn labelling system for their local labels. But I personally find it better to use a one or two word local label such as these:

 @@find:        - marks the start of a search loop 
 @@next:        - jump to the next items in the list 
 @@prev:        - .... the previous item ... 
 @@skip:        - ignore an item  
 @@zero:        - zero memory pointer/undefined 
 @@clip:        - completely clipped, reject it entirely 
 @@undef:       - undefined item 
 @@done:        - main work done, jump to cleanup code 
 @@quit:        - quit immediately from procedure 
 @@exit:        - exit from procedure 

Using single digits is great for short procedures, but for long ones it can get very confusing trying to remember what @@19 does, is it the main loop or a conditional step-over label??

Numbers and expressions

I have a (some would say, bad) habit of using brackets far too often. Even when the preceedance is obvious I still like to break the expression into smaller parts, this not only makes it easier for me to read, but it insures that the assembler calculates things in the correct order. E.g.:

        add     di, 10+(30*320) 

For a standard 320x200 mode 13 hex screen the above would move the DI register right 10 pixels and down 30 pixels. Please notice that I keep the delta (x,y) values in the same order as the co-ordinates.

        add     di, x+(y*320) 

This means less mental translation, so hopefully less bugs and typos too.

I only use uppercase for hexadecimal numbers with a leading '0' and a trailing lowercase 'h' symbol. This means any confusion with the ah, bh, ch and dh registers are minimized because most registers are ALWAYS lowercase. Remember 0ah, 0bh, 0ch and 0dh are all valid hex numbers, so be careful with those leading '0' zeros.

Segment registers

Hopefully these are those strange, dusty registers which no-one uses anymore (cause you are all using flat mode, right?). But for those of us who like to spend a few hours swearing at segment registers....

I like to use uppercase for the CS, DS, ES, FS, GS and SS segment registers. This not only makes them stand out from the cx, di, dx, si and sp registers but makes it easier to see segment overrides, e.g.:

        mov     ax, DS 
        mov     ES, ax 
        ... etc ... 
        mov     bx, FS:[si] 
        mov     CS:[counter], al 

Equates

Hmmmm.. never found a nice method for equates. Some people use all lowercase, some use mixed case, some use all uppercase and some use underscores.

 MaxPolygonCount      equ     10000 
 max_polygon_count    equ     10000 
 maxpolygoncount      equ     10000 
 MAXPOLYGONCOUNT      equ     10000 

In my experience the symbolic names themselves are often enough to distinguish between memory variables and equates. Usually symbols with 'size', 'max', 'min' or 'length' in them are equates.

Bits and Flags

This is an old naming method borrowed from the old 680x0 Devpac assembler. You add '_B' at the end of a symbol to form a bit equate or '_F' to form a flag bit mask (which is 1 shifted left by the bit position).

VGA_SR1_VR_B         equ     3  
VGA_SR1_VR_F         equ     1 SHL 3 

ACTIVE_B             equ     7 
ACTIVE_F             equ     1 SHL ACTIVE_B 

Another popular way is to use either binary or hexadecimal to define equates.

BIT0_F               equ     00000001b 
BIT1_F               equ     00000010b 
BIT2_F               equ     00000100b 
BIT3_F               equ     00001000b 

Padding numbers

When defining equates or using hexadecimal numbers I find that padding the number upto the correct number of digits is a nice, extra touch. It helps to quickly see the size of the constant value and relate the source code to the actual disassembled code which you'll see when debugging your code.

BYTE_CONSTANT        equ     02h 
WORD_CONSTANT        equ     0002h 
DWORD_CONSTANT       equ     00000002h 

Structures

Most of my structures use a 2 or 3 letter lowercase id and mixed case member names. I try to keep structure names down to a minimum because they will probably be used many, many times.

snd STRUC 
  Flags      db      ? 
  ToneEnv    env     <> 
  NoiseEnv   env     <> 
  VolumeEnv  env     <> 
snd ENDS 

Out of habit I define the usual SIZEOF, NUMOF and SIZE equates too (the old DevPac assembler didn't have the now standard SIZE and LENGTH functions).

sndSIZEOF    equ     size snd             ; size of 1 structure 

sndNUMOF     equ     100                  ; maximum of 100 snd structures 
sndSIZE      equ     sndSIZEOF*sndNUMOF   ; total bytes needed 

Stick 'em together.

Here is a recent habit which seems to make code a little easier to read (well, the junk that I like to call 'code' heheh). The idea is simple, define any bit/mask fields or equates right next to the element structure itself.

voice STRUC 
  Flags      db      ? 
             ACTIVE_F        equ     10000000b 
             LOOPED_F        equ     01000000b 
             LFO_F           equ     00001111b 
  
  Volume     db      ? 
  
  Period     dw      ? 
             MIN_PERIOD      equ     0071h 
             MAX_PERIOD      equ     0258h 
  
  Panning    dw      ? 
voice ENDS 

So looking at the above it is obvious that the 'ACTIVE_F' equate is associated with the 'Flags' element of the 'voice' structure.

Inputs and Outputs

For procedures you often need to add some comments about the input and output parameters. Most of the time 1 or 2 lines will do, you don't need a huge page full of ASCII characters for every procedure.

;* CF=Load file[DS:EDX] length(ECX) into[DS:EDI] * 

The above single line of comment describes all the input and output parameters for the a 'LoadFile' procedure. Any output parameters are written before an '=' equals sign and input parameters are enclosed in (..) brackets for registers, or [...] square brackets for memory based variables/pointers or filenames.

;* Draw box at(BX,DI) width(DX) height(CX) * 

Nice and easy, eh?

Closing words

Oh well, I hope this hasn't come across as a 'I command thee...' kind of article. Source code style is a personally thing. Don't believe all the crap in those lousy books. My philosophy is to keep everything to a minimum. Don't get bogged down under a mountain of documentation, comments or line indentation otherwise you may spend all your time updating the comments rather than coding. Do you know any professional programmers who actually use those f***ing stoopid flow chart stencils?? Learning to program is difficult enough without having to waste your time with pointless details like what colour should the diamond box be drawn in!!

So don't blindly follow everyone else, find your own format, your own style that works for YOU and stick with it. Whether you choose long labels/short labels, upper/lower case or underscores is up to YOU.

I can still remember working with someone who chose enormously long labels while working on the old Amiga. This person soon ran into problems with assembling his code, it was horrendously slow and often failed due to lack of memory. The moral of the story?

Having long description labels and pages of comments is great...
...but a program you can actually compile is even better!!

Happy assembling.

TAD/Hugi