Crafting a Vintage CRPG: Part 9

Part 9: Sound and Music

Wow, it's been over a year since I started the site... I'll admit, it's had the effect I wanted. In particular, to keep pushing myself to finish the project rather than circle endlessly in a design-only cycle.

In this installment, we'll look at how sound effects are created and played in TMS9900 assembly language. Music is also a part of this, but a music engine is not part of the core game.

And yeah, I changed the scheme to white text on a dark background. Easier to read, harder to read? Visit my development blog and let me know!

Sound Chip Design

It's interesting to note that sound is one of the last things to get done with most computer projects. The main reason for this is that it is used for special effects and dramatic presentation. This automatically makes it a lower priority. Some games may make a puzzle or game out of sounds (such as Myst did), but this is a rare occurrence. I think that most people are tone-deaf, so sound-based puzzles are more frustrating than fun.

Sound on the TI is generated from the built-in sound chip. It was created by TI in the mid to late-70's. It has three separate voice channels, each capable of a square-wave tone with a frequency range of 110hz to 55,938hz. It also has a noise generator capable of periodic (buzzing) or white (static) noise. Each generator is capable of attenuation (volume) of 0-28 decibels.

A curious element of the TI sound chip is the relatively high minimum frequency of 110hz. This is a low A on a piano keyboard, but it's still an octave or two above "base" octaves. As a result, most TI music has to be pushed up an octave or more, making it sound "higher" than other computer systems. The reason? The TI engineers had to sync the sound chip to the 3.58Mhz clock signal. Since this is is at least twice what most of the other 8-bit systems of the time were running at, the sound chip is actually being forced to play higher tones than it's capable of.

The TI sound chip is not an even competitor to the SID chip that was featured in the Commodore 64. (A statement that I'm sure will get me a lot of angry protests from my fellow 99'ers. Sorry guys, it's the truth.) The TI sound chip IS capable of a lot of the techniques the SID can do, but to do it, the developer must replicate them in software. This includes elements such as different waveforms, decaying sound, and so forth.

Keep in mind, though, that the SID was developed several years later, and it was created by engineers who were also audiophiles. So they designed a chip specifically to produce better sound and music than had been heard on any other home computer at the time. So, full points to them, I'd say. Also, I'd say the TI still ranks high on sound quality, compared to the Color Computer (100% CPU-driven and rather soft), the Apple II (hair-raising screeches and buzzes that made your teeth grind), and the Atari 8-bit series. (Hm, well, the POKEY chip WAS pretty decent...)

Sound Generation

Unlike some other vintage systems, there's a few different ways to produce sound on the TI. Including some exotic ones, which I won't go into here! But I'll cover the two main ones that game developers would be interested in.

ISR (Interrupt Service Routine)

This is the easiest and most common method, primarily because it's supported by a ROM-based ISR, which means it doesn't consume any extra code space to use. It's also what TI BASIC and Extended BASIC use to produce sound through the SOUND subprogram.

You construct byte-length data tables in either VDP or GROM memory, which then have the starting address fed into a memory-mapped port. Each entry begins with a length-byte, which indicates how many bytes are going to be entered into the sound chip. It ends with a duration byte, which indicates how long to play the current "set" until moving on to the next entry. A duration of 0 indicates an end of the sound.

The first nybble (4-bits) of each byte indicate what operation to do. The values >8,>A,>C, and >E indicate a frequency setting of voices 1-4, the fourth being the noise generator. Each of the generators requires two bytes to change or set a frequency, except the noise generator, which only takes one. The noise generator can be set to emit a low, medium, or high tone, or derive its frequency from the third voice's settings.

The values >9,>B,>D, and >F indicate an attenuation setting of voices 1-4. Only one byte is needed to change or set the attenuation of a single voice.

Here's a short example of a sound table entry:

       BYTE 7,>9F,>BF,>DF,>FF,>80,>05,>99,10

This says there are seven bytes to feed into the sound chip. The first four mute generators 1-4. The fifth and sixth set the first generator's frequency to >050. And the last byte sets the first generator's attenuation to 9. (12 decibels) The last byte, not fed to the sound chip, indicates that 10/60, or 1/6, of a second should pass before processing the next line in the sound table.

Note that the frequency bytes for the voices always send the lowest nybble (4-bits) in the first byte. This can be misleading at times. You read the frequency as second byte plus first byte's second nybble.

If you send a length-byte value of 0 or 255, it indicates an address change for the sound data. A zero means you stay in present memory, a 255 means you switch to GROM or VDP based on where you're at presently. This allows you to chain together sound effects or loop them for "continuous" music.

All in all, this is a very good method for sound production. The biggest limitation with it is that it's very data-centric; you can't tell it to "decay" the volume over time; each "change" must be accounted for in the table. So very complex sounds burn a lot of memory. You also have to be sure and clean up your generators after a sound; most of the sound arrangements I've seen have a "silence" sound that just mutes all the generators which is branched to after every other sound is completed. In regards to addressing, you also have to make certain your data addresses match what they would be in GROM or VDP memory, which means a certain degree of pre-planning.

Direct Sound Mapping

This method is one level lower than the ISR method. In fact, the ISR method is essentially doing this for you. Instead of letting the ISR read a sound table automatically, you directly feed the sound commands into the generator yourself, at CPU address >8400. You don't need the length byte at the start or the duration byte at the end. Unless you want to use them, since you'll be writing your own management system.

This method has one real strength, which is it is no longer tied to the interrupts. This means you're no longer limited to a minimum of 1/60 of a second for sound-length. This means your timing becomes dependent on your code cycles, which are MUCH smaller. You can create sound lengths so small the human ear won't pick up the changes. You can make much smoother sound decays and possibly create sound effects that are completely unique and impossible to implement in BASIC.

The drawback is that unlike ISR, you're now involving the CPU in sound processing. This means it's a better method for stand-alone sound applications, such as music playing, as opposed to games, which usually need sound to be a "fire and forget" system. You could always set it up as a user-driven interrupt, but since there's already a ROM-based system in place, it's rather a waste unless you had a very special system in mind.

Creating Sound Effects

Well, there's no easy way to do this. If you're not particularly creative about it, you would be best served by studying how other games and programs on the TI have created sound effects. This is how I started myself.

BASIC sounds

BASIC and Extended BASIC games are always a good place to begin. For one thing, you can easily list programs and find the sound effects, so long as the program isn't protected. (And there's ways around that...)

The one thing to remember is that the SOUND subprogram in both languages, while superficially the same as assembly, has one real drawback: timing. Because the commands are run through the interpreter, you have a lot of overhead lag that makes piecing sound together a tricky business.

Voice control is also lacking. Unlike assembly, if you want to use more than one voice, you have to use them all in a single SOUND command. This wouldn't be a big deal except that in order to generate noise of varied frequency you have to use the third voice. So you end up with SOUND statements with muted first and second voices just to do that, a real waste of code space.

You'll also find that translating sound effects into assembly isn't straight-forward because there's a lot of approximation going on with the floating-point values used in BASIC. In fact, there's only 1024 different individual 'tones' stored in a 10-bit value, >000 to >3FF. And the higher the value, the lower the tone, because this value is used as a divider. So >000 is out of human hearing range, and >3FF is the lowest possible tone you can get.

In a similar fashion, the durations are not, in fact, in milliseconds. They're rounded to the nearest sixtieth of a second. And finally, the volume only ranges from 0-14 (each level is 2 decibels, 0 is 30, 1 is 28, etc.), with 15 being the command to turn off the sound generator. So low is louder than high.

On a side note based on personal observations, I would avoid using the built-in high/middle/low set frequencies for the noise generators. While they take only a single byte to use as opposed to three, they're also frequently used (and over-used) in BASIC.

Cartridge/Assembly sounds

Mining your favorite cartridge or assembly game for sounds is a more difficult prospect than a BASIC program, but it can be done. The trick is to use a hex editor to study the raw code dumps or object files and find the distinctive pattern that sound data makes.

A good way to start is to look for instances of >E3 and >E7 bytes. These are the periodic and white noise generators, using the third voice to determine frequency. If these are followed with >C#,>##,>F# in some varied order, you have probably found a bit of sound data. Searching for the length and duration bytes is also a good technique; usually, finding very low byte values (1-9) adjacent to each other is indicative of sound data.

I've used this technique to decode a few games from TI cartridges. For example, I really liked the "smack" sound that is made in Munchman when you eat a Hoonoo. I found the sound to be thus:

       BYTE 7,>9F,>BF,>DF,>F2,>CC,>01,>E7,1
       BYTE 2,>CC,>03,>1
       BYTE 2,>CC,>05,1
       BYTE 1,>FF,0

First it mutes all the generators except the noise generator, which it sets to 2 (26 decibels). Then it sets the third voice to frequency >01C, and sets the noise generator to draw its frequency from the third voice. It plays for only one sixtieth of a second. Then it changes the frequency to >03C (+32) for another sixtieth of a second, and >05C (+32) for another sixtieth of a second. It then mutes the noise generator and returns a duration of 0, which indicates to the ISR routine that the sound table is at an end.

Remember, lower frequencies means a higher tone. So the effect is high-mid-low static, but so small in duration that it makes a nice "smack" sound.

I think an important aspect of sound development in assembly is to try and work with the system as it is, rather than try and translate things from BASIC. Unless you have to have EXACT tones, because you want a high C or a low A here, use your mathematical sense instead. Consider that the "smack" sound uses 32-tone intervals. There's a nice power-of-two number! The main thing is to try and see what happens. Experiment! Maybe you'll find sounds that no one has ever heard before. But you won't know until you try.

Let's look at another example, the opening theme to Munchman, when the titles come alive:

       BYTE 3,>9E,>BE,>DF,1
       BYTE 6,>87,>03,>A8,>03,>C0,>05,1
       BYTE 2,>C0,>05,10
       BYTE 4,>88,>03,>A7,>03,1
       BYTE 4,>89,03,>A6,>03,1
       BYTE 4,>8A,>03,>A5,>03,1
       BYTE 4,>8B,>03,>A4,>03,1
       BYTE 4,>8C,>03,>A3,>03,1
       BYTE 4,>8D,>03,>A2,>03,1
       BYTE 0,>17,>8E

The first line sets up the voice generators for the three voices. The second line sets the three voice frequencies, which are all fairly high. Notice that most of the lines only have 1/60 second duration, except one, which is 1/6 seconds. Why? Well, this creates the "pulse" effect you hear, a continuous long tone followed by a short burst of lowering frequency. Notice that the first and second voices are increased, which would lower the tone VERY slightly each time. The final line is a branch command, which loops back up to the address where the second line is. This means the sound effects will process forever until the sound ISR is halted by the application.

With address looping, it's incredibly important to know WHERE sounds are going to be located. The sound ISR only supports two locations; VDP memory and GROM memory. Most of the time, you'll be using VDP, unless you're a cartridge developer. In any case, you still need to know the EXACT address of where sounds are located. That way, when you branch, you know exactly where you're going.

With my own sounds, I haven't been starting sound effects with mutes. I assume the sounds are already muted by branching every sound to a "mute" sound, which just shuts off all generators. That way I save memory and space. If I want to stop the ISR, I'll just change the sound address to the mute sound.

One last thing on addressing: because sound tables are constructed with BYTE commands, trying to use reference addresses for your sound looping is probably not going to work. Remember that DATA statements always start at an even address, so you can end up with "gap" bytes. These are very bad news, as the ISR sound system will go berserk when it hits them. You could try and ensure every sound begins and ends on an even address, but it's not the most efficient way to go about it, even if it is easier in code.

One sound effect that caught me off guard was the "lightning bolt" sound effect from Tunnels of Doom. It is, in my opinion, an absolutely incredible sound that has a majesty and power that quite fits the effect of the scroll that you use it with. (one-hit kills in Tunnels of Doom being a rare thing.) I found it in the Tunnels of Doom cartridge, and was quite surprised to see how huge it is. Have a look:

       BYTE 5,>9F,>A5,>01,>B0,>DF,2
       BYTE 2,>A9,>01,2
       BYTE 2,>AE,>01,2
       BYTE 2,>A4,>02,2
       BYTE 2,>AA,>02,2
       BYTE 2,>A2,>03,2
       BYTE 2,>AC,>03,2
       BYTE 2,>A7,>04,2
       BYTE 2,>A5,>05,2
       BYTE 2,>A5,>06,2
       BYTE 2,>A8,>07,2
       BYTE 2,>AF,>08,2
       BYTE 2,>AA,>0A,2
       BYTE 2,>AA,>0C,2
       BYTE 2,>A0,>0F,2
       BYTE 2,>A5,>01,1
       BYTE 2,>A9,>01,1
       BYTE 2,>AE,>01,1
       BYTE 2,>A4,>02,1
       BYTE 2,>AA,>02,1
       BYTE 2,>A2,>03,1
       BYTE 2,>AC,>03,1
       BYTE 2,>A7,>04,1
       BYTE 2,>A5,>05,1
       BYTE 2,>AA,>0C,1
       BYTE 2,>A0,>0F,1
       BYTE 2,>AD,>11,1
       BYTE 2,>A3,>15,1
       BYTE 2,>A4,>19,1
       BYTE 2,>A0,>1E,1
       BYTE 2,>AB,>23,1
       BYTE 2,>A7,>2A,1
       BYTE 2,>A7,>32,1
       BYTE 2,>A0,>3C,2
       BYTE 3,>A5,>01,>B4,1
       BYTE 2,>A9,>01,1
       BYTE 2,>AE,>01,1
       BYTE 2,>A4,>02,1
       BYTE 2,>AA,>02,1
       BYTE 2,>A2,>03,1
       BYTE 2,>AC,>03,1
       BYTE 2,>A7,>04,1
       BYTE 2,>A5,>05,1
       BYTE 2,>AA,>0C,1
       BYTE 2,>A0,>0F,1
       BYTE 2,>AD,>11,1
       BYTE 2,>A3,>15,1
       BYTE 2,>A4,>19,1
       BYTE 2,>A0,>1E,1
       BYTE 2,>AB,>23,1
       BYTE 2,>A7,>2A,1
       BYTE 2,>A7,>32,1
       BYTE 2,>A0,>3C,2
       BYTE 3,>A5,>01,>B8,1
       BYTE 2,>A9,>01,1
       BYTE 2,>AE,>01,1
       BYTE 2,>A4,>02,1
       BYTE 2,>AA,>02,1
       BYTE 2,>A2,>03,1
       BYTE 2,>AC,>03,1
       BYTE 2,>A7,>04,1
       BYTE 2,>A5,>05,1
       BYTE 2,>AA,>0C,1
       BYTE 2,>A0,>0F,1
       BYTE 2,>AD,>11,1
       BYTE 2,>A3,>15,1
       BYTE 2,>A4,>19,1
       BYTE 2,>A0,>1E,1
       BYTE 2,>AB,>23,1
       BYTE 2,>A7,>2A,1
       BYTE 2,>A7,>32,1
       BYTE 2,>A0,>3C,2
       BYTE 3,>A5,>01,>BC,1
       BYTE 2,>A9,>01,1
       BYTE 2,>AE,>01,1
       BYTE 2,>A4,>02,1
       BYTE 2,>AA,>02,1
       BYTE 2,>A2,>03,1
       BYTE 2,>AC,>03,1
       BYTE 2,>A7,>04,1
       BYTE 2,>A5,>05,1
       BYTE 2,>AA,>0C,1
       BYTE 2,>A0,>0F,1
       BYTE 2,>AD,>11,1
       BYTE 2,>A3,>15,1
       BYTE 2,>A4,>19,1
       BYTE 2,>A0,>1E,1
       BYTE 2,>AB,>23,1
       BYTE 2,>A7,>2A,1
       BYTE 2,>A7,>32,1
       BYTE 2,>A0,>3C,2
       BYTE 1,>BF,0

Yeah... that is LONG... it's over 300 bytes of sound data. Which doesn't sound like a lot if you're a modern-day coder, but for a vintage system this is VERY big. It takes up about 1/5 of the space that the game devoted to sound effects. For ONE sound.

So why is it so long? Well, the reason is that it's performing slow frequency and attenuation changes in an unrolled form. If you were doing a sound like this in a higher language, you'd use a control structure like FOR/NEXT to achieve the same effect in much less code space. The problem is, the sound ISR doesn't have control structures.

If you notice, the pattern of changes isn't cut and dry either. The interval changes in frequency get larger over time, and the start isn't repeated, only the three reverbs. Volume decreases in a predictable fashion, at least. It would be possible to convert the sound effect to an algorithm, but you would have to either write your own ISR to handle additional control structures, or have the CPU do sound processing.

Music

So what about music? Well, it can be done with either method. The ISR method has one small disadvantage that any breaks in sound (like the natural pause between notes) must be simulated by muting the generator. However, the ISR system would also allow you to have continuously playing music in the background with very little effort.

I decided early on, however, not to have background music in the game proper. My reasons were:

The ISR method of music is okay, but it grates on the ear after awhile
Music takes up memory that could be devoted to other more game-oriented elements
I'm not a musician personally, and I would want music that makes the game unique
Unfortunately, I can't just visit DeWolfe and get something...

A music playing engine would best be done using the direct sound method. This would let you have complete control over duration, have very fine-tuned sound decay, and even let you experiment with trying to reproduce different wave-forms and instruments.

Also, you can store the data in a different form than the ISR routine. You don't really need all 1024 frequencies with music. In fact, you could easily create a byte-based look-up table for different notes. This would let you compact the data into a smaller form, and even introduce control structure commands like increase/decrease tempo and so forth.

There WILL be music for the opening title sequence, which is outside the scope of the demo, so don't be looking for it anytime soon. I found a piece of music I truly enjoyed for it, that seemed to capture the essence of the game. Interestingly enough, the piece had been written for an independent CRPG some years ago that never saw release.

A few months ago, I decided I had better contact the composer and secure permission, just to be safe. To my shock and dismay, the composer had died a year or so before I had found the piece. His daughter, however, extended me permission to use the piece, saying he was a big vintage computer enthusiast and would have enjoyed having it in a game.

Well, I mean... wow. I'm very happy to be able to use the music... Reducing it to three voices seems nearly a crime, mind you. But now it's like, I HAVE to finish the game. I don't dare NOT.

Samples

Unfortunately, I've had some trouble locating a way to record sound live in an emulator in a quick, clean, and easy fashion. (I'm open to suggestions!) So instead I'll provide some emulation-ready images.

In order to test sound effects, I had to write a small program that would store the sound data and also let me play them. I also used these to listen to the sound effects I extracted from some game cartridges.

So, below you can download the V9T9/Mess disk images and Classic99-ready files for the test programs for Tunnels of Doom and Munchman.

Please note that the sound selection system is not dummy-proof. You can easily navigate to a sound that doesn't exist, as I didn't take the time to code in checks against literal values. Munchman has 10 sounds (0-9) and Tunnels of Doom has 30 sounds. (0-29) You've been warned!

And no, I'm not sharing my sound effects. Not yet anyway...

To load, use the Editor/Assembler cartridge option #3 to load and run TEST. It will start automatically.

Munchman Sound FX (Classic99 files in ZIP file)
Munchman Sound FX (90k SSSD V9T9/MESS disk-image)

Tunnels of Doom Sound FX (Classic99 files in ZIP file)
Tunnels of Doom Sound FX (90k SSSD V9T9/MESS disk-image)

Conclusion

Well, that was a bit of fun. And yes, I do plan on integrating sound into my present demo soon, I want to have a listen as much as everyone else! Then I will be focusing on the combat engine. Until next time!