Thursday, February 6, 2014

Tesla Coil Musings: Digital Polyphony

Not to be confused with Polyphony Digital, creators of the Gran Turismo series... Speaking of which, the sixth edition of the game is out (for PS3, so no need to buy a next-gen cloud-based facebook-tweeting console) and it's pretty awesome:

By the way, we're going widescreen in the main text frame from now on. Welcome to the future!

Anyway, despite GT6 and Kerbal (and stabilized camera systems), it's time to revisit one of my arch nemeses, MIDI. My last encounter was with MIDI Scooter, a rideable three-track MIDI instrument that also happens to be Pneu Scooter, which is still alive and well. MIDI Scooter's physical implementation was relatively simple: each of three notes was played on one of the motor drive's half bridges. (Three total for a three-phase permanent magnet motor controller.) The PWM frequency was the note frequency. The PWM duty cycle was controlled by the normal sensored FOC. So it could both play notes and properly drive the motor at the same time:


Knowing what I know now about how XBees work, I would say its one failing would be in attempting to stream the note data wirelessly to the controller. For the most part it worked, but it would spontaneously drop notes and had little in the way of error checking to prevent that. But that's more on the parsing, data handling, and communications stack side; the physical method of playing three notes at the same time was relatively simple and worked as expected.

For my DRSSTC, which survived a cross-country trip only to be neglected for a year, I've "solved" the streaming problem by just getting a microcontroller with enough flash memory to store an entire MIDI song's worth of data. This has the added benefit of not relying on wireless or USB in a harsh electrical environment. (Although theoretically the driver board thinks it's just controlling some LEDs...)

However, the physical implementation of multi-track MIDI note playing on a DRSSTC is a much more interesting challenge than MIDI Scooter. To begin with, it's probably important to understand how the Tesla coil generates sound in the first place. The PWM frequency for the my coil is in the 150kHz range (to nearly but not necessarily exactly match the resonant frequency of the primary and the secondary), so playing sound by varying it is out of the question. Instead, the resonance-driving high frequency PWM is turned on and off in short pulses at a lower, audible frequency. Each pulse might look something like this:


Voltage builds up and at some point there is a discharge through the air which creates the sound. The pulse is very short: on the order of 100μs PWM drive time plus some decay time that depends on a lot of things. The actual sound-producing part of the pulse is probably even shorter - only the very peak of the voltage where an arc is generated.

How that pulsed arc is transformed into a sound pressure wave is a bit beyond my ability to simulate. For the purpose of  this discussion I just assume that it creates a short square pulse of sound equal to the driving pulse length. If I had to guess, I would say the real thing is more of a triangle, somewhat narrower than the driving pulse width, and centered around the peak voltages where an arc is actually generated. The further you get from the coil, the more it gets spread out, kind-of like thunder but on a much faster time scale. But unless you take it down into a canyon or something, there will be sharp edges on the time scale of the note frequency, giving it the characteristic cracking sound.

So, most of the note will be empty space - off time. This is done by necessity - the coil operates at very high peak kVA but very low overall duty cycle so as not to overload the components. (Here, I mean audible-frequency duty cycle, not resonant frequency duty cycle.) Since the note is mostly off time, and assuming the overall duty cycle is not exceeded, it is certainly possible to play two or more notes at the same time by superimposing the pulse chains. In a hardware sense, this would be like OR-ing two signals intended to turn on the high frequency PWM driver

There is one problem though...overlapping pulses. In the worst-case, if pulses are simply OR-ed, there is the possiblity of N pulse lengths adding together, where N is the number of notes. Since it's a resonant circuit, the resulting long pulse will ring up to much higher voltages and currents than intended, leading to certain destruction. So to ensure this doesn't happen, there is a simple rule to obey: A new pulse cannot start while an earlier pulse is ongoing, or for a certain holdoff period afterward. The holdoff period is to allow for ring-down after the driving waveform has stopped.

Driving pulse followed by a possibly exaggerated no-load ring down (red).
Even though the statement of this rule is simple, there are many ways to actually implement it. Since I couldn't say for certain which way would most accurately produce the sound of three independent notes played simultaneously, I decided to do what I always do and write a VB program to solve the problem for me.

One of the ugliest and most annoying VB programs I have written.
The program just generates a WAV file of short square wave pulses at up to three note frequencies. The notes are selected by MIDI number and pulse widths can be varied independently. Holdoff is varied globally although there is no reason why this must be the case. It uses one of several methods to handle overlapping pulses:

Simple Drops: Notes are associated with up-counters that roll over after the note's period has elapsed. A pulse is generated if a note counter rolls over and no other note counter is below the pulse length plus holdoff time. Otherwise, the pulse is partially or completely skipped (dropped). It's a brute force method that I didn't expect to work well but was good for comparative testing. Unlike all the other methods, there is no priority given to lower-number notes.
Simple Drops

Counter Hold: If a pulse attempts to start during the on time or holdoff period of an earlier pulse, the counter that generates the later pulse is stopped until the end of the earlier holdoff period. After the end of the earlier holdoff period, the counter is restarted and allowed to generate its pulse. If two or more pulses try to start at exactly the same time, a fixed priority is implemented (lower number notes first).
Counter Hold


Random Phase Shift: If a pulse attempts to start during the on time or holdoff period of an earlier pulse, the counter that generates the later pulse is shifted to a random position outside of the pulse-generating range of that counter and allowed to continue running from there. (Equivalent to inserting a random delay of less than one counter period minus one pulse+holdoff time.) If two or more pulses try to start at exactly the same time, a fixed priority is implemented (lower number notes first). Inspired by packet collision avoidance methods in digital communication: If the channel is not clear, each waiting transmitter can delay a random amount to minimize the chance of getting stuck in a four-way stop sign standoff.
Random Phase Shift


ISR Priority: If a pulse attempts to start during the on time or holdoff period of an earlier pulse, the later pulse and its holdoff period are delayed until the end of the earlier holdoff period, similar to Counter Hold, but the counter that triggered the later pulse attempt is allowed to continue running. If two or more pulses try to start at exactly the same time, a fixed priority is implemented (lower number notes first). This mimics the behavior of a prioritized and unnested Interrupt Service Routine on timer compare matches. This is probably? the most common method, and certainly the method used by oneTesla interrupter for two notes (interrupter source available here).
ISR Priority


For fun and science, I also threw in two "control" methods that are not realistic to implement but are interesting for the purpose of generating comparison sounds. One simply adds pulses together. (Adds, not ORs, so 1 + 1 = 2 instead of 1 | 1 = 1.) My thinking here was that this would be the purest representation of playing two or more notes simultaneously where each note is defined as a short square wave pulse. The other is the fundamental sinusoidal case, which is useful for listening to expected beat frequencies.

And what would an algorithm test be without test cases? The test cases were chosen to best bring out the differences in the methods:

1. Same Note: Two channels attempt to play the same note. Simple enough, right??

Control: Pure Sine (.wav)
Control: 1+1=2 (.wav)
Simple Drops (.wav)
Counter Hold (.wav)
Random Phase Shift (.wav)
ISR Priority (.wav)

Loser: Simple Drops. It simply drops all the pulses.

Winner: Any of the others, really. None replicate the exact sound of 1+1=2 (twice the amplitude) but they at least play the note more loudly with two channels than with only one. ISR and Counter Hold should be identical in this test case. Random Phase Shift is interesting because it sounds different depending on what random shift is implemented on the first collision.

2. Octave: Two channels attempt to play two notes that are in the exact ratio 2:1.

Control: Pure Sine (.wav)
Control: 1+1=2 (.wav)
Simple Drops (.wav)
Counter Hold (.wav)
Random Phase Shift (.wav)
ISR Priority (.wav)

Loser: Simple Drops. It simply drops the higher frequency. Actually, it's dropping the lower frequency and every other note of the higher frequency, leaving half the higher frequency which is the lower frequency. It's effed, is what I'm saying.

Winner: This one is tough to call. I think all three other methods produce both frequencies but favor the lower of frequency note more compared to 1+1=2. Random Phase Shift again varies in sound depending on the first random shift. Once it's shifted the f and 2f pulses out of a conflicting phase, they stay where they are after that. Sometimes this sounds more like the 1+1=2 sound than the other two methods. Other times, it sounds like it's creating entirely new, even higher frequencies.

3. Half Tone: Two channels attempt to play a notes that are one half-tone apart. This should illustrate the a worst-case scenario for beat frequency effects.

Control: Pure Sine (.wav)
Control: 1+1=2 (.wav)
Simple Drops (.wav)
Counter Hold (.wav)
Random Phase Shift (.wav)
ISR Priority (.wav)

Loser: Counter Hold. It's hard to explain what exactly is going on, but it results in dropping the higher frequency note. The shorter period pulse doesn't fit between the off-time gap left by the longer period pulse, so it gets delayed every cycle. (Except at the very start, before the first beat period elapses, when you can still hear both notes.)

Winner: ISR Priority or Random Phase. Simple drops works but has obvious clicks where there are drops. ISR Priority has a more pronounced beat frequency (like Pure Sine), Random Phase disguises it with noise. At high frequencies, Random Phase sounds worse (lots of random shifts leads to a very noisy sound), but at lower frequencies it might be the winner.

4. Perfect Fifth: Two notes seven half-tones apart. Very close to 3:2 ratio. Other than the octave test, this is a worst-case scenario for a small integer relationship between notes. This is the case illustrated in the sample waveforms above.

Control: Pure Sine (.wav)
Control: 1+1=2 (.wav)
Simple Drops (.wav)
Counter Hold (.wav)
Random Phase Shift (.wav)
ISR Priority (.wav)

Loser: Simple Drops. It should be obvious from the above waveforms that this won't work. It creates all kinds of mess, dropping almost every other low frequency pulse and every third low-frequency pulse. I don't know why it sounds the way it does, but it doesn't match 1+1=2 well at all.

Winner: Counter Hold or Random Phase. Both are able to shift the 3:2 periods into non-colliding phase for the duration of the 1-second test notes. ISR Priorirty still suffers from collisions, although the distortion they create is not nearly as bad as Simple Drops.

5. Three-Note Chord:
A worst-case duty cycle test, attempting to play three relatively high frequency notes at the same time.

Control: Pure Sine (.wav)
Control: 1+1=2 (.wav)
Simple Drops (.wav)
Counter Hold (.wav)
Random Phase Shift (.wav)
ISR Priority (.wav)

Loser: Everything except ISR Priority. Simple Drops has ... a lot of drops, which sound like clicks. And everywhere where there would be a drop, Random Phase instead inserts a phase shift that, cumulatively, make it sound extremely noisy. And Counter Hold fails for similar reasons to Test Case 3, I think.

Winner: ISR Priority. It's not perfect, but for 10% overall duty cycle, it is pretty damn good.

These are certainly not all the possible test cases, and in fact each one could be tested with a wide range of note frequencies, pulse lengths, and holdoff periods, with possibly different results for different permutations. However I've now got a good feel for the strengths and weaknesses of the methods. ISR Priority seems to be the most robust (it wasn't a "Loser" in any test and was only not the Winner in the Perfect Fifth test case. It's proven, easy to implement, extensible to more than two notes, and ideal for a coil with a relatively low resonant frequency and long pulse lengths, where note-playing duty cycles are likely to be high (like mine).

However, there is something to be said for Random Phase Shift. With extremely low overall duty cycles, it might be the better choice. It can shift octaves, same notes, and 3:2 perfect fifths (and possibly other ratios) into non-colliding phases, making them more independent-sounding and disguising beat frequencies better. Its noisy sound quality at higher duty cycles outweighs these benefits overall, but it was certainly worth the exploration.

I also think that, if the real-time constraint is lifted (the pulses needn't be ORed in real time by a microcontroller using some kind of simple timer-based algorithm), it is possible to devise even more interesting methods that satisfy the holdoff constraint but produce a better match to the 1+1=2 sounds. I don't know exactly what these methods are, but there is a path of optimization to go down for sure.

For now, though, I'm eager to a) implement this on actual hardware and b) test it in conjunction with a MIDI parser on actual MIDI tracks. Not sure in which order. Stay tuned.