performance_matters
need for speed: realtime dsp on most modern computers
Here is how sound works on most modern computers:
- 1. A piece of software owned by the operating system (kernal software) is responsible for sending data to the sound card. (On linux this would be ALSA or pipewire. On Windows this would be ASIO. On mac this would be CoreAudio.)
- 2. This kernal softare will call the program you write and ask it for some number of samples. It gives you some list of numbers and you give it back a list of numbers.
- 3. The kernal software then sends that chunk of data to the sound card and asks your program for the next chunk.
The kernal software is on a tight deadline. The sound card is consuming those numbers at the samplerate. So at 48k its churning though 48000 samples per second. If you miss the kernal software's deadline, it moves on without you. This results in nasty clicks and pops coming out of your computer.
This means realtime DSP algorithms have to be really fast. If the sample rate is 48k a realtime dsp algorithm has less than 20 microseconds to do everything its going to do.
Of course, your program typically isn't the only thing that is making sound so practically you have much less time than that! That 20 microsecond window is shared by everything on your computer that is processing audio.
This is why audio programs are typically written in a fast, compiled language like c, c++, rust, etc... We need every ounce of speed we can muster.
Thankfully compilers are pretty good these days and computers are really good at doing math. However, this does mean that we need a feel for what things are slow and what things are fast on a computer.
don't worry about optimization, worry about pessimization
So how do we make sure our dsp code will be performant enough to not cause problems? Thankfully, for most common dsp algorithms, no advanced optimization is needed. Writing fast enough software is more about not doing things that pessimize your code.
in favor of procedural programming
Many of us learn an Object Oriented style of programming (OOP for short). Its a style of programming that is commonly taught in universities and is used all over the industry. In OOP we concieve of a program as messages passed between "objects". These objects are structures that combine data and functionality together. They are often organized into taxonomies (ex. a Truck is a kind of Car). We can then write methods that operate on all Cars regardless of their type. This is called polymorphism.
However, all of this abstraction is not free! Creating polymorphism this way requires something called virtual inheritance. which prevents the compiler from automatically optimizing our code. The additional costs for this kind of abstraction are not trivial.
Thankfully there are other ways to organize our code that don't pay this kind of performance penalty.
intro to data oriented design: math is fast, data is slow
One such alternative to OOP is something called data oriented design. In this paradigm we start with understanding the data structures we need for our code and how that will be stored and processed by our algorithm. We then let that knowlege drive the rest of the design.
Here is the short version: Math is fast, data is slow. The cpu is really good at doing math. Its not so good at moving data around. When the cpu does math, it has to move data into its registers first. This data has to come from memory. If the data has to come from RAM, it will very likely spend more cycles moving it around than doing math on it. This is why modern computers have a variety of places to cache data. Typically these are called L1, L2, and L3 caches. Each is a bit further away from the CPU and a bit larger in size. If the math you want to do requires data that's not in the cache, it has to move it. This means if you minimize the size of your structures, you fit more data in your cache line and reduce the chance of a cache miss. A cache miss is where cpu needs to clear out the cache and reload the data in order to move forward.