.. post:: Oct 27, 2023 :tags: programming :author: Philipp Beisel Data Oriented Programming Practice ================================================== *How* do we program? An activity spent a lot of your lifetime on, let's make it an enjoyable one! As a happy programmer, I am a good programmer. I want to have an easy time programming. I want to produce source code that I like. I don't want to write code that will hinder me from proceeding some time in the future. Or to put it another way, I don't want to encounter roadblocks I created myself earlier. You can call it *hangover*, the situation where reworking the whole thing feels like an inhumane task, a total defeat. At the same time proceeding in the same manner seems not liable, it causes pain. Mostly the *solution* is some workaround or ugly hack that questions the whole architecture so far. I want to refine my practice, to avoid the mentioned situation in the future. I think a key aspect is to incorporate the topics *iteration* and *refactor* into the process naturally. Also, to use the code or to work with the code should not impose any behaviour to the caller, i.e., the source code should be compatible to other programming styles and practices. Imagine working in an already existing codebase and adding a certain functionality. At first let's forget all entry points and places the code should later be plugged into etc. and just open a scope: .. code-block:: c++ { //... } and ask 1. What data do we operate on? 2. What are the core lines of code that solve the problem? and then we write down these lines of code at an adequate level of quality, including some necessary input and output example data. Then, if not yet possible, we arrange it so that we can compile and run this code snippet in isolation! Example: we want to print some intensity image to the console using ASCII characters. So let the input data be a two-dimensional array of intensities (float) and the output data a two-dimensional array of ASCII characters: .. code-block:: c++ #include #include #include int main() { { //input data size_t width = 50; size_t height = 20; std::vector inputMem(width * height); //fill with some values for(int hI = 0; hI < height; hI++) { for (int wI = 0; wI < width; wI++) { inputMem[hI * width + wI] = std::sin(10.0f * 3.1415f * (float)wI / (float) width) * std::sin(12.0f * 3.1415f * (float)hI / (float) height); } } float * input = inputMem.data(); //output data size_t outputSize = width * height + height; std::vector outputMem(outputSize); char * output = outputMem.data(); //map to ascii art const size_t mpN = 6; std::array mapping = {' ','.',':','o','=','@'}; for(int hI = 0; hI < height; hI++) { for(int wI = 0; wI < width; wI++) { auto inputIdx = hI * width + wI; auto outputIdx = hI * (width + 1) + wI; output[outputIdx] = mapping[ ( std::max(0.0f, std::min(1.0f, input[inputIdx])) + 0.05f ) * (mpN - 1) ]; } output[hI * (width + 1) + width] = '\n'; } //print fwrite(output, sizeof(char), outputSize, stdout); } return 0; //... } when run this produces :: o==o o==o o==o o==o o==o .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. o==o o==o o==o o==o o==o o==o o==o o==o o==o o==o .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. o==o o==o o==o o==o o==o o==o o==o o==o o==o o==o .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. o==o o==o o==o o==o o==o o==o o==o o==o o==o o==o .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. .oo. o==o o==o o==o o==o o==o Let's now refactor it to satisfy the style that I have found to suit me well. In fact, since I stopped thinking in an *object oriented* way and started to explore this approach - we *could* call it *data oriented* programming - I think I became a more effective and also a happier programmer. After refactoring it, actually using (i.e. calling) the code looks like this: .. code-block:: c++ ... int main() { { using namespace ASCIIMapping; Data data{}; data.params.width = 50; data.params.height = 20; run(data); } } ... And the refactoring is very simple, we just put everything into a namespace that describes what the code is doing - in our case *ASCIIMapping*. There we create a struct named *Parameters*. Then a struct named *Data* which has one variable of type *Parameters* and additionally all the data that we need for the code to run. And finally a function called *run()* that takes a reference to a *Data* instance: .. code-block:: c++ #include #include #include namespace ASCIIMapping { const size_t mpN = 6; struct Parameters { size_t width{}; size_t height{}; std::array mapping = {' ','.',':','o','=','@'}; }; struct Data { Parameters params{}; std::vector inputMem{}; std::vector outputMem{}; }; void run(Data & on) { auto width = on.params.width; auto height = on.params.height; //input data on.inputMem.resize(width * height); //fill with some values for(int hI = 0; hI < height; hI++) { for (int wI = 0; wI < width; wI++) { on.inputMem[hI * width + wI] = std::sin(10.0f * 3.1415f * (float)wI / (float) width) * std::sin(12.0f * 3.1415f * (float)hI / (float) height); } } float * input = on.inputMem.data(); //output data size_t outputSize = width * height + height; on.outputMem.resize(outputSize); char * output = on.outputMem.data(); //map to ascii art for(int hI = 0; hI < height; hI++) { for(int wI = 0; wI < width; wI++) { auto inputIdx = hI * width + wI; auto outputIdx = hI * (width + 1) + wI; output[outputIdx] = on.params.mapping[ ( std::max(0.0f, std::min(1.0f, input[inputIdx])) + 0.05f ) * (mpN - 1) ]; } output[hI * (width + 1) + width] = '\n'; } //print fwrite(output, sizeof(char), outputSize, stdout); } } This approach might appear old-fashioned to some. I agree on that, maybe sometimes the old way is the better way? Why call it *data oriented* ? ----------------------------- Because it emphasizes the separation between functions and the data they operate on. In object oriented programming, an object would encapsulate its data and offer methods to manipulate or handle the data. I have tried to use this approach for quite some time now, and I apparently just happened to mess up almost every time. The separation of data and functions clears the mind and helps me focus to actually solve the problem, instead of debating why the name of this class is inappropriate, or which design pattern to use in that case. Note on namespaces ------------------- I have found namespaces to be very useful in this context: I put all the *meaning* of what the code does into the naming of the namespace. This method allows for having a struct simply called *Data* and one called *Parameters*. Namespaces allow us to set up the context. We can then use the created namespaces conveniently with the *using namespace* directive if we want to. But we can also be very specific and give the whole chain of namespaces to make sure the reader knows what we are talking about.