JORDAN.YELLOZ.me

About 10 years ago as a second year undergraduate student, I had a data structures assignment that I didn’t successfully complete. The assignment was to implement a classical data structure, the AVL tree. The insertion algorithm was fairly straightforward at the time but removal was difficult and this was the part I could not complete successfully. Some publicly available implementations recalculate the balance factor of a subtree after every step of the removal operation on that tree which is not very good, but can technically preserve the time complexity requirements of an AVL tree. When that recalculation is performed, the implementation of the removal algorithm is quite easy since nobody needs to think about what happened to the tree at every step of every operation. With less cowardly implementations, a few rules for updating the balance factors can be determined by studying the effects of the removal of an item from the tree and any rotations performed in earlier steps of the procedure.

Anyway, since I didn’t complete that assignment I would come back to it every once in a while to try to finish it and I never achieved any success. Over the past few weeks, when not standing over deceased adolescent pinnipeds, I have used my time to start over completely to implement an AVL tree in C (with GLib). Eventually in problems like this where there is a small amount of cases, it can pay off to create a logic table and actually write out every single possibility with its outcome to find out how to turn specific cases into formulas. That was the tool I used this time (combined with a lot of reading) and I seem to have eventually succeeded in correctly implementing the AVL tree removal operation.

What’s most shameful of all this is that when re-implementing the data structure this time, I ran into the same problems as before. Looking back, that should be expected since I rarely ever used any non-trivial problem-solving skills over the past several years. Anyway, I ended up taking under 1000 lines of code for a complete implementation including some large comments trying to remind the future me how to remove the successor of a node from a tree and perform the re-balancing on any parent nodes of the successor which might need it. Additionally, for those who don’t know, the code is basically duplicated with the directionality swapped since it is a binary tree so it could reasonably be claimed that there’s only about 300-400 lines of actual code in there. It might be worth merging that code and using some math to flip the effects of the algorithm. It would probably also be good to have an exhaustive testcase for the removal operation since that would give any future implementors a fixed target to satisfy.

Posted on

Revisions: