What does a hard problem look like?
Learning to assess the difficulty of a problem in Computer Science.
Hello, Hypers!
There are two big fields of study in theoretical Computer Science: Computational Theory and Complexity Theory. The first one answers the question: Is this problem computable? In other words, can we create an algorithm that solves this problem? The second theory is all about the following question:
How hard is this problem?
Of course, I’m simplificating these fields of study but we’ll go deeper in the future. Today, I just want to gently introduce you to the amazing world of Complexity Theory. Do you want to know if your solution is the best possible one? Or maybe knowing if it is acceptable in your specific context? Let’s begin with the first step!
What do we need to look for?
When implementing an algorithm, we need to know what resources our algorithm consumes and the solution requirements. How long do we need to wait to get the solution? How much memory does the algorithm use? Time and memory are the resources we are mostly concerned about. But, depending on the context, other things might need our attention like security, fault tolerance, network usage, etc.
For now, we are going to focus on time only. However, most of the rules and methods we’ll discover will be valid for memory usage too. That being said, what are the factors that influence an algorithm's runtime?
On one hand, we have the hardware. Older CPUs perform fewer operations in the same amount of time. Also, some algorithms are heavily optimized for very specific hardware. For example, today we have powerful GPUs that perform many operations simultaneously, by taking advantage of parallelization. However, some algorithms can be parallelized, and others don’t. So, not all algorithms can take advantage of this hardware.
In Complexity Theory, we’ll never consider hardware (although in real life we should do it sometimes). In Computer Science, algorithms are not lines of code that run on a computer. They are a more abstract concept, like a mathematical function. We can (and we will) talk about algorithms without talking about computers. However, I’ll be using Python code to illustrate the examples since I don’t want to go deep into what an algorithm is.
Many people have wrote about this topic in Substack. If you want me to write about it in my own way, please let me know by replying to this post/email.
There is still another factor that influences the runtime of an algorithm: the input. For example, if we want to find the sum of all the elements in a list, the larger the list, the slower the algorithm.
The larger the list, the larger the number of times the line result += element
is executed.
The complexity of an algorithm is always relative to its input.
In conclusion, when assessing the complexity of an algorithm, we’ll focus on the algorithm itself (the operations it performs), and the input. “And nothing else matters…”
It’s Time, It’s Memory, It’s an Elemental Operation!
Let’s focus on our algorithm that sums the elements of a list. The question now is: how complex is it? Or, more specifically, how much time does finding the sum take? If you have been awake while reading this, you know the first answer: It depends on the input! In our specific case, it depends on the size of the list.
Ok, let’s say the length of the list is N. How to express the time consumed by the algorithm in terms of N? As we are talking about time, we could be tempted to run the algorithm many times, measure the runtime on each run, and try to find a rule that connects N with that runtime.
Well, those could be a good experiment in other contexts but it violates the first rule: forget about hardware! When running an algorithm, we are running it on a specific hardware and this affects the time we measure. We need to measure time without running the algorithm.
After thinking a bit about it, you’ll realize there is no way to connect time with N without running the algorithm in some way. And you are right! In the context of Computational and Complexity Theory, we don’t have a notion of time. We just deal with a bunch of mathematical operations. So, we need to measure time without measuring time!
But don’t be angry, there is a way! As I said, we only deal with some specific operations. We’ll call elemental operations to a few of them. For example, assignments like result = 0
, sum, subtraction, multiplication, and division of numbers, return statements, comparison, and all these very basic and seemingly atomic operations.
Indeed, these elemental operations are the building blocks of our algorithms. The complexity of an algorithm is then measured by how many of these blocks it needs. You can consider each elemental operation takes a unit of time to execute, so the amount of time the algorithm requires to run matches the number of elemental operations it performs. We don’t measure time, we measure elemental operations.
For example, let’s highlight the elemental operations in our sum algorithm:
Assuming the input list contains N elements, let’s count how many of these operations are performed:
result = 0
: Is executed once and it is independent of N.result += element
: Is executed N times since the for loop will do N iterations.return result
: Is executed once and it is independent of N.
In total, we have N + 2 elemental operations.
Being rigorous, inside the for-loop we assing the
element
variable and do more stuff behind scenes but it is not important. It doesn’t affect the result too much.
Now we have a number that depends on the input but that is independent of the hardware. This is a number that truly represents the complexity of this algorithm. No matter how powerful the CPUs Intel or any other company creates in the future, this number will always make sense.
A similar analysis can be applied to memory.
Conclusions
This was just a first step into the fascinating world of Computational Complexity. In future posts, I’ll be showing you other insights about how to compare algorithms and how to easily calculate runtime complexity.
I know this topic could be counter-intuitive at the beginning but it is quite interesting once you get it (not to mention it is incredibly important in Computer Science and Engineering). Any feedback about the explanations and the topics you would like me to write about is more than welcome. Also, don’t hesitate to ask any questions!
And that’s it! I hope you have enjoyed this issue. As always, thank you for another week.
See you next Tuesday!