This is why we don’t have 10GHz processors now, according to Intel
The 90s and early 2000s saw a thrilling race in terms of clock speeds, with successive generations of CPUs leaping forward into the gigahertz range.
But it’s 2018 now, and we still don’t have a 10GHz CPU. In fact, progress has slowed down significantly, and base clocks have actually decreased in some cases as more cores are added. Why? In a blog posted on its Developer Zone, Intel’s Victoria Zhislina dives into the details of CPU design and explains why clock speeds are plateauing.
Aside from issues like heat, Zhislina says there are other limiting factors that hinder frequency increases for the x86 architecture (a type of superscalar architecture), which is what most of Intel’s products are based on.
A superscalar processor is one that can execute more than one instruction during a clock cycle by simultaneously assigning multiple instructions to different execution units on the same processor. Each instruction is also broken up into several different steps, each of which are executed sequentially.
The diagram below assumes that each instruction, and each step within the different instructions, requires the same amount of time to execute. During the t4 period, different steps of four instructions can be carried out.
However, the reality is that different instructions can vary in execution time, as can the different steps of the same instruction. Ideally, the clock tick length and frequency of the processor should fit the longest step.
One could argue that shorter clock ticks will result in smaller steps being executed faster, which in turn increases the average speed. However, Zhislina says that this is not the case:
Suppose that the longest step requires 500 ps (picosecond) for execution. This is the clock tick length when the computer frequency is 2 GHz. Then, we set a clock tick two times shorter, which would be 250 ps, and everything but the frequency remains the same. Now, what was identified as the longest step is executed during two clock ticks, which together takes 500 ps as well. Nothing is gained by making this change while designing such a change becomes much more complicated and heat emission increases.
Even though execution might be faster for the initial steps, the third step and all of the following steps will be delayed by the time of the fourth clock tick. This is because the third execution unit will be free very two clock ticks, instead of every clock tick.
In Zhislina’s words:
While it is busy with the third step of one instruction, the same step of another instruction cannot be executed. So, our hypothetical processor that uses 250 ps clock ticks will work at the same speed as the 500 ps processor, though nominally its frequency is two times higher.
You could also raise frequency by breaking up the longest step, but this isn’t easy to do. Some steps are dependent on others, so you can’t do this easily without requiring extensive changes in processor architecture.
Smaller processes that reduce the physical size of components can help, as electrical impulses travel over shorter distances. The instruction steps are shortened uniformly as a result, and frequency can then be increased. However, it’s not so simple at a nanometer scale, and there are significant technological and physical limitations to overcome.
That said, there are ongoing efforts to achieve this, which accounts for the slow increase in CPU speeds.
Source: Intel Developer Zone