Instruction Issue degree: The major concept in superscalar processing is how many instructions we can issue per cycle. If we issue k number of instructions per cycle in a superscalar processor, in that case the processor is called a k-degree superscalar processor. If we want to exploit the full parallelism from a superscalar processor then k instructions must be implement in parallel.
For example, we consider a 2-degree superscalar processor with 4 pipeline phases for instruction cycle, i.e. decode instruction (DI), instruction fetch (IF), fetch the operands (FO), execute the instruction (EI) as shown below in Figure. In this superscalar processor, two instructions are issued per cycle as shown below in Figure. Here, 6 instructions in 4 stage pipelines have been implemented in 6 clock cycles. Under ideal conditions, after steady state, two instructions are being implemented per cycle.
Superscalar Processing of instruction cycle in 4-stage instruction pipeline
For executing superscalar processing, some important hardware must be provided which is discussed below:
- The necessity of data path is increased with the degree of superscalar processing. Assume, one instruction size is 32 bit and we are using 2-degree superscalar processor, then 64 data path by the instruction memory is required and 2 instruction registers are also required.
- Multiple implementation units are also required for implementing multiple instructions and to avoid resource conflicts.
Data dependency will be increased in superscalar processing if sufficient hardware is not given. The added hardware provided is known as hardware machine parallelism. Hardware parallelism makes sure that resource is available in hardware to exploit parallelism. Another alternative is to exploit the instruction level parallelism inherent in the code. This is achieved by transforming the source code by an optimizing compiler like that it reduces the dependency and resource conflicts in the resulting code.
Many popular commercial processors have been executed with superscalar architecture like IBM RS/6000, MIPS R4000, DEC 21064, Power PC, Pentium, etc.