answersLogoWhite

0

Are SIMD instructions classified as RISC or CISC?

User Avatar

Myriam Morissette

Lvl 10
4y ago

Want this question answered?

Be notified when an answer is posted

Add your answer:

Earn +20 pts
Q: Are SIMD instructions classified as RISC or CISC?
Write your answer...
Submit
Still have questions?
magnify glass
imp
Related questions

What is flynn's taxonomy of parallel architecture?

Flynn's taxonomy is used to categorize computer architectures. Considers the number of processors and the number of data paths incorporated into an architecture. The fours combinations of processors and data path are: SISD( Single instruction, single data stream) MISD( Multiple instructions, single data stream) SIMD (Single instructions, multiple data streams) MIMD( Multiple instructions, multiple data streams)


What is the difference between SIMD and MIMD?

SIMD (single instruction, multiple data) describes multiple processing elements that work on the dame operation on multiple data points. MIMD (multiple instruction, multiple data) is the number of processors that function independently.


What is the difference between an Intel Pentium 3 and an Intel Pentium 4?

Penitum4 Vs PowerPC 750/970The IA-32 family and the Power/PowerPC family are two processor families that started off with two different design philosophies. The IA-32 processor's ISA was originally based on the philosophy of CISC (Complex Instruction Set Computing) and PowerPC was based on RISC (Reduced Instruction Set Computing). So, in the earlier 90s, the RISC based processors whichwere more suitable for pipelining and produced a better throughput because of the following reasons• More registers that are visible to the program than in the CISC machines like IA-32. So, we have fewer memory access for data• Relatively smaller instruction set (and a fixed length instruction encoding) which also leads to a lesser effort in decoding than the IA-32 family, instructions of which can be of variable length.• With a fixed instruction length and with a careful designed instruction cache line, RISC machines also lead to a more predictable instruction cache performance than CISCprocessorsBut there are other advantages that CISC processors had like the following.• With complex instructions, a CISC machine might be able to do a set of simpleinstructions by execution only one instruction in its ISA; but traditional RISC machines would need a few instructions to do the same. This means that CISC processors would have better code density that their RISC counterparts.• More addressing modes normally provided by the CISC processors mean that not only we have a more flexible way; but this further helps in reducing the number of instructions (atleast in the user program) needed to perform an operation. This is more important as most of the CISC machines like Intel's are accumulator based machines.With improvement in semiconductor process technologies, more transistors could be packed into a die of the same size. This enabled the RISC machines to bring in the basic CISC idea of having instructions that perform more than on simple operation. And it also enabled traditional CISC processors include some of RISC characteristics. Some like the Pentium Pro and its successorshave become a RISC machine at the core; but a CISC machine at the ISA level. Chronologically, the design of Pentium 4 is sandwiched between the designs of Power PC 750 (its closer relative with some enhancements is in Power Mac G4) and Power PC 970 (Power Mac G5). Now, let as compare the processors that represents the world's most popular desktops viz., x86 PC and ApplePower Mac.As mentioned earlier, the micro-architecture of the Pentium4 processor looks the same as a RISC machine. But one of the most notable differences between the two processors in question is the depth of the pipeline. The Pentium4 pipeline is a 20-stage pipeline as compared to essentially a 4- stage pipeline in PowerPC750. It would be interesting to note that the latest PowerPC970 has atleast 15 stages in its integer pipeline to 25 stages for its SIMD (AltiVec / Velocity Engine) pipeline. And one of the main reasons that PowerPC970 (which is a 64-bit processor) has been designed this way is apparently for reaching higher clock speeds and to close its gap with Pentium 4.One of the significant advantages that the RISC processors had over the CISC processors is the ease of decoding instructions. This is now no longer valid with Pentium4 around and with Power PC adding some complex instructions to its ISA. As noted before with the NetBurst microarchitecture, with the instruction decoder decoupled from the main pipeline through the use the trace cache, that advantage that RISC enjoyed has been nullified. Interestingly, the Power PC 970 processor dedicates the first 9 stages of its pipeline for fetch and decode. But Pentium 4 uses only the first 4 Stages for its fetch and decode. This is made possible by the use of trace cache in Pentium 4. Also, the PPC970 processor breaks complex instructions in to instructions that can he executed in a single cycle. This would look just like Pentium Pro and its IA-32 successors dividing the instructions provided by their ISA into μ-ops for execution. But we feel that for very high frequency designs such division of instructions, be it CISC or RISC architecture based processor, is going to inevitable in the future.PowerPC750 also tries to access the operands when issuing the instructions or when in the reservation stations. But Pentium4 access the register file for operand fetch only after the dispatch. This increases the size of a line in the reservation station for PowerPC750. It will be interesting to see this part of the implementation in PowerPC 970 which can have as much as 215 instructions in-flight compared to the 126 in-flight instructions in Pentium 4. PowerPC also introduced another interesting approach to branching. In PowerPC750, the branch prediction unit can be given some hints as a part of the branch instruction itself to specify the most likely option (is the branch taken or not?). It also has a special branch processing unit (BPU) unlike the Pentium4. That BPU is also capable of resolving branches very early and this helps us to recover the mis-prediction pipeline faster. PowerPC 970 extends this even further witha 3 separate 16K branch prediction buffer compared to the 4K branch prediction buffer in Pentium 4.PowerPC had and still does use individual reservation stations for each EU. But Pentium 4 uses a separate queue for memory loads and stores and another queue for all other kinds of instructions. It should also be noted that Pentium 4 takes a longer time to schedule and dispatch (5 stages) to the EUs.PowerPC still enjoys a particular advantage of Pentium 4. PPC has lots of user visible registers and this can reduce the memory access considerably as procedure or function local variables can mostly be kept in the registers itself. Of course, wider registers (64 bit) in PPC 970 means that structures of that size can now be stored in a single register. The introduction of Hammer, whichhas 64 bit extensions to x86, by AMD would mean that we would begin to have Intel like machines with 64 bit registers. But we feel that Intel is not likely to make a 64 bit version successor to Pentium 4; but rather promote an Itanium 2 derivative like Deerfield. Pentium 4 has a strict FIFO order of dispatch from a queue. This means that if an instruction in the front of the queue cannot be dispatched, then the scheduler and dispatcher would not look further down in that queue. But because the loads and stores are implemented in the separatequeue, the operations which could cause the most latency (memory loads) can execute faster. But PowerPC 750 on the other hand issue looking at only 2 entries in the queue and the reservation stations hold the entries for each EU. This means that when PowerPC really relies on executing the instructions in parallel, Pentium 4 relies on the deep pipeline behaviour within the executionunits. And the rapid execution core in Pentium 4 helps this cause as well.As we can see, PowerPC not only attacked the performance problem with a deep pipeline matching Pentium 4's; but also has more EUs than Pentium4. So, it packs more power in the hardware. Something that PowerPC could add in the future for making full use of the resources is on-chip multi threading (SMT). POWER5 processor is supposed to have that. So, we can hope the successor of PPC970 to have that.ConclusionThe two micro-processor families described here really compete for the top spot in modern highperformance desktops and workstations. The latest 64 bit PowerPC and the Pentium 4 seem to bevery close in terms of performance. PowerPC 970 was not only designed to run at higher speeds;but they are also designed for graphics and media applications by providing a power Altivecengine. But Intel aims to run all applications better with its hyper-threading technology andproviding optimal amount of hardware resources. But, if Intel continues for long to be backwardcompatible to IA-32, future design's performance might be constrained by this need for backwardcompatibility. So, we might begin to see the PowerPC processors on Apple Power Macsoutperforming Pentium 4 in the years to come.


What has the author Matthew D Levin written?

Matthew D. Levin has written: 'Parallel algorithms for SIMD and MIMD computers'


What is different between SIMD multiprocessing and MIMD multiprocessing?

SIMD DefinedThe SIMD architecture performs an identical action simultaneously on multiple data pieces. This single action can include retrieving, calculating or storing information. An example is retrieving a lot of different files at the same time. Processors with local memory containing different data execute the same instruction in a synchronized fashion, with inter-processor communication for shift allocation.MIMD DefinedThe MIMD architecture performs multiple different actions simultaneously on multiple data pieces. An example is the performance of various mathematical calculations such as addition and multiplication simultaneously in order to solve a complex math problem with many separate components. MIMD computing may or may not be synchronized and is increasing more commonly than SIMD computing.


Is a process that allows the CPU to receive a single instruction and then execute it on multiple pieces of data rather than receiving the same instruction each time each piece of data is received?

"SIMD, which stands for 'single instruction, multiple data,' is a process that allows the CPU to receive a single instruction and then execute it on multiple pieces of data rather than receiving the same instruction each time each piece of data is received."(Pg. 434, A+ Guide to Managing and Maintaining Your PC)


What are multi vector and SIMD computers?

(Single instruction, Multiple Data) A process that allows the CPU to execute a single instruction simultaneously on multiple peices of data, rather than by repetitive looping.


What is array processor and what is the role of attached array processor?

aray processor is a processor that performs computations on large arrays of data. It is of two types: (1) attached array processor. (2)SIMD array processor.


What does MMX stand for?

it stands for Multimedia extension. MMX is a technology by using which a computer can accept an input or give an output in the form of text, image or sound. MMX enabled microprocessors employ SIMD (Single Instruction Multiple Data) technique of data processing.


What does the acronym SSE stand for?

There are many different things that use the abbreviation SSE. In computing, it can stand for Streaming SIMD Extensions, server-sent events, or simple sharing extensions. It can also be used for supply-side economics and is also an energy company in the UK.


Static versus dynamic network in SIMD interconnection network?

Static- Links between two processor are passive & dedicated buses cannot be reconfigured for direct connection to other processor.Dynamic- It can be reconfigured by setting the n/w active switching element.with different classes i.e single stage & multistage.


What is Shuffle Exchange Network?

it is an interconnection network that is designed for connection between processors.this model works whit 2 functions:shuffle and exchange. shuffle function: shuffle( sn-1,sn-2,...,s0)= (sn-2,...,s0,sn-1) exchange function: exchange (sn-1,sn-2,...,s0)=(sn-1,sn-2,...,~s0) for example,we can sum integers whit shuffle exchange network with this algorithm: summation(SIMD-PS) begin for i=1 to log n do for all pj where 0<= j <= n do shuffle aj bj := aj exchange bj aj := aj+bj end for end for end this model of interconnection networks is SIMD(single instruction multiple data) from Flynn architecture and in useful for parallel algorithms.