[Pc_Support] Re: [LeapList] CISC vs RISC
Bryan J. Smith
b.j.smith at ieee.org
Fri Jan 13 15:26:55 EST 2006
George Laiacona <glaiacona at aikencountysc.gov> wrote:
> Can anyone give a summary of CISC vs RISC processing power,
> clock rates, etc.,
Complex Instruction Set Computing (CISC) and Reduced
Instruction Set Computing (RISC) are rather abstract terms
that aren't in the real-world. Even some RISC designs have
CISC features and some CISC designs have RISC features.
Probably the most bloated CISC is IA-32 (x86), especially
with its sprawling set of extensions. It's because of its
complexity that timing and scheduling is very, very difficult
to do. Intel hasn't redesigned a new IA-32[e] core since the
1994 Pentium Pro. AMD has not modified it's IA-32/x86-64
core since the original 1999 design.
Probably the most anal RISC was AXP (Alpha), which only
worked on 32-bit or 64-bit data (and even lacked 8 or 16-bit
loads until the 21064A), and only added 5 extensions to avoid
holding up timing/pipes.
CISC is a 1970s concept.
RISC is a 1980s concept.
In the 1970s, physcists and engineers did little other than
design transistors and substrates. Computer scientists
designed combinational boolean logic (CBL) to do things,
including the first Intel 4004 microcontroller. As such,
it's instruction set was designed by and for programmers.
Hence CISC.
In the 1980s, physcists and engineers started to realize the
limitations of CISC at the digital gate level. Variable
instruction and flexible operands lead to a massive
inefficiency in MEAGs/one-hot/selects and other control
logic. Berkeley RISC (the father of the SPARC) and Stanford
MIPS showed that using a fixed instruction size (typically
32-bit), with instructions of a shorter fetch-decode-execute
timing on fixed operands (typically select registers, instead
of various registers/memory) could result in much higher
performance than CISC -- even though several instructions
were required versus only a few.
Physcists and engineers also had an alterior motive than
efficiency, but wants RISC out of sheer design effort --
especially timing. The more simplistic the logic, the easier
it is to time and execute. Especially when today's modern
superscalar architecture where instructions are not only
pipelined one after another, but there are multiple pipes
executing their own pipelines at any time.
The key component required for the adoption of RISC was the
compiler. Computer scientists would not accept something
that would require them to write 2-10x as many instructions
as RISC would versus CISC. But compilers were commonplace,
so physcists/engineers had a solution. In fact, I
_regularly_ have to remind people (especially people who
wrote assembler in the 80s) that it's virtually _impossible_
to "out-smart" the instructions from an optimizing C compiler
with direct assembly in today's superscalar CISC (let alone
RISC) designs.
There are newer, even "more RISC" designs like the Very Large
Instruction Word (VLIW) in designs like Transmeta's 128-bit
VLIW that reduces fetch/decode even more, and almost
completely replace any requirement for it. It makes IC
design even easier and more flexible, and took "binary
translation" to a whole new level beyond what the AXP RISC
could do.
Intel's IA-64 is sometimes referred as a 128-bit VLIW, but
it's _nothing_ like Transmeta's design. It's a 41-bit RISC
instruction word optimized at compile time into 3 instruction
words (plus 5-bits), known as Explicitly Parallel Instruction
[Set] Computation (EPIC). The idea was that EPIC could keep
the superscalar pipes fuller than the typical 40-50% and
60-70% of CISC and RISC, respectively. The problem with EPIC
in the first two IA-64 Itanium revisions is that compile-time
optimizations are not enough, and Branch Predication
(executing both paths a branch could take and discard the
result) was worse than a 95% (19 out of 20 time) accurate
Branch Predictor.
IA-64 Itanium3 is retrofitting more traditional RISC design,
largely those obtained from AXP licensure/ownership. This
includes the same binary translation approach used by AXP and
Transmeta to run x86 (as the IA-64 hardware x86 emulation
support is slower than binary translation in software).
> and why this might matter to a *nix installation?
> (Bryan probably could.)
Software is software, including OSes.
C code is C code and today everything boils down to virtually
C.
Whether the C compiles into object code of CISC, RISC, EPIC,
etc... like instructions matters little.
99.99% of programmers learn all they need to know about
optimizing software, system organization and "raw programmer
interfaces" from 3rd Generation Language (3GL) C-level code.
You don't want to touch assembly unless you want to spend 2+
years learning the ins-n-outs of how a _specific_
multi-pipelined, superscalar architecture works to write
efficient code for it. I can't stress that enough, even *I*
can't and don't want to code at 2GL assembler other than for
bootloader or other, low-level code (possibly an occassional
in-line assembler) as I can _not_ optimize and schedule like
today's C compilers.
You have the optimizing C compiler which is like having the
engineers who designed the architecture writing your
assembler code, building it to schedule and time separate
tasks fairly efficiently. Learning how to optimizing in C --
especially for those using 4GL languages (Java, .NET, etc...)
-- will by far have the greatest improvement in their code
writing.
Especially in new arena of multiple-cores *AND*
multiple-threads over those cores, scheduled _dynamically_ by
the chip(s) overseeing logic for all cores. God knows C just
knowing the POSIX standards for threaded C code writing goes
a very, very long way towards performance.
--
Bryan J. Smith Professional, Technical Annoyance b.j.smith at ieee.org http://thebs413.blogspot.com
----------------------------------------------------
*** Speed doesn't kill, difference in speed does ***
More information about the Pc_support
mailing list