Slashdot Log In
Core Microarchitecture
Please keep in mind that we are hardware experts, not software, so questions on virtualization, security and multi-threading optimizations are outside of our realm.
The three of us are located in Oregon, on the West coast of the USA, and will be responding to your questions and comments on a daily basis.If for some reason you have no questions for us, I'd be interested in your response to a couple of my own:
- What is more important, a processor having particular architecture features or a processor that has the best performance?
- How do you use information displayed by some hardware monitoring programs such as processor temperatures or voltages?
Related Stories
We plan on covering relevant issues around business clients with topics ranging from purchasing considerations, refresh, feature selection all the way through manageability, security, virtualization and everything in between. We'll talk about the good, the bad and even the ugly when it comes to business clients.
Our plan is to put out some topics that we hope will have interest to the Slashdot community and provide insight into technology elements and business processes that drive decisions. All three of us are based in the United States and will be posting and responding as frequently as we can. Once the week ends we'll be passing the torch to some of our coworkers who will drive discussion on next week's topic, mobility and wireless.
First, forget Dual Core. Or even Quad Core. Here's a PC Pro story headlined Intel finally demos 80-core pocessor. There has been talk about this ultra-mighy-processor in the past, but this is its first public unveiling. No, it won't run Windows (or any known desktop Linux distro), but still: 6 GHz, 2 teraflop performance. That'd be nice to have around the shop purely for bragging rights, wouldn't it? Surely we can come up with *some* kind of "practical" use for the thing that'll get our bosses to spring for one, right? Worth a try....

Your questions
(Score:3, Interesting)To answer your questions from my perspective (I'm, of course, no one special, just average programmer):
* What is more important, a processor having particular architecture features or a processor that has the best performance?
We're long after the phase where we're coding large pieces of code in assembler, and moved to higher level programming platforms. As such, and with all software around built for existing architecture, it really benefits us more to improve the performance of the existing features versus introduce constantly new SIMD extensions and so on.
The need for new features arises when you can't squeeze more performance out of the existing architecture. Isn't this, after all, why NetBurst died: it's not as if the programmers worldwide suddenly woke up thinking "damn it, I hate serial programming, instead I want to deal with thread concurrency issues, deadlocks, race conditions and all that fun! Give us multicore!"
But.. as multicore is the best way to scale up, then so be it. Same for SSE - it's NOT fun having to implement more and more branches in our code (does the CPU have SSE 3? then SSE3 code, else if it has SSE2, SSE2 code, else if it has 3DNow!, then 3DNow! code etc. etc.) to take advantage of new CPU-s.
This is why Itanium was such a dangerous step, and ultimately failed. It may have had incredible architecture, but for programmers world-wide, it meant we drop everything we worked long years to build and start porting from scratch, more or less and deal with the quirks of a totally new CPU architecture.
* How do you use information displayed by some hardware monitoring programs such as processor temperatures or voltages?
Well, if they are really off target, I call tech support
Power consumptio vs Specialization
(Score:2)(Last Journal: Saturday March 31, @02:05PM)
Big L2 vs each core haveing it's own L2 + a Big L3
(Score:1, Redundant)also why did you make the quad-core cpu have to use the same FSB to talk to each other as the cpus need to use to get to ram / the chipset. Why not have a bus that is used just for the dual-core to dual-core traffic?
alternative instruction sets
(Score:2)it would sure shake things up.
Ah, yes, I am quite excited to see this article
(Score:1)1. For the most part, speed. That is not to say SSE* is worthless, as its not, but, as someone above posted it makes the programmers life more difficult having different code paths, etc. -- Personally I run a core 2 duo box @ 3.4ghz with a 15K scsi drive -- going back to the stock 2.4ghz of the e6600 feels slow, and you can defiantly tell in using the pc, at least I can. I am somewhat picky with that aspect and my pc builds, but with this setup I am quite pleased really.
2. Hardware monitoring beyond temps and fan speeds is only really useful when initially setting up a system and seeing how fast you can overclock it, at least in a desktop environment. BUT having as many details, and high accuracy (like 0.01v or so, better would be nice) really helps in this area. (Now I am talking about all kinds of voltages here, not just vcore, so that is probably more to do with the bios & motherboard manufacturers)
So I know this is going to be a hot topic, can you please just explain as much as you can about the new CSI (?) interconnect system that works similarly to hypertransport that is supposed to debut on the chip revision after penryn? (2nd gen 45nm)
Anyways I really think you guys did an amazing job with the Core 2 microarchitecture, I mean it was better than I thought it was going to be. (Too bad the P4 didn't have the same story but oh well
Progress
(Score:2, Insightful)(http://www.mithral.com/~beberg/)
ATI and nVidia are launching ~250Watt cards which is causing people to pause, because if you're in the room with it you'll suffer from heat exhaustion. Intel isn't doing much better for the desktop frankly.
These beasts have got to stop. Integrate a graphic core and some RAM, make it under 10W and call it a day.
Delivered performance is all that matters
(Score:2)(http://tom.womack.net/)
On the other hand it's an interesting brain-exercise to write code which fully exploits SIMD instructions, so from that point of view I'd quite like exotic instructions; on the other hand this is code I'm writing for myself rather than for distribution, and we still have to distribute code that runs on Athlon chips without SSE2.
Performance monitoring
(Score:2)(http://tom.womack.net/)
Temperature, again, is really only useful to improve my confidence that the people I bought my computer from assembled it correctly, and that the ventilation system hasn't broken; if the Intel boxed cooler isn't capable of keeping the processor it came with adequately cooled, the air-flow in the box is probably completely screwed up, and it's likely that I ought to cycle down and get a new fan to replace the one that has doubtless died. The program that came with my D850GB board in 2001, which displayed a window and sounded alarms when temperatures went out of spec, was ideal for that; no need for real-time displaying of the temperature.
My thoughts
(Score:2)(Last Journal: Tuesday February 06, @10:13AM)
I was a die-hard AMD fan until the Core2 CPUs came out, though the PentiumM/Core series did make me waver a bit before hand. Good job.
That being said - Features on a CPU are only relevant in that they improve performance, and when being named, they provide a way to ascertain in what areas performance boosts are available.
With that, the features I'd like to see are
(1) Someone else mentioned alternative instruction set emulation, that would be nice. Especially with multi-core. It could be useful to dynamically switch one of the cores to "ARM", "SPARC" or maybe even "Itanium" mode for testing and debugging. At least as far as the opcodes go. I suspect the rest of the system would have to be emulated in software and the other CPUs would have to act as information gateways between the emulating CPUs due to system hardware design differences.
(2) Up to this point, AMD has been somewhat belatedly adding Intel instruction sets (SSE[x], MMX, etc.) to their chips. Intel has not been adding the 3DNow instruction sets to their chips. In comparisons, I've seen several apps where, if it didn't use 3DNow, Intel had the lead, but when 3DNow was added AMD pushed well ahead. So, I'd really like to see 3DNow on an Intel chip if they can get AMD to license it.
(3) Last but most definetly not least - a continuation of the efforts to reduce CPU heat production.
Why multicore instead of speed increase?
(Score:1)I mean, where are the 7GHz processors? By now you'd think we wouldn't have an effective speed cap around the 3.5GHz mark [the fastest I can usually find them]
Seeing as the x86 instruction set is an emulation in prefetch, why not give us access to the ACTUAL instruction set? I'd gladly program that instead.
Why is there so much fuss from everyone about "low-power" desktop computers? It's not on a battery, it really does not matter if the computer takes 5 watts or 500 watts.
PowerPC features
(Score:2)(This could be done in a backward compatible way. Agree with AMD that having two REX prefixes in an instruction is illegal, instead of just ignoring any REX prefix but the last. Then in a future processor two REX prefixes could be used to specify 32 registers of each kind instead of 16).
More PowerPC features
(Score:2)and more PowerPC features...
(Score:2)More instructions are generally *not* needed
(Score:1)(http://www.icu-project.org/)
Not all programs are compute intensive. A fair amount are bound to memory bandwidth and memory latency. Those new hybrid hard drives with flash memory look promising to reduce the bandwidth problems between the hard drive and RAM, but I haven't seen much innovation on improving the speed between RAM and the CPU. L1/L2 caches only do so much, and the CPU speeds seem to have far outpaced the RAM performance. More cores means more of a load on the memory.
I'm interested in what kind of memory bandwidth improvements are on the horizon. Is the memory controller going to be integrated on the next CPU, like AMD Athlon X2? Will it use XDR memory, like the Cell processor? What kind of memory improvements will we see in the future?
BTW a lot of people don't monitor their CPU voltage or temperature. The room thermostat, computer fans and heat sinks take care of the temperature, and the UPS takes care of the voltage. As long as the computer is doing its work, I can continue playing *cough* working on important things.
Memory mechanisms.
(Score:2)(http://www.sfu.ca/~rdickie)
My Answers
(Score:2)(Last Journal: Thursday August 07, @03:46AM)
* What is more important, a processor having particular architecture features or a processor that has the best performance?
I personally believe that performance only becomes a factor if I actually notice a slowdown because of it. I don't think I have with any modern processor. I think performance and architecture features would go hand-in-hand though; if you're adding features for reasons other than performance in specific situations, then I'd like to know why... you probably mean "general performance" as in ramping up the clock speed vs "specific performance" as in focusing on optimizing the architecture to get the most out of it. I'm most interested in balance; I want a processor to be smartly designed, to include all the architecture features that I want (mainly 64 bit and reasonably optimized for video encoding and graphics processing) while having a large cache and to do it all at the fastest clock speed possible for the process and the features. I don't want to tell you where that balance is, I want you to figure it out for me. That's your job
* How do you use information displayed by some hardware monitoring programs such as processor temperatures or voltages?
This is a harder question. I run AMD currently as I've said before, and I don't feel I ever need to look at the temp or voltages. The X2's run fairly cool (just try to find a heatsink/Fan for the AM2 socket... no one makes them!) so it's never something I've had to worry about. And I don't overclock, if I did it'd be very conservative.
So the answer is that I don't use that information. I look at it once to make sure it's under, oh, say 50 deg. C, and I never want to see it again. My criteria is that my processor a) keeps running stably, and b) doesn't catch fire. Anything outside that I don't even want to think about.
Thanks for your interest in the Slashdot community... keep on being honest with us
Question
(Score:1)(http://pg302.sourceforge.net/)
Well it depends. It is quite obvious to me now that there are two different camps now. You have low consumption processors made for laptops and mobile devices and server/gaming, multi core processors.Now the time has come when fragmentation of this market should happen again. In a way it already did happen with wide use of ARM architecture but I expect further fragmentation on desktop market in the near future.There is no need to have a dual core processor for simple word processing right? I expect that there will be more fragmentation here in the future. I think there is no need for adding more instruction sets in processors (even though that could be fun for programmer).I believe in cross compiling on a higher level.That's why I stick with GCC that supports wide range of microprocessors. So if adding new instruction set would hurt performance/power consumption don't do it! Unfortunately desktop market is driven by the "FORCE" of Microsoft who doesn't take cross compiling very seriously and is just creating demand for new performance boosts on certain areas what eventually mean new instruction sets and more backward compatibility junk in microprocessor
How do you use information displayed by some hardware monitoring programs such as processor temperatures or voltages?
Programmers unfortunately don't want to do anything about it on this subject as long as their program works. I believe there will be some great movements on this area in connection with process importance mapping/frequency scaling such as Nice value on Unix machines.Smart scheduling and voltage CPU regulation should happen on OS in not very distant future and should become some kind of a standard. Not only that we need information on voltages it is even more important to get current consumption for beginning on large CPU's.If I take this further I think OS should give user possibility to easily change importance of certain process for example from top of the window. The thing is this is OS task and not regular programmer's task. Problem here is of course standard.
Feature Request: more cache control
(Score:1)end user
(Score:2)This is a tough area.
(Score:2)(http://slashdot.org/ | Last Journal: Wednesday March 07, @03:14AM)
(IIRC, for very basic opcodes, a truly pure RISC chip is still generally around four times faster than a RISC/CISC hybrid, clock-cycle for clock-cycle.)
Parallelisms within the chip architecture - multi-core, SMP, or something else along those lines - have also been an area filled with frantic activity. The best you can scale SMP is about 16-way. The best anyone actually scaled an Inmos T400 (and still got performance benefits) was 1024-way. You'd need a 16-way SMP configuration of Intel's 80-core wafers to scale better, core-for-core.
Then, there's the issue of what you actually want in the CPU at all. Where a function can be provided better using processor-in-memory architectures, why bother optimizing the CPU implementation of that function? The whole reason for the hybrid design was to avoid over-optimizing the wrong stuff. If you extrapolate that to what can be offloaded, doesn't it make more sense to de-optimize the offloadable and use the real-estate gained to improve performance in areas that the CPU absolutely has to do well in?
I guess that last bit sums up my question for Intel's gurus and genius extraordinaires - offloading (whether with PIM or some other method) requires something to offload to. Those things don't generally exist because there's never been anything to offload onto them from. Other chicken-and-egg-and-trex problems exist, for much the same reason. How do you optimize your core microarchitecture from a technologically-correct standpoint, when the other technology required for you to do so may never exist until you have done so?
(All other optimization problems reduce to this, because all other optimization problems will have "better" technical solutions than the ones you can directly get to. Just ask de Bono.)
core naming convention
(Score:2)At first I thought core meant "core" like number of processing cores. Then it seemed like 2 meant how many cores there were. But that's completely wrong on both counts. It's the Duo. But four cores makes quad instead of Quattro? I guess Audi would have sued you. I suppose you better be careful when you get to 8 cores or you might step on a spiderman 2 character, sigh.
On top of all that weirdness, Sony uses nearly the same naming convention for their memory sticks.
I'm not offering a better idea, I really don't know. I just remember it being really clear that Pentium 1 through Pentium 4 offered improved performance and clock speeds with increasing number. Maybe Pentium 5 seemed silly to somebody in the marketing/branding department but it would have been less confusing. I liked the "m" suffix for mobile, and hey an "s" for server would have been nice.
Fair's fair: a certain three letter competitor's naming choices are not much better.
Cache Coherency in Dual Chip (8 Core) Core 2
(Score:2)(http://logicnazi.org/)
In particular does it cost any extra FSB bandwidth to maintain cache coherency or is this somehow accomplished by simply listening to the reads and writes from the other chip?
Thanks
VT-D Technology
(Score:1)How about both
(Score:2)There would have to be some massive jump in performance to switch back to big iron with fast/expensive CPUS.
On this. Are we likely to see any reconfigurable CPUs [wikipedia.org] from Intel in the near future? Wouldn't that be an architectural feature which improves performance?
New SSE4 instructions
(Score:1)(Last Journal: Monday April 09, @11:16AM)
Productivity applications et al
(Score:1)80 cores, 6Ghz, Three TF's??
(Score:1)Will it unlock the secrets of the universe?
Will it cure cancer?
Will it get an old fart a job?
Uhhh, will it make my mail server run faster, will it break absurdly strong encryption so the gummermint can spy on us better?
Will great aunt Tilly be able to remember how to get to her email account without calling her grandson?
Will the airlines run on time?
160 cores, 12Ghz, 6TF's, but gas is still $4.00 per gallon.
More cores = good -- How about more FSBs
(Score:1)I see that Intels near future offerings will have 6MB caches
Of course we can't have a chip 1000's of pins as we duplicate traditional parallel data and address buses - one obvious choice is to to use PCI-Express technology (high-speed serial lines) to crunch those 64 data bus lines into maybe 16 or 32 lines. Another solution is to move at least part of the memory controller into the CPU core and do this multi-bus work within the CPU - and again most likley using high speed serial lines.
Only useful with compiler support
(Score:2)(http://minion.sourceforge.net/)
Re:Tell me, Mr. Anderson:
(Score:2)(http://dev.nul/ | Last Journal: Thursday April 22, @01:58PM)
I suppose you make half of a good point with regards to parallelization. You *can*, it's just tough to do any only addresses a subset of the potential computation problems out there. Current tools are lacking though, and I would love to see Intel step up to the plate with regards to SMP libraries/toolsets.
Re:Documentation first!
(Score:1)Re:Three Areas
(Score:1)