How does a computer allocate memory in hardware? (pointers,stack, heap)

3dilettante · Nov 11, 2014

Flux said:
yeah intel can dump a massive budget into anything and make it competitive. Why not make a multicore armv8 with a legacy intel x86 core?

It's actually quite common that the reverse situation exists, where there are RISC,VLIW, or other DSP cores on the same SOC as an x86, but primarily as the various controllers and offload engines rather than peers.
Odds are, the ARM SOCs have more ARM or specialty coprocessors, but probably not a stripped-down x86 for offload.

The problem with having the cores as peers would go towards evaluating the different way the systems are structured at a software and hardware level. They handle memory differently, in terms of cache coherence protocols, how they structure virtual memory, and how the different cores can view memory. If they ever operated on the same data or OS structures, serious problems would result.
There are also OS and system software differences that are not likely to take kindly to mixing, and hardware-level differences that require different interfaces to the rest of the SOC and platform.

AMD seems keen on a strategy where they can create a platform that can host both ARM and x86, but have not indicated doing so at the same time. Keeping them separate allows for all the various discrepant interfaces and system requirements to be rebuilt.
That doesn't rule out something like a single slice of silicon hosting an x86 and ARM section, where they operate as physically separate machines with unshared memory spaces, or there's a requirement for a reboot between modes and they never run simultaneously. I dimly recall some kind of 2-in-1 that tried something like this, by having both an ARM and x86 SOC present in the same product, there wasn't enough benefit to the awkward arrangement, and the necessary demand for a dedicated silicon product would need to be much higher than a PCB kludge.

Dominik D · Nov 12, 2014

Flux said:
Please stop with the personal attacks. All i am doing is asking questions for a better understanding of the subject.

You don't have to be insulting. We are all adults here.

It is impossible to answer questions that are wrong. I gave you some examples, I'll give you another. How would you answer if I asked: how do you make a wooden chair fly? These are the kinds of questions you're asking.

I've made an honest attempt at answering your first batch of questions and tried to inform you about the subject matter. I'm not insulting, I'm simply stating that your knowledge is lacking. Because it is. If you take this as insulting then there's nothing I can do to help you. You'll grow out of this stance, just like I did. And I also thought I was adult back then. I wasn't.

Rodéric · Nov 12, 2014

@Dominik D:
You were a little harsh though, but right about the fact that when faced with such questions we either 1/Don't waste time and ignore them altogether, 2/Try to give some help by providing names of books because answering such questions is well beyond the time we allocate for forum browsing/interactions.

@Flux:
You have a lot to learn and books are the best way to do so because they'll cover a lot more ground than we have the time to.

pMax · Nov 12, 2014

Rodéric said:
@Dominik D:
You were a little harsh though...

Let me disagree, in the forums directly related to my profession, such questions would have had a "RTFM" answer in the first shot, actually.
The fact that someone make questions does not mean he deserves an answer - a question should show some work/research behind about getting an answer by yourself. Sometime, in doubt, professionals use forums -but they dont make such type of questions...

Flux · Nov 12, 2014

3dilettante said:
It's actually quite common that the reverse situation exists, where there are RISC,VLIW, or other DSP cores on the same SOC as an x86, but primarily as the various controllers and offload engines rather than peers.
Odds are, the ARM SOCs have more ARM or specialty coprocessors, but probably not a stripped-down x86 for offload.

The problem with having the cores as peers would go towards evaluating the different way the systems are structured at a software and hardware level. They handle memory differently, in terms of cache coherence protocols, how they structure virtual memory, and how the different cores can view memory. If they ever operated on the same data or OS structures, serious problems would result.
There are also OS and system software differences that are not likely to take kindly to mixing, and hardware-level differences that require different interfaces to the rest of the SOC and platform.

AMD seems keen on a strategy where they can create a platform that can host both ARM and x86, but have not indicated doing so at the same time. Keeping them separate allows for all the various discrepant interfaces and system requirements to be rebuilt.
That doesn't rule out something like a single slice of silicon hosting an x86 and ARM section, where they operate as physically separate machines with unshared memory spaces, or there's a requirement for a reboot between modes and they never run simultaneously. I dimly recall some kind of 2-in-1 that tried something like this, by having both an ARM and x86 SOC present in the same product, there wasn't enough benefit to the awkward arrangement, and the necessary demand for a dedicated silicon product would need to be much higher than a PCB kludge.

Makes sense. The two architectures have vastly different interfaces(buses,mmu,etc) and it would not be practical to have them both on the same die communicating with each other.

So Amd will just keep them separate and play the low-high squeeze play game with intel?

Armv8 has 4 threads per core and x86 has 2. Could AMD create a new arm core with SSE3-4 support?

A multicore 64bit arm with desktop level float point performance would be pretty impressive.

Kaarlisk · Nov 12, 2014

Flux said:
Armv8 has 4 threads per core and x86 has 2. Could AMD create a new arm core with SSE3-4 support?

You have got to be kidding me. I know I am not a veteran here (nor do I work in IT), but it seems you need to hear more voices saying "RTFM". Or asking whether you may be a troll.
http://www.arm.com/products/processors/technologies/neon.php

Though I have to admit that some of the longer replies to the silly questions were really enlightening, making this thread very much worth keeping.

3dilettante · Nov 13, 2014

Flux said:
So Amd will just keep them separate and play the low-high squeeze play game with intel?

I don't know what AMD thinks it will do. Whatever game it can play depends on what it can actually offer, and AMD has promised roughly comparable performance for its ARM and x86 cores. If that level of performance doesn't go above Intel, then there is no higher ground to press down on Intel with.

Armv8 has 4 threads per core and x86 has 2. Could AMD create a new arm core with SSE3-4 support?

I don't know if you needed a line break between the two sentences.

I may need a source on the 4 threads for ARM as most don't go above 1, but all the same ISA normally doesn't set a count.
x86 has gone up to 4 threads with Larrabee.

SSE is an x86 set of extensions, and ARM is not keen on licensees messing with the actual ISA, and the two ISAs are very, very different. Selling a core that has x86 instructions as an ARM is likely to get AMD sued if they insist that the core is not x86 for the purposes of Intel's royalties and the like.

pMax · Nov 13, 2014

3dilettante said:
I may need a source on the 4 threads for ARM as most don't go above 1, but all the same ISA normally doesn't set a count.

...thanks for the laugh in the boring morning, that's funny ...even if it looks like a bit too engligh humor

Blazkowicz · Nov 23, 2014

3dilettante said:
I dimly recall some kind of 2-in-1 that tried something like this, by having both an ARM and x86 SOC present in the same product, there wasn't enough benefit to the awkward arrangement, and the necessary demand for a dedicated silicon product would need to be much higher than a PCB kludge.

Perhaps the most known about hybrid "product", with separate chips was when you use an Amiga with a PowerPC accelerator card yet a (high end) 68k CPU is still present, either on the original system or on the card.
Perhaps Amiga refugees STILL exist, I never use one more advanced than Amiga 500 or 1000.. but afaik the dual CPU "Amiga" did run 68k and PPC applications simultaneously. An Amiga was very different from even Windows 95 though, no memory management (the original 68000 didn't include or even support an MMU) thus no memory protection whatsoever.
You've got pre-emptive multitasking but any program just write wherever the hell it wants to and if one misbehaves it may make everything else fail catastrophically.
Not really sure how it was on an asymetric Amiga system, maybe there were two RAM pools, who knows. Anyway these days even one single processor/core and/or OS with no memory protection would be deemed unfit for applications, except for micro-controllers.

sebbbi · Nov 23, 2014

Flux said:
Could AMD create a new arm core with SSE3-4 support?

Quick answer: All AMD ARM cores have equivalent functionality to SSE3-4 already. It's called NEON.

Long answer:

SSE is an x86 instruction set extension. It has standardized opcodes (binary numbers allocated for each instruction) that do not collide with the existing x86 instruction set opcode numbers. x86 is also a variable length instruction set, meaning that each instruction can be encoded with different amount of bytes. ARM is a fixed length instruction set (actually it has two instruction sizes, 32 bit and 16 bit). Most x86 opcodes could not be presented by ARM instruction set and/or could not be decoded by ARM CPU decoders. Also SSE instruction opcode numbers would likely clash with some existing ARM instructions.

SSE instruction set is also designed around the x86 memory consistency model (http://en.wikipedia.org/wiki/Memory_ordering, http://www.realworldtech.com/arm64/5/) and ARM memory model differs from this. Some instructions might need to behave differently as stated in the SSE standard, or might be cost-inefficient to implement.

However none of these problems actually matter, since ARM has it's own SIMD instruction set, called NEON. NEON provides roughly the same functionality as SSE. In many ways NEON is actually better than SSE. It has some nice extra functionalities over SSE, but also is missing some small bits here and there.

A better question would be: Are 256 bit NEON instructions coming (matching AVX1/2 vector length)? If you want to be able to exceed Xeon Phi (or even Haswell) in float throughput (or throughput/watt) you need wider SIMD units. ARM is not good enough for this kind use cases until it matches Intel / IBM in vector processing performance.

Npl · Nov 23, 2014

sebbbi said:
A better question would be: Are 256 bit NEON instructions coming (matching AVX1/2 vector length)? If you want to be able to exceed Xeon Phi (or even Haswell) in float throughput (or throughput/watt) you need wider SIMD units. ARM is not good enough for this kind use cases until it matches Intel / IBM in vector processing performance.

Stupid question, but in OOO cores is it strictly necessary to be "wide" to beat throughput compared to cores that just can pair and execute multiple instructions?
The first would be a simpler execution pipe, but the later would be more flexible - how those VLIW architectures with insane peeks numbers and similar claims fared is now known and an historical anecdote. (given all the cruft I would expect x86 to alwas lose in terms of efficiency anyway)

Lightman · Nov 23, 2014

Blazkowicz said:
Perhaps the most known about hybrid "product", with separate chips was when you use an Amiga with a PowerPC accelerator card yet a (high end) 68k CPU is still present, either on the original system or on the card.
Perhaps Amiga refugees STILL exist, I never use one more advanced than Amiga 500 or 1000.. but afaik the dual CPU "Amiga" did run 68k and PPC applications simultaneously. An Amiga was very different from even Windows 95 though, no memory management (the original 68000 didn't include or even support an MMU) thus no memory protection whatsoever.
You've got pre-emptive multitasking but any program just write wherever the hell it wants to and if one misbehaves it may make everything else fail catastrophically.
Not really sure how it was on an asymetric Amiga system, maybe there were two RAM pools, who knows. Anyway these days even one single processor/core and/or OS with no memory protection would be deemed unfit for applications, except for micro-controllers.

Amiga with PPC card is still using single memory pool. 68k CPU is executing 'old' code and PPC CPU is running RISC stuff with both of them having access to the same memory pools (CHIP, FAST). Only thing bypassed is original CPU sitting on the PCB of Amiga 1200 (68EC020) or Amiga 500/600 (68000). In higher end models you replace the whole CPU card to install PPC accelerator.
It is fun to run CISC and RISC code at the same time

It obviously is prone to crashes for badly written software, but if everything goes thorught OS and not direct to the metal coding then all is good!

From Wiki:

Kernel
Main article: Exec (Amiga)
Exec is the multi-tasking kernel of AmigaOS. Exec provides functionality for multi-tasking, memory allocation, interrupt handling and handling of dynamic shared libraries. It acts as a scheduler for tasks running on the system, providing pre-emptive multitasking with prioritized round-robin scheduling. Exec also provides access to other libraries and high-level inter-process communication via message passing. Other comparable microkernels have had performance problems because of the need to copy messages between address spaces. Since the Amiga has only one address space, Exec message passing is quite efficient.[2][3]

Bob · Nov 24, 2014

Npl said:
Stupid question, but in OOO cores is it strictly necessary to be "wide" to beat throughput compared to cores that just can pair and execute multiple instructions?

This isn't a stupid question at all! To a first order, they achieve the same effect: leverage existing ILP to increase performance.

SIMD execution is a huge, huge simplification for the HW implementation of parallel execution of instructions.

Consider that in the non-SIMD case, the CPU must first parse a second instruction, figure out all the dependencies for that instructions, where they are and when they'll commit (if applicable). It must do that in parallel to the first instruction, making sure that the apparent effects of the execution are no different than if they were executed in sequence (which is not trivial in the presence of other features such as speculative execution, exceptions, MMU faults from implied access (consider x86 and its ability to have memory operands), etc).

In the SIMD case, the compiler and/or application writer has simplified down the problem to just building a wider execution core (ignoring architecture-specific details like non-SIMD accesses to SIMD registers).

Short answer: no, it's not necessary, but it sure makes life easier for processor designers!

(There are also other cost trade offs to consider: energy per operation, effort involved in the design and fabrication, etc).

Hope this helps.

sebbbi · Nov 25, 2014

Npl said:
Stupid question, but in OOO cores is it strictly necessary to be "wide" to beat throughput compared to cores that just can pair and execute multiple instructions?

You need to be wide (and have high floating point throughput) to compete in the high performance scientific computing field (http://www.top500.org/). For other markets (database/web servers for example) you don't need wide SIMD. You need good performance/watt in running messy code.

Infinisearch · Nov 28, 2014

Flux I came across this resource recently you might find it helpful. http://ocw.mit.edu/courses/electric...nce/6-004-computation-structures-spring-2009/

How does a computer allocate memory in hardware? (pointers,stack, heap)

3dilettante

Dominik D

Rodéric

a.k.a. Ingenu

pMax

Flux

Kaarlisk

3dilettante

pMax

Blazkowicz

sebbbi

Npl

Lightman

Bob

sebbbi

Infinisearch

Similar threads