3D Xpoint Memory (Intel Optane)

I don't think you understand what memory-mapping a file does. On a SATA device, it allocates some ram and copies the content of the device into that ram when accessed. On any device that allows direct random access into it (such as NOR flash currently), memory mapping just attaches that memory at that point in your address space. When you memory map a NOR (or XPoint in the future) page into memory and access it, there is no translation of access beyond the one that is always done for RAM, it's just a direct bus access into that device.

It's not done this way for current PC and smartphones though.

As I said before, the only thing that the visible VFS layer of filesystems provide is a translation of human readable paths into pointers in devices. All the other features of filesystems are completely orthogonal to this. I never see this system going away, it's just too practical.

I'm not saying current system should be going away (as I said, we'll likely see current system going on external storage in foreseeable future). I'm just saying for some part of the system maybe we can use something different.
I'll use another example. Right now we have a mapping system inside an SSD for wear leveling (we'll still need that even with the 3D XPoint memory). With some modification we can make each file in the SSD looks continuous, and that would eliminate the need for nodes and bitmaps in a traditional file system, because in theory the controller is already doing that.
 
I'm not saying current system should be going away (as I said, we'll likely see current system going on external storage in foreseeable future). I'm just saying for some part of the system maybe we can use something different.
But what are the benefits you expect out of using "something different"? What can't be achieved with memory mapped file I/O? And what limitations would an alternative introduce?

I'll use another example. Right now we have a mapping system inside an SSD for wear leveling (we'll still need that even with the 3D XPoint memory). With some modification we can make each file in the SSD looks continuous, and that would eliminate the need for nodes and bitmaps in a traditional file system, because in theory the controller is already doing that.
I think the mapping for wear levelling is quite a bit simpler than what you need for this kind of total mapping. And how much is really gained by the controller doing that?
 
But what are the benefits you expect out of using "something different"? What can't be achieved with memory mapped file I/O? And what limitations would an alternative introduce?

Well, current filesystems are quite heavy. It takes a lot of time to simply open a file, and performance can degrade once a directory has too many files. Therefore, it's generally not a good idea to use a file as an "object." However, if there's a permanent, fine-grain accessible storage, it'd be nice if it's possible to access objects on these storage like how one access objects in RAM. For example, storing a binary tree without the need to serialize it to a file.

Right now, sometimes we combine small, static files into one big file and memory map it for access to reduce the time required when loading data. But memory mapped files is a bit shaky on iOS when the file is too large (though that problem is mostly solved in 64 bits iOS). Also the objects in the file needs to be managed manually. It simply can't be the most efficient way of doing things especially once you have technologies like 3D XPoint.
 
Well, current filesystems are quite heavy. It takes a lot of time to simply open a file, and performance can degrade once a directory has too many files. Therefore, it's generally not a good idea to use a file as an "object." However, if there's a permanent, fine-grain accessible storage, it'd be nice if it's possible to access objects on these storage like how one access objects in RAM. For example, storing a binary tree without the need to serialize it to a file.
Individual memory allocations are quite heavy, relatively speaking, so allocating from pools is generally a good idea. What do you gain by mapping objects to individual files instead of one big file? (I'm presuming that a large virtual address space is an absolute necessity for memory mapping a large non-volatile storage device).

Could you clarify what you mean by "managed manually"?
 
Individual memory allocations are quite heavy, relatively speaking, so allocating from pools is generally a good idea. What do you gain by mapping objects to individual files instead of one big file? (I'm presuming that a large virtual address space is an absolute necessity for memory mapping a large non-volatile storage device).

Could you clarify what you mean by "managed manually"?

Allocating memory with a system call is expensive (though not as expensive as opening a file), but allocating memory from heap is cheap.
And yes you are right, it requires a large virtual address space to map large storage, thus it's impractical with 32 bits CPU (with PAE it's possible to support up to 64GB, but that's simply too cumbersome to use). As an example, under 32 bits iOS the useful size limit of a memory mapped file is about 700MB, but sometimes it fails with just 200MB (probably due to memory fragmentation, I'm not sure).

For static data, using a big file is actually quite practical. The problem is the need to have your own code to manage the files (that's what I mean by "managed manually"). Of course that's not really a serious problem once you have the library ready. However, it's very difficult to manage if some of the files need to be changed (especially if file size needs to be changed). For smaller data it's easier to use something like SQLite or MongoDB (although it's still not ideal in some case), but for larger data such as images it's much more difficult.

To use as an example, supposed that you need an image caching system for some dynamically generated images for a 3D engine. If the images are of the same size, it's actually not that bad to use a large memory mapped file to manage the image cache. However, if the images have different sizes, then it's much more difficult. There are some tricks (e.g. if the size classes are somewhat limited in number), but still not ideal. On the other hand, if there's an object based direct-access file system available, it's possible to use the filesystem to manage the objects directly and be able to map these images to GPU accessible memory space at the same time. That'd be even nicer if the storage controller can help with its mapping function.

I know that this might sound like a niche application, but while I have no hard data, I still believe that there could be a lot of potential savings if we have a direct-access file system (with a good permanent storage, of course), especially considering current SSD are already near main memory performance (by "near" I mean within an order of magnitude ;) )
 
This goes back to some of the database platforms I've seen, where there is a legacy of having to manage storage before Windows (edit: or other older OSes, for that matter), or for a number of performance optimizations that allocate large files and/or create proprietary management for their files for the express purpose of stripping the OS out of the access process as much as possible.
I have seen where transitions for some functions in the other direction that bring in file system involvement inject orders of magnitude more latency.

Some platforms would conceptually lend themselves well to a large non-volatile store because of their topologies, development philosophy, and software being structured for a workload dominated by disk access coupled with a desire to abstract this from the code. Some legacy types whose history gave them a limited exposure to massive quantities of volatile DRAM have Coelacanth-ed their way to seeing things almost coming back to the way they functioned, although I expect they themselves would not experience a renaissance from such a sea change.
 
Last edited:
Games also usually use package files, with one main feature being to reduce the cost of system file management. Accessing a file is expensive on NTFS with all rights & stuff going on, so best avoid that cost when you can. (Open at engine start close at engine end.)
 
To use as an example, supposed that you need an image caching system for some dynamically generated images for a 3D engine. If the images are of the same size, it's actually not that bad to use a large memory mapped file to manage the image cache. However, if the images have different sizes, then it's much more difficult. There are some tricks (e.g. if the size classes are somewhat limited in number), but still not ideal. On the other hand, if there's an object based direct-access file system available, it's possible to use the filesystem to manage the objects directly and be able to map these images to GPU accessible memory space at the same time. That'd be even nicer if the storage controller can help with its mapping function.

I know that this might sound like a niche application, but while I have no hard data, I still believe that there could be a lot of potential savings if we have a direct-access file system (with a good permanent storage, of course), especially considering current SSD are already near main memory performance (by "near" I mean within an order of magnitude ;) )
So you do want to use a file system, but a light-weight one. Can you define what you mean by "direct-access file system"?

To be clear, I'd like to see efficient use of persistent storage, too, but I think there are many concerns that are easy to be overlooked when extending the concept as "volatile memory allocation, but persistent". For example:
  • Persistent state of an application needs to be protected such that an application can access it again after a crash, but no unauthorised application or user can access it.
  • Running multiple instances of the same application with separate persistent state should be possible (without duplicating the application code).
  • You should be able to update/upgrade an application and continue from the saved persistent state (this means access should not be restricted via an application ID).
  • You should be able to back up persistent state to another storage device, and continue execution on another device (otherwise the backup is worthless)
  • If the two things really need to be separate, you should be able to use the same storage device for light-weight access and for full filesystem access (with user-owned files, permissions, shared access, versioning, etc.), ideally with dynamic allocation instead of static partition allocation.
 
So you do want to use a file system, but a light-weight one. Can you define what you mean by "direct-access file system"?

By "direct-access" I mean something that can be accessed simply with a semi-permanent memory address. Of course, the address is likely to be private to the application or even a specific instance of an application.

To be clear, I'd like to see efficient use of persistent storage, too, but I think there are many concerns that are easy to be overlooked when extending the concept as "volatile memory allocation, but persistent". For example:
  • Persistent state of an application needs to be protected such that an application can access it again after a crash, but no unauthorised application or user can access it.
  • Running multiple instances of the same application with separate persistent state should be possible (without duplicating the application code).
  • You should be able to update/upgrade an application and continue from the saved persistent state (this means access should not be restricted via an application ID).
  • You should be able to back up persistent state to another storage device, and continue execution on another device (otherwise the backup is worthless)
  • If the two things really need to be separate, you should be able to use the same storage device for light-weight access and for full filesystem access (with user-owned files, permissions, shared access, versioning, etc.), ideally with dynamic allocation instead of static partition allocation.

Sure, I agree it's a hard problem, but I don't think it's impossible to solve. I think the sole reason why we still don't have something like that is simply because we still don't have the hardware. When the first iPhone was out, its storage is not fast enough nor the CPU has enough addressing capability to do something like this. And now we already have the burden of compatibility to bear. But I think it's still not too late.

On a smartphone, I think it's simpler. Applications normally only have one instance, and they are all sandboxed, so we can allocate memory spaces separately for different applications. The same applied to the back up problem, but one thing might be problematic is, since the data structure in-memory is likely to be dependent on the CPU architecture, so the backup is unlikely to be able to transfer between different CPU architectures (not impossible, but things like endian and alignment rules can be very difficult).
 
Forget smartphones. They don't even exploit the full potential of Flash storage. They are also too cost sensitive to use 3D Xpoint memory which is likely to be significantly more expensive than Flash.

IMO, DBMS servers are the real target market. It will allow regular database features at in-memory database speeds (from tens of thousands small transactions/core/second to tens of millions/core/second).

Cheers
 
So what about the manufacturability of this?
Does it look like a goer or is it gonna be a When they start making consumer grade some time Real Soon Now™ like the other similar techs?
 
Micron and Intel aren't usually launching hot air, so my bet is that it will be a real product within ½-1 year.

Cheers
 
From the AT article: "The significance of the announcement isn't just the new memory technology, but that it's actually in production with volume shipments scheduled for next year."
 
Very exciting indeed.

I think it's insane that nobody outside Intel/Micron seems to know for sure how the thing really works.
Competitors must be seriously worried about this.
 
I can understand why this is coming to market first in SSD form, but I'm more excited by the possibility of non-volatile DRAM-like memory that the underlying technology should make possible.
 
NV storage at nearly the speeds of high-end DDR4 DIMMs? At this point, even with NVMe and PCI-E, we still don't have the right "storage stack" architecture to make full use of it. I have the strangest boner right now ;)

That's bitchin fast :D
 
No doubt interesting for the enterprise market but as a home user I'm much more interested in SSD's reaching a per GB price much closer to mechanical drives than anything else.

Endurance has never really been an issue IMO. Maybe on the early SSD's but even my pretty old 64GB M4 is still going strong. More speed is always welcome but at this point I rather have a 1 ~ 2TB "slow" SSD around a 200 dollar price point than a 7x faster SSD that's going to cost me three times that or more.
 
I think I'll wait for that to become affordable before I change my SSD or HDD !
Should help video games loading a lot, might change the compression algorithms used too, if it's faster to load rather than to decompress...

Also interested about the DRAM version, no more need to dump RAM to disk to go in hibernation ?
 
So per https://forum.beyond3d.com/threads/intel-optane-ssds-10x-bigger-7x-faster-1000x-the-endurance.57231/ which references http://wccftech.com/intel-optane-technology-3d-xpoint-ssd-2016/ that refers back to Anandtech IDF live blog http://www.anandtech.com/show/9539/intel-idf-2015-live-blog
Brand name is Optane, with SSDs coming in 2016 & DIMMs planned.

Edit: Also occurred to me if Intel is actually coming to market with this stuff, soon & it lives up to the hype, then Samsung etc will presumably be pushing the various Spin-tech/Non Volatile stuff to market not too far behind rather than be stuck with NAND.
 
Last edited:
Back
Top