A new approach to storage architectures for the zettabyte age
So, first of all, what is a zettabyte? A zettabyte is a trillion gigabytes. This is a large amount of data and the reason it’s perhaps not a household name in the same way as a giagbyte or even a terabyte, is there’s rarely been a commercial need to store this amount of information. But that is changing.
The innovation, products, and requirements for this coming architectural shift will depend on several key things:
The first is the need to disaggregate compute, storage, and network to leverage each component in the most efficient and optimal way. Disaggregation is the only way to deal with the volume, velocity, and variety of the data that the zettabyte age will inevitably bring.
The second consideration is regarding the fact that data infrastructure will need to be purpose-built. Businesses can no longer rely on general-purpose solutions–that is, one solution cannot be “good enough” to solve across-the-board needs. Organisations need to maximise efficiencies and focus on one purpose: delivering the perfect balance of performance, density, and cost in the zettabyte world.
The third is that there must be collaboration and intelligence among the different elements in the pipeline. Hardware and software need to interact together, and it’s important to understand the full stack in order to design hardware and software to holistically maximise performance and functionality.
Purpose-built SMR solutions
Getting input from open source and Linux® communities on the core technologies of SMR (Shingled Magnetic Recording) will be important in trying to find solutions that can meet the data requirements of the next decade. The magnetic storage data solution works by overlaying tracks on a disk, helping hardware providers to achieve roughly a 20% increase in capacity. This requires data to be written sequentially so that it will not alter an underlying write track.
For many hyperscalers, sequential writing is a good fit due to the write once/read many nature of large-scale workloads like video streaming. But the ramp-up to deploy SMR requires rearchitecting the host end of things—modifying the operating system to stage writes sequentially or even enabling the application to be aware of the sequential write model.
Rearchitecting can require some effort initially, but the density and cost benefits are substantial and demonstrate all the advantages of purpose-built hardware and software-aware constructs.
Utilising zoned namespaces
It may sound strange to bring SMR hard-disk drives (HDDs) and solid state hard drives (SSDs) into comparison because, in many ways, these technologies are a world apart. However, as we look at SSDs and NAND to be part of this disaggregated future, we’re seeing a companion technology to the SMR/HDD space called Zoned Namespaces (ZNS).
NAND-based media can handle only a certain number of writes and, as a result, it has to be managed. The Flash Translation Layer (FTL) intelligently deals with everything from cache to performance, to wear levelling. However, at the zettabyte scale, device level management brings indirection between the host and the actual media and impacts throughput, latency, and cost.
In an era where businesses want to control these elements and maximise efficiencies, we have to look at moving this management from the device level to the host—exactly how SMR is approached.
ZNS divides the flash media into zones, where each zone is an isolated namespace. Cloud providers can, for example, separate workloads or data types to different zones so that usage patterns are predictable among multiple users. Yet, more importantly, like the SMR construct, data is written through a zone in a sequential manner. Suddenly, there’s no longer a need for all that media management. The outcomes for this:
• Additional savings due to decreased need for over provisioning of NAND media
• Better drive endurance by reducing write amplification
• Dramatically reduced latency
• Significantly improved throughput
Zoned Storage — A unifying platform to support SMR and ZNS Technologies
As companies prepare themselves for increased data demands, initiatives like Zoned Storage, which work with the community to establish ZNS as an open standard that can use the same interface and Application Programming Interface (API) as SMR are really important. This step allows end users to adopt a single interface that can communicate with the entire storage layer. As a result, data centre architects can now make the transition to zettascale architectures more easily as applications don’t have to change regardless of the storage environment they choose. This will allow companies to reach a new balance between performance, latency, and cost using disaggregated, purpose-built, and intelligent architectures.
Forward-Looking Statements:
This article may contain forward-looking statements, including statements relating to expectations for Western Digital’s HDD products, the market for these products, and future capabilities and technologies for those products. These forward-looking statements are subject to risks and uncertainties that could cause actual results to differ materially from those expressed in the forward-looking statements, including development challenges or delays, supply chain and logistics issues, changes in markets, demand, global economic conditions and other risks and uncertainties listed in Western Digital Corporation’s most recent quarterly and annual reports filed with the Securities and Exchange Commission, to which your attention is directed. Readers are cautioned not to place undue reliance on these forward-looking statements and we undertake no obligation to update these forward-looking statements to reflect subsequent events or circumstances.