Philip Williams
on 5 December 2023
Here, there, everywhere – MicroCeph makes edge storage easy
Data is everywhere, not just in large centralised data centres, but in smaller outposts, like retail outlets, remote or branch offices, filming locations and even cars. Use cases for edge storage include local processing, content collaboration that cannot take place with high network latency, and for ingest – where locally created content is captured, protected and then made available to other systems.
All of these scenarios have two things in common: the data generated is critical, and those locations typically have limited bandwidth.
Historically, some of these use cases would use a single-board computer, but over time the importance of the data has increased and greater performance and redundancy is now required. You can meet these new demands with edge storage.
An edge storage solution allows for data to be securely stored and efficiently processed where it is created, without having to ship it to a centralised location for processing. This approach enables secure, performant processing irrespective of latency and bandwidth limitations.
In today’s blog, I’ll explore why you might want to use an edge storage solution, and demonstrate how you can get storage clusters up and run easily with MicroCeph.
Maintaining locality with edge storage
In some edge environments, such as a retail store, it is more efficient to carry out initial processing and analysis of sales and inventory data at that location, before sending the results to a central repository. This kind of perishable data is used as the input to other processes and can be generated at hundreds or thousands of locations.
Another use case is autonomous vehicles, which can generate up to 6TB of data per day. That data needs to be processed in real-time locally, and its integrity must be maintained, as this data is used for decision-making and the optimisation of autonomous driving algorithms.
These kinds of use cases require highly performant and redundant storage, something that just isn’t available in a legacy single-board computer system. A scalable storage solution like Ceph can help meet performance and redundancy goals. Using multiple nodes means that multiple copies of the data are stored, which mitigates any data loss should a single node or disk fail. Ceph’s scale out architecture means that performance as well as capacity can increase when adding additional nodes.
The distribution of data processing not only reduces demand on corporate networks and data centres, but also decreases the latency of getting results.
Securing content with edge storage
Content types such as graphics and video have limited scope for remote collaboration due to the size of the files, so the data needs to be temporarily stored locally. As an example, the raw footage for an average TV episode can consume 100TB.
Additionally, with remote filming locations there may be no network connectivity to securely ship content to a centralised data store, so it must be securely stored locally before being processed at a later time. Data loss in this scenario could be financially disastrous for the production company.
Full disk encryption allows a user to be confident that their data will not get exposed should a disk or node be lost. Strong client authorisation via CephX for all of the different storage protocols means that only approved parties can access data stored on the cluster.
Edge storage, simplified
An edge storage system must provide a streamlined and effortless deployment experience, with minimal ongoing maintenance overhead. This is because these systems may be located in remote or offline locations, where a storage expert may not be available. A standard Ceph cluster can be challenging to deploy and operate at the edge, but MicroCeph solves this complexity.
We designed MicroCeph from the ground up to provide reliable and resilient storage for non-experts, wherever they chose to deploy it. It is a highly scalable snap-based solution that can run on just three nodes to provide redundancy and high availability, but also scale to meet multi-petabyte needs. MicroCeph is containerised along with all of its dependencies and runs fully sandboxed from the underlying host to minimise security risks. Software updates are hassle-free and non-disruptive, respecting the operational requirements of a running Ceph cluster.
MicroCeph supports all data access protocols – block, file and object, and can be deployed with full disk encryption, so that you do not need to worry about hardware loss.
Making Ceph easy for all, in a snap
Try MicroCeph out. With a handful of commands and a few minutes, it’s possible to have a functional Ceph cluster up and running. To get started, install the MicroCeph snap with the following commands to create a single node Ceph cluster with 3 loopback based OSDs. This configuration is not recommended for production but is a great easy to get hands-on with MicroCeph.
sudo snap install microceph --channel quincy/edge
Then bootstrap the cluster from the first node:
sudo microceph cluster bootstrap
Next, add some disks that will be used as OSDs:
sudo microceph disk add loop,4G,3
Verify cluster status:
microceph.ceph status
microceph.ceph osd status
Learn more
FInd out more about MicroCeph on its snapcraft page.