Edge computing is growing in importance. The development of the Internet of Things, the potential of connected cars and the growth of industrial manufacturing will all demand greater compute power at the edge. By 2025, around 75 percent of data will be processed at the edge according to a prediction by research firm Gartner, compared to just 10 percent in 2018.
However, that big shift will depend on how well companies can architect their approaches as a whole. To achieve the right results around edge, IT teams will have to consider a range of challenges, from individual edge device design through to wider business goals that those devices will have to meet. At the heart of this is data.
Edge devices will create data, use that data themselves, and then also have to pass that data back to more central locations for storage and processing. While real-time requirements for connected cars will mean that data processing will be carried out locally, others will rely on central data analysis to create value. Managing this at scale will be a huge endeavour.
With so many distributed devices all creating data, the sheer scale here will make planning ahead essential. At the same time, it’s also worth looking in more detail at how the data your edge computing project creates will be used.
Edge, centre and in-between
For your edge computing implementation, understanding where your data creates value and where it will be used can help you design your architecture more efficiently. For example, storing data locally on each device will mean that your device should be hardened against potential tampering or removal of data where it is not authorised. Most devices will be fairly simple ones dedicated to a specific task, but even these will need careful consideration around security and updates.
Alternatively, you can design your approach so your devices don’t store data and instead send it back to your central implementation. This should make planning around security easier, but it will mean that you have to think about scaling out your data management approach to cope with thousands or millions of updates coming back in. To keep up with the volume of data created, using a database optimised for time-series data will therefore be useful, as this will normally be the kind of data that edge devices create.
However, there is a third consideration too. You may have to deal with latency between device and data centre as part of your implementation, which can affect your results. For normal applications, caching data closer to the devices can help remedy potential problems, and you can take the same approach for edge computing too. Just as you have distributed devices to support, you can look at a distributed data approach as well, where local data centres or clusters can support sets of devices that are close to them, while all data is also replicated back to the centre for wider analysis. This does involve looking at your data replication and management approach, but it delivers a performance improvement and better results locally.
Supporting large scale edge deployments will involve looking at your data strategy as more data is created and processed. In order to make the most of edge, you’ll have to think about handling distributed data at scale, and ideally across multiple locations. Simplifying this can therefore help, particularly if you can avoid large data extract-transform-load operations. Using a distributed database can therefore help deliver on the potential that edge computing has to offer organisations, whether they are supporting smart cities, digital twin projects or connected car programmes.
Written by Patrick Callaghan, Vice President Value Engineering, DataStax