28 Jun 2021

DeltaXML Ltd: Government Data, Where Change Matters

The pandemic has seen an explosion of data, and the importance of data is high up in the public consciousness like never before. Raw data and statistical analysis are included in every news item, article, and conversation to tell the story of Covid. Citizens across the globe want transparent information about infection rates, cases, deaths and vaccinations from their governments and public institutions. Everyone demands this information quickly and accurately, but it is constantly changing, so how do we make sense of change in such vast and rapidly changing data? 

There are many software tools available for comparison of datasets which can help analyse change in data, but few understand the data in the context of its structure. Analysing change and representing difference are sophisticated problems and offer significant value. There are two common scenarios: regular provision of complete big datasets; and secondly providing a delta or set of changes. 

For example, the provision of large daily data sets for infection rates, cases, deaths, and vaccinations means each day data analysts need to compare that data with their current model to determine what changed and update the relevant data. Processing can be time consuming, resource intensive and prone to statistical or processing errors. Using a pre-processed set to compare the current data set with the previous using sophisticated context aware data comparison produces a considerably smaller file of changes which will be completely accurate and will update your data model far more quickly. 

The Value of Change 

Publishing update and change information makes such good sense that we need to ask why this is not being done. The answer is probably that it is not as easy as it may appear. For example, there is no standard way to publish changes to a CSV (Comma Separated Variable) file. There is a standard patch format for JSON but it is deficient in many respects for this purpose. However, it can certainly be done, and at DeltaXML we have been doing this for many years with XML and more recently with JSON. Thought needs to be given to the nature of the data, the structure, and key values and then a delta format can be defined. The great advantage of publishing changes is that they can be generated on demand between arbitrary versions of the original data, adding significant value to that data. 

As a specific example of the value of changes, the International Organization for Standardization (ISO) now makes available the latest version of a standard with changes from a selected previous version marked up. Our technology provides the engine behind this, to allow changes to be marked up on demand from the XML source

Across the world, Pandora’s box of transparent government data is truly open, accessible, and most importantly being used by citizens in every country. When change happens, it matters, and UK technology companies can be at the forefront of providing tools to understand that change. Now more than ever we need to ensure data is not misunderstood as social media can rapidly spread misrepresented data, damaging the credibility and authority of data sources like governments.

Reference: UK Government data source for daily data sets for infection rates, cases, deaths, and vaccinations - https://coronavirus.data.gov.uk/ 

This blog was writtten by Robin La Fontaine, CEO and Founder of DeltaXML Ltd. Robin is known for helping companies manage change (find, merge, audit, publish) in JSON and XML documents and data. To know more about this author, click here

