Teams at #SMRnov19: Join The National Archives in the future business

Guest Post | | 3 min read

We asked John Sheridan, Digital Director at The National Archives, to tell us about the work they do keeping more than a thousand years of British history alive. Don’t miss them at #SMRnov19 this weekend! You’ll find them at stand 10.

What do you think of when someone mentions The National Archives? It comes as no surprise to learn that our collection is vast and unique, dating back more than 1,000 years. We have records touching on the history of almost every part of the world. It is also astonishingly varied, including paper and parchment, as well as digital records including websites and databases. Some of the notable highlights include England’s first national dataset – more commonly known as the Domesday Book, Shakespeare’s will and the confessions of Guy Fawkes. Our collection holds insights into the biggest events in history.

People think archives are about the past, but really we are in the future business. We keep the evidence of today for people tomorrow. The National Archives is a living archive, with a collection of government records that continues to grow. Last year we accessioned – that is, formally added – more than 55,000 new records. The contemporary record of government is now predominantly digital.

Becoming a digital archive has been a game-changer for The National Archives. Take email, for example. Snippets of texts in threaded discussions, with different participants, the conversations forking and sometimes re-merging. Compared to the letters and memos of yesteryear, it is far from obvious where the digital record might begin and end. 

We are investigating how we might best use Artificial Intelligence to select which emails to keep and which to delete. Computers are now reasonably good at classification problems – say distinguishing between personal email in a work account and business related email. However, classifying emails in ways that rely on their context is a much tougher nut to crack, albeit the technology is advancing very quickly.

Email is now well established and a very mature technology. Meanwhile we are evolving new ways of communicating with each other, of capturing information and of processing it. Every technology presents new challenges for the digital archivist, in terms of selection, context, preservation and access. How do we know which AI based deep networks to keep and what do we need to do to preserve them? 

There is no long-term solution to digital preservation, nor reliable preservation software solutions that can be guaranteed to function even over the medium-term. The digital archive’s risk landscape is complex and varied. For example each type of storage medium (hard disc, tape etc.) has its own age distribution, which the archive needs to understand. Moreover, the impact of data corruption on the archive depends both on characteristics of file format of the records and the information density of what is being stored. Compressed formats (zip, jpeg, mp3 etc) may be far more susceptible to damage from a single bit flip. Conversely, if storage is not densely filled, a random bit flip may not have any affect at all.

The National Archives is developing new methods for measuring and managing the risks to the digital archive. We are currently developing Dynamic Bayesian Networks to model digital preservation risks. These are probabilistic networks of cause and effect. Crucially for the archive, they are also iterative, meaning that we can consider the impact of different preservation actions over time in our model. 

Technology systems become obsolete at an extraordinary pace, making it a highly disrupted and disruptive environment. This makes the role of the archivist even more important in sustaining the archive and securing information. Our work involves intervening at the right time in the right way, to mitigate the risks to the digital records that we hold. We are at the forefront, internationally, in the creation of new approaches to modelling, understanding and responding to digital preservation risks over time.

The National Archives also operates some major digital services, including websites nationalarchives.gov.uk and legislation.gov.uk.

Inclusive, disruptive, adapting and responding to change in a digital age. Imagine how exciting it would be to be part of making The National Archives!

Take a look at the roles The National Archives are bringing to the event here.