Pseudonymised Information and Data Masking Explained

The GDPR strongly recommends two principles be adhered to in relation to organisations and their data: anonymisation and pseudonymisation. Although these processes sound quite similar, they are both distinctly different in terms of implication and requirements.

What is Personal Data?

Before we examine these processes, it’s perhaps worth first considering a key definition. ‘Personal data’, is unsurprisingly a core concept in the GDPR and is defined as any information relating to an identified or identifiable person (the data subject), that could be used, or potentially used to identify an individual.

This identifiable information includes the obvious categories of name, age, address etc. Though there also exists ‘special categories of personal data’, constituting a far more exhaustive and extensive list of personal data points, all of which are considered particularly sensitive in nature and are thus subject to a higher level of protection.

Pseudonymisation vs. Anonymisation

This technique takes a data set of personal information and either replaces or removes the data that can be used to identify an individual. This process removes the ability to attribute the information to a data subject, without additional information kept separately. This additional information must also be subject to technical and organisational measures to ensure that the data remains unattributable to the data subject.

Like the above process, anonymisation makes personal data no longer attributable to the data subject, though unlike pseudonymisation, this process does not allow for the process to be reversed. As a result, under the GDPR, properly anonymised data no longer can be categorised as personal data and is not subject to the same rules and regulations.

Techniques for Pseudonymisation

There are many methods that are used to pseudonymise information, of which there are those that are reversible and those which are not. The following different methods are utilised for varying purposes and each have their own strengths and weaknesses.

Scrambling is a technique that entails the mixing and obfuscation of letters. For example, the name Mathew, may once scrambled, become ‘Teamhw’ .

Data Blurring, perhaps best exemplified by facial blurring on video footage, renders data obsolete by approximating values and removing the ability to reverse said process.

Masking is a technique of obfuscation that allows data to only be used for certain purposes, whilst minimising information availability. This method is often employed when you are asked to verify phone or card numbers (e.g. XXX XXXX 5861).

Tokenisation substitutes sensitive data with a non-sensitive equivalent. A benign and randomly generated ‘token’ can then be used to access the original data. Baring no relation to the original data, tokens can even be single use, thus increasing their level of security. Tokens also allow organisations to minimise their access, and thus liability, to sensitive information.

Encryption is a process which transposes data into an unintelligible form, a process which can be extremely difficult to reverse, as without the correct ‘decryption key’ (which is kept separate from the encrypted data), even the most powerful computers on Earth would require thousands of years to ‘crack’ strong encryption methods.

Depending on your purposes and the nature of the data you are handling, one or more of these methods of pseudonymisation may be recommended, or even necessary under the GDPR. For instance, if you are handling any special categories of personal data or data that could be considered particularly sensitive, e.g. medical records, your requirements under the law would be different from something such as age group.