Incorrect information can very easily ruin your day.
Missing the bus because of an outdated timetable, missing out on love because of an incorrect phone number, missing the point of a joke because someone told the punchline wrong - all of these day-to-day frustrations could be avoided with accurate data.
However, having accurate data is even more vital when it comes to business because the stakes are often much, much higher. Corrupt data can cost your company its reputation, customers and revenue. Gartner reported that the average financial impact of poor data quality is $15 million per year in losses.
Source: Forbes
In its most simplified sense, data integrity is the practice of ensuring data remains accurate, valid and consistent throughout the entire data life cycle. To understand the concept fully, you need to know that data integrity has two definitions, depending on the context in which you approach it.
First, we have logical data integrity as a process that ensures that data is kept accurate and consistent. The primary purpose of this process is to stop data from becoming compromised and, essentially, useless.
There are three basic types of logical data integrity:
Referential integrity requires a valid primary key to be referenced in the parent table whenever a foreign key is used, thus ensuring consistency between these tables.
Second is the product of these processes, physical data integrity as a state, i.e., a data set that is accurate and valid. Here we are concerned with storing and fetching the data to ensure it is not corrupted by events such as power outages, natural disasters, corrosion, etc. For many businesses, the introduction of cloud storage has solved the threat posed by loss of physical data integrity.
If data integrity processes are not followed, it can, as we have already mentioned, have a high cost to business, research, and anyone attempting to make decisions based on that data.
Here are some scenarios and instances where data integrity can become compromised:
How can you ensure that your data is accurate and consistent when it's generated, duplicated, accessed, and moved around your enterprise at such a rapid rate?
The FDA (Food and Drug Administration) has outlined some principles for those in the pharmaceutical industry to ensure data integrity when recording on paper or electronically. However initially intended, they have become widely circulated and accepted as the standard across all industries. The principles can be remembered by using the acronym ALCOA, which stands for:
This principle refers to the responsibility of data and the ability to trace any action to a single user. To ensure attributable data, any person who makes a data action (recording, transforming or moving data) must be identifiable as the person who took action.
Analysts must create data logs for every action to include the name, computer ID, date of the data action, etc.
Simply put, this principle aims to ensure that data can be read and understood by everyone who accesses it - whether it is recorded on paper or electronically.
Ensure that data is recorded in standard terms and values so that even when the data-recorder has left an organisation, the data remains valid and usable.
Data integrity processes should occur at the same time as the data activity or immediately afterwards. All data activities should be timestamped to ensure that analysts have a clear record of the date and time when they took place.
Back-dating or overwriting data activity logs is a threat to data integrity as it increases the likelihood of human error or data loss.
Data must be recorded as raw or source data in the original location. In other words, when recording any new data or data activity, you must ensure that you not only record it immediately but that you enter it into the correct system.
If data is quickly recorded in paper notes and then either transferred onto official forms or electronic databases, corruptions and errors in the data entry can occur. Original data must always be maintained as the true copy to ensure a data audit trail can be maintained.
Recorded data must be free from errors and complete. However, we all know errors can occur, and in these circumstances, corrections must also be recorded.
When recording data on paper, this usually means crossing a line through the mistake and adding a to mark the correction. But what about when we are using electronic data (which most of us now are)? Automated audit trails and edit checks should be a feature of any database you use to ensure that incorrect alterations are flagged and corrected before resubmission.
Source: Laafon
As the ALCOA acronym was adopted by more industries, including big institutions such as the WHO (World Health Organisation), it was expanded upon. As such, ALCOA is now referred to as ALCOA+ and includes the following further factors:
The information recorded must be complete enough to recreate the event, test or analysis carried out. If all information is not documented or disclosed, it can undermine data integrity and reliability.
All recording procedures must be consistently carried out, in the correct order and with all time-recording apparatus synchronised to ensure accurate recording.
The material used to record the data must be maintained to ensure endurability. For example, if paper files are kept, they should now be backed up electronically; if electronic databases are physical storage, they should be backed up with cloud storage, etc.
Records should be accessible to the organisation for review, audit and analysis at any time during the data lifecycle.
As we discussed above, data integrity can be compromised when problems occur during the migration of data. In some forms of data integration, data is transferred and replicated between systems for communication and analytics - this is precisely when unwanted duplications or alterations occur.
Steps must be taken to avoid the corruption of data during the integration process:
Attempting to integrate data without adequate data integrity protocols can lead to wasted resources and inaccurate business intelligence, meaning you’re basing essential decisions on bad data. Not ideal.
Summing up
Poor data integrity will undermine any effort you take to be a data-driven business. It seems like common sense, yet it’s estimated that 3% of companies meet basic data quality standards.
Safeguarding data integrity can ensure:
If you follow the principles laid out in this blog, you will be on the right track to keep your data accurate, valid and consistent, so that you can feel secure in the decisions you make with it.
Are you struggling to get the full picture when analysing campaign results? Book a free demo today and see how Hurree can help you transform your company reporting to improve your sales and marketing output 💌 Don't hesitate to get in touch via contact@hurree.co if you have any inquiries - we’re happy to chat!