Book Image

Managing Data Integrity for Finance

By : Jane Sarah Lat
Book Image

Managing Data Integrity for Finance

By: Jane Sarah Lat

Overview of this book

Data integrity management plays a critical role in the success and effectiveness of organizations trying to use financial and operational data to make business decisions. Unfortunately, there is a big gap between the analysis and management of finance data along with the proper implementation of complex data systems across various organizations. The first part of this book covers the important concepts for data quality and data integrity relevant to finance, data, and tech professionals. The second part then focuses on having you use several data tools and platforms to manage and resolve data integrity issues on financial data. The last part of this the book covers intermediate and advanced solutions, including managed cloud-based ledger databases, database locks, and artificial intelligence, to manage the integrity of financial data in systems and databases. After finishing this hands-on book, you will be able to solve various data integrity issues experienced by organizations globally.
Table of Contents (16 chapters)
1
Part 1: Foundational Concepts for Data Quality and Data Integrity for Finance
5
Part 2: Pragmatic Solutions to Manage Financial Data Quality and Data Integrity
10
Part 3: Modern Strategies to Manage the Data Integrity of Finance Systems

Debunking the myths and misconceptions surrounding finance data integrity management

There are several myths and misconceptions that can negatively influence the financial practices and processes of various departments in an organization. In this section, we will cover the different beliefs within an organization that can lead to data quality issues and noncompliance, which could in turn lead to major financial consequences. Once we are able to debunk these myths and misconceptions, we can establish more effective strategies and practices that ensure the integrity and reliability of financial data in our own organizations.

Myth 1 – only large financial organizations are concerned about data integrity

Data integrity issues affect organizations of varying sizes, from start-ups and small businesses to large organizations. As mentioned earlier in this chapter, poor financial data integrity management can result in serious financial consequences or regulatory offenses. For example, start-ups may end up deprioritizing financial data integrity management and avoid strict processes that could slow down progress. This could lead to inconsistencies in their financial reporting, budget forecasts, and internal audits, potentially resulting in significant long-term financial and reputational damage.

Though small businesses can have less complexity and less regulation compared to bigger organizations, there is still a need to maintain data integrity throughout the data lifecycle. This enables business owners and management to confidently rely on the data and make informed business decisions.

Myth 2 – only finance professionals should be concerned about data integrity

While finance professionals play a crucial role in data integrity management, this responsibility needs to be shared across the entire organization. This involves promoting a culture of quality within the organization, increasing data literacy through training, and having an environment of openness.

For one thing, software engineers need to be mindful of data integrity issues and risks when building financial applications. Junior software engineers are probably not aware that adding 0.1 and 0.2 using languages such as Python or JavaScript without converting these floating-point values to decimal would yield a result of 0.30000000000000004 instead of 0.3! Where did that extra 0.00000000000000004 come from?! To help solve this mystery, developers should be aware that a floating-point number (often referred to simply as float) is stored in a format that cannot accurately represent all decimal numbers. Using float instead of decimal when developing financial applications is a bad idea, as this would have several significant implications including approximation errors and rounding errors.

This is best demonstrated with a simple code example using the Python programming language:

principal = 1000
rate = 0.001 # 0.1%
time = 1/365
interest_float = principal * rate * time
interest_float

Running this block of code would result in 0.0027397260273972603, similar to what we have in Figure 1.9:

Figure 1.9 – Getting the results for interest_float

Figure 1.9 – Getting the results for interest_float

Let’s try performing the same calculation, but this time we’ll use the decimal data type:

from decimal import Decimal
principal = Decimal('1000')
rate = Decimal('0.001') # 0.1%
time = Decimal('1') / Decimal('365')
interest_decimal = principal * rate * time
interest_decimal

This will result in Decimal('0.002739726027397260273972602740'), as can be seen in Figure 1.10:

Figure 1.10 – Getting the results for interest_decimal

Figure 1.10 – Getting the results for interest_decimal

Once we subtract the interest values (one stored as a floating-point number and the other stored in decimal form), we will get a difference similar to what we have in Figure 1.11:

Figure 1.11 – Getting the difference between the interest calculations

Figure 1.11 – Getting the difference between the interest calculations

This difference may seem small, however, what if the interest calculations are needed, for example, on overnight deposits worth hundreds of millions of dollars?

Note

For more information about this topic, feel free to check the following link: https://en.wikipedia.org/wiki/Floating-point_arithmetic.

Myth 3 – only internal financial reporting systems are affected by data integrity issues

Any type of financial system with a database can be affected by data integrity issues. This includes banking systems, accounting software, investment management platforms, and even payroll systems.

Note that data integrity issues can negatively impact machine learning-powered financial systems as well. ML-powered financial applications make use of machine learning models, which learn from data to identify patterns, make decisions, or predict outcomes. What if the data used to train these models have data integrity issues? In such cases, these models may produce inaccurate or biased results, as they rely on the quality of the training data to make predictions. This could lead to significant challenges in application performance and even potentially yield harmful results, especially in finance. As the saying goes: garbage in, garbage out.

As more organizations around the world build ML-powered financial applications, taking care of the integrity of data and addressing the phenomenon called machine learning drift is critical. While we won’t discuss machine learning drift in detail, it’s important that we are aware that this drift leads to a decline in model accuracy, which impacts the effectiveness of a machine learning system. It is essential that we know that data integrity issues, such as inconsistencies, missing values, or biases, significantly contribute to this drift.

Myth 4 – processes that improve data integrity are expensive and difficult to implement

Contrary to popular belief, improving the quality and integrity of data used by organizations doesn’t have to be expensive. There are practical ways and processes that can be implemented using the most commonly used tools available, such as Microsoft Excel, Google Sheets, and Power BI. These tools offer functionalities such as data validation, conditional formatting, and pivot tables, which can be leveraged to maintain accurate and consistent data. In addition to this, integrating basic data checks and regular audits into routine processes can go a long way to preserving the integrity of financial data. Training staff in effective data management and the use of these tools can also be done with minimal expense.

Note

Using various solutions and features for data integrity management will be covered further in Chapter 4, Understanding the Data Integrity Management Capabilities of Business Intelligence Tools.

Myth 5 – only electronic data is affected by data integrity issues

All types of data are affected by data integrity, whether stored digitally or on paper. It is important to be able to accurately store and retrieve data whether from an electronic database or hardcopy documents. One way of minimizing risks in the data collection process is anticipating and managing potential human errors ahead of time. This can be addressed by doing data validation checks, double-checking the work, having a standardized process, or enabling automation.

Note

While there are machine learning-powered tools to help automate the encoding process, it is crucial to remember that these tools also require regular monitoring and validation to ensure that they are functioning correctly and adapting to any changes in data formats or structures.

That’s pretty much it! At this point, we should have a better understanding of the myths and misconceptions and a deeper appreciation of the right mindset and approach toward ensuring the integrity of financial data.