Quick Summary
The Bank of Canada relies on securities-related data from various external sources to monitor financial markets and conduct related research. To make this information usable, staff invest considerable time in manually “cleaning” —organizing and preparing the data they receive.
The objective of this challenge was to explore innovative approaches to data cleaning and to exceed existing limitations by identifying an automated solution.
As part of the Partnerships in Innovation and Technology (PIVOT) Program, the Bank of Canada collaborated with Bronson to develop a solution aimed at minimizing information duplication, automating the data cleaning process, and enhancing the accuracy of the results produced in-house.
Project Overview
Bronson set out to review, clean and align financial security tombstone data provided to the Banks. The data is fundamental to many Data Analytics challenges at the bank where the data is required.
Led by Phil Cormier and supported by Bronson’s President Martin McGarry, Bronson began by reviewing the data provided by the Bank’s Financial Markets Department (FMD). This data involved a specific use case that required matching organizational names across three different datasets. Using the Alteryx Fuzzy matching tool as the matching engine, workflows were created through Alteryx Designer to analyze, clean and standardize the data, which made linking the common fields across datasets easier.
After Bank staff and Bronson Consulting reviewed and discussed the initial results, we focused on generating outputs that could be used to enhance the accuracy of record matching. A second review showed the potential for a scalable, robust solution that could automate our data cleaning methods.
The Challenge
The Bank Partnered with Bronson to find a solution that would:
- Reduce duplication of information;
- Automate the cleaning of the data; and
- Improve the accuracy of results currently achieved in-house.
Our Solution and Impact
Using Alteryx, Bronson reviewed all the securities data and realized it wasn’t merely a matter of cleaning and matching the securities data. It was necessary to analyze all the data sets individually and engineer a data schema to proceed through the challenge in a stepwise fashion. Due to the source datasets themselves containing irregularities it was necessary to run a data assay to identify clear anomalies which would render downstream processing mute.
Once the irregularities of the sources data were identified Bronson set out to build a series of Alteryx workflows; all of which with the ability to provide constant, automated and reproduceable results. By employing the Alteryx Fuzzy Logic tool, which incorporates the Jaro-Winkler similarity test, Bronson effectively compared datasets with varying sources and naming conventions.
Bronson was able to prove the simplicity and value of Alteryx and the Fuzzy matching toolsets embedded within it. Bronson left a roadmap for the Bank to both solve and continuously automate it’s securities tombstone data. Bronson aims to further its collaboration with the Bank to develop a strategic solution that surpasses the initial PIVOT challenge.
Given the irregularities present within the source datasets, a data assessment was conducted to identify anomalies that would impede subsequent processing. Following the identification of these inconsistencies, Bronson set about constructing a series of Alteryx workflows designed to yield consistent, automated, and reproducible outcomes.
Learn more about the Alteryx platform.