Improving the classification of your transaction data with machine learning
How the Data Platform team at TrueLayer improved purchase transaction classification with machine learning.
- Alex Spanos: Lead Data Scientist, Data Platform
- Daniele Paliotta: Machine Learning Engineer Intern, Data Platform
Transaction classification at TrueLayerTrueLayer’s Data API is a uniform, reliable and secure conduit through which applications can retrieve banking data of their end-users, including their financial transaction history.On top of enabling the underlying connectivity to the raw transactions, TrueLayer provides additional information and insights into the transactions themselves. In this context, our classification service enriches financial transactions with information related to their purpose and the relevant counterparty.The output of our classification service is embedded in the response of our Data API. For each purchase-related transaction (i.e. Credit/Debit Card payments and Direct Debits) the service tries to assign:
- a category and sub-category based on our taxonomy; and
- the merchant name.
The service is currently in beta and in the Platform team, we are actively working on enabling classification for other transaction types, such as Transfers and Standing Orders.In September 2019, we shipped a new version of our classification service. This release represents a major milestone for TrueLayer; our first foray into the world of Machine Learning-enabled data products — the first of many to come! Nomenclature note: this process is frequently referred to as “categorisation” in our broader ecosystem, however, we prefer “classification”.
The joys of rules-based systemsThe original service for classifying purchase-related transactions was rules-based; an expert system.It relied on building classification rules with the transaction description through an offline human annotation process. But transaction descriptions returned by providers can vary quite a bit according to payment-related conditions and the provider itself. So, to make annotation more efficient, we grouped similar transactions together by creating a complex set of parsing rules that standardised the form of the transaction description across providers and payment conditions.
Although quite basic as a service, it succeeded in classifying around 75% of purchase-related transactions flowing daily through our Data API (coverage) — not too bad 😏.However, as day follows night, we ran into many of the problems rule-based systems suffer from.
- Diminishing returns: ↘️Human annotation stopped being practical in improving classification coverage, as the distribution of transaction frequencies by the merchant is long-tailed. Coverage plateaued at around 75%.
- Maintenance: ⚙️As providers change description formats, the complex parsing rules no longer work — and it becomes increasingly difficult to identify which specific rule is at fault. Creating more rules adds significant maintenance overhead.
- Generalisation: 📖The parsing and classification rules were UK-specific. Very little knowledge can be transferred to enable classification in different market geographies; rules would have to be built from scratch.
Infusion of machine learningThis textbook problem provided the basis for developing our first Machine Learning-based service.At a high level, we used supervised learning to infer models for transaction classification that map information relating to the transaction to a category/sub-category combination.Compared to the (already complex) rules-based system that relied only on the description, with supervised learning we were able to easily take advantage of additional transaction properties, such as amounts, timestamps and others.By “feeding” these (features) and their associated categories/sub-categories assigned through the human annotation process (labels) to the modelling algorithms, they eventually learned patterns mapping the former to the latter — eventually enabling models to generalise; namely, to accurately predict categories on previously unseen transactions. We will cover our Machine Learning model development workflow in more depth in a future blog post. Promise 🤞!After some exploratory work, we also determined an optimal model-dependent “minimum confidence” threshold that needs to be achieved before assignment — model predictions are not always trustworthy!Eventually, we rolled out the new Machine Learning service as a “fallback”, to be invoked only when the rules-based classifier fails to assign a classification.As a result, we measured significant classification coverage uplift for almost all providers (banks), and globally by around 10%.
Additionally, we measured the accuracy of our classification service to ~90% at the category level and ~75% at the sub-category level.