TrueLayer launched its first payment product in 2019: we pioneered open banking payments and we have seen them getting traction across a wide spectrum of open banking examples and use cases over the past two years.
As we gained more experience, we also got a good grip on their limitations:
Open banking payments are one way.
You can initiate a transfer from an end user bank account to a merchant bank account, but you can’t go the other way.
If we look at ecommerce, for example, refunds are a challenge: the merchant has to build and maintain additional infrastructure to provide this functionality to their users.
To reduce the burden on our customers, we built PayDirect — a new payment product that combines open banking with the fastest payment rails in the UK and Europe.
You can read more about the different use cases that PayDirect supports in our launch article — in this blog post we will instead take a look behind the scenes and delve deeper into the technology that powers our new product.
Closing the loop
How does PayDirect work? When a merchant onboards to PayDirect, we provision an e-money wallet for each of the currencies we support. Each wallet has an IBAN and can receive funds over all the payment rails supported for that currency.
Let’s look, for example, at the UK. Here, a merchant can:
collect funds from their end users via open banking payments
issue refunds or withdraw from their wallet programmatically with an instant payout using Faster Payments
Having a direct integration with Faster Payments in the UK brings additional benefits: for an incoming open banking top up for example, we can confirm the account owner name — a key aspect of fulfilling anti-money laundering requirements in various industries.
It might not look like much is changing — Faster Payments is just another payment method, isn’t it?
Well, it actually is a seismic shift for us: we are holding funds on behalf of our customers.
With open banking payments we never “touch the money”: we act as an intermediary between our customers and banks, instructing the latter to perform actions on behalf of end users.
Managing funds is a different game. From an engineering perspective, the bar for the new system has to be much higher.
Correctness, auditability and resiliency were our top concerns, while at the same time aiming to get to market in a reasonable time frame.
The rest of the post will delve deeper into how each of those requirements shaped the final system.
Access to the rail
Let’s start from the integration with the payment rail — what does it look like?
We chose to integrate via a clearing house partner (indirect agency).
They take care of the collateral and settlement requirements with the Bank of England and we deal with their REST API instead of the underlying Faster Payments messages.
Our partner’s system is built around two key concepts: safeguarding accounts and virtual accounts.
Safeguarding accounts hold our customers’ funds — they are completely segregated from TrueLayer’s own funds. You can think of a safeguarding account as a conventional current bank account: it has an IBAN, a balance and a transaction history.
Virtual accounts, instead, are multiplexed over an underlying safeguarding account.
Each virtual account has an IBAN and a transaction history, but our partner does not keep a balance: a transaction on a virtual account will always be authorised as long as the underlying safeguarding account holds enough funds.
Each wallet offered by TrueLayer maps to a virtual account therefore it falls on our system to make sure that
payout_amount <= available_balance for all payouts initiated by our customers.
Authorising a payment
Proper handling of wallet balances is key to prevent overdraft on our wallets.
For every wallet we hold two balances: an available balance and a current balance.
The current balance takes into account all transactions in a final state (either settled or failed), while the available balance reflects pending transactions too.
Let’s look at the lifecycle of a payout to better understand what those balances stand for.
Suppose you have a wallet that has been sitting idle for a while — both your available balance and your current balance are £50 .
You then decide to perform a £25 payout — what happens?
We first book the payment: the available balance is reduced to £25 before submitting it to the scheme. This ensures that you have enough funds and prevents another payment from being booked without taking into account money that is in transit. If we did not have a booked status, you could get another £50 payment submitted and then end up with £-25 if they both succeed!
After the payment is booked, we instruct our partner to fulfill it: the payment then moves to submitted — no change in balances.
If the payment succeeds, its status becomes settled and your current balance is updated to £25.
If the payment fails, its status becomes failed and your available balance is restored to £50.
The audit trail is the source of truth
A lot of things can go wrong in a payment system, sometimes seriously wrong.
To keep track of everything that is happening to our wallets we built a ledger.
The ledger is an event-sourced system: for each wallet we have an immutable append-only event log that keeps track of all actions concerning it.
That event log is both our audit trail and our source of truth: all snapshots of resources (e.g. current balance for a wallet or the current status for a payment) are built from the events. We just keep some materialised projections for speed.
To ensure that our invariants are not breached (e.g. no overdraft allowed) we rely on optimistic concurrency: each event is tagged with a monotonically increasing version number and a few database-level constraints guarantee data integrity.
No matter how much care you put into a piece of software, there will be defects — software is made by humans, after all. There will be defects in our components, in our partners’ products.
We have optimised for blast radius reduction and time to detection. We also continuously run a set of different processes that act as feedback loops for our system and operators:
Smoke tests, moving funds back-and-forth between test accounts, every minute
Micro-batching reconciliation, continuously checking that our view of our wallets and transactions matches what our partners sees, scraping both ledgers
Roll-ups, periodically replaying all the events on our logs to make sure projections are coherent and correct
As we go forward we will certainly add more and refine the ones we have.
Built with Rust
We used one innovation token on the programming language: all the microservices underpinning PayDirect are built with Rust.
Its type system was the key selling point: being able to enforce many of our domain constraints at compile-time boosted our confidence in the correctness of the final solution.
Rust’s enums, in particular, proved invaluable when working on the state machine of the payment lifecycle.
Adopting a new programming language is a journey — not one to be taken lightly.
Although we played with Rust previously, it was never in the hot path — we used it for CI tooling, small Kubernetes controllers, quality-of-life CLIs.
To build PayDirect we had to integrate Rust with the rest of the TrueLayer platform (e.g. telemetry), onboard and train engineers to use the language as well as figure out how to assemble our “application toolkit” using the crates available in the ecosystem — we had a chance to contribute a few patches upstream along the way.
Keep an eye on our blog — we’re looking forward to giving an in-depth report on our Rust journey in a future post.
We’re just getting started
There is so much more we could say on this: how we secured the whole stack, our choice of an event-driven architecture, how we manage on-call and alerts — all topics for future blog posts.
For now, enough to say that this is just the beginning: we are scaling the system to handle growing volumes, adding more payment rails and building new products on top.