😵💫 Why billing systems are a nightmare for engineers
First-hand insights from Qonto, Algolia, Pleo, Scaleway, & Upflow
‘On my first day, I was told: ‘Payment will come later, shouldn't be hard right?’
I was worried. We were not selling and delivering goods, but SSDs and CPU cores, petabytes and milliseconds, space and time. Instantly, by an API call. Fungible, at the smallest unit. On all continents. That was the vision.
After a week I felt like I was the only one really concerned about the long road ahead. In ambitious enterprise projects, complexity compounds quickly: multi-tenancy, multi-users, multi-roles, multi-currency, multi-tax codes, multi-everything.
These systems were no fun, some were ancient, and often 'spaghetti-like'.
What should have been a 1 year R&D project ended up taking 7 years of my professional life, in which I grew the billing team from 0 to 12 people.
So yes, if you have to ask me, billing is hard. Harder than you think. It's time to solve that once and for all.’
This is a typical conversation we have with engineers on a daily basis. In that case, these are Kevin’s words, who was the VP Engineering at Scaleway, one of the European leaders in cloud infrastructure.
Some of you asked me why billing was that complex, after my latest post about my ‘Pricing Hack’. My co-founder Raffi took on the challenge of explaining why it’s still an unsolved problem for engineers.
We also gathered insights from other friends who went through the same painful journey, including Algolia, Segment, Pleo, don’t miss them! Passing the mic to Raffi.
When you’re thinking about automating billing, this means your company is getting traction. That’s good news!
You might then wonder: should we build it in-house? It does not look complex, and the logic seems specific to your business.
Also, you might want to preserve your precious margins and therefore avoid existing billing solutions like Stripe Billing or Chargebee that take a cut of your revenue. Honestly, who likes this rent-seeker approach?
Our team at Lago still has some painful memories of the internal billing system at Qonto, that we had to build, maintain, and deal with.. Why was it that painful? In this article, I will provide a high-level view of the technical challenges we faced while implementing hybrid pricing (based on both 'subscription' and 'usage'), and what we learned the hard way in this journey.
TL;DR: Billing is just 100x harder than you will ever think
Let’s bill yearly as well, should be pretty straightforward' claims the Revenue team. Great! Everyone is excited to start working on it. Everyone, except the tech team. When you start building your internal billing system, it's hard to think of all the complexity that will pop up down the road, unless you’ve experienced it before.
It’s common to start a business with a simple pricing. You define one or two price plans, and limit this pricing to a defined number of features. However, when the company is growing, the pricing gets more and more complex, just like your entire codebase.
At Qonto, our first users could only onboard on a €9 plan. We quickly decided to add plans, and 'pay-as-you-go' features (such as ATM withdrawals, foreign currency payments, one shot capital deposit, etc...) to grow revenue.
Also, as Qonto is a 'neobank', we wanted to charge our customers directly in their wallet, through a ledger connected to our internal billing system. The team started from a duo of full-time engineers building a billing system (which is already a considerable investment), to currently a dedicated cross-functional team called 'pricing'.
This is not specific to Qonto of course. Pleo, another Fintech unicorn from Denmark faced similar hurdles:
'I’ve learned to appreciate that billing systems are hard to build, hard to design, and hard to get working for you if you deviate from 'the standard' even by a tiny bit.'
admits Arnon Shimoni, leading Product Billing Infrastructure at Danish Fintech unicorn Pleo.
This is not even specific to Fintechs. The Algolia team ended up creating a whole pricing department, now led by Djay, a Pricing and monetization veteran, from Twilio, VMWare, Service Now. They pivoted their pricing to a 'pay-as-you-go' model based pricing based on the number of monthly API searches
'It looks easy on paper — however, it’s a challenge to bring automation and transparency to a customer, so they can easily understand. There is a lot of behind-the-scenes work that goes into this, and it takes a lot of engineering and investment to do it the right way.'
says their CEO, Bernardette Nixon in Venture Beat and we could not agree more.
#1 - Dates
When implementing a billing system, dealing with dates is often the number 1 complexity. Somehow, all your subscriptions and charges deal with a number of days. Whether you make your customers pay weekly, monthly or yearly, you need to roll things over a period of time called the billing period.
Here is a non-exhaustive list of difficulties for engineers:
How to deal with leap years?
Do your subscriptions start at the beginning of the month or at the creation date of the customer?
How many days/months of trial do you offer?
Who decided February only holds 28 days? 🤔
Wait, bullet 1 is also important for February... 🤯
How to calculate a usage-based charge (price per seconds, hours, days...)?
Do I resume the consumption or do I stack it month over month? Year over year?
Do I apply a pro-rata based on the number of days consumed by my customer?
Although every decision is reversible, billing cycle questions are often the most important source of customer support tickets, and iterating on them is a highly complex and sensitive engineering project.
For instance, Qonto migrated the billing cycle start date from the 'anniversary' date, to the 'beginning of the month' date, and the approach was described here. It was not a trivial change.
#2 - Upgrades & downgrades
Then, you need to enable your customers to upgrade or downgrade their subscriptions. Moving from a plan A to a plan B seems pretty easy to implement, but it's not. Let's zoom on potential edge cases you could face.
Downgrades
The user downgrades in the middle of a period. Do we block features right now or at the end of the current billing period?
The user has paid the plan in advance (for the next billing period)
The user has paid the plan in arrears (for what he has really consumed)
The user downgrades from a yearly plan to a monthly plan
The user downgrades from a plan paid in advance to a plan paid in arrears (and vice-versa)
The user has a discount applied when downgrading
Upgrades
The user upgrades in the middle of a period. We probably need to give her access to the new features right now. Do we apply a pro-rata? Do we make her pay the pro-rata right now? At the end of the billing period?
The user upgrades from a plan paid in advance to a plan paid in arrears
The user upgrades from a monthly plan to a yearly plan. Do we apply a pro-rata? Do we make her pay the pro-rata right now? At the end of the billing period?
The user upgrades from a plan paid in advance to a plan paid in arrears (and vice-versa)
We did not have a ‘free trial’ period at the time at Qonto, but Arnon from Pleo describes the additional scenarii this creates here.
#3 - Usage-based computations
Subscription based billing is the first step when implementing a billing system. Each customer needs to be affiliated to a plan in order to start charging the right amount at the right moment.
But, for a growing number of companies, like we did at Qonto, other charges come alongside this subscription.
These charges are based on what customers really consume. This is what we call 'usage based billing'. Most companies end up having a hybrid pricing: a subscription charged per month and 'add-ons' or 'pay as you go' charges on top of it.
These consumption-based charges are tough to track at scale, because they often come with math calculation rules, performed on a high volume of events that need to be tracked.
Some examples:
Segment.com tracks the number of Monthly Tracked Users
This means that they need to COUNT the DISTINCT number of users, each month, and resume this value at the end of the billing period. In order to get the number of unique visitors, they need to apply a DISTINCT to deduplicate them.
Algolia tracks the number of api_search per month
This means they need to SUM the number of monthly searches for a client and resume it at the beginning of each billable period.
It becomes even more complex when you start calculating a charge based on a timeframe. For instance, Snowflake charges the compute usage of a data warehouse per second.
This means that they sum the number of Gigabytes or Terabytes consumed, multiplied by the number of seconds of compute time.
Maybe an example we can all relate to would be the one of an energy company who needs to charge $10 per kilowatt of electricity used per hour, for instance. In the example below, you can get an overview of what needs to be modeled and automated by the billing system.
Hour 1: 10 KW used for 0.5 hour = 5 KW (10 x 0.5)
Hour 2: 20 KW used for 1 hour = 20 KW (20 x 1)
Hour 3: 0 KW used for 1 hour = 0 KW (0 x 1)
Hour 4: 30 KW used for 0.5 hour = 15 KW (30 x 0.5)
TOTAL = 40 KW used x $10 ⇒ $40
#4 - Idempotency done right
Working with companies’ revenue can be tough.
Billing mismatches sometimes happen. Charging a user twice for the same product is obviously bad for customer experience, but failing to charge when it’s needed hurts revenue. That’s partly why Finance and BI teams spend so much time on revenue recognition.
As a 'pay-as-you-go' company, the billing system will process a high volume of events, when an event needs to be replayed, it needs to happen without billing the user another time.
Engineers call it 'Idempotency', meaning the ability to apply the same operation multiple times without changing the result beyond the first try.
It’s a simple design principle, however, maintaining it at all times is hard.
#5 - The case for a CCC - Cash Collection Officer
Cash collection is the process of collecting the money customers owe you. And the black beast of cash collection is 'dunnings': when payments fail to arrive, the merchant needs to persist and make repeated payment requests to their customers without damaging the relationship. These reminders are called 'dunnings'.
At Qonto, we called these 'waiting funds'. A client’s status is 'waiting funds' when they successfully went through the sign-up, the KYC and KYB process, yet their account balance is still 0.
For a neobank, the impact is twofold: you can’t charge for your service fees (a monthly subscription), and your customer does not generate interchange revenues (A simplistic explanation of interchange revenues: when you make a €100 payment with Qonto - or any card provider-, Qonto earns €0.5-€1 of interchange revenue, through the merchant’s fees.)
Therefore, your two main revenue streams are 'null', but you did pay to acquire, onboard, KYC the user, produce and send a card to them. We often half joked about the need to hire a 'chief waiting funds officer': the financial impact of this is just as high as the problem is underestimated.
Every company has 'dunning' challenges. For engineers, on top of all the billing architecture, this means they need to design and build:
A 'retry logic' to ask for a new payment intent
An invoice reconciliation (if several months of charges are being recovered)
An app logic to block the access in case of payment failure
An emailing workflow to urge a user to proceed to the payment
Some SaaS are even on a mission to fight dunnings and have built full-fledge companies around cash collection features, such as Upflow for instance, that is used by successful B2B scale-ups, including Front and Lattice, the leading HRtech.
‘Sending quality and personalized reminders took us a lot of time and, as Lattice was growing fast, it was essential for us to scale our cash collection processes. We use Upflow to personalize how we ask our customers for money, repeatedly, while keeping a good relationship. We now collect 99% of our invoices, effortlessly’,
says Jason Lopez, controller at Lattice.
#6 - The labyrinth of taxes and VAT
Taxes are challenging and depend on multiple dimensions.
What are the dimensions?
Applying tax to your customers depends on what you are selling, your home country and your customers’ home country. In the simplest cases, your tax decision tree should look like this:
Now, imagine that you sell different types of goods/services to different taxonomies of clients in +100 countries. If you think the logic on paper looks complex, the engineering needs to automate this at least tenfold.
What do engineers need to do?
Engineers will need to think of an entire tax logic within the application. This logic is pyramidal based both on customers and products sold by your company.
Taxes on the general settings level. Somehow, your company will have a general tax rate that is applied by default in the app.
Taxes per customer. This general setting tax rate can be overridden by a specific tax applied for a customer. This per-customer tax rate depends on all the dimensions explained in the image above.
Taxes per feature. In some cases, tax rates can also be applied by feature. This is mostly the case for the banking industry. For instance, at Qonto, banking fees are not subject to taxes and non-banking fees have a 20% VAT rate for all customers. Engineers created a whole tax logic based on the feature being used by a customer.
With billing, the devil is in the details. That’s why I always cringe when I see engineering teams build a home-made system, because they think it’s not 'that complex'.
If you’ve already tackled the topics listed above and think it’s a good investment of your engineering time, go ahead and build it in-house. Make sure to budget for the maintenance work that is always needed.
Another option is to rely on existing billing platforms, built by specialized teams. If you’re considering choosing one or switching, and you think I can help, please reach out!
To solve this problem at scale, we adopted a radical sharing approach. We’ve started building an Open-Source Alternative to Stripe Billing (and Chargebee, and all the equivalents).
Our API and architecture are open, so that you can embed, fork, customize them as much as your pricing and internal process need. As you’ve read, we experienced these painpoints first hand.
Request access or sign up for a live demo here, if you’re interested!