Scroogie Boy

5 more mistakes people make when building SaaS software

· Carl W. Irving

As a follow-up to my comments on the InfoQ “QCon London: Mistakes People Make Building SaaS Software” post, I thought I would add my list of five additional common mistakes that I’ve seen SaaS product teams make.

Mismatched revenue vs. cost scaling

This is mostly a business failure, but with significant engineering input: when offering SaaS that has a variable usage-based component that implies non-trivial costs, forgetting to include those costs in the product pricing (or forgetting to include limits to prevent abuse – see the next mistake) can end up being catastrophic. Before you say “yeah, that’s just a business problem”, I will point out that engineering needs to discover and report these costs. The business is not going to magically know what costs money and what doesn’t.

As an example, I once worked with a team whose costs ended up dominated by their logging provider: the provider charged by log volume, logs were proportional to the data ingested / processed by the system, but the price was scaled according to a lightly correlated variable that matched light usage, not real-world usage. By the time they realized their mistake, they had to endure expensive losses while working frantically to reduce their logging provider usage. The expensive things aren’t necessarily what you expected up front.

Not having limits

When I say “limits”, I mean limits on every part of the system – rates, volumes, backups, anything that can grow and costs money. Virtually everybody remembers to use an API rate limiter, but limiting all other variable aspects is equally important. This seems like technical debt that can be kicked down the road, but my experience is that in any kind of data-intensive system, there will be another component that drives cost in an unexpected way. Setting those limits (even if they are high) and testing them is a really important part of making sure you aren’t surprised.

This isn’t in contradiction with the previous mistake (mismatching revenue vs. cost scaling): even if revenue scales perfectly with usage, there is a point where customers will have sticker shock and rebel. Having a conversation about limits and making sure that they are on board with the implications of their usage is preferable to irritated customers refusing to pay.

Misjudging your organization’s willingness to iterate

This one goes both ways, depending on the kind of organization. One key advantage of the SaaS model is that you can be lean – deliver incrementally and adapt as you learn. The real danger lies when you rely on this but are in an inflexible organization. If you can’t continuously choose between adding features or iterating over past implementation based on feedback (especially operational feedback), you end up becoming trapped in an increasingly unproductive cycle of trying to add features to an inadequate base. The inadequacy can take many forms (quality issues, decisions that turned out to be incorrect, etc.), but they ultimately present as a lack of productivity that only gets worse over time.

The lesser form of this is to overshoot the completion of features before making them available, slowing down the rate at which you gather feedback and increasing waste when you do incorporate it. Unpalatable as it may seem, applying the perfect definition of “done” before anybody can see new work is also a mistake that gets projects cancelled for lack of progress. The trick is finding the level of iteration and leanness that the organization can tolerate.

Undershooting operational quality

The key word here is “operational” – most software development organizations have a pretty clear set of quality acceptance criteria for software on its own, the problem is when it actually is deployed. The ability to detect and resolve issues when the software is in operation is somewhat of an afterthought: the traditional definition of “done” tends to be along the lines of “we tested it on the bench with no errors”, not “we exposed it to the whims of the customer base, figured out what they broke and fixed it”. In the more extreme cases, you may be blind to the problems experienced by customers. That is not a good place to be. You want to know that there are problems as soon as possible, preferably fixing them before customers even have time to create a support ticket.

If you don’t have too many operational issues or they take too long to resolve (corollary: if you don’t feed back changes into the system itself based on your operational findings), all parties involved will be unhappy. Customers will be mad and look for alternatives, the development team will be demoralized and checked out. The #1 job of a SaaS software development team is keeping the system running smoothly (that is what customers pay for, after all), writing new features is a distant #2. Only when you’ve gotten good at operations and operational quality will you have the time to do fun things like develop new features.

Organizations that split development and operations thinking that this allows them to keep doing the fun stuff while others deal with fires only defer and magnify the consequences of this mistake.

Not having a proper tenant decommissioning plan

It’s nice to think of tenant onboarding, keeping them running, etc. but nothing lasts forever. Customers will eventually leave. If you can’t properly decommission their resources and destroy their data as agreed, you’ll be in a world of hurt: customers (especially in a B2B context) have real expectations of data destruction on termination (potentially delayed by some period of time) and get really mad if they find that their data hasn’t been destroyed as expected. Similarly, if you keep paying for resources used by departed customers, those stragglers will eat into your profits until the end of time.

It’s easy to think of this as a problem for later. The mistake is when “later” becomes “never.”