October 4, 2023
Season 1, Episode 25
Jon & Jay sit down with Viktor Farcic, a seasoned veteran in the tech industry, to discuss the ever-changing world of platform engineering.
In this episode, you will learn:
Themes Covered in the Podcast:
I was literally listening to the cloud exit story. Basically your podcast kind of sequence and kind of was like don’t tracking against it which is super interesting and obviously relevant to this kind of podcast and unusual to be fair.
Yes, most people are going in the other direction still.
Yeah, still going the other direction. But I have like millions of different questions. Yeah, great. Just to hear about why you chose that move overall and I guess what the Motivator was, whether it was cost or were you skills like what it was really?
Yeah. So for us it was several factors. I mean for me personally I had growing unease with the centralization of the internet in the hands of a handful of hyperscaler cloud providers. But that was sort of a philosophical longer term, do you know what, this isn’t great. I wish someone would do something. And then the trigger for us was cost. The trigger for us was realizing do you know what, this budget is just obscene. And I thought for a while that this budget, the spent we had on cloud was like it was going to have an initial peak, were going to figure it all out and then were going to heavily optimize it. And what were going to end up with was a situation where things were faster, easier and cheaper.
That was really the promise of cloud as I received it from all the marketing and the hype. And what turned out to be true was it is faster. If you need 100 machines in the next five minutes, nothing beats cloud. There is no other way you can get access to that much computing power that quickly. What turned out to be interesting was that we very rarely need 100 new machines within five minutes. So the premium you pay for that benefit or that advantage is very high when you don’t need it. So if you have a business that is in need of such rapid expansion in computing power, I can totally see the appeal. But what were really banking on were the other two legs of the stool that it was going to be easier and that it was going to be cheaper.
Now we ran in the cloud and have run in the cloud with applications since what, 15 or 16 or something. Spent a lot of time getting into the cloud in the first place. We had some applications that weren’t set up for that and we converted them and we got them into the cloud and kept thinking ease is just around the corner, this is going to be easier, we’re going to be able to do more with a smaller team or the same team can manage more applications. And it never happened. The cloud was never easier than operating things on premise. There were certain things that were different like even thinking about hardware. It’s not even that. It doesn’t exist in cloud because of course it does.
If your instance has a hardware issue, that instance would get shut down and you have still have to migrate and so on. And sometimes it’s not even all on your own timetable because it’s magic, but the magic is a very thin veneer on top of the same physical reality. Hard drives occasionally go bad, network connections have issues, all these other things, right? But it never got easier, it never got more productive. We were never able to do more with less. And then there was the final stool.
Was it was supposed to be cheaper that the HyperCloud or the Hyperscalers would have such vast economies of scale that they could buy machines much cheaper than any of the rest of us, and that they could do so, offer those machines at a premium to their cost that would still below what we could buy them for. Also not true, and increasingly not true. Much of the progress in computing power, both on the CPU side and storage side with SSD and NVMEs, if anything, it’s been diverging. Moore’s Law continues to march forward. Every 18 months things get faster and cheaper. Yet a lot of cloud costs are marching the other way. Especially if you look at the managed services, which is really the appeal of it, right?
Like, that’s really where you’re supposed to think, like, this is where it’s getting easier and by consequence cheaper. We bought into that on AWS, we used Open Search and we used Aurora and RDS. They’re just ludicrously expensive at scale. Absolutely obscenely expensive at scale, and whatever sort of ergonomic advantages that they have, which there are some just absolutely dwarfed by the fact that overall picture was not dramatically different in operating complexity and the cost was just absurd.
How did you work it out, though? As in, like, did you because I’m guessing you have a range of different because you obviously got base camp and kind of different services overall. So did you make a blanket decision for everything or did you start to categorize and be like, these are the services that we should move, these services should stay for a while. And was it driven by cost or simplicity of exit? How did you break it all down to be like, what was the logistics? Because it’s not impossible and it’s not obviously because you’ve done it, but it’s not a simplistic exit because it’s just the logistical elements of it all and architectural dependencies and really getting into the weeds of everything, of how it all hangs together and that’s not as simple.
Yeah, I mean, what’s interesting embedded in your question is a refutation of a central premise of the cloud. Oh, this is all commodities, this is all migratable, this is all interchangeable. No. For most people, no, exactly. As you say, you end up getting wedded to a particular platform in a particular way. Makes it actually quite difficult to move, not just out of the cloud, but to another cloud provider, which is one of the reasons why the cloud economics are so poor, because the cloud providers know this. They know you’re not going to move. They know you can’t cross shop AWS versus GCP very easily. This is one of the reasons I think GCP is having such a hard time making the business work.
And one of the reasons AWS is absolutely killing it with fantastical margins of 30% to 40% in a business that is used to single digit margins for hardware, they’re killing it because of lock in. And that lock in is very real and we had to tackle that in a distinct way. And the way we started out thinking about it was, okay, we’re running Kubernetes on AWS. Maybe if we can just run Kubernetes on our own stuff, we can reuse some aspects of all the configuration we build up. First of all, not really true. Sort of, kind of some things could be reused, but it’s not a extract redeploy scenario. Just because you have a bunch of helm charts don’t mean that you can just deploy them against the new target and everything’s just going to work with no mystery.
So we had that and we tried it first, like the Kubernetes route. And that was really I think if we have to stop there, we would have gone like, yeah, you know what? No, this is too much of a hassle, it’s too hard. We’re not going to be able to get out. We’re not going to be able to even reap the advantages either, because Kubernetes is really a beast. If you want to run it yourself, kubernetes can be relatively straightforward to use if someone else is running your control planes and everything else for you, which is obviously what the hyperscalers are doing. But if you’re thinking about you’re going to run the whole thing yourself and you’re going to be responsible for all the upgrades and everything else.
What our team did when they looked at that was go like, I don’t know, man, we are going to need some outside help, we’re going to need some consultants, maybe we need some auxiliary products, we need something else. And went down a path eventually a blind alley of that with an enterprise provider that just, yes, you could have done it, but the cost savings would have been swallowed up by consultant fees.
Did you look at other Kubernetes products like Openshifty? Really? Like full blown platformy based services like the Openshifts of the world and things like that, like the Red Hats or.
Did you we looked at SUSE products, right? Harvester and Rancher, yeah, yes, exactly. So those two products again, it’s not that they don’t have anything speaking for them, it’s just that they are enterprise consultant wear. The few people are interested or have the stomach for running themselves, which is of course the whole business model of SUSE is like oh, yeah. I mean, it’s open source, sort of, kind of, but you have to buy like a few million dollars worth of licenses, fee, and consulting services to actually make the whole thing work. And I went like, you know what? That’s not what we’re doing. This. We’re getting out of the cloud to control our own destiny. We’re getting out of the cloud to be able to use commodity hardware with open source infrastructure. This isn’t that. So we looked at that.
I looked at that and went, you know what? There’s got to be a simpler way. First of all, I’ve been running web services, SaaS services, for damn well, 20 years. I remember the before times. And the before times had absolute issues. All our servers were pets, and they needed careful little caressing to be set up and kept up to date and all the other sort of things we’ve left behind in the modern era of containerization. And that’s all good. That’s all that progress. I would not want to revert on. But at the same time, even in the period of the pets, things weren’t that complicated. So we can’t go from like, okay, pets that are relatively manageable to deal with to not pets, which is a paradigm shift and a paradigm progress, but it’s more complicated.
So that’s how we ended up developing Kamal, which is our deployment tool that is essentially a set of workflows on top of docker. Like, we’re not developing any fundamental technology here. What our stack looks like now is we have KVM at the bottom, which slices up the big machines into a handful of VMs as needed. And then right on top of that, we put docker. And then we deploy containers into docker, and we use Kamal to do zero downtime, deploys and rolling restarts and all of these other fancy things. And it turned out Kamal was just not that complicated, is not that complicated using docker more or less directly, not even docker swarm, not even docker compose, just basic docker and instrumenting. That and giving that a bit of a workflow into ergonomics around deployment was plenty. And there lied the premise.
I saw it. Of all the good things about the cloud containerization as number one, virtual machines as perhaps number two, those things kind of go together. Here’s a way to take these amazing chunks of hardware, by the way. This is the other thing I think people have been in the cloud for too long, have lost complete touch with the physicality and the awesomeness of a one U machine with 192 Bcpus in it, essentially, right? You can just get some wonderful hardware these days. And it is shockingly cheap when you contrast it to the rental rates from the hyperscalers. And I think so, taking the combination of that, the realization hardware is really fast, really cheap, actually at scale, and we could talk about when is it cheap enough? And all that. Stuff.
And then the amount of complexity you need to put on top can be way less. Orders of magnitude is potentially less than what you have in your head when you think container orchestration or auto scaling or any of these other buzzwords that people have fallen in love with when it comes to the cloud. And once we did that, once went through that process, we started with our regular criticality ladder, as we like to say it. We took the least critical application we have, which is an application called TaDaList, which was a free app we made back in 2005, have not updated since 2006, I think, and used as sort of the frontier guinea pig because it still has, I don’t know, 1000 users a week or something. Small number of users, they’re not paying us anything low criticality if we screwed up.
So we used that to develop Kamal. And then after we had done it, we got something deployed in, I don’t know, a few weeks and went like, holy, this is different. This is different than getting into the cloud. When we got into the cloud, progress was measured in months and years. When were getting out of the cloud, progress was measured in days and weeks. And it did not take long until we had worked our way up that criticality ladder to the Big Stuff basecamp hey, some of the other major services that we did. And we ended up finishing all major moves. It was seven applications were running in the cloud. Certain versions of Basecamp, some other legacy, or as I like to call the heritage, applications that we no longer sell but we run for existing customers.
And then the big dog, hey was an application that was born cloud first, never lived on prem, had always lived in the cloud. And we moved it all out. Those seven applications, including developing Kamal, including figuring it all out in six months, that’s pretty impressive.
Was this like an ambition for you economically? Or was it like sponsored by the CFO who was looking kind of like financially at just like a constant drill down of money? It’s like bittersweet. I kind of respect somehow the cloud economic structure because it’s very smart just.
From a business for the cloud providers, it’s amazing.
It’s amazingly clever. So you kind of respect it. And each time there’s a new level of abstraction, like containers, you see how they can economize each time. And so there’s just like different price points and serverless functions. So even at a more granular level, how they make money. So it’s like layers upon layers over the compute. That’s quite smart really, on the offerings, as in how they package it up and how they kind of distribute it. So as a business model, amazing. But not everybody. Also, though, on the flip side, kind of has the skills, but for yourselves, I guess because you’ve come from such a highly technical background. And also you’ve seen, like you were saying before, you’ve been part of when cloud wasn’t around, so you’ve had to do all of this before.
So though you weren’t yourself bought in the cloud, obviously that arrived later. Applications might have been, but how did that all come about? Was it operationally driven to save costs. And then you’ve kind of clocked it and then been like, well, I do know how to do this if you don’t have the cloud, because I’ve done it before. And therefore this. Isn’t as hard, and let’s maybe come up with a plan. Or just was it you just getting really frustrated where you’re like, actually, I can see how ridiculous this is economically versus the technology, and we could just do something better.
I think it was in many ways, all of the above, but what really turbocharged the whole enterprise was to realize just how much money were spending on the cloud. So once we factored it all, because I kept thinking, if we just optimize a little more, if we do this, if we do that, if we get a little smarter, if we do whatever the thing, eventually it’ll make some sort of economic sense. And they just never did. I mean, we literally spent years trying to chase that ever receding vantage point of sense, economic sense, wasn’t it the.
Cloud, the cloud stream, yeah.
And the cloud was constantly just out of reach in terms of the economic sense. And then we did the full accounting for 2022, we spent $3.2 million on the cloud. That’s everything, right? Including almost a million dollars on S Three, which is the one piece of the cloud we have not yet moved out of, and the one I probably have the most affinity towards, where the economics are to some extent the best, if you want to put it that way. We store about eight petabytes and storing eight petabytes is still actually a surprisingly difficult problem. That’s not that easy to do. You need a lot to be able to do that in a good way. So let’s put that aside.
But then let’s leave the 2.3 million were spending on the rest of the cloud budget, which was things like Open Search and RDS and EC Two and all these things, right? I started going through that, then I broke it down, like, what are we spending a week? And I just went like, this is absurd. On our weekly spend, I just compared it to like, all right, so let’s say we’re going to buy one of these super powerful machines from Dell, 192 vCPUs, like terabytes of Ram and whatever, and I go like, It’s going to be like 20 grand. Oh, man, the payback on going from rental to buying is so short. I didn’t think it was going to be this way. And part of this is, I think, after went with the same cloud drum that everyone else went with.
We kind of didn’t check in on pricing for a few years because I kept just assuming that the cloud was going to get cheaper at the same rate that the underlying hardware was going to be cheaper. And that’s, of course, just not what happened. So by the time we actually looked into how much computing power can you buy for say, half a million dollars, which is what we about ended up spending on all the new machines we needed to get out of the cloud. It was just shocking. You could get so much hardware for half a million dollars that I mean, we literally got two pallets of hardware delivered. We have two different data centers that we store our stuff in. And there was a pallet delivered to each of them. And we added about, what was it? 4000 vCPU or something like that.
And it cost half a million dollars, which sounds like a lot of money, and it sort of is. But it’s nothing compared to the fact that we’re spending 2.3 million a year on rental. And if we stop paying that rental bill, we have nothing. There’s nothing left, there’s no residual. Right. So the payback analysis was just I almost thought we had missed something when we started doing the calculations, we’re like, no, something’s got to be wrong here. It can’t be that it’s too absurd, it’s too out there for us to end up with these numbers. So I ended up doing some back of the envelope calculation and we ended up with a very conservative estimate that getting us out of the cloud was going to save us 1.5 million a year. Wow. And I think it’s going to be closer to two.
So there was just something there, just like, you know, what, that’s real. Like, I own the company together with Jason. The money that’s left over at the end of the year, I get to keep. So when it’s that visceral, when the dollar actually matters because you have a direct connection to it, I think it’s much easier to visualize the advantages of getting out of the cloud. If you work at some big enterprise and you go like, well, the cloud has better economics, I don’t have to talk to it, whatever, great. It’s not your money. Who cares if it costs ten times as much? That’s some sort of abstract enterprise billing thing, and the CFO has already bought, hook, line and sinker, the advantages of shifting from Capex to Opex, right? So you’re like, well, everyone wins.
Well, except the shareholders, except the people who actually care about the economic performance at the end of the day. Now, there’s some nuance to this, and I actually find that one of the best arguments for the cloud is to route around deep organizational dysfunction. If you are a large enterprise and you are simply incapable of provisioning new servers to your development teams. Yeah, I could see it. Maybe you have to pay like a stupid fee for that, or an incompetence fee or obstruction fee that yeah, things have to cost ten times as much as you don’t have to deal with the people you work with. All right, interesting theory. I can sort of see it in a dilbert kind of way. That was not our world, right?
I deal with competent, kind, and caring people who also don’t want to just flush money down the toilet. So that made it easier in our case. But I think what also made it easier was that it was easy. The cloud providers have successfully convinced everyone that running your own machine is this exotic, difficult thing that you need wizards of a past era to be able to pull off as though were building pyramids. And they go like, but we don’t know how to do it anymore. What are you talking about? Like literally five minutes ago, or at least five years ago, everyone ran on their own hardware. Everyone. It’s not like this knowledge is lost to the ages. And it’s also not like the cloud providers have any secret sauce.
If you look at something like, AWS, there’s like tens of thousands of people who run around with duct tape fixing servers, right? Like, there’s not like they didn’t come up with some fundamental underlying breakthrough technology that puts them on a different plane of understanding. In terms of your interface with physical machines that is enabling this, it’s partly what you’re buying is you’re buying a rental service, which has some advantages in some cases, and you’re buying or leasing a bunch of people who work there. Okay? If you need full computers, and this is a key distinction, if what you’re running can’t fill up a computer, you should rent. Maybe you have some issues with data security and so forth. GDPR, there’s some reasonable arguments there for why you should run your own stuff, even if you don’t need a full box.
But let’s put that aside and just take the case where we need our company, we need like dozens of machines, I think. I don’t know what we have 100 maybe or something, right? If you can fill 100 machines, it does not make sense to rent them for like years and years. Rental is great when you don’t need the whole thing all the time or you don’t know whether you’re going to need this at all in like five minutes. That’s the case of most startups. Most startups have a speculative business model. Until that finds product market fit, no, don’t go out and buy a bunch of stuff.
And also, as we talked about in the beginning, if you have these massive orders of magnitude spikes, which ironically perhaps was the original justification for the cloud Amazon going like, hey, we need ten times as many servers as normal on Black Friday and whatever leading up to Christmas and in March, like nine out of ten of our services just laying dormant. We should rent them out to other people who need those at the times we don’t. Awesome. Great. We run a SaaS business. I can predict sort of our growth very accurately most of the time. Not all the time, but most of the time, such that having a lead time of a few weeks of getting a new load of servers is just not a big deal. You over provision somewhat slightly, and that over provisioning, by the way, is incredibly cheap.
It’s not like the cloud where you really want to optimize your budget, make sure you don’t leave a large Xx instance just running, because that’s going to clock you deep. So you do these things, you over provision, and then you realize, holy smokes, this is just a completely different universe of cost, and I get to keep all this money that we don’t spend on the cloud. Yeah, this is good.
How did you structure it, King? Because I still think yeah, I think there’s a contrast of that for other things, like regionalization, obviously, if you’re going to go into other countries or like talent.
I don’t even think that’s true. I don’t even think that’s true. So again, it’s the same analysis as with everything else. And this is the key part of the equation that I think a lot of cloud people have missed, which is they have this conception that running your own servers is like it was in 2005, that you need physical access to the data center. Some people even believe that we’re building our own data centers. What are you talking about? There’s like hundreds, if not thousands of highly professionally run data centers where you can rent a cage, and that cage comes with power and bandwidth, right? First of all, you’re not dealing with the physicality of servers anymore. What you’re renting is like the rack rather than the computer. A rack. And then you don’t even interface with the rack. I’ve never seen our computers.
I don’t touch our computers. None of the people we have touched our computers. You buy some services from a white glove outfit that works in that specific data center. You have Dell or whoever you’re buying from drop ship in palleted servers. The white gloves take the boxes, opens the boxes, plugs in the machines, connects them to Power and Ethernet, and boom, you see an IP address come online. That’s the same as the cloud. Now it has a longer lead time. Yes. So again, if you need 100 servers in five minutes, that’s not for that. Most businesses are not in that situation. But internationally, if we want to do the same thing in Berlin, it’s the same deal. You contact Dell. Can you ship us some servers in can? Yes, we can. And is there a white glove service available? Of course there is.
I don’t have to fly to Berlin or Paris or wherever to get an international footprint set up. So I think there’s some misconceptions around the differences between cloud and not cloud, which is why the cloud conversation often gets foggy, because what’s often the difference is buy versus rent. It’s much more an economic decision than it is a technical one. In a lot of cases, the technologies that underpin you, running your own hardware is like 90% the same as you running in the cloud.
I totally agree. I love the fact that you’ve taken the cloud on just as a bit of a juggernaut thing. I come from that background, obviously. I used to work for a managed hosting company and we used to obviously do managed services and colo. It’s still a challenge. I think that even though those skills were there, obviously, like all the networking VLAN, trunking switches, certificates, automation, there’s like a huge amount still to it. If you can hire those teams, which obviously you can, and the cost makes a lot more sense as an investment in terms of investing the talent rather than the economics of the cloud, and you’re kind of tilting the money inwards rather than externally, then it’s definitely doable.
However, the talent, maybe for a company set like yourselves at 37 signals, probably finding people that want to work for you is much easier than maybe.
Lot of the other enterprises.
Maybe. I’m skeptical on that. And part of the reason I’m skeptical is that AWS is not a simple switch. AWS is a very complicated beast. To set up that well and to set it up securely is not something that just happens. It’s not something that you can like, oh, you just graduated from a boot camp. Yeah, here you go, access to our AWS, set it all up. No, you need absolutely, very highly qualified people if you’re doing this at scale. Again, what you’re doing as a startup that needs two VMs, whatever, if you’re doing this for a major enterprise that’s actually making real money and has lots of customers, you’re facing many of the same skill issues when you’re dealing with the cloud. And I think there’s almost a false sense of security.
Oh, we’re on the cloud, we don’t have to think about security. What are you talking about? Like, 90% of the security challenges lie in your specific software stack. Your application, your configuration, all this stuff that doesn’t magically go away just because you’re on the cloud. I will grant you that on the network side, there are some things that are to some degree slightly simpler, but you start getting into the gnarly bits of VPCs and whatever setups on AWS, and again, it is not a simple answer to get that configured well and correctly.
I guess what they’ve done, though, is the smart bit about it. And I totally like, I’m not advocating that the cloud is worth the money either, right? So I’m kind of a bit. On your side, because I think it’s astronomically ridiculous, the prices and unfair, really, I think, to a lot of the consumers. But what they’ve done well is they’ve abstracted it to a common interface. So it becomes, I think, the simplification of it being API driven and having a common interface to it was quite smart because then you don’t need the divergences of different API types or different levels of complexity of different vendors where you’re like, oh, this is a different vendor’s API over here. And now we’re moving over to these APIs over here. And obviously you can make it all.
Work still, but also sort of right. Sort of. So the convergence that’s happening has an asterisk when you post a question earlier. If you want to move your sophisticated AWS installation to GCP, there’s not a common interface where that’s just like this. Right. I’m not taking away from the fundamental point that there is more stuff happening in software and configuration that to some degree are easy. Like, we use F five load balancers, for example. Right, yeah. Do you know what that is? A dialect. And there is some specialization around that. I just tend to think that people vastly overestimate the complexity of that. And the cloud providers are very eager to sort of breathe life into some of those FUD myths that this is too hard, you’re not smart enough to figure this out. Come over here. And it’s also simple, it’s also easy.
Well, some people put FIS in the cloud as virtual. Just a whole other level of bizarreness at that point. Yeah, exactly. So it’s even more strange when it kind of gets to that level. What’s next on the list then, I guess, for you, because you said you created Kamal, which was kind of like a declarative way of provisioning around docker, I guess, the fundamentals of docker principles. But how does it work for the developers? So if you’re a developer, what are your responsibilities? Is everything abstracted away? And it’s like your responsibilities are very application centric. Do you have, like, a data team? Does the developer take on the data bit as well? How does it work?
Yeah, it can work across the whole ladder. So Kamal makes it really easy to take a machine that’s completely pure, completely virgin, set up that has no specialization. Right. We want to stay in the cattle world. We don’t want to go back to the world of pets. So we want machines that are interchangeable and can just disappear. I think that’s one of those key innovations that really got pushed with the cloud. That’s the whole containerization thing and so on. Kamal sees that, embraces that, and then goes, do you know what? If your app can get wrapped up as a docker image, you have a docker file, you can deploy it with Kamal in a very easy way. You can do these rollouts.
You can get essentially the ergonomics of a heroku or render or fly or any of these other services that all build on top of the cloud to make the cloud more ergonomic. Kamal is doing that for any type of server. So this is where you get the level of true portability you might be. In my video on Kamaldeployy.org, I go through deploying a basic application on two different clouds, all within 20 minutes. First we deploy it on DigitalOcean and we deploy it on Hessner. Well, the other way around, but it all comes down from Kamal just needing an IP address against a virtual machine or full machine that runs Linux. It’ll even install docker for you if it’s not on the machine. Now, when it comes to data management, kamal either can treat your data management as any other type of VM.
It has this concept of accessories. You can set up a database to run as accessories, but of course you still have to think about how does persistence work? Do you use volumes, what do you use? So there’s some understanding that has to be present, but it’s far simpler. Now, in our case, we’re running eight racks across two data centers. Like, I don’t know, we have 20 live production apps. We have a whole team for that, right? So we have an ops team still. Now, the interesting thing was people thought, oh, you’re getting out the cloud, then how much bigger is your team going to be? And our answer was like, it’s not going to get bigger at all.
That was half the reason why we’re getting out of the cloud, because the cloud offered no material advancement in productivity around how to operate these services. And this is exactly what we found. We’ve kept the same team size. Now they just deal more with the hardware we own than the hardware we rent. But Kamal is trying to just massively simplify the act of running your application either on the cloud, where most people still will start, regardless of what I say. Well, not even regardless of what I say. I say start in the cloud. If you are a new startup, don’t go out and spend like the first half million dollars of your seed round on a bunch of servers. Rent it in the cloud, for God’s sake, please.
But look into something like Kamal so that you have an exit strategy, as we’ve talked about, how did you get out of the cloud? Because most people would have horror stories about how to get out of it because it’s actually quite difficult. Well, it doesn’t have to be if you start with something like Kamal actually moving to either Bare Metal or another cloud provider. If you find a better deal, is vastly orders of magnitude simpler to do because of the setup. So I think this is just like Broadening Horizons a bit, trying to deal and teach people that they can run their own stacks, that the stacks can be much simpler than they think they could be. I mean, if you’re an individual developer, for example, I think you have to be kind of crazy if you’re going to run Kubernetes by yourself.
That’s not within reach of maybe there’s some savants out there who are both total experts in Kubernetes on top of being excellent full stack developers. Not a lot of those around, right? More people around who can use the economics of something like Heroku, which means that they can use something like Kamal and can start there and go like, all right, I’m really good at application development. I can know just enough to run things if I use a tool like Kamal, if I use some of these abstractions and then I still have the freedom to be able to go wherever I want. That, I think, is the path I would recommend most people to go down.
And the data services though, like things like message queues, Kafka or backend databases, I mean databases, to be fair, not too hard to kind of run yourself, really, but there are some more complex things.
Avoid them like the freaking plague. There is no chance whatsoever you’re doing a startup. I shouldn’t say no chance, but this is colligurally. So the 99.99% of all startups, all they need is a MySQL database and perhaps a redis instance and that’ll take them to a billion dollars in revenue. And then once you get to a billion dollars and you reach Facebook scale or whatever, great, cool. Look into some of these sophisticated solutions. I mean, Kafka came out of LinkedIn, I think once they reached the scale of absolute billions or many of these sort of sophisticated data stores and setups, they’re all extractions from what you have to do when you are internet scale. When you’re web scale, there’s like ten companies in the world that are web scale.
Now, granted, some of these solutions still kind of sort of make sense for some companies below them, but you really got to be in the stratosphere already. You are long since past the zero to one stage. Like startups just don’t need this level of complication now, that doesn’t mean startups aren’t trying to use them anyway. And I think this is actually a leading killer of startups. And it will be in this new era where money isn’t free, where money actually can earn an interest risk free. And you have to beat that to provide a return that’s worth investing in, right when you’re dealing with that, productivity matters again and the productivity lost to you. Fumbling around with Kafka setups in the early days of a startup that doesn’t have a thousand engineers is just tragic.
You shouldn’t be using these tools, you should be using tools that you can actually understand and run yourself.
Yeah, that’s difficult though. I mean, it’s not difficult, it’s actually simple, but the market is such a noisy one and people learn from their peers and so they’ll hear things that people are doing and then they’ll think that’s just what they should do. So it’s kind of a challenge for people to reflect on the technology stack early and be like, just go for the most simplistic operationally. But most people don’t think like that. They go for the most interesting, potentially.
From an engineering perspective, CV driven development is a major scourge on our industry. And not just CV driven, although that’s part of it, but also just novelty driven. Now, again, I am thankful actually that there are people who do that. Let them go off and waste all the VC money, let them go off and waste all the productivity. Then there’s room for some underdogs, some scrappy companies to actually be competitive because they’ll just not pick those kinds of tools. This is one of the reasons why 20 years in, I’m still so eager to work on Ruby on Rails, because Ruby on Rails is a stack built around these principles that single developers can actually build complete systems and understand them and launch them in a reasonable amount of time.
Even as a side project, they don’t need to raise $10 million in funding and hire a bunch of Silicon Valley refugees from the Fang companies to be able to operate right. You can actually do so much of what makes the world go round in terms of software, eating the world with a dramatically simpler stack. And if anything, this is where I think the interest is and where it continues to go. SQLite is one of those technologies been around for a very long time, that just keeps getting more and more interesting. The faster machines become, the more you can run on a single box, the longer you can push out complications of even running a database server. Now, again, I like my SQL.
I’m not saying SQL is the right for everything, but that’s where to me, the interesting part is I find it far less interesting to ponder what does LinkedIn need to service a billion users? Yeah, figure it out. When you’re worth a billion dollars, it’s just not relevant problems for you when you go zero to one. But this is where the attention is and this is where the limelight is. There’s far less attention on limelight on, hey, what’s the simplest thing that could possibly work? Can we make this happen with like one to developer working part time? Yeah, you can. You’re choosing not to.