Site Reliability Engineer has been around for a long time, but has resurged in popularity due to Google. Google defines SREs as software engineers who write code to solve systems/infra problems.
-
-
Show this thread
-
This is conjures up the expectation that SREs are full-blown software engineers who also know networking and kernels. This is bollocks. Even for Google.
Show this thread -
In fact, there are two SRE tracks at Google: software and systems. They work on the same teams on the same projects. The only difference is the software track SREs don't have to re-interview if they want to become full SEs.
Show this thread -
So yes there are SREs at Google writing databases but there are also SREs tying together provisioning workflows with Python scripts.
Show this thread -
But this definition has caught on and suddenly everyone wants an SRE org. And they go about it in terrible ways.
Show this thread -
Usually companies take their Ops org and rename it to SRE and change nothing else. Except their expectations.
Show this thread -
Oh, you're SREs now! So you own reliability. Except you're given no authority to actually improve reliability. Also, you have no opportunities to write code for your job but we're going to expect candidates to write code to get hired.
Show this thread -
And they ruin a perfectly good thing. Because most companies don't need SREs.
Show this thread -
A lot of companies need Ops orgs and have good operations engineers or sysadmins fulfilling their duties.
Show this thread -
Then you have companies taking the parallel track: DevOps Engineers. What a harmful title for this industry.
Show this thread -
DevOps is something you *do* not something you *are*. DevOps Engineer is a title created by someone who didn't know sysadmins can code.
Show this thread -
But some places *do* need SREs. What kind of places are those?
Show this thread -
I take the Site Reliability part pretty literally. You have a site and you want it to be reliable. But, to me, this site is a large SaaS platform.
Show this thread -
You don't need SREs if your site gives info but does not intake and store large quantities of it. You need ops and sysadmins for that.
Show this thread -
SREs in the wild are best at understanding, scaling, and stabilizing large data intake platforms.
Show this thread -
Sometimes this means they write code. Sometimes this means they troubleshooting operation systems. Sometimes they handle networking. Sometimes they do science.
Show this thread -
But most of all, SREs *communicate*. They have to communicate with the dev teams they support and they have to think about the final user experience and build from the back expecting it.
Show this thread -
SREs are highly-sought and there's a reason for that. A good SRE understands code, systems, *and* people.
Show this thread -
Ironically, SREs have become the real "DevOps Engineers." We facilitate open engineering cultures by being able to speak everyone's language.
Show this thread -
I'm not saying operations engineers or sysadmins are bad at communication. But, at many shops, it's not a required skillset.
Show this thread -
So, stop taking your ops team that does a very good job of maintaining the VM cluster running your company site and making them SREs. You're doing everyone a disservice.
Show this thread -
SRE is not the next-step in Ops. SRE is a breed of engineering incorporating ops at certain companies at certain scales.
Show this thread -
(PS Google still has ops teams. If you replace your Ops org with an SRE one, you're not even doing it The Google Way)
Show this thread
End of conversation
New conversation -
Loading seems to be taking a while.
Twitter may be over capacity or experiencing a momentary hiccup. Try again or visit Twitter Status for more information.


International speaker. Tea drinker. Business goth. Opinions (and selfies) my own. She/her