Site Reliability Engineer

OmniTI is looking for Site Reliability Engineer to join our team!

The OmniTI Ops team is a flexible and progressive group. We work closely with developers, DBAs, and client teams, to help them manage availability and performance in the midst of constant changes. We are not risk averse; instead, we strive to understand why things fail and understand the true impact of those failures so that we can empower others. Collaboration is a cornerstone, and we understand that being friendly and outgoing are keys to making that work.

About The Job

The role of SRE is a highly technical role and requires a thorough understanding of all components of a modern web application stack, including front-end, networking, and systems level knowledge. In this role you will be working with clients to design, build and operate reliable and scalable services in the cloud, our custom hosting platform, or in their datacenter. You are up to date on current cloud technologies and are equally comfortable on the whiteboard as you are on the command line. You will also help support our internal infrastructure and teams, as well as providing systems consulting, open source product development, and data center infrastructure support for our customers.

No one knows it all, but these are the kinds of things we're looking for (and the types of technologies you'll get to play with):

  • Experience with cloud and virtualization technologies: AWS, VirtualBox, KVM, zones/containers, Vagrant, Docker
  • Excellent troubleshooting skills with the ability to dive deep into all aspects of the stack to identify and fix problems
  • Strong background in web server technologies such as Apache, HAProxy, nginx
  • Familiarity with technologies such as Apache Traffic Server or Varnish, and a good working knowledge of the issues when implementing web caching
  • Strong knowledge of IP networking protocols
  • Programming/scripting experience in Ruby, Python, bash, Perl and/or JavaScript
  • Experience with configuration management tools such as Chef, Puppet, or Ansible
  • Familiarity with version control systems such as Git/Subversion, from both an end user and administrator perspective
  • Exposure to dynamic tracing such as Dtrace or Brendan Gregg's blog
  • In-depth understanding of Unix oriented operating systems including illumos, Linux, Solaris 10+, or *BSD

You must be willing to share in an on-call rotation and work to eliminate sources of operational disruption. You won't just be working on our infrastructure, you'll also be expected to help our clients with broken, under performing infrastructure, turning it into something that "just works". You'll get to work on hard problems and be proud of the solutions you'll build.

If you contribute to an open source project, have a blog, or are involved in technology in some other way, we would love to hear about it when you write to us!

At OmniTI we believe in diversity as a core asset. From the tools we use to the technologies we choose to the people we work with, diversity in approach has always led us to better successes. We take pride in the diversity of our staff and seek diversity in our applicants.

Staff Thoughts

This is a great place to be exposed to a wide variety of technologies and to be mentored by some of the brightest minds in the business. Knowledge is shared openly, and the amount is limited only by your ability to absorb it.

~ Eric Sproul, Systems Administrator.

Where else can you work with people from whose books you learned to program.

~ Leon Fayer, Vice President.