Site Reliability Engineer – Cloud Operations

Site Reliability Engineer – Cloud Operations

Website encapsecurity AllClearID

Work with a strong, global engineering team to maintain and enhance world-class cloud-based-applications with cutting edge technologies on a fresh project with a great mission.

We are looking for a great site reliability engineer to maintain and continually improve our cloud-based applications!!!

About You:

  • Deep understanding of Cloud Platforms like AWS and GCP and how to leverage them for computestorage, and managed services including, but not limited to databasesmanaged Kubernetes, and content delivery networks.
  • Experienced with modern DEVOPS engineering practices and comfortable with diverse technical problem sets, across the entire technology stack, including the virtualized hardware.
  • Possess a deep understanding of the Linux Operating System and are at home on the command line / terminal at your workstation.
  • Versed in Infrastructure as Code practices using technologies like TerraformCloud Formation, etc.
  • Familiar with tools like AnsiblePuppetChef, and leveraging those tools for configuration automation.
  • Proficient in scripting and developing automation in Python and BASH, or similar programming languages.
  • Used to keeping everything you do in source control (git) and automating (scripting) any task you have to do more than once.
  • Understand modern approaches to software security – and know what needs to be done to secure software systems and cloud-based infrastructure.
  • Equipped with a proactive security mindset, and a solid understanding of information security and privacy principles
  • Experienced in protecting modern, cloud-hosted operating environments using defense-in-depth strategies
  • Comfortable operating in environments subject to regulatory, compliance, and risk-based security requirements
  • Able to effectively trouble-shoot issues across the entire technology stack from (UI-> API -> Application -> Database) including the operating system and the underlying (virtual) hardware.
  • Enthusiastic about cutting-edge technologies and fresh challenges that come with them.

And ideally you are:

  • Experienced using Kubernetes and related technologies (such as Docker) for application orchestration.
  • Excited about monitoring technologies (such as Prometheus, TICK stack), the metrics they provide, and using the data to extract information about the performance characteristics, and error modes of a cloud-based software stack.
  • Proficient as a developer, experienced writing code and solving problems in at least one main-stream programming language (such as PythonJavaGo, C#, etc.).
  • Have experience developing and maintaining globally deployed applications, in multiple languages, with many, many users.
  • Very familiar with agile frameworks, such as Scrum and Kanban, and how to operate within these frameworks to continually deliver value.
  • Experienced with mono-repo concepts and tools like bazel or pants.
  • experienced developing and maintaining feature-rich applications using modern software frameworks such as Spring-Boot, Flask, .NET, etc.?
  • Take pride in the quality of your code, the work it takes to make great software, and the value delivered to the end-user.
  • Understand computer networking, and how it applies in cloud environments.
  • The type of person that gets excited when you merge a pull request you authored.
  • Hold a Bachelors or Master’s Degree in Computer Science, Electrical Engineering, or another scientific or technical discipline.

So that you can:

  • Operate, monitor, and maintain high availability of software service for multiple products running in a multi-region cloud environment.
  • Work with team to establish service level objectives and monitor to ensure the objective are met.
  • Continually improve cloud operations automation and tooling to monitor and maintain enterprise cloud-based applications.
  • Troubleshoot infrastructure and application issues, and work with the engineering team to resolve issues.
  • Execute run-books for known cloud-operations tasks, and create new run-books for new situations or issues you encounter. Automate everything.
  • Collaborate with a great team to maintain, monitor, and improve amazing cloud-based-applications that solve real-world problems for end users.
  • Facilitate blame-free root cause analysis meetings in the event of a production-systems incident so the team can learn from mistakes and improve our systems and run books.
  • Participate in stress, security, and performance testing.
  • Participate in continually improving best-practices security posture.

To apply for this job email your details to