G
source · wttj·req · jb_87e2926596·listed 2d ago
Senior Systems Engineer
Graphcore·Bristol, United Kingdom·Hybrid·Full-time
Sourced listing · wttjNo salary disclosed
compensation · estimating
Estimating…
Sign up to see our estimate based on role, location, and seniority.
source · estimate pending
Summary
the pitchJoin Graphcore, a leading AI technology company, as a Senior Systems Engineer. In this hands-on technical role, you will work closely with various teams to develop and deploy cloud services on cutting-edge AI systems. You will be responsible for cloud integration, validation, performance benchmarking, optimization, and development of high-performance AI solutions. The position offers flexible working arrangements, private medical insurance, pension, income protection, and opportunities for learning and development.
Role
posted by company- Bachelor's degree or equivalent practical experience in a relevant subject
- Strong proven Linux system administration (Ubuntu, RHEL and variants)
- A solid hands-on understanding of the technologies underpinning cloud services (APIs, virtualisation of CPUs, IO, systems), virtual networks, block storage, resource management and monitoring
- An ability to work independently on critical infrastructure with minimal oversight, and with a focus on end-user availability
- Experience with IAC automation tools (Terraform/OpenTofu, Ansible, Packer)
- Good communication and presentation skills, and experience dealing with end-users of IT services
- Experience with managing production Kubernetes clusters and workloads with a continuous delivery tool such as ArgoCD
- Solid software engineering or IT experience with a proven track record of delivering technical output as an individual contributor
- Experience working in an AGILE and SCRUM framework, including understanding of priorities, risks, issues, impacts and constraints
- Experience with a version control system (preferably Git) and using it to manage system configuration or automation
- Strong proven Linux scripting ability (bash, python, awk, sed)
- Experience with Continuous Integration or testing pipelines using GitLab, GitHub or similar
- Experience with Openstack cloud platforms
- Experience with solutions for monitoring and observability. e.g. Grafana, Prometheus, OpenSearch/ElasticSearch, Loki
- Experience with High Performance Computing (HPC) environments using SLURM or similar batch workload solutions
- Programming experience with Python3 utilising classes and inheritance
Key responsibilities
- Develop and deploy cloud infrastructure and services, working closely with cross-functional teams.
- Manage and operate Kubernetes-managed end-user services on private clouds, turning requirements into deployed services.
- Configure, test, and maintain AI hardware and systems using Continuous Deployment and Infrastructure-as-Code.