We are seeking an experienced High Performance Computing Engineer who can plan, implement, and maintain advanced cyberinfrastructure solutions.
Requirements
- Design, deploy, configure, and administer medium scale HPC clusters and associated storage systems.
- Monitor system health, performance metrics, and resource utilization to ensure optimal operation.
- Implement robust security protocols and perform regular maintenance including upgrades and patching.
- Troubleshoot complex hardware and software issues in a multi-user research environment.
- Manage job scheduling and workload optimization using tools like SLURM.
- Administer parallel file systems (such as ceph and IBM Spectrum Scale/GPFS) and storage solutions.
- Design and implement innovative HPC solutions to address evolving research requirements.
- Create and maintain automation scripts and tools to streamline system administration.
- Optimize scientific applications and computational workflows for performance.
- Implement container technologies (Docker, Singularity) for reproducible research.
- Support GPU computing and accelerator technologies for specialized workloads.
- Define and track performance metrics to ensure efficient current and future use of resources.
- Partner closely with researchers to understand computational needs and translate them into technical solutions.
- Collaborate with network, security, and data center teams to ensure integrated operations.
- Build and maintain relationships with external vendors and technology partners.
- Participate in the HPC community to stay current with emerging technologies and best practices.
- Serve as a technical advisor on infrastructure planning and technology roadmaps.
- Develop comprehensive documentation for systems, policies, and procedures.
- Create user guides and training materials for researchers utilizing HPC resources.
- Provide mentorship to junior staff and knowledge sharing across teams.
- Conduct workshops and training sessions on effective use of HPC resources.
Benefits
- Generous Paid Time Off
- 401k Matching
- Retirement Plan
- Health Insurance
- Dental Insurance
- Vision Insurance
- Life Insurance
- Disability Insurance
- Flexible Spending Account (FSA)
- Employee Assistance Program (EAP)