Solutions Architect
Job Description and Responsibilities
We are looking for a senior HPC solution architect to lead the architecture of TI s NPDE (New Product Development Execution) compute environment. This platform supports large-scale semiconductor design workloads, including EDA simulations, regression runs, and growing AI/ML use cases. You will design and evolve the HPC environment to support increasing demand while balancing performance, scalability, and cost. In this role, you will work closely with Storage, Network, Datacenter, EDA teams, and external vendors to translate business needs into practical, high-performing infrastructure solutions.
Key responsibilities will be as follows:
- Responsible for design and implementation of robust, scalable infrastructure solutions combining diverse processors (CPUs, GPUs)
- Integrate cutting-edge technologies to enhance computational power and capabilities
- Evaluate and select best-in-class hardware and software solutions, optimizing our infrastructure for peak performance, scalability, efficiency, reliability and cost
- Partner with global teams to establish and enforce architectural standards and best practices across HPC environments
- Profile entire cluster of nodes and each node with profilers to understand bottlenecks, optimize workflows and improve cost of ownership
- Develop and deploy solutions to continuously optimize system performance and resource utilization across the HPC ecosystem
- Act as a trusted technical advisor to senior leadership, offering strategic insights on emerging HPC, hardware trends
- Evaluate success based on the HPC environment s utilization and end-user satisfaction
Expertise/Required skills:
- Bachelor s, Master s, or PhD degree in Computer Science, Electrical Engineering, or a related field
- Overall 10+ years of experience and minimum 5+ years of experience in HPC architecture, system design, or a similar role within a large-scale compute environment (~100k cores)
- Deep expertise in the following:
- HPC system design and optimization
- Parallel computing
- Linux systems administration and good knowledge in enterprise storage solutions (Isilon, Netapp, Object, Lustre-based systems)
- HPC management tools (e.g., Kubernetes, Docker, LSF, Slurm)
- High-performance processors and compute offload devices (e.g., GPUs)
- Low-latency network architecture, including high-speed interconnects (e.g., InfiniBand, Ethernet)
- Datacenter design and optimization
- Experience in designing hybrid cloud or cloud-based HPC environment
- AI/ML frameworks and their integration into HPC systems
- Programming languages such as Python, Bash, C++, or similar
- Excellent problem-solving and analytical skills
- Ability to mentor and coach junior team members
- Exceptional communication skills, with the ability to translate complex technical concepts for non-technical audiences
- Proven ability to influence decision-making and align global teams to achieve a unified vision