From 5 days to 1 hour — The Magic of HPC

Sagar Chauhan
11 min readDec 10, 2022

“It’s all about how the Preemptive virtual computers on Google Cloud enabled researchers at Neurosim Lab to smoothly autoscale complex simulations of brain circuits, reducing computing time from five days to an hour.

Introduction —

High performance computing (HPC) is the processing of large data sets using machines that have high-performance processors and high-speed connections to the network. It enables a wide range of applications to be solved, including weather prediction, climate modeling, drug discovery and optimization. HPC systems are used in industries such as bioinformatics, aerospace engineering and materials science research.

HPC allows for computation of greater scale problems —

HPC systems are used to solve all sorts of things, from modeling the weather to designing nuclear power plants, and it does this by breaking down large problems into smaller pieces that can be solved simultaneously. For example, if you have an application that needs to simulate 1 million atoms moving around on a grid over time in order to determine whether or not they’re colliding with each other (a problem called molecular dynamics), then instead of trying to calculate those interactions at once using your standard PC or laptop computer — which would take many years — you could instead use an HPC system like the ones at Lawrence Livermore National Laboratory (LLNL).

Neurosim Lab is a high performance computing company specializing in brain simulation and modeling. Their mission is to advance the science of neuroscience by developing advanced computational techniques for analysing brain activity and mapping neural circuits. They use bleeding edge technologies like GPU-based machine learning, OpenCL Numerical Linear Algebra (NLAC), and Google Cloud Platform (GCP) as their primary platform for working with data sets that are too large or too complex for any other solution.

Neurosim Lab is a neuroscience research company that uses computational modeling to understand the human brain. Neurosim Lab has been using Google Cloud Platform to run their simulations for more than four years, and we’re excited about how much more powerful GCP can be for them in 2019.

Why Google Cloud Platform?

Google Cloud Platform is a cost-effective and reliable platform for HPC. It provides a wide range of HPC services including compute, storage and networking that enable you to build high performance computing (HPC) applications on the cloud.

Google Cloud Platform is easy to use:

  1. You can create your own VMs in minutes using the Google Compute Engine Console or APIs.
  2. You can connect them together with just a few clicks through the same console or API.
  3. You can scale up or down as needed easily with no downtime by updating your VM’s configuration settings via our APIs or by running automated scripts in the Cloud Shell command-line interface (CLI).

Designing the solution

The neurosim lab is a neuroscience research lab that uses high performance computing (HPC) to simulate brain circuits. They needed a solution that was cost effective and scalable, so they turned to Data-Driven Computing.

The neurosim lab’s HPC system makes use of Infiniband network cards which have been optimized for HPC applications by vendors like Mellanox Technologies, Inc., AMD and Intel Corporation. The designers also selected IBM Power Systems Server P495 with IBM BladeCenter HS23 server blades to provide an integrated platform for simulation and data analytics workloads across multiple threads per core or CPU cores on the same processor core making it possible for them to scale up their simulations as needed without having any impact on performance at any stage in the process flow from concept generation through validation/verification stage towards production deployment The IBM Power Systems servers are built on the IBM POWER9 processor, one of the most powerful CPUs in the market today. The POWER9 chip has cores that can handle up to 24 threads per core and up to 48 CPU cores on a single chip.

Building our app with the Google Cloud Platform API

To build our app, we need to use the Google Cloud Platform API. The Google Compute Engine and Google Cloud Storage APIs are also essential for building an application.

The following are some of the services we’re going to use:

  • Google Cloud Platform (GCP) — A web based control panel that allows users to manage their resources and applications across multiple clouds in one place. It provides access to several different managed service providers such as VMWare vSphere, Microsoft Azure and Amazon Web Services (AWS).
  • Google Compute Engine — This service allows you to run virtual machines on top of other cloud providers such as AWS or Azure so your code doesn’t have any dependencies on those specific platforms but rather can be executed within its own environment regardless of where it runs within GCP’s ecosystem!

The Google Cloud Platform is a great platform for neurosim lab. Next steps will be to integrate Google Cloud Storage with Neurosim Lab, so that users can upload and retrieve data from their experiments. The API has already been tested successfully by other researchers in the field of neural networks, so it’s no surprise that it works well for this purpose as well.

Neurosim Lab could also be used to simulate brain circuits using high-performance computing (HPC) machines such as those offered by AWS or Microsoft Azure, which would allow us to make predictions about how different parts of our brains work together during decision making processes like “should I go left or right?”

Neurosim Lab seamlessly auto-scale their detailed simulations of brain circuits with Google Cloud’s Preemptible Virtual Machines —

Neurosim Lab seamlessly auto-scale their detailed simulations of brain circuits with Google Cloud’s Preemptible Virtual Machines. They use the following tools:

  • VMWare Workstation for running stand-alone instances of the program
  • Google Compute Engine for provisioning and managing virtual machines
  • The Google Cloud Platform console for managing instances, networks, and load balancers
  • Google Container Engine for managing Docker containers.

Roy Sookhoo, the SUNY Downstate Medical Center’s chief information officer, was up against a problem in late 2017. He needed to update the technology infrastructure of the five institutions and hospital of SUNY to facilitate cutting edge research for the professors and residents. First, he conducted some independent research by speaking with professors like Salvador Dura-Bernal at SUNY Downstate’s Neurosim Lab. The lab, directed by Bill Lytton, does computationally intensive simulations of the neural cortical networks of the brain, and its researchers have been asking additional processing capacity. Sookhoo wanted to test a move to the cloud so he could replicate it throughout the organisation since additional big data initiatives were in the works.

“As CIO, I want to make sure that I provide the resources that these researchers and scientists need to do their job,” Sookhoo states. “The equipment that we have is outdated so rather than make an investment in new equipment we thought it would be better for them to get to the cloud where they can scale as they need to and have the processing power whenever they want. It’s a win-win for us to get on the cloud.”

Simulators with a great deal of detail can help scientists learn more about how the brain works on a daily basis and make strides in curing diseases like schizophrenia, Parkinson’s, and epilepsy. The Neurosim Lab team developed the most accurate model of mouse motor cortex microcircuits to date using funding from the National Institutes of Health and the New York State Spinal Cord Injury Research Board. It consists of a 0.1 mm3 region containing over 10,000 cells and close to 30 million synaptic connections. However, performing even one second of the simulation on a physical server required one hour using 50 cores: starting thousands of simulations with varied parameters of 1–2 seconds each indicated that one simple batch may take 50,000 core hours.

The team used Google Cloud’s Preemptible VMs, a versatile, affordable approach to execute batch operations on Google Compute Engine, to augment funds first from National Science Foundation’s XSEDE programme for supercomputer time. It succeeded. Dura-Bernal claims that “Google Cloud contributed significantly for us exponentially. I can now process things that used to take three to four days in 3 to 4 hours. I can come up with a concept, test it out, and receive outcomes really quickly. And wow, the fact that I can execute the computation simultaneously on 50,000 processors as opposed to my limitation of 500 on site.

The NEURON simulation engine and NetPyNE, a fully accessible tool created by the Neurosim Lab to assist researchers in creating their own unique models of biological neural networks, are used to run Dura-model. Bernal’s Researchers may replicate experimental data in scaled-up, controlled simulations using NetPyNE and GC VMs. In order to manage their high-performance computing cluster, the team at SUNY Downstate also profited from Google Cloud’s connection with Slurm, a well-known open-source programme that automatically queues and effectively distributes workloads. Dura-Bernal says that “one advantage of Google Cloud is that you can install and remove the HPCCs simply and fully configure them.” He can put up that many as 2,300 terminals with 16 cores in 10 minutes, and the simulations can run on them for two hours before automatically shutting down.

“Running the models on Preemptible VM instances,” he adds, “is four times cheaper and allows us to try more hypotheses because we can run the tests faster.” With more powerful processing, Dura-Bernal hopes next to model a larger area of the brain with multiple interconnected regions.

Researchers can create “biomimetic” implants to replace injured brain tissue by modelling the information flow via the intricate circuitry of the brain. They can create prosthetic limbs that can both move and feel by understanding the link between motor and sensory functions. They can develop experimental medication and electrical stimulation therapies by researching the neurodynamics of the chemical composition of the brain. By using these incredibly accurate simulations, Dura-Bernal continues, “we can assess the impact of novel therapies first in simulation before administering them with actual patients.”

Benefits to IT professionals —

The team has already done a lot, and Lin Wang, Associate Director for User Computing at SUNY Downstate, is impressed: “Building this type of computing capacity in roughly three months is inconceivable with a physical structure. We lack the necessary infrastructure. For a forthcoming genomics sequencing project, Sookhoo intends to switch to a hybrid system, but his long-term objective is to completely switch to Google Cloud: “I hope that all of the equipment at SUNY Downstate will be on Google Cloud and that we won’t need any equipment here. That is my aim, he declares. Researchers ought to be concerned with their own research, not with machines. The apparatus need to function like a light switch. They switch it on, utilise it, then shut it off before leaving for home.

“Google Cloud made an exponential difference for us. Processing that before took three to four days I can now run in three to four hours.” — by Dura-Bernal

A brief about SUNY Downstate Medical Center —

SUNY Downstate Medical Center is one of the nation’s leading urban medical centers.The Downstate campus includes a College of Medicine, College of Nursing, School of Health Professions, a School of Graduate Studies, School of Public Health and University Hospital of Brooklyn.

The mission at Downstate is to provide access to excellence in education, patient care and research.

Downstate Medical Center of Brooklyn

SUNY Downstate Medical Center, the largest employer in Brooklyn, has deployed HPC cluster integration with Google Cloud Platform.

The research hospital with a mission to serve the healthcare needs of New York State, including its most vulnerable citizens, is now using Google Cloud solutions to advance the field of medicine and promote quality patient care.

Downstate Medical Center was founded in 1860 as the Long Island College Hospital. It has since evolved into a 705-bed medical center that serves as Brooklyn’s only academic medical center and its only level 1 trauma center. It also educates more healthcare professionals than any other institution in Brooklyn.

Researchers at Downstate develop new therapies and clinical trials for diseases such as cancer, diabetes and HIV/AIDS.

Downstate researchers needed a cloud solution that would optimize their high performance computing (HPC) workloads and offer storage solutions to house the vast amounts of data they generate.

SUNY Downstate Medical Center, the largest employer in Brooklyn, has deployed HPC cluster integration with Google Cloud Platform. This allows for a seamless migration of data from local servers to a cloud-based system that can then be used to analyze vast amounts of patient data at the same time.

Downstate’s infrastructure includes multiple computing centers that provide virtualized compute power and storage resources for scientific research projects conducted by faculty members across academic departments like Chemistry & Biochemistry and many more.

With Google Cloud technologies like Filestore High Scale volumes and Compute Engine VMs with preemptible pricing, Downstate was able to enhance their Slurm workflow scheduler to run tasks across compute nodes with burstable workloads.

Conclusion —

The hospital has been able to run HPC jobs across all its compute nodes, which helps them meet the demands of their research and clinical work. With Google Cloud technologies like Filestore High Scale volumes and Compute Engine VMs with preemptible pricing, Downstate was able to enhance their Slurm workflow scheduler to run tasks across compute nodes with burstable workloads.

This is a very important topic. High performance computing is not just about making sure you have the right tools, but also about planning your workflow and how to utilize them effectively. In order to do that, you need experts who know what they are doing and can help teach others how they work best.

There are many different ways to choose a high performance computing company or consultant, but one thing is clear: You want someone who will listen to your needs and put their expertise towards helping build the solution that best suits those needs!

References —

--

--