High Perfomance Compute (HPC) clusters for Image Analysis

This tutorial was written for the

Introduction to Phoenix

Phoenix is a Johns Hopkins Medicine on-premise GPU-enabled high-performance computing cluster that may be used free of charge for processing medical images. Processing can be done in several steps:

  1. Getting access to Phoenix and ADLS
  2. Copying data from ADLS to Phoenix
  3. Transferring a singularity image into Phoenix
  4. Submitting a job
  5. Copying results from Phoenix into ADLS

This tutorial covers each of these steps one by one.

Getting access to Phoenix and ADLS

Open a web browser and enter the following URL: http://t.jh.edu/pmap-infrastructure

After logging in with your JHED ID and password, you should see a webpage like the one shown below. access1.png

Enter your JHED ID and email, and select "Ask a Question" for the Support Type. In your question, ask for access to Phoenix and ADLS as in the figure below. access2.png

Within a few days, you should be given access to both Phoenix and ADLS to continue with the next steps.

Copying data from ADLS to Phoenix

From this point forward, it is required for you to be on the JHU network. If you are not on a JHU network and have not connected to the VPN before, please see instructions at this link: https://cds.johnshopkins.edu/vpn/

Using your favorite SSH client (e.g. PuTTY), open a connection to mrprhpch1.hosts.jhmi.edu and log in with your JHED ID and password. In the shell, enter the following lines:

cd /usr/local/admin/azcopy_linux_amd64_10.0.8
./azcopy login

At this point, you will be prompted to use a web browser to open the page https://microsoft.com/devicelogin and to enter a code to authenticate. copy1.png

Follow the instructions in your web browser until you are authenticated. Once authenticated, you should also see a Login succeeded. notification in your SSH client.

From here you can copy data from ADLS into Phoenix using the azcopy cp function followed by the source path and destination path. For example, if you want to copy from the dev folder in ADLS into the hpcdata1 folder in Phoenix, you could enter the following command:

./azcopy cp "https://pmapprojects.dfs.core.windows.net/dev" "/hpcdata1/pmap-projects/dev/tmp" --recursive=true

When the transfer is completed, you will be presented with a summary of the transfer, including elapsed time, number of transfers completed, number of transfers failed, number of transfers skipped, total bytes transferred, and final job status.

Transferring a singularity image into Phoenix

The Phoenix HPC environment uses Singularity (as opposed to Docker) to run algorithm containers. Creating Singularity images is beyond the scope of this tutorial as it will be assumed that you already have a Singularity image of your algorithm created separately from Phoenix. If you don't have your own Singularity image, one may be created from a Docker image as well. For more information about Singularity, please see https://github.com/sylabs/singularity.

Transferring a Singularity image file into Phoenix may be done by SCP, just like transferring any file into a Linux platform. Using your favorite SCP client (e.g. WinSCP), open a connection to mrprhpch1.hosts.jhmi.edu and log in with your JHED ID and password. Then copy your singularity image file into Phoenix as directed by your SCP client.

An example is shown below using WinSCP for transferring the iacl_pipeline_v1.1.0_withatlas-19771e6.simg Singularity image file. transfer1.png

Submitting a job

The Phoenix HPC environment uses the cluster workload manager, Slurm, to schedule jobs (https://slurm.schedmd.com/overview.html). To submit a job on Phoenix, first create a shell script that will be processed on a node.

For example, below is a shell script named myjob.batch that was created to run the iacl_pipeline_v1.1.0_withatlas-19771e6.simg Singularity image file that was tranferred in the previous step.

#!/bin/sh
#SBATCH --ntasks=1
module load singularity
IACL=/hpcdata1/pmap-projects/dev/imaging/IACL
DATA=/hpcdata1/pmap-projects/dev/imaging/data
PATIENT=20011
SCAN=01
cd $IACL
singularity exec iacl_pipeline_v1.1.0_withatlas-19771e6.simg \
run_patient -d $DATA -p $PATIENT -s $SCAN -a /opt/atlas/mni_0p8mm/mni_icbm_152_2009c_t1_0p8mm.nii.gz -b /opt/atlas/mni_0p8mm/mni_icbm_152_2009c_t1_0p8mm_brainmask.nii.gz -r /opt/atlas/mni_0p8mm/mni_icbm_152_2009c_t1_0p8mm_registration_mask.nii.gz -m /opt/atlas/monstr/ -g /opt/atlas/s3dl_lesions/ -f /opt/atlas/nmm30_1mm/

Once the batch file is created, a job may submitted using the sbatch command. For example, the following line runs the batch file above.

sbatch myjob.batch

While the job is running, you can check the status of your job using the squeue command. An out file will also be created showing the output message for your job. See below for an example.

[nkuo8@mrprhpch1cl IACL]$ tail slurm-224.out
190420-13:51:03,500 workflow INFO:
         [Node] Finished "pipeline.cortical_thickness_01.move_inner_values".
190420-13:51:03,501 workflow INFO:
         [Node] Setting-up "pipeline.skull_strip_01.monstr_01.move_icv_mask" in "/hpcdata1/pmap-projects/dev/imaging/IACL/data/20011/pipeline/skull_strip_01/monstr_01/move_icv_mask".
190420-13:51:03,612 workflow INFO:
         [Node] Running "move_icv_mask" ("iacl_pipeline.interfaces.io.MoveResultFile")
190420-13:51:03,671 workflow INFO:
         [Node] Finished "pipeline.skull_strip_01.monstr_01.move_icv_mask".
/opt/conda/envs/neuro_py35/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters

Copying results from Phoenix into ADLS

The procedure for copying job results from Phoenix into ADLS is similar to copying data from ADLS into Phoenix, just applied the other way around.

Like in Step 2, run the following lines in the shell.

cd /usr/local/admin/azcopy_linux_amd64_10.0.8
./azcopy login

Authenticate according to the instructions if not already authenticated. From here you can copy data from Phoenix into ADLS using the azcopy cp function followed by the source path and destination path. For example, if you want to copy from the hpcdata1 folder in Phoenix into the dev folder in ADLS, you could enter the following command:

./azcopy cp "/hpcdata1/pmap-projects/dev/tmp" "https://pmapprojects.dfs.core.windows.net/dev" --recursive=true

Summary

In this tutorial, we covered the following steps to process medical images on Phoenix.

  1. Getting access to Phoenix and ADLS
  2. Copying data from ADLS to Phoenix
  3. Transferring a singularity image into Phoenix
  4. Submitting a job
  5. Copying results from Phoenix into ADLS

Development continues to make the process more streamlined, but hopefully this tutorial is sufficient to get you started!