Accessing Your Data Using Python

Table of Contents

Setting Up the Container

The steps below will go over how to set up a container in crunchr. Make sure you are in SAFE before proceeding.

Step One: Accessing Crunchr

On the SAFE desktop, click on the "PMAP Tools" folder, and then click on "PMAP crunchr Analytics"

Step Two: Creating a Container

On the crunchr homepage, click on "Compute," and then click "Create Container." You can name your container whatever you would like. the Domain does not need to change. For "Compute Image," select "PMAP Jupyter Python 2.7" from the drop-down. then click "Create."

Step Three: Preparing to run the code

On the left-hand side you should see folders, like in the image below.

You have two options:

  1. You can click on "getting_started" and then open "Start_Here.ipynb" which will open the cookbook within crunchr. Then you can click on the link to this notebook and run the following code right within crunchr!

OR

  1. If you want to create your own notebook, go to Storage --> your JHED --> Persistent. Then, on the right, under "Notebook," select "Python" to launch a new Python notebook. You can run the following code in that new notebook to connect to your database and access your data.

Connecting to the Database

The following steps will show you what code to run in order to access your data. You can copy and paste the code from here to your own notebook, or run it directly in this notebook if you opened it in crunchr. Just make sure to make any necessary changes (database name, your jhed, etc).

Setting up the Connection

The following code establishes a connection with the CAMP_PMCoe_Projection database, for demonstration purposes. Make sure to confirm the name of your database and update that field accordingly.

Connecting to the Database

Basic SQL: List Your Table Names

Run the following code to retrieve a list of the table names in your database.

Basic SQL: Put Your Data into Dataframes

The following code will put a table from your database into a dataframe. In this example, we will put the encounters table into a dataframe we are calling df2. You can change the object name (df2 --> df3, etc) and the table name to get all your tables into dataframes.

Final Thoughts and Additional Resources

Please note that if you are proficient with SQL, you can put any SQL query in place of the queries we have above. This is helpful if you do not want all rows or all columns of a table.

There are many more resources within this cookbook which show you additional ways to work with SQL, and different things you can do with your data once you have it in dataframes. The link below is a particularly good follow-up to this guide.

Intro to SQL for Mining EMR Data