This textbook was written for the clinical research community at Johns Hopkins leveraging the precision medicine analytics platform (PMAP). These notebooks are available in html form on the Precision Medicine portal as well as in computational form in the CAMP-share folder on Crunchr (Crunchr.pm.jh.edu).
"The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text."Jupyter Open Source Project
Jupyter is a web based data science environment that includes several key functions.
File Browser
- A file explorer to collect, share, and manage the data and files you use to analyze data.Text Editor
- A web based text editing application that provides an integrated development environment.Notebook
- A standards based JSON file format to combine documentation with computational code creating an analytics
workproduct that can be published along with research manuscripts to enable reproducible research. These notebooks can also be converted to HTML for web viewing. Terminal
- A web based terminal windown giving users command line access to the underlying operating system.Kernel
- The kernel executes the code in the notebook based upon the specified programming language and creates the output to be displayed in the notebook.Jupyter is used and taught across many industries for enabling data science communities. Here are some useful resources.
File browser is a web based client. In the directory path you can see what directory you are in or click to change to a parent directory. In the list of files are subfolders and files. You can click on a subfolder to go to that directory or click on a file to launch the text editor in a new browser tab.
When you click on the file a row of buttons are revealed to allow you to rename, duplicate, or delete the file.
If the files are green, then they are currently active. They are likely open in another tab or have not been closed. To close an actively running file go to the Running
tab next to the Files
tab.
To shutdown a page click on the shutdown orange button next to the page name.
On the right hand side of the file browser, there is a button New
. When you click you will see options to create new notebooks based upon compilers for Python 2, Python 3, and R.
You alos have the ability to create a text file, launch a terminal view, and create a new folder.
The first thing you should do when you create a new notebook is name it. You can do this by clicking on the Untitled label in the top center.
Files are automatically saved periodically. You can force a save by pressing the save button.
Let's run our first command.
print('hello world')
into the grey cell block.Now let's create a new cell block using the + menu option. On this cell block pull down the menu item that says code and convert this block into a markdown cell. Markdown is used for documentation.
You execute a markdown cell the same way as a code cell. Ctrl+Enter or the run button.
Note: If you want to edit a markdown cell once it is compiled, double click on that cell to switch from presentation mode to edit mode.
Let's create one more cell block as code and type in a ?
and run that cell blcok. This is a special command that pulls up the help documentation as part of the interface. You can use it to look up functions and libraries.
There are great resources on Jupyter on the web. There are over 2.5 Million Jupyter notebooks currently on github.com source repository system. Here are some good places to checkout.
Python Data Science Handbook
- This is an open source textbook built with Jupyter notebooks on learning Python for data science. Jupyter Notebook Gallery
- This a gallery of notebooks as well as textbooks using Jupyter NoteobooksJupyter User Interface Documentation
Jupyter Notebook Cheat Sheet
- Useful quick guide to the UI from Datacamp