Jupyter Crash Course

Getting Documentation

Jupyter uses the IPython kernel when you work with PyARPES. IPython has many conveniences, like allowing you to quickly get the documentation (with ?) or source code of a function (with ??).

[2]:
import pprint

# Look at help information for the library function `pprint.pprint`
pprint.pprint?
Signature:
pprint.pprint(
    object,
    stream=None,
    indent=1,
    width=80,
    depth=None,
    *,
    compact=False,
    sort_dicts=True,
)
Docstring: Pretty-print a Python object to a stream [default is sys.stdout].
File:      c:\tools\miniconda3\envs\python3\lib\pprint.py
Type:      function

You can use this with library functions, PyARPES funtions, or your own.

[3]:
from arpes.io import load_example_data

# Let's look at the full source code for `load_example_data`
load_example_data??
Signature: load_example_data(example_name='cut') -> xarray.core.dataset.Dataset
Source:
def load_example_data(example_name="cut") -> xr.Dataset:
    """Provides sample data for executable documentation."""
    if example_name not in DATA_EXAMPLES:
        warnings.warn(
            f"Could not find requested example_name: {example_name}. Please provide one of {list(DATA_EXAMPLES.keys())}"
        )

    location, example = DATA_EXAMPLES[example_name]
    file = Path(__file__).parent / "example_data" / example
    return load_data(file=file, location=location)
File:      c:\users\chsta\documents\github\arpes\arpes\io.py
Type:      function

You can also use the help function to get a terse form of this info.

[4]:
help(pprint.pprint)
Help on function pprint in module pprint:

pprint(object, stream=None, indent=1, width=80, depth=None, *, compact=False, sort_dicts=True)
    Pretty-print a Python object to a stream [default is sys.stdout].

Best Practices

‘ Try to keep your analysis notebooks short and ensure it’s possible to run them top to bottom in a reproducible way.

One way to make sure this is possible is to use a module stored alongside your code which contains common routines or snippets used in a particular project or analysis.

In general, sharing code is better than sharing data between notebooks, because it prevents stale dependencies between parts of your analysis.

A given notebook should have limited scope, ideally to generate a single or a few related figures. If your notebook is getting too long, split it up by using “Save As” and deleting relevant cells from each.

Maintaining Records of Results

It’s painful to go searching through your notebooks for records of your analysis products. It’s probably a good idea to store these out-of-band in a separate piece of software (PowerPoint, Notion, or pencil and paper) so that you can summarize and cross reference. Make sure to annotate which notebooks results came from, however!

Exercises

  1. Look at the Jupyter documentation (https://jupyter.org/documentation) to learn more about either JupyterLab or Jupyter Notebook, depending on which you use.

  2. Import numpy.random.random_sample, get documentation with ? and use it to determine how to make a 20 by 20 array of random numbers. What distribution are they drawn from?

  3. Plot the distribution of numbers from the previous cell using the ndarray method ravel and matplotlib.pyplot.hist. Use the documentation to determine how to call them. Hint: If arr is an array, you can get documentation for .ravel by using arr.ravel?.

  4. Make a new cell and run ?. What gets outputted?