In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from whatif import Model

Excel "What if?" analysis with Python - Part 5: Documentation¶

Documentation is important.

“Code is more often read than written.”

— Guido van Rossum

In this introduction, we'll touch on three specific aspects of documentation:

  • comments and docstrings
  • the readme file
  • Sphinx and restructured text

The Real Python people have developed a very nice guide to documentation. You should read through it and use it as a reference as you are going through this notebook.

  • Documenting Python Code: A Complete Guide

Other good resources include:

  • Hitchhikers Guide to Python - Documentation
  • PEP 8 - Style Guide for Python Code and PEP 257 - Docstring Conventions
    • PEP stands for Python Enhancement Proposal. See https://www.python.org/dev/peps/,
    • these are great references for learning to use good Python coding style conventions,
    • includes information on docstrings and comments as well as on actual code.

Comments and docstrings¶

At the very minimum, your code should be well commented and include appropriate docstrings. Let's start by looking back at an early version of our whatif.data_table function from the what_if_1_datatable.ipynb notebook.

In [2]:
def data_table(model, scenario_inputs, outputs):
    """Create n-inputs by m-outputs data table. 

    Parameters
    ----------
    model : object
        User defined model object 
    scenario_inputs : dict of str to sequence
        Keys are input variable names and values are sequence of values for this variable.
    outputs : list of str
        List of output variable names

    Returns
    -------
    results_df : pandas DataFrame
        Contains values of all outputs for every combination of scenario inputs
    """

    # Clone the model using deepcopy
    model_clone = copy.deepcopy(model)
    
    # Create parameter grid
    dt_param_grid = list(ParameterGrid(scenario_inputs))
    
    # Create the table as a list of dictionaries
    results = []

    # Loop over the scenarios
    for params in dt_param_grid:
        # Update the model clone with scenario specific values
        model_clone.update(params)
        # Create a result dictionary based on a copy of the scenario inputs
        result = copy.copy(params)
        # Loop over the list of requested outputs
        for output in outputs:
            # Compute the output.
            out_val = getattr(model_clone, output)()
            # Add the output to the result dictionary
            result[output] = out_val
        
        # Append the result dictionary to the results list
        results.append(result)

    # Convert the results list (of dictionaries) to a pandas DataFrame and return it
    results_df = pd.DataFrame(results)
    return results_df

A few things to note:

  • The block at the top within the triple quotes is a docstring in what is known as numpydoc format. It is pretty verbose but easy for humans to read. Learn more at https://numpydoc.readthedocs.io/en/latest/format.html. This type of block docstring is appropriate for documenting functions and classes.
  • In PyCharm, you can go into Settings | Tools | Python Integrated Tools | Docstring Format and set it to Numpy style. When you do this, and then type the triple quote after a function definition, PyCharm will automatically generate a skeleton docstring based on the function parameters and in Numpydoc format. Very convenient.
  • Code comments start with a '#', are on their own line, and the line should be less than 72 chars wide.
  • The code above is a little over-commented. This is intentional as it's part of a learning tutorial.
  • By including docstrings, we get to do this...
In [3]:
data_table?
Signature: data_table(model, scenario_inputs, outputs)
Docstring:
Create n-inputs by m-outputs data table. 

Parameters
----------
model : object
    User defined model object 
scenario_inputs : dict of str to sequence
    Keys are input variable names and values are sequence of values for this variable.
outputs : list of str
    List of output variable names

Returns
-------
results_df : pandas DataFrame
    Contains values of all outputs for every combination of scenario inputs
File:      /tmp/ipykernel_122501/2478273169.py
Type:      function

Now let's look at our BookstoreModel class with respect to comments and docstrings. Notice that:

  • the individual methods have short concise docstrings - many are one line. This is ok if the meaning of the method and the way it was implemented is pretty straight forward.
  • the first line of a multi-line docstring should be a self-contained short description and be followed by a blank line.
In [4]:
class BookstoreModel(Model):
    """Bookstore model

    This example is based on the "Walton Bookstore" problem in *Business Analytics: Data Analysis and Decision Making* (Albright and Winston) in the chapter on Monte-Carlo simulation. Here's the basic problem (with a few modifications):

    * we have to place an order for a perishable product (e.g. a calendar),
    * there's a known unit cost for each one ordered,
    * we have a known selling price,
    * demand is uncertain but we can model it with some simple probability distribution,
    * for each unsold item, we can get a partial refund of our unit cost,
    * we need to select the order quantity for our one order for the year; orders can only be in multiples of 25.

    Attributes
    ----------
    unit_cost: float or array-like of float, optional
        Cost for each item ordered (default 7.50)
    selling_price : float or array-like of float, optional
        Selling price for each item (default 10.00)
    unit_refund : float or array-like of float, optional
        For each unsold item we receive a refund in this amount (default 2.50)
    order_quantity : float or array-like of float, optional
        Number of items ordered in the one time we get to order (default 200)
    demand : float or array-like of float, optional
        Number of items demanded by customers (default 193)
    """
    def __init__(self, unit_cost=7.50, selling_price=10.00, unit_refund=2.50,
                 order_quantity=200, demand=193):
        self.unit_cost = unit_cost
        self.selling_price = selling_price
        self.unit_refund = unit_refund
        self.order_quantity = order_quantity
        self.demand = demand

    def order_cost(self):
        """Compute total order cost"""
        return self.unit_cost * self.order_quantity

    def num_sold(self):
        """Compute number of items sold

        Assumes demand in excess of order quantity is lost.
        """
        return np.minimum(self.order_quantity, self.demand)

    def sales_revenue(self):
        """Compute total sales revenue based on number sold and selling price"""
        return self.num_sold() * self.selling_price

    def num_unsold(self):
        """Compute number of items ordered but not sold

        Demand was less than order quantity
        """
        return np.maximum(0, self.order_quantity - self.demand)

    def refund_revenue(self):
        """Compute total sales revenue based on number unsold and unit refund"""
        return self.num_unsold()  * self.unit_refund

    def total_revenue(self):
        """Compute total revenue from sales and refunds"""
        return self.sales_revenue() + self.refund_revenue()

    def profit(self):
        """Compute profit based on revenue and cost"""
        profit = self.sales_revenue() + self.refund_revenue() - self.order_cost()
        return profit

The readme file¶

Every project should have a readme file at the very least. Usually it will contain a high level description of the project and instructions for installing it obtaining the source code. It may also contain contact info, tell people how to contribute and licensing info, among other things. Write your readme file using markdown as then it will automatically be rendered as html in your GitHub repo and serve as a type of "home page" for your repo. Here's a sample readme file from my whatif project:

whatif - Do Excel style what if? analysis in Python¶

The whatif package helps you build business analysis oriented models in Python that you might normally build in Excel. Specifically, whatif includes functions that are similar to Excel's Data Tables and Goal Seek for doing sensitivity analysis and "backsolving" (e.g. finding a breakeven point). It also includes functions for facilitating Monte-Carlo simulation using these models.

Related blog posts

  • Part 1: Models and Data Tables
  • Part 2: Goal Seek
  • Part 3: Monte-carlo simulation

Features¶

The whatif package is new and quite small. It contains:

  • a base Model class that can be subclassed to create new models
  • Functions for doing data tables (data_table) and goal seek (goal_seek) on a models
  • Functions for doing Monte-Carlo simulation with a model (simulate)
  • Some Jupyter notebook based example models

Installation¶

Clone the whatif project from GitHub:

git clone https://github.com/misken/whatif.git

and then you can install it locally by running the following from the project directory.

cd whatif
pip install .

Getting started¶

See the Getting started with whatif page in the docs.

License¶

The project is licensed under the MIT license.

Even if you write no other documentation, your project should have a readme file, be well commented and include docstrings.

Creating documentation with Sphinx and restructured text¶

Sphinx is a widely used tool for creating Python documentation (and other things) from plain text files written in something known as reStructureText, or reST for. You'll see that reST is similar to markdown but way more powerful.

It's easy to create a new documentation project using the sphinx-quickstart script described on the Getting Started page. For our cookiecutter-datascience-aap template, I've already run the quick start script and the docs folder contains the base files for the documentation - the most important being conf.py and index.rst, getting_started.rst. We'll discuss these shortly, but let's start by exploring a finished reST based site - one of my coursewebs.

Yes, all of my public coursewebs are written in reST and the html is generated from it by Sphinx. I've included my MIS 4460/5460 course website in the mis5460_w21 folder within the downloads folder. Let's go take a look.

Exploring a reST based site¶

A few things to note:

  • all of the pages have a .rst extension, indicating that they are written in reST. They are JUST PLAIN TEXT files.
  • Look at index.rst to see a table of contents directive.
  • There is some flexibility in how sectioning is done. See this section of the reST primer.
  • bold and italics are the same as they are in markdown.
  • hyperlinks are different than in markdown.
  • Sphinx uses the toctree along with section headings to automatically generate a table of contents and navigation. Very convenient.
  • Sphinx gets much of its power from something known as directives. Yes, it's easy to make mistakes related to spacing or blank lines or missing colons when using directives and I'm frequently referring to the reST documentation when things aren't working quite right.
    • a good example of the power of directives is the yellow warning block above in this document. If you double click it to get into edit mode, you see that it's just raw html. Markdown doesn't have a way to easily custom style something like this just by indicating that's it's a note or a warning. In reST, we just do:
.. note:: This is a note admonition.
   This is the second line of the first paragraph.

   - The note contains all indented body elements
     following.
   - It includes this bullet list.

Generating documentation for whatif¶

Now let's go look at the docs folder for the whatif project that I included in the downloads folder. Actually, even though I've included my whatif folder, I'm going to clone it from GitHub to show how easy it is to clone a repo.

The URL is https://github.com/misken/whatif.git. Open a git bash shell and navigate to some folder into which you'll clone my whatif repo. Make sure you don't already have your whatif repo in the same folder.

git clone https://github.com/misken/whatif.git

We'll explore the various files and show how to generate html based documentation. A couple of important things to be alert to:

  • In order to be able to automatically generate documentation from our code's docstrings, we need to tell Sphinx to enable a few key extensions - sphinx.ext.napoleon and sphinx.ext.autodoc. We will do this in the conf.py file.
  • It still feels a little magical when we can turn text files and code into documentation by typing make html.

Some resources:

  • Documenting Python Code: A Complete Guide
  • Hitchhikers Guide to Python - Documentation
  • PEP 8 - Style Guide for Python Code
  • PEP 257 - Docstring Conventions
  • Sphinx
  • reStructureText primer
  • autodoc extension - generate docs from docstrings in code
  • napolean extension - converts numpydoc to reST
  • Read the Docs
  • More doc tips
In [ ]: