Building Automated Charts to your RunBooks: Visualize the TTR for your team
At unSkript, our goal is build tooling that helps your CloudOps team reduce their manual toil, and make it easier for your team to automate the things we should be automating. In this post, we’ll address a common process that regularly sucks away time: reporting. We know we have to do it, but it can take a lot of time. Let’s automate as many of the reports as possible.
unSkript is based on top of Jupyter Notebooks, and using Pandas dataframes and the Panel visualization library – we can easily manipulate our data – even in real time.
One common metric for SREs is the Time To Resolution (or Time To Recovery): TTR. Let’s figure out a way to build automated reports that can help us track our TTR over time, and quickly built out charts for our managers when they are requested.
To build our charts, we’ll be using issue data from Jira. If you don’t use Jira to store your outage data – but use a database or other tool, you’ll have to switch out the data Collection Action from Jira to your database of choice, and the rest of the RunBook will continue to work. That’s the beauty of using Jupyter Notebooks – they are modular, and modifications are super easy to make.
Once we have the data from Jira, we’ll convert the data into a dataframe, and build an interactive chart using Panel. In this RunBook, we’ll build the same chart 2 different ways:
-
-
- Static Report: In our first example, we’ll import all of the issues for a large time period into the Notebook. This gives us huge flexibility on what we can report, but there are a lot of issues, we may have problems with memory. If your data changes frequently, the “big table of data” can become stale and will need to be refreshed on the regular.
- Dynamic Report: Every time we modify the graph, we pull a new dataset from Jira. This will slow down the graph creation – since the data is not im memory already, but does mean that the data pulls will be much smaller, and will always be up to date.
-
To estimate the TTR for an issue, we take the difference between issue creation time and the time the issue’s status was set as “Done.” The difference in these times can be used as a proxy for how long the issue was worked on – or the TTR.
Pulling the data
Our RunBook will have five parameters as input, and they will all be used to configure the Jira JQL (Jira Query Language) query:

- start_date: The start of the date range..
- end_date: When the status was set to ‘Done.’
- new_status: The status change that signifies completion: in our case “Done.”
- issue_type: We’ll use ‘Bug’ in this demo
- jira_project: The Jira Project we’ll pul the content from
Let’s walk through the steps to generate a chart where we list every issue changed to done, graphed by the elapsed time.
Step 1: Create a JQL Query
Our First Action creates a function to generate the JQL code for the query:
def create_query(jira_project, issue_type, new_status, start, end):
global jql_query
jql_query =f"project = {jira_project} and issueType = {issue_type} and status changed to {new_status} during ('{start}','{end}')"
create_query(jira_project, issue_type, new_status, start, end)
Step 2: Query Jira
in this step, we can use a pre-built unSkript Action, the “Search Jira Issues with JQL. Simply place the variable jql_query as the JQL search term as the required input. We also name the output issueList so that it can be referenced in later steps of the automation:
We will use this Action to pull the data from Jira. In our initial pull – we grab a huge timeframe – the “pull everything into memory” approach. But, we can reuse this Action for smaller queries when we build the dynamic charts too!
When we run this action, issueList will hold all of the data for the bugs changed to Done in the chosen timeframe. Now, let’s take this big bucket of data, and turn it into a graph.
Step 3: Convert the output into a Dict
The data that comes from Jira is a Python object of strings, so the next step does our data manipulation – changing the timestamps from strings into datetimes, calculating the elapsed time from issue creation to completion, and then inserting the desired data into a Dictionary. We use a function so we can reproduce this step later.
When we calculate the elapsed time, the default output is in days, hours, minutes. We keep that data, but also convert the elapsed time into hours by taking the day count*24, and the seconds count /3600:
elapsed_time = done_time-create_time
elapsed_time_hours = elapsed_time.days*24 +round(elapsed_time.seconds/3600,1)
The elapsed_time_hours seems to work better in our charts.
Step 4: Create the Chart
To Create the chart, we’ll be using Panel. In this case, we’ll create 2 sliders to select the date range for closed tickets – the start date, and the length of time to observe. These values are used to create a second dataframe , and we create a histogram of the issues:
To create this chart we have to do a number of things:
-
-
- Change the Dictionary of issues into a pandas dataframe.
- Create a couple of functions:
- weekdf: This creates a 2nd dataframe, with new start and end times (configured by our chart), and places the elapsed_time_hours into buckets for a chart.
- time_plot: Builds a new chart each time the parameters are changed in our chart.
- Generate the plot.
-
Here’s a video of the interactive plot in action:
Dynamic Data
Now that we have created the chart with a preloaded dataset, let’s recreate the chart, but with polling for fresh data on every re-render. This might be important to ensure that the data is fresh (when there is a lot of data coming in quickly), or to prevent overloading the RunBook with an initial query of 10,000 issues.
The code layout for the dynamic chart is similar to the previous. Every time the slider is moved, the replot_the_data function is called. In the static chart, this called a function to sort the data in a new dataframe. This time, we’re going to pull the data, convert it, put. it in a dataframe, and THEN sort the data for plotting.
It sounds like a lot, but the Jupyter Notebook takes care of this very quickly. Check out a demo of the chart in this video:
Conclusion
In this post, we have created interactive charts in our RunBook using the Panel module. This RunBook helps us to automate our reporting, allowing the team to quickly gather the time to resolution of all issues over a period of time. It also allows us to quickly compare different time periods to see if the time to resolution is improving for the team.
By automating tasks that occur regularly, we ensure that the results are always generated quickly, and that the results are created the same way – ensuring easy comparison across reports. By automating the data gathering and display, our sprint reports, quarterly updates and other general updates can be generated quickly and painlessly.
Interested in trying it out? Panel is available in our Open Source Docker Image! Install instructions can be found on our GitHub page (give us a star while you’re there!).