Colored Network Diagrams (Part 1)

Colored Network Diagrams (Part 1)

Executive Summary

This blog post offers a method to create colored network graphs (PERT charts) from project plans that are created with Microsoft® Project. Color schemes similar to the ones used in physical maps visualize those tasks which:

  • … have long durations
  • … have a high amount work
  • … have little slack
  • … have a high criticality

The blog post shall also encourage readers to think outside of the possibilities that established project management software offers and to define for yourself which tasks are critical or relevant for you and how this information shall be determined.

Preconditions

In order to use the approach described here, you should:

  • … have access to a Linux machine or account
  • … have a MySQL or MariaDB database server set up, running, and have access to it
  • … have the package graphviz installed
  • … have some basic knowledge of how to operate in a Linux environment and some basic understanding of shell scripts

Description and Usage

The base for the colored network graphs is a plan in Microsoft® Project which needs to be stored as XML file. The XML file is readable with a text editor (e.g., UltraEdit) and follows the syntax in [1]. The XML file is then parsed, the relevant information is captured and is stored in a database (Step 1). Then, selected information is retrieved from the database and written to a script in dot language which subsequently is transformed into a graph using the package graphviz (Step 2). Storing the data in a database rather than in variables allows us to use all the computational methods which the database server offers in order to select, aggregate, and combine data from the project plan. It also allows us to implement Step 2 in PHP so that graphs can be generated and displayed within a web site; this could be beneficial for a web-based project information system.

Step 0: Setting up the Database

At the beginning, a database needs to be set up in the database server. For this, you will probably need administrative access to the database server. The following SQL script sets up the database and grants access rights to the user gabriel. Of course, you need to adapt this to your needs on the system you work on.

# Delete existing databases
REVOKE ALL ON projects.* FROM 'gabriel';
DROP DATABASE projects;

# Create a new database
CREATE DATABASE projects;
GRANT ALL ON projects.* TO 'gabriel';
USE projects;

CREATE TABLE projects         (proj_uid               SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,\
                               proj_name              VARCHAR(80)) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE tasks            (proj_uid               SMALLINT UNSIGNED NOT NULL,\
                               task_uid               SMALLINT UNSIGNED NOT NULL,\
                               task_name              VARCHAR(150),\
                               is_summary_task        BOOLEAN           DEFAULT FALSE NOT NULL,\
                               is_milestone           BOOLEAN           DEFAULT FALSE NOT NULL,\
                               is_critical            BOOLEAN           DEFAULT FALSE NOT NULL,\
                               task_start             DATETIME          NOT NULL,\
                               task_finish            DATETIME          NOT NULL,\
                               task_duration          INT UNSIGNED      DEFAULT 0,\
                               task_work              INT UNSIGNED      DEFAULT 0,\
                               task_slack             INT               DEFAULT 0,\
                               rem_duration           INT UNSIGNED      DEFAULT 0,\
                               rem_work               INT UNSIGNED      DEFAULT 0,\
                               `%_complete`           SMALLINT UNSIGNED DEFAULT 0,\
                                                      PRIMARY KEY (proj_uid, task_uid)) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE task_links       (proj_uid               SMALLINT UNSIGNED NOT NULL,\
                               predecessor            SMALLINT UNSIGNED NOT NULL,\
                               successor              SMALLINT UNSIGNED NOT NULL) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE resources        (proj_uid               SMALLINT UNSIGNED NOT NULL,\
                               resource_uid           SMALLINT UNSIGNED NOT NULL,\
                               calendar_uid           SMALLINT UNSIGNED NOT NULL,\
                               resource_units         DECIMAL(3,2)      DEFAULT 1.00,\
                               resource_name          VARCHAR(100),\
                                                      PRIMARY KEY (proj_uid, resource_uid)) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE assignments      (proj_uid               SMALLINT UNSIGNED NOT NULL,\
                               task_uid               SMALLINT UNSIGNED NOT NULL,\
                               resource_uid           SMALLINT UNSIGNED NOT NULL,\
                               assignment_uid         SMALLINT UNSIGNED NOT NULL,\
                               assignment_start       DATETIME          NOT NULL,\
                               assignment_finish      DATETIME          NOT NULL,\
                               assignment_work        INT UNSIGNED      DEFAULT 0,\
                                                      PRIMARY KEY (proj_uid, task_uid, resource_uid)) ENGINE=MyISAM DEFAULT CHARSET=utf8;
CREATE TABLE assignment_data  (proj_uid               SMALLINT UNSIGNED NOT NULL,\
                               assignment_uid         SMALLINT UNSIGNED NOT NULL,\
                               assignment_start       DATETIME          NOT NULL,\
                               assignment_finish      DATETIME          NOT NULL,\
                               assignment_work        INT UNSIGNED      DEFAULT 0) ENGINE=MyISAM DEFAULT CHARSET=utf8;

This SQL script sets up a database with several tables. The database can hold several project plans, but they all must have distinctive names. This SQL script only catches a subset of all information which is part of a Microsoft® Project plan in XML format; it does not capture calendar data, for example, nor do the data types in the SQL script necessarily match the ones from the specification in [1]. But it is good enough to start with colored network graphs.

This only needs to be executed one time. Once the database has been set up properly, we only have to execute Step 1 and Step 2.

Step 1: Parsing the Project Plan

The first real step is to parse the Microsoft® Project plan in XML format; this is done with the bash script msp2mysql.sh. This program contains a finite-state machine (see graph below), and the transitions between the states are XML tags (opening ones and closing ones). This approach helps to keep the code base small and easily maintainable. Furthermore, the code base can be enlarged if more XML tags or additional details of the project plan shall be considered.

Finite-State Machine for parsing the Project Plan

I concede there are more elegant and faster ways to parse XML data, but I believe that msp2mysql.sh can be understood well and helps to foster an understanding of what needs to be done in Step 1.

There is one important limitation in the program that needs to be mentioned: The function iso2msp () transforms a duration in ISO 8601 format to a duration described by multiples of 6 s (6 s are an old Microsoft® Project “unit” of a time duration, Microsoft® Project would traditionally only calculate with 0.1 min of resolution). However, as you can see in [2], durations cannot only be measured in seconds, minutes, and hours, but also in days, months, etc. A duration of 1 month is a different amount of days, depending on the month we are looking at. This is something that has not been considered in this program; in fact, it can only transform durations given in hours, minutes and seconds to multiples of 6 s. Fortunately, this captures most durations used by Microsoft® Project, even durations that span over multiple days (which are mostly still given in hours in the Microsoft® Project XML file).

While msp2mysql.sh parses the Microsoft® Project plan in XML format, it typically runs through these steps:

  • It extracts the name of the project plan and checks if there is already a project plan with this name in the database. If that is the case, the program terminates. If that is not the case, the project name is stored in the table projects, and a new proj_uid is allocated.
  • It extracts details of the tasks and stored them in the tables tasks and task_links.
  • It extracts the resource names and stores their details in the table resources.
  • It extracts assignments (of resources to tasks) and stores them in the tables assignments and assignment_data.

msp2mysql.sh is called with the Microsoft® Project plan in XML format as argument, like this:

./msp2mysql.sh /home/tmp/Example_1.xml

Step 2: Generating a Graph

The second step is to extract meaningful data from the database and transform this into a colored network graph; this is done with the bash script mysql2dot.sh. mysql2dot.sh can create colored network graphs according to:

  • … the task duration
  • … the work allocated to a task
  • … the task slack
  • … the criticality of a task

mysql2dot.sh relies on a working installation of the graphviz package, and in fact, it is graphviz and its collection of tools that create the graph, while mysql2dot.sh creates a script for graphviz in the dot language. Let us look close to some aspects:

mysql2dot.sh uses an uneven scale of colors, and the smallest scale should be three or more colors. The constant NUM_COLSTEPS sets the number of colors, in our case 7 colors. The constant COLSCHEME defines the graphviz color scheme that shall be used, in our case this is “ylorrd7”. [3] shows some really beautiful color schemes. The ColorBrewer schemes are well suited for our kind of purpose (or for physical maps); however, the license conditions ([4]) must be observed.

mysql2dot.sh is invoked with the name of a directory as argument. The generated script(s) in dot language as well as the generated graph(s) will be stored in that directory. A typical call looks like this:

./mysql2dot.sh /home/tmp/

When started, mysql2dot.sh retrieves a list of all projects from the database and displays them to the user who then has to select a project or choose the option Exit. When a project plan has been selected, the user has to choose the graph that shall be generated. Subsequently, the program starts to generate a script in dot language and invokes the program dot (a tool of the graphviz suite) in order to compute the network graph.

Different project plans typically have different ranges of durations, work or slack for all of their tasks. mysql2dot.sh therefore takes into account the maximum duration, the maximum work, or the maximum slack it can find in the whole project. Initially, I used a linear scale to distribute the colors over the whole range from [0; max_value], but after looking at some real project plans, I personally found that too many tasks would then be colored in darker colors and I thought that this would rather distract a project manager from focusing on the few tasks that really need to be monitored closely. Trying out linear, square and square root approaches, I finally decided for the square root approach which uses the darker colors only when the slack is very small, the workload is very high, or the duration is very long. Consequently, the focus of the project manager is guided to the few tasks that really deserve special attention. However, this does not mean that this approach is the only and correct way to do it. Feel free to experience with different scaling methods to tune the outcome to your preferences.

Different Algorithms and their Effect on Color Scales

The network graph colored according to task criticality follows a different approach. Task criticality is derived from the ration of task slack and task duration. Tasks which have a low ratio of task slack / task duration are defined as more critical than tasks where this ratio is higher. I also experienced with the square root approach mentioned above, but that seemed to result in different criticalities depending on the maximum ratio of task slack / task duration which seemed unsatisfactory to me. Finally, I opted for a logarithmic approach which puts the ration task slack / task duration = 1,0 to the middle value of NUM_COLSTEPS. So in the current case, a ratio of task slack / task duration = 1,0 (means: task slack = task_duration) would result in a value of 4.0. If NUM_COLSTEPS was 9 rather than 7, the result would be 5.0. This can be reached by the formula:

where:

  • n is the number of tasks that are neither a milestone nor a summary task and that have a non-zero duration
  • j is the task for which we determine the level of shade
  • number_of_shades is the number of different shades you want to have in the graph; this should be an impair number and it should be 3 at least
  • shadej is the numerical value of the shade for task j
  • task_slackj is the task slack of task j
  • task_durationj is the duration of task j

Examples

Example 1: Sequential Task Chain

This example is a sequential task chain with 20 tasks; each task is +1d longer than the previous task.

Example 1: Sequential Task Chain

In this example, no resources have been assigned yet. As all tasks are in sequence, there is obviously no slack in any task, and all tasks are in the critical path. Parsing this project plan with msp2mysql.sh leads to:

Parsing the project plan from Example 1

We can already see that the shall script takes a long time for this easy project plan. For a productive environment, one would have to use C code rather than a shell script, but for educational purposes, the shell script is easier to change, understand and also to debug. Subsequently, we start creating the graphs with mysql2dot.sh, here is an example on how this looks like:

Creation of a graph with mysql2dot.sh

Let us look at the 4 graphs which can be generated:

Network Graph colored according to task duration (Example 1)

The result is what could be expected. Tasks with a shorter duration are colored in a lighter color, tasks with a longer duration consequently in a darker color. While the task duration increases in a linear fashion from left to right, the graph colors tasks with a longer duration more in the “dangerous” area, that is, in a darker color.

Network Graph colored according to task work (Example 1)

As no resource has been attributed, no work has been registered for any of the tasks. Therefore, all tasks are equally shown in a light color.

Network Graph colored according to task slack (Example 1)

As none of the tasks has any slack, obviously all tasks are in the critical path. Hence all of them are colored in dark color.

Network Graph colored according to task criticality (Example 1)

Similar to the graph above, all tasks are critical, and progress must be carefully monitored; hence, all tasks are colored in dark color.

Example 2: Parallel Task Chain

This example has the same tasks as Example 1, but all of them in parallel. The longest task therefore determines the overall project runtime.

Example 2: Parallel Task Chain

In this example, too, no resources have been assigned yet. Parsing this project plan with msp2mysql.sh leads to:

Parsing the project plan from Example 2

The 4 graphs from this example look very different from Example 1:

Network Graph colored according to task duration (Example 2)

The color of the tasks is the same as in Example 1; however, all tasks are in parallel.

Network Graph colored according to task work (Example 2)

The color of the tasks is the same as in Example 1 as we still have no work assigned to any of the tasks; however, all tasks are in parallel.

Network Graph colored according to task slack (Example 2)

As to slack, the picture is different. Only one task (task #20) does not have any slack, all other tasks have more or less slack, and task #01 has the most slack.

Network Graph colored according to task criticality (Example 2)

Here, we can see that both tasks #19 and tasks #20 are considered to be critical. More tasks receive a darker color as compared to the graph considering slack alone.

Example 3: Sequential Task Chain with Resource

This example has the same tasks as Example 1, but we assign a resource to each task (the same in this example, but this OK as all tasks are sequential, hence no overload of the resource). Consequently, work is assigned to each task, and we can expect changes in the network graph referring to task work.

Example 3: Sequential Task Chain with Resource

Parsing this project plan with msp2mysql.sh leads to:

Parsing the project plan from Example 3

We can see that now (different from Example 1), resources and assignments are captured, too. Let us look at the 4 graphs from this example:

Network Graph colored according to task duration (Example 3)

Nothing has changed compared to Example 1 as the durations are the same.

Network Graph colored according to task work (Example 3)

This graph now experiences the same coloring as the network graph according to task duration. That is understandable given the fact that we have a full-time resource working 100% of the time for the full task duration, hence work and duration have the same numeric values for each respective task.

Network Graph colored according to task slack (Example 3)

Nothing has changed compared to Example 1 as still none of the tasks has any slack.

Network Graph colored according to task criticality (Example 3)

Nothing has changed compared to Example 1 as still none of the tasks has any slack. Consequently, all of them are critical.

Example 4: Parallel Task Chain with Resources

This example has the same tasks as Example 2, but we assign a resource to each task. As all tasks are parallel, we assign a new resource to each task so that there is no resource overload.

Example 4: Parallel Task Chain with Resources

Parsing this project plan with msp2mysql.sh leads to:

Parsing the project plan from Example 4

We can see that now (different from Examples 2), resources and assignments are captured, too. Let us look at the 4 graphs from this example:

Network Graph colored according to task duration (Example 4)

Nothing has changed compared to Example 2 as the durations are the same.

Network Graph colored according to task work (Example 4)

This graph now experiences the same coloring as the network graph according to task duration. That is understandable given the fact that we have a full-time resource working 100% of the time for the full task duration, hence work and duration have the same numeric values for each respective task.

Network Graph colored according to task slack (Example 4)

Nothing has changed compared to Example 3; the slack of each task has remained the same.

Network Graph colored according to task criticality (Example 4)

Nothing has changed compared to Example 3; the criticality of the tasks remains the same.

Example 5: Sequential Blocks with Parallel Tasks and with Resources

This example is the most interesting one as it combines different aspects within one project plan. We have 5 milestones, and 4 parallel tasks before each milestone. All tasks have resources assigned.
Task Block 1 contains 4 tasks of different durations, and as each task has a 100% resource allocated, also of different work. Due to the different durations, the 4 tasks also have different slack, with the longest task having zero slack.
Task Block 2 is like Task Block 1, but the work is the same for each task; this has been achieved by an over-allocation of the respective resources for 3 out of the 4 tasks.
Task Block 3 consists of 4 parallel tasks with the same duration, however with a work allocation similar to Task Block 1. None of the tasks therefore has slack, but the work for each task is different.
Task Block 4 is like Task Block 1 in terms of task duration and task work. However, the tasks have been arranged in a way so that none of the tasks has slack.
Task Block 5 is like Task Block 4, but each task has (the same amount of) lag time between the task end and the subsequent milestone. None of the tasks has any slack.

Example 5: Sequential Blocks with Parallel Tasks and with Resources

Parsing this project plan with msp2mysql.sh leads to:

Parsing the project plan from Example 5

Let us look at the 4 graphs from this example:

Network Graph colored according to task duration (Example 5)

The colors in Task Block 1, 2, 4 and 5 are the same. Only in Task Block 3 where all tasks have the same (long) duration, all tasks have a dark color.

Network Graph colored according to task work (Example 5)

The colors in Task Block 1, 3, 4, and 5 are the same. Only in Task Block 2 where all tasks have the same amount of work (which is also the highest work a task has in this project plan), all tasks have a dark color.

Network Graph colored according to task slack (Example 5)

The colors in Task Block 1 and 2 are the same. The tasks in Task Blocks 3, 4, and 5 are all dark as none of these tasks has any slack.

Network Graph colored according to task criticality (Example 5)

The network graph according to task criticality is similar to the one according to slack, but we can see that in Task Blocks 1 and 2 tasks get a darker color “earlier” as in the network graph according to criticality.

Example 6: Distinguishing Slack and Criticality

So far, we have not seen much difference between the network graphs according to slack and the one according to criticality. In this example, we have two task blocks with 4 tasks each where the slack is the same in the corresponding tasks in each task block. However, the duration is different between both task blocks. Work per task is almost the same, I tried to match the work despite the longer duration of the tasks in Task Block 2 as good as possible.

Example 6: Distinguishing Slack and Criticality

Parsing this project plan with msp2mysql.sh leads to:

Parsing the project plan from Example 6

Let us look at the 4 graphs from this example:

Network Graph colored according to task duration (Example 6)

The duration of the tasks in Task Block 2 is +5d higher as in Task Block 1. Therefore, it is not surprising that the tasks in Task Block 2 are colored in darker colors.

Network Graph colored according to task work (Example 6)

As mentioned above, I tried to keep the amount of the work the same for corresponding tasks in Task Block 1 and 2 (which can also be seen in the numeric values for the work), and consequently, the colors of corresponding tasks match.

Network Graph colored according to task slack (Example 6)

The slack of corresponding tasks in Task Block 1 and 2 is the same as the numeric values how. And so are the colors.

Network Graph colored according to task criticality (Example 6)

Despite the same slack of corresponding tasks in Task Block 1 and 2, the algorithm rates the criticality of the tasks in Task Block 2 higher as in Task Block 1. This is because the computation of “criticality” is based on the ratio of slack and duration. Out of two tasks with the same slack, the one with a longer duration is rated more critical as relative deviations in the duration have a higher probability to consume the free slack.

Example 7: A real-world example

Let us now look at a real-world example, a simplified project plan of a development project consisting of mechanical parts, PCBAs and software, including some intermediate milestones. As it is the case in reality, some tasks have only a partial allocation of resources, that means, that the numerical values for the allocated work are less than for the allocated duration. This will be reflected in different shadings of colors in the network graphs task duration and task duration.

Example of a Development Project (without completed tasks)

The resulting graphs are:

Network Graph colored according to task duration (Example 7)
Network Graph colored according to task work (Example 7)
Network Graph colored according to task slack (Example 7)
Network Graph colored according to task criticality (Example 7)

As mentioned above already, we can see the differences in the shading of the graphs according to task duration and task work, the reason being an allocation of resources different to 100%. However, in contrast to an initial expectation, the different shadings occur in large parts of the graph rather than only on a few tasks. This is because the shadings depend on the maximum and the minimum values of task duration or task work, and these values are unique to each type of graph. Well visible is also the critical path in dark red color in the network graph colored according to task criticality.

Example 8: A real-world example with complete tasks

Once a project has been started, sooner or later, the first tasks will be completed, and in this example, we assume that we have run three weeks into the project (green vertical line), and all tasks have been executed according to schedule (this is an example from an ideal world 😀). Consequently, our plan looks like this:

Example of a Development Project (with completed tasks)

Tasks that have been completed 100% are colored in green color across all network graphs indicating that these tasks do not require any more attention in the remaining project:

Network Graph colored according to task duration (Example 8)
Network Graph colored according to task work (Example 8)
Network Graph colored according to task slack (Example 8)
Network Graph colored according to task criticality (Example 8)

The color shadings of the incomplete tasks remain as they were in Example 7. This is because the script mysql2dot.sh determines the step size of the shadings (and thus the resulting shadings) of the individual tasks based on the maximum and minimum value of task duration, task work, task slack, or task criticality for the whole project, independent whether some tasks have been completed or not.

Conclusion

The scripts create_project_db.sql, msp2mysql.sh and mysql2dot.sh and the examples provided above show how, using the powerful graphviz package, traditional project plans created with Microsoft® Project can be visualized in a set pf graphs that can help a project manager to focus on the right tasks. For the ease of understanding, the scripts run in bash so that everyone can easily modify, enlarge, or change the scripts according to the own demands. Users who want to deploy the scripts on productive systems should take speed of execution and cybersecurity (or the lack of both in the provided example scripts) into account.

Outlook

The provided scripts can be enlarged in their scope or improved in numerous ways, and I encourage everyone to tailor them to your own needs. Some examples for improvement are:

  • Improve the parsing of Microsoft® Project XML files by:
    • …using a real XML parser rather than bash commands
    • …enlarging the scope of the parser
  • Create additional graphs.
  • Consider tasks that have been completed partially.
  • Experiment with different definitions of criticality.
  • Explore resource usage and how to display it graphically (load, usage, key resource, etc.).
  • Improve the function iso2msp () which so far is only a very simplistic implementation and is not yet able to process values like “days”, “weeks”, “months”.

Sources

Files

  • create_project_db.sql sets up a database in MySQL or MariaDB. The script works with the user gabriel who assumes to have access to the database server without password; you might have to adapt the script to your environment and needs.
  • msp2mysql.sh reads a Microsoft® Project plan in XML, parses it and writes the data into the MariaDB database.
  • mysql2dot.sh reads from the MariaDB database and creates a script for dot, a tool of the graphviz suite.
  • PERT-Farbskalen.xlsx is a work sheet where different approaches for color distributions are tested and where you can play around to find the “right” algorithm for yourself.
  • Example_1.zip contains all files with respect to Example 1.
  • Example_2.zip contains all files with respect to Example 2.
  • Example_3.zip contains all files with respect to Example 3.
  • Example_4.zip contains all files with respect to Example 4.
  • Example_5.zip contains all files with respect to Example 5.
  • Example_6.zip contains all files with respect to Example 6.
  • Example_7.zip contains all files with respect to Example 7.
  • Example_8.zip contains all files with respect to Example 8.

Disclaimer

  • The program code and the examples are for demonstration purposes only.
  • The program shall not be used in production environments.
  • While the program code has been tested, it might still contain errors.
  • The program code has not been optimized for speed (It’s a bash script anyway, so do not expect miracles.).
  • The program code has not been written with cybersecurity aspects in mind.
  • Only a subset of all possibilities in [1] has been used, and the code does not claim to adhere completely to [1].

Posted on: 2021-12-29Gabriel Rüeck

One thought on “Colored Network Diagrams (Part 1)

Comments are closed.