Feature Release Planning

6 min readApr 23, 2021

Scheduling with Genetic Algorithms

The Problem

Feature release planning is a key activity for any type of enterprise that sells modular products or services, software products being one example. In order for a product to be viable in an existing category, it has to meet the minimum viable product (MVP) features clients in a targeted market segment expect. MVP features are also called points of parity (PoP). Companies can pursue a strategy to compete on price with basic PoP products in mature markets, or on features an quality in growth markets. In order to win market share, a product has to be distinct in its characteristics. Customers are enticed by unique features, better quality, better support services. These are also known as points of differentiation (PoD) — they make products unique and increase their market potential. A more in-depth discussion can be found on the BrandUNIQ blog.

When companies plan the development and release of a product, they have to identify the release sequence that is most beneficial to the company. Several methods are used to evaluate the benefits of bringing a product / product feature to the market, e.g. the return on investment, the break even duration, or the net present value. The net present value uses a set of cash flows (expenses are negative, revenue is positive) and the discount rate, which is the cost of borrowing money (e.g. interest rate).

Organizational Constraints

Product development initiatives need to be planned accounting for limited resources, per-determined budgets, and the need to mitigate risks. I found a good analysis of project constraints done by Greer.

At a team scale, the effort constraint is the metric to be used as input to the genetic algorithm. A generic daily average number is good enough for the purpose of this article; a daily effort bandwidth which takes into account vacations and holidays can be used to increase accuracy of the planning process.

In addition to effort constraints, dependency constraints between features are key to determining the optimum release plan. Both types of constraints are taken into account when scoring the fitness of the individuals in the population.

The Solution

The intention is to use a Genetic Algorithm to solve the scheduling problem. There are several articles on Medium that introduce Genetic Algorithms, such as “A gentle introduction to Genetic Algorithms”

For the implementation, I’ll use Python, which is the go-to language for data science and machine learning projects. In the future, I’ll discuss Evolutionary Algorithms frameworks, e.g. DEAP. For now, this is a DIY exercise. The libraries to familiarize with: numpy, pandas, matplotlib.

The code is available in GitHub.

Design

A genetic algorithm operates on populations, which are set of distinct individuals. Individuals are characterized by their chromosomes. Individuals can mutate or can have children through chromosome cross-over operations. While the release plan is the genetic algorithm ‘individual’, the features are its ‘chromosomes’.

Features

All plans have all the features under consideration. When created, the features of a plan are characterized by effort, cost, and revenue. The effort is measured in person days, while the cost and revenue are monetary values.

The variables of a feature set that the genetic algorithm modifies during the run are feature start dates and daily effort the team assigns to the feature. Derived values are the duration and the end date of the feature

Features are implemented as a class. The start and end attributes are integers, corresponding to offsets from the day the schedule starts.

The Release Plan

The individuality of the release plan is determined by the scheduled set of features, when each feature is assigned a start date and the amount of resources that will work on it. As cognitive overload increases when a resource is assigned to work on multiple tasks in parallel, I prefer to use an algorithm that assigns resources 100% to a task. Thus, when the algorithm assigns the daily effort to a feature, its values are integers in the range of [1, team size].

Once the features have been assigned a start date and the daily effort, derived metrics for the release plan can be computed, such as the compound daily effort, the compound daily costs and the net present value of the overall plan.

In code, the release plan has a dictionary of features, and plan-level attributes. When the first generation of plans is created, for each plan, random start and duration values are set. When performing this activity, there are two options:

Be a firm believer in the natural selection process and assign {start, duration} values independently for each feature and not take into account its dependencies on other features. OR
Nudge the algorithm a bit, and schedule features with no parents first, then follow through with their children, and make sure the dependencies are taken into account

If dependencies are taken into account when the first generation is created, the genetic algorithm converges faster. In code, both approaches are available, and an individual is created using one or the other method, with an equal probability of 50%.

Another key method of the Plan is the score method, which is used to evaluate the fitness of the individual. The method scores individuals according to their Net Present Value (using the numpy-financial library) when all the scheduling constraints are met. The score method sets the net present value to zero, if the plan does not meet the inter-dependencies between features, and lowers the net present value by 10% if the daily effort bandwidth is exceeded. This is a large granularity approach and its weight in the scoring process can be further refined.

The Genetic Algorithm

The last part of the puzzle is the implementation of the generic algorithm. The algorithm has methods to create the initial population, to rank the individuals of a given generation, to mutate individuals and to cross-over the DNA of two individuals. The algorithm uses a simple ranking process to determine which individuals are promoted to the next generation, which ones are mutated and promoted, and which ones create new off-springs.

Running the Algorithm

The quality of the results depend on how well the mutations and cross-over methods have been designed. If the best NPV is the one from the first generation and does not improve at all during the run, this means the mutations and cross-overs create individuals that are less fit then the randomly generated ones.

In order to improve the quality of the results, one has to ‘nudge’ the evolutionary methods to yield better fit individuals. For example, the initial mutation method was randomly assigning new start dates to features, and keeping the duration constant. The algorithm was not producing better fit individuals with newer generations. Changing the method to increase the daily effort by one while keeping the start date constant, and removing the idle time between features produces evolving individuals. Running the algorithm, one will notice that improvements may happen every 8 generations or so, this is why it is important to run the algorithm for a reasonable large number of generations

One has to run the algorithm several times, to make sure that the solution is not a local optimum. As an eager schedule is penalized, I don’t expect to have the daily effort exceed the team capacity.

Results

Input parameters for the algorithm are: the population size (500), the number of generations (500), the minimum number of generations before convergence is tested (25) and the team size (5). Internally the NPV is computed using 0.05 as the discount rate.

The scheduled features of the best fit individual and the team daily effort load are plotted using matplot lib.

The dependencies between features are respected, and there is no idle time in the schedules. The second run yields a better Net Present Value. In a more complex scenario, the implementation may have to be iteratively refined, but for the purpose of this post, the implementation has proven the potential of Genetic Algorithms to be used to solve the feature release planning problem.