Measuring Software Development Productivity

Measuring software development productivity is challenging in that there are no useful objective ways to measure it. Traditional approaches that strive for objectivity like counting lines of code, story points, or function points fall short in one way or another. In this post, I’ll review traditional approaches to software development productivity and discuss their shortcomings. Alternatives to these traditional methods will be discussed in a future post.

Motivation

If you can’t measure it, you can’t improve it.

Peter Drucker

If we can measure software development productivity we will know if changes we make to the people (e.g., individuals, roles, responsibilities), processes (e.g., scrum, kanban), or technology (e.g., language, IDE) are improving productivity or not. It would also be helpful to be able to compare the productivity of different software development teams and objectively measure the benefit (if any) of outsourcing.

Software Development Productivity Definition

What is software development productivity? In the business world, productivity is defined as the amount of useful output of a production process per unit of input (e.g., cost). Thus, software development productivity could be defined as the amount of quality software created divided by the resources expended to produce it. The denominator is fairly straightforward and could be measured as the cost of the software development effort (e.g., wages, benefits, tools, office and equipment costs). The numerator is the challenge.

Software development productivity is the amount of quality software created divided by the resources expended to produce it.

Note that quality must be taken into consideration when measuring the amount of software produced or the productivity measure will not be useful.

Quick Note on The Observer Effect

The Observer Effect is the idea that merely observing a phenomenon can change that phenomenon. Because there is additional overhead required to train the organization on productivity metrics and to measure and report on them, establishing a software development productivity measurement process could actually lower productivity because of the overhead. Measuring the productivity impact of a productivity measurement process is an interesting topic. Intuitively, the benefits of measuring productivity should outweigh the cost of the measurement process but I think it is a good idea to be aware of the added cost.

What We Want in a Productivity Measure

Ideally, our productivity metric would have the following properties:

  • Development team can’t game it to look more productive
  • Objective rather than subjective for consistency
  • Reasonably easy to measure (not a lot of overhead)
  • An absolute measure that can be used to compare productivity of different teams/organizations

Hours Worked

It may be hard to believe that hours worked would be a measure of software development productivity, but it is frequently used.

Hours Based Productivity = (total hours the team works) / (cost of the team)

If you compare the productivity of two software development teams using this measure, and they work the same number of hours, you will conclude that the less expensive team is more productive (i.e., that you will get more useful software produced per dollar of investment). This is often used as justification for moving software development offshore where labor rates are cheaper or the driver for a policy to hire the cheapest developers in a local market.

This is also used in some organizations as justification for encourage software developers to work more hours per week. Managers who use this productivity metric are focused on increasing the numerator (hours worked) and decreasing the denominator (cost).

The problem with this metric is that it assumes that every software developer produces the same amount of quality code per hour. This far from the truth. Studies have found that there can be an order of magnitude (10x) difference in productivity between programs and between teams. Alan Eustace (Google SVP) argued that a top notch developer is worth three hundred average ones. Bill Gates said that a great writer of code is worth 10,000 times the price of an average writer. Robert C. Martin said that ninety percent of code is written by ten percent of the programmers.

“A great lathe operator commands several times the wage of an average lathe operator, but a great writer of software code is worth 10,000 times the price of an average software writer.”

Bill Gates

“90% of the code is written by 10% of the programmers.”

Robert C. Martin

There is also a myth that the more time a developer spends in her seat, the more productive she will be (the greater the hours-worked numerator will be and the more quality code she will produce). As Tom DeMarco and Timothy Lister pointed out in their book PeopleWare, overtime will often lead employees to take compensatory undertime whenever possible, either through suboptimal work, leaving early, getting sick and so on. In essence, overtime is like sprinting: it’s great for the final stretch, but if you sprint for too long, you won’t be able to finish the race. It gives you the illusion of higher productivity.

Source Lines of Code (SLOC) Completed

Another measure of software development productivity that is used is counting the amount of code that has been written and dividing that by the cost to write it.

SLOC Productivity = (number of lines of code) / (cost to write the code)

There are different ways to count lines of code and a variety of code counting tools available. The term “source code” refers to the human-readable source code before it is compiled. There are several problems with using this as a measure of software development productivity.

The first issue is that not all SLOC takes the same amount of effort. For example, complex scientific code takes a lot longer than text boxes and radio buttons on a user interface. This has been addressed in software estimation tools like SEER-SEM and COCOMO by assigning a complexity factor to software products (e.g., an e-commerce website is less complex than an image processing system). But software cost estimation is not the same as measuring the productivity of a software development team. It is not practical to ask developers to assign a complexity measure to every software component they develop and it would be difficult to normalize this between developers. But there is another more serious problem of using SLOC as a productivity measure.

“Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.”

Bill Gates

The issue is that we strongly prefer software solutions with fewer lines of code than those with more code. One software developer’s implementation may be 50 lines of code and another’s might be 500 lines of code for the same functionality. The shorter the software solution, the easier it is to maintain. Most of the cost of software is in maintenance. The shorter code may also be more performant (e.g., requiring less computing resources, providing lower latency, more throughput). If we were to use quantity of SLOC as a productivity measure, we would think that a good programmer who creates efficient code is not as productive as a bad programmer that produces more verbose code which is wrong.

Function Points

The basic idea of Function Points (FP) is to quantify business functionality provided by software. There are formal methods for FP estimation and even ISO standards that govern the methodology. Function points are somewhat obscure in that they are rarely used at commercial companies in the US (they seem more popular in other countries). Allan Albrecht, the inventor of function points, observed in his research that FPs were highly correlated to lines of code and thus they share the same issues using them as a software development productivity measure. A large criticism of FPs is that like SLOC counting, they don’t take into account the complexity of the software being written. They work better for business software but not so well for software that has more algorithmic complexity (e.g., a data science application).

FP Productivity = (function points completed) / (cost to write the code)

The bottom line is that if a software development team completes 1000 function points one month and 1100 function points the next, you can’t conclude that their productivity increased 10%. That is because function points don’t take into account complexity (i.e., the software they developed in the second month might have been a lot easier than the first month). There is also significant overhead to assigning function points and it would be difficult to find staff with function point experience in the US.

User Story Points

Story Points are used by agile development teams to estimate the amount of effort it will take to complete a user story. They are typically used to determine how many user stories can be planned into a sprint (a time-boxed development cycle). They are very subjective measure that only has value within a team. Comparison to other teams, departments and organizations is not possible. A software development team can track its velocity (the number of story points completed each sprint) with a goal to improve it, but velocity can easily be gamed by the team. Since story points are completely subjective, the team can just estimate them to be higher and velocity will appear to increase. They are useful within the team to improve their own performance, but not as an external productivity measure.

SP Productivity = (story points completed) / (cost to write the code)

Use Case Points

Use Case Points (UCP) rely on the requirements for the system being written using use cases, which is part of the UML set of modeling techniques. The software size is calculated based on elements of the system use cases with factoring to account for technical and environmental considerations. The UCP for a project can then be used to calculate the estimated effort for a project. Thus UCP is only applicable when the documentation contains use cases (i.e., you have to write use cases for everything). UCP is also a highly subjective method, especially when it comes to establishing the Technical Complexity Factor and the Environmental Factor. Also, there is no standard way to write use cases.

UCP Productivity = (use case points completed) / (cost to write the code)

Subjective Measures of Productivity

As we have seen, useful, practical, comparative, objective measures of software development productivity simply do not exist. We’ve been searching for them for many decades to no avail. I believe the core reason for this is that software development is knowledge work and a complex creative endeavor. Vincent Van Gogh produced more than 2,000 artworks, consisting of around 900 paintings and 1,100 drawings and sketches. Frida Kahlo in her shorter lifetime produced approximately 200 paintings, drawings, and sketches. Which painter was more productive? Maybe one artist’s paintings took longer to create because they were more complex or required more careful thought or experimentation. How would one go about analyzing this to come up with a reasonable productivity metric?

In the absence of useful objective productivity measures, we must turn to subjective measures. I’ll discuss these approaches in a future post.

Organizational Bias in Software Architecture

Organizational structure can bias the software architecture and lead to sub-optimal solutions. Why not align the organization with the architecture?

The architecture of a system tends to mirror the structure of the organization that created it. This idea was introduced by Melvin Conway way back in 1967 and still holds true today. Organizational bias to the architecture can be mitigated through awareness and with flatter, more flexible organizational structures.

Conway cited an example where eight people were assigned to develop a COBOL and an ALGOL compiler. After some initial estimates of difficulty and time, five people were assigned to the COBOL job and three to the ALGOL job. The resulting COBOL compiler ran in five phases, the ALGOL compiler ran in three! That probably wasn’t an optimal design and was clearly aligned with the organizational structure.

I’ll discuss a few other scenarios where Conway’s Law has manifested: website design, API architecture, and microservices architecture.

Website Design

The most obvious modern-day manifestation of Conway’s Law is in website design. We’ve all seen corporate websites that mirror the organizational structure of the company rather than follow a user-optimized design. What typically happens is that each organization develops and contributes its own content to the website and these pieces are then assembled together.

Home page
    - Division A webpages
    - Division B webpages
    - Division C webpages
Boeing website organized by divisions

A website user may not care about how the company is organized. She just wants a great website user experience. Also, how easily is the website maintained when there is an organizational restructuring?

API Architecture

We also sometimes see organizational structure reflected in API design. Take this example presented by Matthew Reinbold. Suppose that Team A creates a Users API as follows.

/users/{userId}/preferences

"preferences" : {
    "language" : "EN-US",
    "avatar" : "my-avatar.gif",
    "default-page : "settings"
}

Clearly, the intention is to have all the user’s preferences accessible via the “preferences” resource. Now suppose that sometime later, Team B is given the responsibility to develop a new feature that allows users to customize the sort order for their search pages and they need a place to save the user’s sort preferences. Because no one on Team B knows anyone on Team A, they are separated by a few time zones, don’t have the ability to add a high priority item to Team A’s backlog, and are on a tight schedule, they decide to create a new API under “preferences” and call it “sort.” They don’t need to involve Team A and can get this done very quickly. So they come up with something like this.

/users/{userId}/preferences/sort

"sort" : {
    "order" : "ascending"
}

The problem here is that even though this single design transgression in of itself doesn’t seem a big deal, it can proliferate and you can end up with a very chatty API like this that is difficult for clients to consume.

/users/{userId}/preferences/ooo
/users/{userId}/preferences/manager
/users/{userId}/preferences/timezone
/users/{userId}/preferences/signature

Microservices Architecture

If software development is functionally organized as in UI (front end), Services (back end), and Database teams, there will be an architectural bias along these lines. This can lead to sub-optimal microservices architectures. Ideally, there would be cross-functional teams with each team responsible for one or more microservices. James Lewis and Martin Fowler published an article on this topic.

With a functional organization as described above, there will be a bias for each team to address features within their functional area rather than across functional areas. You may end up with business logic in each layer of the architecture. Your services may not be as encapsulated as you would like.

Decoupling Architecture from Organization

So what can we do to avoid the pitfalls of Conway’s Law? I think the first step is awareness. Just like we need to be aware of other biases like confirmation bias and survivorship bias, an awareness of organizational bias can help our teams to actively work against it. But the organizational structure itself is a large factor.

  1. The flatter the organizational structure, the less likely it will bias the architecture. The flatter organizational structure should provide more of a blank sheet of paper rather than a set of constraints.
  2. The more flexible the organizational structure, the less likely it will bias the architecture. My view is that the organizational structure should mirror the architecture and not vice versa. Do the architecture first, then organize around it.
  3. The more communication within the organization, the less likely it will bias the architecture. If everyone knows what architecture decisions are made, there is a better chance that someone will speak up when Conway’s Law rears its ugly head, especially if awareness is raised throughout the organization.

Conclusion

Although Conway’s Law has been widely known in the software development community for many years, we still continue to see sub-optimal architectures biased by organizational structure. Software architecture organizational bias has manifested itself in many different scenarios including website design, API architecture, microservices architecture.

Organizational awareness is the first step to mitigate this and simple anecdotes can be included in group meetings and training sessions. But even better is to have a relatively flat organization with the flexibility to organize around the architecture of the product that is being developed.

In closing, I recently re-read Conway’s original paper, and found another somewhat humorous aphorism buried within. I’ll leave it here without further comment.

“Probably the greatest single common factor behind many poorly designed systems now in existence has been the availability of a design organization in need of work.”

Melvin Conway, 1967

Image credit Manu Cornet

Can AI Help Battle Coronavirus?

The AI community has been marshaling its resources in the fight against Coronavirus with focus in three areas: diagnosis, treatment, and prediction. The biggest challenge thus far has been the lack of data, partly caused by a dearth of diagnostic testing. In this post, I give some examples of how AI is being applied in these three areas. Unfortunately, I don’t think AI will have a huge impact on our response to the COVID-19 epidemic, but what we learn here will help us in the future.

Prediction

There has been much discussion about “flattening the curve” so we don’t overwhelm our healthcare resources. The graphs being shown are based on predictions of how the disease can spread under different scenarios. We would like to know how many COVID-19 cases to expect, when and where they are likely to occur, and their expected severity. We would also like early identification of novel outbreaks.

In 2008, Google launched a project to predict and monitor flu called Flu Trends. It was shut down after it missed the peak of the 2013 flu season by 140 percent. But other companies learned from this epic failure and have since developed better solutions. At the end of February 2020, Metabiota was able to predict the cumulative number of COVID-19 cases a week ahead of time within 25% and also predict which countries would have the most cases.

Diagnosis

The most widely publicized AI success versus Coronavirus has been the development of Deep Learning models that can be used to analyze CT scans of lungs and distinguish COVID-19 pneumonia from other causes. Infervision and Alibaba have built models that demonstrate high accuracy. Here is a paper describing an approach by a Chinese team.

The issue here is that we would like an earlier diagnosis and not have to wait until there is pneumonia. Also, with the large number of cases, the capacity to perform CT scans could be exceeded.

Treatment

BioTech companies are using AI to identify already-approved drugs that can be re-purposed for Coronavirus and also to identify other molecules that could form the basis of an effective treatment.

Insilico is going after an enzyme, called 3C-like protease, that is critical for the coronavirus’s reproduction. They are using Generative Adversarial Networks (GAN) and other models in their drug discovery pipeline.

Conclusion

There have been great advances in AI technology this past decade, especially in the area of Deep Learning, that can be used for prediction, diagnosis, and treatment of infectious diseases. Our experience developing solutions for this current epidemic will help prepare us for the next one.

Featured photo of “Geek Machine” by Bob Mackie Copyright © 2020 Steve Kowalski

Removing Bias in AI Systems

We must ensure that our AI Systems are not biased. This can be an issue when building Deep Learning models from a biased training set.

Advances in AI technology have enabled a large number of successful applications, especially in the area of Deep Learning. But the issue of learned bias has raised its ugly head and must be addressed. The good news is that the AI research community has been working on this problem and interesting and effective solutions are being developed.

There are many different types of biases. Here are some examples.

  • Gender bias
  • Economic bias
  • Racial bias
  • Sexual orientation bias
  • Age bias

If we train our AI systems from biased data, these biases will be learned. For example, if we train a Deep Learning system on images of doctors and an overwhelming percentage of the images are of male doctors, the system is likely to learn that doctors are men. In their 2016 paper, “Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings,” Tolga Bolukbasi, et al. showed a disturbing level of gender bias in word embeddings trained from Google News articles. But they also proposed an effective way of removing this bias from the models that are learned. The basic idea is to change the embeddings of gender neutral words, by removing their gender associations. The same approach can be taken for other forms of bias.

The fact that we have this problem to deal with in Machine Learning sheds a light on the extensive amount of bias that exists in the human world and unfortunately, it seems easier to fix the bias issue in AI systems than in humans.

“Debiasing humans is harder than debiasing AI systems.” – Olga Russakovsky, Princeton

Programming Language Constants – Quick Reference

This is a quick reference for how constants are declared and used in different programming languages.

Software developers writing programs in different languages are often challenged by lack of consistency for how constants are declared and used (pun intended!). This is a quick reference that compares rules and behavior around constants in Java, Swift, Javascript (ES6), Rust, and C++. The motivation for this is the annoying difference between the let keyword in Javascript ES6 and Swift that drives me crazy (it is a variable in ES6 and a constant in Swift).

Language“Constant” KeywordWhen is “Constantness” Enforced?Rules Related to “Constantness”
Javascript ES6constRun time1. const variables must be assigned a value when declared and cannot be reassigned
2. const does not define a constant value, it defines a constant reference to a value (you can change the properties of a const object)
SwiftletCompile time1. After the value of a let variable is set, it cannot be changed.
2. But if it is initialized with a class object, the object itself can change, but the binding between the constant name and the object it refers to can’t.
C++#defineGives compile-time warning if changedProgrammers use #define preprocessor macro’s to declare constants but these can be modified by subsequent #define statements (getting a compiler warning) or undefined with the #undef statement so there is no real enforcement of “constantness”
C++constCompile time1. A const variable can’t have its value changed
2. Pointers declared as const can’t have their value changed but the memory they point to can be changed
3. You can only call const methods on objects that are declared as const
JavaconstN/AIt is interesting that const is a reserved keyword in Java but it is not used
JavafinalCompile Time1. Once a final variable is assigned, it can’t be changed but it doesn’t need to be assigned at the point of initialization
2. If the final variable is an object, it will always point to the same object but the properties of the object can change
PythonN/AN/ANo such thing as a constant in Python. Everything can change
RustconstCompile Time1. Similar to Javascript ES6, constants are declared using the const keyword while variables are declared using the let keyword
2. Constants cannot have their value changed
3. Be aware that constants may not refer to the same memory location

JavaScript ES6

ES6 introduced two new JavaScript keywords: const and let. Both have block scope. The difference is that const variables can’t vary – they have to be assigned a value when they are declared and can’t be reassigned after that.

const PI; // will give an error

const PI = 3.14;
PI = 3.14159; // will give an error

But it is really the reference that is the constant and not the value it refers to. So you can have a const object but still reassign its properties.

const myCar = {color: "red", make: "honda", miles: 15000};
myCar.miles = 15500; // this is OK

Swift

In contrast to Javascript ES6, the let keyword in Swift is used to declare a constant, not a variable.

let (firstNumber, secondNumber) = (5, 23)

When a let constant is declared in global scope, it needs to be initialized (like the declaration above). But if you declare it inside a function, it can be initialized at runtime as long as it gets a value before it is first used.

One thing to keep in mind is that if the constant is an object, the value it takes is a reference to the object that can’t be changed, but the properties of the object CAN change.

C++

Variables declared as const can’t have their values changed.

const int maxarray = 255;

Neither can pointers but you can change the memory location that the pointer points to.

char *mybuf = 0, *yourbuf;

char *const aptr = mybuf;

*aptr = 'a'; // OK because pointer value isn't changing

aptr = yourbuf; // Error C3892

Objects that are declared const can only have const member functions called.

const Date BirthDate( 1, 18, 1953 );

BirthDate.getMonth(); // Okay if getMonth() is a const function

BirthDate.setMonth( 4 ); // C2662 Error if setMonth is not a const

Note that because in C++, const is part of the type, it can be cast away and the value can be modified.

Java

A final variable can only be assigned once.

final int i = 1;

i = i + 5; // error

But objects declared final can have their properties changes.

final StringBuffer sb = newStringBuffer("Hello");

sb.append(" Steve"); // this is OK

Rust

Rust uses the const and let keywords similar to the way Javascript ES6 uses them.

const N: i32 = 5; // can't be modified

The nuance with Rust is that constants have no fixed address in memory. This is because they’re effectively inlined to each place that they’re used analogous to #define in C++ but not exactly the same. References to the same constant are thus not necessarily guaranteed to refer to the same memory address.

Featured image of Rocks at Joshua Tree Copyright © 2020 Steve Kowalski

A Grammar for AI-based Parking Sign Understanding

A grammar for parking signs can be used in conjunction with image recognition and text recognition to reason when parking is permitted and for how long.

Street parking in Los Angeles can be a nightmare. Besides the lack of available spaces, signage can be complex and confusing. Often there are multiple signs posted on the same pole that must be read, understood, and reasoned over in order to avoid a citation or towing (the signs in the image at the top of this post are typical). Fortunately, parking sign understanding can be automated with Artificial Intelligence techniques such as image recognition, text recognition, and machine learning. This post describes one piece of a comprehensive solution – a grammar for parking signs – that can be used to generate parsers that apply rules to unstructured sign text to facilitate automated reasoning. I will also show how ANTLR can be be used to generate a parser from the grammar. Grammar for LA parking signs and example photos can be found here on GitHub.

Grammar in this context is defined as the way that we expect words to be arranged on parking signs (for now, we ignore non-alphanumeric symbols such as a P with a line through it). A useful grammar notion commonly used in Computer Science is Extended Backus-Naur form (EBNF).

No Parking for Street Sweeping

One common sign found in Los Angeles and the source of many citations is no parking when there is street sweeping. Several instances of this sign are shown below. Note the following variations:

  1. “NO PARKING” text versus the symbol P with a red circle and line through it
  2. The time range specified by “TO” versus a dash “-“
  3. “12NOON” instead of “12PM”
  4. “STREET SWEEPING” versus “STREET CLEANING”
Street Sweeping Sign

I wrote a program using Apple’s Vision Framework to process these images and extract text from them. The output for these four signs (clockwise from top left) is as follows (note that the output is always uppercase):

  1. NO PARKING 9AM TO 12 NOON MONDAY STREET CLEANING
  2. NO PARKING 8AM – 10AM TUESDAY STREET CLEANING
  3. 8AM TO 10 AM TUESDAY STREET SWEEPING
  4. 9AM TO 12NOON MONDAY STREET SWEEPING

EBNF Grammar

We would like a grammar that covers all of these variations, can be used to distinguish street sweeping signs from other signs, and allow us to understand the parking rules on this street. Taking a bottom-up approach, note that “STREET SWEEPING” and “STREET CLEANING” really mean the same thing, so we can create an EBNF grammar rule for them as follows.

streetSweeping : STREET ( SWEEPING | CLEANING ) ;

The vertical line between SWEEPING and CLEANING means that either of these tokens can match. So if a parser finds the text “STREET SWEEPING” or “STREET CLEANING,” this grammar rule would be executed and create a “streetSweeping” node in the parse tree with the matching tokens as child nodes.

The part of the Parser that matches input text to tokens is called the Lexer. Our grammar would need three lexical rules to support the “streetSweeping” rule as follows.

STREET : 'STREET' ;
SWEEPING : 'SWEEPING' ;
CLEANING : 'CLEANING' ;

The characters between the single quote marks are matched one for one with the input text from the sign and if there is a match, the Lexer creates a node in the parse tree corresponding to the word.

We apply the same approach to identifying the time range on the signs. Some signs specify the range in the form 8AM TO 10AM and others as 8 TO 10AM so we have different rules for “time” and just a plain integer.

timeRange : (time to time) | (INT to time) ;

Since the signs have three conventions for indicating time range (“TO” “THRU” and “-“), we can have a grammar rule for this as well, along with lexical rules to support it.

to : TO | THRU | DASH ;
TO : 'TO' ;
THRU : 'THRU' ;
DASH : '-' ;

For the hours in the time range, we need to handle AM/PM and also NOON and MIDNIGHT. We also need to be able to handle the case when there is a space between the hour number and the AM/PM/NOON/MIDNIGHT and when there is no space (the output of the text recognition software is inconsistent here).

time
  : INT (':' INT )? (am | pm )?
  | twelveNoon
  | twelveMidnight
  ;

twelveNoon : NOON | ('12' NOON) ;
twelveMidnight : MIDNIGHT | ('12' MIDNIGHT) ;

am : 'AM' | ('A.M.') ;
pm : 'PM' | ('P.M.') ;
NOON : 'NOON' ;
MIDNIGHT : 'MIDNIGHT' ;

INT : [0-9]+ ;

Finally, we need rules for the days of week and the words “NO” and “PARKING.” The last rule “WS” is to ignore whitespace.

day : MON | TUE | WED | THU | FRI | SAT | SUN ;
MON : 'MONDAY' | 'MON' ;
TUE : 'TUESDAY' | 'TUE' ;
WED : 'WEDNESDAY' | 'WED' ;
THU : 'THURSDAY' | 'THU' ;
FRI : 'FRIDAY' | 'FRI' ;
SAT : 'SATURDAY' | 'SAT' ;
SUN : 'SUNDAY' | 'SUN' ;

NO : 'NO' ;
PARKING : 'PARKING' ;

WS : [ \t\r\n]+ -> skip ;

The top-level EBNF grammar rule for street sweeping signs can be written as:

streetSweepingSign
    : NO?  PARKING?  timeRange  day  streetSweeping
    ;

The question marks after “NO” and “PARKING” mean that these are optional. That is all the grammar to support this one type of sign. Many of these rules can be reused for other signs (e.g., day, timeRange).

Generating a Parser from the Grammar

ANTLR is a powerful parser generator that operates on an input grammar file to generate all the source code files you need in a target language that can be incorporated into the rest of your application. It also provides command line tools to test your grammar against input text streams. Running the parser generated from the Street Sweeping Sign grammar against the four sign instances creates the following parse trees.

NO PARKING 9AM TO 12 NOON MONDAY STREET CLEANING

NO PARKING 8AM – 10AM TUESDAY STREET CLEANING.

8AM TO 10 AM TUESDAY STREET SWEEPING.

9AM TO 12NOON MONDAY STREET SWEEPING.

Other Types of Parking Signs

There are other types of parking signs that can be found in Los Angeles. The grammar described here for street sweeping signs has been extended to cover all of the instances I’ve encountered (I haven’t driven on every street in LA yet!). This comprehensive parking sign grammar and parser can be used to identify multiple parking signs and reason about whether it is safe to park and for how long. That app is under development and could be the subject of a future post.

Javascript ES6 Demo – Elevator System

An interactive elevator control system and demo implemented in Javascript (ES6) is described. There are links to a demo and the source code.

When interviewing software developers, I like to give them a design problem to work out on a white board. My favorite has been an elevator control system. Everyone should know how elevators are supposed to work, so candidates shouldn’t get bogged down by lack of domain knowledge. Of course, there is no single correct answer and the purpose is really to gain insight into their thought process and how they would approach a problem. I often thought about what my design would look like, but I never followed through and implemented any of my ideas. So, I finally decided to take the plunge and code it to create a Javascript ES6 Demo.

Javascript ES6 Demo

You can see it running here and you can get the source code here on Github.

I decided to implement it in Javascript so it can run in a web browser without any backend. ECMAScript 6 is widely supported now, so I was able to take advantage of some more advanced Javascript language features. IntelliJ was my development environment. Also, the developer tools in Chrome were invaluable.

Architecture

I first considered a distributed architecture with independent controllers for each elevator that “bid” on elevator call requests like an auction (e.g., the elevator that can accept the request with the least cost would get it). The actor model would be a good fit for this approach. But I ended up deciding to implement a centralized control system as an initial solution. I may revisit the distributed architecture in a future implementation.

Metrics and Optimization

What exactly are we trying to optimize? The metrics I used are:

  1. Average Wait Time – the average time a person must wait before being picked up.
  2. Average Travel Time – the average time a person spends inside an elevator.
  3. Average Total Time – the sum of wait and travel time.

I also considered using elevator power consumption as a metric. Modern elevators generate power when they are traveling down which makes this very interesting. But I left this as future work.

Software complexity – Is it worth measuring?

Software is among mankind’s most complex creations. How do we measure software complexity and is there value in doing so?

Software is among mankind’s most complex creations. How do we measure software complexity and does it even make sense to do so?  When I think of the most complex structures we have created, the arts (especially music), and … software come to mind. From an economic perspective, we consider software complexity to be a bad thing because the more complex the software, the more time and expense it takes to build and maintain.  But sometimes it needs to be complex in order to solve complex problems.  There can also be an aesthetic beauty to software that only developers seem equipped to appreciate. I recall once looking at code that controlled a spacecraft and being blown away not only by its complexity but also how by how very well written it was. 

Music is different in that how complicated it is seems to affect people in different ways.  The high level of complexity in a Mozart symphony seems to contribute to its immortality. On the other hand, simple popular music tunes have a wide fanbase.

Complexity of Music

Fractal

In his 1933 book Aesthetic Measure1, preeminent American mathematician George David Birkhoff proposed a mathematical theory of aesthetics. In the course of writing the book, he spent a year studying art, music, and poetry of various cultures around the world.  He developed a formula to measure the aesthetic quality of an art object (e.g., a work of music) as being the ratio between its order and complexity.  Since that time, researchers have built upon his work to come up with other ways of analyzing complexity of music. Mandelbrot’s protégé Richard Voss together with John Clark applied fractals to mathematical analysis of music2April Pease and her colleagues extended this work by searching for the presence of crucial musical events based on an analysis of volume and using this as a measure of complexity3.  I find it interesting that in music complexity, the performance is measured, not the static sheet music (or electronic equivalent).  Music played by a computer reading sheet music has been found be less complex than a performance by accomplished musicians!

Software Complexity

The software profession has struggled with how to measure software complexity for decades.  Thomas McCabe came up with the idea of using Cyclomatic Complexity to measure the number of logical paths through code4.  But this has been shown to not be any better than just counting source lines of code (SLOC).  Two methods currently in use are a set of six metrics proposed by Shyam R. Chidamber and C.F. Kemerer specifically designed for object-oriented code5, and a different set of six metrics proposed by Maurice H. Halstead6.

Comparison of Software Complexity Metrics

Chidamber and Kemerer Metrics
Note that these metrics are per class so you would sum them up for all classes in the program
Halstead Metrics
WMC – weighted methods per class is the sum the complexities of each class method but they used a complexity of 1 for each method so this is really just the number of methods in a class Program Vocabulary n = n1 + n2
n1 – number of distinct operators
n2 – number of distinct operands
CBO – coupling between object classes is the number of other classes which are coupled (using or being used) Program Length N = N1 + N2
N1 – total number of operators
N2 – total number of operands
RFC – response for a class is the number of methods called by each class method summed together Calculated Estimated Program Length = n1 log2 n1 + n2 log2 n2
NOC – number of children is the sum of all classes that inherit this class or a descended of it Volume V = N x log2 n
DIT – depth of inheritance tree is the maximum depth of the inheritance tree for this class Difficulty D = (n1 / 2) x (N2 / n2)
LCOM– lack of cohesion of methods measures the intersection of the attributes used in common by the class methods Effort E = D x V

Measuring Software Complexity

Measurement

So why should we care about measuring software complexity?  Here are some claims, in many instances being made by companies that are selling complexity measurement products or consultants that will help you figure out how to use them.

Better estimates on software maintenance effort

I suppose if you had enough empirical data to somehow relate complexity to maintenance cost, this might be useful.  There certainly have been a lot of studies on this.  The problem is that there are other significant factors that affect software maintenance costs like

  • the number and type of new or changed user requirements concerning functional enhancements
  • the amount of adaptation that needs to be done to support a changing environment (e.g., Database, Operating System)
  • the amount of preventative maintenance that needs to be done to improve reliability or prevent future problems

Monitoring complexity so as to keep it lower, saving cost and reducing risk

So, do we have the Software Development Manager tell developers that their LCOM or DIT is too high and they need to fix it?  Really?  I suppose this could be an indicator that could be used to focus code reviews (i.e., spend more time reviewing code that has higher complexity) but I don’t see that there would be much value in doing this especially if your team is already doing effective code reviews with good coverage.

Using complexity as criteria for deciding to refactor or rewrite software

The suggestion here is that a software development manager would monitor the complexity across the codebase and when a module gets above a certain threshold, a decision to refactor or rewrite would be considered.  My experience is that the development team already knows which sections of code are the best candidates for refactoring based on the effort required to maintain them.  I’d trust that measure much more than a complexity metric. Another consideration is how often complex code needs to be touched.  I’ve worked in organizations where we had a large, complex legacy codebase that we didn’t touch, just wrap it with a façade or adaptor.

My view on software complexity is that there is little if any value in measuring it outside of academia.  There is a hospital sketch in Monty Python’s Meaning of Life movie where doctors call for the operating room to be filled with the most expensive equipment in order to impress the administrators should they drop in for a visit.  John Kleese specifically asks for staff to bring in “the machine that goes ‘Bing’.”  A software complexity dashboard would seem to have equivalent utility.

1 George D. Birkhoff, Aesthetic Measure, Harvard University Press, 1933.
2 R.V. Voss, J. Clarke, 1/f Noise in Music and Speech, Nature, 258 (1975).
3 April Pease, Korosh Mahmoodi, Bruce J. West,Complexity Measures of Music, Chaos, Solitons and Fractals 108 (2018) 82–86.
4 McCabe (December 1976). “A Complexity Measure”. IEEE Transactions on Software Engineering (4): 308–320.
5 Chidamber, S.R.; Kemerer, C.F. IEEE Transactions on Software Engineering Volume 20, Issue 6, Jun 1994 Page(s):476 – 493.
6 Halstead, Maurice H. (1977). Elements of Software Science. Amsterdam: Elsevier North-Holland, Inc. ISBN 0-444-00205-7.

Featured image of Barcelona Cathedral Copyright © 2019 Steve Kowalski

Software evolution – Software is never done … it is abandoned!

“Software is never done … it is abandoned.” Software evolution is something to be understood (Lehman’s Laws) and embraced

I’m not sure who should get credit for the aphorism “Software is never done … it is abandoned,” but it seems a corollary to Lehman’s laws of software evolutionMeir “Manny” Lehman worked at IBM’s research division from 1964 to 1972. Lehman’s studies of software development lifecycle provided a foundation for his early recognition of the software evolution phenomenon. After IBM, he became Professor and Head of the Computing Department at Imperial College London and then Professor at Middlesex University. I’ll discuss three of his eight laws that resonate the most with me, and their implications.

Functional content must grow

The functional content of a software system must be continually increased to maintain user satisfaction over its lifetime

This is a good thing!  People like using your software!  They will find ways to use it that you hadn’t thought of. They will have wonderful ideas on how it can be more efficient and more comprehensive.  But if you don’t keep releasing new features and enhancements to keep up with their requests, they may become dissatisfied and move on to something else. 

Complexity must be managed

As a software system evolves, its complexity increases unless work is done to maintain or reduce it

This is in reference to increasing software entropy.  As new functionality is added, the software will eventually become more complex and more disorganized as it departs from its original design.  At some point, it may well be time for a redesign.  That in no way means that the original design was a failure, just that the system has evolved, which is a good thing!  This concept of software entropy is orthogonal to technical debt. Taking on technical debt may lower complexity when easier short-term solutions are selected over better longer-term solutions with higher complexity and longer implementation times.

Quality may appear to be declining

The quality of a software system will appear to be declining unless it is rigorously maintained and adapted to operational environment changes

The environment that our software operates in is likely to be ever-changing. There will be new platforms, new operating systems, new devices, new frameworks, new protocols, new databases, new APIs, and new resource constraints and unconstraints. “Adapt or perish, now as ever, is nature’s inexorable imperative.” – H.G. Wells

Software evolution is not the enemy; it is the consequence of a successful system.