Lines of Code

Our configuration management team sent me an unusual request. They had identified all the files in our source code control system. They wanted to know which file types can be counted to arrive at a source line of code count. The idea was to tally up the numbers and give a report to our client. Now this task alone was not difficult. Any developer worth his salt knows which files are source files and which are not. Actually the CM guys should be able to figure this out. Does anybody really thing a bitmap (bmp) file contains any source code?

The real concern was why somebody wanted to count the lines of code. Such a metric is not evil in and of itself. But if you do not know what you are doing, you can become dangerous having such a metric at your disposal. Lines of code is a relatively unambiguous metric. Interpreting this metric can be difficult to fathom. For example, you might refactor your code and arrive at a lower count. Somebody looking solely at the count might think that you have then made negative progress toward some development goal. Or you might see that there are 20k SLOC which took 20 months to develop, and assume that a new 1k SLOC change will then take just 1 month to complete. You get the picture.

I do not like to second guess our CM team. They are pretty sharp technical guys. If they have a task to count up the lines of code, then I can support them. I rely on them often to help me out with Rational Clearcase. So I answered their questions regarding source files that should be counted. In the end they came up with a count of 215,000 lines of code in our two largest applications. This seemed to be on the correct order of magnitude. A couple year ago I did a lie of code count using the UNIX wc command. I came up with 270,000 lines of code. My count may have included the source code for some third party tools we were using. So my count may have been artificially high.