In the previous post we discussed coding standards, and in this one we will examine the equally important subject of module/unit tests, and in particular the built-in module testing framework that Mitopia® provides, and which MitoSystems’ coding standards mandate. A distinction is generally made between ‘unit tests’ and ‘module tests’, the former is normally a development phase activity performed by the developer on a per-function basis while the latter is normally though of as something that happens after initial development and may be repeated regularly to ensure the module still performs as expected during maintenance. Module tests tend to test operation of an integrated set of functions more than unit tests do. In either case, these tests tend to involve running the target code within some kind of specialized testing harness which is quite distinct from the actual program the code is designed to be part of.
In Mitopia® we take the position that unit tests and module tests are one and the same, and that rather than being distinct from the target application, they are always part of it, and can in fact be invoked at any time from within the Mitopia® application itself. This is a fairly radical departure from normal approaches, and so perhaps we should start out by describing the reasoning behind this choice of approach.
The following are some key advantages of having module tests available in-place in the target application:
The basic concept behind Mitopia’s module test approach is that every test step should convert the results of code execution to some form of human readable string, and that by defining an ‘expected’ value for this string and comparing it to the actual ‘achieved’ value, all test steps can be reduced to a standardized form and the entire test step registration and execution process can be formalized and controlled within Mitopia® so that all test steps remain available at all times. If the two string match, the step passes, if they don’t match, the framework can print out the expected and achieved, and the tester can easily see where any why they don’t match. Now the first time someone unfamiliar with this approach hears this strategy, they invariable say well … not everything can be converted to a string like that, what about if its a comms. thingy, multi-threaded, causes errors, or perhaps is user interface stuff, how do you convert that to a string? My answer is that: all of these kinds of things are module tested within Mitopia®, and we have found ways to convert all of them to meaningful strings. It is really very simple. So, given that premise, lets go ahead and describe the basic structure of a module test and the API’s one uses to register and execute it:
The screen shot to the left shows the final portion of the function popup within the development environment for the Lex.c package. Mitopia® coding standards dictate that every module have an initialization and termination function which must be named XX_Initialize() and XX_Terminate() where XX_ is the package prefix, it also mandates that these functions be at the end of the source file. Part of the reasoning for this is that this allows the initialization function to ‘see’ and register the entirely ‘static’ module test – which the coding standards dictate should appear immediately above the Initialization section as shown in the screen shot.
We can see from the screen shot that for the Lex package, the module test comprises 19 distinct test steps.
The module test itself is registered with the suite by calling DG_RegisterModule() as illustrated above. The function LX_CALLtest() provides the interface to the standard module test suite, and the second string parameter is the module name. This registration allows the module test to be run at any time from the built in Administration window simply by choosing “Invoke a Module Test” and then picking the name of the module to be tested as illustrated in the screen shot below:
The screen shot to the right illustrates the code necessary to register the various test steps with the test suite so that it can call them. Note that LX_testInitialize() defines a global using Mitopia’s OC_MakeGlobal() function to reference the test context handle ‘cHdl‘. This makes the test context available to other code and threads throughout the system. In particular, should any test step cause an error report (in this thread or any other), the error logging facility will automatically insert the error details into the current ‘achieved’ string as a result of the module test suite registering a custom error handler for this purpose. This will thus make a test step that is not expecting the error fail.
Every package defines its own unique module test context type (in this case LX_testContext) and these always start with the defined Mitopia® type ET_TestContext which is used by the module test suite to perform all its functions (see definitions below). In particular this structure is used to track the registered test steps and hold the expected and achieved strings for each step. Specific modules may add additional fields and structures to this generalized test context as required by their own module test code. In the case of the Lex test, it adds a lexical analyzer DB and a large buffer used internally by certain steps (see above).
Given all these definitions, we can now look at some simple early test steps for the Lex package (remember tend to be like function unit tests):
As you can see each step sets up the expected string by copying a constant into the ‘c.expString‘ field of the context, it then performs a sequence of actions using package functions in order to general a matching sequence by testing logical operation of the package. When the test is run, the framework invokes each step in sequence and compares expected and achieved strings when complete, outputting a single ‘.’ to the console for a successful test, or the mismatching test step and expected/achieved for a failure. It also monitors for memory leaks caused by the test. A successful run of the Lex module test therefore looks as follows:
Obviously, the test steps above are relatively simple. To illustrate more complex module level tests, the screen shot below shows the first step in the Parse package test which is in fact testing creation and operation of a full C expression parser by comparing its results to those obtained from the equivalent C program. This single test step tests a broad range of Parse.c functionality as well as correct operation of the underlying Lex.c package and many other features. In fact, module tests if correctly targeted at the uppermost functions (which in turn require lower level functions to operate correctly) can consist of a relatively small number of steps in order to provide broad coverage. Remember also, as the Parse example illustrates, that because Mitopia® code is so heavily layered on lower level abstractions, module tests for dependent abstractions (in this case Parse) further confirm correct operation of lower level abstractions (i.e., Lex) within the actual target environment. The full suite of registered module tests if run (and passed) thus provides a fairly high degree of confidence that any given change made during maintenance did not break something else. Indeed such tests in Mitopia® are also fundamentally a major part of integration testing since they all occur within the real application including of course access to persistent storage, GUI, and everything else. Some Mitopia® module tests further up the abstraction pyramid wipe the LOCAL server content (after a suitable warning to the user) and then mine and persist significant data sets, which they then use to test both client and server side operation.
As can be seen, for a relatively small one-time effort in creating and registering the test step, the developer (and maintainer) using the Mitopia® module test suite gains a permanent way of testing that all functionality in the module is working as intended, an easy way of stepping through code to understand it, and documentation of how package functions can be combined and invoked in order to accomplish complex things. Best of all, none of this is ever lost as developers leave or are changed. Mitopia’s built-in module test suite thus represents a key pillar making the code base resistant to ‘entropy’ and bugs caused by ill conceived maintenance actions on the code. As we all know, most bugs are introduced during maintenance, and most of the cost of a piece of software is actually incurred hunting these bugs during maintenance (some estimates go as high as 90%).
Built-in module tests should therefore be a required feature, enforced by coding standards, and integrated into the actual product, in any software designed for longevity.
Much has been written on the subject of coding standards and conventions over the years, and they are often the subject of vigorous debate. In this post I want to give a brief overview of the MitoSystems (C language) coding standards and the philosophy behind them. First let us be clear: the purpose of coding standards is to improve the readability, reliability, and maintainability of the code base.
In any large project such as Mitopia® wherein multiple people other than the original author must be able to rapidly understand and trace through code during a debugging session, the most important thing one can do to facilitate this is to maximize the degree to which all the code looks basically the same, and uses the same commenting, indenting, file organization, spacing, and underlying libraries and techniques. In a large code base uniformity is critical to efficiency, reliability, and comprehension. Uppermost of all these is comprehension by both the author or more likely other code maintainers. It is during maintenance that most ‘entropy’ and bugs are introduced into code through poor comprehension of the side effects of what appears locally to be a safe change.
For this reason, arguments commonly put forward that coding standards inhibit creativity and need not be followed by star programmers (these arguments are most often put forward by these individuals themselves), are merely showing a lack of professionalism on the part of the proponent, and a failure to see things in the larger ‘picture’. I have posted before on the relative irrelevance of the programming language in solving these goals, as I have on the relative irrelevance of the language metaphor, and the dangers of ‘COTS cobbling’ in this regard. There are no magic solutions to maintainability and robustness, in the end it it all up to the programmers and the degree of design and coding rigor they apply – from the bottom up, no shortcuts.
The truth is that for large long lived projects, maintainability (and thus project longevity – through resistance to entropy) is driven primarily by four things: (1) Good requirements and initial design, (2) Development of well designed and layered generalized abstraction libraries organized as packages, (3) Good coding standards, rigorously enforced, (4) Pervasive (and maintained) module tests for all packages (5) Other adaptivity techniques (see other posts on this site e.g., here).
We have discussed (1) and (5) in previous posts and have illustrated (2) throughout this blog, so in this post I want to look at the important subject of coding standards (3) in more depth. We will look at Mitopia’s module test approach (4) in a future post.
We will use the simple function above to illustrate a number of MitoSystems’s basic coding standards:
It does this by redefining the C reserved word ‘return‘ so that it cannot be used in the code directly, and so that if used twice in the same function, it will generate a compiler error/warning. Experience has shown that having multiple return statements dotted around a function is the #1 cause of leaks and other unintended side effects introduced by subsequent maintainers that have not fully understood all possible execution paths. By enforcing a single exit, all these debugging features are possible and it also makes all functions look the same and perform their cleanup in a standard place. Moreover, notice the three definitions RETURN_ret, RETURN_void, and RETURN(res). In effect these enforce another standard which is that the return value for ALL functions must be either ‘void‘, or ‘ret‘ (since RETURN(x) where ‘x‘ is anything else is undefined). Figuring out what a function is returning is a key time waster when stepping through code. Through these macros, this problem goes away, the function return value is always called ‘ret‘ (you can’t even return constants like true or false!). Again this standard is focussed on the goal of rapid code understanding, and the ENTER()/RETURN() formalism is a huge part of enforcing this standardizations in a way that programmers cannot simply work around.
Bottom line for all these basic standards and conventions is they make the code easier to understand and test, and they automate completely the task of making documentation match code, as well as various other maintenance tasks that might otherwise be avoided by the lazy.
In addition to the basic MitoSystems coding standards described above, Mitopia® itself imposes a number of of additional rules to further enhance reliability and programmer productivity within a large code base. The paragraphs below discuss some of these standards.
All code should utilize the standard abstractions provided by Mitopia® to manipulate data structure aggregations. These standard metaphors include the flat memory model, types and the ontology, string lists, lexical analyzers, and the database/persistent memory abstractions. In particular, the creation of any memory resident structure that contains pointer links is strongly discouraged. The underlying abstractions are powerful enough to represent anything one might need and are highly optimized. By using them for all things, we make the operation of most areas of the code immediately familiar by analogy with other known code based on the same abstractions. Of course the minimize the code size and avoid introducing low level bugs also.
The code should be designed throughout to be as platform and architecture independent as possible (e.g., endian issues). This means overt declaration of the size of variable implementation intended (as in the sequence int16, int32, int64). This runs somewhat contrary to C norms, however experience has shown that programmers alway have in their mind what size ‘int‘, ‘long‘, or ‘short‘ might be, even if not stated, with the result that code breaks badly over time through range and structure alignment changes mandated by compilers and the underlying processors. The only constant in this world is change, and change breaks all hidden assumptions. Better to be explicit in everything – including explicit structure padding if needed to guarantee universal alignment. The only exception to this rule is the C assumption that ‘long‘ is the same size as a pointer (though that may be either 32 or 64 bits depending). If in doubt, use 64-bit values for integers and double for real – with modern computers there is really no reason not to.
Further to the platform independent goal, all calls to the underlying OS toolbox (and C library for that matter) must be wrapped (and preceded by the prefix XC_ e.g. XC_NewPtr). These wrappers are all declared in the single header file ToolBoxMap.h, and the implementations (which simply call the toolbox routine) are all gathered into ToolBoxMap.c. This structure ensures that all toolbox/external calls emanate from ToolBoxMap.c, and the macros XC_fnName() declared in ToolBoxMap.h and used to invoke are the wrapper functions can then be organized to go through a mapping table. This in turn allows any and all toolbox calls to be dynamically patched for debugging purposes, or if one switches to an underlying OS that does not provide it. This allows calling code the be platform agnostic to the highest degree possible.
One key example of use of this technique is to wrap all memory allocators and de-allocators and then substitute alternates that take advantage of the ENTER() and RETURN() formalism in order to implement leak checking and a variety of other critical debugging capabilities. In this way, all allocations can be fully analyzed by Mitopia® itself without requiring external debugging capabilities. This is essential in a heavily multi-threaded environment such as Mitopia® where tracing memory ownership becomes a real challenge any other way. The end result is that you cannot find direct toolbox calls in any Mitopia® code other than the wrappers.
Packages should provide a complete suite of functions to manipulate the abstraction to which they relate, all those functions (public and private) appearing within a single source file. Externally defined structure types should be avoided if at all possible. Instead provide additional accessor functions to get at fields hidden within structures referenced via an abstract reference. Publishing structures is a sin and inevitably leads to problems down the line with client code directly accessing the structure which may of course change later. Better to hide the data behind the package accessor functions. The entire Mitopia® code base publishes less than 100 structure types publicly (that is substantially less than one per Mitopia abstraction package) despite having many times that defined and used internally. On the other hand, there are literally thousands of public API calls grouped into packages. If data structure is to be published, use the ontology.
Mitopia® code tends to be very dense in terms of the number of functions called compared to the kind of code one might find say in an open source project. The snippet below is typical:
As can be seen the code utilizes abstraction functions from a variety of packages to do virtually everything, there is generally very little actual complex manipulation done locally. The function above is part of the implementation of the MitoQuest database server and is running within the servers, however, because even Mitopia’s database abstraction is built on the same fundamental underpinnings, the code looks identical to what a client side function might be doing and calls all the same kinds of packages. When combined with the other coding standards above, this tends to make all Mitopia® code look similar and instantly recognizable as to its purpose. This in turn makes all code easy to comprehend and maintain in a way that I would argue is far more effective than the kinds of ‘placebo’ organizing metaphors offered by standard programming languages.