Friday 9 December 2011

Overloading the meaning of configuration settings is a smell

Is there anything wrong with this?

if (settings[“mysetting”] != string.Empty) 
{ 
    DoSomethingWith(settings[“mysetting”]);
}

The meaning of the configuration setting "mysetting” is overloaded. One of its possible values is special, and is used to turn some behaviour on, or off. It’s like overloading the return value of a counting method to handle error codes as well as real values; like returning –1 on failure to count, and the real count if successful.

Notionally, although the empty string is in the valid set of possible strings, it may not be in the possible set of valid values for this particular setting, in which case you might argue that it’s reasonable to take one of the invalid values and overload on it. However, that hides the fact that there’s a logical difference between turning a behaviour on, or off, and configuring that behaviour. Generally it leads to branching in the code, which makes it more complex.

That’s apart from the fact that the empty string might actually be a valid value one day.

Instead, it makes sense to think about why you need to turn the behaviour on and off, rather than just configuring it. We hit this problem recently when we wanted to add some configuration that allowed us to replace a specific string with another string[*]. Specifically we wanted to override the name of a callback numerical solver in a task definition. Originally it was using “NewtonRaphson”, and we wanted to use “SomeNAGMethod”.

I mulled over two approaches with a friend – first, hardcoding the new value and having a bool configurable flag to use it or not; and a single string configurable value for which string.Empty turns the behaviour off, and all other values actually configure the behaviour.

<add key=“FixSolverMethod” value=“false”/>
...
settings[“FixSolverMethod”] = “true”;  // to override

or

<add key=“SolverMethod” value=“”/> 
... 
settings[“SolverMethod”] = “SomeNAGMethod”; // to override

Both were tied to the idea that this was behaviour that needed to be turned off an on, and as a result implied the need for decision making at the point of use, as well as actual use of the value.

if(setting[“FixSolverMethod”]) FixSolverMethod(); 
    // which then uses a hardcoded method name

or

if(setting[“SolverMethod”] != string.Empty) 
    OverrideSolverMethod(setting[“SolverMethod”]);

I even toyed with the idea of configurable configuration items:

<add key=”SolverMethod” value=”SomeNAGMethod” enabled=”false”/>

but that way madness lay.

It felt wrong to me, and I couldn’t figure out why until I spotted this smell; that the meaning of the value was being overloaded. I realised then that I could simply extend the notion of what we were doing. Instead of turning the behaviour on or off, I could just turn leave it on and have the default value for the configurable setting set the same as the existing value.

// at initialization – gives us a default
<add key=(“SolverMethod”, value=“NewtonRaphson”/>  
...
// some time later, when we want to reconfigure. alternatively we could
// just set the new method name in the xml above ...
settings[“SolverMethod”] = “SomeNAGMethod”;  
...
// finally, just use it
UpdateSolverMethod(setting[“SolverMethod”]); 

In this case no branching is needed – the solver method name is always overwritten. At worst it’s a wasted operation, as it’s overwritten with the string that’s already there.

Once you get to that point you start thinking about the code more generally. Is this the right place to be doing this? If we’re having to fix this up in every case, can’t we do it where these definitions are created? That led us to explore parts of the codebase that we didn’t have direct experience of, and what did we find? That we had control over the tasks definitions directly – we were creating them, but in a part of the code neither of us knew about. We could keep the configuration setting we’d created, but apply it at source, logically and consistently, to configure the tasks instead of patching them up downstream.

Overloading configuration settings is a code smell. It’s worth thinking – can I set a default that matches the existing behaviour, and apply it all the time? And if I can, does that mean it’s better done somewhere else?

[*] Ah, the glamour of corporate development!

  

2 comments:

Chris Oldwood said...

An empty string is a funny beast, as I recently described in my post "Null String Reference vs Empty String Value" (http://chrisoldwood.blogspot.com/2011/11/null-string-reference-vs-empty-string.html)

When it comes to configuration data I generally equate it to NULL, meaning no setting value provided. Consequently I don't see this as "overloading", more as treating the value as nullable.

If the setting is an integer value, an empty string still has the same meaning (NULL) and therefore I believe the behaviour is consistent.

Of course that leaves the question of what NULL means. I'm not sure it can be much apart from what the internal "default" is.

Perhaps the real smell is that the configuration API returns empty strings instead of null string references for missing values. Would that make it less ambiguous?

Tim Barrass said...

I think if the intent was to say that there's no setting provided it'd make more sense to not provide a setting, rather than setting it to an empty string.

The behaviour you describe is certainly consistent across different types -- but I don't think you've made a case for it not overloading the meaning. Although you equate it to NULL it's not actually NULL -- you're glossing over the fact that there needs to be some interpretation of a specific string value to infer NULL.