Talk:Coding Standards

From Sirikata Wiki
Jump to navigation Jump to search

There is no mention of const-correctness in the standards page. (Sorry, there IS; I just missed it on first read.) Where it talks about the use of get/set prefixes, the example given is incomplete. It says to write,

int getAge();
void setAge( int Age );

It should say to write

int getAge() const;
void setAge( int anAge );

(Besides, someone might expect Age to be a float, or an unsigned, or a size_t, or a special class with years and months and days... There's probably something to be said for choosing a name that better reflects the underlying representation; --NOT talking about "iAge"; rather a name like "YearsOfAge", and its type to be a size_t, unless it can be negative.)

Const-correctness essentially boils down to making const anything that CAN be const. Sometimes the constness of functions is made difficult by intrinsic linkage, such as when objects have a pointer to next, and are part of an LRU cache that should update on reads. Well, such situations call for making that pointer to next mutable; --after all, it is not really a part of the class, philosophically speaking--, rather than compromise constness on the class interface.

Another subject not mentioned is command/query separation. This is an enormoulsy helpful discipline that makes for clean and readable code when adhered to religiously. Command/Query separation says that there are two types of functions that should never be mixed or combined in one: Queries return a value, ALWAYS; and as class member functions are always ...() const. Commands are NEVER const, and ALWAYS return void. And what people new to this concept ALWAYS ask is "what about returning a success or failure status?". Answer is NEVER do that. Failures should be "returned" by throw()- -ing an exception. Status returns have been depracated for years now.

Another subject not addressed is, in fact, exceptions, and the huge importance of marking functions that do NOT throw excetions simply with a comment

void f() // no-throw

Why? Because the big gottcha with exceptions is that you DON'T want them to happen inside a constructor; so by flagging non-throwing functions //no-throw you can, at a glance, tell if you can call it from a ctor.

Another subject not addressed is the use of "explicit" for single arg ctor's.

Another subject not addressed is avoiding like the plague the terrible practice of using special values with floats to convey additional state. This bad practice is a time-bomb, and a documentation nightmare. Someone modifying a piece of code using one such variable cannot be expected to look up the declaration of every variable to scan visually for comments about them; so a variable that can have special values meaning special things should be accompanied by comments about these special values and their meanings everywhere they are used!!! Just pack a bool with the float into a struct, like

struct fuel_capacity
{
   float litres;
   bool this_is_a_station__dogbreath__aint_got_no_fuel_tank;
};

rather than signify no fuel use by a -1. value. Or if you are really keen on the efficiency aspect, at least wrap that float into a class, like

class fuel_capacity
{
  float litres;
public:
  bool is_applicable() const { return litres >= 0.0f );
  float get_capacity() const
  {
    assert( is_applicable() );
    return litres;
  }
  ...
};

Another subject not addressed is variable and function names, and how to make them intuitive and clear. For example, boolean variables should always read like statements that can only be true or false. E.g,

bool light_is_on; bool tank_has_enough_fuel;

etc. A function that returns the state of a light could be similarly named, but with a verb at the front,

bool light::is_on() const;
bool tank::has_enough_fuel_for( trip const & trp ) const;

A function that turns a light on should be named like

void light::turn_on();

Or if a boolean variable controls the light, it could be named,

bool light_enable_switch;

There should NEVER be ambiguously named variables or functions like

light_on //BAD!!! Is this a switch that turns it on?, a status?, or what?

Least of all, something like

int light_on( double x, int y ); // HORROR

Another very good practice not mentioned is avoiding inheritance of concrete classes. All hierarchies trees should be abstract, having a dtor like,

virtual ~aClass() = 0;

with only the leafs being concrete. Adhering to this discipline avoids many dangers associated with automatically generated assignment operators and constructors that can lead to slicing or simply non-sensical assignment across inheritance levels. It also allows you to make concrete class destructors non-virtual, if you know that a concrete class will never be inherited. There'd be many exceptions to this rule, however; the idea is merely to think twice... make that three times.. before having one concrete class inherit another concrete class. Avoid whenever possible.

Also, avoiding virtual functions in the public interface of classes. The preferred method is to have public non-virtual, inline functions that call private virtual functions

class unit
{
public:
  void fly(){ _fly(); }
private:
  virtual void _fly();
};

Exceptionally, a virtual function could be protected:

class unit
{
public:
  void fly(){ _fly(); }
protected:
  virtual void _fly();
};

The semantic difference here is that if you make _fly() private, you are communicating to other programmers that derived class overrides of _fly() need not call nor know anything about the base class' _fly() function. In the latter case, you are communicating to programmers that derived class overrides of _fly() SHOULD call the base-class _fly() function; that is, that they should add to, or "decorate" the original function, rather than replace it. But if you make fly() a public virtual function, you're communicating to other programmers that you just don't know the language and its idioms very well ;-)

Separating "data" from "code" or "behavioral" classes is also a good idea. Classes representing passive data should feature a complete set of initializing constructors and assignment operators, as well as binary operators for comparisons and/or arithmetic if applicable, and explicit initializations to/from base classes; but no other functionality. Behavioral classes should often have a single data member being a pointer to a struct containing the "real" data members; --that is, they should often use the PIMPL idiom, and often should have their default assignment operator and constructor disabled by being declared private and left undefined. Functions in code classes access variables in the struct via the single pointer member. This practice avoids excessive header information exposure. Clients of a PIMPL only need information about its public functions, --not about its data members; and having data members in its main header implies that their type definitions must also be included, which calls for additional header files, and so on, which can often lead to vicious header inclusion circles.

Global space and non-class-function calls should be prefixed :: to make it clear that they are not calls to member functions. This would include functions from static and dynamic libraries. Thus, rather than a = floor( b ), write a = ::floor( b ).

Another very helpful standard to adopt is DBC (Design By Contract). DBC cannot be done fully in C++. Long story. But at least a subset can be achieved. Every class should have a private, inline function void _chk_invariants() that is defined as empty in release mode but has code in debug mode. The debug mode code checks the state of the class for coherence.

class myString
{
  char str[32];
  size_t len;
public:
  ...
private:
  ...
  void _chk_invariants()
  {
    assert( len < 32 );
    assert( len >= 0 );
    assert( ::strlen(str) == len );
  }
};

Then, all class functions can have calls to _chk_invariants(): at the starts of queries and at the ends of commands. Thus, queries make sure that nothing's wrong with the class' state before answering the query; and commands make sure that they leave the class in a coherent state after they execute. Additionally, every function should begin with assertions about its input parameters, to make sure the input they get is within specified range; that pointers are not null, etc. This falls far short of DBC as implemented in the Eiffel language; but it's a start. Note that input check asserts should NOT even look at private class or function variables; only input variables. Why? Because it would be an unfair contract to expect the caller to ensure anything about the state of the class, which it cannot see. Just for kicks, adding assertions before the ends of queries, ensuring that output is within specified range; that pointers returned are not null, etceteras, would be nice. In the case of commands, the call to _chk_invariants() is sufficient.

Avoid confusing code errors and debugging with run-time exceptions. These are two totally different paradigms and should not be mixed. Checking for possible consequences of code error is what debugging is all about; and the standard way to deal with code errors is by assert()'s. The difference between code errors and exceptions is that code errors are bad things that "should NOT happen", whereas run-time exceptions are bad things that one foresees WILL happen, such as the user having removed a CD that the app needed. The standard way to deal with run-time exceptions is to throw( new disk_not_found() ); then catch the exception at some convenient place up the stack, and deal with it. Avoid using exceptions for debugging; or asserts for run-time exceptions. There's admittedly a gray area, however: Code is never perfect; and sometimes it may be desirable to deal with a code error as an exception rather than to terminate an application; but this should be rarely done; only for types of bugs that are known but seemingly intractable; and should be documented as such. Also, error-logging tends to straddle both worlds; but, by the same token, it may be preferable to have separate files for debugging and exception logs. Just have __ASSERT__ and __THROW__ macros that direct messages to two separate files.