Friday, November 30, 2012

Static member functions and const-ness

A colleague of mine was a little miffed with C++ a few days ago. She was trying to create class with a static member function, but one that was doing a simple translation. Since it wasn't modifying any class data (static or member) and was doing a pure translation, she defined it as below:



#include <string>

#include <cstdint> //need c++0x

class SomeClassWithStaticAndPrivateData {
public:
  enum EnumInt {
    Unknown,
    Zero,
    One,
    Two,
    Three
  } ;
  //snip

  #define wart_TRANSLATE_CASE(enumInt) if(str.find(#enumInt)) return enumInt;
  static EnumInt translateStringToEnumInt(const std::string& str) const {
     wart_TRANSLATE_CASE(Zero)
     wart_TRANSLATE_CASE(One)
     wart_TRANSLATE_CASE(Two)
     wart_TRANSLATE_CASE(Three)
     return Unknown;
  }
  #undef wart_TRANSLATE_CASE
private:
  static int32_t sInt;
  int32_t instanceInt;
};


Sure enough, the compiler gave the following error:


error: static member function ‘static SomeClassWithStaticAndPrivateData::EnumInt SomeClassWithStaticAndPrivateData::translateStringToEnumInt(const string&)’ cannot have cv-qualifier


Her assertion was "Why is it wrong to indicate to the compiler that I'm defining a static function, and that it isn't going to change any static variables of the class at all?"

At first blush, her idea seems perfectly reasonable: you want the compiler to provide protection against any inadvertent modification of the class' data members. So why doesn't the C++ standard allow this? Is it just an oversight?

Well, I don't think so. I think the C++ standardizers, whatever might be their faults, did spend quite some time thinking about the specific capabilities and features of the language. I think they are battling against the notion of a false sense of security, while trying to maintain consistency.

What does it mean to have a const member function? And what does it mean to have a static member function?

non-static data members are the norm. These are data members that are created for each instance of a class. static members on the other hand, belong to the class, and not to any particular instance or object. They are created once, and are shared by all object instances of that class.

Similarly, non-static member functions (methods) are the norm. These are methods that take an implicit pointer to the object they are associated with: the "this" pointer. Any instance data members accessed can be thought of being accessed via a this->data_member dereference. Since such an access is the norm, the "this->" dereference operation may be omitted for notational convenience, but it is nonetheless present.

Using the class above, the function foo can be looked up as a function of the type:

void SomeClassWithStaticAndPrivateData::foo(SomeClassWithStaticAndPrivateData* this);

A static member function on the other hand, since it belongs to the class, and is shared by every instance of the class, does NOT need to refer (or in other words cannot refer to) a particular instance of the class. Consequently, it does not need a special / implicit "this" pointer.

A static member function is therefore very much like a plain vanilla "C" function in that there is no hidden pointer passed implicitly by the compiler. it's signature therefore remains clean (no name-mangling, but that's a topic for another day when we examine nm and c++filt). The public static member function can be approximated to a simple "C" function protected by a name-space that just happens to be namespace of the class it is defined in. Also, it can only access other static data members and methods within that class. continuing with the example above:


namespace SomeClassWithStaticAndPrivateData {
 void static_foo() {
   sInt = 42; //ok to access static data members
   //instanceInt = 0; //ERROR: cannot access without a this*
 }
}


The const qualifier on a non-static member function is a promise to the compiler that the method will not modify any data members (static or otherwise). This is achieved by passing the "this*" as a const pointer of the class. Continuing with the example above, the const_foo function may be looked upon as:

void SomeClassWithStaticAndPrivateData::const_foo(const SomeClassWithStaticAndPrivateData* this);

Which brings us to the issue with the const qualifier of a static member function: since it doesn't take a this*, there is nothing to which the const qualifier can be applied.

More philosophically, if we did allow the const qualifier on a static member function, it would be difficult for the compiler to guarantee that a static member function declared const wouldn't actually modify any data member. Consider the contrived example below. static_const_foo represents the hypothetical abomination that is a static method with a const qualifier.


//defining global and static vars 
int32_t *p_global_Int = NULL;

//defining static data members
int32_t SomeClassWithStaticAndPrivateData::sInt = 0;

void SomeClassWithStaticAndPrivateData::foo() {
   if(NULL == p_global_Int) {
      p_global_Int = &sInt;
   }

   // do other processing

}

void SomeClassWithStaticAndPrivateData::static_const_foo() const {
   if(NULL != p_global_Int) {
      (*p_global_Int)++;
   }
   //try to do something else
}

In of itself, the definition of static_const_foo() is completely legal. It is taking a global pointer, null checking it for bonus points, and then incrementing the non-null referenced integer. Perfectly valid.

Similarly, the definition of foo() is also completely legal. If the global pointer is not already filled, it innocently points it to the class' static data member. For additional horror points, it actually even points to private data, breaching any protection that compiler might offer, but that's a separate story (for details, checkout Scott Meyere's excellent book "55 Specific Ways to Improve Your Programs and Designs, pp. 126, Item 28 " Avoid returning handles (references, pointers or iterators) to object internals"). What it did is completely legal and well within its rights and the rules of C++.

Taken together though, they represent not only a moral sin, but an actual violation of the guarantees purportedly provided to the compiler: a function that declared itself as not changing any static data members exhibits the highest depravity by fondling its supposedly const private members. And there would be no easy way for the compiler to detect, let alone prevent this. Since this level of service cannot be guaranteed, it is NOT provided by the compiler. And since this level of service cannot be provided, it is best to explicitly disallow it so that the developers are aware that they are on their own in this regard.

No comments: