#define
)
#if
and its ilk)
const
struct
and union
Declarations
class
Declarations
friend
Classes
extern
Declarations
if
/
else Statements
do
Statements
goto
Statements
return
Statements
try
/catch
Statements
inline
This document defines the C++ coding style for Wildfire, Inc. It also tries to provide guidelines on how to use the various features found in the C++ language. The establishment of a common style will facilitate understanding and maintaining code developed by more than one programmer as well as making it easier for several people to cooperate in the development of the same program. In addition, following a common programming style will enable the construction of tools that incorporate knowledge of these standards to help in the programming task.
Using a consistent coding style throughout a particular module, package, or project is important because it allows people other than the author to easily understand and (hopefully) maintain the code. Most programming styles are somewhat arbitrary, and this one is no exception. In the places where there were choices to be made, we attempted to include the rationale for our decisions.
This document contains rationale for many of the choices made. Rationale will be presented with this paragraph style.
One more thing to keep in mind is that when modifying an existing source file, the modifications should be coded in the same style as the file being modified. A consistent style is important, even if it isn't the one you usually use.
However, there are many variations in style that do not interfere with achieving these goals. This style guide is intended to be the minimum reasonable set of rules that accomplish these ends. It does not attempt to answer all questions about where ever character should go. We rely upon the good judgement of the programmer as much as possible.
This guide presents things in "programming order", that is, notes, rules, and guidelines about a particular programming construct are grouped together. In addition, the sections are in an order that approximates that used to write programs.
The section Miscellaneous on page 32 contains many useful tidbits of information that didn't fit well into any of the other sections.
Finally, there is a Bibliography and Reading List at the end of this document that contains quite a few titles. Many of the books there should be considered mandatory reading --- if nothing else, buy and read a copy of both the ARM [10] by Ellis & Stroustrup and Effective C++ [12] by Scott Meyers. Coplien's Advanced C++ Programming Styles and Idioms [13] is also highly recommended.
A good style guide can enhance the quality of the code that we write. This style guide tries to present a standard set of methods for achieving that end.
It is, however, the end itself that is important. Deviations from this standard style are acceptable if they enhance readability and code maintainability. Major deviations require a explanatory comment at each point of departure so that later readers will know that you didn't make a mistake, but purposefully are doing a local variation for a good cause.
A good rule of thumb is that 10% of the cost of a project goes into writing code, while more than 50% is spent on maintaining it. Think about the trade-offs between ease-of-programming now vs. ease-of-maintenance for the next 5 to 10 years when you consider the rules presented here.
The C++ programming language differs substantially from the C programming language. In terms of usage, C is more like Pascal than it is like C++. This style guide differs from traditional C style guides in places where the "C mindset" is detrimental to the object-oriented outlook desired for C++ development.
Code should compile without errors or warnings. "Compile" in
this sense applies to lint
-like code analyzers, a
standard-validating compilers (ANSI-C++, POSIX, Style Guide
Verification, etc.), and C++ compilers on all supported
hardware/software platforms.
Try to pick filenames that are meaningful and understandable. File names are not limited to 14 characters. The following table shows the file naming conventions we will use:
-------------------------------------------
File Contents Name
-------------------------------------------
C++ Source Code filename.cc
C++ Header File filename.hh
C Source Code filename.c
C Header File filename.h
Object Code filename.o
Archive Libraries filename.a
Dynamic Shared Libraries filename.so.<ver>
Shell Scripts filename.sh
Yacc/C Source Code filename.y
Yacc/C++ Source Code filename.yy
Lex/C Source Code filename.l
Lex/C++ Source Code filename.ll
Directory Contents README
Build rules for make
Makefile
-------------------------------------------
POSIX specifies a maximum of 14 characters for filenames, but in practice this limit is too restrictive: source control systems like RCS and SCCS use 2 characters; the IDL compiler generates names with suffixes appended, etc.
char *s1 = "hello\n" "world\n"; // s1 is exactly the same as s2, char *s2 = "hello\nworld\n";
The line length limit is related to the fact that many printers and terminals are limited to an 80 character line length. Source code that has longer lines will cause either line wrapping or truncation on these devices. Both of these behaviors result in code that is hard to read.
#pragma
directive should be used.
#pragma
directives are, by definition, non-standard,
and can cause unexpected behavior when compiled on other systems. On
another system, a #pragma
might even have the opposite
meaning of the intended one.
In some cases #pragma
is a necessary evil.
Some compilers use #pragma
directives to control template
instantiations. In these rare cases the #pragma
usage should
be documented and, if possible, #ifdef
directives should be
to ensure other copilers don't trip over the usage.
(See #error directive
and 4.2
Conditional Compilation (#if
and its ilk)
Header files should be functionally organized, with declarations of separate subsystems placed in separate header files. For class definitions, header files should be treated as interface definition files.
#include
"name"
construct.
The required ordering in header files is as follows:
(1)
:
// Copyright 1992 by Wildfire Communications, Inc. // remainder of Wildfire copyright noticeDon't place anything other than the copyright text in this comment --- the whole comment will be replaced programmatically to update the copyright text.
#ifndef
that checks whether the header file has
been previously included, and if it has, ignores the rest of the
file. The name of the variable tested looks like
_WF_
FILE_HH, where "FILE_HH
"
is replaced by the header file name, using underscore for any
character not legal in an identifier. Immediately after the test, the
variable is defined.
#ifndef _WF_FILENAME_HH #define _WF_FILENAME_HH
$Header: /usr/local/cvsroot/sourcepole/sources/programming/cpp/Wildfire-C++Style.html,v 1.1 2000/10/05 12:27:59 pi Exp $
variable should be placed as the end
of the block comment, or in a comment immediately following it:
// $Header: /usr/local/cvsroot/sourcepole/sources/programming/cpp/Wildfire-C++Style.html,v 1.1 2000/10/05 12:27:59 pi Exp $
Since implementations will change, code that places "implementation-required #includes" in clients could cause them to become tied to a particular implementation.
The following items are a suggested order. There will be many times where this ordering is inappropriate, and should be changed.
const
declarations.
class
, struct
, and
union
declarations.
struct
or union
declarations.
typedef
declarations.
class
declarations.The rest of these items should be in found this order at the end of the header file.
enum
or const
can be used to
reduce the need for globals; if they are still required they should be
either file-scope static
or declared extern
in a header file.
#endif
need be followed
by a comment describing the #ifdef
head guard.
#ifndef/#endif
multiple inclusion guard
(see Template for C++ Implementation
files on page 45). After that the order should be:
const
class static member variables.)
static
) variable definitions.
#
in
column 1. No indentation allowed for preprocessor directives.
#include <module/name>to get public header files from a standard place. The
-I
option of the compiler is the best way to handle the pseudo-public
"package private" header files used when constructing libraries--- it
permits reorganizing the directory structure without altering source
files.
#define
)Macros are almost never necessary in C++.
#define
NAME
value
should never be used. Use a const
or
enum
instead.
The debugger can deal with them symbolically, while it can't with
a #define
, and their scope is controlled and they only
occupy a particular namespace, while #define
symbols
apply everywhere except inside strings.
Macros should be used to hide the ##
or
#param
features of the preprocessor and encapsulate
debugging aids such as assert()
. (Code that uses these
features should be rare.) If you find that you must use macros, they
must be defined so that they can be used anywhere a statement
can. That is, they can not end in a semicolon. To accomplish this,
multi-statement macros that cannot use the comma operator should use a
do
/while
construct:
#define ADD(sys,val) do { \ if (!known_##sys(val)) \ add_##sys(val);\ } while(0)
This allows ADD()
to be used anywhere a statement can,
even inside an if
/else
construct:
if (doAdd) ADD(name, "Corwin"); else somethingElse();
It is also robust in the face of a missing semicolon between the
ADD()
and the else
.
This technique should not be used for paired begin/end
macros. In other words, if you have macros that bracket an operation,
do not put a do
in the begin macro and its closing
while
in the end macro.
This makes any break
or continue
between
the begin and end macro invocations relative to the hidden
do
/while
loop, not any outer containing
loop.
#if
and its ilk)
In general, avoid using #ifdef
. Modularize your code so
that machine dependencies are isolated to different files and beware
of hard coding assumptions into your implementation.
#
of all preprocessor commands must always be in column 1.
#ifdef
to select among a set of
configuration options, you need to add a final #else
clause containing a #error
directive so that the compiler
will generate an error message if none of the options has been
defined: #ifdef sun #define USE_MOTIF #define RPC_ONC #elif hpux #define USE_OPENLOOK #define RPC_OSF #else #error unknown machine type #endif
USE_STREAMS
or USE_SOCKETS
, not for
predefined system names like sun
, hpux
,
SYSV
, etc., that you happen to "know" support one or the
other form.
#define BEGIN { // EXTREMELY BAD STYLE!!! #define when break;case // EXTREMELY BAD STYLE!!!
This makes the program unintelligible to all but the perpetrator. C++ is hard enough to read as it is.
#else
, #elif
, and #endif
should have commented tags identifying the #if
construct to
which it is attached if there are several levels of ifdefs or more than
a page worth of code is placed between the #ifdef
and
#endif
.
#ifdef RPC_ONC doONCStuff(); #endif
#ifdef DEBUG if (!debug) // #ifdef breaks standard braces rule #endif { doSomeStuff(); doMoreStuff(); }
Identifier naming conventions make programs more understandable by
making them easier to read. They also give information about the
purpose of the identifier. Each subsystem should use the same
conventions consistently. For example, if the variable
offset
holds an offset in bytes from the beginning
of a file cache, the same name should not be used in the same
subsystem to denote an offset in blocks from the beginning of
the file.
We have made an explicit decision to not use Hungarian Notation.(2)
Well chosen names go a long way toward making a program self-documenting. What is an obvious abbreviation to you may be baffling to others, especially in other parts of the world. Abbreviations make it hard for others to remember the spelling of your functions and variables. They also obscure the meaning of the code that uses them.
foo
and FOO
. Having a type name and a variable differing in
only in case (such as String string;
) is permitted, but
discouraged.
Identifiers are either upper caps, mixed case, or lower case. If an
identifier is upper caps, word separation in multi-word identifiers is
done with an underscore (for example, RUN_QUICK
). If an
identifier is mixed case, it starts with a capital, and word
separation is done with caps (for example, RunQuick
). If
an identifier is lower case, words are separated by underscore (for
example, run_quick
). Preprocessor identifiers and
template parameters are upper case. The mixed case identifiers are
global variables, function names, types (including class names), class
data members, enum members. Local variables and class member functions
are lower case.
Template parameter names act much like #define
identifiers over the scope of the template. Making them upper case
calls them out so they are readily identifiable in the body of the
template.
An initial or trailing underscore should never be used in any user-program identifiers.(3)
Prefixes are given for identifiers with global scope (some packages may extend the prefixes for their identifiers):
---------------------------------------------------------------------- Prefix Used for ---------------------------------------------------------------------- WF_preprocessor
_WF_
hidden preprocessor (e.g., protecting symbols for header file) Wf Global scope (global variables, functions, type names). wf File-static scope ----------------------------------------------------------------------
File-static identifiers, are the only exception: they are mixed case,
but start with a lower-case prefix. (for example,
wfFileStaticVar
).
The goal of this section is to provide guidance to minimize potential name clashes in C++ programs and libraries.
There are two solution strategies: (1) minimize the number of clashable names, or (2) choose clashable names that minimize the probability of a clash. Strategy (1) is preferable, but clashable names cannot be totally eliminated.
Clashable names include: external variable names, external function names, top-level class names, type names in public header files, class member names in public header files, etc. (Class member names are scoped by the class, but can clash in the scope of a derived class. Explicit scoping can be used to resolve these clashes.)
There are two kinds of name clash problem:
Solutions:
Exception: A top-level class name used only as a naming scope can consist entirely of a distinctive prefix.
WfRenderingContext (a type name) WfPrint() (a function name) WfSetTopView() (a function name) WfMasterIndex (a variable name) Wf::String (a type name --- the class name serves as prefix)
For components of the Wildfire program, prefixes begin with Wf
.
Listed below are explicitly reserved names which should not be used in human-written code (it may be permissible for program generators to use some of these).
_[A-Z_][0-9A-Za-z_]*
_[a-z][0-9A-Za-z_]*
E[0-9A-Z][0-9A-Za-z]* errno values is[a-z][0-9A-Za-z]* Character classification to[a-z][0-9A-Za-z]* Character manipulation LC_[0-9A-Za-z_]* Locale SIG[_A-Z][0-9A-Za-z_]* Signals str[a-z][0-9A-Za-z_]* String manipulation mem[a-z][0-9A-Za-z_]* Memory manipulation wcs[a-z][0-9A-Za-z_]* Wide character manipulation
Note that the first three namespaces are hard to avoid. In
particular, many accessor methods naturally fall into the
is*
namespace, and error conditions map onto the
E*
namespace. Be aware of these conflicts and make sure
that you are not redefining existing identifiers.
Blank lines and blank spaces improve readability by offsetting sections of code that are logically related. A blank line should always be used in the following circumstances:
#include
section.
class,
struct
, and
union
declarations.
The guidelines for using spaces are:
;
follows the keyword.
// no space between 'strcmp' and '(', // but space between 'if' and '(' if (strcmp(input_value, "done") == 0) return 0;
This helps to distinguish keywords from procedure calls.
[
]
(
)
.
->
- All other binary operators must be separated from their operands
by spaces. In other words, spaces should appear around assignment,
arithmetic, relational, and logical operators, and they should not
appear around
. and ->
.
for
statement must be separated
by spaces:
for (expr1; expr2; expr3) { ...; }
String(sp)
) as
a clue to the reader.
start= (a < b ? a : b); end= (a > b ? a : b);
Only four-space line indentation should be used. The exact construction of the indentation (spaces only or tabs and spaces) is left unspecified. However, you may not change the settings for hard tabs in order to accomplish this. Hard tabs must be set every 8 spaces.
If this rule was not followed tabs could not be used since they would lack a well-defined meaning.
The rules for how to indent particular language constructs are described in Statements, § 10.
Occasionally an expression will not fit in the available space in a line; for example, a procedure call with many arguments, or a logical expression with many conditions. Such occurrences are especially likely when blocks are nested deeply or long identifiers are used.
if (LongLogicalTest1 || LongLogicalTest2 || LongLogicalTest3) { ...; } a = (long_identifier_term1 --- long_identifier_term2) * long_identifier_term3;If there were some correlation among the terms of the expression, it might also be written as:
if (ThisLongExpression < 0 || ThisLongExpression > max_size || ThisLongExpression == SomeOtherLongExpression) { ...; }
Placing the line break after an operator alerts the reader that the expression is continued on the next line. If the break were to be done before the operator, the continuation is less obvious.
Note also that, since temporary variables are cheap (an optimizing
compiler will generate similar code whether or not you use them), they
can be an alternative to a complicated expression:
temp1 = LongLogicalTest1; temp2 = LongLogicalTest2; temp3 = LongLogicalTest3; if (temp1 || temp2 || temp3) { ...; }
Comments should be used to give an overview of the code and provide additional information that is not readily understandable from the code itself. Comments should only contain information that is germane to reading and understanding the program.
//
) are preferred over C style
(/*...*/
), though both are permitted.
//!! When we can, replace this code with a wombat -author
This gives maintainers some idea of whom to contact. It also
allows one to easily grep
the source looking for
unfinished areas.
Block comments are used to describe a file's contents, a function's behavior, data structures, and algorithms.
main()
should include a description of what the program
does. The comments at the beginning of other files should just
describe that file.
This would require anyone changing the text in the box to continually deal with keeping the right-hand line straight.
statements; // another block comment // made up of C++ style comments statements; /* * Here is a C-style block comment * that takes up multiple lines. */ statements;
Short comments may appear on a single line indented at least to the indentation level of the code that follows.
if (argc > 1) { // Get option from the command line. ...; }
Very short comments may appear on the same line as the code they describe, but should be tabbed over far enough to separate them from the statements. Trailing comments are useful for documenting declarations.
if (a == 2) return WfTrue; // special case else return is_prime(a); // works only for odd a
cons
t
or enum
. (See Macros (#define) on page 5.) The
enum
data type is the preferred way to handle situations
where a variable takes on only a discrete set of values because of the
added type checking done by the compiler: class Foo { public: enum { Success = 0, Failure = -1 }; ...; } if (foothing.foo_method("Argument") == Foo::Success) ...
const
and initialized with compile-time expressions are
themselves compile-time constants. Thus, they can be used as case
labels and such.
0
, 1
,
and -1
, can often be used directly. For example if a
for
loop iterates over an array, then it is reasonable to
code: for (i = 0; i < size; i++) { // statements using array[i]; }
<wfbase.hh>
defines the constants
WfTrue
and WfFalse
, as well as the type
WfBoolean
, as ensures the constant NULL
is
available.
sizeof
operator. For example, if an array's size is
determined by its initializers, the proper construct for determining
the number of elements it has is:
double factors[] = { 0.1345, 123.23451, 0.0 }; const int num_factors = sizeof factors / sizeof factors[0];
sizeof
operations should be
applied to objects, not types. Parentheses are not allowed around the
object specifier in a sizeof
expression.
This means that if the type of an object changes, all the
associated sizeof
operations will continue to be
correct. The parentheses are forbidden for data objects so that
sizeof
on types (where the compiler requires them) will
be easy to see.
const
Both ANSI C and C++ add a new modifier to declarations,
const
. This modifier declares that the specified object
cannot be changed. The compiler can then optimize code, and also warn
you if you do something that doesn't match the declaration.
The first example is a modifiable pointer to constant
integers: foo
can be changed, but what it points
to cannot be. Use this form for function parameter lists when you
accept a pointer to something that you do not intend to change (for
example, strlen(const char *string)
)
const int *foo; foo = &some_constant_integer_variable
Next is a constant pointer to a modifiable integer: the pointer cannot be changed (once initialized), but the integer it points to can be changed at will:
int *const foo = &some_integer_variable;
Finally, we have a constant pointer to a constant integer. Neither the pointer nor the integer it points to can be changed:
const int *const foo = &some_const_integer_variable;
Note that const
objects can be assigned to
non-const
objects (thereby making a copy), and the
modifiable copy can of course be changed. However, pointers to
const
objects cannot be assigned to pointers to
non-const
objects, although the converse is allowed. Both
of these forms of assignments are legal:
(const int *) = (int *); (int *) = (int *const);
But both of these forms are illegal:
(int *) = (const int *); // illegal (int *const) = (int *); // illegal
When const
is used in an argument list, it means that the
argument will not be modified. This is especially useful when you want
to pass an argument by reference, but you don't want the argument to
be modified.
void block_move(const void *source, void *destination, size_t length);
Here we are explicitly stating that the source data will not be modified, but that the destination data will be modified. (Of course, if the length is 0, then the destination won't actually be modified.)
All of these rules apply to class
objects as well:
class Foo { public: void bar1() const; void bar2(); }; const Foo *foo_pointer; foo_pointer->bar1(); // legal foo_pointer->bar2(); // illegal
Inside a const
member function like bar1()
,
the this
pointer is type (const Foo *const)
,
so you really can't change the object.
However, there is a distinction between bit-wise const
and logical const. A bit-wise const function truly does
not modify any bits of data in the object. This is what the compiler
enforces for a const
member function. A logical const
function modifies the bits, but not the externally visible state; for
example, it may cache a value. To users of a class, it is logical, not
bit-wise, const is important. However, the compiler cannot know if a
modification is logically const or not.
You get around this by casting away const, for example, by casting the
pointer to be a (Foo *
). This should only be done if you
are absolutely sure that the function remains logically const after
your operation, and must always be accompanied by an explanatory
comment.
struct
and
union
Declarations
A struct
should only be used for grouping data; there
should be no member functions. If you want member functions, you
should be using a class
. Hence, struct
s
should be pretty rare.
struct
or union
name.
struct
or
union
keyword.
struct
or union
should
be indented one level.
struct Foo { int size; // Measured in inches char *name; // Label on icon ...; };
Note that struct
and enum
tag names are
valid types in C++, so the following common C idiom is obsoleted
because foo
can be used wherever you used to use
Foo
:
typedef struct foo { /* Obsolete C idiom */ ...; } Foo;
enum
tag and the opening brace should be on the
same line as the enum
keyword.
enum
is the same as for a
struct
if it takes up multiple lines, or it contains
explicit initializers. It also can be contained on one line as shown
below.
,
').
class Color { public: enum Component { Red, Green, Blue }; }; Color::Component foo = Color::Red;
const int Red = 0; // Bad Form const int Blue = 1; const ink Green = 2; enum ColorComponent { // Much Better Red, Blue, Green }; enum ColorComponent { // Explicit values can be given Red = 0x10, // to each item as well... Blue = 0x20, Green = 0x40 };This causes
ColorComponent
to become a distinct type that
is type-checked by the compiler. Values of type
ColorComponent
will be automatically converted to
int
as needed, but an int
cannot be changed
to a ColorComponent
without a cast.
switch
statement on an enum
variable that
does not have all elements of the enum expressed as case
labels. This situation usually indicates a logic error in the code.
enum
, make the last element of the enum
be a
last
field.
enum Color { Red, Blue, Green, LastColor = green };This trick should only be used when you need the number of elements, and will only work if none of the enumeration literals are explicitly assigned values.
public
or
protected
. Member data must always be
private
.
public
. private
and protected
inheritance is not allowed.
public
or protected
data
members, right?), "Does my client (or subclass) really need to know
this, or could I recast the interface to reveal less?"
const
whenever
possible (see Use of const on page
14).
class
Declarations
class
should be on a separate
line in the same column as the class
keyword.
This is to help users of vi
, which has a simple "go
to beginning of paragraph" command, and which recognizes such a line
as a paragraph beginning. Thus, you can, in the middle of a long class
declaration, go to the beginning of the class with a simple
command. The usefulness of this feature was deemed to outweigh its
inconsistency (also see. section
9.2).
class
keyword.
class
are indented similarly to
those of a struct
(see struct
and union Declarations on page 15).
public
, protected
, and
private
sections of a class
should be
present (if at all) in that order, indented 1/2 an indent level
past that of the opening brace.
The ordering is "most public first" so people who only wish to use
the class can stop reading when they reach
protected
/private
.
public
or protected
data
members --- use private
data with public
or
protected
access methods instead. class Foo: public Blah, private Bar { public: Foo(); // be sure to use better ~Foo(); int get_size(int phase_of_moon) const; // comments than these. int set_size(int new_size); virtual int override_me() = 0; protected: static int hidden_get_size(); private: int Size; // meaningful comment void utility_method(); };
Public and protected
data members affect all derived
classes and violate the basic object oriented philosophy of data
hiding.
Constructors and destructors are used for initializing and destroying objects and for converting an object of one type to another type. There are lots of rules and exceptions to the use of constructors and destructors in C++, and programs that rely heavily on constructors being called implicitly are hard to understand and maintain. Be careful when using this feature of C++!
Be particularly careful when writing constructors that accept only one argument (or use default arguments that may allow a multi-argument constructor to be used as if it did) since such constructors specify a conversion from their argument type to the type of its class. Such constructors need not be called explicitly and can lead to unintended implicit uses of conversions. There are also other difficulties with constructors and destructors being called implicitly by the compiler when initializing references and when copying objects.
Things to do to avoid problems with constructors and destructors:
operator=
, the compiler will perform a
member-wise copy, which may not be the behavior expected. Note that
initialization and assignment are generally very different operations.
This will cause the compiler to generate a compile-time error if a
member-wise copy is attempted.
class Foo { public: Foo(); ~Foo(); int get_size(int phase_of_moon) const; private: ... };
:
is on the same line as the
closing parenthesis. Constructors that take multiple lines to declare
have their :
on the line following the last paramter,
indented to the same level as the beginning of the constructor name.
BusStop::BusStop() : PeopleQueue(30), Port("Default") { ...; } BusStop::BusStop(char *some_argument) : PeopleQueue(30), Port(some_argument) { ...; }
static
initialization. If you design
a class that depends on some other facility in its constructor, be
careful about order dependencies in static
initialization. The order in which static
constructors
(that is, the constructors of objects with static
storage
class) get called is undefined. You cannot count on one object
being initialized before another. Therefore, if you have such a
dependency, you must either document that your class cannot be used
for static
objects, or you must use "lazy evaluation" to
defer the dependency until later (see Item 47 in Effective C++
[12] for more details).
C++ automatically provides the following methods for your classes (unless you provide your own):
const
and
non-const
), and
class Empty { }; // You write this ... class Empty // You really get this ... { public: Empty() { } // constructor ~Empty() { } // destructor Empty(const Empty &rhs); // copy constructor Empty &operator=(const Empty &rhs); // assignment operator Empty *operator&(); // address-of const Empty *operator&() const; // operators };
Every class writer must consider whether the default functions are correct for that class. If they are, a comment must be provided where the function would be declared so that a reader of the class knows that the issue was considered, not forgotten.
If a class has no valid meaning for these functions, you should
declare an implementation in the private
section of the
class. Such a function should probably call abort()
,
throw an exception, or otherwise generate a visible runtime error.
This ensures that the compiler will not use the default implementations, that it will not allow users to invoke that function, and that if a member function uses it by accident, it may at least be caught at runtime.
It is a good idea to always define a constructor, copy constructor, and a destructor for every class you write, even if they don't do anything.
Overloading function names must only be done if the functions do essentially the same thing. If they do not, they must not share a name. Declarations of overloaded functions should be grouped together.
Deciding when to overload operators requires careful thought. Operator overloading does not simply create a short-hand for an operation --- it creates a set of expectations in the mind of the reader, and inherits precedence from the language.
+
on strings
concatenates, <<
adds to a stream) or real algebra
on the types (for example, a position class plus an offset gets a
different position).
<
without overloading
>
or >=
will astonish the user in
unhappy ways, as will overloading +
and =
but not +=
. In particular, ->
.
and []
should always be considered a set:
foo->member() // should be identical to (*foo).member() // which should also be identical to foo[0].member()Overloading
==
requires overloading
!=
, and vice versa.
If the expression (a != b)
is not equivalent to
!(a == b)
we have
unacceptably astonished the user.
private
section so that the compiler will report
the error to anyone who assumes that the set is complete. However,
this should be a flag for you to consider whether the operator
overloading really is natural --- the strong presumption is that you
are not going to override all members in the set then
none of the members of the set should be overridden.
This allows the user of the class to determine if a cast is more readable than a member function invocation, for example, to avoid casts that look like they should be automatically done by the compiler, but are explicit to invoke the cast.
When a member of a class is declared protected
, to
clients of the class it is as if the member were private. Subclasses,
however, can access the member as if it were declared private to
them. This means that a subclass can access the member, but only as
one of its own private fields. Specifically, it cannot access a
protected field of its parent class via a pointer to the parent class,
only via a pointer to itself (or a descendant).
When using friends remember that private member access rights do not
extend to subclasses of the friend
class. Any method that
depends on friend
access to another class cannot be
rewritten in a subclass.
class Base
, below), the friend
keyword denotes a class-global behavior change that is being applied
to the friend
class. As such, it is not governed by the
class part designation (public
, protected
,
or private
) currently in force. Thus, the
friend
keyword should be indented to the same level as
these class part names.
friend int operator==
,
section 7.12.2) it should be treated as a
type modifier in the same sense that static
,
extern
, and virtual
are. That is, the word
friend
is lined up along with the other type specifiers
one indent level from the level of the class itself.
friend
is needed between classes,
friend
member functions are preferred to making the
entire class a friend.
The use of friend
class or method declarations is
discouraged, since the use of friend
breaks the
separation between the interface and the implementation. The only
non-discouraged uses are for binary operators and for cooperating
classes that need private communication, such as container/iterator
sets.
friend
Classes
friend
class declarations must come at the end of
the class declaration.
friend
class declaration is necessary and the
friend
class is intended to be subclassable, the
friend
class must be written so that its subclasses have
the same access rights as the base class. To do this, any access
depending on the friend
declaration is encapsulated in a
protected function:
class Secret { private: int Data; int method(); friend Base; }; class Base { protected: int secret_data(Secret *income_info); int secret_method(Secret *income_info); }; int Base::secret_data(Secret *income_info) { return income_info->Data; } int Base::seccet_method(Secret *income_info) { return income_info->method(); }Methods of the
Secret
class should not be accessed
directly by methods of the friend
class
Base
. Direct access makes it hard to cut-and-paste code from
the base to a derived class:
void Base::an_example(Secret *income_info) { int a = income_info->Data; // BAD: Direct access is wrong int b = secret_data(income_info); // GOOD: Use accessor functions! }
Binary operators, except assignment operators, must almost always be
friend
member functions.
class String { public: String(const char *); friend int operator==(const String &,const String &); friend int operator!=(const String &,const String &) { return !(string1 == string2); } };
If the operator=
= were a member function, the
conversion operator would only allow (String == char *)
but not (char * == String)
This would be quite surprising
to the user of the class. Making operator==
a
friend
member function allows the conversion implied by
the constructor to work on both sides of the operator.
template<class TYPE> class List { public: TYPE front(); ... }; template<class TYPE> TYPE List<TYPE>::front() { ...; }
template<class TYPE, unsigned int SIZE> class Vector { private: Type Data[SIZE]; }Here, the type stored by the
Vector
template class is
named TYPE
because it is a general purpose parameter. The
SIZE
parameter, however, is specific since it ultimately
determines the size of a Vector<TYPE>
object; its
name reflects this specific purpose.
Since C++ gives the programmer the freedom to place a variable definition wherever a statement can appear, they should be placed near their use. For efficiency, it may be desirable to invoke constructors only when necessary. Thus function code may define some local variables, do some parameter checking, and once the sanity checks have passed then define the class instances and invoke their constructors.
Where possible, you should initialize variables when they are defined.
char *Foo[] = { "Hello", ", ", "World\n" }; int max_string_length = BUFSIZE; String path("/usr/tmp/gogin");
This minimizes "used before initialization" bugs.
extern
Declarations
class ClassName;
Place them in header files to prevent inconsistent declarations in each source file that uses it.
*
and &
should be with the identifier, not the type. The following style is
forbidden:
int* ip; String& str;
This style, though currently popular, lies about syntax, since
int* p1, p2;
implies p1
and p2
are both pointers, but one is not. Since we do not accept that only
one variable should be declared on a line as a fixed rule, we cannot
allow a style that lies about the meaning of multiple declarations on
a line.
*
, &
, etc.).
(4)
int count = 0; char **pointer_to_string = &foo;
int level = 0; // indentation level int size = 0; // size of symbol table int lines = 0; // lines read from inputis preferred over:
int level, size, lines; // Not RecommendedThe latter style is frequently used for declaring several temporary variables of primitive types such as
int
or
char
, or strongly matched variables, such as x, y pairs,
where changing the type of one requires changing the type of all.
long db, OpenDb(); // Bad long db; // Better long OpenDb(); // but still not recommended #include <admintools/database.hh> // Best Databae db;
You should use a header file that contains an external function
declaration of OpenDb()
instead of hard-coding its
definition in your source file.
void WfFunction() { static int boggle_count; // Count of boggles in formyls if (condition) { int boggle_count; // Bad --- this hides the above instance } }
=
on
the same line.
This style is purposefully analogous to the function declaration style. It may look strange to some at first, but in the context of a complete program, it lends itself to an overall pleasing appearance of the code.
Cat cats[] = { "Shamus", "Macka", "Tigger", "Xenephon", };
char *name = "Framus";
(
.
SomeType *WfLibraryFunctionName(void *current_pointer, size_t desired_new_size);However, if a function takes only a few parameters, the declaration can be strung onto one line (if it fits):
int strcmp(const char *s1, const char *s2);
We usually use a one-line-per-declaration form for several reasons.
(1) It is easy to comment the individual parameters,
(2) It makes it easier to read when there many parameters.
(3) It is easy to reorder the parameters, or to add one. The closing );
is on a line by itself to make it easier to add a new parameter at the end of the parameter list.
(4) It is designed to be visually similar to the other declaration statements.
(5) It works well with long identifier names.
However, with simple declarations the weight seems too great for the benefit.
int getchar();The ANSI C-compatible construct of
(void)
for a function
with no parameters must only be used in header files designed to be
included by both C and C++ (See
Interaction with C on page 39.)
typedef
type). The only exception is for
operators and single-argument constructors where the meaning of the
parameter is clear from that context.
This provides internal documentation that can help people remember what a parameter is supposed to represent. It also allows comments in the file to refer to the parameter by name.
const
&
.
var
parameter) The alternative of passing pointers is not
encouraged, but is not prohibited. See
References vs. Pointers, § 11.5 for
more details.
Small functions promote clarity and correctness.
int
must be specified explicitly). If the function does not return a value then it should be given the return type void
. If the value returned requires a long explanation, it should be given in the block comment above. The function name should be alone on a line beginning in column 1 (the class name is included on the same line as the function name if the function is a method of a class).
char * WfString::cstr() { // ... }
void WfFoo(int param1, int /* optional_param2 */) { // ...; }
int SystemInformationObject::get_number_of_users(Name host_name, Time idle_time) { int some_variable; statements; ...; }
this
variable in member functions to access members. In other words, you should never write this->Anything
.
argv++; argc--; // Multiple statements are bad if (err) fprintf(stderr, "error\n"), exit(1); // Using `,' is worse argv++; // The right way argc--; if (err) { fprintf(stderr, "error\n"); exit(1); }
Compound statements are statements that contain lists of statements enclosed in braces.
class
definition, or a new scope are the only occurrences of a left brace that should be alone on a line.
{ // New Block Scope int some_variable; statements; }
if
/else
or for
statement, as in:
if (condition) { // braces required; following "if" is two lines if (other_condition) // braces not required -- only one line follows statement; }Braces are not required for control structures with single-line bodies, except for
do
/while
loops, whose always require enclosing braces. This single-line rule includes a full if
/else
/else
/... statement:
if (condition) single_thing(); else if (other_condition) other_thing(); else final_thing();Note that this is a "single-line rule", not a "single statement rule". It applies to things that fit on a single line.
Single-statement bodies are too simple to be worth the weight of the extra curlies.
if
/
else Statements
An else
clause is joined to any preceding close curly brace that is part of its if
. See also Comparing against Zero on page 34.
if (condition) { ...; } if (condition) { ...; } else { ...; } if (condition) { ...; } else if (condition) { ...; } else { ...; }
for (initialization; condition; update) { ...; }
If the three parts of the control structure of a for statement do not fit on one line, they each should be placed on separate lines or broken out of the loop:
for (longinitialization; longcondition; longupdate ) { ...; } longinitialization; // Alternate form... for (; condition; update) { ...; }
When using the comma operator in the initialization or update clauses of a for
statement, no more than two variables should be updated. If there are more, it is better to use separate statements outside the for
loop (for the initialization clause), or at the end of the loop (for the update clause).
do
Statementsdo { ...; } while (condition);
while (condition) { ...; }
The infinite loop is written using a for
loop:
for (;;) { ...; }
This form is better than the functionally equivalent while
(TRUE)
or while
(1)
since they imply a test against TRUE
(or 1
), which is neither necessary nor meaningful (if TRUE
ever is not true, then we are all in real trouble).
Loops that have no body must use the continue
keyword to make
it obvious that this was intentional.
while (*string_pointer++ != '\0') continue;
case
labels should be on lines separate from the statements they control.
case
labels are indented to 1/2 an indent level beyond the level of the switch
statement itself.
We use this indentation since the labels are conceptually part of the switch
, but indenting by a full indent would mean that all code would be indented by two indent levels, which would be too much.
case
label in a set of case
labels, especially if the body code is large. (But don't put a blank line right after the switch
keyword)
break
in the switch
is, strictly speaking, redundant, but it is required nonetheless.
This prevents a fall-through error if another case
is added after the last one.
switch
statement should rarely, if ever, be used (except for multiple case labels as shown in the example). If it is used otherwise, it must be commented with:
// FALLTHROUGHwhere the
break
would normally be expected.
This makes it clear to the reader that it is this fallthrough was intentional.
return
statement should not be followed by a break
.
switch
statements that use members of an
enum
should not have a default
case. This means that if you have such a switch
, you must
always have all members of the enum
represented in
explicit case
labels, even if these only execute a
break
.
Some C++ compilers will warn you if such a switch
is missing a member. This warning will call out situations where you add a member to an enum
definition but forget to add a case
for it in a given switch
. This is usually an error.
switch
statements keyed on non-enum
values should have a default
label if the code assumes that only certain values will arrive. Such a default
label should make sure that the erroneous situation is called to someone's attention, such as by signalling an error or generating an error message.
switch (pixel_color) { case Color::blue: ...; break; case Color::red: found_red_one = TRUE; // FALLTHROUGH case Color::purple: { int local_variable; ...; break; } default: // handles green, mauve, and pink colors... ...; break; }
This is to catch unexpected inputs in more graceful ways than failing unpredictably somewhere else in the code.
goto
Statements
While not completely avoidable, use of goto
is discouraged. In many cases, breaking a procedure into smaller pieces, or using a different language construct will enable elimination of a goto
.
The main place where a goto
can be usefully employed is to break out of several nested levels of switch
, for
, or while
nesting when an error is detected. In future versions of C++ exceptions should be used.
for (...) { for (...) { ...; if (disaster) { goto error; } } } return true; error: // clean up the mess
goto
to branch to a label within a block:
if (pool.is_empty()) { goto label; // VERY WRONG } for (...) { Object obj; label: }
Branching into a block may bypass the constructor invocations and initializations for the automatic variables in the block. Some compilers treat this as an error, others blissfully ignore it.
goto
is necessary, the accompanying label must be alone on a line starting in column 1.
return
Statements
The expressions associated with return
statements are not required to be enclosed in parentheses.
try
/catch
StatementsThe proposed C++ syntax for exception handling is not to be used in shared code at this time (5). This section specifies the syntax which will eventually be used to support the feature, but should be avoided in near term code.
throw
statements are
not required to be enclosed in parentheses.
try { statements; } catch (type) { statements; } catch (...) { // This is the literal "..." statements; }
This can used to support C programs being linked to C++ libraries without the use of a C++-aware linker. See Interaction with C on page 39.
<stdargs.h>
) parameters. You should avoid doing this if at all possible. The classic example of this usage is:
void printf(const char *, ...);
if (BooleanExpression) return WfTrue; else return WfFalse;with:
return BooleanExpression;Similarly,
if (condition) // Awkward return x; return y;is usually clearer when written as:
if (condition) // Clear return x; else return y;or
return (condition ? x : y);
if (x = y) { // Confusing ...; }it is hard to tell whether the programmer really meant assignment or the equality test.
if ((x = y) != 0) { // Understandable ...; }or something similar if the assignment is needed within the
if
statement. There is a time and a place for embedded assignments. The ++
and --
operators count as assignments. So, for many purposes, do functions with side effects.
As an aside, many of today's compilers can produce faster and/or smaller code if you don't use embedded assignments. If you are using such convoluted code to "help the compiler optimize the program", you may be doing more harm than good.
memcpy()
function. Not only does this waste your time, but it may prevent your program from taking advantage of any hardware specific assists or other means of improving performance of these routines. It also makes your code less readable, because the reader has to figure out whether you are doing something special in the re-implemented routines to justify their existence.
------------------------------------------------------------------------ Type Minimum Maximum Comments Value Value ------------------------------------------------------------------------ signed char --128 127 They may hold more unsigned char 0 255 They may hold more char 0 127 Can't assume signed or unsigned short --32,768 32,767 Minimum 16 bits signed short unsigned short 0 65,535 Minimum 16 bits long --2,147,483,648 2,147,483,64 Minimum 32 bits signed long 7 unsigned long 0 4,294,967,29 Minimum 32 bits 5 int --32,768 32,767 Same as a short signed int unsigned int 0 65,535 Same as an unsigned short ------------------------------------------------------------------------
char
may be unsigned
or signed
. You can't assume either. Thus, only use (unmodified) char
if you don't care about sign extension and can live with values in the range of 0-127.
int
cannot be counted on to hold more than a short int
. It is an appropriate type to use if a short
would be big enough but you would like to use the processor's "natural" word size to improve efficiency (on some machines, a 32-bit operation is more efficient than a 16-bit operation because there is no need to do masking). If you need something larger than a short
, you must specify a long
--- an int
won't do.
size_t
, ptrdiff_t
, sigatomic_t
where appropriate.
Comparisons against zero values must be driven by the type of the expression. This section shows the valid ways to compare for given types. Anything not permitted here is forbidden.
When maintaining code it is very useful to be able to tell what "units" a comparison is using. As an example, an equality test against the constant 0
implies that the variable being tested is an integral type; testing against an explicit NULL
implies a pointer comparison, while an implied NULL
implies a boolean relationship.
(See if/else Statements on page 28)
Choose variable names that make sense when used as a "condition". For example,
if (pool.is_empty())
makes sense, while
if (pool.state())
just confuses people. The generic form for Boolean comparisons are
if (boolean_variable) if (!boolean_variable)
Note that this is the only case where an implicit test is allowed; all other comparisons (int
, char
, pointers, etc.) must explicitly compare against a value of their type. A standalone variable should always imply a boolean value.
Never use the boolean negation operator!
with non-boolean expressions. In particular, never use it to test for a null pointer or to test for success of the strcmp()
function:
if (!strcmp(s1, s2)) // Bad if (strcmp(s1, s2) == 0) // Good
if (char_variable != '\0') while (*char_pointer != '\0')
if (integer_variable == 0) if (integer_variable != 0)
if (floating_point_variable > 0.0)
Always exercise care when comparing floating point values. It is generally not a good idea to use equality or inequality type comparisons. Use relative comparisons, possibly bounded by a "fuzz" factor in cases where an equality-like functionality is required.
if (pointer_variable != NULL ) // Always use an explicit test vs. NULL
Implicit comparisons are not allowed:
if (pointer_variable) // WRONG
inline
inline
functions in public interface definitions.
Since a client using your inlined interface actually compiles your code into their executable, you can never change this part of your implementation. And no one else can provide an alternate implementation.
+w
option which will warn you of the case where things declared inline
aren't inlined. When an inline
function isn't inlined, it may be defined "file static" in every file that references it!
Within your implementation there may be places where you need to use inlines. Be aware that the use of inlines can easily make your (and other people's) code larger, which can overcome any efficiency gains. Here are some guidelines to help do it right.
is_equal()
, which compares two objects
for equality. It also has an inline definition for
operator==
, as a notational convenience. Since
operator==
just turns around and calls the
is_equal()
function, it may be OK for it to be inline and
not virtual.
inline
in the class, with the code presented immediately
after the class declaration:
class Dummy { public: inline int do_something(); }; inline int Dummy::do_something() { // ... several lines of code }
The advantages of using references over pointers include (from [25]):
The advantages of using pointers over references include:
Use references where you reasonably can --- that is, when assigning a name to an already existing singular object. Use pointers for any of the other N meanings that pointers have traditionally held.
The advantages of portable code are well known and little appreciated. This section gives some guidelines for writing portable code, where the definition of portable is a source file that can be compiled and executed on different machines with the only source change being the inclusion of (possibly different) header files.
int
. Nor are all pointers always the same size, or freely interchangeable.
char
to a pointer-to-int
may result in an invalid address.
NULL
pointer except test its value. In particular, code that assumes that dereferencing a NULL
pointer yields '\0'
(ala VAX/BSD) will generate memory faults on other machines (for example, Sparc). Further, never write a class that assumes that this
may be validly NULL
.
struct
or class
is laid out in memory, or that it can be written to a data file as is.
char
is sign-extended when used in expressions, which is not the case on some other machines. Code that assumes either that char
is signed
or unsigned
is non-portable. It is best to completely avoid using char
to hold numbers. Manipulation of characters as if they were numbers is often non-portable. Explicitly declare character variables as signed
or unsigned
in cases where it matters.
signed
on some machines and unsigned
on others. If you use bitfields in a way that is sensitive to this difference you must be explicit.
int
, since it will get the most efficient (natural) unit for the current machine. Word size also affects shifts and masks. The statement
x &= 0177770will clear only the three right most bits of a 16 bit
int
on a PDP-11. On a VAX (with 32 bit int
s) it will also clear the entire upper 16 bits. Use
x &= ~07instead, which works as expected on all machines. The operator
|
does not have these problems, nor do bit-fields.
#include "somefile.hh"
implies different search paths on different systems. On BSD derived systems it means
-I
directory list
-I
directory list
foo.cc
, foo.hh
, and the subdirectory obj
. Also assume that foo.cc
does an #include "foo.hh"
. The command
CC -c -o obj/foo.o foo.ccwill work on both systems, but
cd obj CC -c -o foo.o ../foo.ccwill fail to find
foo.hh
on many BSD based systems.
This is the list of the header files that ANSI-C (and thus C++) requires be provided by the language implementation. Use of any other "system" header file may not be portable.
Uses of these C header files are not required to be bracketed with the extern "C" { }
construct.
#include <stddef.h> #include <stdio.h> #include <stdarg.h> #include <stdlib.h> #include <locale.h> #include <ctype.h> #include <string.h> #include <time.h> #include <limits.h> #include <errno.h> #include <assert.h> #include <signal.h> #include <setjmp.h> #include <math.h> #include <float.h>
Header files that must be included by both C and C++ source have slightly different rules than do C++-only header files.
headerfile.h
instead of C++'s headerfile.hh
- All C++ keywords must be avoided. The C++ keywords that are not
in C are:
asm, catch
,
class
, delete
,
friend
, inline
,
new
, operator
,
private
, protected
,
public
, template
,
this
, throw
,
try
, virtual
.
/* */
, not C++'s //
.
(void)
, not just ().
- Function prototypes must always be provided.
This project has no interest in any non-ANSI-C dialects of the C language.
The inclusion of every non-C++ header file must be surrounded by the extern "C" { } construct.
Note that the header files enumerated above (ANSI-C/C++ include files:, § 12.1) are considered C++ header files, and not subject to this rule.
extern "C" { #include <somefile.h> #include <otherfile.h> }
Function calls that are intended to be called from C that take
input-only struct
arguments may wish to use pointers,
since C does not have references. Such pointers must, of course, be
declared const
.
In order to be able to export a C++ library to C users, you'd like to let the C users link to the library using the regular C/Unix linker.
The CC linker (also called the "C++ pre-linker" or
"patch" or "munch") is the part of the C++ system that makes static
constructors and static destructors work. These are the routines that
get called if you have a global (or a local static) variable whose
class has a constructor. The C++ pre-linker paws through your object
files looking for the right pattern of mangled name that indicates
constructors and/or destructors that need to be called for that file,
puts them all together into an initialization routine named
_main()
, and links that synthesized _main()
into your program. Cfront has inserted a call to _main()
at the start of your main C++ program.
So, on SunOS 4.x, if your library has any global, file-static or
local-static variables whose classes have constructors or destructors,
you must use the CC
command to link any application to
that library.
SunOS 5.0 object files allow libraries to have a .init
section that gets called to initialize the library. The constructors
would be put into this section (and not _main()
), thus
avoiding the linking problems mentioned above.