STYLE(9) BSD Kernel Manual STYLE(9)
style - BSD style guide for (mostly C) source code
This manual page specifies the preferred style for C source files in the MirBSD source tree. It used to be called "Kernel source file style guide (KNF)", although it applied to userland code since almost forever. These guidelines should be followed for all new code. In general, code can be considered "new code" when it makes up about 50% or more of the file(s) involved. This is enough to break precedent in the existing code and use the current style guidelines. /*- * Style guide for the MirOS Project's Coding Styles. * Derived from the OpenBSD KNF (Kernel Normal Form). * indent(1) does not reformat this comment. */ /** * This is a documentation comment in doxygen format. * clang -Wcomment checks them for correct syntax. * Use only if you're honouring that standard. */ /* * VERY important single-line comments look like this. */ /* almost all small single-line comments look like this */ /* A few others are sentences, thus end with a full stop. */ /* * Multi-line comments look like this. Make them real sentences, * in contrast to single-line comments. Fill them so they look * like real paragraphs. indent(1) does reformat this comment. * (XXX: does it? It says slash+star+newline isn't reformatted.) * Encode files as ISO_646.irv:1991 7-bit character set ("ASCII") * or UTF-8 with codepoints from the UCS BMP only. Use traditional * German, Latin, British English (adapted for ICT), metric and SI * units and prefixes preferring ISO/IEC 60027-2 binary praefices. */ Kernel include files (i.e. <sys/*.h>) come first. You usually need either <sys/types.h> or <sys/param.h> but not both: BSD <sys/param.h> includes <sys/types.h>, which in turn includes <sys/cdefs.h>, and it's okay to depend on that. Also, add <sys/time.h> before the other system includes, but after <sys/types.h>. Put non-local includes in angle brackets, local includes in double quotes. #include <sys/types.h> #include <sys/time.h> #include <sys/ioctl.h> Machine and device includes follow. If it's a networked program, put the network include files next. #include <net/if.h> #include <net/if_dl.h> #include <net/route.h> #include <netinet/in.h> #include <protocols/rwhod.h> Then there's an optional blank line, followed by the other files from /usr/include. The list of include files should be sorted by group and, if possible, within the group. That is, <sys/param.h> and <sys/time.h> first, then all other system includes, then machine and device includes, then network includes, then /usr/include files, then local includes. #include <stdio.h> Global pathnames are defined in <paths.h>. Pathnames local to the program go in "pathnames.h" in the local directory. #include <paths.h> Then there's a mandatory blank line, and the user include files. #include "pathnames.h" The includes block is followed by another blank line to separate it from the file identification block. Add the CVS (or RCS) ID(s) of the file and, if taking over old source code, the SCCS ID and __COPYRIGHT as well; place another blank line after that. It is discouraged listing RCS IDs as comments at the beginning of files if it's possible to use these macros; copy them over (nore this may introduce CVS branch merge conflicts), the macros in MirBSD are safe to be used more than once in the same file. This used to exclude most header files; some worked around this defining a macro with their ID which would be expanded by the main source file, but if <sys/cdefs.h> can be relied on to have been included first or is included by the header file itself, there's no argument against using these macros in MirBSD. These four macros are defined in <sys/cdefs.h>: __COPYRIGHT("Copyright (c) 1989, 1993\n\ The Regents [...] reserved."); __SCCSID("@(#)cat.c 8.2 (Berkeley) 4/27/95"); __RCSID("$MirOS: src/bin/cat/cat.c,v [...] Exp $"); __IDSTRING(someotherid, "bla"); All functions are prototyped somewhere. Function prototypes for private functions (i.e. functions not used else- where) go at the top of the first source module. In userland, functions local to one source module should be declared 'static'. This should not be done in kernel land since it makes it almost impossible to use the kernel debugger. Functions used from other parts of the kernel are prototyped in the relevant include file. It is strongly recommended that include files do not include other header files. Document interdependencies in the relevant manual pages. Functions that are used locally in more than one module go into a separate header file, e.g. "extern.h". This file is allowed to precede the, otherwise sorted, list of local include files. Do not use the same names as other well-known header files, such as these in /usr/include, these used by library dependencies or part of other applications that are often pulled in by the make(1) ".PATH" command. Use of the __P macro has been deprecated. It is allowed in code imported from other sources but not to be used in native MirBSD code. Only write out prototypes with full variable names in manual pages; in all other files, prototypes for documented functions should not have variable names associated with the types (this is what manpages are for); i.e. use: void function(int); not: void function(int a); Lining up prototypes after type names is discouraged because it is hard to maintain; use a single space. If lining up (existing code), prototypes may have an extra space after a tabulator to enable function names to line up: static char *function(int, const char *); static void usage(void); Function or macro names and their argument list are never separated by whitespace, only keywords that are not function-like (GCC attributes, sizeof, etc.) use a single space. Use __dead from <sys/cdefs.h> for functions that don't return: __dead void abort(void); void usage(int) __dead; Use GCC attributes extensively to catch programming errors, e.g. /* one line per function attribute */ wchar_t *xambsntowcs(const char *, size_t) __attribute__((__nonnull__(1))) __attribute__((__bounded__(__string__, 1, 2))); /* use of an argument mandated by function pointer API */ static int x_del_char(int __unused); /* macro evaluating its argument twice */ #define DOUBLE(x) __extension__({ \ __typeof__(x) DOUBLE_x = (x); \ \ (DOUBLE_x + DOUBLE_x); \ }) In header files, put function prototypes within matching pairs of __BEGIN_DECLS / __END_DECLS. This makes the header file usable from languages like C++. Labels start at the second column, i.e. are prefixed with exactly one single space (ASCII 20 hex) character, no matter which block they are in. Macros are capitalised and parenthesised and should avoid side-effects. If they are an inline expansion of a function, the function is defined all in lowercase; the macro has the same name all in uppercase. In rare cases, function-like macros that evaluate their arguments only once are allowed to be treated like real functions and use lowercase. If the macro needs more than a single line, use braces. Right-justify the backslashes, as the resulting definition is easier to read. Use a single space after the #define cpp(1) command and, except if lining up single-line macros with tabstops, after the macro name or closing parentheses. In contrast to functions and invocations, the list of parameters omits the space after a comma separating parameters in macro definitions. If the macro encapsulates a compound statement, enclose it in a "do" loop, so that it can safely be used in "if" statements, like shown below. Do not forget the CONSTCOND lint(1) command. Any final statement-terminating semicolon shall be supplied by the macro invocation rather than the macro, to make parsing easier for pretty-printers and editors. #define MACRO(x,y) do { \ variable = (x) + (y); \ (y) += 2; \ } while (/* CONSTCOND */ 0) Enumeration values are all uppercase. enum enumtype { ONE, TWO } et; When declaring variables in structures, generally declare them sorted by use, then by size (largest to smallest), then by alphabetical order. You may attempt to optimise for structure padding to avoid wasting space. The first category normally doesn't apply, but there are exceptions. Each one gets its own line. It is strongly recommended to not line them up either. Use single spaces, but line up the comments if desirable or, better, place them on their own lines just before the item they apply to. Major structures should be declared at the top of the file in which they are used, or in separate header files if they are used in multiple source files. Use of the structures should be by separate declarations and should be "extern" if they are declared in a header file. struct foo { struct foo *next; /* list of active foo */ struct mumble amumble; /* comment for mumble */ int bar; }; struct foo *foohead; /* head of global foo list */ Use queue(3) and tree(3) macros rather than rolling your own lists when- ever possible. Thus, the previous example would be better written: #include <sys/queue.h> struct foo { /* foo list queue glue */ LIST_ENTRY(foo) link; /* comment for mumble */ struct mumble amumble; int bar; }; /* head of global foo list */ LIST_HEAD(, foo) foohead; Avoid using typedefs for structure types. This makes it impossible for applications to use pointers to such a structure opaquely, which is both possible and beneficial when using an ordinary struct tag. When conven- tion requires a typedef, make its name match, but not exactly identical to (which makes the code unusable from C++) the struct tag. Avoid typedefs ending in "_t", except as specified in Standard C or by POSIX. /* make the structure name match the typedef */ typedef struct _bar { int level; } BAR; /* * All major routines should have a comment briefly describing * what they do. The comment before the "main" routine should * describe what the program does. */ int main(int argc, char *argv[]) { int aflag, bflag, ch, num; const char *errstr; For consistency, getopt(3) should be used to parse options. Options should be sorted in the manual page SYNOPSIS and DESCRIPTION, any usage() or similar function, the getopt(3) call and the switch statement, unless parts of the switch cascade. Elements in a switch statement that cascade should have a FALLTHROUGH comment. Numerical arguments should be checked for accuracy. Code that cannot be reached should have a NOTREACHED com- ment. The CONSTCOND, FALLTHROUGH, and NOTREACHED comments benefit lint. while ((ch = getopt(argc, argv, "abn:")) != -1) switch (ch) { /* indent the switch */ case 'a': /* don't indent the case */ aflag = 1; /* FALLTHROUGH */ case 'b': bflag = 1; break; case 'n': num = strtonum(optarg, 0, INT_MAX, &errstr); if (errstr) { warnx("number is %s: %s", errstr, optarg); usage(); } break; case '?': /* redundant here but ok */ default: usage(); /* NOTREACHED */ } argc -= optind; argv += optind; Cast expressions and the value to be casted are never separated by a space; use parentheses about the latter if it's a compound expression. Use a space after keywords (if, while, for, return, switch) but not unary operators like sizeof, typeof, alignof, or function-like constructs like GCC attributes (see above). It is recommended to put parentheses around the return argument as well, although this is not a strict requirement, to accommodate languages such as C++ and PHP in which the result type differs when surrounded by parentheses. It helps when debugging (define return to a debug expression) though, except for void functions. No braces are used for control statements with zero or only a single statement, unless that statement is more than a single line (in which case they are permitted), it contains a comment (in which case they are recommended), or it contains a label (in which case they are mandated). A separate project may choose to mandate braces for all cases. Avoid empty bodies; otherwise, push the semicolon to the next line and indent it, and comment it (here, a same-line comment is permitted), or perhaps insert a redundant continue statement; never place the semicolon on the same line as the loop, and don't use empty braces except if absolutely required (e.g. the NXC compiler fails at loops with a sole semicolon as body), in which case they must not have any whitespace between the opening and closing brace and follow the loop statement on the same line, separated by one space. for (p = buf; *p != '\0'; ++p) ; /* nothing */ // traditional style for (p = buf; *p != '\0'; ++p) /* nothing */; // less legible for (p = buf; *p != '\0'; ++p) continue; /* empty */ // optional; add a comment! while (/* CONSTCOND */ 1) /* or: for (;;) */ stmt; while (/* CONSTCOND */ 1) { z = a + really + long + statement + that + needs + two + lines + gets + indented + four + spaces + on + the + second + and + subsequent + lines; } while (/* CONSTCOND */ 1) { if (cond) stmt; } if (cond) { /* comment */ somelabel: stmt; } Parts of a for loop head may be left empty, although while loops are preferable especially in such cases. Don't put declarations inside blocks unless the routine is unusually complicated. for (; cnt < 15; cnt++) { stmt1; stmt2; } Indentation is an 8 character tab. Second level indents are four spaces. while (cnt < 20) z = a + really + long + statement + that + needs + two + lines + gets + indented + four + spaces + on + the + second + and + subsequent + lines; This is true for almost all languages but not HTML or XHTML (it is valid for XML, though xmlstarlet fo does not wrap at all). In assembly source code, labels always begin at the first column, and the comma separating operands is normally unspaced: /* ... */ Lout: xor eax,eax dec ebx jnz 1f dec eax 1: ret In assembly for processors that have it, do indent the branch delay slot, whether cancelled (,a on SPARC) or not (this rare example uses spaces): ENTRY(bcopy) cmp %o2, BCOPY_SMALL bge,a Lbcopy_fancy /* if >= this many, go be fancy */ btst 7, %o0 /* (part of being fancy) */ /* not much to copy, just do one byte at a time */ deccc %o2 /* ... */ Do not add whitespace at the end of a line, and only use tabs followed by spaces to form the indentation. Never use more spaces than a tab will produce and do not use spaces in front of tabs. vim users are required to put "let c_space_errors = 1" into their ~/.vimrc. Fixup with ^K] in jupp. Closing and opening braces go on the same line as the else. Braces that aren't necessary may be left out, unless they cause a compiler warning. if (test) stmt; else if (bar != NULL) { stmt; stmt; } else stmt; Avoid doing multiple assignments in one statement, like this: /* bad example */ if (foo) { stmt; stmt; } else if (bad) *wp++ = QCHAR, *wp++ = c; else stmt; /* write this as */ if (foo) { stmt; stmt; } else if (good) { *wp++ = QCHAR; *wp++ = c; } else stmt; Do not use spaces after function names. Commas have a space after them. Do not use spaces after '(' or '[' or preceding ']' or ')' characters. if ((error = function(a1, a2))) exit(error); Use positive error codes. Negative errors (except -1) are something only the Other OS does. Unary operators don't require spaces; binary operators do. Don't use parentheses unless they're required for precedence, the statement is confusing without them, or the compiler generates a warning without them. Remember that other people may be confused more easily than you. Do YOU understand the following? a = b->c[0] + ~d == (e || f) || g && h ? i : j >> 1; k = !(l & FLAGS); It's much better to break after an operator if you need to apply line breaks. This is especially true for shell scripts. The above example could be rewritten as: a = (((b->c[0] + ~d) == (e || f)) || (g && h)) ? i : (j >> 1); Lines ought to be not larger than 80 characters. Stick to 75 characters or less if possible, but in some cases it's ok to put a character in the 80th column. Descriptions should not be longer than 66 characters, eMails must not be longer then 72 characters per line. In object-oriented languages, it may be acceptable to use up to 100 characters per line. Exits and returns should be 0 on success, and non-zero for errors. /* * avoid obvious comments such as * "Exit 0 on success." */ exit(0); } The function type should be on a line by itself preceding the function. This eases searching for a function implementation: $ grep -r '^function' . static char * function(int a1, int a2, float fl, int a4) { When declaring local variables in functions, declare them sorted by size (largest to smallest), then in alphabetical order; multiple ones per line are okay. If a line overflows, restate the type keyword. Declarations should follow ANSI X3.159-1989 ("ANSI C89"). Be careful not to obfuscate the code by initialising variables in the declarations. Use this feature only thoughtfully. DO NOT use function calls in initialisers! struct foo one, *two; double three; int *four, five; char *six, seven, eight, nine, ten, eleven, twelve; four = myfunction(); Do not declare functions inside other functions: such declarations have file scope regardless of the nesting of the declaration. Note that indent(1) comes with a sample .indent.pro which understands most of these rules, starting from MirBSD #9. As of MirBSD #11, it will also be installed into the user home skeleton directory. Use of the 'register' specifier is discouraged in new code. Optimising compilers such as GCC can generally do a better job of choosing which variables to place in registers to improve code performance. Functions containing assembly code where explicit register placement is required for proper code generation in the absence of compiler optimisation are excepted from this rule. When using longjmp() or vfork() in a program, the -Wextra or -Wall flag should be used to verify that the compiler does not generate warnings such as warning: variable `foo' might be clobbered by `longjmp' or `vfork'. If any warnings of this type occur, apply the "volatile" type-qualifier to the variable in question. Failure to do so will result in improper code generation when optimisation is enabled. Note that for pointers, the location of "volatile" specifies whether the type-qualifier applies to the pointer or the thing being pointed to. A volatile pointer is declared with an extra space and "volatile" to the right of the "*". Example: char * volatile foo; says that "foo" is volatile, but "*foo" is not. To make "*foo" (the thing being pointed to) volatile use the syntax volatile char *foo; and if both the pointer and the thing pointed to are volatile use: volatile char * volatile foo; "const" is also a type-qualifier and the same rules apply. Assume string literals are constant. Never make use of broken C APIs such as strchr(3) to "cast away" the "const" qualifiers. The description of a read-only hardware register might look something like: const volatile char *reg; Global flags accessed inside signal handlers should be of type "volatile sig_atomic_t" if at all possible. This guarantees that the variable may be accessed as an atomic entity, even when a signal has been delivered. Global variables of other types (such as structures) are not guaranteed to have consistent values when accessed via a signal handler. NULL is the preferred nil pointer constant. Never use 0 in place of NULL. Use NULL instead of (type *)0 or (type *)NULL in all cases except for (sentinel) arguments to variadic functions where the compiler does not know the type as some systems define NULL to 0 not "(void *)0UL". Don't use '!' for tests unless it's a boolean (or an integral flag used in a boolean context, almost certainly with a bitwise AND operation, since otherwise using a bool would be correct), i.e. use if (p != NULL && *p == '\0') not if (p && !*p) Routines returning void * should not have their return values cast to any pointer type. Functions used as procedures should not have their return value explicitly cast to void, either. The exception are function-like macros like sigaddset(3), where failure to do so may result in compiler warnings about unused LHS in comma operations. You can assume that pointers to variables and function pointers share the same address space and have the same size as ptrdiff_t. Use err(3) or warn(3), don't roll your own! if ((four = malloc(sizeof(struct foo))) == NULL) err(1, NULL); if ((six = (int *)overflow()) == NULL) errx(1, "Number overflowed."); return (eight); } Old-style function declarations looked like this: static char * function(a1, a2, fl, a4) int a1, a2; /* declare ints, too, don't default them */ float fl; /* beware double vs. float prototype differences */ int a4; /* list in order declared */ { ... } You really ought to replace them with ANSI C function declarations. Long parameter lists are wrapped with a normal four space indent. Variable numbers of arguments should look like this: #include <stdarg.h> void vaf(const char *fmt, ...) __attribute__((__format__(__printf__, 1, 2))); void vaf(const char *fmt, ...) { va_list ap; va_start(ap, fmt); STUFF; va_end(ap); /* No return needed for void functions. */ } static void usage(void) { Usage statements should take the same form as the synopsis in mdoc(7) manual pages. Options without operands come first, in case-insensitive (uppercase first) alphabetical order inside a single set of brackets, followed by options with operands in case-insensitive (uppercase first) alphabetical order, each in brackets, followed by required arguments in the order they are specified, followed by optional arguments in the order they are specified. A bar ('|') separates either-or options/arguments, and multiple options/arguments which are specified together are placed in a single set of brackets. If numbers are used as options, they should be placed first, as shown in the example below. Note that the options list ordering should be purely alphabetical, except that the no-argument options are listed first. "usage: f [-12aDde] [-b barg] [-m marg] req1 req2 [opt1 [opt2]]\n" "usage: f [-a | -b] [-c [-de] [-n number]]\n" The getprogname(3) function may be used instead of hard-coding the program's name. fprintf(stderr, "usage: %s [-ab]\n", getprogname()); exit(1); } New core kernel code should be reasonably compliant with the style guides. The guidelines for third-party maintained modules and device drivers are more relaxed but at a minimum should be internally consistent with their style; the current MirPorts Framework package tools are a bad example of style inconsistency (such as three different indentation styles: three spaces, four spaces and KNF one tab) and a good example of why it must be prevented. Whenever possible, code should be run through at least one code checker (e.g. "gcc -Wall -Wextra -Wpointer-arith -Wbad-function-cast ...", "make __CRAZY=Yes", lint(1) or splint from the ports tree) and produce minimal warnings. Write source code that compiles without any warnings, failures or malfunctions with gcc3 -Os -std=c99 -Wbounded -Wformat (or -std=c89), pcc -O and gcc4 -O2 -fwrapv -std=c99 -Wformat. Try to keep code working with gcc3 -std=c89 or at least both of -std=c99 and -std=gnu89. When rolling your own bools, use German names: Wahr, Ja, Nee, isWahr() ("is" + lcase is reserved) Otherwise the code might not compile any more past some point because some subsequent edition of ISO C made your type and constant definitions UB because suddenly the chosen English names are become keywords. This also applies to other framework-ish inventions. Note that documentation follows its own style guide, as documented in mdoc.samples(7). This however does not invalidate the hints given in this guide. Shell scripts also follow this guide as far as applicable, except function declarations are all on one line; never use the "`" character. Use Korn syntax and reasonably recent mksh(1) extensions. function bla { (( x )) && foo=abcdefghijklmnopqrstuvwxyz$(fnord \ ABCDEFGHIJKLMNOPQRSTUVWXYZ) [[ $foo = @([a-z_]*([a-z0-9_]) ]] || exit 1 }
/usr/share/misc/licence.template Licence preferred for new code. ~/.indent.pro should contain at least the following items: -c0 -ci4 -di0 -nbs -ncsp -nfc1 -nlp -nlpi -Tbool -Tint16_t -Tint32_t -Tint64_t -Tint8_t -Tintmax_t -Tintptr_t -Tmbstate_t -Toff_t -Tptrdiff_t -Tsize_t -Tssize_t -Ttime_t -Tuint16_t -Tuint32_t -Tuint64_t -Tuint8_t -Tuintmax_t -Tuintptr_t -Twchar_t -Twint_t and a bunch of others. Note that the -c0 option might be problematic for existing code and may be better left out. mircvs://src/usr.bin/indent/.indent.pro contains a more complete list; even then, indent(1) does only basic help to apply KNF.
indent(1), lint(1), err(3), queue(3), warn(3), mdoc.samples(7)
This manual page is largely based on the src/admin/style/style file from the BSD 4.4-Lite2 release, with updates to reflect the current practice and desire first of the OpenBSD project, then for the MirBSD source tree, including an improved Open Source licence. MirBSD #10-current February 19, 2022 10