Mercurius — C Coding Standards
This document defines how C is written in the Mercurius codebase.
It is:
- influenced by the principles of MISRA C
- focused on robustness, security, and long‑term maintainability
- designed for a network‑native, protocol‑driven system
All new C code in Mercurius MUST comply with this document.
The intent is that this document is machine‑checkable: an AI or static tool should be able to scan the codebase and flag violations.
1. Goals and Philosophy
1.1 Primary goals
Mercurius C code MUST be:
- Correct: no undefined behaviour, no data races, no protocol violations
- Predictable: behaviour does not depend on compiler, platform, or luck
- Maintainable: readable and understandable years after it was written
- Defensive: never trusts the network, the filesystem, or the caller
- Boring: no clever tricks, no surprising control flow, no magic macros
1.2 Design philosophy
- Clarity over cleverness: if intent is not obvious, rewrite it.
- Explicit over implicit: conversions, ownership, and lifetimes must be clear.
- Safety over convenience: avoid constructs that are easy to misuse.
- Small pieces, loosely coupled: modules should be comprehensible in isolation.
- Long‑term view: write code future‑you will still respect.
2. Language Subset
Mercurius uses a conservative subset of ISO C (C11 or later where available).
2.1 Forbidden features
The following are NOT PERMITTED:
- Variable Length Arrays (VLAs)
- Recursion (direct or indirect)
setjmp/longjmp
gotointo a different scope
- Non‑standard compiler extensions (unless guarded and justified)
- Implicit function declarations
- K&R‑style function definitions
- Trigraphs and digraphs
2.2 Strongly discouraged features
These MAY be used only with clear justification in comments:
- Function‑like macros
- Unions (only for safe, well‑documented type punning)
- Bit‑fields (only when layout and semantics are fully understood)
3. Types and Integer Safety
3.1 Fixed‑width integers
All protocol‑visible and size‑sensitive integers MUST use fixed‑width types from <stdint.h>:
uint8_t,uint16_t,uint32_t,uint64_t
int8_t,int16_t,int32_t,int64_t
These types provide explicit width, predictable layout, and stable wire formats.
Use size_t for memory sizes, buffer lengths, and element counts.
Use ptrdiff_t for pointer differences.
Native integer types (int, long, long long) MAY be used only when width does not affect correctness and the value never crosses a module boundary.
Do NOT assume the size of any native integer type.
3.2 Signed vs. unsigned
Signedness must be explicit and intentional.
- Do NOT mix signed and unsigned values in arithmetic or comparisons without an explicit cast and a comment explaining why it is safe.
- Do NOT rely on signed integer wrap‑around; it is undefined behaviour.
- Unsigned wrap‑around is defined, but MUST NOT be used as a control mechanism unless clearly documented and justified.
- When converting between signed and unsigned types, document the invariant that makes the conversion safe.
3.3 Booleans
Use bool from <stdbool.h> for logical values.
- Do not use
intas a boolean except where required by external APIs.
- Do not use integers as implicit truth values in new code.
- Boolean expressions MUST NOT rely on implicit conversions from pointers or integers.
3.4 Type casts
Casts MUST be avoided unless they are strictly necessary. Unnecessary casts hide bugs, silence useful compiler diagnostics, and make invariants harder to reason about.
When a cast is required:
- the cast MUST be explicit
- the cast MUST be correct for the value range involved
- the code MUST document the invariant that makes the cast safe
- the cast MUST NOT change signedness unless the value is provably in range
- pointer casts MUST only be used when converting between compatible representations or when interacting with external APIs that require them
Examples of acceptable casts:
/* Converting a validated 16‑bit length field to size_t. */
size_t len = (size_t)hdr->length; /* length ≤ MWS_MAX_PAYLOAD_BYTES */
/* Casting to uint32_t after checking the value fits. */
if (x <= UINT32_MAX)
{
out = (uint32_t)x;
}
Examples of unacceptable casts:
/* Hides a signed/unsigned bug. */
if ((int)u < 0) …
/* Blindly truncates. */
uint16_t n = (uint16_t)big_number;
/* Pointer reinterpretation without justification. */
struct foo *f = (struct foo *)ptr;
Casts MUST NOT be used to silence warnings. If the compiler complains, fix the code or document the invariant that makes the cast safe.
If you lie to the compiler, it will get its revenge.
3.5 Variable initialisation
All variables MUST be initialised at the point of declaration.
- Do not rely on later assignments or control-flow analysis to ensure initialisation.
- Initialisation may be a zero-initialiser (
{0}) or a clearly defined sentinel value. - This rule exists to prevent use of indeterminate values and to keep the code robust under refactoring and compiler optimisation.
4. Naming and Structure
4.1 Files and modules
- One logical module per
.cfile.
- Each module SHOULD have a corresponding
.hif it exposes an interface.
- Internal‑only headers live in
src/and are not installed.
4.2 Header guards
Every header MUST use a traditional include guard:
#ifndef MWS_FOO_H
#define MWS_FOO_H
/* declarations */
#endif // MWS_FOO_HDo not use #pragma once.
4.3 Naming conventions
These are conventions, but consistency is expected: Every function name follows a big_little_littlest pattern:
big — subsystem prefix:
mwsc_— Client (mwscCLI and support library)
mwsd_— server daemon
mws_— shared protocol and Transport layer
little — object type or component:
mwsc_control_,mwsc_session_,mwsd_session_,mwsd_broker_,mws_transport_.littlest — a verb describing the action:
mwsc_control_run_handshake,mwsd_session_add_window,mwsd_broker_attach_session,mws_transport_destroy.
No enforced _t or _e suffixes. Use names that read naturally.
5. Control Flow
5.1 Braces and layout
Braces are mandatory, even for single statements. This avoids ambiguity, prevents accidental logic changes during edits, and keeps the code visually consistent.
if (condition)
{
do_something();
}
else
{
do_something_else();
}No single‑line if/else without braces.
The infamous goto fail bug in Apple’s TLS stack was caused by a duplicated, un‑braced if statement. Always use braces.
5.1.1 Placement
Opening braces always go on their own line, aligned with the controlling statement:
while (true)
{
do_work();
}Closing braces align with the construct that opened them.
5.1.2 Empty blocks
Empty blocks must contain a placeholder comment to make the emptiness intentional:
while (something)
{
/* this space intentionally left blank */
}Never use {} on a single line.
Why this rule exists
- An empty block with no comment looks like a mistake.
- It’s easy to accidentally delete a line and leave behind a silent empty block.
- A placeholder comment makes the intent explicit.
- It improves readability when scanning quickly.
- It prevents future contributors from “fixing” the emptiness incorrectly.
5.1.3 Multi‑line conditions
If a condition spans multiple lines, operators go at the start of continuation lines:
if (very_long_expression
&& another_condition
&& something_else)
{
handle_case();
}This makes grouping unambiguous and prevents misreading.
5.1.4 Explicit parentheses in conditions
Parentheses may be used liberally to make logical grouping explicit, even when not required by C operator precedence:
if ((a && b)
|| (force && (mode == MODE_OVERRIDE)))
{
do_work();
}5.1.5 else if chains
if (a)
{
…
}
else if (b)
{
…
}
else
{
…
}5.1.6 Loops
All loops require braces, even for a single statement:
for (int i = 0; i < n; i++)
{
process(i);
}5.1.7 Indentation
Indentation is 4 spaces, never tabs.
Braces define the indentation level; nothing else does.
5.2 switch statements
- Every
switchMUST have adefault:case.
- Every
caseMUST end inbreak,return, orgoto cleanup;.
- Fall‑through MUST be explicit and documented:
switch (state)
{
case MWS_STATE_INIT:
{
init_stuff();
/* fall through */
}
case MWS_STATE_RUNNING:
{
run_stuff();
break;
}
default:
{
handle_unexpected_state(state);
break;
}
}5.3 Loops
Use
do { ... } while (condition);when the body must run at least once.Use
while (condition) { ... }when the body might not run at all.Use
forloops only when the iteration variable and bounds are explicit (e.g. counting from 0 to N).Never use
for (;;)to express an infinite loop. Infinite loops must usewhile (true)with a clear comment:
while (true)
{
/* main event loop */
}- Use
breakfor loop termination sparingly; prefer expressing the exit condition in the loop header instead of “hidden” exits in the body:
/* Avoid termination buried inside an apparently infinite loop. */
for (;;)
{
do_something();
if (condition) /* <= do not do this */
break; /* <= */
do_something_else();
do_other_thing();
}- Avoid
continueunless it significantly improves clarity. Overuse ofbreakandcontinuemakes control flow harder to follow. —
5.4 goto
goto is permitted only for error handling and cleanup within a single function:
int mws_thing_do(struct mws_thing *t)
{
int rc = 0;
resource_a *a = NULL;
resource_b *b = NULL;
a = acquire_a();
if (a == NULL)
{
rc = -1;
goto out;
}
b = acquire_b();
if (b == NULL)
{
rc = -1;
goto out_a;
}
/* normal work */
out_b:
release_b(b);
out_a:
release_a(a);
out:
return rc;
}No jumping into blocks, no spaghetti control flow.
5.5 Preprocessor conditionals
All conditional compilation blocks must label their closing #endif with a comment matching the opening directive:
#if defined(MWSD_DEBUG)
// ...
#endif // MWSD_DEBUGNested conditionals must also be labelled:
#if defined(MWSD_DEBUG)
// ...
# if defined(MWSD_TRACE)
// ...
# endif // MWSD_TRACE
#endif // MWSD_DEBUGThis prevents mis‑nested or unintended conditional blocks.
6. Memory and Resource Management
Mercurius is not an embedded system; dynamic memory is allowed, but it MUST be disciplined.
6.1 Ownership
Every dynamically allocated object MUST have a clear owner.
- The owner is responsible for freeing it.
- Ownership transfer MUST be documented in comments or function documentation.
- Functions MUST document whether they take ownership of pointers passed in.
Example:
/**
* Takes ownership of `msg`. Caller MUST NOT use `msg` after this call.
*/
void mws_queue_message(struct mws_queue *q, struct mws_message *msg);6.2 Allocation and failure
- Every
malloc/calloc/reallocMUST have its return value checked.
- On allocation failure, functions MUST return an error code and leave the system in a consistent state.
- No function may assume allocation “cannot fail”.
6.3 No hidden allocations
Functions MUST NOT allocate memory behind the caller’s back unless:
- the behaviour is clearly documented, and
- the caller has a way to free the memory.
6.4 Lifetime and cleanup
- Use a single
cleanup:(or equivalent) label per function for releasing resources.
- Ensure all acquired resources are released on all exit paths.
- Avoid long functions with complex lifetime graphs; refactor instead.
6.5 Global state
- Global mutable state is strongly discouraged.
- If used, it MUST be documented, and access MUST be controlled.
- No hidden singletons.
6.6 Object-style structs and payloads
Many Mercurius subsystems use C structs in an object-like way (e.g. handshake contexts, AUTH lists, surface capabilities, session info). These MUST follow a clear init/use/destroy pattern.
Non-trivial structs (those that own memory or other resources) MUST provide an explicit
*_initand*_destroyAPI, or be documented as requiring zero-initialisation plus*_destroy.The function that calls
*_init(or otherwise becomes the first owner) is responsible for eventually calling*_destroy. Ownership MUST NOT silently migrate between unrelated layers.Helper functions that operate on these structs (parsers, encoders, transport helpers) MUST treat them as borrowed objects. They MUST NOT free or reallocate internal buffers unless this is explicitly documented.
Example:
struct mws_surface_caps caps = {0};
if (!mws_surface_caps_from_message(&caps, msg)) {
mws_surface_caps_destroy(&caps);
return false;
}
/* use caps */
mws_surface_caps_destroy(&caps);
6.6.1 Message payload ownership
MwsMessage is a container; it does not own the lifetime of
its payload by itself. Heap-backed payloads follow these rules:
The layer that arranges allocation of
msg->payload(typically a handshake step or higher-level API) is the owner. It is responsible for freeing the payload and resetting the fields when it is done.Codec and transport helpers (encode/decode, send/recv) MUST NOT take ownership of
msg->payload. They MUST NOT free or reallocate it. They may only read from it (encode/send) or write into a caller-owned buffer (decode/recv).Functions that do take ownership of a
MwsMessageor its payload MUST explicitly document this in their interface, as in the example in §6.1.
7. Macros and Constants
7.1 Macros
Prefer static inline functions over function‑like macros.
Function‑like macros are only allowed when:
- they cannot be expressed as a function without performance or type penalties, and
- they are simple, side‑effect‑free, and clearly documented.
Macros MUST NOT evaluate arguments multiple times.
7.2 Constants
Use const variables or #define for constants:
#define MWS_MAX_CLIENTS 64
static const uint32_t MWS_DEFAULT_PORT = 4242u;Magic numbers in code MUST be replaced with named constants.
8. Error Handling
8.1 Return codes
- Functions that can fail MUST return an explicit status (e.g.
int,enum mws_status).
0or a named success value indicates success; non‑zero or named error codes indicate failure.
- Do not overload return values with mixed “data or error” semantics unless clearly documented.
8.2 Logging
- Errors SHOULD be logged at the appropriate layer.
- Do not spam logs with transient or expected conditions.
- Do not log secrets or sensitive data.
8.3 Defensive checks
- All external inputs (network, files, environment) MUST be validated.
- All lengths, indices, and offsets MUST be bounds‑checked before use.
- All protocol fields MUST be validated before acting on them.
8.4 Error Categories and Control Flow
Mercurius distinguishes between three categories of error conditions.
These categories determine how functions return, whether logging occurs,
and whether the fake‑exception cleanup path is used.
8.4.1 Benign no‑ops
Some operations are naturally idempotent. Examples include:
- destroying or freeing an already‑destroyed object
- unmapping an already‑unmapped window
- closing a NULL or invalid handle
These are not errors. Functions SHOULD:
- return immediately
- NOT log anything
- NOT set
errno
- NOT enter the fake‑exception handler
This is normal behaviour and not considered a failure.
8.4.2 Caller‑side errors (invalid parameters)
Invalid parameters are the caller’s responsibility, not the subsystem’s.
Examples include:
- NULL pointers where a valid pointer is required
- invalid identifiers
- out‑of‑range values
- operations on uninitialised objects
These MUST:
- return early
- set
errno = EINVAL
- NOT enter the fake‑exception handler
- NOT log an error (debug‑level logging is acceptable)
The onus is on the caller to correct the misuse.
8.4.3 Subsystem/runtime errors
These represent actual failures within the subsystem. Examples include:
- protocol decode failure
- transport failure
- compositor failure
- allocation failure
- unexpected message type
- truncated payload
- internal invariants violated
These MUST:
- enter the fake‑exception handler
- perform any necessary cleanup
- log an appropriate error
- return a failure code
- use a single exit point at the end of the function
Subsystem/runtime errors are the only errors that use the fake‑exception pattern.
9. Protocol Handling
Mercurius is a network‑native system. Protocol correctness is non‑negotiable.
9.1 Never trust the wire
- Treat all incoming data as hostile until proven otherwise.
- Validate message types, lengths, IDs, and flags.
- Reject malformed messages explicitly and safely.
9.2 Lengths and buffers
- Never read or write beyond the bounds of a buffer.
- Never trust a length field without checking it against the actual buffer size.
- Use parsing functions that take explicit lengths and validate them.
9.3 SCTP and transport
- Follow the SCTP usage defined in the Mercurius protocol RFC.
- Do not assume ordering or delivery beyond what SCTP guarantees.
- Handle partial messages, retransmissions, and disconnects gracefully.
9.4 Protocol evolution
Any change to the protocol MUST be reflected in:
protocol/headers
- the protocol RFC in
docs/
- any affected client and server code
10. Concurrency and I/O
10.1 Threads
libmwsMUST be usable in a single‑threaded context.
- If threads are introduced, their usage MUST be explicit and documented.
- No hidden background threads.
10.2 Event loops
mwsdandmwscare primarily event‑driven.
- Avoid blocking operations in the main event loop.
- Use non‑blocking I/O or bounded blocking with timeouts.
10.3 No busy waiting
- No spin loops.
- No
while (!flag) {}without a sleep or event mechanism.
11. Formatting and Style
11.1 Indentation and whitespace
Indent with 4 spaces, no tabs.
One statement per line.
One declaration per line.
Use blank lines to separate logical blocks within functions.
Two blank lines MUST separate top‑level function definitions.
This improves readability, makes function boundaries visually distinct in the minimap/overview, and allows liberal use of single blank lines inside functions without losing structure.
11.2 Line length
- Aim for a maximum of 130 characters per line.
- Longer lines are allowed only when breaking them would harm clarity.
11.3 Includes
- Include only what you use.
- Order: standard headers, then system headers, then project headers.
- No unused includes.
11.4 File endings
- Every file MUST end with a single newline.
- No trailing whitespace.
12. Documentation
12.1 File headers
All public header and source files MUST begin with the standard Mercurius file header block. This block provides a clear description of the file’s purpose and authorship.
Example:
/***********************************************************************************
* @file mws_proto.h
* @brief Mercurius Wire Stream (MWS) protocol definitions.
*
* Defines the wire-level message structures, enumerations, and constants
* used by both client and compositor.
*
* Authors: Christopher Ross <chris@tebibyte.org>
* Another Contributor <name@example.com>
*
* COPYRIGHT (C) 2026 Christopher Ross. All rights reserved.
* See the LICENSE file for terms of use.
***********************************************************************************/
The Authors: field MUST follow the multi‑author list format:
* Authors: First Author <email>
* Second Author <email>
* ...
12.2 Function documentation
All non‑trivial, non‑static functions MUST have a brief Doxygen comment describing:
- what the function does
- what its parameters mean
- what it returns
- any ownership or lifetime rules
- any side‑effects or invariants it relies on
Example:
/**
* Initialise the Mercurius server context.
*
* @param ctx Pointer to an uninitialised context structure.
* @return 0 on success, non-zero error code on failure.
*
* On success, the caller is responsible for calling mwsd_ctx_destroy().
*/
int mwsd_ctx_init(struct mwsd_ctx *ctx);
12.3 Invariants and assumptions
Wherever code relies on invariants (e.g. “this list is sorted”, “this ID is unique”, “this buffer is always N‑aligned”), those invariants MUST be documented in comments near the code that enforces or depends on them.
Documentation MUST make implicit assumptions explicit.
12.4 Header and implementation documentation
Every public function MUST be documented in both its header (.h) and its
implementation (.c). These two documentation blocks serve different purposes,
but they MUST describe the same behaviour.
12.4.1 Header documentation
The header provides the public API contract:
- what the function does
- what its parameters mean
- what it returns
- any preconditions or invariants callers must satisfy
- any ownership or lifetime rules visible to callers
Header comments MUST be complete enough for a caller to use the function correctly without reading the implementation.
12.4.2 Implementation documentation
The implementation provides the operational description:
- the same first sentence as the header, verbatim
- additional detail about internal behaviour
- ordering constraints
- side‑effects
- interactions with other subsystems
- assumptions that matter to maintainers but are not part of the public API
The .c file MUST stand alone.
It MUST NOT say “see header” or rely on the header for behavioural description.
12.4.3 Consistency requirement
The header and implementation documentation MUST:
- describe the same behaviour
- use the same terminology
- be updated together when behaviour changes
If the header says “twiddle the knob” and the .c file says “push the button”,
the documentation is wrong.
12.4.4 Rationale
Developers read headers when using a function.
Developers read .c files when changing a function.
Both deserve complete, accurate documentation.
This rule prevents drift, eliminates ambiguity, and keeps the codebase maintainable over time.
12.5 Audience and intent of documentation
All comments and Doxygen blocks MUST be written for future maintainers, implementers, and reviewers. Documentation MUST describe:
- the protocol semantics
- the wire format
- the invariants the code relies on
- any assumptions required for correctness
Comments MUST NOT:
- reference historical changes, prior bugs, or internal discussions
- address specific individuals
- describe how the code used to work
- duplicate information already encoded in the code
Comments exist to explain what the code does and why it does it that way. How it does it is expressed by the code itself.
13. Testing and Tools
13.1 Compilers
Code MUST compile cleanly (no warnings) with:
- GCC
- Clang
Warnings are treated as errors in CI and during review.
13.2 Static analysis
The codebase SHOULD be regularly checked with:
clang-tidy
scan-build
- other static analysis tools as appropriate
Warnings from these tools MUST be taken seriously and either fixed or explicitly justified.
13.3 AI‑assisted linting
This document is written to be machine‑checkable.
It is expected that AI tools will be used to:
- scan for violations of these rules
- suggest safer alternatives
- highlight potential undefined behaviour
- enforce consistency in naming, structure, and error handling
AI suggestions are advisory, not authoritative. Human judgement prevails.
14. Code Review Expectations
Reviewers will check:
- adherence to this standard
- clarity and maintainability
- absence of undefined behaviour
- correct protocol handling
- safe memory and resource management
- appropriate error handling
- portability across supported platforms
“Works on my machine” is not sufficient.
15. Final Notes
Mercurius is written in C by choice, not by accident.
We write C in a way that is:
- explicit
- disciplined
- defensive
- boring in all the right ways
If you find yourself reaching for something clever, stop and ask:
“Will future‑me thank me for this, or swear at me for it?”
If the answer is the latter, rewrite it.
© Chris Ross — chris@tebibyte.org