Notice that I'll be moving my website to aartaka.me.eu.org soon. If you're reading this on aartaka.me, remember to switch.

Making C Code Uglier

IMAGE_ALT
A group of C programmers arguing about which indentation style is better. Just kidding, it's "Anger" by Pieter van der Heyden, 1558

C++ is practical, yet sometimes scary. C is outright frightening. If someone writes code in C++, they must be smart. If someone writes code in C, they must be crazy (well, at least I am.)

But still, C—with its guts full of eldritch horrors—is the lingua franca of programming and the most portable assembly language.

C is readable enough to most programmers, because most mainstream languages are C progenies. Pointers and macros are loathsome, but they are rare enough (are they?) to ignore.

So how scary can C code get? Not as a production use, but rather as an exercise in aesthetics. This post goes through a set of things that can convolute/obfuscate C code, from the minute details to critical readability losses.

Note that some obvious things like

are not mentioned to leave space for the scarier ones.

Test Program

I'm going to use a slightly modified C version of Trabb Pardo Knuth algorithm from Wikipedia because it's small enough while still showcasing most of C constructs/features:

 #include <math.h>;
 #include <stdio.h>;
double f (double t)
{
    return sqrt(fabs(t)) + 5 * pow(t, 3);
}
void tpk (void)
{
    double y, a[11] = {0};
    for (int i = 0; i < 11; i++)
        scanf("%lf", &a[i]);
    for (int i = 10; i >= 0; i--)
        if ((y = f(a[i])) > 400)
            printf("%d TOO LARGE\n", i);
        else
            printf("%d %.16g\n", i, y);
}
int main (void)
{
    tpk();
    return 0;
}
Trabb Pardo Knuth algorithm implementation in C

Benign: Indentation and Bracket Placement Style

Two of four spaces? Eight? Or three, maybe? Or—God almighty—tabs? C code styles are numerous and these styles have only one thing in common: all the styles are mutually incompatible and un-aesthetic. No matter which style one prefers—they're delusional and wrong, at least to the ones exhorting another style.

I use Linux kernel style, which might make you scream from the 8 spaces-wide tabs. But I'm not surrendering it.

As a matter of example, I'll use the Pico indentation style (four/five spaces) and bracket placement (before the first expression and after the last one.) Plus added spaces mimicking the Glib style: ¶

double
f (double t)
{ return
    sqrt (fabs (t)) + 5 * pow (t, 3); }
Sub-function of TPK algorithm re-indented in Pico style

Ugh, block scope and control flow are illegible now.

Confusing: Subscripts

A queer behavior of the standard array subscripts: the index and array parts can be swapped:

double a[11] = {0}, y;
for (int i = 0; i < 11; i++)
    scanf ("%lf", &i[a]);
Inner loop code with confusingly reversed array subscripts

This reversal is modest, but nonetheless galling.

An exercise to the reader: can you find the exact spot where the subscript is reversed?

Antiquated: K&R style

That's where the post gets shuddery. K&R style, or, as they call it, "I don't understand old C code".

double
f(t)
double t;
{ return
    sqrt (fabs (t)) + 5 * pow (t, 3); }
A function rewritten with types after the parameter list

This style

Luckily, C23 finally removes it, after more than thirty years of yielding to the horror and maintaining it in deprecated status.

Smart: Recursion

Reordering and refactoring functions is always fun. So how about turning all the for-loops into recursion? Recursion is cool, I've heard. So here's a recursive rendering of the number printing loop:

void
print_nums(a, i)
double *a;
int i;
{ if (i < 0)
         return;
     double y = f (i[a]);
     if (y > 400)
         printf ("%d TOO LARGE\n", i);
     else
         printf ("%d %.16g\n", i, y);
     print_nums (a, --i); }
Number printing loop refactored as a recursive function folded over an array

Five more code lines, lots of stack frames (unless you have tail call elimination), and overall less comprehensible control flow. Yay!

Recursion is good, actually

Like some un-aesthetic and alienating changes this page lists, recursion might be useful. It can make your algorithms simple and powerful when done right. I often use recursion when writing Lisp. But I can relate to people seeing it as vile and perplexing.

Terse: Ternaries

This is my favorite: switching from if-else to ternaries. It's shorter, expression-only, and it makes code look more daunting. And there's a rumor that compilers increase the optimization level when they see ternaries. Likely, out of regard for programmer's bravery.

void
print_nums (a, i)
double *a;
int i;
{ double y;
     (i < 0) ? 0 :
               (y = f (i[a]),
                (y > 400 ? printf ("%d TOO LARGE\n", i) :
                           printf ("%d %.16g\n", i, y)))
               print_nums (a, --i); }
Code with if statements replaced with ternaries

If only comma operator allowed for variable declaration (wink wink C standard committee), this function might've had no double y in it either. But, for now, let this stateful statement stay there.

Ternaries are good, actually

I like the ternary-formatted code because it forces a side effect-less algos where I want it to. It's even more useful in other C-like languages because they have less restrictive blocks and more abstractions compatible with functional style.

Unconventional: Delimiter-First Code

There are reasons one can use leading-delimiter style in SQL and Haskell. But in other languages...

void
print_nums (a, i)
double *a;
int i;
{ double y;
     (i < 0)
     ? 0
     : (y = f (i[a])
        , (y > 400
           ? printf ("%d TOO LARGE\n"
                     , i)
           : printf ("%d %.16g\n"
                     , i, y))
        , print_nums (a, --i)); }
Using leading commas and ternaries in function calls

I like how the ternaries become more pronounced and how it promotes a functional-ish style. But I bet, your eyes are already hemorrhaging, so feel free to ignore my aesthetic preferences.

Awful: Alternative representations

That's the most horrifying one: C has alternatives to some characters that weren't there at the time of the first standard. There are two-(digraphs) and three-character (trigraphs, deprecated in C23) encodings for [, ^, { etc. Here's a table of transformations:

C char Digraph Trigraph
{<% ??<
} %> ??>
[<: ??(
] :> ??)
# %: ??=
\ ??/
^ ??'
| ??!
~ ??-
All the digraphs and trigraphs with their decoding

And here's the code with encoded parts:

void
read_nums (a, i)
double *a;
int i;
<% if (i == 11)
    <% return; %>
    else
    <% scanf ("%lf", &i<:a:>);
        read_nums (a, ++i);%> %>
C code using digraphs

And that's just digraphs, trigraphs are even worse!

Alternative representations are good, actually

There is a more useful side to alternative encodings. <iso646.h> provides the spelled-out logical operators far more readable than single-character operators:

C operator iso646.h spelled-out macro
&& and
&= and_eq
& bitand
| bitor
~ compl
! not
!= not_eq
|| or
|= or_eq
^ xor
^= xor_eq
Relatively unreadable C operators vs. respective iso646.h macros

Even though it's atypical, I'm tempted to use these in my projects.

Wrapping Up

Here's the final code for TPK algorithm. It compiles under Clang 13.0.1 on my x86_64-unknown-linux-gnu 😵 (the exact command is clang tpk.c -trigraphs -lm.)

%:include <math.h>;
??=include <stdio.h>;
double
f(t)
double t;
??< return
    sqrt (fabs (t))
    + 5
    * pow (t
           , 3); %>
void
read_nums(a, i)
double *a;
int i;
<% if (i == 11)
    <% return; %>
    else
    ??< scanf ("%lf"
               , &i<:a??));
        read_nums (a
                   , ++i);%> %>
void
print_nums(a, i)
double *a;
int i;
<% double y;
     (i < 0)
     ? 0
     : (y = f (i??(a:>)
        , (y > 400
           ? printf ("%d TOO LARGE\n"
                     , i)
           : printf ("%d %.16g\n"
                     , i, y))
        , print_nums (a
                      , --i)); ??>
void tpk ()
??< double a<:11:> = ??<0??>
           , y;
    read_nums (a
               , 0);
    print_nums (a
                , 10);
    /* Absolutely unnecessary, but irritating. */
    return; %>
int main ()
<% tpk();
    return 0; ??>
Final code with all the ugly gotchas above applied

If you want some job security as a C or C++ programmer, you might use some of the things discussed above. But in any other scenario: you don't want to write code this way! Be kind to each other, even when y'all write chthonic C code.

Update: some commenters on Reddit mentioned IOCCC as an additional inspiration and further research direction. This post is by no means exhaustive, and you will likely find much more gory details if you explore IOCCC.

Another update: u/insanelygreat shared an absolutely horrendous piece of code and set of macros that turn C into something BASIC. Here's a small piece of code from their comment you should read in full:

/* check for meta chars */
BEGIN
   REG BOOL slash; slash=0;
   WHILE !fngchar(*cs)
   DO IF *cs++==0
        THEN IF rflg ANDF slash THEN break; ELSE return(0) FI
        ELIF *cs=='/'
        THEN slash++;
        FI
   OD
END
Heavily macro-infused piece of C code that looks like shouting in Ada (citing u/insanelygreat)