Last December, Brenna did a bunch of Advent of Code, and on a number of problems I played mentor while she worked through them. My central tenet was that the hard part of programming is the part you do before you put your fingers on the keyboard. It’s really tempting to read through the problem and jump right into the code part and start laying out functions and stuff to solve it. It doesn’t work so well though.
The essence of good software design is effective abstraction. Abstracting things is not terribly difficult, but effective abstractions are rather trickier. As a programmer, you have to take human/business “stuff” and translate it into something so simple a computer processor can understand it. The bottom of that process is all taken care of for you by various compilers/assemblers/linkers/interpreters/etc., but that’s the uninteresting (and thus easily automated) part.
The technique that we repeatedly used was simple, but quite effective. It’s also totally obvious, so much so that it seems almost pointless:
Your first version should be English.
Start by writing prose in comments, and make it really high-level prose. I’m talking “the whole program is described in 10 lines” high-level. Once that’s done, it’s easy to pick out the nouns and verbs, which indicate where you probably need to make an abstraction. A noun represents a data structure of some sort. Declare the type somewhere, and reference it in your comment. A verb represents a function of some sort, and the object(s) will be the function’s parameter(s). Declare that function somewhere, and reference it in your comment.
Now you’ve got some types and functions, which are exactly the same as your program as a whole, so you approach the same way: write prose in comments for what they do. Find the nouns and verbs and repeat.
At some point you’ll get to functions which are either provided by your language runtime or libraries, or can be expressed directly in your language’s syntax. Once that happens, your comment can just be written as the equivalent line of actual code, at which point you’re done. And I don’t mean “done with the first draft”, I mean actually done as in “all the code is written.”
This is recursion, of course! Undoubtedly the scariest concept in programming, and one of the most useful.
Part of the reason this works so well is that it forces you to name everything. Naming stuff is a) really hard, and b) really, really useful. By having good names for stuff, it becomes possible to have natural-language conversations about the “stuff” in your program, at any level of abstraction. Even the best of programmers don’t think like computers, they think like humans, and humans use natural language to converse, even when they’re talking to themselves. By having good names in your programs, you make it much easier to talk about – and therefore think about – what is actually happening when they run, especially if there is more than one human involved.
As one specific example, consider this type I created when working on Day 6 of the Advent of Code:
type LightGrid [][]bool
This might not seem very useful: what is wrong with just passing [][]bool around everywhere? But consider the question “how many lights in the grid are turned on?” for a moment, and then look at these two effectively-equivalent function signatures and ask yourself which one is more obviously the way to answer that question:
func CountTurnedOn(LightGrid) int func CountTurnedOn([][]bool) int
I hope you would agree with me that the former is much more obvious about what it’s doing, and this is an unusual case where “turned on” sort of makes sense for boolean values. If you’re familiar with Go, you’ll not be surprised that in my actual solution the first one was a method on the LightGrid
type, not a standalone function.
By having the LightGrid
type, we can talk about it with other humans (or ourselves) without having to understand that it’s a two-dimensional slice of boolean values. That’s irrelevant if you’re talking about the grid as a whole, and the names in your program should reflect that. I.e., the name provides encapsulation.
In Brenna’s specific case, a rather significant part of the problem was that she’s primarily developed using CFML in her career. CFML doesn’t provide a good way to create lots of data types to help with this problem. As a result, you have lots of signatures that use array
, struct
, or query
in their types, and who knows what their semantics are. Which means you often have to go digging around in the implementation to deduce the semantics, which can be quite time consuming and completely breaks up the thought flow. Using CFCs and having good function names can help, but the function names in particular often end up being unwieldy, because they have to express argument type info as well as the verb in question, not to mention the extreme verbosity and runtime penalty of CFCs.
If you’ve done your job right, after you’re all done with your program, you final shipping version should still be pretty darn close to English. The grammatical rules mandated by your compiler/interpreter are probably rather different than English, and the punctuation rules certainly will be, but with a little effort it’s surprising how close you can get. Everyone who has to read the code later (especially Future You) will thank you.