Scientific conventional commits
If you’re writing code, you should be using conventional commits!
Conventional commits probably look familiar to you already. Stuff like:
feat: add button that makes fart sound
fix: make fart sound appropriate volume
docs: add documentation for how to use fart button
The idea is to use a standardized format for commit messages that includes a type, a scope, and a description.
I’m a scientist! Most of the code that I write is… well, research code (derogatory).
But I still use conventional commits because they induce readability of the version control history.
I have therefore extended the spec slightly to accommodate the specific needs of the Scientific Method. In particular, I have added the following commit types:
Hypothesis (hyp)
For version-controlling and “locking in” a hypothesis, pre-registering it in a way that is easily trackable and reproducible. For example:
hyp: the earth is actually round
…would be the commit message that I use to explicitly annotate that I have declared my hypotheses in this commit. This also makes it easy for others to review :)
In vanilla CCS, you can use “scopes” to zoom in on specific areas of the codebase. For example, feat → feat(fart-api).
I’ll often use this space to either specify particular analyses (hyp(curvature)), or annotate the “null” vs “alternative” nature of the hypothesis (hyp(null) vs hyp(alt)).
This often goes in a feat in standard CCS.
Experiment (exp)
For version-controlling an experiment, which is a specific test of a hypothesis. For example:
exp: measure dist to horizon in datasets/horizon-views.csv
exp(newtons-g): compute gravitational constant using apples
These would best fit into feat in standard CCS.
Result (res)
When all you have done is run an analysis or experiment, and you want to commit the results of that analysis, you can use the res type. For example:
res: generate curvature.csv
These would most likely live under chore in standard CCS, since a res commit should not change any code, just add results.
Results often go hand in hand with:
Parameters (param)
If all you have changed between two commits is the parameters of an experiment, you can use the param type to track that. For example:
param: set curvature.csv to use 10km bins
This is especially useful for tracking the parameters of a particular experiment, and how they affect the results. Scope, obv:
param: decrease learning rate due to stagnating loss
param(newtons-g): increase n_apples to 10K
Note that I generally like to notate “increase” / “decrease” instead of just “set” to make it more clear how the parameters are changing over time.
These too likely like under chore in standard CCS, since they should not change any code / logic, just parameters.
Hopefully this is useful to other scientists who are trying to be more rigorous and reproducible in their code!