title: Making world-class docs takes effort
url: https://daniel.haxx.se/blog/2021/09/04/making-world-class-docs-takes-effort/
hash_url: f327a02cab
Here are six requirements that I have on a project for it to reach my gold approval for stellar docs. Then something about what I’ve done recently to further improve the docs for curl.
It needs to be next to the code so that authors and contributors can update/read the docs while working on the code or docs. Providing it in a separate repository or otherwise separated will undoubtedly lead to discrepancies sooner or later. Similar to how all wikis are always wrong.
Admit it. If you browse around you will realize that the best documented projects you find never provide that docs generated directly from code. Such generated docs can still provide value, but you will not reach gold level without more effort.
Separated docs encourage more writing and writing by people who are otherwise scared of touching the code or thinking that fixing spelling errors in the docs isn’t worth patching the code for. It also allows for other and better formatting on the docs.
Users always ask for (more) examples. You can never go wrong by providing examples. If your docs don’t have enough examples, you’re not doing the docs good enough yet.
Fewer things make me sigh more than when I have to dig up the source code and from that try to figure out exactly how an external and publicly provided API works. Yet this still happens regularly even for libraries that have been released and maintained for decades.
In one fairly recent instance I reported such an omission in a popular library. It took them over two years to add it.
Long-living libraries should also provide information about from which versions certain functionality or options exist or can be expected to work or not work.
Good docs means that we can find what we’re looking for and that the documentation flows and is easily read and understood. Ideally, even simple google searches for API details in your library should lead us to suitable entry points.
Preferably, the documentation should also be provided for proper off-line reading, meaning man pages or something similar that can be browsed when disconnected from the Internet.
The docs should be easy for contributors to help out with (independently from the code if desired). That also includes that they should be easy for contributors to build and render locally so that they can test and view their updates while working on them.
I want the documentation for curl and libcurl to be known, recognized and widely admitted to be world-class.
I want the curl documentation to be of a quality and content to make users not able to find competitors or similar projects with better docs.
Documentation in curl is not an after-thought. It is not a second-tier component. It is a crucial and important foundation that allows users to use, trust and rely on our products. We require that new changes or improved functionality are provided with the corresponding updated and accurate documentation.
We also try to verify and check the docs as much as possible with scanners , tests and tools.
I maintain that our documentation is as good as it is today a lot thanks to us very rigidly sticking to our guiding principles: compatibility and not breaking existing behavior. Documentation we wrote decades ago is still valid. It gives us plenty of time to keep refining and polishing the documentation of a feature that doesn’t change. No documentation was perfect already at the first attempt, but after numerous iterations and improvements chances are it is better. Time is on our side. And we are never done, documentation can always be improved.
A few days ago I talked documentation with someone and when doing so I thought about what guiding principles I think we should put on project documentation. What I’ve listed above basically.
In then dawned on me that the current man curl.1 man page is actually not featuring that many examples, in spite of it being 3535 lines long. I pondered a little bit on how best add some, and then dove in and extended our system that generates it from the hundreds of individual files that each describe a single command line option. They should of course all offer at least one example!
Having 242 command line options (as of right now, it will be more soon), that sudden idea that seemed simple enough turned into a quite gruesome work and I spent many hours walking over the options to make up examples. I also made sure that our build system now returns an error if there’s a command option without an example in the documentation! This way, we can be sure that also all future command line options will have examples in the man page.
This made the curl.1 man page grew with over 1200 lines!
A few years ago I did a manual effort and made sure most man pages for libcurl options include examples, but I never made that into a test or anything so there’s nothing that forces us to stick to this.
Having started this journey, I decided now was the time to add that requirement to the scripts. I extended test case 1173, which already scans all man pages to verify some basic syntax, to also check that man pages for options feature an EXAMPLE
section.
There are at this moment 374 stand-alone man pages for libcurl options. Only ten of them were detected to not feature good enough examples and it wasn’t very hard to fix this.
Having just extended the man page checker, another idea came up.
I made the script also check that each man page features the correct set of sections, in the required order! I’m a true believer in consistency and that using the same set in the same order will make the docs easier to read and find information in, and checking for these things will make sure that all future additions will be forced stick to the same.
The mandatory sections for libcurl option man pages are right now, in this order:
NAME SYNOPSIS DESCRIPTION PROTOCOLS EXAMPLE AVAILABILITY RETURN VALUE SEE ALSO
These man pages are allowed to have other sections as well, and they can be placed anywhere among the mandatory ones, but the eight section headers that has to be there has to be in that order.
While at it, I also extended the man page scanner to check that all references in all curl man pages to libcurl options are verified to actually refer to existing options, to find typos. Ironically this extra check turned out finding exactly no such typos in the current 463 man pages!
The outcome of the work I write about here will of course be merged asap and will be part of future releases and on the website.
We should keep thinking of more ways to improve the documentation and for more ways to verify and cross-reference things mentioned in the docs to increase its accuracy and detect typos.
If you find a problem, an inferior wording or just something you think we should improve in any curl documentation, file a bug or a PR! We also try to make this as easy as possible for users to do directly from the curl website by providing bug-reporting direct links from the documentation pages.
I added the sixth “rule” days after the initial post after Willy Tarreau’s feedback.