Whether starting a new project or trying to decipher OpenSource code that has near zero comments, Doxygen is a tool you need to know. It can shine a light in that deep dark tunnel known as “what you need to do.” Too few of us know how to use Doxygen. Too few of the “getting started” examples you find online are actually good enough to get you started.
Install doxygen and graphviz. I’m going to assume you are using Ubuntu because most people do for development. Non-Debian distro based users will have to figure these package names out on their own.
sudo apt-get install graphviz graphviz-doc
sudo apt-get install doxygen doxygen-doc doxygen-gui
Always install the -doc when you install a tool because you will eventually need it. I don’t use the GUI here, but you might want it. You need graphviz to make the pretty graphs I will rave about later in this post.
cd
mkdir -p linux_doxygen/config
cd linux_doxygen/config
doxygen --help
The last command will show you a few of the command line options and prove you have actually installed doxygen. Many of the examples you find online will use the -s command line option. This is a really bad idea for beginners. In general it is a bad idea. The -s command removes the comments from the generated configuration file and there are a plethora of configuration options. Some you can guess the meaning of by the name and some you might guess the proper values of, but not all.
doxygen -g j_config
The -g option says to generate a configuration file. There is some default config file name, you really shouldn’t know or use it because you are just begging for catastrophe. I called mine j_config.
Now I needed something you could all try so I went here and pulled down Juffed. It’s a nice little cross platform editor that suffers from OpenSource syndrome.
Damned few comments in the code!
Doxygen config tweaks
Here are the sections of j_config you need to tweak or verify:
# The PROJECT_NAME tag is a single word (or a sequence of words surrounded by
# double-quotes, unless you are using Doxywizard) that should identify the
# project for which the documentation is generated. This name is used in the
# title of most generated pages and in a few other places.
# The default value is: My Project.
PROJECT_NAME = "Juff1"
Yes, there is a default project name. Don’t use it. Default names, especially when it comes to documentation, will always burn you. They continually get overwritten. I named this “Juff1.”
# The TAB_SIZE tag can be used to set the number of spaces in a tab. Doxygen
# uses this value to replace tabs by spaces in code fragments.
# Minimum value: 1, maximum value: 16, default value: 4.
TAB_SIZE = 4
The default tab size is 4. This is a very sane value and one you should be using in your project. Anything less than 4 is not a sane value. You need to make this value match your level of insanity.
# The RECURSIVE tag can be used to specify whether or not subdirectories should
# be searched for input files as well.
# The default value is: NO.
RECURSIVE = YES
I understand the default of NO here. Many people try to squeak by passing in only a few files. Other people want to run this over only a single portion of a massive project. Here’s a future clue. That never works. With today’s project layouts, the odds of all the header and source files being in the one directory you chose are slim. Odds of you getting all of the files you need listed for input are closer to none than slim . . . at least on the first try. Disk space is cheap, generate it all. If you think it takes too long kick it off before you leave at night. Just read up on Linux batch processing.
# The PAPER_TYPE tag can be used to set the paper type that is used by the
# printer.
# Possible values are: a4 (210 x 297 mm), letter (8.5 x 11 inches), legal (8.5 x
# 14 inches) and executive (7.25 x 10.5 inches).
# The default value is: a4.
# This tag requires that the tag GENERATE_LATEX is set to YES.
PAPER_TYPE = letter
This is one of those you might get it things. Letter and a4 are pretty common values. Other values get more iffy.
# The DOT_NUM_THREADS specifies the number of dot invocations doxygen is allowed
# to run in parallel. When set to 0 doxygen will base this on the number of
# processors available in the system. You can set it explicitly to a value
# larger than 0 to get control over the balance between CPU load and processing
# speed.
# Minimum value: 0, maximum value: 32, default value: 0.
# This tag requires that the tag HAVE_DOT is set to YES.
DOT_NUM_THREADS = 2
Unless you are running on an under powered single core processor, you will want to use at least 2. I would personally not step into the double digits or enter a value higher than the number of processor cores on your computer. That’s a personal choice. I like to leave something for the other processes running on the system, like checking for new mail.
# If the CLASS_GRAPH tag is set to YES then doxygen will generate a graph for
# each documented class showing the direct and indirect inheritance relations.
# Setting this tag to YES will force the CLASS_DIAGRAMS tag to NO.
# The default value is: YES.
# This tag requires that the tag HAVE_DOT is set to YES.
CLASS_GRAPH = YES
This is how you get some of the pretty-pretty graphs.
# If the CALL_GRAPH tag is set to YES then doxygen will generate a call
# dependency graph for every global function or class method.
#
# Note that enabling this option will significantly increase the time of a run.
# So in most cases it will be better to enable call graphs for selected
# functions only using the \callgraph command. Disabling a call graph can be
# accomplished by means of the command \hidecallgraph.
# The default value is: NO.
# This tag requires that the tag HAVE_DOT is set to YES.
CALL_GRAPH = YES
This is how you get other pretty-pretty graphs. I always want the pretty-pretty graphs.
# The INPUT tag is used to specify the files and/or directories that contain
# documented source files. You may enter file names like myfile.cpp or
# directories like /usr/src/myproject. Separate the files or directories with
# spaces. See also FILE_PATTERNS and EXTENSION_MAPPING
# Note: If this tag is empty the current directory is searched.
INPUT = /home/roland/sf_projects/juffed-master
Here is where you tell Doxygen to find your files or what files you want processed. Normally I never put a relative path in here, but you kick the tires on that. There has to be some way to do it otherwise storing your Doxygen config file in your project source control wouldn’t make much sense.
# The OUTPUT_DIRECTORY tag is used to specify the (relative or absolute) path
# into which the generated documentation will be written. If a relative path is
# entered, it will be relative to the location where doxygen was started. If
# left blank the current directory will be used.
OUTPUT_DIRECTORY = /home/roland/linux_doxygen/rpts
This is where you tell Doxygen to dump its output. The directory does have to exist, but you do have to have write permission to create the directory. Doxygen will handle creation for you.
# The GENERATE_TODOLIST tag can be used to enable (YES) or disable (NO) the todo
# list. This list is created by putting \todo commands in the documentation.
# The default value is: YES.
GENERATE_TODOLIST = YES
I didn’t tweak this, I just wanted to point it out. There is an @todo comment tag you can put in your source code that Doxygen recognizes. This is handy for both managers and developers alike. Whenever you think you are done, generate the documentation and see if there are any entries left on the todo list. Of course this only works if you actually enter the comments into your source!
The output
doxygen j_config
I ran that from within the config directory I created earlier. If you run it from a directory other than where the config file is, don’t forget to include the path. This is the other reason you should never take the default file name when generating a config. You might not be where you think you are when you run doxygen. It would really suck to sit through a four hour generation only to find out you generated doc for the wrong project. Worse yet, not find out and wonder why nothing is better in your “current” doc.
Expect to see a lot of runtime warnings about stuff not being commented. This should be the first step of any peer review process. Run Doxygen and kick the code back if anything is missing its comments. Don’t wait until the end to write your doc.
There are lots of different output formats you can enable. Latex is enabled by default and you get much better html documentation if you leave it enabled because the process of generating html from latex is much better than generating raw html.
Now I open Firefox and put the following in the URL.
file:///home/roland/linux_doxygen/rpts/html/index.html
Our main page is blank because there aren’t any Doxygen comments in the code. On a proper project you will have all kinds of stuff here.
I’ve never liked how this diagram looks. It’s great for spotting base classes and identifying children that might have “unexpected features” if you change a base class, like add a method or member variable that exists in one or more of the children. For some reason I don’t think the heads on the arrows should point to the base. Maybe that is just me.
These hierarchy diagrams can go on forever in a project of significant size. You never want to try printing them out. When you are coming in cold though this can really help you figure things out. Assuming someone put just a tiny bit of thought into naming their classes, you should be able to get a general lay of the land.
I really prefer the index to the list.
On the far left of this image is the class list; forgot to take a clean shot for you. I prefer the index because I don’t have to scroll as much. Really just wanted to show you the Class Members sub-menu here. You can tunnel down into these objects.
You can just tunnel all the way down. Please note the small menu in the upper right to navigate to slots, member functions, etc. You can see even more when you scroll down.
If you find yourself thinking “Gee, this looks a lot like Qt documentation” that would be because they used to (and probably still do) use this to generate documentation.
This is “Hello World!”
What I’ve shown you year is the equivalent of a “Hello World!” program. Not only are there many more output types, you can have many different style sheets to control how output looks, colors used, etc. I’m just trying to show you that when someone doesn’t bother to put comments in their code, you don’t have to go blind trying to read their code.
One of the first things I like to do when thrust into a new existing project where nobody documented squat is introduce Doxygen, then start putting Doxygen formatted comments in the code. Managers usually don’t want to pay a consultant for that unless the code base has gotten to the “Gee, nobody can figure it out or remember what they did” stage.
“Read the code” may be the mantra hurled at you by 12 year old boys on StackExchange, but it is never the correct answer for an established code base of any real size. You need a tool to generate a picture so you can figure out which of the 10,000+ modules you need to look at first.
You can find the full Doxygen manual here.
A list of the “commands” can be found here. Commands are what I call special comment tags. Note that \ and @ get used for the same thing. You just need to choose a standard for your project/shop and be consistent.