incava.org

about

past

DoctorJ began as a few Perl scripts two or three years ago, written when I was often encountering code that looked like the following:

/**
 * Method: foo
 *
 * @return int an integer
 * @param String the string to be foo'ed
 * @param Object the object to foo.
 * @throw Exception when there are problems
 */
public int foo(Object a, String b)

The original idea was to validate Javadoc comments against code itself. Originally called DocJ, it grew beyond trying to parse Java with regular expressions, so it was modified to use a recursive descent parsing module (Parse::RecDescent). This module is powerful and elegant, but when the parser was finally implemented, a single Java input file took 7 minutes just to be parsed, not to perform any analysis.

Java was considered as an obvious implementation language, but early trials with JavaCC and SableCC proved also to require far too much processing time and memory. So C++ was finally chosen, due to its speed and limited overhead. An LALR(1) scanner and grammar were written, most heavily borrowing from code in JLint, CUP, and Jikes, among others. From there, an abstract syntax tree was created, with the grammar repeatedly heavily modified to make the AST more "flat".

Along the way, it seemed that DocJ just wasn't a sufficiently descriptive name, given that this application was destined to diagnose a variety of coding and documentation problems. DocJ just morphed into DoctorJ, similar to the nickname of Julius "Dr. J" Erving, a legendary basketball player from the 1970s and 80s, but certainly not nearly close enough to warrant a lawsuit for infringement of a trademark.

Over time, the idea of rewriting DoctorJ in Java seemed more appealing, as processors and memory became cheaper and faster. Also, writing this in C++ was becoming onerous, especially as my free time became more limited. So in February 2004, I started rewriting it using JavaCC, and in late March a rough working version was ready. It also accelerated the process to write many unit tests (using JUnit); approximately 270 tests exist in 30 test cases.

architecture

DoctorJ is comprised of the following packages:

  • org.incava.util. Basic utilities.

  • org.incava.log. Contains the logging module, which actually is more for debugging output than it is for logging.

  • org.incava.lang. Extensions to java.lang classes.

  • org.incava.io. Extensions to java.io classes.

  • org.incava.text. Spell-checking classes.

  • org.incava.jagol. Module for option and configuration file processing.

  • org.incava.java. The Java and Javadoc parsers and abstract syntax tree classes.

  • org.incava.analysis. Module for reporting, violations and rules.

  • org.incava.doctorj. Contains the documentation analyzers and DoctorJ application.

resources

My primary development environment is:

future

I have not begun working on a Java version of the syntax analyzer. I recommend PMD, which contains many of the same rules.

The statistics analyzer may become a standalone application, depending on user interest and feedback.

Valid HTML 4.01!

Valid CSS!