Sunday, October 8, 2017

Udacity CS259: Software Debugging Course Notes

I recently finished going through Udacity's CS259: Software Debugging class.

One insight from the beginning of the course was that we should treat bugs as mysteries to be solved, which we can explore systematically using the scientific method.

To help us reason about about the state of the program, the teacher, Andreas Zeller, uses some interesting terms:
  • defect: The teacher prefers the term defect rather than bug; bug suggests something external, whereas defect suggests that the program was constructed incorrectly.
  • infection: An infection is when, during the execution of a program, the state is incorrect.
  • infection origin: The first point of infection.
  • failure: The failure is the incorrect output that we actually observe on the surface when the program is executed.
A defect causes infection, which then spreads and causes failure. Debugging is working backward from the failure to the infection origin to the defect itself.

7 Stages of Debugging

This course gives the following description of the process of debugging after discovering a failure.

  1. Track the problem using a bug tracker. Bug trackers can yield useful statistics.
  2.  Reproduce the problem. This may only occur in a certain environment.
  3. Automate and simplify to make a test case that reproduces the problem.
  4. Find possible infection origins. That is, sources where the incorrectness started.
  5. Focus on most likely origins. Code smells, past problems, etc.
  6. Isolate the infection chain. Use the scientific method. Set up experiments.
  7. Correct the defect. Then verify and follow up.

"Automating the Boring Tasks"

Usually, when it comes to debugging, the most basic tools that we have are print debugging and step debuggers.

But one of the main messages of this class was that we can use programs to help us reason about programs: aspects of debugging can be automated. For example:

  • Finding the minimal input that causes a particular output (e.g. delta debugging)
  • Finding differences between a failing and passing run of a program
  • Finding lines in a project that are correlated with failure

Possible project inspiration

Zeller analyzed the Mozilla codebase by analyzing issues in the the Bugzilla issue tracker and related patches and the files they touch. This could be done for most large software projects by mining the issue tracker data. As a simple way to start:

  • List all fixed bugs of some type.
  • List commits associated with those bugs.
  • List files touched by those commits.
  • This should yield a listing of bug-prone files.

No comments:

Post a Comment