Causeway: A message-oriented distributed debugger
Causeway, an open source distributed debugger written in E, lets you browse the causal graph of events in a distributed computation.
Causeway provides a post-mortem view, gathered from trace files written by the processes you wish to debug. The process-order view shows the full order of events recorded by each process. This gives a "follow the process" view common to conventional distributed debuggers. In the message-order view, we see an alternative "follow the conversation" outline view, in which each event expands to show the events it causes.
Causeway presents several different views of the causal relations. The views are coordinated such that, selecting an item in one view causes corresponding selections in other views.
- Process-order View (top-left pane) This view lists events in chronological order, organized by vat. It's a tabbed view with one tab per vat. An entry in the process-order view is a 2-level subtree. The parent item represents an event; each nested item represents an eventual send that occurred during the parent event. In the screenshot, the selected item is the currently selected event in the message-order view. Synchronized selection between the process-order and message-order views is especially useful since, taken together, they convey the equivalent of spacetime diagrams.
- Message-order View (top-right pane) This view shows the order in which events caused other events by sending messages. This message order is reflected in the outline structure; nested events were caused by the parent event. When an event has multiple causes it is a joining event. A joining event appears directly under each of its causes and is marked with a right arrow icon. Each tree item represents a message target and is identified by a vat name and turn. The descriptive label depends on the information available in the trace record for the event and is one of the following.
- The "text" field string. This field is required for Comment records. It is optional for Sent, SentIf, and Resolved records.
- A single line of source code from the source file specified in the top stack entry.
- The source file name and function name specified in the top stack entry.
- Stack Explorer (bottom-left pane) As with sequential debugging, the question is often: How did we get here -- what chain of activations led to the current event? Causeway's stack explorer answers this question by looking back in time and presenting both eventual sends and immediate calls that led to the current event. An entry in the stack explorer is a 2-level subtree. The parent item represents an event; its nested items represent the stack trace captured for that event. In the screenshot, the top entry is the currently selected event in the message-order view. Subsequent entries are built by following the message graph back in time to sending events. An event having multiple causes has multiple paths but only the last cause is followed. Being the last in chronological order, it is expected to be the most useful for following the interesting causality.
- Source View (bottom-right pane) This view shows the source code for the currently selected item in the stack explorer and indicates the corresponding source span.
File>>Export... translates Causeway's message graph (DAG) to the DOT format and writes the dot file to a local disk. The dot file is a human-readable text file. It specifies a graph using the DOT language. GraphViz must be downloaded and installed to see the graph visualization. The graph below was generated for the Waterken application described above.
Our current development effort is to generalize Causeway to support asynchronous message-passing programs running on event loop-based platforms in general, not just E. Our initial focus has been on the Waterken server.