Causeway

From Erights

Revision as of 22:50, 23 May 2010 by Tstanley (Talk)
Jump to: navigation, search

Contents

Causeway: A message-oriented distributed debugger

Causeway is an open source postmortem distributed debugger for examining the behavior of distributed programs built as communicating event loops. Its message-oriented approach follows the flow of messages across process and machine boundaries.

Getting Started

The simplest way to get started is to launch Causeway from a command line shell and then open one of the examples.

$ cd e/src
$ rune -De.home=esrc esrc/scripts/causeway.e-swt


From the Welcome view select an example program from the Help menu.


Image:Welcome-help.png


Optionally, the sources and trace logs can be specified on the command line.

$ rune -De.home=esrc -Dsrc=<srcRootDir> esrc/scripts/causeway.e-swt <logs>

Java's default memory settings are sufficient for the examples but larger programs need more stack and heap space. Use the -Xss (stack) and -Xmx (heap) options to increase Java default memory sizes. Follow the amount with m for Mb or k for Kb. Notice the format does not follow the name=value convention. The J option tells rune to pass the option to Java.

-J-Xss128m -J-Xmx128m

Setting Causeway's debug flag enables a Debug view. As events are selected in the viewer, the Debug view shows the corresponding trace record in the log file.

-Dcauseway_debug=true


Image:debug-view.png


Waterken Example (Ajax-style)

This Java program ran on the Waterken server instrumented to generate Causeway's trace log format. The program is a distributed implementation of a procedure for handling new purchase orders.

Before an order is placed, certain conditions must be met: the item is in stock and available, the customer’s account is in good standing, and the delivery options are up to date. An object residing in the “buyer” process (or vat) has remote references to objects residing in the “product” and “accounts” processes. The buyer queries the remote objects with asynchronous message sends. This example uses Ajax-style continuation-passing: a request carries a callback argument to which a response should be sent.

The screenshot below shows the principal views from Causeway's postmortem display for this example.

Causeway Viewer

Causeway Viewer


  • Search view (leftmost): This view lists bookmarks and results of search commands.
  • Process-order view: This view lists events in chronological order, organized by vat. For example, clicking the "buyer" tab shows all events logged by the buyer vat. The events are ordered by turn number and within each turn, anchor number.
  • Message-order view: This view shows the order in which events caused other events by sending messages. Message order is reflected in the outline structure: nested events were caused by the parent event. Causeway assigns each vat a color so we can see when message flow crosses vat boundaries.
  • Stack Explorer and Source view: These views are familiar from sequential debugging.

Individual tree items represent events and their descriptive labels depend on the information available in the trace record for the event. Causeway labels the tree items according to the following priority.

  1. The "text" field string. This field is required for Comment records. It is optional for Sent, SentIf, and Resolved records.
  2. If the trace record has at least one stack trace with a source and span, a single line of source code from the source file specified in the top stack entry.
  3. If there's no span, source file name and function name specified in the top stack entry.
  4. If nothing else, a Causeway comment.


Menu Commands

File >> Set Source Root... point at the root directory of the sources.

For example, for the trace record pathname shown below,

"trace" : {
  "calls" : [ {
      "name" : "AsyncAnd.run",
      "source" : "org/waterken/purchase_ajax/AsyncAnd.java"
    } ]
}

set the source root to the underlined prefix.

~/Desktop/waterken/example/src/org/waterken/purchase_ajax/AsyncAnd.java

(Limitation: Cannot set multiple source roots.)


File >> Open select log files to open.

(Limitation: Cannot select a folder. Must go into the folder and select the multiple files.)


File >> Export... translates Causeway's message graph (DAG) to the GraphViz DOT format and writes the dot file to a local disk. The dot file is a human-readable text file. It specifies a graph using the DOT language. GraphViz must be downloaded and installed to see the graph visualization. The graph below was generated for the Waterken example described above.


Causeway's DAG as GraphViz graph


Search >> Find Lost Messages reports sent messages that were not received. The message graph is searched for Sent or SentIf events having no corresponding Got event. Of course, the message may actually have been received, but the event was not logged.


Tools >> Set Filter Options... presents all source files seen during parsing of the trace logs. Individual files can be filtered out. (Limitation: These settings are not persistent across launches.)


Filter options dialog


Context Menu Commands

Bookmark bookmarks the currently selected event.


Bookmark menu item


Find Multiples finds the multiple causes of a joining event.


Find multiples menu item


Search Stacks


Search stacks menu item


(Limitation: Not all source lines ... and currently, there is no visual indication...)



Causeway's Trace Log Format

Causeway supports the JSON trace log format defined by Tyler Close.

Trace log event types


When causality tracing is on the events are logged as follows.

Event Record type
An eventual send to an object Sent
A message delivery, starting a new turn Got
Registration of a when-block, to execute when a promise resolves SentIf
Returned result from a remote object, for local promise resolution Returned
A state-based event contributes to promise resolution Progressed
A promise resolves Fulfilled or Rejected
Programmer logs a comment Comment


Logging Ajax-style Messaging in Waterken

Consider the sequence of events shown below.

Ajax-style message send

  • a remote inventory object is queried for the availability of a part
  • the inventory object reports true to teller, a callback object


The eventual send to the inventory object has two log entries: a Sent and its corresponding Got.

    "class" : [ "org.ref_send.log.Sent", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 3,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 3
      }
    },
    "message" : "lqhjpwbbeemozk-2-0",
    "timestamp" : 1274238401772,
    "trace" : {
      "calls" : [ {
          "name" : "Main.Buy.fulfill",
          "source" : "org/waterken/purchase_ajax/Main.java",
          "span" : [ [ 68 ] ]
        }, {
          "name" : "Main.Buy.fulfill",
          "source" : "org/waterken/purchase_ajax/Main.java",
          "span" : [ [ 48 ] ]
        } ]
    }

The anchor uniquely identifies the origin of this message send as the 3rd messaging event from the buyer vat, turn 3.

The message is a generated string which uniquely identifies a message.

The trace is the stack capture at the point of the message send.

Note: The loop field identifies the vat by URI. By convention, Causeway picks up the part following "/-/", in this case buyer, for a short display name.

Note: The timestamp field is optional. Currently, it is ignored by Causeway, so it's not shown in the remaining trace records.

    "class" : [ "org.ref_send.log.Got", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 1,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/product/",
        "number" : 2
      }
    },
    "message" : "lqhjpwbbeemozk-2-0",
    "trace" : {
      "calls" : [ {
          "name" : "InventoryMaker.InventoryX.partInStock",
          "source" : "org/waterken/purchase_ajax/InventoryMaker.java"
        } ]
    }

The corresponding Got record matches on message. The message delivery in the product vat starts a new turn, turn 2. Being at the top of a new turn, there is limited stack capture and getting a source span through Java reflection, is not practical.

Reporting true to teller has two log entries: a Sent and its corresponding Got.

    "class" : [ "org.ref_send.log.Sent", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 2,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/product/",
        "number" : 2
      }
    },
    "message" : "ewrigzpctikrhk-1-0",
    "trace" : {
      "calls" : [ {
          "name" : "InventoryMaker.InventoryX.partInStock",
          "source" : "org/waterken/purchase_ajax/InventoryMaker.java",
          "span" : [ [ 19 ] ]
        } ]
    }

The anchor uniquely identifies the origin of this message send as the 2nd messaging event from the product vat, turn 2.

    "class" : [ "org.ref_send.log.Got", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 1,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 10
      }
    },
    "message" : "ewrigzpctikrhk-1-0",
    "trace" : {
      "calls" : [ {
          "name" : "AsyncAnd.run",
          "source" : "org/waterken/purchase_ajax/AsyncAnd.java"
        } ]
    }

The corresponding Got record matches on message. The message delivery in the buyer vat starts a new turn, turn 10.

Logging Promise-based Messaging in Waterken

Consider the sequence of events shown below.

Promise-based message send


Note: For the purpose of describing the trace records the screenshot shows events that actually will be filtered by Causeway and do not appear in the message-order view.


   "class" : [ "org.ref_send.log.Sent", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 1,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 2
      }
    },
    "message" : "3t4ubiwssjzuy2-2-0",
    "trace" : {
      "calls" : [ {
          "name" : "Buyer.Buy.fulfill",
          "source" : "org/waterken/purchase_promise/Buyer.java",
          "span" : [ [ 39 ] ]
        }, {
          "name" : "Buyer.Buy.fulfill",
          "source" : "org/waterken/purchase_promise/Buyer.java",
          "span" : [ [ 24 ] ]
        } ]
    }
    "class" : [ "org.ref_send.log.Got", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 0,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/product/",
        "number" : 1
      }
    },
    "message" : "3t4ubiwssjzuy2-2-0",
    "trace" : {
      "calls" : [ {
          "name" : "InventoryMaker.InventoryX.partInStock",
          "source" : "org/waterken/purchase_promise/InventoryMaker.java"
        } ]
    }
    "class" : [ "org.ref_send.log.Returned", "org.ref_send.log.Sent", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 1,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/product/",
        "number" : 1
      }
    },
    "message" : "3t4ubiwssjzuy2-2-0-return"
    "class" : [ "org.ref_send.log.Got", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 0,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 5
      }
    },
    "message" : "3t4ubiwssjzuy2-2-0-return"
    "class" : [ "org.ref_send.log.Fulfilled", "org.ref_send.log.Resolved", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 1,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 5
      }
    },
    "condition" : "http://localhost:8080/-/buyer/#p3"
    "class" : [ "org.ref_send.log.Got", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 0,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 8
      }
    },
    "message" : "http://localhost:8080/-/buyer/#w3",
    "trace" : {
      "calls" : [ {
          "name" : "AsyncAnd.DoAnswer.fulfill",
          "source" : "org/waterken/purchase_promise/AsyncAnd.java"
        } ]
    }
    "class" : [ "org.ref_send.log.Progressed", "org.ref_send.log.Resolved", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 1,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 8
      }
    },
    "condition" : "http://localhost:8080/-/buyer/#p6",
    "trace" : {
      "calls" : [ {
          "name" : "AsyncAnd.DoAnswer.fulfill",
          "source" : "org/waterken/purchase_promise/AsyncAnd.java",
          "span" : [ [ 34 ] ]
        }, {
          "name" : "AsyncAnd.DoAnswer.fulfill",
          "source" : "org/waterken/purchase_promise/AsyncAnd.java",
          "span" : [ [ 16 ] ]
        } ]
    }
    "class" : [ "org.ref_send.log.Got", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 0,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 14
      }
    },
    "message" : "http://localhost:8080/-/buyer/#w6",
    "trace" : {
      "calls" : [ {
          "name" : "Buyer.CheckAnswers.fulfill",
          "source" : "org/waterken/purchase_promise/Buyer.java"
        } ]
    }

The registration of when-blocks, logged as SentIf records, are filtered from the message-order view, as they don't contribute to the understanding of message flow. However, they do appear in process-order, as shown below.


Buyer events in process order


    "class" : [ "org.ref_send.log.SentIf", "org.ref_send.log.Sent", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 7,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 2
      }
    },
    "condition" : "http://localhost:8080/-/buyer/#p6",
    "message" : "http://localhost:8080/-/buyer/#w6",
    "trace" : {
      "calls" : [ {
          "name" : "Buyer.Buy.fulfill",
          "source" : "org/waterken/purchase_promise/Buyer.java",
          "span" : [ [ 46 ] ]
        }, {
          "name" : "Buyer.Buy.fulfill",
          "source" : "org/waterken/purchase_promise/Buyer.java",
          "span" : [ [ 24 ] ]
        } ]
    }
    "class" : [ "org.ref_send.log.SentIf", "org.ref_send.log.Sent", "org.ref_send.log.Event" ],
    "anchor" : {
      "number" : 4,
      "turn" : {
        "loop" : "http://localhost:8080/-/buyer/",
        "number" : 2
      }
    },
    "condition" : "http://localhost:8080/-/buyer/#p3",
    "message" : "http://localhost:8080/-/buyer/#w3",
    "trace" : {
      "calls" : [ {
          "name" : "AsyncAnd.run",
          "source" : "org/waterken/purchase_promise/AsyncAnd.java",
          "span" : [ [ 60 ] ]
        }, {
          "name" : "Buyer.Buy.fulfill",
          "source" : "org/waterken/purchase_promise/Buyer.java",
          "span" : [ [ 43 ] ]
        }, {
          "name" : "Buyer.Buy.fulfill",
          "source" : "org/waterken/purchase_promise/Buyer.java",
          "span" : [ [ 24 ] ]
        } ]
    }


Performance Issues in Waterken

Due to the expense of stack capture in Java, tracing in Waterken incurs roughly, an order of magnitude performance penalty. If tracing is off, there is no penalty.

Waterken guarantees message order delivery and if a connection is dropped, there's enough information to know about partial success. For example, if 2 messages (msg1, msg2) are sent from vat A to vat B, they are guaranteed to be processed in the order sent. If the connection is dropped after msg1 is successfully sent, when the connection is re-established, it is known that only msg2 must be resent.

The identifiers used to support these guarantees are also used for tracing. The advantage of these multi-purpose identifiers is there is no overhead when tracing is off (i.e., unique message identifiers, just for tracing, are not sent out over the wire.)

Note: Resending a message after a connection is re-established can result in 2 identical Sent events being logged. Causeway notices when the event records are identical and ignores the duplicate.


See Also

Our current development effort is to generalize Causeway to support asynchronous message-passing programs running on event loop-based platforms in general, not just E. Our initial focus has been on the Waterken server.

HP Labs Technical Report presents our experience with the Waterken web server which we have instrumented to generate Causeway's language-neutral trace log format.

Screencast presents a brief demonstration of Causeway, using the example from the HP Tech Report.

Debugging a Waterken application explains how to configure the Waterken server to emit the JSON debugging records understood by Causeway.

Debugging AmbientTalk using Causeway explains how to emit Causeway debugging records, in order to use Causeway to debug AmbientTalk applications.

Personal tools
more tools