7. Reporters and Repositories

Inca reporters are executable programs and scripts, generally small, that test and report the health and characteristics of a system. The figure below illustrates a typical Inca installation where reporters are retrieved from a repository and sent to Reporter Managers on Grid resources by the Agent. The Reporter Managers then execute the reporters based on series configuration from the Agent and send the XML reports to the Depot for storage.

7.1. Executing Reporters

Because they are executables, Inca reporters are independent of the rest of the Inca system. Reporters can be executed manually from a command line or automatically as part of an Inca installation. Incorporating your own reporters into a running Inca installation requires only writing the reporters (Section 7.2), including them in a repository (Section 7.3), and configuring the repository's series using incat (Section 5). Most developers will execute reporters from the command line before adding them to their Inca installation. After installing Inca, you can try executing some of the reporters that come with the distribution from the command line:
% cd $INCA_DIST/Inca-Reporter-*/bin
% setenv PERL5LIB ../lib
% ./cluster.compiler.gcc.version 
You should now see XML output like that in Section 7.2.2.

All Inca reporters must support the command line arguments listed in the table below. In addition, a reporter may support additional command line arguments specific to that reporter's task.

Table 5. Reporter Command Line Arguments

Argument

Valid Values

Default Value

Description

-help

yes|no

no

Do not run the reporter; instead, print information on running it. If the value of the verbose argument is 0, this information will be readable text; otherwise, it will be reporter XML.

-log

0-5|debug|error|info|system|warn

0

Include reporter log messages in the reporter output. The named argument values indicate specific types of log messages that should be included. 0 indicates no log messages should be included; the other numeric values indicate error, warn, system, info, and debug messages, cumulatively. For example, --log=2 indicates both error and warn messages should be included, while --log=4 includes error, warn, system, and info messages.

-verbose

0-2

1

Determine what the reporter prints. A verbose level of 0 indicates that the reporter prints only 'completed' or 'failed', depending on the outcome of its testing. Verbose level 1 produces XML that reports the testing result, and verbose level 2 adds additional tags to this XML to give instructions on running the reporter.

-version

yes|no

no

Do not run the reporter; instead, print its version number.

Executing a reporter using different arguments:
% ./cluster.compiler.gcc.version -log=3 
% ./cluster.compiler.gcc.version -help=yes -verbose=0
% ./cluster.compiler.gcc.version -version=yes

7.2. Writing Reporters

Reporters can be written in any language as long as they output XML according to the schema described in Section 7.2.1. New reporter developers may choose to write reporters in Perl since the Inca distribution includes sample Perl reporters and API modules (Section 7.2.3) for printing XML according to our schema.

NOTE: Because some databases won't allow queries greater than 4kB, the XML portion of the report for logging/debugging and the error message must each be smaller than 4kB in size. The body XML can be 12kB in size because it is queried in three parts. If the report XML is greater than its limit, the depot truncates the oversized section from the beginning until it is the right size.

7.2.1. Reporter XML Schema

In order to promote interoperability between reporters, we define a specification for how reporter output should be formatted. Given the wide acceptance and availability of tools for XML, the specification requires that reporter output should be formatted using XML. Furthermore, we specify a basic schema that the XML should follow so that we can handle the output in a general manner. The goal of this schema is to be flexible enough to express a wide variety of data.

Our approach is to require a number of XML fields which provide metadata about the output and define one of the fields, body, to be abstract. The body field is a placeholder for the formatted output and can be replaced by any XML substitution group thereby allowing this schema to accommodate a large variety of output. In other words, the basic schema is like an abstract class and the substitution groups provide for subclassing.

The reporter schema is visualized in Figure 18.

Figure 18. Inca Reporter Schema

7.2.2. Reporter XML Output

Here is the output from the successful run of a typical Inca reporter. The content and meaning of the XML tags is described below.

Figure 19. Example of Reporter Output

<?xml version='1.0'?>
<inca:report xmlns:inca='http://inca.sdsc.edu/dataModel/report_2.1'>
  <gmt>2006-11-17T17:35:40Z</gmt>
  <hostname>jhayes-Computer.local</hostname>
  <name>cluster.compiler.gcc.version</name>
  <version>2</version>
  <workingDir>/Users/jhayes/Inca/subversion/inca/trunk/devel/reporters/bin</workingDir>
  <reporterPath>cluster.compiler.gcc.version</reporterPath>
  <args>
    <arg>
      <name>help</name>
      <value>no</value>
    </arg>
    <arg>
      <name>log</name>
      <value>0</value>
    </arg>
    <arg>
      <name>verbose</name>
      <value>1</value>
    </arg>
    <arg>
      <name>version</name>
      <value>no</value>
    </arg>
  </args>
  <body>
    <package>
      <ID>gcc</ID>
      <version>3.3</version>
    </package>
  </body>
  <exitStatus>
    <completed>true</completed>
  </exitStatus>
</inca:report>

As shown in Figure 19, reporter output begins with an XML preamble and is surrounded by <report> tags. A prefix with a tag name that references http://inca.sdsc.edu/dataModel/report_2.1, which is the namespace that defines the report schema, can also be used.

The following tags are defined within a <report>:

gmt

the time the reporter ran (ISO 8601 format)

hostname

host where reporter ran

name

reporter name

version

reporter version number

workingDir

directory where reporter execution begins

reporterPath

local path to reporter file

args

args must contain an arg name/value entry for every argument the reporter supports, including those for which the reporter supplies a default value (help, log, verbose, version)

log

OPTIONAL TAG (not shown in Figure 19 report). Log entries produced by the reporter. This tag contains one or more <debug>, <error>, <info>, <system>, and/or <warn> tags, each of which gives the text of the message and the time it was produced. Here is a typical example of a log section:
  <log>
    <system>
      <gmt>2006-11-17T18:28:10Z</gmt>
      <message>grid-proxy-info 2&gt;&amp;1</message>
    </system>
    <debug>
      <gmt>2006-11-17T18:28:10Z</gmt>
      <message>Checking for grid proxy: Result of command "grid-proxy-info":
         sh: line 1: grid-proxy-info: command not found
      </message>
    </debug>
    <error>
      <gmt>2006-11-17T18:28:10Z</gmt>
      <message>ERROR: Valid proxy needed for file transfer.</message>
    </error>
  </log>

body

The body tag contains the results of the reporter testing. The only requirement for the contents of this tag is that they must be well-formed XML--tags balanced and no extraneous <, >, and & characters. Figure 19 shows the conventional body for version reporters.

exitStatus

Includes the boolean <completed> tag, indicating whether or not the reporter successfully completed its testing, and the optional <errorMessage> tag, which contains a string indicating why the reporter failed to complete.

help

OPTIONAL TAG (not shown in Figure 19 report). The help tag describes the reporter and how to run it. Contents include the reporter name, version, description, and url, detailed descriptions of each argument, and an optional list of dependencies that the reporter has on other packages. For example, here is the <help> section for the gcc version reporter.
  <help>
    <ID>help</ID>
    <name>cluster.compiler.gcc.version</name>
    <version>2</version>
    <description>Reports the version of gcc</description>
    <url>http://gcc.gnu.org</url>
    <argDescription>
      <ID>help</ID>
      <accepted>no|yes</accepted>
      <description>display usage information (no|yes)</description>
      <default>no</default>
    </argDescription>
    <argDescription>
      <ID>log</ID>
      <accepted>[012345]|debug|error|info|system|warn</accepted>
      <description>log message types included in report</description>
      <default>0</default>
    </argDescription>
    <argDescription>
      <ID>verbose</ID>
      <accepted>[012]</accepted>
      <description>verbosity level (0|1|2)</description>
      <default>1</default>
    </argDescription>
    <argDescription>
      <ID>version</ID>
      <accepted>no|yes</accepted>
      <description>show reporter version (no|yes)</description>
      <default>no</default>
    </argDescription>
    <dependency>
      <ID>Inca::Reporter</ID>
    </dependency>
    <dependency>
      <ID>Inca::Reporter::Version</ID>
    </dependency>
  </help>

7.2.3. Reporter APIs

The Inca release includes a set of Perl modules that make it easier to develop reporters that produce output as shown in Section 7.2.2 and conform to the schema described in Section 7.2.1. The following are a list of modules and their purpose (click on module names for manpages):

  1. Inca::Reporter

    This module is the general reporter API and is the base class for all types of reporters. Inca::Reporter automates determination of hostname, gmt, reporter name, etc., handles command-line parsing, provides an interface for log messages, and handles XML generation.

  2. Inca::Reporter::GlobusUnit

    The Inca::Reporter::GlobusUnit module is used for Globus unit tests. This package provides methods for running Globus jobs.

  3. Inca::Reporter::GridProxy

    The Inca::Reporter::GridProxy package is a pseudo-module indicating that a reporter requires a proxy credential in order to execute. The following is an example of a reporter that requires a proxy:
    #!/usr/bin/env perl
    
    use strict;
    use warnings;
    use Inca::Reporter::SimpleUnit;
    
    my $reporter = new Inca::Reporter::SimpleUnit(
      name => 'grid.middleware.globus.unit.proxy',
      version => 2,
      description => 'Verifies that user has valid proxy',
      url => 'http://www.globus.org/security/proxy.html',
      unit_name => 'validproxy'
    );
    $reporter->addDependency( "Inca::Reporter::GridProxy" );
    $reporter->processArgv(@ARGV);
    
    # check to see if proxy has enough time left
    $reporter->log( 'info', "X509_USER_PROXY=$ENV{X509_USER_PROXY}" );
    my $output = $reporter->loggedCommand('grid-proxy-info -exists -hours 4 2>&1');
    if( $? != 0 ) {
      $reporter->unitFailure("grid-proxy-info failed: $! $output");
    } else {
      $reporter->unitSuccess();
    }
    $reporter->print();

  4. Inca::Reporter::Performance

    The Inca::Reporter::Performance module is used to gather system performance metrics. This package defines a common <body> schema for system/software performance metric reporters and produces a collection of benchmarks, each a set of parameters (name/value) and statistics (name/value/units). For a single benchmark, the Inca::Test::Reporter::Performance::Benchmark module is also available.

  5. Inca::Reporter::SimpleUnit

    The Inca::Reporter::SimpleUnit module is used for software unit tests. This package defines a common <body> schema for unit test reporters and provides methods for recording results of unit tests.

  6. Inca::Reporter::Usage

    The Inca::Reporter::Usage module is used for creating simple usage reports.

  7. Inca::Reporter::Version

    The Inca::Reporter::Version module is used for reporting software versions. This package defines a common <body> schema for version reporters, offers support for subpackage versions, and provides convenience methods for common ways of determining version.

The following is the Perl code for a reporter that produces output like Figure 19. This reporter uses the Inca::Reporter::Version module to determine the version of gcc. Examples of reporters that use the other Perl modules are located in $INCA_DIST/Inca-Reporter-*/bin.
#!/usr/bin/env perl

use strict;
use warnings;
use Inca::Reporter::Version;

my $reporter = new Inca::Reporter::Version(
  name => 'cluster.compiler.gcc.version',
  version => 2,
  description => 'Reports the version of gcc',
  url => 'http://gcc.gnu.org',
  package_name => 'gcc'
);
$reporter->processArgv(@ARGV);

$reporter->setVersionByExecutable('gcc -dumpversion');
$reporter->print();

7.3. Reporter Repositories

The Inca system retrieves reporters from external collections called repositories. A reporter repository is simply a file directory, accessed via a file: or http: URL, that contains a catalog file named Packages.gz. This gzipped file includes a sequence of name:value attribute pairs for every reporter and support package in the repository; blank lines separate the attributes for different reporters. For example, here is a portion of the Packages.gz file for the Inca standard reporter repository.

arguments: help no|yes no;log [012345]|debug|error|info|system|warn 0;verbose [0
12] 1;version no|yes no
dependencies: Inca::Reporter;Inca::Reporter::Version
description: Reports the version of tgusage
file: cluster.accounting.tgusage.version
name: cluster.accounting.tgusage.version
url: http://www.teragrid.org
version: 2

arguments: help no|yes no;log [012345]|debug|error|info|system|warn 0;verbose [0
12] 1;version no|yes no
dependencies: Inca::Reporter;Inca::Reporter::SimpleUnit
description: ant hello world test
file: cluster.admin.ant.unit
name: cluster.admin.ant.unit
version: 3

arguments: help no|yes no;log [012345]|debug|error|info|system|warn 0;verbose [0
12] 1;version no|yes no
dependencies: Inca::Reporter;Inca::Reporter::Version
description: Reports the version of Apache Ant
file: cluster.admin.ant.version
name: cluster.admin.ant.version
version: 2

Of the attributes shown, only file and name are required. The file attribute gives the relative path to the reporter file, and the name attribute specifies the unique package name of the reporter. If the reporter requires support packages to execute, it should include a dependencies attribute with a semicolon-separated list of package names. The incat administration tool uses the Packages.gz file's arguments and description attributes as part of its series edit dialog. The value of the arguments attribute is a semicolon-separated list giving the name, value pattern, and default value, if any, for each supported command-line argument.

To create a local repository for your own reporters, you only need to collect them into a directory and create a Packages.gz in that directory. The default Inca installation has a Packages.gz file in $INCA_DIST/Inca-Reporter-* that can be added in incat. Inca also supplies a web accessible repository that can be added in incat as "http://inca.sdsc.edu/repository/latest/".

The Inca distribution includes a perl script, incpack, that can create Packages.gz for you. Simply run incpack with a list of reporters that you want to include in Packages.gz, e.g.,

% perl incpack jade.version f77.unit vim.version

incpack runs each of the listed reporters with --help=yes --verbose=1 to extract a standard set of attributes. If your reporters use the Inca reporter perl modules, you might need to run incpack with a -I switch to specify the location of the Inca perl library, like this.

% perl incpack -I ${INCA_DIST}/lib jade.version f77.unit vim.version

7.4. Updating Reporter Repositories

The Inca agent will detect changes to your reporter repository and automatically send changes to the appropriate reporter managers if you:

  1. update the reporter version number (ie. change a line like "version => 1" to "version => 2" in the body of the reporter)

  2. make sure the reporter permissions are set so the agent can fetch the reporter (755 is the standard reporter permission)

  3. update your Packages.gz file using incpack. The command will be something like:
    % cd $INCA_DIST/Inca-Reporter-*; ./sbin/incpack -I lib bin/<reportername>

  4. wait for the agent to deploy the new reporter automatically (it looks for new reporters every four hours by default),

    *OR*

    restart the agent,

    *OR*

    Connect to the agent in incat, select the Repositories tab, then press the Refresh button under the repository panel.

If the revised reporter still isn't deployed, look for any errors in the $INCA_DIST/var/agent.log that indicate the agent was unable to fetch the reporter or skipped over updating it. Make sure there is an active series that uses the reporter with "use latest version" checked on the resource your intend it to run on incat. Look for $INCA_DIST/var/repository/repository.xml entries for the reporter with "<latestVersion>false</latestVersion>" (should be "<latestVersion>true</latestVersion>" to get the updated reporters).