8. Reporters and Repositories

Inca reporters are executable programs and scripts, generally small, that test and report the health and characteristics of a system. The figure below illustrates a typical Inca installation where reporters are retrieved from a repository and sent to Reporter Managers on Grid resources by the Agent. The Reporter Managers then execute the reporters based on series configuration from the Agent and send the XML reports to the Depot for storage.

8.1. Executing Reporters

Because they are executables, Inca reporters are independent of the rest of the Inca system. Reporters can be executed manually from a command line or automatically as part of an Inca installation. Incorporating your own reporters into a running Inca installation requires only writing the reporters (Section 8.2), including them in a repository (Section 8.3), and configuring the repository's series using incat (Section 5). Most developers will execute reporters from the command line before adding them to their Inca installation. After installing Inca, you can try executing some of the reporters that come with the distribution from the command line:

% cd $INCA_DIST/Inca-Reporter-*/bin
% setenv PERL5LIB ../lib/perl
% setenv PYTHONPATH ../lib/python
% ./cluster.compiler.gcc.version 
You should now see XML output like that in Section 8.2.2.

All Inca reporters must support the command line arguments listed in the table below. In addition, a reporter may support additional command line arguments specific to that reporter's task.

Table 7. Reporter Command Line Arguments

Argument

Valid Values

Default Value

Description

-help

yes|no

no

Do not run the reporter; instead, print information on running it. If the value of the verbose argument is 0, this information will be readable text; otherwise, it will be reporter XML.

-log

0-5|debug|error|info|system|warn

0

Include reporter log messages in the reporter output. The named argument values indicate specific types of log messages that should be included. 0 indicates no log messages should be included; the other numeric values indicate error, warn, system, info, and debug messages, cumulatively. For example, --log=2 indicates both error and warn messages should be included, while --log=4 includes error, warn, system, and info messages.

-verbose

0-2

1

Determine what the reporter prints. A verbose level of 0 indicates that the reporter prints only 'completed' or 'failed', depending on the outcome of its testing. Verbose level 1 produces XML that reports the testing result, and verbose level 2 adds additional tags to this XML to give instructions on running the reporter.

-version

yes|no

no

Do not run the reporter; instead, print its version number.

Executing a reporter using different arguments:

% ./cluster.compiler.gcc.version -log=3 
% ./cluster.compiler.gcc.version -help=yes -verbose=0
% ./cluster.compiler.gcc.version -version=yes

8.2. Writing Reporters

Reporters can be written in any language as long as they output XML according to the schema described in Section 8.2.1. New reporter developers may choose to write reporters in Perl or Python since the Inca distribution includes sample reporters and API modules in those languages (Section 8.2.3) for printing XML according to our schema.

NOTE: Because databases impose a 4000 character limit on text fields, the XML portion of the report for logging/debugging and the error message must each be smaller than 4000 characters. The body XML can be 12000 characters because it is stored in three parts. If the report XML is greater than its limit, the depot truncates the oversized section from the beginning until it is the right size.

Table 8. Report Character Limits

Report XML Section

Character limit

log

4000

error (exit) message

4000

body

12000

8.2.1. Reporter XML Schema

In order to promote interoperability between reporters, we define a specification for how reporter output should be formatted. Given the wide acceptance and availability of tools for XML, the specification requires that reporter output should be formatted using XML. Furthermore, we specify a basic schema that the XML should follow so that we can handle the output in a general manner. The goal of this schema is to be flexible enough to express a wide variety of data.

Our approach is to require a number of XML fields which provide metadata about the output and define one of the fields, body, to be abstract. The body field is a placeholder for the formatted output and can be replaced by any XML substitution group thereby allowing this schema to accommodate a large variety of output. In other words, the basic schema is like an abstract class and the substitution groups provide for subclassing.

The reporter schema is visualized in Figure 23.

Figure 23. Inca Reporter Schema

8.2.2. Reporter XML Output

Here is the output from the successful run of a typical Inca reporter. The content and meaning of the XML tags is described below.

Figure 24. Example of Reporter Output

<?xml version='1.0'?>
<inca:report xmlns:inca='http://inca.sdsc.edu/dataModel/report_2.1'>
  <gmt>2006-11-17T17:35:40Z</gmt>
  <hostname>jhayes-Computer.local</hostname>
  <name>cluster.compiler.gcc.version</name>
  <version>2</version>
  <workingDir>/Users/jhayes/Inca/subversion/inca/trunk/devel/reporters/bin</workingDir>
  <reporterPath>cluster.compiler.gcc.version</reporterPath>
  <args>
    <arg>
      <name>help</name>
      <value>no</value>
    </arg>
    <arg>
      <name>log</name>
      <value>0</value>
    </arg>
    <arg>
      <name>verbose</name>
      <value>1</value>
    </arg>
    <arg>
      <name>version</name>
      <value>no</value>
    </arg>
  </args>
  <body>
    <package>
      <ID>gcc</ID>
      <version>3.3</version>
    </package>
  </body>
  <exitStatus>
    <completed>true</completed>
  </exitStatus>
</inca:report>

As shown in Figure 24, reporter output begins with an XML preamble and is surrounded by <report> tags. A prefix with a tag name that references http://inca.sdsc.edu/dataModel/report_2.1, which is the namespace that defines the report schema, can also be used.

The following tags are defined within a <report>:

gmt

the time the reporter ran (ISO 8601 format)

hostname

host where reporter ran

name

reporter name

version

reporter version number

workingDir

directory where reporter execution begins

reporterPath

local path to reporter file

args

args must contain an arg name/value entry for every argument the reporter supports, including those for which the reporter supplies a default value (help, log, verbose, version)

log

OPTIONAL TAG (not shown in Figure 24 report). Log entries produced by the reporter. This tag contains one or more <debug>, <error>, <info>, <system>, and/or <warn> tags, each of which gives the text of the message and the time it was produced. Here is a typical example of a log section:

  <log>
    <system>
      <gmt>2006-11-17T18:28:10Z</gmt>
      <message>grid-proxy-info 2&gt;&amp;1</message>
    </system>
    <debug>
      <gmt>2006-11-17T18:28:10Z</gmt>
      <message>Checking for grid proxy: Result of command "grid-proxy-info":
         sh: line 1: grid-proxy-info: command not found
      </message>
    </debug>
    <error>
      <gmt>2006-11-17T18:28:10Z</gmt>
      <message>ERROR: Valid proxy needed for file transfer.</message>
    </error>
  </log>

body

The body tag contains the results of the reporter testing. The only requirement for the contents of this tag is that they must be well-formed XML--tags balanced and no extraneous <, >, and & characters. Figure 24 shows the conventional body for version reporters.

exitStatus

Includes the boolean <completed> tag, indicating whether or not the reporter successfully completed its testing, and the optional <errorMessage> tag, which contains a string indicating why the reporter failed to complete.

help

OPTIONAL TAG (not shown in Figure 24 report). The help tag describes the reporter and how to run it. Contents include the reporter name, version, description, and url, detailed descriptions of each argument, and an optional list of dependencies that the reporter has on other packages. For example, here is the <help> section for the gcc version reporter.

  <help>
    <ID>help</ID>
    <name>cluster.compiler.gcc.version</name>
    <version>2</version>
    <description>Reports the version of gcc</description>
    <url>http://gcc.gnu.org</url>
    <argDescription>
      <ID>help</ID>
      <accepted>no|yes</accepted>
      <description>display usage information (no|yes)</description>
      <default>no</default>
    </argDescription>
    <argDescription>
      <ID>log</ID>
      <accepted>[012345]|debug|error|info|system|warn</accepted>
      <description>log message types included in report</description>
      <default>0</default>
    </argDescription>
    <argDescription>
      <ID>verbose</ID>
      <accepted>[012]</accepted>
      <description>verbosity level (0|1|2)</description>
      <default>1</default>
    </argDescription>
    <argDescription>
      <ID>version</ID>
      <accepted>no|yes</accepted>
      <description>show reporter version (no|yes)</description>
      <default>no</default>
    </argDescription>
    <dependency>
      <ID>Inca::Reporter</ID>
    </dependency>
    <dependency>
      <ID>Inca::Reporter::Version</ID>
    </dependency>
  </help>

8.2.3. Reporter APIs

The Inca release includes a set of Perl modules and a Python package that make it easier to develop reporters that produce output as shown in Section 8.2.2 and conform to the schema described in Section 8.2.1. The following are a list of modules and their purpose (click on module names for manpages):

  1. Inca::Reporter

    inca.Reporter

    This module is the general reporter API and is the base class for all types of reporters. It automates determination of hostname, gmt, reporter name, etc., handles command-line parsing, provides an interface for log messages, and handles XML generation.

  2. Inca::Reporter::GlobusUnit

    inca.GlobusUnitReporter

    This module is used for Globus unit tests. it provides methods for running Globus jobs.

  3. Inca::Reporter::GridProxy

    The Inca::Reporter::GridProxy package is a pseudo-module indicating that a reporter requires a proxy credential in order to execute. The following is an example of a perl reporter that requires a proxy. Python reporters should use the equivalent, reporter.addDependency('inca.GridProxyReporter').

    #!/usr/bin/env perl
    
    use strict;
    use warnings;
    use Inca::Reporter::SimpleUnit;
    
    my $reporter = new Inca::Reporter::SimpleUnit(
      name => 'grid.middleware.globus.unit.proxy',
      version => 2,
      description => 'Verifies that user has valid proxy',
      url => 'http://www.globus.org/security/proxy.html',
      unit_name => 'validproxy'
    );
    $reporter->addDependency( "Inca::Reporter::GridProxy" );
    $reporter->processArgv(@ARGV);
    
    # check to see if proxy has enough time left
    $reporter->log( 'info', "X509_USER_PROXY=$ENV{X509_USER_PROXY}" );
    my $output = $reporter->loggedCommand('grid-proxy-info -exists -hours 4 2>&1');
    if( $? != 0 ) {
      $reporter->unitFailure("grid-proxy-info failed: $! $output");
    } else {
      $reporter->unitSuccess();
    }
    $reporter->print();

  4. Inca::Reporter::Performance

    inca.PerformanceReporter

    This module is used to gather system performance metrics. It defines a common <body> schema for system/software performance metric reporters and produces a collection of benchmarks, each a set of parameters (name/value) and statistics (name/value/units). A dependent Benchmark class is used to define individual benchmarks.

  5. Inca::Reporter::SimpleUnit

    inca.SimpleUnitReporter

    This module is used for software unit tests. It defines a common <body> schema for unit test reporters and provides methods for recording results of unit tests.

  6. Inca::Reporter::Usage

    inca.UsageReporter

    This module is used for creating simple usage reports.

  7. Inca::Reporter::Version

    inca.VersionReporter

    This module is used for reporting software versions. It defines a common <body> schema for version reporters, offers support for subpackage versions, and provides convenience methods for common ways of determining version.

The following is the Perl code for a reporter that produces output like Figure 24. This reporter uses the Inca::Reporter::Version module to determine the version of gcc. Examples of reporters that use the other modules are located in $INCA_DIST/Inca-Reporter-*/bin.

#!/usr/bin/env perl

use strict;
use warnings;
use Inca::Reporter::Version;

my $reporter = new Inca::Reporter::Version(
  name => 'cluster.compiler.gcc.version',
  version => 2,
  description => 'Reports the version of gcc',
  url => 'http://gcc.gnu.org',
  package_name => 'gcc'
);
$reporter->processArgv(@ARGV);

$reporter->setVersionByExecutable('gcc -dumpversion');
$reporter->print();

8.2.4. Reporter Writing Tips

In general using the reporter APIs described in Section 8.2.3 will help to produce the most efficient reporter code. There are additional considerations when writing reporters that:

  • create temporary files and directories

  • cd to directories with variable names

  • include variable information like timestamps or PIDs in the exit error message

For reporters that create temporary files or directories, the APIs offer a function called "tempFile" to remove them. If the tempFile function is used then additional code to remove temporary files or directories (e.g. unlink or rm) is not required. The reporter below uses the tempFile function to remove the temp $scratchDir it creates.

It's best practice never to use a PID or variable information as a reporter argument value or to include PIDs or timestamps in error messages. Reporters that incorporate this sort of information will create a new report in the Inca database each time the reporter runs, which may slow query response. If a reporter error message may contain variable information, a function to replace the variable can be written to normalize the error (like the "failClean" function in the reporter below).

#!/usr/bin/env perl

use strict;
use warnings;
use Inca::Reporter::SimpleUnit;
use Date::Parse;
use Cwd;

my $reporter = new Inca::Reporter::SimpleUnit(
  name => 'security.ca.unit',
  version => 9,
  description => 'Checks whether the CA certificates or CRLs have expired',
  unit_name => 'caCertNCrlExpire'
);

 ...

my $scratchDir = "/tmp/security.ca.unit.$$";
if ( ! mkdir($scratchDir) ) {
  failClean("Cannot mkdir scratch dir $scratchDir"); 
}
$reporter->tempFile( $scratchDir );
if ( ! chdir($scratchDir) ) {
  failClean("Cannot change to scratch dir $scratchDir"); 
}

 ...

if ($err ne ""){
  failClean($err);
} else {
  $reporter->unitSuccess();
  $reporter->print();
}

sub failClean {
  my $err = shift;
  $err =~ s/--\d{2}:\d{2}:\d{2}--/--xx:xx:xx--/g;
  $err =~ s/$$/PID/g;
  $reporter->failPrintAndExit($err);
}

8.3. Reporter Repositories

The Inca system retrieves reporters from external collections called repositories. A reporter repository is simply a file directory, accessed via a file: or http: URL, that contains a catalog file named Packages.gz. This gzipped file includes a sequence of name:value attribute pairs for every reporter and support package in the repository; blank lines separate the attributes for different reporters. For example, here is a portion of the Packages.gz file for the Inca standard reporter repository.

arguments: help no|yes no;log [012345]|debug|error|info|system|warn 0;verbose [0
12] 1;version no|yes no
dependencies: Inca::Reporter;Inca::Reporter::Version
description: Reports the version of tgusage
file: cluster.accounting.tgusage.version
name: cluster.accounting.tgusage.version
url: http://www.teragrid.org
version: 2

arguments: help no|yes no;log [012345]|debug|error|info|system|warn 0;verbose [0
12] 1;version no|yes no
dependencies: Inca::Reporter;Inca::Reporter::SimpleUnit
description: ant hello world test
file: cluster.admin.ant.unit
name: cluster.admin.ant.unit
version: 3

arguments: help no|yes no;log [012345]|debug|error|info|system|warn 0;verbose [0
12] 1;version no|yes no
dependencies: Inca::Reporter;Inca::Reporter::Version
description: Reports the version of Apache Ant
file: cluster.admin.ant.version
name: cluster.admin.ant.version
version: 2

Of the attributes shown, only file and name are required. The file attribute gives the relative path to the reporter file, and the name attribute specifies the unique package name of the reporter. If the reporter requires support packages to execute, it should include a dependencies attribute with a semicolon-separated list of package names. For more information about reporter package dependencies see Section 8.3.1. The incat administration tool uses the Packages.gz file's arguments and description attributes as part of its series edit dialog. The value of the arguments attribute is a semicolon-separated list giving the name, value pattern, and default value, if any, for each supported command-line argument.

To create a local repository for your own reporters, you only need to collect them into a directory and create a Packages.gz in that directory. The default Inca installation has a Packages.gz file in $INCA_DIST/Inca-Reporter-* that can be added in incat. Inca also supplies a web accessible repository that can be added in incat as "http://inca.sdsc.edu/repository/latest/".

The Inca distribution includes a perl script, incpack, that can create Packages.gz for you. Simply run incpack with a list of reporters that you want to include in Packages.gz, e.g.,

% perl incpack jade.version f77.unit vim.version

incpack runs each of the listed reporters with --help=yes --verbose=1 to extract a standard set of attributes. If your reporters use the Inca reporter APIs, you might need to run incpack with -I switches to specify the location of the Inca libraries, like this.

% perl incpack -I ${INCA_DIST}/lib/perl -I ${INCA_DIST}/lib/python jade.version f77.unit vim.version

For more information about incpack usage, click here.

8.3.1. Reporter Package Dependencies

Some reporters may require a CPAN Perl module, C library, compiled executable, or some other tar.gz packaged dependency. Reporters can use packaged dependencies if the dependencies are 1) bundled into a tar.gz file, 2) added using incpack to the reporter repository, and 3) noted as a dependency in the reporters.

For example, the cluster.math.blas.unit.level1 reporter wraps the Level 1 BLAS Test Suite available from the Basic Linear Algebra Subprograms (BLAS) website. To add the BLAS Test Suite as a dependency to the cluster.math.blas.unit.level1 reporter, use the following steps:

  1. Package the Level1 BLAS Test Suite files (fortran code) into a tar.gz called blasTestSuite.tar.gz along with a Makefile and configure script. The blasTestSuite.tar.gz file contains:

    Makefile.in     cblat2d         configure       dblat2.f        dblat3d         sblat2d         zblat1.f        zblat3.f
    cblat1.f        cblat3.f        configure.in    dblat2d         sblat1.f        sblat3.f        zblat2.f        zblat3d
    cblat2.f        cblat3d         dblat1.f        dblat3.f        sblat2.f        sblat3d         zblat2d

  2. Update the reporter repository with the package dependency using an .attrib file. An .attrib file contains information about the dependency such as its name, version number, description, a descriptive url and dependencies. The .attrib file needs to be prefixed with the tar.gz name. For example, the BLAS Test Suite's .attrib would be named blasTestSuite.tar.gz.attrib and contain:

    name: blasTestSuite
    version: 1.0
    description: Test programs for the BLAS library
    url: http://inca.sdsc.edu
    dependencies:

    Both the blasTestSuite.tar.gz and blasTestSuite.tar.gz.attrib files are then placed in the share directory or can be placed anywhere inside the repository directory. Then the dependency is added to the repository using incpack (see Section 8.3.2 for more about repository updates):

    % sbin/incpack share/blasTestSuite.tar.gz
    Note: Appending to existing Packages.gz file
    share/blasTestSuite.tar.gz

  3. Include the following line in reporters that use the blasTestSuite dependency before $reporter->processArgv is called. Use the name specified in the .attrib file:

    $reporter->addDependency('blasTestSuite');

    Then add the reporter to the repository using incpack:

    % sbin/incpack -I lib/perl -I lib/python bin/cluster.math.blas.unit.level1
    Note: Appending to existing Packages.gz file
    bin/cluster.math.blas.unit.level1

After unzipping and untarring the package file, the reporter manager builds the package in one of several ways, depending on the contents of the package directory. If a configure is present, the reporter manager executes these commands to build the package:

% ./configure --prefix=$RM_INSTALL_DIR/var/reporter-packages
% [g]make
% [g]make install

Otherwise, if a [Mm]akefile is present, then the reporter manager executes these commands:

% [g]make INSTALL_DIR=$RM_INSTALL_DIR/var/reporter-packages 
% [g]make INSTALL_DIR=$RM_INSTALL_DIR=/var/reporter-packages

Otherwise, if a Makefile.PL file (i.e., Perl package) is found, then the following is executed:

% perl -I$RM_INSTALL_DIR/var/reporter-packages/lib/perl  Makefile.PL \
  PREFIX=$RM_INSTALL_DIR/var/reporter-packages/lib/perl \
  LIB=$RM_INSTALL_DIR/var/reporter-packages/lib/perl \
  INSTALLDIRS=perl \
  INSTALLSCRIPT=$RM_INSTALL_DIR/var/reporter-packages/bin \
  INSTALLMAN1DIR=$RM_INSTALL_DIR/var/reporter-packages/man/man1 \
  INSTALLMAN3DIR=$RM_INSTALL_DIR/var/reporter-packages/man/man3
% [g]make
% [g]make install

Otherwise, if a Build.PL file is found, then the following commands are executed:

% perl -I$RM_INSTALL_DIR/var/reporter-packages/lib/perl Build.PL \
  --install_path lib=$RM_INSTALL_DIR/var/reporter-packages/lib/perl \
  --install_path libdoc=$RM_INSTALL_DIR/var/reporter-packages/man/man3 \
  --install_path bindoc=$RM_INSTALL_DIR/var/reporter-packages/man/man1 \
  --install_path bin=$RM_INSTALL_DIR/var/reporter-packages/bin \
  --install_path script=$RM_INSTALL_DIR/var/reporter-packages/bin \
  --install_path arch=$RM_INSTALL_DIR/var/reporter-packages/lib/perl
% perl -I$RM_INSTALL_DIR/var/reporter-packages/lib/perl Build
% perl -I$RM_INSTALL_DIR/var/reporter-packages/lib/perl Build install

If none of the files listed above are present, the reporter manager assumes that no build step is needed for the package.

The reporter manager sets the INSTALL_DIR environment variable before running a reporter. Reporters that depend on other packages can use this variable to locate the package files--libraries in $INSTALL_DIR/lib, binaries in $INSTALL_DIR/bin, etc.

8.3.2. Updating Reporter Repositories

The Inca agent will detect changes to your reporter repository and automatically send changes to the appropriate reporter managers if you:

  1. update the reporter version number (ie. change a line like "version => 1" to "version => 2" in the body of the reporter)

  2. make sure the reporter permissions are set so the agent can fetch the reporter (755 is the standard reporter permission)

  3. update your Packages.gz file using incpack. The command will be something like:

    % cd $INCA_DIST/Inca-Reporter-*; ./sbin/incpack -I lib bin/<reportername>

  4. wait for the agent to deploy the new reporter automatically (it looks for new reporters every four hours by default),

    *OR*

    restart the agent,

    *OR*

    Connect to the agent in incat, select the Repositories tab, then press the Refresh button under the repository panel.

If the revised reporter still isn't deployed, look for any errors in the $INCA_DIST/var/agent.log that indicate the agent was unable to fetch the reporter or skipped over updating it. Make sure there is an active series that uses the reporter with "use latest version" checked on the resource your intend it to run on incat. Look for $INCA_DIST/var/repository/repository.xml entries for the reporter with "<latestVersion>false</latestVersion>" (should be "<latestVersion>true</latestVersion>" to get the updated reporters).