PMDF Statistics
4.5 Specifying the log file
You may specify a list of files as input, separating the file names by 
commas. Wildcard specifications may also be used. PMDF-Stats will open 
each file sequentially, and list the files as they are processed, 
producing a combined report. Be careful you don't cause a file to be 
processed twice, e.g. if you have a file JUNE.LOG, then the command
  
    | 
 
     $ Pstat *.LOG,JUNE.* /report=MY-REPORT.TXT 
 | 
will process the file twice. PMDF-Stats will list the name of each log 
file as it processes it.
4.6 Specifying alternate group files
You can specify an alternate group file using the /GROUP=file 
qualifier. By default, PMDF-Stats will use the logical PSTAT_GROUP_FILE 
if the logical is defined, and the file 
PMDF_ROOT:[STATS]DOMAIN_GROUPS.TXT) if it is not. For example, you 
might wish to maintain different group files for analyzing internal and 
external traffic.
4.7 Local Channel addresses
PMDF will not by default write full domain addresses for messages 
corresponding to the local channel, but only logs the username in the D 
record. This can cause matching problems for PMDF-Stats. There are 
three ways to solve this.
4.7.1 The PMDF LOG_LOCAL option
You can override PMDF's default behavior by ensuring that the LOG_LOCAL 
option in the PMDF_OPTION_FILE is set to 1.
Note that changing this option is too late for MAIL.LOG files that have 
already ben produced by PMDF.
4.7.2 Using /LOCAL qualifier
The /LOCAL qualifier specifies a domain name which will be appended to 
any domainless address associated with the local channel. You may 
either specify an explicit domain address (e.g. /LOCAL=vms.eurokom.ie) 
or omit the value, in which case PMDF-Stats will use the local channel 
name of the current PMDF system.  For this default to work, PMDF 
must be installed on your system, and all the logicals defined, as 
PMDF-Stats must map the PMDF shareable image file to get the local 
channel name.
4.7.3 Using $LOCAL directive
You can alternatively include the domain name $LOCAL in the Group file 
for one of the groups. This special domain will match any domainless 
address, and associate it with the group for which it is defined. Thus 
you would place it in the group that you would normally expect the 
local domain to be found.
4.7.4 Differences
Using the first alternative ensures that the problem never arises. The 
main difference between the other two approaches is seen when log file 
processing is being performed. Using /LOCAL in conjunction with /OUTPUT 
will cause PMDF-Stats to append the domain name in output MAIL.LOG 
files. This can be useful for standardizing MAIL.LOG files for 
centralized processing.
Chapter 5
Filtering and processing MAIL.LOG files
5.1 Filtering
You can instruct PMDF-Stats to consider a subset of the MAIL.LOG 
records in the input file(s). This is done by using one or more of the 
selection qualifiers in the command to activate the program. PMDF-Stats 
allows you to select records which match any of the fields in the 
MAIL.LOG record. If more than one such qualifier is used, only records 
which satisfy all the criteria are considered when compiling the report.
5.2 Partial matching
Most of the qualifiers allow partial matching using the "%" and "*" 
wildcard characters. Thus /INCOMING=BIT_* will match any record with an 
incoming channel starting with "bit_".
5.3 Processing Log files
PMDF-Stats also provides three additional qualifiers for generating 
MAIL.LOG format files. These are /OUTPUT, /REJECT and /ILLEGAL, and all 
take a single filename as a value. If /OUTPUT is specified, any records 
which pass the selection criteria are output to the file specified. If 
/REJECT is specified, any records which do not match the selection 
criteria are output to the file specified. If /ILLEGAL is specified, 
any records which fail the parsing are output to the specified file. 
This also provides a useful way to investigate any anomolies which 
might appear in traffic reports. For example, the report indicates a 
high amount of traffic from group INTNET to group SALES. You can then 
extract the relevent MAIL.LOG records using
  
    | 
 
     $ Pstat MAIL.LOG /from=INTNET /to=SALES /output=SALES.LOG 
 | 
which then allows you to look at the actual records leading to the 
anomalous figure, or subject them to further analysis using a different 
group file.
5.4 Illegal Records
PMDF-Stats assumes that all records in the MAIL.LOG file contain the 
following fields:
  -  Date and Time.
  
-  Source channel.
  
-  Destination channel (may be blank).
  
-  Code (up to three characters, of which only the first is 
  considered).
Any record which does not conform to this syntax is treated as illegal, 
and will not contribute to the statistics. If the /ILLEGAL qualifier is 
specified, it will be copied to the illegal records file (or discarded 
if the /ILLEGAL qualifier is omitted). A warning message will be issued 
for each illegal record encountered.5.4.1 Nonstandard Records
If the record has a code starting with "E" or "D" and contains the 
following fields after the Code field, then it is considered a 
"standard" record:
  -  Message volume
  
-  Source address (may be blank).
  
-  Destination address.
If not, the record is considered nonstandard. Nonstandard 
records do not contribute to the statistics, but may be filtered using 
channel, date or code specifiers. They are not considered illegal 
records. Several channels such as PMDF-Fax and MAILSERV generate such 
records.5.5 Filtering on localpart (username)
PMDF-Stats will only consider the domain part of an address in 
allocating addresses to groups, but you can use any part of the address 
in the filtering qualifiers, e.g. /SENDER=H235_*@ccvax.ucd.ie will 
match any username starting with "H235_" on the machine "ccvax.ucd.ie".
5.6 Filtering Qualifiers
The following describes the various qualifiers that can be used to 
filter records.
5.6.1 Enqueued or Dequeued
Normally PMDF-Stats only considers dequeued records (i.e. those with a 
code field of "D"). This is because
  -  Most mail entries create two records, an "E" record when it is 
  placed on the channel, and a "D" record when it is processed. You don't 
  want to count a mail message twice.
  
-  The "D" record indicates the message was processed. Messages may 
  generate "E" records and subsequently time-out.
  
-  Some channels (e.g. Fax) generate records which give channel 
  specific information (e.g. telephone number and call duration), rather 
  than standard domain pairs.
In some cases, you might with to use "E" records instead. For example, 
if you wish to examine traffic passing between two specific PMDF 
channels (as opposed to using sender and recipient addresses), then you 
need to use "E" records, as only these will have both incoming and 
outgoing channels specified. You can also select other codes used by 
channels such as MAILSERV or Fax. Use the /CODE qualifier to select 
records matching a specific code, the default is /CODE=D. You may also 
use a /CODE=ANY to allow all code types to be considered.5.6.2 Selecting on Date
You may use the /BEFORE and /SINCE qualifiers to select according to 
the date and time of the message. These qualifiers restrict 
consideration to records whose date fields are earlier or later 
respectively than the date specified. Specify the date and time in VMS 
format, or use keywords such as TODAY or YESTERDAY. You may not use 
wildcards with this qualifier.
5.6.3 Channel specifiers
You may use the /INCOMING_CHANNEL and /OUTGOING_CHANNEL qualifiers to 
select records according to the PMDF channels. Wildcard characters may 
be used with either of these qualifiers. Note that the incoming channel 
is the first channel in the log line, and the outgoing channel is the 
second. Since D records only contain one channel name, you should use 
the /INCOMING_CHANNEL qualifier to select D records according to 
channel.
5.6.4 Selecting on addresses
You may use the /SENDER, /RECEIVER or /ADDRESS qualifiers to select 
records based on the envelope addresses in the log record. Wildcards 
may be used in the qualifier values. The /SENDER and /RECEIVER 
qualifier refers to the envelope from and to addresses respectively. 
The /ADDRESS qualifier is used when you want to match an address in 
either envelope field (note that you cannot specify /ADDRESS together 
with either of the other two). If you do not specify any localpart 
fields, any localpart address will be matched (an implicit "*@" is 
prepended to such records). Do not confuse the /SENDER and /RECEIVER 
qualifiers with the /FROM and /TO qualifiers which refer to groups.
5.6.5 Selecting on message size
You may use the /MIN_VOLUME and /MAX_VOLUME qualifiers to select 
records based on the block size of the message. You should ensure that 
the blocksize you use agrees with the blocksize in effect when the log 
file was created. PMDF-Stats uses the /BLOCK_SIZE qualifier to set 
this, whereas PMDF uses the BLOCK_SIZE option in the file 
PMDF_OPTION_FILE. Both default to 1024. Records are selected depending 
on the relative size to the specified limit. You may not use wildcard 
characters with these qualifiers.
5.6.6 Selecting on group
You may use the /FROM and /TO qualifiers to select records according to 
the group they are assigned to using the rules in the Domain Group file 
(note that Short Group Names are used). Do not confuse these qualifiers 
with the /SENDER and /RECEIVER qualifiers which refer to the actual 
addresses of the sender and recipient. You may specify wildcard 
characters with these qualifiers, but an error message will be issued 
if the wildcard does not match any of the defined groups.
5.7 Ignoring a Channel
When processing statistics, it is desirable to ignore the records 
associated with intermediate channels such as "directory", "conversion" 
and "name_router". Failure to do this will result in the counting the 
same messages more than once, as each messages causes multiple "D" 
records to be produced by PMDF. You can use the /SKIP_CHANNEL qualifier 
to specify one or more channels that are to be skipped as intermediate 
channels. Such records will not contribute to the statistics, apart 
from the message transit times (the transit time will be calculated as 
the difference in time between the 1st "E" record for any channel and 
the 1st "D" record for any channel not listed in the /SKIP_CHANNEL list.
Chapter 6
Producing a Summary Report
6.1 Purpose of a Summary Report
A summary report will display the total number of messages (and the 
total volume of messages) going into and out of each group. It does not 
give the same amount of detail as the matrix report, but it gives a 
readable summary of group totals. It also has the option to display per 
mailbox statistics.
6.2 Header Information
The following information is provided at the top of a Summary Report:
  -  Title (specified by the /TITLE qualifier).
  
-  Dates and times of the earliest and latest record processed.
  
-  Header titles for the fields described below.
6.3 Information on each group
The rest of the summary report consists of a one line record for each 
group. The information provided for each group is as follows:
  -  Short name of the group.
  
-  Total number of messages sent from addresses in this group.
  
-  Total number of messages received by addresses in this group.
  
-  Overall message total (total of previous two fields).
  
-  Volume of messages sent from addresses in this group in Megabytes.
  
-  Volume of messages received by addresses in this group in 
  Megabytes.
  
-  Overall volume total (total of previous two fields).
  
-  Number of unique localparts sending mail but not receiving mail.
  
-  Number of unique localparts receiving mail but not sending mail.
  
-  Number of unique localparts that either sent mail, received mail 
  or both.
6.3.1 Format
The number of messages is output in Integer format. The volume is 
output as a floating point number in units of Megabytes. Where either 
number is too large for the output field, the format is changed to 
exponential (or scientific) format, e.g. 1.234E05).
6.4 Mailbox statistics
PMDF-Stats will only include mailbox statistics if the /MAILBOX 
qualifier is specified (otherwise the mailbox fields above will be 
zero). Mailbox processing does add to the amount of CPU time and memory 
used by PMDF-Stats. Note that groups that do not have the 
mailbox flag set will not be analyzed for mailbox 
statistics, and the field values for those groups will be zero.
6.4.1 Users that Send and Receive
The only figure not included explicitly is the number of users that 
have both sent and received mail, but this can be calculated from the 
figures given.
  
    | 
 
 senders_and_receivers = senders_or_receivers - (senders_only + receivers_only) 
 | 
i.e. subtract the sum of the first two numbers from the third.
6.5 Overall Total
The overall total of mail sent and received (and volumes) is included 
at the end of the Summary Report.
6.6 Site Reports
PMDF-Stats can optionally produce individual summaries for each group, 
each in a file of its own. This might be useful if you wanted to 
provide reports to each person responsible for a group as to the 
activity of that group (without disclosing other information). If you 
specify the /SITE qualifier, then PMDF-Stats, in addition to the 
Summary Report, will generate a site report for each group. Note that 
the /SITE qualifier is ignored if the /SUMMARY qualifier is not present.
6.6.1 File Names
The name of the site report is SITE-groupname.SUM, where "groupname" is 
the short name of the group. Any characters in the short name other 
than valid characters in an ODS-2 filename will be omitted.
6.6.2 Summary Information
The following information is written to the site report.
6.6.2.1 Headings
The title string (specified by /TITLE) starts the report, followed by 
the dates and times of the earliest and latest record processed). This 
is followed by a section delimiter (a line of dashes).
6.6.2.2 Overall Message Totals
The total number and volume of messages sent into and out of the group 
is presented in the next section. Each piece of information is 
presented on a separate line with introductory text.
6.6.2.3 Fax Traffic
If a Fax report was requested (/FAX is specified) then the next section 
will detail the number of pages and transmission times taken for faxes 
generated by addresses within this group to each of the Fax groups. 
This is followed by an overall fax total for the site in question. For 
more details on fax reports, see the relevant chapter.
6.6.2.4 Service Traffic
If a Service Report was requested (/SERVICE is specified) then the next 
section will detail the number and volume of messages sent by addresses 
in the site group to each mail group designated as a service group. 
This is followed by an overall service total for the site in question. 
For more details on service reports, see the relevant chapter.
6.6.2.5 Transit Statistics
If transit statistics were requested (/TRANSIT is specified) then the 
next section will show the average, minimum and maximum times taken for 
messages processed to this group. For further information on transit 
reports, see the relevent chapter.
6.6.2.6 Details mailbox reports
If the /MAILBOX qualifier was specified, then the site report will then 
contain a list of all unique mailboxes (localparts) in the group, 
together with the total number and volume of messages sent and received 
by that mailbox. This allows a site administrator to identify who are 
the chief contributers to the group's usage.
Chapter 7
Other Report Types
7.1 Overview
In addition to the matrix and summary reports, PMDF-Stats can also 
produce the following report types:
  -  Service Reports.
  
-  Fax Reports.
  
-  Transit Statistics.
Each of these reports is written to a separate file, but the 
information can also be included (on a per group basis) in the site 
reports (see the chapter on Summary Reports for further details). The 
different types of each report are described more fully below.7.2 Service Reports
A service group is a group containing addresses of gateways, that may 
be used by users in many groups. Examples of these are X400, Internet 
or application gateways. Since many such service gateways generate per 
message costs, it may be desirable to know the usage of these services 
on a per group basis. Of course, the matrix report will show this too, 
but often a summary format is desired. This is achieved by marking such 
service groups with the service keyword. By specifying 
the /SERVICE qualifier with a filename argument, you can have 
PMDF-Stats list the number and volume of messages sent to each service 
group on a per group basis. In addition, if the /SITE qualifier is 
specified, this information is also written to the site report for each 
group.
7.3 Fax Reports
Since faxes often involve per message charges, it is usually desirable 
to produce information on fax usage for each group for charging 
purposes. PMDF-Stats achieves this by the use of a Fax Report. When the 
/FAX qualifier is specified, PMDF-Stats will write to the filename 
argument a report consisting the following information from each user 
group to each fax group:
  -  Total number of faxes sent.
  
-  Total number of pages sent.
  
-  Total elapsed phone time.
7.3.1 Defining Fax Groups
You define Fax groups the same way that you define user groups: within 
the Group Definition File. Unlike user groups whose elements consist of 
partial or complete domain names, fax groups consist of partial or full 
telephone numbers. You can define multiple fax groups based on 
different call charges. A fax group is indicated by the presence of the 
fax flag keyword. For example:
  
    | 
 
Notoll (Toll Free) [fax]: 1800 
Local (Local Fax) [fax]: 1, 2, 3, 4, 5, 6, 7, 8, 9 
Trunk (Long Distance) [fax]: 0 
Intnl (International) [fax]: 00 
 | 
A telephone number appearing in the PMDF-Fax "S" record will be matched 
against the above group for a best match. Any starting with 00 will be 
considered in the International group, any starting with 0 in the Long 
Distance group, any starting with 1800 in the Toll Free group, and 
anything else in the local group.
  | Note The Fax Report facility processes records in the MAIL.LOG produced by 
the PMDF-Fax layered product. This processing does not refer to any 
other third party fax gateway.
 | 
7.4 Transit Reports
Transit Reports give an idea of how long it is taking to process 
messages for each group. This is done by subtracting the time stamps 
between "E" and "D" records for messages to each group. This 
effectively measures the performance of the channel that finally 
delivers the message, so such statistics are only meaningful on a 
destination group basis.
7.4.1 Producing a Transit Report
To produce a transit report, use the /TRANSIT qualifier with a filename 
argument. PMDF-Stats will produce transit statistics for each group 
into that file. The transit statistics consist of the shortest, longest 
and average time difference between corresponding "E" and "D" records. 
In addition, if site reports are also requested, this information for 
each group is written to the corresponding site report.
  | Note The Transit Statistics code uses the Message-ID of each message to 
match up "E" and "D" records. This item is not included in the MAIL.LOG 
record by default. If you wish to analyze transit times, you must set 
the LOG_MESSAGE_ID option in the PMDF_OPTION_FILE to 1. Note that there 
is no way to process records for transit statistics that were created 
prior to your setting this option.
 | 
7.4.2 Using the /SKIP qualifier
If you have messages that are crossing an intermediate channel (e.g. 
the conversion channel for virus filtering), you might want the transit 
statistics to include the time taken during this intermediate step, as 
well as the final channel. You should therefore include such 
intermediate channel in the Skip List (specify the channels as an 
argument to the /SKIP_CHANNEL command qualifier). You should also 
ensure that the logging keyword is included on the intermediate channel 
as well as the final channel. The transit time for a given record is 
the time delay between the "E" record when it is enqueued to the 
intermediate channel to the "D" record of the final channel (the "D" 
from the intermediate channel and the "E" passing it between the 
intermediate channel and the final channel is skipped over).