SCANSCI High-Level Documentation
|
|
|
Version 2.6.2
|
27 October 2000
|
Purpose
SCANSCI is just one small part of the 2MASS pipeline process. It is intended
to gather information compiled by other subsystems, perform quality analysis
on the nightly data acquired, and then print an HTML report summarizing
the scan-by-scan QA performed.
High-Level Organization
The purpose outlined above suggests an overall design split into three
clear, distinct parts: Input, Q/A, Report. Accordingly, the program is
organized into these three areas, and (with few exceptions) no overlap
occurs between these functional zones.
Although SCANSCI is written entirely in C, it is designed in an object-oriented
manner (to the extent that the problem domain and language choice allow).
For the most part, the code is not organized by function, but rather by
data types (grouped with the functions which operate on them). This organization
breaks down somewhat in the input and report routines (I/O is always messy!)
because the input routines are strongly influenced by the files to be read,
while the report routines are strongly influenced by the HTML format and
other concerns which may cross data boundaries (for example, links to external
plots).
The error-handling strategy we employ is this: If a nonfatal error occurs
(such as one of the input files in an unexpected format), then we simply
note the error and continue operation, printing the error message in the
final report file. If a fatal error occurs (out-of-memory, or unable to
open the final report file) then we print an error message to stderr and
abort. (Currently, however, all error messages are just printed to stdout.)
File-Level Organization
SCANSCI consists of fourteen header files (with the suffix .h) and sixteen
source files (with the suffix .c), plus this HTML documentation and a Makefile.
A project file for use with the Integrated Development Environment (IDE)
Metrowerks
Codewarrior is also available upon
request. Header files are used primarily for comments (explaining the
purposes of individual functions and data structures) and function and
data declarations. Source files are used primarily for implementing these
declarations, but occasionally also include their own declarations (for
information private to the implementation).
In particular, the main structure is the scanlist, which consists of
individual substructures (scan), each of which is in turn made up of substructures
(tracking, focus, focus_quality, etc.) Each of these substructures has
its own header and source files, containing typically three functions:
One for initializing a new structure, one for performing Q/A on it, and
one for printing the structure to stdout (essentially a debugging feature).
Functions for reading these structures from file or writing them out to
a final report tend to have overlap with one another, and consequently
are placed elsewhere.
The header and source files are listed below, along with an overview
of their contents:
lines of code |
filename |
purpose and contents |
102
|
errors.h
|
global error-handling facilities
contains two custom types (my_error and error_report)
for error-reporting, plus function declarations
|
142
|
errors.c
|
global error-handling facilities
contains implementations of the functions declared in errors.h, along
with custom functions for handling certain fatal errors.
|
76
|
debug.h
|
global debugging facilities
includes support for Design-By-Contract (preconditions, etc.). All debugging
tools are routed through the global error-handling facilities. Debugging
checks can be turned off by defining the global variable NDEBUG (either
in the source, or through a compiler directive like gcc's -D flag).
|
77
|
globals.h
|
constants and macros used throughout the program |
15
|
globals.c
|
necessary only to give a definition for the Q/A names (green, yellow,
etc.). (Could be done in any source file.) |
148
|
scan.h
|
definition of the overarching structures holding the individual scan
data (and Q/A results) and the entire scan list. This file is needed by
almost every part of scansci, but is not so primitive as to be a part of
globals.h.
It also includes declarations of the three primary functions (read_scanlist,
qualify_scanlist, and summary_report) as well as functions for operating
on individual scans or the scanlist as a whole.
|
102
|
scan.c
|
Implementations of the functions operating on individual scans or the
scanlist as a whole. |
68
|
tracking.h
|
declaration of tracking data and Q/A results, plus functions for working
with them |
86
|
tracking.c
|
implementation of the functions declared above |
74
|
focus.h
|
declaration of background/seeing/focus data and Q/A results, plus functions
for working with them |
128
|
focus.c
|
implementation of the functions declared above |
84
|
astrometric.h
|
declaration of astrometric data and Q/A results, plus functions for
working with them |
161
|
astrometric.c
|
implementation of the functions declared above |
50
|
photometric.h
|
declaration of photometric data and Q/A results, plus functions for
working with them |
65
|
photometric.c
|
implementation of the functions declared above |
86
|
galaxy.h
|
declaration of galaxy data and Q/A results, plus functions for working
with them |
86
|
galaxy.c
|
implementation of the functions declared above |
51
|
quad_jump.h
|
declaration of quadrant jump/planet hazard data and Q/A results, plus
functions for working with them |
71
|
quad_jump.c
|
implementation of the functions declared above |
48
|
file_util.h
|
functions for making working with files a little more sane |
56
|
file_util.c
|
implementation of the functions declared above |
86
|
input.h
|
declaration of functions used to read in all the data, along with the
relevant directory names |
1191
|
input.c
|
implementations of all of these functions and their helpers. Big and
messy, but file I/O in C always is. |
67
|
qualify.h
|
Convenient macros used by all the QA routines. |
107
|
qualify.c
|
Implementation of the main QA routine (which calls all the others)
and grading routine (which grades each scan). These functions were declared
in scan.h |
43
|
report.c
|
Implementation of the main reporting routine (declared in scan.h) |
43
|
debug_report.c
|
Primarily useful for debugging purposes. Uses the print functions defined
in each substructure's .c file to print an overall summary to standard
out. |
139
|
html.h
|
Declarations of useful macros when working with HTML, along with constants
defining the URLs used in the resulting pages (especially links to images
or plots) and function declarations for the HTML generators. |
721
|
html.c
|
Implementation of the functions declared above. Big and messy, like
input.c |
57
|
main.c
|
Contains the main() routine which extracts whatever command-line parameters
are passed to SCANSCI and then drives the entire process. |
4230 total lines of code |
|
Error Handling
To the extent possible, I trap all errors encountered and store them in
a global list (error_list, defined in the files errors.h and errors.c).
Fatal errors (such as running out of memory, or encountering a programming
error, or being run with the wrong number of command-line parameters) will
cause program execution to stop immediately and a message to be printed
to stderr (or possibly stdout). Otherwise (if, for example, an input file
could not be opened, or a file was in an unexpected format) then execution
continues as best it can.
Once the program is ready to generate the HTML report, a summary of
the errors encountered is listed at the top of the report in an "Error
Summary" table. You can suppress the generation of this table by recompiling
scansci with the OMIT_ERRORS flag defined (either in html.h, or as a -D
parameter to the gcc compiler).
Anticipated Changes
A roadmap broadly covering changes that may occur and their effects on
the source code may help those who are left with the job of maintaining
this code. Following is a list of anticipated changes (in both the requirements
and implementation) and the parts of the code which must change to meet
them:
Input files change
-
Files move to different locations in the file system, or change names.
In this case, the files input.h and input.c will be the only ones affected.
Filenames are mostly hard-coded in input.c, but are only used in one place
each (namely, the reading functions read_*). Directory names are defined
as constants in input.h.
You will need to create a new reading function (read_whatever) and make
sure it is called from the function read_scanlist at the top of input.c.
Your function should be declared in input.h, along with the other reading
functions. Most likely, you can emulate one of these existing functions
(choosing either a function such as read_psf_see, which reads one file
containing one entry for every scan, or a function such as read_bmg which
reads one file for each scan). These functions in turn call helper functions
(_read_whatever) which do the actual work. So you would probably also emulate
one of these helper functions (e.g., _read_psf_see or _read_bmg). The main
thing you would need to change would be the format string describing the
data being read, and of course which data fields in the scan structure
(or its substructures) are affected.
-
An existing file format changes, or new or different data must be read
from files which are already being read.
As above, you will need to modify the format string used in the reading
helper function. Any good C reference will explain the use of the library
function fscanf in detail. Additional data being read may also necessitate
changes in the data structures themselves, their Q/A routines, their initialization
routines (e.g., in tracking.h and tracking.c) and possibly also the report
format (as in html.h, html.c).
Q/A process changes
-
New data is introduced, or new Q/A checks performed.
If, for example, the data structure affected is focus_data, then you would
edit the files focus.h and focus.c to introduce these changes. The macros
defined in qualify.h may be of help (see focus.c for examples of use).
-
Existing Q/A checks change.
Depending on the change, this might be as simple as just changing a value,
or using a different macro (e.g., CHECK_UPPER instead of CHECK_LOWER).
Again, see the individual substructure files (such as focus.c) for details.
Report generation changes
Changes in the HTML report format are made in the files html.h and html.c.
Depending on the kind of change to be made, this could be as simple as
editing one of the URLs defined in html.h, or as complicated as changing
every implemented routine in html.c (if, for example, every table is to
be laid out vertically instead of horizontally).
If you want to change whether the text is printed in color or the table
entry backgrounds are (or neither), change the definitions of USE_TEXT_COLOR
or USE_ENTRY_COLOR in html.h. The actual color codes used can also be found
there.
The output routines in html.c are complex and somewhat intertwined;
I recommend you examine them thoroughly before embarking on significant
changes to them.
Additional Notes
SCANSCI was developed entirely on a Power
Macintosh G3 (266 MHz) minitower, using the Metrowerks
CodeWarrior development environment, in about 54 hours (including time
spent in meetings, designing, and writing documentation, testing, and debugging).
It compiles and executes to completion under both the MacOS
and Solaris.
Thanks to Davy Kirkpatrick and Linda Fullmer for their extensive help
and direction during the development phase, and to Roc Cutri for the introduction
to/overview of 2MASS. Thanks also to Dave Van Buren for loaning me to 2MASS
for a week, and providing me with such a great work environment.
As of August 1998 maintenance of scansci has passed on to Robert Hurt
who sadly is developing entirely under unix and not his beloved Mac.
Version History
2.6.2- 27 October 2000
-
Dropped overlaps qa factor calculation for 6x scans since statistics for
them are not compiled in .cumtab file. This means overlaps downgrades will
need to be done by hand.
-
6x QA review templates are no longer deleted (may need these for full reviews).
2.6.1- 12 October 2000
-
Dropped "deletable" lines from the .qua templates; such files are now "hands-off"
and should be correct based only on the QA review and .qagrade files.
-
Now the run_scansci script will archive a copy of the .qagrade file in
/home/davy/QAgrade/ so it will survive reprocessing runs. The scansci.csh
script now checks for the archive of this file and will use it if present.
2.6- 8 August 2000
Added Bla0 override in .qagrade file to allow flagging of H electronics
glitches. This represents an incompatibility with previous versions of
this file.
The contents of Gene's *.badfsig files are appended to the end of the review
templates (used to help identify possible electronics glitches).
2.5.1- 10 May 2000
-
The backgrounds used in 6x PSP calculations are divided by 6 to produce
values more comparable to those for 1x scans. Note that the sensitivity
cuts are still not "correct" since the background RMS should be down a
factor of sqrt(6) as well, and this is not taken into account.
2.5- 10 May 2000
-
Added compatibility for 6x LSC and LCA scans; when found in the .lgo file
they are treated as SUR scans and appear in summary pages.
-
Sensitivity quality factors are incorrect for long integration scans
-
Photometric overlaps are incorrectly reported to be quality 0; this is
a logic bug caused by having no lines in the .cumtab file I think (I've
kludged a fix by forcing phot quality to 1.0 in a temporary .qagrade file)
-
At present, .cnoise files are not made for these scans so that info is
blank
-
New scansci6.csh script renames html files to scan6_*.html, deletes bogus
.tmpl and .sit files
-
Added seeing plot prompt in QAreview.tmpl file
-
Added intermediate airglow summary to QAreview.tmpl file
-
No longer writes scans with tile # = -9 to .sit file (this excludes scans
that failed processing)
2.4.3- 17 April 2000
-
Tweaked North H sensitivities so they never give quality < 0.3
-
Kludged Southern tile margins for N hemisphere tiles in the tracking summary
2.4.2- 4 April 2000
2.4.1- 22 March 2000
-
Implemented correct sensitivity scoring for northern H array
-
Implemented new grading algorithm based on the minimum of the quality factors
rather than their product
-
Changed URL references from irsatest to spacemouse to deal with IRSA web
server changes
2.3- 4 February 2000
-
Implemented correct sensitivity scoring for southern arrays
-
Added new links to flats diagnostic images on the main index page
2.2- 3 January 2000 (R. Hurt).
-
Extensive revisions made to incorporate new grading rules:
-
Untracked seeing runs > 900" now result in quality 1 (previously quality
2 of > 1400")
-
J average shapes > 1.25 or J max seeing shapes > 1.3 result in quality
1
-
H cnoise(4) > 4.5 for log(density) < 4.2 result in quality 1 (airglow)
-
QAreview template now employs 5 quality factors to cover photometry, sensitivity,
seeing, untracked seeing, and airglow.
-
Airglow diagnostics added to Background section
-
Untracked seeing diagnostics revised to include seeing info
-
New .qagrade file created in datequal directory to allow reviewer overrides
to be entered once and automatically placed into all relevant QA tables
(including QAreview).
-
Added time/date stamp to web pages & .QAreview templates
2.1 (PARC)- 4 November, 1999 (R. Hurt).
-
NOT a production processing release; this intermediate version was designed
to be run on the PARC directories to generate the 1st generation of the
Scan Information Table (SIT).
-
Two summary table inputs were added for SIT generation to help recover
prior QA scores: these are the QAtrans and QUA files. These input will
be commented out for production use.
-
Numerous new columns added to cover requested SIT info and diagnostics
for new grading schemes.
2.0- 4 November, 1999 (R. Hurt).
-
Changed output selection for scan summary and grade summaries in template
.qareview files to use all non-cal scans instead of sci scans; sometimes
.pfqqas file is screwed up and scans are not labelled correctly. This means
they don't disappear from the reports.
-
Implemented new senstivity scoring for both hemispheres.
-
Now scansci adjusts the scoring based on which hemisphere (and for the
north, which date) is being processed.
-
New sensitvity cut-offs are used for northern H and southern H & K
data.
-
Seeing shapes > 1.25 in any band now result in a downgrade to quality 0.
-
NOTE - This version was backed out of by request from UMASS as the newly
implemented grading rules were reevaluated.
1.9- 17 September, 1999 (R. Hurt).
-
Implemented new refinements to the grading scheme for untracked seeing
issues.
-
Now reads seetrack file for linear extent of untracked (seetracker score
> 1) seeing and reports these in the Background/Seeing/Focus page.
-
Scans with >= 1400" of linear extent are now downgraded to quality 2 (similar
to before).
-
In the *.qua template file the fct5 column now contains the max seetracker
score, truncated to the nearest tenth and the fct6 column contains an untracked
flag, 1.0 for OK scans, 0.0 for untracked seeing. The flag defaults to
0.0 for scores >= 2.0, but these may be overriden by the reviewer.
1.8- 31 August, 1999 (R. Hurt).
-
Implemented new grading scheme for untracked seeing issues.
-
Grades for untracked seeing scans are handled the same was as all other
scans without a ceiling value of 2.
-
In the QA review template file all scans with seetracker scores > 2 and
between 1 & 2 are summarized for inspection.
-
In the *.qua template file the fct5 column now contains the max seetracker
score, truncated to the nearest tenth and the fct6 column contains an untracked
flag, 1.0 for OK scans, 0.0 for untracked seeing. The flag defaults to
0.0 for scores >= 2.0, but these may be overriden by the reviewer.
-
Version 1.8.1 corrects a minor truncation problem with the seetracker max
score.
1.7- 17 March, 1999 (R. Hurt).
-
Fixed mistake in airglow grading factor (it now depends only on the H background-subtracted
sigmas, not the max of H & Ks). Also renamed this to be the "Extended
Photometric Quality Factor" in QA template to be more descriptive of its
function.
-
Added two new qualitative H airglow factors to scan summary page: Banding
& Jumps. These diagnostics are abstracted from the H banding and jump
counters and should give an approximate indication of extended source reliability
since they are driven up by structures seen in the coadds. They are here
for QA reviewer evaluation purposes only; more analysis will need to be
done to relate these to actual extended source reliabilities.
-
Fixed output in qDATEHEMI.qua.tmpl file for cases with missing scans (was
missing a linebreak).
1.6 - 10 February, 1999 (R. Hurt).
-
Implemented grading of airglow based on background-subtracted (by GALWORKS)
coadd sigmas. The formula used (from Tom Chester) is: qf(airglow) = 3.36
- (9/4)sigma. Scores appear in scan and galaxy summary pages as well as
QA review templates.
-
Note that this score is not a multiplicative factor used in grading, only
an independant measure of extended source reliability
-
This score becomes invalid in high density regions because of point source
confusion (source densities ~> 3.5)
-
Template qDATEHEMI.qua.tmpl file is now created in datequal directory in
which grades and quality factors have been filled in.
1.5 - 14 January, 1999 (R. Hurt).
-
Readout noise/banding summaries now read and displayed in Jump Counters.
-
New overlaps range summary read from *.cumtable and summarized in Photometricity;
also proper grades are computed for the photometric scores based on overlaps
scatter and included in both Scan Summary and in the QA template file.
-
New, corrected threshold is set for bad "2nd moment"/psf elongation at
major/minor < 0.81.
-
Fixed minor bug in which the high-SNR RA deltas had lost their signs (always
came out positive).
-
Seetracker scores are now flagged yellow for scores between 1.0 & 2.0
(helps track margainal seeing).
-
#Tycho stars and % coverage of Tycho stars now read and displayed in Astrometry
and is automatically summarized in QA template file.
-
Color thresholds in Galaxy detections updated to more useful values.
-
Order of cal/survey scans reversed in scan summary (survey scans now come
first).
-
Legibility of most tables improved by adding additional row breaks between
cal/survey blocks.
-
The QA review template was tweaked slightly for better legibility and ease
of use.
-
Index page revamped with several new features:
-
Links are more space efficient in 2-column format
-
Link added to log file (.lgo).
-
Link added to sigma vs. time CALMON plot
-
Link added to zero-point plot for alternative calibration method (if the
default method was aperture, then there will also be a *.zro_psf.ps file
while a *.zro_ap.ps file will be present if the method used was psf; note
that only one of the two links to these extra plots will work at any given
time).
-
The Index page has links pointing to it on all of the other html pages.
-
Approximate time: 20 hours
1.4- 30 September, 1998 (R. Hurt); numerous minor improvements.
1.3 - 4 September, 1998 (R. Hurt).
-
Tables have been split up into different html documents with a master index
page.
-
New html naming convention adopted to work with new QA web page naming
conventions. QA documents for a given night will now be served out of a
directory named qyymmddh. All QA files will be copied to this directory.
The master index page is index.html and the various tables follow the form
scan_xxxx.html
-
The Scan Summary table has been reenabled and expanded in functionality.
Now reported are:
-
Maximum possible grade (because of untracked seeing and bad 2nd image moment)
-
Photometric quality factor (currently just 0.0 or 1.0)
-
Sensitivity quality factor
-
Overall quality factor (product of prev. 2 for science scans, just photometric
factor for cals)
-
Final grade of scan
-
Template QA form is now output.
-
Approximate time: 12 hours
1.2 - 2 September, 1998 (R. Hurt).
-
Bkg/See/Focus: Updated the bkg vs. seeing column to use the current formula
(shape * bkg^0.29) and colors reflecting the quality grades.
-
Added new column including the corresponding QA sensitivity scoring factor
based on the current scoring rules for the bkg. vs. seeing diagnostic.
-
Rearranged column order in Astrometric Summary to move more useful info
to the beginning.
-
Fixed broken links to graphics icons.
-
Connected links to diagnostic files at bottom of tables.
-
Added small spaces between some column blocks for clarity.
-
Commented out scan summary table (until it is upgraded with more useful
scoring info).
-
Approximate time: 24 hours
-
Changed color codings for the following:
-
Jump counters: 0 <= green < 3 <= yellow < 10 <= red
-
Bkg/See/Focus 2nd Moment ratio: red < 0.77 or > 1.17, 0.87 < green
< 1.07
1.1 - 17 March, 1998.
-
Much improved error handling, for easier debugging. Errors are now logged
to the report file (except for fatal errors, such as out of memory or inability
to open the report file, which are all printed to stdout before termination).
-
Added a column to every table in the report, indicating the scan type.
-
Fixed a stupid bug (typo- good catch, Davy!) in the QA of astrom. disp.
-
Fixed bug reading seetrack (using a variable before initializing it - Doh!).
-
Fixed bug where if the report file could not be opened, we might seg fault.
-
Modified _read_qapm to extract the multipliers from the header, if any
are present.
-
Added a new file (../log/YYMMDDH.lgo) and reading functions to go along
with it; use the data from there to adjust the tracking XSDifEq values.
-
Added the version history section to this documentation and updated it.
-
Modified input routines so that any data skipped is read as characters
(rather than formatted data). This should produce a slight speed improvement,
and make the routines more robust against unexpected formats (such as blank
fields).
-
Fixed bug in _read_cal which was reading to EOL twice.
-
Some bug-fixes to the bug-fixes.
-
Second release. (13 hours effort)
1.0 - 6 March, 1998.
-
First release. (41 hours effort)