NED Spectral Archive - data loading steps

OPTICAL SPECTROSCOPY OF LUMINOUS INFRARED GALAXIES. I. NUCLEAR DATA
D.-C. KIM, D. B. SANDERS, S. VEILLEUX, J. M. MAZZARELLA, AND B. T. SOIFER (1995ApJS...98..129K)

Data loading steps

Associated literature:
refcode 1995ApJS...98..129K
Read associated literature to learn about the nature of the dataset:
- tabulated/common information:
  galaxy name listed in Table 1A & 1B
  exptime listed in Table 1A & 1B
  pa listed in Table 1A & 1B
  pos_origin published-value
  pos_units sexagesimal
  pos_system equatorial
  pos_rframe FK4?
  pos_equinox B1950
  flux_unit erg s^-1 cm^-2 A^-1
  bandpass optical
  obs_mode long-slit
  detector CCD
- telescopes used and instrument setup:
  telescope Palomar 5m - always two (blue & red ends of spectrum) accosiated fits files per object
  instrument double spectrograph (Oke & Gunn, 1982, PASP, 94, 586)
  disperser N/A
  spect_res 8A (blue) & 10A (red)
  seeing 1.0 - 1.5 arcsec
  aperture 2 arcsec
  
  telescope University of Hawaii 2.2m - always accosiated with single fits file per object
  instrument Faint Object Spectrograph
  disperser N/A
  spect_res 7A
  seeing 1.0 arcsec (or better)
  aperture 1.5 arcsec
- region sampled by spectral observation, and the region actually extracted during data reduction:
  obs_region nucleus
  ex_reg_leng (extraction length) varied according with redshift:
  ~2 kpc for cz < 20,000 km/s and ~4 kpc for higher cz objects
  (Ho=75km/s/Mpc & qo=0.5)
- calibration:
  flux calibration YES
  absolute calibration some - spectra obtained under photometric conditions are flagged with an asterisk (*) in Table 2 and 3
  sky subtraction YES
Received electronic versions of tables from Kim:
Table 1A (tb1a.txt) & 1B(tb1b.txt): not the final version published - names and positions are different to those published
Table 2A & 2B (tbl2.txt): not the final version published
Table 3A & 3B (tbl3.tex): added "*" to first entry (NGC 23) after communication with Kim.
Using data listed in step 2, build metadata table - for this dataset, Kim has actually done most of the hard work,
listed in "in.tbl". There are a few corrections to this table:
(i) obs_region should be "nucleus" not "Whole_Galaxy"
(ii) given_name - phases "blue"/"red"/"blue+red" removed
(iii) lambda_s & lambda_e - only had 4s.f. - re-read from fitsheader to 6 s.f. (see below - readFITShead.pl)
(iv) cra2000 & cdec2000 - had too many d.p. in arcsec - re-calculated to 2d.p. with "/Users/bchan/bin/d2t.pl"
(v) n_pix - incorrectly set to 800 for both telescopes - re-read from fitsheader (see below - readFITShead.pl)
Look at fits header to identify any required parameter available - in this case there are two types form the two telescopes,
use "/Users/bchan/bin/fitshead", e.g.:
```
	:~> fitshead filename.fits
```
It turns out that the FITS header keys used by the two telescopes are the same:
```
	CRVAL1    = lambda_s
	CDELT1    = dlambda
	AIRMASS   = airmass
	EXPOSURE  = exptime
	NAXIS     = naxis
	NAXIS1    = n_pix
```
Using the script "readFITShead.pl":
look at documentation:
```
	:~> perl readFITShead.pl -h
```
run it first time to generate required data file - INPUT_fitsheader_identifier.dat:
```
	:~> perl readFITShead.pl
```
edit data file - INPUT_fitsheader_identifier.dat (resluts of step 4) and re-run script:
```
	:~> perl readFITShead.pl *.fits
```
output file = OUTPUT_identified_fitsheaders.dat
Using EXCEL, created "metadata_table.xls" and merged "in.tbl" & "OUTPUT_identified_fitsheaders.dat", corrected problems listed in step 3. The columns that still need to be filled are:
(i) seeing: Palomar 200in = 1.5; MKO 2.2m = 1
(ii) instrument: Palomar 200in = Double Spectrograph; MKO 2.2m = Faint Object Spectrograph
(iii) ex_reg_leng: Need original cz measurement: (see step 2)
(vi) flag_abs_cal: Need electronic version of table 2 or 3
Further issues with Kims original metadata table:
(i) given_name: some are different to those published - use names in (tbl3.tex): created tbl3.name - tab-separated with names - entry 140 changed from Mrk 331 to MCG+03-60-036 (NOTE: in the publication, table 1A - last entry - had UGC 12812 while tables 2A & 3A used MCG+03-60-036).
(ii) cra1950 & cdec1950 - are actually NED positions, not those published.

Matching metadata table names to those from table 3:

	:~> perl ~/bin/FINDdiff.pl -keepall -m -col=1,2 -sep="\t" metadata_table.names tbl3.name > LOG

Names with 'problems' will be reported in the LOG file, they involve: (1) minor name-format differences; (2) strange and unexpected differences; and (3) What the??? (typographical error?)

		tbl3.name            metadata_table.names
	(1)	IC5135               NGC7130
		MCG+0029023          MCG+00-29-023
		MCG+0142088          MCG+01-42-088
		MCG+0204025          MCG+02-04-025
		MCG+0310045          MCG+03-10-045
		MCG+0818012          MCG+08-18-012
		MCG-0201051N         MCG-02-01-051N
		MCG-0201051S         MCG-02-01-051S
		MCG-0233098E         MCG-02-33-098E
		MCG-0233098W         MCG-02-33-098W
		MCG-0304014          MCG-03-04-014
		NGC1143/44           NGC1143/4
		NGC6090NE            N6090NE
		NGC6090SW            N6090SW
		NGC7771Main          NGC7771
	(2)	IR10565+2448Main     IR10565+2448W
		IR10565+2448SE       IR10565+2448NE
		NGC3508N             NGC3508E
		NGC3508S             NGC3508W
		NGC5257              NGC5257/8E
		NGC5258              NGC5257/8W
		UGC9913E             UGC9913N
		UGC9913W             UGC9913S
		Zw448.020Main        Zw448.020SE
	(3)	IR04335-2524 (14?)   ESO484-G036

In conclusion - I need an alternate source to varify these discripancies.

* Kay has supplied OCR of table 1A and 1B - Kay_table_1A_1B.txt
* After fixing a few minor problems, generated EXCEL file: Kay_table_1A_1B.xls
* Since the order of objects within Tables 1, 2, and 3 are the same, I merged Kim's tbl3.name "in-order" to over come most of the confusions listed in step 9, and fixed a couple of misplaced error and a bunch of OCR error by hand..
```
	OCR error examples:
	NOC 1204      -> NGC 1204
	ESO 550-1G025 -> ESO 550-IG025
	NGC 3508 F    -> E
	IR 152503609  -> IR 15250+3609
	ESO 343-10013 -> ESO 343-IG013
	ESO 602-0025  -> ESO 602-G025
	ESO 534-0009  -> ESO 534-G009
	IR 093032736  -> IR 09303+2736
	IR 093392835  -> IR 09339+2835
	IR 35483+4227 -> IR 15483+4227
	IR 35519+3537 -> IR 15519+3537
	ESO 593-10008 -> ESO 593-IG008
	IR 22279-1312 -> IR 22279-1112
```
* It's obvious that this ORC process still require a fair bit of human intervention and most of these error are difficult to spot!
* After fixing these problems, a tab-separated file is created: Kay_table_1A_1B.tab
Try step 9 again - matching metadata table names to those from Kay_table_1A_1B.tab:
```
	:~> perl ~/bin/FINDdiff.pl -keepall -m -col=1,1 -sep="\t" metadata_table.names Kay_table_1A_1B.tab > LOG2
```
Again the problematic names will be reported in the LOG2 file, this time only a few simple ones:
```
	Kay_table_1A_1B.tab            metadata_table.names
	NGC 5953/4                     NGC 5953
	NGC 6090 NE                    N6090 NE
	NGC 6090 SW                    N6090 SW
```
The output file is named: metadata_table.names_Kay_table_1A_1B.tab.common, once the above 'problem lines' are added back in (by hand), this list can be readily merged back into the EXCEL file: metadata_table.xls

Marion has provided the 'cat-prep' file, to re-cover the link between NED-objects and Kim's-objects: Kim_catprep_file.txt
* First, a few lines in this file were removed:
--> These two lines are not link to Kim's paper as the refcode is not persent:

	0_E_|G     |145447.8   +244858.   |                                |G_UGC 09618 NED01                                                                                                                                                                       
	E -

--> These ones are used for rearranging NED objects in database:

	0___+ZAP   +000000.0000+000000.000+O_NONEXISTENT OBJECT 774        |I_IRAS F09338+3133              |                                |                                -     -           -        -                    |1111...............| 0.0   0.0    0  
	0___|ZAP   |000000.0000+000000.000-I_IRAS  09338+3133               O_NONEXISTENT OBJECT 774        -I_IRAS F09338+3133                                                                                                                                     
	0_R_|ZAP   |000000.0000+000000.000+I_IRAS  09338+3133               O_CG 0024                       +I_IRAS F09338+3133                                                                                                                                     
	R + 6930 IRAS  09338+3133              
	R + 6929 IRAS F09338+3133

* With nedit, I sorted the catprep file (a few example lines shown below) and extracted two name columns (shown in bold below) and check for the correct refcode (column under-lined below).

	0_Z_|G     |000833.4   -122310.   |                                |  MRK 0938                        |G_NGC 0034                          |                                |     |           |   5860 |                    |1995ApJS...98..129K|                 
	0_Z_|G     |001618.0   -103914.   |                                |  MCG -02-01-051                |G_MCG -02-01-051 S              |                                |     |           |   8170 |                    |1995ApJS...98..129K|                 
	0_Z_|G     |001618.0   -103914.   |                                |  MCG -02-01-052                |G_MCG -02-01-051 N              |                                |     |           |   8080 |                    |1995ApJS...98..129K|

* In the two name columns, the names with "G_" prefix (mostly on the right column) are those object names used in the publication, while the column on the left are mostly the identified NED-names. There are however a few cases where the published names were added into NED directly (hence only the left-column is filled - with "G_" prefix names), a few examples are shown below.

	0+S_|G     |191132.3   -212422.   |                                |G_ESO 593-IG 008 N              |                                |                                |     |           |[0.0434]|                    |1995ApJS...98..129K|  18.0 6.0    83 
	0+S_|G     |191132.3   -212422.   |                                |G_ESO 593-IG 008 S              |                                |                                |     |           |[0.0437]|                    |1995ApJS...98..129K|  18.0 6.0    83 
	0+S_|G     |205505.3   +165603.   |                                |G_CGCG 448-020 NW             |                                |                                |     |           |[0.0359]|                    |1995ApJS...98..129K|  35.0 7.0    70 
	0+S_|G     |205505.3   +165603.   |                                |G_CGCG 448-020 SE              |                                |                                |     |           |[0.0355]|                    |1995ApJS...98..129K|  35.0 7.0    70

* The "G_" prefix and the " ID" postfix (see examples below) in the published names were removed, and the two name columns are put into an EXCEL file Kim_catprep_file.xls

	0_S_+G     |092054.8   +394327.   |                                |  IRAS F09209+3943              +G_IRAS  09209+3943 ID           |                                -     -           +[0.0923]-                    |1995ApJS...98..129K|                 
	0_S_+G     |092151.9   +342838.   |                                |  IRAS F09218+3428              +G_IRAS  09218+3428 ID           |                                -     -           +[0.0677]-                    |1995ApJS...98..129K|                 
	0_S_+G     |092431.6   +351728.   |                                |  IRAS F09245+3517              +G_IRAS  09245+3517 ID           |                                -     -           +[0.1391]-                    |1995ApJS...98..129K|

* This list cannot be matched to Kay_table_1A_1B.tab yet as the published names in the "catprep" file are in NED format (losely)... I now use the code nedname to attemp an unification...

(a) in Kim_catprep_file.xls: the column published-names was pass to nedname to generate the column published-names-formated which was used to create the column complete_published-names-formated. Then the file Kim_catprep_file.tab was created...
(b) similarly in Kay_table_1A_1B.xls: "conditioned" column-final-name --> nedname --> column-NED-formated --> column-complete_final-name_formated --> updated file-Kay_table_1A_1B.tab. The only extra step was the conditioning of "final-name" which included only replacing all "IR "-prefix to "IRAS ".

Match up the two lists Kim_catprep_file.tab and Kay_table_1A_1B.tab:

	:~> perl ~/bin/FINDdiff.pl -keepall -m -col=2,2 -sep="\t" Kay_table_1A_1B.tab Kim_catprep_file.tab > LOG3

A few simple mismatch names are reported in the LOG3 file:

	Kay_table_1A_1B.tab            Kim_catprep_file.tab
	MCG+03-60-036                  UGC12812 (we decided to use the name MCG+03-60-036 for this object)
	UGC6436NW                      UGC06436NW   
	UGC6436SE                      UGC06436SE   
	UGC4881SW                      UGC04881SW   
	UGC4881NE                      UGC04881NE   
	UGC8335NW                      UGC08335SE   
	UGC8335SE                      UGC08335NW   
	Zw448.020NW                    CGCG448-020NW
	Zw448.020SE                    CGCG448-020SE
	NGC0985                        NGC0958 (error in Kim's paper - see Marion's email, comment added in table)

The output file is named: Kay_table_1A_1B.tab_Kim_catprep_file.tab.common, the above 'problem lines' are added back in (by hand).

Finally, match up the two lists metadata_table.names and Kay_table_1A_1B.tab_Kim_catprep_file.tab.common:
```
	:~> perl ~/bin/FINDdiff.pl -keepall -m -col=1,1 -sep="\t" metadata_table.names Kay_table_1A_1B.tab_Kim_catprep_file.tab.common > LOG4
```
Checking LOG4 shows no problems at all in this match as I have already edited out the problems found in step (11) before matching. The result list (renamed metadata_Kay_table__Kim_catprep.common) can be directly merged back into metadata_table.xls
```
	:~> perl ~/bin/NEDcoordtrans.pl metadata_table.pub_pos > LOG_NEDcoordtrans
```

From Kim, we have were reminded that ex_reg_leng can be obtained from the fitsheader parameters: APLOW \& APHIGH (the range of pixel values used for the extracted spectra). So, we used readFITShead.pl to extract these values, and merged into the EXCEL file: metadata_table.xls
To convert the ex_reg_leng length from pixels to arcsec, we needed know the pixel scales. The following were found from the web, and with help from Kim and Joe, we decided the pixel scales used in this paper. NOTE: the final adopted pixel-scales (MKO 2.2m = 0.65 arcsec/pixel; Palomar Red = 0.58 arcsec/pixel; Palomar Blue = 0.78 arcsec/pixel) for the Palomar is different to thos listed in the tables below as the instrument has changed many times over the years.

	http://www.astro.caltech.edu/palomar/200inch/dbl_spec/dbl_spec.html
	Length: 128 arcsec
	Detectors:
	Red:            0.468 arcsec/pixel; CCD 21; 1024 x 1024; 24� pixels
	Blue:           0.390 arcsec/pixel; CCD 20; 2688 x 512 (+100 pixels x-axis overscan, 2788 total) ; 15� pixels
	Alternate Blue: 0.624 arcsec/pixel; CCD 14; 1024 x 1024; 24� pixels

	http://pdssbn.astro.umd.edu/data1/IHW/hal_1002/document/manuals/uhuser.asc
            	 Table 7.8: Faint Object Spectrograph Characteristics             
	 ________________________________________________________________________     
	 Collimators           400mm          blue, visual and red (marked)           
	 ________________________________________________________________________     
	 Cameras       Z135    135mm       Zeiss f/2.0 5975158 (broad: 400-1000 nm)   
	 ________________________________________________________________________     
	 Slits         160mum x 8mm           1.5arcsec x 74arcsec                       
	 ________________________________________________________________________     
	 Beam Size            40mm                                                    
	 Field Size at Slit   26mm                   4 arcmin                         
	 Image Scale                                                                  
	   at detector       135mm      27.2 arcsec mm (1);   0.65 arcsec/24 micron   
	 ________________________________________________________________________

Finally, after all the metadata are collected, the content of the EXCEL file: metadata_table.xls is copied into a text file (named metadata_table.fin), used as the input to the loader script generate_spectraDB_loader.pl:
```
	perl generate_spectraDB_loader.pl metadata_table.fin
```
This script also generate the input file for the script which makes the previews:
```
	perl generate_spectraDB_preview.pl
```

Update 17 May 2006: Both generate_NED_ASCII.pl and generate_VOtable.pl have been updated to generate homoginized values.

	:~> perl ~/research/NEDsp/Codes/generate_NED_ASCII.pl -col=1,2 -units=A,erg_cm2_s_A DATA/*.txt
	:~> perl ~/research/NEDsp/Codes/generate_VOtable.pl DATA/*_NED.txt

Back

OPTICAL SPECTROSCOPY OF LUMINOUS INFRARED GALAXIES. I. NUCLEAR DATA D.-C. KIM, D. B. SANDERS, S. VEILLEUX, J. M. MAZZARELLA, AND B. T. SOIFER (1995ApJS...98..129K)

Data loading steps

OPTICAL SPECTROSCOPY OF LUMINOUS INFRARED GALAXIES. I. NUCLEAR DATA
D.-C. KIM, D. B. SANDERS, S. VEILLEUX, J. M. MAZZARELLA, AND B. T. SOIFER (1995ApJS...98..129K)