Collection of material for FITS Technical Group

Assembled by Lucio Chiappetti (INAF IASF Milano, IAUFWG chair, TG member)
All material should be available on public sources

List of perceived shortcomings

An attempt should be made to extract this list in an homogeneous format from the documents listed above (and to assign priority in which items should be handled).
Topics are listed in arbitrary order, using the occurrence order in the Thomas et al. ADASS poster and in the other sources listed above and in the order given above, and grouping similar topics.

For a detailed explanation of each item look at the original wording in the references given above, marked by a mnemonic [Xx], and a punctual reference to the section (s.n.m), paragraph (p.n) or line (l.n) or item (#n).

Priorities and comments are marked by the initials of the proposer (within braces). Comments covering several items are indicated by a simple catch phrase and linked at the end.

Priorities (from top priority 1 to lowest 999) are colour coded as
1 red for items to be taken seriously as soon as possible
10 orange for items to be taken seriously or partially seriously but of lower importance
50 yellow for items intermediate between the orange and green classes
90 green for items to be ignored since they are beyond the scope of FITS
500 gray for items which are neutral and could be deferred sine die
999 black for items rejected

PS: my original inclination would be to attribut "green" priorities ("beyond the scope of FITS") to many items, but recognizing some merits to the arguments in favour of them I moved them to "yellow".

The last column tries to group items by a coarse classification by topic:
HDR header and header keyword related
DAT (binary) data representation related
CON specific conventions
SEM semantics

Item Short id Ref. Description Priority Opinions Class
0 Slowness [Th1]s.1 p.2 [Py1] FITS standard evolved slowly {LC}900 {LC} true but did VO converge faster? (despite the amount of effort and money thrown in it) ???
0.0 Obsolescence [Py1]#1 Current s/w out of date (... e.g X11) (sic!) {LC}999 {LC} nobody forbids anybody to write "newer" FITS s/w ???
1 Data models [Th1]s.2.2 p.1 [Th2]s.2.1. s.2.2 No standardized data model association {LC}90 {LC} VO? discipline? data model! SEM
1A Errors & data quality [Th1]s.2.2 p.2 [Th2]s.2.2.1 [Th2]s.2.2.4 No standard models for errors and data quality {LC}80 {LC} discipline? Moreover proposals like those in [Th2] are semantics and do not concern the format (both errors and data quality). Assignment of a standard pre-retrieval data quality is an ambitious task, for the archivers (VO?). Argument is serious but difficult. SEM
1B Provenance [Th1]s.2.2 p.2 [Th2]s.2.2.3 no machine-readable general HISTORY {LC}70 {LC} VO? discipline? Personally I am not convinced that documentation kwds should not be primarily human-readable, and that a rigid machine-readable syntax (blocking data analysis on irrelevant data) would be a bonus (more a nuisance!) Requires either another convention or new standard? propose one! SEM
1B1 Data provenance [Th2]s.2.2.3 p.3 no traceability of which data files contributed {LC}20 {LC} a PARENT kwd? or family thereof? store command line? (if any!) All approaches seen in mission-specific contexts HDR
1C WCS [Th1]s.2.2 p.3 [Th2]s.2.2.2 WCS complex, incomplete, inflexible {LC}19 {LC} WCS is open: please supply details and proposals
link with 4A, limitations of short kwd names on complex WCS conventions
HDR
1D Units [Th2]s.2.2.5 Standardization of (new) data units insufficient {LC}100 VO task? but they think even IVOA is insufficient! However VOunits work looks pretty sensible, and FITS should align more than diverge! CON
2 Network [Th1]s.2.4 end [Th2]s.2.4.1 Streaming indeterminate size unsupported {LC}30 {LC} is it really a problem? it wasn't for tapes! it won't be for URLs with a Content-Length! Otherwise just use staging files! DAT
3 Large distributed datasets [Th1]2.2.4 beg [Th2]s.2.4.2 TB datasets across multiple file systems. Grouping convention insufficient {LC}81 {LC} Requests in [Th2] are overloading FITS with something it is outside of it! See data organizers or use of external databases (see 3B) Would delegation of part of HDU/s to external URI wise? DAT
3B FITS and database tables LC motu proprio Devise a convention to map FITS tables from/to database tables {LC}25 CON
3C FITS tables with >999 columns [MT] Devise a way to handle broad tables (joins from database tables) {LC}24 Allowing TFIELDS>999 will require long kwd names (see 4A) and may require a WCS-II with long KWD names DAT
3D Mapping FITS tables to VOTables [Py4]#2-3 points raised in [Py4] are astropy specific {LC}35 However they make reference to a topcat convention to add XML VOTable info onto FITS files, to be examined (but it is not registered, see this astropy comment discipline? )
requirement for a standard "column description kwd" in this astropy message
CON
3E variety of internal representations [Py3]#3 must support a variety of different byte-level data types ... of all byte sizes {LC}100 {LC} FITS has already some, and more or less enough (see 5B though,. and an established mechanism to handle them in images and bintables. As a convinced Ockhamist, I think data types shall not be multiplied (and used) beyond necessity (e.g. unsigned are not really necessary); compare Java "primitive data types" vs objects DAT
3F (efficient) random access [Py3]#6 should support reading and writing to specific subsets of the data without requiring the entire file to be read into memory {LC}110 {LC} Essentially is requiring random access, which we have. Contrasts with item 2,compare also item 4I. DAT
3G preview thumbnails [Py3]#8 should support thumbnail-style lower resolution data (... associated with the main data) for quick view purposes {LC}150 {LC}Not a priority, viewers are usually fast enough, anyhow could be handled by a (foreign ?) extension CON
4A1 8-char kwd name [Py1]#3 [Th1]s.2.3 l.2 [Th2]s.2.3.1 p.3-4 [BP5P]#1 8-char kwd name too short {LC}1 work in progress HDR
4A2 No namespaces [Th1]s.2.3 l.6 Lack of namespaces {LC}31 {LC} invent a convention! CON
4A3 68-char kwd limit [Py1]#3 [Th1]s.2.3 l.2-3 [Th2]s.2.3.1 p.5 [BP5P]#2 [Py4]#1 Kwd values too short, HIERARCH, CONTINUE insufficient {LC}2 work in progress HDR
4B 2880-byte blocks [Th1]s.2.3 l.8 [Th2]s.2.3.1 p.6 2880-byte blocks are an excessive overhead for tiny datasets {LC}200 {LC} either leave with it and do all-FITS (XMM CCF approach) or use tiny self-documenting ASCII file for tiny datasets
and don't tell me that XML is more efficient!
DAT
4C Real time writing [Th1]s.2.3 l.9 2880-byte blocks limit real time writing {LC}31 See also 2 DAT
4D Unextendable header [BP5P]#5 [LC] Header located at the front unextendable without extensive rewriting {LC}6 linked with other "kwd" items HDR
4E No array kwds [Th2]s.2.3.1 p.2 [LC] Better convention for lists, sets, arrays of kwds {LC}5 {LC} true! HDR
4F Data association [Th2]s.2.3.2 awkward data association among HDUs in a MEF {LC}40 {LC}This is a task for a data organizer; devise convention for index files? CON
4G Data endianness [Th2]s.2.3.3 No support to little endian byte order {LC}999 {LC disguised as Ockham} Absolutely no! Look at Java! or import and work in native format DAT
4H Variable length rows [Py2] [Py1]#2 [Py3]#10 Request to disallow variable length rows in BINTABLEs {LC}999 {LC} Nobody is obliged to use them. I agree to use sparingly. A normalizing utility could be provided. DAT
4I Table storage inefficient [Py3]#10 Request to store tables by columns in consecutive bytes {LC}90 {LC} we could live with storage by row as we did so far DAT
4I Kwd typing [LC] Header kwd not strongly typed {LC}21 HDR
4J NaN in kwd values [EB] Allow IEEE NaN and Inf in kwd header values {LC}26 discussed in the past on FITSBITS, could be encoded as strings like 'NAN', also because of 4I HDR
4K Metadata support [Py3]#9 metadata should be either stored in or easily exported to a more commonly used format (i.e. XML?) {LC}80 {LC} don't see great advantages in XML (see also a nice reading). but mapping could be handled by external utilities. BTW what was of this ADASS 2001 proposal ? HDR
5A No versioning [Py4]#7 [Th1]s.2.1 p.1 [Th2] 2.1.1 [BP5P]#4 No way to tell which features or convention supported {LC}18 {LC} para 4 of [Th2] is either B.S. or an illusion, legacy s/w shall deal with legacy data and ignore newer data More serious arguments about convention registry in para 5 of [Th2] CON
5A1 Informal variants [Th2]s.2.1.4 Too many informal variants are in use {LC}110 {LC}it is a lost cause (discipline) What they describe for "non-compliant VOTables" is not by chance! Build a system that even a fool can use and only a fool will want to use it! Do we want full portability or are happy with plain interoperability? CON
5B No Unicode [Th1]s.2.1 p.2 [Th2]s.2.1.3 [BP5P]#3 7-bit ASCII excessively limited {LC}15 [BP5P]#3 just adds dot,dollar and ASCII lowercase
{LC} worth tackling in conjunction with 4A,4F
HDR
Item Short id Ref. Description Priority Opinions Class

Detailed (personal) comments

































































sax.iasf-milano.inaf.it/~lucio/FITS/NewTG/ :: original creation 2016 set 30 15:48:09 CEST :: last edit 2016 Sep 30 15:48:09 CEST