International Federation of Digital Seismograph Networks

Thread: Availability Service Specification - next steps

None
Started: 2019-02-25 22:46:44
Last activity: 2019-03-29 20:35:59
Tim Ahern
2019-02-25 22:46:44
Thanks to all of the FDSN members that responded to the previous discussion related to the Availability Service. All 18 members that responded were in favor of the Availability Service concept.

As promised, here is the Draft Web Service Specification for the fdsnws-availability service in a format similar to existing FDSN web services.
The delay in getting this to you was mine, sorry.

Can you please respond with comments to the entire list no later than March 19, 2019 which is 3 weeks from today.

Thanks in advance



Tim Ahern

Chair
FDSN WG III



  • Marcelo Belentani de Bianchi
    2019-03-01 00:27:25
    Dear Tim and other Colleagues,

    Thank you for the document. My suggestions are:

    (1) I don't think that the values 'lastestupdate', 'timespancount' and
    'restriction' should be optional. I would make then fixed. In this way the
    show parameter would default to 'None' and have no options in this version.

    (2) Parameter show gives an idea that it controls the columns. Maybe rename
    it to 'extracolumns' to make it clear that we talk about something optional
    and, that would really change the amount of information in the output.

    (2) I would rename 'mergetimespans' to 'mergeoverlaps'

    (3) I suggest adding a parameter let say 'mergegaptolerance' that would
    default to 0.0 (no merge) but could be like 5. And would merge gaps smaller
    than 5s. This is useful for doing plots and other analysis of complete
    station operation times.

    (4) On the orderby parameter, I guess that there is a mistake with the
    'lastestupdate' and 'latestupdate_desc' also, I found the options too long.

    (5) The parameter 'limit' has only use on doing interfaces to browse
    timespans, in this case it is missing and 'offset' parameter. Remeber that
    it is always required to give start and end. I would add the 'offset'
    parameter for sake of consistency. Alternatively remove both.

    (6) Considering temporary network, it should be clear when you request a
    timespans in an interval that the network code was re-used in two different
    experiments. In this case, I would generate a GAP inevitably between
    experiments and also, on the JSON I would expect a different 'datasource'
    object even thou, network, station, location and channel matches. I.e. this
    behavior should be explicit in the document.

    This problem can also generate some other side effects. What is unique is
    the pair of Network Code + Network Start Code. Maybe use the extracolumns
    to request 'network start code' from the inventory.

    With my kind regards,

    Marcelo Bianchi
    --
    Centro de Sismologia / IAG / USP
    http://www.moho.iag.usp.br/ ~ http://www.iag.usp.br/geofisica
    Rua do Matão, 1226, office D-211
    +55 (11) 9820-10-930 ~ +55 (11) 3091-4743



    Em seg, 25 de fev de 2019 às 19:47, Tim Ahern <tim<at>iris.washington.edu>
    escreveu:

    Thanks to all of the FDSN members that responded to the previous
    discussion related to the Availability Service. *All 18 members* that
    responded were in favor of the Availability Service concept.

    As promised, here is the Draft Web Service Specification for the
    fdsnws-availability service in a format similar to existing FDSN web
    services.
    The delay in getting this to you was mine, sorry.

    Can you please respond with comments to the entire list no later than
    March 19, 2019 which is 3 weeks from today.

    Thanks in advance



    Tim Ahern

    Chair
    FDSN WG III



    ----------------------
    FDSN Working Group III
    Topic home: http://www.fdsn.org/message-center/topic/fdsn-wg3-products/ |
    Unsubscribe: fdsn-wg3-products-unsubscribe<at>lists.fdsn.org

    Sent from the FDSN Message Center (http://www.fdsn.org/message-center/)
    Update subscription preferences at http://www.fdsn.org/account/profile/


  • Catherine Pequegnat
    2019-03-14 11:26:37
    Dear all,

    Please find below some comments by RESIF team :

    (1) We suggest to unify methods naming between webservices. We could keep only this methods:
    - query
    - queryauth
    - version​
    - application.wadl​
    and we could have boolean parameters to replace other methods. For instance :
    '...query?extent=true...' to replace the extent method.

    (2) The 'limit' parameter should be renamed to 'rowlimit' which is more self-explanatory.
    (3) Rename header to have an overall homogeneity between webservices such as 'station webservice' that is :
    #Network Station Location Channel Quality SampleRate Earliest Latest Updated TimeSpans
    (4) We wonder if the main implementation version number should correspond to the specification version number ?
    (5) With regard to temporary network, we should add an option 'networkstartyear' to understand observed gaps and avoid misleading merging.
    (6) We also think(cf Marcelo's reply) that 'lastestupdate', 'timespancount' and 'restriction' should not be optional.
    (7) We also think (cf Marcelo's reply) that 'mergetimespans' should be renamed to 'mergeoverlaps'.
    (8) We suggest adding a parameter 'mergetolerance' with 0.0 as default value (no merge).

    Generally speaking (but this goes beyond the strict framework of the webservice specification availability), we wonder if it would not be wise to provide for the output in text format of the two webservices station and availability an output format corresponding exactly to the format of the files passed as argument when using the POST method.
    Thus: the station output' ==> availability input and availability output ==> dataselect input

    Best regards,
    Catherine Péquegnat
    RESIF-DC
    ✆ +33 4 76 63 51 37 ou +33 4 76 63 52 48
    🏢 Isterre, bureau 035
    22 rue de la Piscine
    38400 Grenoble Cedex

  • Philipp Kästli
    2019-03-29 20:35:59
    Dear colleagues,

    as i recently got to know that there will probably not be a joint comment from EIDA, i just add my personal (Swiss Seismological Service ;-) ) opinions on the Availability service specification here, hoping that it might still be useful...


    a) general considerations
    ====================
    the information on data availability is fully covered by, but differently structured in more generic waveform metadata services like mustang (iris) or wfcatalog (eida). Concepts are different: these services also return the quality information itself, rather than just providing time windows consistent with selection criteria. While the waveform metadata services can serve many purposes, there are definitely adequate to discover *waveform data adequate for my purpose". Discovering *waveform data available at all* is just a specific sub-purpose of what can already be done with the metadata services. Rather than introducing yet another service with its specific peculiarities in request and response, I would recommend to homogenize, and popularize the waveform metadata services. If requested by the users, they (hopefully, for the future: it) can be extended by an option to return availability intervals only.

    With this, Availability as a specific service could be avoided entirely.




    Having said that, I have still some considerations on the technicalities:

    b) commments/suggestions to the query parameters for the timespan method:
    ======================================================================

    quality:
    a) This parameter is poorly defined. If implementation is data center specific, then the user cannot learn from the service specification what the service response actually means.
    b) it is overlapping in scope with existing waveform metadata services. Having different services doing similar (or: the same?) thing increases load and noise in service implementations, as well as implementation of clients, and raises constant questions about consistency.
    Recommendation: remove this parameter from the Availability service.

    merge:
    by definition of the response format, quality or samplerate parameters in the output must be wrong for parts of a merged interval, and cannot be trusted any more. Thus, if merge=true, these parameters cannot be trusted any more.
    Recommendation: remove this parameter from the specification. Do not merge adjacent time intervals which are different in any parameter included in the response format.

    orderby parameter:
    The order item "timespan-range" is not entirely clear to me. I would suggest to define it as the start time of a timespan (rather than, e.g., median time or duration)

    includerestricted
    what is the intention? to report, or do not report periods for which the *waveforms* are restricted (while the service timespan vs. timespanauth would distinguish between restricted vs open *availability information*)?
    Having this parameter on both services would then imply that
    - that for some periods both waveforms and availability information could be restricted (availability data available on timespanauth, but only when includerestricted=true)
    - for other periods, the availability information is open (reported by timespan rather than timespanauth), but the waveform closed
    - and for third periods, the avialability information may be closed (-> timespanauth), but the waveforms open?
    Ok, this could be considered a feature, but it is a complicated feature, and i doubt it is required. Further, in the third case, as there is no login to timespan and thus the user unknown, the service has no means to tell whether there is waveform data generally restricted, but available for this user.
    i recommend to remove the parameter from the specification and assume:
    method timespan -> do not report intervals of restricted waveforms
    method timespanauth: do report also intervals of restricted waveforms accessible to the current user.

    mergetimespans:
    a) overlapping timespans, but different data: this should actually not happen (unless the data is different by sampling rate or similar properties, see discussion of the "merge" parameter) -> in such a case, the mergetimespans=no still would not allow the user to take adequate actions, e.g. decide which version (s)he wants to have (as the different properties do not translate into potential selection parameters for waveform requests)
    b) overlapping timespans, but identical data, as well as adjacent data, should be reported as one coherent interval in any case
    Thus, I would recommend to remove this request parameter from the specification

    show:
    formally: rather than show=latestupdate, i would say showlatestupdate=true/false, if latestupdate is the only potential value for "show"

    I guess this is a simple trial to do something like versioning. However,
    - as it is not defined for what reason this date would be modified (technical, such as file date, content-wise modification of the waveform such as gapfilling, or defined reprocessing such as offset removal).
    - the user has no access to metadata describing the nature of the modification, and
    - the user has no possibility to return to an earlier version of the data.
    With these limitations, I do not think that this feature is useful. I would drop this parameter.

    mingap=[numeric, seconds, default 0]
    i suggest to add this additional parameter to define which is the minimum gap which is actually reported (with the meaning: gaps shorter than mingap would not split an interval in two).


    Comments on the extent method:

    ==============================

    I don't see this method as generally useful. The information content is a too limited description of the data: e. g. if you ask for the availability of data of stream XXX.hhz from January 2000 to December 2010, and you get the information that data is available between 2003 and 2009 in 20 intervals, you actually don't know whether this is a triggered station which has recorded 20 events / 20x 5 minutes in 6 years, or whether it recorded 6 full years of waveforms with 20 minor gaps of a few minutes. There is little value added compared to the start/end of a station epoch. (the mingap parameter added to the timespan method would allow more flexibility in retrieving purpose-specific information of data availability or gappyness).
    I recommend to drop the extent method entirely.


    Kind regards,

    Philipp Kästli / SED


    • John Clinton
      2019-03-29 16:42:42
      Dear Colleagues,

      We would like to clarify a comment just sent by Philipp Kästli. We recognise that the WG has already voted in favour of having the Availability Webservice and the current discussion is focusing on the Service Specifications. Hence please disregard the first comment on the general considerations.

      John, Philipp