Thread: Template for making proposed change to the miniSEED DEADLINE FOR COMMENTS MAY 31, 2016

Started: 2016-05-10 17:04:06
Last activity: 2016-07-08 14:55:04

Comments Must Be Submitted to WG II list (fdsn-wg2-data<at> by May 31, 2016

I apologize for the delay in getting this information out to everyone.

I am attaching the following items
1) The agenda for our meeting in Vienna
2) The next generation of miniSEED document - this is the straw man and is currently Version 2016­-3­-30
3) The rationale for the recommended changes
4) The process proposed at the EGU meeting and it is the process we will follow
5) the Excel template in which you must provide your comments and suggestions
(Note we have added a rationale for each of your comments in this version of the spreadsheet)

To remind you of the process
A. All feedback on this version of the straw man (2016-3-30) must be sent to the FDSN WGII list within 3 weeks
B. An editorial board (Angelo Strollo, Chad Trabant, Reinoud Sleeman, Tim Ahern) will review the submissions and produce a new version of the strawman
C. The new straw man version will be posted again to FDSN WG II.
D. Steps A-B-C repeat as the straw man evolves or until no new comments are received. A maximum of 4 iterations are anticipated.

So please provide your comments in the attached Excel spreadsheet and return to FDSN WG II and also to all recipients of this email. I encourage everyone to make sure they are members of WGII for future comments and contact for the FDSN.

Again my apologies for the delay.

Tim Ahern

Director of Data Services

  • Hi Tim

    I was interested in making comments, but am having trouble with the
    standardization of the input.

    First, because the strawman is a PDF, copying from it and then pasting
    into anything loses all separations between words. For example pasting
    results in things like this:
    Locationidentifier. Usedtoidentifyagroupingofchannels,forexamplefromaspecific

    Secondly, and perhaps this is just because I do not use excel, but I
    am unable to paste (or input) multiline text into your template and
    have the text stay in a single field, ie the new line causes following
    lines to fill down the column. Unfortunately I am too old to learn new
    skills, so excel remains a mystery to me. :(

    I agree that having comments show both existing and proposed language
    is a really good idea, but the mechanics feel a bit limiting. It might
    be helpful to resend the strawman as plain text for ease of copy
    paste? And would it be possible to allow submitting comments in plain
    text so long as they follow the structure of the template, ie
    something like this?


    Commenting on document version #:

    Do something big or small

    Type of Action:

    Current Wording form document:
    Location identifier. Used to identify a grouping of channels, for
    example from a specific

    New Wording:
    Location identifier. Used to identify a big or small grouping of
    channels, for example from a specific

    Big and small is good.

    Philip Crotwell
    Univ. of South Carolina

    Date of Comment:

    Tim Ahern

    Director of Data Services

    • Hi all,

      To aid in the cut and paste issue that Philip raises attached is a plain text version of the straw man version 2016-3-30 document.

      Regarding the entry of long lines of text, the cells are set to wrap lines so you should be able to continue typing as needed. To insert newlines to break paragraphs in the spreadsheet cells, either use Alt+Enter (Windows), Cmd+Option+Enter (Mac) or whatever your platform needs; alternatively you can insert ^P and the editors will understand it to be a paragraph break.


      Next generation miniSEED
      Version 2016-3-30

      Background and context
      Adopted by the FDSN in 1987, the SEED format has become and still serves as the canonical format for passive source seismic (and other) data. Data exchange, especially to end users, is commonly formatted as “full” SEED, which contains both the time series and complete metadata. For continuous data collection and archiving it is common to split the time series from the metadata. Extensions to the SEED format were adopted in 1992 to define miniSEED, the time series portion of SEED, which can be decoded independently of the supporting metadata.

      Many FDSN members recognize that the current two-character network code needs to expand. Such an expansion requires changes in both the metadata and time series components of the SEED format. With the adoption of StationXML by the FDSN, the metadata component is easily adjusted due to the extensibility of XML. The time series component, miniSEED, is a fixed-length field format and expanding the network code would render the format incompatible with the current release. Such a small, but disruptive change affords the opportunity to consider other changes to the format, allowing the FDSN to address historical issues and create a new foundation for current and future use.

      miniSEED 3, important changes
      * Expand the network code field, in coordination with equivalent StationXML changes.
      * Recommendation: 6 characters.
      * Suggested convention for temporary deployments: “xxxxYY”, where ‘x’ are unique identifiers and YY
      are the last two digits of the start year of the deployment, e.g. 16 for 2016. Temporary network codes
      will still begin with the letters X, Y, Z, or a numeral from 0-9.
      * Add a miniseed format version.
      * Add a data version.
      * Move most blockette 100, 1000 & 1001 field information (actual sample rate, byte order, record length, encoding,
      microseconds) into the fixed section of the data header.
      * Simplify the record start time encoding and include microsecond resolution.
      * Combine the 3 bit-flag fields in fixed section data header to a single byte, dropping rarely used flags.
      * Eliminate timing correction field, timing corrections must be applied to the time stamp.
      * Document forward compatibility mapping, how to convert miniSEED 2.4 to version 3.

      miniSEED 3, changes for consideration
      * General compression encodings for fundamental sample types and opaque data
      Encoding 50: 32-bit integers, general compressor (e.g. Brotli)
      Encoding 51: 32-bit IEEE floats, general compressor (e.g. Brotli)
      Encoding 52: 64-bit IEEE floats (doubles), general compressor (e.g. Brotli)
      Encoding 100: Opaque data
      * Add CRC field for validating integrity of data payload.
      * Expand channel codes, identify more instrument types and potential combination with location.
      * Expand location identifier and disallow empty values (synonymous with all other series identifiers).
      * Fixed-point sample encoding; would need to determine a representation due to lack of standard.
      * No SEED 2.x blockettes allowed, instead allow opaque headers for arbitrary information.
      * Eliminate fixed-length field for sequence numbers. Alternatives: transport protocol or opaque headers.
      * Eliminate arbitrary % timing quality field, timing quality related bit flags remain. Further timing
      qualifiers can be in an opaque header or separate channel if needed.

      Below is a straw man miniSEED Fixed Section Data Header incorporating most of the concepts above.
      Considerations and adjustments for byte alignment should be made after the fields have been settled.

      Straw man miniSEED 3 Fixed Section Data Header

      The data record starts at the first byte. The next two bytes are ‘MS’ to indicate the format, followed by a single binary digit indicating the format version. The fixed section of the header may be followed by optional, opaque header values. The total length of the record is the length of the fixed section, plus the length of any opaque headers, plus the length of the data payload. No padding is allowed before, after or between any of the sections.

      Note Field name Type Length Offset Mask/Flags
      1 Record indicator (‘MS’) A 2 -
      2 miniSEED version (3) B 1 -
      3 Network code A ? - [UN]
      4 Station code A 5 - [UN]
      5 Location identifier A ? - [UN]
      6 Channel codes A ? - [UN]
      7 Quality indicator A 1 - [UN]
      8 Data version B 1 -
      9 Record length B 4 -
      10 Record start time B 8 -
      11 Number of samples B 4 -
      12 Sample rate B 4 -
      13 CRC-32 of data B 4 -
      14 Offset to data B 2 -
      15 Flags B 1 -
      16 Sample encoding format B 1 -
      17 Number of opaque headers that follow B 1 -
      18 Opaque header fields V V -

      Notes for fields, all fields are mandatory:

      1 Data record indicator - “MS”.

      2 UBYTE: miniSEED header version. Set to 3 for this version.

      3 Network code. A code that uniquely identifies the network operator responsible for the data. This identifier is assigned by the FDSN. Left justify and pad with spaces (ASCII 32). Cannot be empty.

      4 Station code (see Appendix G). Left justify and pad with spaces (ASCII 32). Cannot be empty.

      5 Location identifier. Used to identify a grouping of channels, for example from a specific sensor. Left justify and pad with spaces (ASCII 32). Cannot be empty.

      6 Channel codes (see Appendix A). Cannot be empty.

      7 Quality indicator. Defined values: D (unknown), R (Raw), Q (Quality controlled), M (merged/modified).

      8 UBYTE: Data version. Start with version 1 and increase for later versions.

      9 ULONG: The record length in bytes.

      10 LONGLONG (64-bit signed integer): Start time of record, time of the first data sample. As a representation of UTC, this value is encoded as the number of microseconds since midnight 1 January 1970 UTC not including leap seconds. This is a microsecond version of Unix/POSIX time as defined by IEEE Std 1003.1, 2013 Edition (POSIX.1-2008). The mapping between separate components of a UTC time (seconds, minutes, hours, etc.) and this representation is documented in Section 4.15 of IEEE Std 1003.1, 2013 Edition, which is then scaled by 1E6 and microseconds are added to result in this representation. This time scale is continuous except for the occurrence of leap seconds, whether this value is a leap second or not is defined by bit 2 of the Flags field. When calculating time within a record, bits 2 and 3 of the Flags field should also be consulted to determine if leap seconds occurred during the record.

      11 ULONG: Number of data samples in record.

      12 FLOAT: Sample rate encoded in IEEE-754 floating point format. When the value is positive it represents the rate in samples per second, when it is negative it represents the sample period in seconds. Writers should use the negative value sample period notation for rates less than 1 samples per second to retain resolution. Set to 0.0 if no time series data is included or data is opaque.

      13 ULONG: CRC-32 value of data as defined and used in RFC 1952 (GZIP format). For non-opaque data this is the CRC value of the decoded data payload. For opaque data it is the CRC of the raw payload. If no data payload or a CRC is not possible, set this value to 0.

      14 UWORD: Offset in bytes, relative to the beginning of the record, to the beginning of encoded data. If no data payload, set this value to 0.

      15 UBYTE: Flags:
      [Bit 0] - Byte order. Set this bit to 0 to indicate least significant byte first (little endian) order and 1 to indicate most significant byte first (big endian) order. This indicates the byte order of binary header and data samples values.
      [Bit 1] - The start time occurred during a leap second.
      [Bit 2] - A positive leap second occurred during this record.
      (same as SEED 2.4 FDSN, field 12, bit 4)
      [Bit 3] - A negative leap second occurred during this record.
      (same as SEED 2.4 FDSN, field 12, bit 5)
      [Bit 4] - Time tag is questionable. (same as SEED 2.4 FSDH, field 14, bit 7)
      [Bit 5] - Clock locked. (same as SEED 2.4 FSDH, field 13, bit 5)

      16 UBYTE: A code indicating the encoding format. (same as SEED 2.4 Blockette 1000 field 3, with addition of encodings 50, 51, 52 and 100 described above)

      17 UBYTE: Total number of opaque header fields that follow the fixed section.

      18 VAR: Opaque data header fields. Each opaque header field is a variable length string, terminated by the character ‘~’ (ASCII 126). Each header may contain any data except for the terminating character. It is strongly recommended that opaque headers contain printable text. Example header values (with terminators), for illustration only, no implied usage pattern:
      “GPS~”, “TYPE=GPS~”, “FORMAT=BINEX~”, “SEQUENCE=12345~”, “FILENAME=data.bin~”,
      “FRAGMENT=15/238~”, “TIMEQUALITY=98%~”
      Philip Crotwell
      Univ. of South Carolina

      Tim Ahern

      Director of Data Services

      IRIS DMC
      • Hi all,

        To aid in the submission of proposed changes, attached is a fillable PDF that can be used as an alternative to the Excel spreadsheet. For each change proposal, copy the file, add a sensible tag to the file name and fill in the boxes.

        Submissions may be submitted in either the Excel sheet or this PDF form.


        Philip Crotwell
        Univ. of South Carolina

        Tim Ahern

        Director of Data Services

        IRIS DMC
  • Hi Tim, all,

    thanks for your email and the documents concerning the IRIS proposal for changes in the

    miniSEED format that was discussed at the EGU 2016 in Vienna.

    The main motivation for the proposal for miniSEED3 (mS3) comes from the need to expand

    the current two-letter network code, simply because we are running out of available (free)

    combinations. The proposed solution in mS3 is to expand the network code to more (6, or 8)

    characters (in particular to be prepared for improved identification of temporary networks).

    Then, since such a small change would be disruptive, why not consider to include other changes

    to the format as identified over the last decades

    In my opinion, however, prior to entering the next step in discussing the contents of this proposal

    is the question whether the FDSN supports a disruption in the format, with all implications for acquisition,

    operations, software and services, or that we prefer a simple solution (if possible) with limited impact.

    This discussion did not took place at the EGU, or before through the mailing list, but it is extremely

    important to get feedback from WGII on this issue before the next round in the discussion on the

    proposal can take place.

    The question is whether we really need to change the current miniSEED format to accommodate

    for the required expansion of the network code or that we can find a solution within the existing

    SEED format. A possible solution is to use the reserved byte in the fixed section of the data header and

    define this as the third character in the network code. When this field is empty the network code has 2

    characters as it always has been. This would be a very simple and pragmatic solution, with the price

    being paid that we will keep alive all other changes that we possibly would like to have cleared.

    Both solutions will mark the end of dataless SEED anyway as a 2+ character network code will not

    fit the Station Blockette (50) and StationXML can be the only format for stations with 2+ character

    network codes.

    The purpose of this mail is to invite the WGII to provide feedback on the above question first, before

    the proposed process (with feedback on the straw man) can/may start. I think it is important to have a

    broad agreement within the FDSN to approve on which step to take in the evolution of miniSEED as it may

    have a major impact for many of us.

    Looking forward for any feedback (before end of May).

    Best regards,
    Reinoud Sleeman
    Chair FDSN WGII

    • Hi

      I would recommend caution with this idea. On the surface is appears a
      simple solution of limited impact, but I fear it would be a source of
      bugs and mistaken attribution for a long time to come. Consider a
      "new" miniseed file with network code ABC that is loaded by software
      that assumes it is an "old" style file. The data would appear to be
      from network BC and because there is no notion of a format version of
      miniseed in the header, there is no way for this older software to
      notice that something is wrong. At least with a totally new file
      format (and I would argue miniseed3 is a new file format, not a simple
      revision), there is no expectation that older systems will
      successfully read the files, and if they try, bad things will happen
      very quickly and very noticeably. With a minor change as you propose,
      there would be this expectation of still being able to use older code
      without change. And for the most part it is true, older systems would
      work, most of the time, except when they don't....and then they would
      fail in subtle ways. And therein lies the problem. In the short term
      it would be a lower level of pain, but that pain would drag on for
      decades. I would much prefer a short term disruption.

      Use of extra bytes in the header is fine in the case where older
      systems can more or less safely ignore the new information, such as in
      the data quality indicator. But I feel that the network code is just
      too important for it to be interpreted wrongly.

      My $0.02

      Tim Ahern

      Director of Data Services

      IRIS DMC
      • Hi Reinoud and others,

        Philip makes a very good point, most current miniSEED readers would not recognize any change and would read the incorrect network code leading to network identification confusion.

        While a change to 3 character network codes would be easier to adapt to for miniSEED readers (compared to a much bigger change), even that change would require schema and code modifications in lots of systems. Put another way, it sounds simple but a 3 character will trigger significant distribution in equipment and data handling systems (for very limited gain).

        Furthermore, I think 3 characters is insufficient to identify temporary networks. Currently temporary networks cannot be unambiguously identified by their code alone, a start year (at minimum) is required to remove ambiguity. Now is our chance to address this very common wrinkle in network identification.


        My $0.02

        Tim Ahern

        Director of Data Services

        IRIS DMC
  • I received a request to extend the period to comment on the miniSeed straw man to June 6 and I granted this extension. Please have your comments in no later than June 6, a one week extension of the deadline.


    Tim Ahern

    Director of Data Services

    1408 NE 45th Street #201
    Seattle, WA 98105

    (206)547-0393 x118
    (206) 547-1093 FAX

    • Tim,

      I found that I was unable to enter and properly save all of the text
      for my comments in the PDF file provided by Chad for comments on the
      MiniSEED proposal.

      Therefore, I am submitting my comments in the attached text files.

      - Doug N

      Doug Neuhauser University of California, Berkeley
      doug<at> Berkeley Seismological Laboratory
      Office: 510-642-0931 215 McCone Hall # 4760
      Fax: 510-643-5811 Berkeley, CA 94720-4760
      Remote: 530-752-5615 (Wed,Fri)
      Next Generation miniSEED Change Proposal Document version 2016-5-12

      Change Description:

      Change timestamp from 8 byte longlong with required leapsecond flag to 12 byte
      MSEED3 Time Structure which can represent all timestampsto miccrosecond
      resolution with properumeric vvalues.

      Type of change: Modification

      Current wording from document:

      Record Start Time B 8

      10 LONGLONG (64-bit signed integer): Start time of record, time of the first
      data sample. As a representation of UTC, this value is encoded as the number
      of microseconds since midnight 1 January 1970 UTC not including leap seconds.
      This is a microsecond version of Unix/POSIX time as defined by IEEE Std
      1003.1, 2013 Edition (POSIX.1-2008). The mapping between separate components
      of a UTC time (seconds, minutes, hours, etc.) and this representation is
      documented in Section 4.15 of IEEE Std 1003.1, 2013 Edition, which is then
      scaled by 1E6 and microseconds are added to result in this representation.
      This time scale is continuous except for the occurrence of leap seconds,
      whether this value is a leap second or not is defined by bit 2 of the Flags
      field. When calculating time within a record, bits 2 and 3 of the Flags field
      should also be consulted to determine if leap seconds occurred during
      the record.

      [Bit 1] - The start time occurred during a leap second.

      Propose new wording:

      Record Start Time:
      MSEED3 Time Structure (12 bytes)
      Year B 2 Range: -32768 to 32767
      Day-of-Year B 2 Range: 1-366
      Hour B 1 Range: 0-23
      Minute B 1 Range: 0-59
      Second B 1 Range: 0-60 (including leap second)
      Unused B 1 For alignment purposes
      Microsecond B 4 Range: 0-999999

      10 MSEED3 Time Structure (12 bytes) Start time of record, time of the first
      data sample. As a representation of UTC with microsecnd resolution.


      The current MSEED 2.4 time structure can be easily extended by 2 bytes
      to provide microseconds (0-999999). This provides a continuous time
      scale to the microsecond resolution, and does not suffer from the
      POSIX IEEE Std 1003.1, 2013 Edition timestamp which does not allow for
      leap seconds.

      The proposed POSIX-style longlong int timestamp cannot represent a
      leapsecond, so the proposed standard requires an additional flag to
      indicate that the timstamp is actually during a leapsecond. In
      addition, presumably the current optional flags for "record contains a
      positive leapsecond" or "record coontains a negative leapsecond" would
      now be required flags rather than advisory flags since the timestamp

      Proposing a timestamp that appears to represent time as a continuum
      where time computation can be performed with simple integer arithmetic
      will encourage users and program to ignore leapseconds and therefore
      have timing erros of 1 second when working around a leap second .
      Given that MSEED is an archive format, we should NOT be promoting a
      time representation that is "apparently" continuous but is actually
      nor and does not provide an adequate representation of time without
      the use of auxiliary bits. The proposed time representation also has
      no way to represent 2 consecutive leapseconds should that ever happen.

      I cannot support the currently proposed timestamp of longlong int
      that required up to 3 leapsecond-related flags.

      Douglas Neuhauser,
      UC Berkeley Seismological Laboratory and
      Northern California Earthquake Data Center

      Date of comment:
      Next Generation miniSEED Change Proposal Document version 2016-5-12

      Change Description:

      The CRC should represent the encoded data in the MSEED record rather than
      the decoded data.

      Type of change: Modification

      Current wording from document:

      13 ULONG: CRC-32 value of data as defined and used in RFC 1952 (GZIP
      format). For non-opaque data this is the CRC value of the decoded
      data payload. For opaque data it is the CRC of the raw payload. If
      no data payload or a CRC is not possible, set this value to 0.

      Propose new wording:

      13 ULONG: CRC-32 value of data as defined and used in RFC 1952 (GZIP
      format). For non-opaque data this is the CRC value of the encoded
      data payload. For opaque data it is the CRC of the raw payload. If
      no data payload or a CRC is not possible, set this value to 0.


      1. It is also desireable to be able to verify the integrity of the
      CRC for a MSEED record without having to decode the data.

      2. The computation of a CRC of the encoded data in the MSEED record is well
      defined, but the computation of a CRC on the decoded data is not.
      For example, decoded STEIM1 or STEIM2 data would have a different
      byte order on little endian and big-endian systems,
      and therefore would have a different CRC.

      Douglas Neuhauser,
      UC Berkeley Seismological Laboratory and
      Northern California Earthquake Data Center

      Date of comment:

    • Dear Tim (and WGII members),

      as already discussed in individual messages we (ORFEUS/EIDA) have been
      collecting during the last 4 weeks comments within our EIDA group (11
      European federated data centers) on the mseed3 proposal. In particular
      we collected a number of comments to some of the specific points in the
      straw man (i.e. #1 Expansion of the net code, #10 CRC field, #12
      location identifier, #17 variable record lengths, etc); in parallel we
      have been discussing more on the managerial side about the implications
      that this important change will have on the operations of our data
      centers once this will be approved.

      Although we may all agree that the proposed changes are needed we should
      also recognize that this is a major change for the seismological
      community. In particular, for data centers that should actively engage
      in this endeavor, the RFC process appears too fast. Having said that, we
      would kindly ask you to allow comments on this first iteration until the
      end of June. This will allow us to harmonize the technical comments we
      are collecting internally in EIDA as well as discuss further on the
      eventual implementation timeline and resources involved.
      We collected the comments, initiated the discussion at the Management
      meeting last week and we have fixed a dedicated internal technical
      discussion for next week.

      As mentioned above, considering that this is an important change for all
      FDSN data centers, we think that keeping the possibility to post
      comments for this first iteration until the end of June should not be a
      problem. This will hopefully allow a lively discussion on the mailing
      list as well as within the editorial board to ensure that different
      opinions are captured and discussed before moving to the next iteration.

      We apologize for this late request of deadline extension, in particular
      with respect to colleagues that have submitted their comments timely
      according to initial deadline.

      With Kind Regards,
      Angelo Strollo (on behalf of the ORFEUS/EIDA data centers)

      Tim Ahern

      Director of Data Services

      IRIS DMC
      • Dear Tim, dear WGII Chair and dear WGII Colleagues,

        as mentioned in our previous message we have been collecting internally
        within the EIDA Management Board a number of comments on the mseed3
        proposal, and discussed them carefully within the last weeks through
        many meetings and intense internal e-mail exchange.

        From our point of view this is a major change in seismology that will
        have a notable impact on the data centre operations as well as on the
        users for the next decade. Although challenging, probably also
        premature, we tried to think about the effects of an implementation of
        the proposed changes on the data centre side and related costs as well
        as guessing the impact on our user community.

        After a long discussion at both technical and managerial level among the
        11 federated EIDA data centres in Europe we come to the conclusion that
        the proposed change is too “expensive” for data centres as well as for
        users with respect to the real benefit that may be derived.

        Aside from the pressing issue of the network code limitation, we
        consider that the other proposed changes are ‘nice to have’ from the
        side of data managers but on balance they are not substantial additions
        that warrant the effort to make a major change to the existing
        standard. There are indeed other avenues we would like to explore - as
        an example, the European community, is currently demanding more
        interoperability with communities beyond seismology in order to allow
        interdisciplinary and integrated research as well as for better and
        modular data models, we would like the next generation of seismic data
        format to be more widely adoptable.

        We all agree that a new format for seismological data is needed in the
        long term, but overall the main problem we have is with the speed
        currently proposed for this standardization process, as pointed out also
        by the WGII chair in one of the initial comments on this mailing list
        already in May. Although we appreciate the initiative and the initial
        straw-man we think that the proposed changes are significant enough to
        require technological modifications in user software, data centre
        practice, and station-side instrumentation software, but will not
        substantially future-proof us from further change over the next decade.
        As incremental changes damage credibility and possibility of community
        uptake we would really like to have enough time to explore carefully
        additional proposals that in order to be well thought and tested cannot
        fit with the proposed time line.

        Therefore, in light of what is written above we would like to propose a
        different approach towards a new data format that goes through the
        following 4 steps:

        1 - Define an interim solution (SEED 2.4+?) allowing additional network
        codes using an extra blockette for extended network codes, that is
        backward compatible and can be used immediately.
        2 - Restart an FDSN-wide process gathering ideas for streaming and
        archive data formats, culminating in a dedicated meeting in late 2016.
        3 - Distribute proposal(s) to the FDSN members before the Kobe meeting.
        4 - Prepare a preliminary implementation plan for approval and adoption
        by FDSN after the Kobe meeting possibly in 2018.

        In summary, our proposal is to solve the immediate problem of network
        code allocation in a pragmatic way without adding compatibility issues,
        and in parallel start to work jointly on a well thought and future proof
        solution that will bring us towards a new data format for seismology.

        More details on our plan are appended to this email.

        We apologize for the long e-mail and for not having used the template
        for comments.

        Looking forward to hearing from you,

        The ORFEUS/EIDA* data centres




        1 - Use blockette 1002 to solve the immediate needs of additional
        network codes.
        We would propose to add another blockette 1002 to extend the network
        code as follows.

        A typical mini-SEED record looks schematically like this:

        GE_WLF__BHZ, 728437, D
        start time: 2016,180,00:00:16.599998
        number of samples: 428
        sample rate factor: 20 (20 samples per second)
        sample rate multiplier: 1
        activity flags: [00000000] 8 bits
        I/O and clock flags: [00000100] 8 bits
        [Bit 5] Clock locked
        data quality flags: [00000000] 8 bits
        number of blockettes: 2
        time correction: 0
        data offset: 64
        first blockette offset: 48
        BLOCKETTE 1000: (Data Only SEED)
        next blockette: 56
        encoding: STEIM 2 Compression (val:11)
        byte order: Big endian (val:1)
        record length: 512 (val:9)
        reserved byte: 0
        BLOCKETTE 1001: (Data Extension)
        next blockette: 0
        timing quality: 100%
        micro second: 98
        reserved byte: 70
        frame count: 7

        There is a 48-byte fixed header and a linked list of blockettes. The
        header contains pointers to the beginning of data (64) and to the first
        blockette. Each blockette has a pointer to the next blockette, which is
        0 if no more blockettes follow.

        Suppose we add another blockette (1002) and use it to extend the network
        code for example “GEMMA” instead of “GE”. Unfortunately there is no free
        space, so we have to steal 64 bytes (1 frame) from data.

        Now a record would look like this (where “99” is a valid reserved
        network code that will be used as an indicator of extended network code
        if blockette 1002 is not supported by the reader):

        99_WLF__BHZ, 728437, D
        start time: 2016,180,00:00:16.599998
        number of samples: 367 <= 14 %
        less data
        sample rate factor: 20 (20 samples per second)
        sample rate multiplier: 1
        activity flags: [00000000] 8 bits
        I/O and clock flags: [00000100] 8 bits
        [Bit 5] Clock locked
        data quality flags: [00000000] 8 bits
        number of blockettes: 3 <= one
        blockette added
        time correction: 0
        data offset: 128 <= data
        offset moved by 64 bytes
        first blockette offset: 48
        BLOCKETTE 1000: (Data Only SEED)
        next blockette: 56
        encoding: STEIM 2 Compression (val:11)
        byte order: Big endian (val:1)
        record length: 512 (val:9)
        reserved byte: 0
        BLOCKETTE 1001: (Data Extension)
        next blockette: 64 <= now
        pointing to the new blockette
        timing quality: 100%
        micro second: 98
        reserved byte: 70
        frame count: 6 <= one
        frame less for data
        BLOCKETTE 1002: (Data Extension 2)
        next blockette: 0
        extended network code: GEMMA

        This approach will ensure 100% backwards compatibility without breaking
        any existing miniseed readers that implement SEED 2.4 correctly. Old
        readers would simply ignore the blockette. Although this removes 64
        bytes from data, we should keep in mind that some additional header
        space will be required anyhow with the current IRIS mseed3 proposal. As
        this is not finalized and there is no guarantee that mseed3 header will
        fit in 64 bytes

        This is a “light” change that will allow the FDSN to solve the issue of
        the network codes for the moment and give us the possibility to start an
        extended discussion on the new data format.

        Storage growth would be 16 % (instead of 100 TB, we would need 116 TB),
        assuming 512-byte record size. On the positive side, real-time latency
        would decrease by the same amount

        2 - Having removed the time pressure, start a process where FDSN members
        are encouraged to propose future proof ideas that will address mseed
        format shortcomings, not only focusing on the offline storage but also
        on the real-time, low latency streaming (if indeed they should remain
        coupled). We also would prefer a tighter coupling between the new
        waveform format and the current metadata standard, stationXML. We
        propose to kick-off this process in a dedicated meeting that can be
        organized at AGU in December or preferably at a dedicated meeting that
        we can host in Europe between September and November this year. During
        our discussion in Europe on the new data format we are considering
        something similar to video streaming formats and/or OGC standards
        ( We are ready to commit some
        resources on this project in order to get an initial proposal ready to
        be discussed in a dedicated meeting later this year. This approach will
        have the following advantages:

        - No introduction of a new, incompatible (likely short-lived) format
        without providing major technical progress;
        - follow a long term strategy capable to cover future needs;
        - better coordination and separation of data and metadata - eg remove
        information from the data header that is already described in the
        stationXML, making simpler data format;
        - address shortcomings with realtime data streaming;
        - gain time and freedom to think broader, including the adoption of
        generic data standards with a scope beyond just seismology.

        3 - Send the proposal(s) to the FDSN members before the IASPEI/Kobe
        meeting for comments and discuss the proposal(s), timeline for
        implementation and implications at the meeting and afterwards if
        necessary. Of course, in order to demonstrate the feasibility, some work
        should be done in designing some prototypes ideally before the Kobe
        meeting. Feedback from End Users and Instrumentation manufacturers
        should be consistently sought by advertising with FDSN-level
        presentations at AGU and EGU in advance of Kobe.

        4 - Prepare a preliminary implementation plan, accordingly define the
        deadlines for approval and adoption at the FDSN that should go beyond
        the Kobe meeting possibly in 2018


        With Kind Regards,
        Angelo Strollo (on behalf of the ORFEUS/EIDA data centers)

        Tim Ahern

        Director of Data Services

        IRIS DMC
        Tim Ahern

        Director of Data Services

        IRIS DMC
  • Dear Tim & Chad

    You guys do great work. We want to keep supplying the best and most complete content we can to make that possible.
    Our general position is that we make a considerable effort to record what happens, as it happens, and we
    think this information should be kept together in a form that is documented and published so that the data can be fully
    interpreted long after we're gone. The data format should be as simple as it needs to be, and not simpler.

    One of the more frequent support questions we have dealt with over the years is related to what customers see as
    unexplained effects when they review data in display tools that do not interpret time quality. When you review data
    including the status of timing quality, the answer becomes immediately obvious if the time quality is bad. Few understand
    this subtlety, and many insist on discarding the time quality information. It's an unfortunate, and unnecessary, mistake to do so.
    Recording less information packaged with the data would not be an advance for miniseed 3.

    We’ve tried to capture all of our thoughts in the attachment where they differ from the strawman.
    Attached please find our document as advised. There are 13 tabs including the Instructions and Example tabs.

    Kind regards,

    Dr. Edelvays Spassov
    Sales Manager
    Kinemetrics, Inc
    222 Vista Avenue
    Pasadena, CA 91107
    Phone: 626-795-2220
    Fax: 626-795-0868

    Tim Ahern

    Director of Data Services

