Data File Format - File Size
- Definition: For a given dataset the comparative size of the transport package (in bytes)
- Compression
- Definition: Does the transport format support compression and decompression of encapsulated data using standard open compression formats?
- Encryption
- Definition: Does the transport format support the encryption of encapsulated data using industry standard algorithms (including PKI)?
- Digital Signature
- Definition: The transport format will support the application of one or more digital signatures on encapsulated dataset
- Data integrity
- Definition: The transport format will support a hash or checksum function to mitigate unexpected data changes
- Schema driven
- Definition: The transport format should support a schema to ensure that a data transport file will be well-formed and valid.
- Well defined Metadata
- Definition: The transport format will support a set of well-defined metadata tags that allow effective communication of encapsulated data between sender and receiver.
- Examples: Encryption used, record number, subject UUID, etc.
- A use case for this would be partitioning study datasets for a into subject transfers and having enough metadata to reconstitute the original study
- Sending partial datasets for a subject
- Incremental or cumulative data transfers
- Wide Payload Support
- Definition: The transport format may support transfer of a wide range of well-defined payloads over and above data currently well-described using tabular data structures.
- Examples:
- Image data, DICOM data, WAVEform, RDF [Ask Armando]
- Protocol (electronic)
- Statistical Analysis Plan (electronic)
- Relationship Data
- Definition: The transport format will support meaningful relationships between data.
- Example: Replace RELREC with metadata laden links for relationship between clinical observations and histopathology findings.
- Partial Data Transfers
- Definition: The transfer format should support the transmission of subsets of data in a meaningful fashion.
- Examples: (this should be linked with the well defined metadata)
- Transmitting data on a subject level
- Transmitting all data for a given time period across multiple subjects on request
- Transmitting incremental datasets
- Must be an Open Standard
- Definition: The full transport format specification is freely available, well documented and allowed for free use without license. All supporting materials (eg schemas, documents) will be available without cost.
- Should support multibyte character encodings
- Definition: The transport format supports the fidelity of captured source data in transmission without requiring translation or transcoding. The encoding of a transport file should be declared by the format. Restrict support to UTF-8 encoding.
- Example:
- Should support submissions in Kanji for Japanese Studies
- Audit records
- Definition: The transport format should support the transport of audit data/metadata.
- Example:
- The CRF-level audit trail should be able to be transported as part of an end-to-end submission
- Something similar to the capability present in the ODM
- Traceability and Provenance
- Definition: The transport format should support the transport of traceability data and metadata to establish data provenance.
- Example:
- For a given data value in a submission analysis dataset it will be possible to trace back to the original source of data.
- Transmit data and metadata
- Definition: It will be possible to transfer both data and metadata in the same transport file.
- Example:
- In a given data transfer incorporate both the metadata and data, and link from data elements to corresponding metadata
|