Quick Summary

Bronson assessed and mapped 22 distinct scientific data storage systems used across a large Canadian research organization, evaluating each against five data lifecycle storage types.

The engagement produced a storage classification framework defining five storage types, from portable collection storage through to long-term archival, each with explicit definitions, technology options, and storage capacity requirements.

A comparative assessment matrix evaluated all 22 systems across dimensions including suitability by storage type, security level, cost profile, future outlook, and regional accessibility.

A decision flowchart was developed to guide research teams through storage selection decisions based on data type, lifecycle stage, and sharing requirements.

The engagement addressed storage needs spanning field collection to public open data publication, covering the full scientific data lifecycle.

Project Overview

A large Canadian research organization with scientific programs spanning forestry, geomatics, genomics, climate science, and related disciplines manages a complex and fragmented scientific data storage environment. Research teams across the organization use a diverse mix of local, networked, cloud-based, and high-performance computing storage systems, many of which have evolved organically to meet immediate needs rather than as part of a coherent organizational data strategy. This fragmentation creates challenges for data findability, accessibility, interoperability, and long-term reuse, particularly as the organization moves toward adopting FAIR data principles across its scientific programs.

The organization engaged Bronson to develop a structured classification framework for scientific data storage, mapping the full landscape of available storage systems to the five key stages of the scientific data lifecycle: portable collection storage, robust and scratch processing storage, sharing and collaboration storage, open data and publication storage, and long-term archival storage. The engagement required both a comprehensive inventory of available systems and a practical assessment of how each system performs across these storage categories, giving research teams and data managers a clear, evidence-based reference for storage decision-making.

The work produced three coordinated deliverables: a storage type framework defining each lifecycle stage, a comparative assessment matrix covering all 22 storage systems, and a decision flowchart for guiding teams through storage selection.

The Challenge

Mapping a fragmented, multi-system scientific storage environment to a coherent classification framework required navigating significant technical, organizational, and policy complexity.

  • Diverse and overlapping storage landscape. The organization’s storage environment spans locally managed network-attached storage, shared supercomputing infrastructure, approved cloud platforms, external collaboration networks, open data portals, portable field devices, and specialized systems for specific research domains such as genomics. No single framework existed to describe how these systems relate to each other or to the stages of scientific data work.
  • Highly variable system characteristics. The 22 systems assessed differ substantially across security classification, cost structure, management model, regional accessibility, and future outlook. Some systems are stable long-term infrastructure; others are funded through time-limited programs or are actively being phased out. Producing a framework that accurately captured this variation required careful system-by-system analysis rather than broad generalizations.
  • Bandwidth and regional access constraints. Research teams at regional science centers face persistent bandwidth limitations that affect the usability of centrally managed systems, including high-performance computing infrastructure. Any framework that ignored these constraints would be of limited practical value to the scientists who need it most.
  • Security and classification requirements. Different storage systems support different data security levels, from unclassified only through to Protected classification. Scientific datasets vary in sensitivity, and the framework needed to reflect these constraints clearly to prevent inappropriate storage decisions.
  • Supporting field-to-publication data flows. Scientific data in this organization originates in the field (from portable devices and laptops), moves through processing and analysis environments, is selectively shared with external collaborators, and ultimately may be published to open data platforms or archived for future use. The framework needed to support this full lifecycle rather than addressing only one or two stages in isolation.
  • Communicating complexity to non-specialist audiences. The outputs needed to be usable by research scientists, data managers, and program leads without deep technical expertise in storage infrastructure, requiring plain-language definitions, clear visual tools, and a practical decision framework rather than a purely technical inventory.

The organization needed a framework that was both analytically rigorous and practically accessible, capable of guiding real storage decisions across a diverse scientific community.

Our Solution

Bronson structured the engagement around three coordinated workstreams: defining the storage classification framework, conducting the comparative system assessment, and developing the decision support tools.

1. Storage Type Framework Development

Bronson defined five distinct scientific data storage types corresponding to the stages of the scientific data lifecycle. Each type was defined with a plain-language description, storage volume requirements, and a set of technical and operational characteristics that distinguish it from adjacent types. The five types are: Portable and Collection Storage (temporary field storage and raw data lakes for initial transfer into research labs); Robust and Scratch Storage (processing and analysis space for completed datasets, including temporary scratch space near HPC resources); Sharing and Collaboration Storage (controlled external sharing with partners and collaborators, with permission management and retention limits); Open Data and Publication Storage (public-facing storage for final scientific products including datasets, publications, and spatial data); and Archival Storage (long-term storage of legacy scientific datasets with metadata, inventorying, cataloguing, and FAIR-aligned management features).

2. Comparative System Assessment

Bronson assessed all 22 storage systems against the five storage types, producing a structured matrix that rates each system’s suitability for each storage category (optimal fit, moderate fit, suboptimal fit, or requiring further investigation). For each system, the assessment documented the system’s purpose and technical characteristics, the managing organization, future outlook, regional accessibility constraints, estimated cost profile, and data security classification level. The 22 systems assessed include network drives, the General Purpose Science Cluster (GPSC) and its collaboration variant (GPSCC), approved cloud platforms, network-attached storage (NAS and Isilon), CANARIE and the science network, the Genomics Research and Development Initiative network, external storage devices, local workstations, the Open Data Portal, a national geospatial data portal, FTP services, Datahub, GitHub, cloud collaboration tools, object storage variants, and the Digital Research Alliance Canada.

3. Decision Flowchart Development

Bronson developed a visual decision flowchart to guide research teams and data managers through storage selection decisions. The flowchart maps data type and lifecycle stage to the appropriate storage category, and from there to the specific systems best suited to that category. The flowchart is structured to be usable without technical expertise, providing clear branching logic from data collection through to archival or open data publication.

4. Storage Landscape Visualization

Bronson produced a diagrammatic overview of the full storage landscape, presenting the five storage types as a coherent progression from collection through to archival and open data, with specific system options mapped to each stage. This visual was designed to serve as a quick-reference orientation tool for scientists and data managers encountering the framework for the first time.

Key Deliverables

  • Scientific Data Storage Type Framework – A structured definition of five scientific data storage types (portable/collection, robust/scratch, sharing/collaboration, open data/publication, and archival), each with plain-language definitions, storage volume requirements, operational characteristics, and illustrative technology examples.
  • 22-System Comparative Assessment Matrix – A comprehensive matrix evaluating all 22 identified storage systems across suitability by storage type, system summary, managing organization, future outlook, regional accessibility, cost profile, and data security classification level.
  • Storage Decision Flowchart – A visual decision support tool guiding research teams from data type and lifecycle stage to appropriate storage system selection, structured for use by non-specialist audiences.
  • System-by-System Suitability Assessments – Detailed narrative assessments for each of the 22 storage systems across all five storage categories, documenting strengths, limitations, cost considerations, and constraints relevant to scientific data use cases.

The Impact

Bronson’s framework gave the organization’s research community a clear, structured, and evidence-based reference for scientific data storage decisions for the first time.

  • The five-type storage classification gave research teams and data managers a shared vocabulary for describing data storage needs across the scientific lifecycle, replacing ad hoc system selection with a principled framework aligned to organizational data management objectives.
  • The 22-system comparative matrix consolidated dispersed and often informal knowledge about the organization’s storage landscape into a single authoritative reference, surfacing gaps, constraints, and risks (such as systems with uncertain future outlooks or unclassified-only security levels) that had not been systematically documented.
  • The decision flowchart made the framework immediately actionable for scientists and program leads without technical backgrounds, providing a practical tool for storage selection at the point of need rather than requiring consultation with data specialists.
  • The assessment explicitly documented regional accessibility constraints, giving program managers and infrastructure planners evidence to support investment decisions related to bandwidth and connectivity for regional science centers.
  • The framework positioned the organization to make consistent, defensible storage investment decisions as its data infrastructure continues to evolve, with a clear baseline against which future system changes and additions can be assessed.

The engagement reflects Bronson’s ability to bring analytical structure to complex, multi-system technology environments and translate that structure into practical tools that research organizations can act on. By grounding the framework in the full scientific data lifecycle and aligning it explicitly with FAIR principles, Bronson delivered not just an inventory but a durable decision-making asset for the organization’s scientific data management program.

Let’s work together.

Don’t let data challenges hold back your operations. Explore how data, analytics, and AI can drive success in your business processes. Contact us today for a consultation and unlock the full potential of your data.