Data-MINX


Poster by: Peter Turner, University of Sydney, Paul Coddington, Australian Research Collaboration Service, Nick Hauser, ANSTO, Allan Jones, AMMRF and University of Sydney & Richard Farnsworth, Australian Synchrotron Facility.

The data deluge, the growing need for repository services and increasingly diverse instrument portfolio held by scientific researchers, poses a significant challenge for the provision of research enabling e-Infrastructure. The Data-MINX project is an NCRIS-NeAT supported project to provide Single-Sign-On data access, management and repository services for the NCRIS 5.3 capabilities of microscopy, neutron and X-ray science. Data-MINX is to provide a capability to flexibly search, access and transfer data across the set of research facilities and repositories used by a given researcher. The Data-MINX e-Infrastructure is in part based on a system developed over several years by the UK Science and Technology Facility Council (STFC). A uniform model is applied across the NCRIS 5.3 capabilities, with a central component being a metadata schema based Information Catalogue (ICAT), holding metadata records for the research activity at each facility. An ICAT API exposes its methods via Web services to facilitate the development and use of customised stand-alone applications or Web browser clients. The scientific instrument data may be held in conventional or distributed file systems (e.g. SRB), with browsers services allowing a user to locate and transfer data from one location to another. Metadata is harvested as close to the source as is practical, and services allow meta-data searching across multiple instances of the ICAT, to effectively federate ICAT instances at different facilities. The ICAT is defined with respect to the Scientific Metadata Model, developed by the STFC, and may be mapped onto any SQL database system. Instrument data may be held on the national data fabric being developed by ANDS and ARCS, and ARCS services will provide collaboration tools for researchers. Services are to be tailored according to the varying data policies of the research facilities and the requirement of the research communities.