Stephen McMahon: The ARCS Data Fabric


View the presentation slides PDF (555 Kb)

Abstract

The Australian Research Collaboration Service (ARCS) Data Fabric is a data storage and collaboration service with nodes throughout the country. Researchers and other end users are able to easily and transparently obtain access to an initial quantity of storage resources without having to invest financially or technically. Currently access is possible with either the command line based tools or cross-platform GUI applications. Web interfaces are under development. In case researchers require more or different storage than the free offerings, they will be able to negotiate additional storage or storage features. These additions will be made available to these users as part of the data fabric. It is possible to replicate data to other sites as a permanent backup, to provide authorised sharing of data with collaborators in other states or countries, or to cache the data for a computation workflow. Data stored in the data fabric can be used as a part of a workflow using other ARCS services such as compute job submission.

The intention is that some researchers will find such a service meets their immediate needs while others find it whets their appetite sufficiently for them to engage ARCS to deploy or develop further services for their specific use. From an efficiency point of view, ARCS would like end user groups to be able to use "out of the box" services but fully recognises that such an approach is unlikely to meet everyone's needs. The data fabric is implemented as a set of production services which undergo continual testing and monitoring to provide high availability to end users.

New features are being continuously planned, developed and implemented. One such is the concept of managed file transfers ensuring that the use of filesystem, network interface and network resources are optimised when data is transferred across the data fabric.

This session will provide an overview of the ARCS data fabric, its present and future features and some examples of how researchers from a variety of different disciplines are using it.

About the speaker

Stephen McMahon Stephen McMahon is a member of staff at the ANU Supercomputer Facility where he works with providing data services for large data projects. He is also the assistant manager of the ARCS data services team. He has been working with data and grid activities for several years. Stephen was integral in policy and development of data services under the APAC Grid program. He helped run an SRB workshop in Tsukuba, Japan several years ago and has attended numerous other international SRB workshops. Stephen has also worked in scientific software development and installation during his time at the ANU Supercomputer Facility. He holds a Masters in Aerospace Engineering from UNSW, ADFA in the field of computational fluid dynamics and a Bachelor of Science with honours in physics from ANU.