Records Management as-a-service (RMaaS)

Client
Department of Finance
Domain
WoG shared services, ML
Period
Three months
Role
Prime Contractor

A whole-of-government records management platform that leveraged machine learning and microservices to automate classification at massive scale.

Challenge

The Department of Finance identified records management as a target for a whole-of-government single solution. Commonwealth agencies register approximately 230 million new records per year and hold about 1 petabyte of digital records. Research across 36 agencies found an 80% dissatisfaction rate with existing records management solutions, with the root cause being that practices imposed a significant burden on users. The consequence was business inefficiency and a large number of records never being entered. Additionally, manual disposal processes could not cope with the volume of digital records, so unwanted data was accumulating rapidly.

Solution

GoSource proposed leveraging cloud-based machine learning and big data platforms to automate records management activities. The team developed a records lifecycle microservice to create, version manage, index and auto-classify records in accordance with National Archives records management standards and government information security standards.

To validate the prototype, the team ingested over 500 million sample records and verified sub-second keyword search across the full holdings. GoSource built connectors for e-mail, Windows file systems, and cloud platforms such as Twitter to prove that any business system could integrate with the records microservices.

The solution architecture and operating model for RMaaS included discovery and re-use of information assets through big-data search indexes and linked-data ontologies. System enhancements included unified search functionality across multiple record sets, AI-driven record clustering and classification, and merging and reappropriation of records between organisational units. The microservices suite collectively provides all the essential functions of a system of record as defined by the Records Management Act.

Outcomes

  • Proven scalability: Prototype successfully ingested 500 million+ records and demonstrated sub-second search across the full holdings
  • Whole-of-government viability: Proved the feasibility of a single platform capable of ingesting multi-petabytes of new records each year while remaining highly available and secure
  • Reduced user burden: Future-state design ensures users do not need to think like record managers; records management happens in the background while they use normal systems
  • Interoperability: As a true microservice architecture, RMaaS demonstrated that value-adding services can be integrated with core record management functionality
  • Worst-case validation: Team set up sustained worst-case workload scenarios to prove the solution’s ability to scale

Technologies & Methods

  • Python / Django (web application framework)
  • Python / Flask (lightweight microservices)
  • AWS S3 (object storage)
  • AWS ECS (container orchestration)
  • AWS Elasticsearch (search and indexing)
  • AWS API Gateway (API management)
  • Machine Learning (auto-classification of records)
  • Linked-data ontologies (information asset discovery)
  • Microservices architecture
  • National Archives records management standards compliance

Team Size

Not specified (small agile team)

GoSource’s rapid prototyping helped us to prove the technical feasibility of our records management transformation program.

Robyn McLeod, Director, whole of government solutions Department of Finance

See more of our work