CIDS Specialties
The Curtin Institute for Data Science (CIDS) provides end-to-end support for projects in eight specialties:
Big Data Analytics
Big data refers to data sets that are so large or complex that traditional data analysis techniques are not suitable and so require new ways of processing the data. The big data analytics performed by the CIDS are crucial for next-generation radio telescopes such as the Murchison Widefield Array and the Australian component of the Square Kilometre Array.
Machine Learning and Artificial Intelligence
The CIDS uses the available data to develop the best methods and models in machine learning and artificial intelligence to achieve the most accurate predictions. The CIDS has experience in machine learning across a number of fields, such as computer vision, Natural Language Processing, and time series forecasting. The CIDS can develop prediction models using both supervised – for classification and regression –and unsupervised learning methods – such as clustering and density estimation.
High-Performance Computing
High-performance computing (HPC) uses supercomputers and clusters to provide solutions to complicated problems. HPC involves parallelising, containerising, and automating to wrangle massive amounts of operations that involve large amounts of data. The CIDS frequently works with partners either to deploy solutions on the Pawsey and NCI supercomputers or to operationalise data pipelines using commercial solutions such as AWS and Google Cloud.
Research Software Development and Data Engineering
Research software development is an emerging field where software engineers support scientific research through code optimisation and development. Data engineering relates to the building of those automated processes and systems required to collect and analyse large, complex datasets. For example, the CIDS developed and optimised code to modernise existing workflows to better use HPC resources, making said workflows more applicable to a larger range of data.
Project Scoping and Life Cycle Management
The CIDS’s dedicated Project Management Office supports and manages projects end-to-end. The CIDS does early scoping to determine project activities, resourcing requirements (time, skills, experience) and overall scope – including deliverables and costings. Projects are tracked and managed with appropriate processes and by implementing purpose-designed tools.
Modelling and Optimisation
The CIDS develops data and statistical models by using theoretical, semi-empirical, and empirical approaches. The CIDS can work with domain experts to develop physics-based models as well as to model inversion methods using stochastic or deterministic optimisation.
Visualisation and Simulation
With the aid of a computer, simulation recreates phenomena based on current knowledge. The CIDS generates such simulations and the subsequently extracted datasets – often vast and complex – help researchers from disparate fields, such as the space industry, medicine and mining.
Education
The CIDS aims to build computational skills across the university community by integrating relevant units into undergraduate curricula in all faculties. The CIDS will also supervise the data science projects of interns at the Hub for Immersive Visualisation and eResearch; and Innovation Central Perth.