Big Data and Digital Technology Workshop
from
Monday, 16 September 2019 (10:00)
to
Friday, 20 September 2019 (16:15)
Monday, 16 September 2019
10:00
Registration
Registration
10:00 - 10:30
10:30
Coffee Break
Coffee Break
10:30 - 11:00
11:00
Opening ceremony Workshop aims
Opening ceremony Workshop aims
11:00 - 11:30
11:30
Summary of previous workshops
Summary of previous workshops
11:30 - 12:00
12:00
Ice-breaker activity
Ice-breaker activity
12:00 - 12:45
12:45
Lunch + Group Photo
Lunch + Group Photo
12:45 - 13:45
13:45
Data science: a tool for social innovation
-
Nadeem Oozeer
(South African Radio Astronomy Observatory (SARAO))
Data science: a tool for social innovation
Nadeem Oozeer
(South African Radio Astronomy Observatory (SARAO))
13:45 - 14:15
The world as we know it is changing at an exponential rate. The new world has placed a new demand for countries to keep up and adapt in innovative ways. Developing countries are increasingly being left behind and access to new skills and capabilities is ever more inadequate. The African continent is being presented with an opportunity for scientific participation with the global community via Astronomy and Big Data with the coming Square Kilometer Array (SKA) Telescope. African countries are now growing and joining the ranks of other emerging economies but there is a need to harmonize science into society for sustainable impact. The challenge with such a delicate stage of development is universal and equal access to opportunity and knowledge access. In this talk I will present how data science is being used in the SKA Africa partner countries to bring together experts in the field of Big Data, Machine Learning, Astronomy and Social sciences to seek how we can all work together for Social uplift.
14:15
Developments in the Southern African Development Community (SADC) - Cyberinfrastructure Framework to Support Regional Collaborations on Big Data projects
-
Tshiamo Motshegwa
(University of Botswana)
Developments in the Southern African Development Community (SADC) - Cyberinfrastructure Framework to Support Regional Collaborations on Big Data projects
Tshiamo Motshegwa
(University of Botswana)
14:15 - 14:45
This talk shares experiences around the development and implementation of the Southern African Development Community (SADC) Cyber-infrastructure Framework aimed at supporting multi-sectoral and multidisciplinary regional collaborations around big data projects of regional impact. The Cyber-infrastructure Framework envisaged a regional commons of high performance computing platforms over a fabric of National education and research networks to host multi-sectoral data repositories - facilitated by robust regional policies - utilised by trained users working on collaborative regional R&D projects. The Frameworks defines several pillars for implementation - ranging from harmonised regional policy through to infrastructure deployment, R&D, resource mobilisation, strategic partnerships, and human capital development. In the R&D pillar there are some regional projects e.g. in Weather and Climate and potential projects in geoscience, bioinformatics and spatial drone image data - all aimed at exercising the CI. The cyber-infrastructure also seeks to promote participation of citizens in science, promote education and innovation value chains.
14:45
Getting ready for the AVN in Madagascar: Big Data challenges & opportunities
-
Zara Randriamanakoto
(South African Astronomical Observatory)
Getting ready for the AVN in Madagascar: Big Data challenges & opportunities
Zara Randriamanakoto
(South African Astronomical Observatory)
14:45 - 15:15
Madagascar is one of the South African SKA partner countries on the AVN project. Located 50 km away from Antananarivo, the city of Arivonimamo hosts the old telecon dish to be converted into a radio telescope as part of the pan-African project. Despite the progress being made and various opportunities and support from the AVN and DARA community, there are still outstanding issues that need to be addressed such as the local ability to handle and analyse huge quantities of data collected well before successful conversion of the dish.
15:15
Coffee Break
Coffee Break
15:15 - 15:45
15:45
IDIA from big data to big ideas
-
Jasper Horrell
(IDIA)
IDIA from big data to big ideas
Jasper Horrell
(IDIA)
15:45 - 16:15
16:15
Discussion 1: Local communities and computing in developing countries
-
Rene Breton
Discussion 1: Local communities and computing in developing countries
Rene Breton
16:15 - 17:00
Panel members: Nadeem Oozeer Tshiamo Motshegwa Zara Randriamanakoto
Tuesday, 17 September 2019
09:30
Cardiac Remodelling Prediction using Deep Learning
-
Dayang NurFatimah
(University of Malaysia, Sarawak (UNIMAS))
Cardiac Remodelling Prediction using Deep Learning
Dayang NurFatimah
(University of Malaysia, Sarawak (UNIMAS))
09:30 - 10:00
Left Ventricular (LV) remodeling involves changes in the ventricular size, shape and function. Hence, analysis and prediction of LV remodeling is important to improve patient survival, emergency medical response and treatment strategies after acute myocardial infarction (AMI). As cardiac magnetic resonance imaging (MRI) is increasingly used in clinical assessment of cardiovascular diseases, the overwhelming size of a typical cardiac MRI image dataset poses a significant challenge for a time-efficient image quantification and interpretation for the cardiac expert. Furthermore, there is scarce study on classification and prediction of LV remodeling using multiple cardiac MRI derived data (oedema, infarct size and micro vascular obstruction at onset) and textual data. Therefore, it is critically important to propose an approach in mining meaningful patterns from multimodal data (image and text) from Sarawak Big Heart Data. We used a Deep Learning approach for the prediction of LV remodeling that can assist cardiac experts in LV remodeling prediction by analysing the images acquired from cardiac MRIs for faster decision making in clinical cardiac health management and prevention of cardiovascular mortality. This approach will then be realised in a proof of concept and will be compared with the existing manual images analysis by cardiac experts in term of sensitivity, specificity, positive and negative predictive value. In line with the Malaysia National Key Economic Area (NKEA), this research will address on a focus area in the Strategic Thrust 2 – Improving Well Being for All in the Eleventh Malaysia Plan and contribute to the coding of AMI, clinical management and treatment strategies for the improvement in system delivery for better health outcomes.
10:00
Interpretable models for predicting early mortality in patients coronary artery disease
-
Demetrio Fabián García Nocetti
(UNAM)
Interpretable models for predicting early mortality in patients coronary artery disease
Demetrio Fabián García Nocetti
(UNAM)
10:00 - 10:30
This work refers to the construction of models using machine learning algorithms for early prediction (during the first 24 hours of admission) of hospital mortality in patients with coronary artery disease through the use of clinical notes and structured clinical data (electronic health record -EHR). The aim is to effectively identify suitable models to predict early mortality and recognize risk factors. For unstructured EHR, n-gram models to extract feature in the clinical notes are explored. For structured data, different machine learning algorithms are evaluated and combined with different kind of information to identify risk factors. We also validate the model performance and compare its performance using reference scores. It is important to mention, that this study is focused on the analysis of the first 24 hours of admission, because during this time it is possible to identify invasive procedures or not, avoiding irreversible damage or sudden death.
10:30
Cofee Break
Cofee Break
10:30 - 11:00
11:00
Astrostatistics: Big Questions in the age of Big Data
-
German Chaparro
(ECCI)
Astrostatistics: Big Questions in the age of Big Data
German Chaparro
(ECCI)
11:00 - 11:30
Astronomy has been one of the pioneering sciences in handling large volumes of data, which has historically meant that the bond between astronomy and statistics has been very strong. However, due to the advent of modern astrophysics in the 20th century, a significant estrangement between astronomy and statistics arose. Only recently the bond has begun to mend and to grow strong again thanks to astronomers acknowledging the limitations of "tried and true" statistical methods and becoming more willing to learn and apply modern statistical methods to their research. Incidentally, this has created an open science environment in which these methods have become available for anyone, lowering the barriers to entry for many astronomers in the Global South. In particular, I will mention how this has impacted my own research using a few examples from recent years.
11:30
Computational Fluid Dynamics as a digital technology practice implementation
-
Mauricio Suarez
(National Astronomical Observatory, Colombia)
Computational Fluid Dynamics as a digital technology practice implementation
Mauricio Suarez
(National Astronomical Observatory, Colombia)
11:30 - 12:00
The talk focuses on Computational Fluid Dynamics as a digital technology practice implementation. These kind of techniques are not just useful in industry but in many different scientific research areas such as Astronomy and others.
12:00
Discussion 2: Medicine and modern techniques
-
Mark Thompson
(University of Hertfordshire)
Discussion 2: Medicine and modern techniques
Mark Thompson
(University of Hertfordshire)
12:00 - 12:45
Panel Members: Dayang Iskandar Fabian Garcia Nocetti Germán Chaparro Mauricio Suarez
12:45
Lunch
Lunch
12:45 - 13:45
13:45
Astronomy and Airlines: Expanding our horizons with Newton and GCRF
-
James Mullaney
(University of Sheffield )
Astronomy and Airlines: Expanding our horizons with Newton and GCRF
James Mullaney
(University of Sheffield )
13:45 - 14:15
Since 2017, my Thai collaborators (NARIT, MFU) and I have worked together on various Newton/GCRF-funded projects. The first of these projects focussed on using data from GOTO - a new wide-field, high-cadence optical astronomical survey - to expose Thai students and researchers to vast amounts of digital data. Initial projects focused on the development of Machine Learning algorithms to analyse GOTO data, but our work has since expanded to include data management and front-end development. As well as astronomy-related research, however, our team has also been using our expertise in data analysis and management to help businesses and organisations across Thailand via two GCRF-funded projects. In these projects, students work closely with businesses across a diverse range of sectors - from supermarkets and hotels to airlines and cloud-computing providers - to help solve their data-related problems while also acquiring technical skills that are truly relevant for Thailand's growing technical economy. In this presentation, I will discuss the work we have done so far across all our projects, and outline our plans for the future of our collaboration.
14:15
Recommender Systems - Mobile Phones
-
Sebastian Laborde
(ANTEL)
Recommender Systems - Mobile Phones
Sebastian Laborde
(ANTEL)
14:15 - 14:45
Recommender systems are a large part of Machine Learning Research. Relating to the Telecom industry, Antel (a Uruguayan Telecom Company) have developed a machine learning recommender system model to give mobile phones recommendations for telecom retail.
14:45
BIOS, a case study of a public-private partnership for the promotion of biotechnology and data science in Colombia, Latin America
-
Eduardo Gomez Restrepo
(Centre for Bioinformatics and Computational Biology of Colombia (BIOS))
BIOS, a case study of a public-private partnership for the promotion of biotechnology and data science in Colombia, Latin America
Eduardo Gomez Restrepo
(Centre for Bioinformatics and Computational Biology of Colombia (BIOS))
14:45 - 15:15
In 2007, the national government and public and private entities defined that it was necessary to take advantage of Colombian biodiversity through a high-level research center in the country that developed and harnessed its great potential in the fields of Bioinformatics, Biology Computational and Data Science. This idea pointed to the achievement of the objectives in innovation and economic development of the national government so that in 2008 its creation began, thanks to a cooperation agreement between a private public alliance. Additionally I want to talk about one of our main projects: ORIGEN, a national strategy of precision medicine for the genetic study of the Colombian population through the use of tools in artificial intelligence.
15:15
Coffee Break
Coffee Break
15:15 - 15:45
15:45
Big Data Service and Network Infrastructure
-
Muhammad Taufik
(Indosat Ooredoo)
Big Data Service and Network Infrastructure
Muhammad Taufik
(Indosat Ooredoo)
15:45 - 16:15
Big Data and Digital Technology are new eras in the world, many activities in our lives have a strong relationship to each other in this new era, whether in the government, industry, business, or education sectors. In this talk I would like to talk about big data and digital services as a new big business opportunity in Indonesia and look at the network infrastructure as the main carrying capacity of these services.
16:15
Discussion 3: Machine learning and industry
-
Utane Sawangwit
(NARIT)
Discussion 3: Machine learning and industry
Utane Sawangwit
(NARIT)
16:15 - 17:00
Panel Members: James Mullaney Sebastián Laborde Eduardo Gómez Restrepo Muhammad Taufik
Wednesday, 18 September 2019
09:00
Visit at the Thai National Observatory
Visit at the Thai National Observatory
09:00 - 12:45
12:45
Lunch
Lunch
12:45 - 13:45
13:45
Cultural visit to the Wat Phrathat Doi Suthep temple
Cultural visit to the Wat Phrathat Doi Suthep temple
13:45 - 17:00
17:00
Conference banquet
Conference banquet
17:00 - 20:00
Thursday, 19 September 2019
09:30
Developing Big Data Analytics capacity in SKA Partner countries
-
Happy Sithole
(Council for Scientific & Industrial Research (CSIR) - NICIS)
Developing Big Data Analytics capacity in SKA Partner countries
Happy Sithole
(Council for Scientific & Industrial Research (CSIR) - NICIS)
09:30 - 10:00
Developing Big Data Analytics capacity in SKA African Partner Countries involves the roll-out of infrastructure to help build the expertise in the different countries. At the core of the infrastructure are the skills required to perform data analysis, such as scientific programming and machine learning algorithms. Hence this talk will focus on these developments in the 8 partner countries in Africa.
10:00
Big Data, HPC & Astronomical Numerical Simulations
-
Alfredo Santillan Gonzalez
(UNAM)
Big Data, HPC & Astronomical Numerical Simulations
Alfredo Santillan Gonzalez
(UNAM)
10:00 - 10:30
The 21st century can be the era of Big Data. Huge databases are generated by different scientific projects, industry and society in general. In the particular case of Astronomy, both ground (SKA, ALMA, SDSS, LSST, etc) and space telescopes (HST, Chandra, Spitzer, etc), as well as Astronomical Numerical Simulations (Millennium Simulation project, IllustrisTNG project, etc) are developed in supercomputers and generate databases of tens or hundreds of petabytes. To manipulate these databases, different data mining algorithms and specialized software for data analysis are used. In the case of numerical simulations data can be manipulated in two ways: first, the database is generated and then analyzed, or in-situ data analysis is done, i.e., data is analyzed as it is generated. In this presentation I will discuss the two possibilities for data analysis.
10:30
Coffee Break
Coffee Break
10:30 - 11:00
11:00
Monitoring of an invasive weed from space
-
Rene Breton
(University of Manchester)
Monitoring of an invasive weed from space
Rene Breton
(University of Manchester)
11:00 - 11:30
In this talk I will present our work on the monitoring of an invasive weed called Parthenium, which spreads very quickly across Pakistan and can destroy a significant fraction of the crops if not actively controlled. Our research looks into implementing a new approach which relies on enabling large-scale monitoring using satellite imaging and taps on skills borrowed from astrophysics. We use extensive ground truth data in order to train machine learning algorithms to recognise the subtle signature of the plant as viewed from space at low resolution. Our work, still in progress, shows promising results demonstrating that we might indeed be able to track this weed from space.
11:30
Do we need to invoke Machine Learning or Artificial Intelligence to add value to big data?
-
Nadeem Oozeer
(South African Radio Astronomy Observatory (SARAO))
Do we need to invoke Machine Learning or Artificial Intelligence to add value to big data?
Nadeem Oozeer
(South African Radio Astronomy Observatory (SARAO))
11:30 - 12:00
Whenever we mention Big Data, we hear Machine Learning (ML) and/or Artificial Intelligence (AI). Is this required? Radio Frequency Interference (RFI) affects astronomical data that is measured by radio telescope such as MeerKAT. RFI is known to be dynamic in various aspects. There is often claims that the RFI environment radio facility around site is changing . How can we quantify this from the telescope perspective? Once given their science dataset, astronomers remove the RFI and hardly give feedback to the Operations, Commissioning and RFI team about the quality of their data, unless it is really bad. Furthermore, how confidently can one be to remove some potentially “affected” data points? In this talk I will show how we address the above 3 questions from a data science perspective.
12:00
Discussion 4: HPC & cloud computing, Agri-food/local communities
-
Premana Premadi
(Bandung Institute of Technology)
Discussion 4: HPC & cloud computing, Agri-food/local communities
Premana Premadi
(Bandung Institute of Technology)
12:00 - 12:45
Panel member: Happy Sithole Alfredo J. Santillan Rene Breton Nadeem Oozeer
12:45
Lunch
Lunch
12:45 - 13:45
13:45
China SKA Regional Centre Prototyping
-
Shaoguang Guo
(Shanghai Astronomical Observatory)
China SKA Regional Centre Prototyping
Shaoguang Guo
(Shanghai Astronomical Observatory)
13:45 - 14:15
China has been taking part in SKA pre-construction since the 1990's; 21cma and FAST both have some relationship with SKA. In past years, we had involved many packages including DISH/CSP/SDP. We are highly involved with the SKA Regional Center construction from 2015 onwards. In this presentation, I will look at the progress of China SKA Regional Centre Prototyping, including the building, hardware, network and also some initial results using this platform, and the future plan of SRC.
14:15
Ongoing Astronomical Projects in Algeria and Big Data Issues
-
Nassim Seghouani
(Centre of Research in Astronomy, Astrophysics & Geophysics)
Ongoing Astronomical Projects in Algeria and Big Data Issues
Nassim Seghouani
(Centre of Research in Astronomy, Astrophysics & Geophysics)
14:15 - 14:45
This talk will be about some of the most important astronomical projects in Algeria. I will speak mainly about the ongoing project of the Aures Observatory and the farm of telescopes for Transcient Monitoring (GWs, GRBS etc), the Schedular program with genetics algorithms, Site testing and Radioastronomy with an emphasis on Big Data requirements and needs (such as infrastructure and HPC).
14:45
Bosscha Observatory Exoplanet Research
-
Yusuf Muhammad
(Bandung Institute of Technology)
Bosscha Observatory Exoplanet Research
Yusuf Muhammad
(Bandung Institute of Technology)
14:45 - 15:15
The exoplanet research program was initiated in Bosscha Observatory since 2006 using the transit method. However, it was only since 2015 that the research program has been carried out more systematically and automatically. With the operation of the new telescope in 2014 and the implementation of the observation automation system, we have begun a new journey of exoplanets detection research. After spending the first few years learning, we can then concentrate on developing data servers and computing . The research now carries out transit photometry observation to produce a database of light curve set that will be identified of exoplanets. The MCMC fitting method is used to look for physical parameters afterwards. Other exoplanet detection research is conducted using surveys on stellar clusters. With the same observational data, we learn to apply machine learning to classified variable stars.
15:15
Coffee Break
Coffee Break
15:15 - 15:45
15:45
Developing network of astronomical facilities in Indonesia and Asia-Pacific Region: Instruments, Data, and Knowledge Sharing
-
Premana Premadi
(Bandung Institute of Technology)
Developing network of astronomical facilities in Indonesia and Asia-Pacific Region: Instruments, Data, and Knowledge Sharing
Premana Premadi
(Bandung Institute of Technology)
15:45 - 16:15
Development of astronomy in South East Asian countries are gaining momentum mostly due to the success of human capacity building and increase in research funding. However, to contribute to the advancement of science at the forefront, in particular astrophysics, can be prohibitively costly for most developing countries. It is therefore imperative to construct collaborative strategies among astronomical institutions in countries in the same region that includes scientific goals, human resources, hardware and software facility, as well as cost and benefit among aspects to consider.
16:15
Discussion 5: SKA and new facilities
-
Siraprapa Sanpa-arsa
(NARIT)
Discussion 5: SKA and new facilities
Siraprapa Sanpa-arsa
(NARIT)
16:15 - 17:00
Panel members: Shaoguang Guo Nassim Seghouani Muhammad Yusuf
Friday, 20 September 2019
09:30
Big data and big images in galactic astronomy
-
Mark Thompson
(University of Hertfordshire)
Big data and big images in galactic astronomy
Mark Thompson
(University of Hertfordshire)
09:30 - 10:00
TBC
10:00
Computational Fluid Dynamics: A first approach to aerodynamics
-
Mauricio Suarez
(National Astronomical Observatory, Colombia)
Computational Fluid Dynamics: A first approach to aerodynamics
Mauricio Suarez
(National Astronomical Observatory, Colombia)
10:00 - 10:30
In this talk the problem of the external fluid passing around an obstacle in 2-dimensions will be dealt. The stream-vorticity formulation will be used in order to understand the behaviour of the fluid after passing such obstacle. Although the geometry of the obstacle will be a foundamental one, this can be changed to more complex topologies.
10:30
Coffee Break
Coffee Break
10:30 - 11:00
11:00
The implementation of big data architecture for radio astronomy and machine learning in a balloon to characterize the atmosphere in Ecuador
-
Darwin Mena
(National Polytechnic School)
The implementation of big data architecture for radio astronomy and machine learning in a balloon to characterize the atmosphere in Ecuador
Darwin Mena
(National Polytechnic School)
11:00 - 11:30
TBC
11:30
Radioastronomy and Space Science: from a big data playground to societal spin-offs
-
Domingos Barbosa
(Instituto de Telecomunicações)
Radioastronomy and Space Science: from a big data playground to societal spin-offs
Domingos Barbosa
(Instituto de Telecomunicações)
11:30 - 12:00
Large sensor-based science infrastructures for radio astronomy will be among the most intensive data-driven projects in the world, facing very high demanding computation, storage, management, and above all power demands. The geographically wide distribution of the radioastronomical sensors and and its associated processing requirements in the form of tailored High Performance Computing (HPC) facilities and cloud distributed environments require a Greener approach towards the Information and Communications Technologies (ICT) adopted for the data processing to enable operational compliance with strict power budgets. In addition, this convergence of Big data technologies with parallel computing opens new digital avenues, that may enhance Internet of Things (IoT) applications in smart farming, smart tourism and the ubiquitous smart cities frameworks. Here we outline major characteristics and innovative scenarios that may share common digital background and potential high societal impacts.
12:00
Discussion 6: Technologies
-
Rene Breton
Discussion 6: Technologies
Rene Breton
12:00 - 12:45
Panel members: Mark Thompson Mauricio Suarez Darwin Mena Domingos Barbosa
12:45
Lunch Break
Lunch Break
12:45 - 13:45
13:45
Brainstorming Session
Brainstorming Session
13:45 - 14:45
14:45
Workshop Summary & Concluding Remarks
Workshop Summary & Concluding Remarks
14:45 - 15:15
15:15
Coffee Break
Coffee Break
15:15 - 15:45