Big Data and Digital Technology Workshop

Asia/Bangkok
Kantary Hills Hotel, Chiang Mai

Kantary Hills Hotel, Chiang Mai

44, 44/1-4 Nimmanhaemin Road, Soi 12, Suthep, Muang, Chiang Mai 50200, Thailand
Rene Breton, Siraprapa Sanpa-arsa (NARIT) , Utane Sawangwit (NARIT)
Description

This workshop will discuss how the techniques and technologies being used and developed in radio astronomy to handle the big data aspects of the field can be applied to other fields in industry and commerce to generate economic development in LMIC countries. It will be hosted at NARIT in Thailand where researchers are already involved in several big data initiatives. Experts will be invited from the UK, Australia, South Africa and Colombia among others. This workshop will provide an opportunity to bring together these separate projects and share the experiences so far and discuss new directions for collaboration across continents. It will also be an opportunity for countries who are just beginning to develop radio astronomy to be exposed to the development possibilities in this sphere.

This is one of the four DRAGN (Development through Radio Astronomy Global Network) workshops being organized during 2018 - 2019, funded by the UK Research and Innovation (UKRI) Global Challenge Research Fund (GCRF). 

 

About DRAGN (https://dragn.info)

Four international workshops will be held throughout 2019 to build partnerships between the various projects in development through radio astronomy that have already been established via the Newton Fund in Africa, Latin America and South East Asia. This collaboration will form a global network of expertise in the mobilization of radio astronomy for economic development. The network will establish and build ‘south-south’ connections that will help the sharing of experiences and lessons learned in how to translate the high tech skills of radio astronomy into local job creation and entrepreneurship. Other low- and middle-income countries (LMICs) in the three continental regions not currently involved will be encouraged to join the established activities and build strong regional collaborations around this theme. The workshops are fully funded by an award from the UK’s Global Challenges Research Fund. Hence the cost of all participants including travel, accommodation and subsistence will be covered by the project. About 40 participants including expert speakers are expected at each event and these will mostly be by invitation. The UK institutions that are organising these events along with the workshop host institutions below are the University of Leeds, University of Hertfordshire, University of Manchester, University of Oxford, University of Bristol and University of Central Lancashire. A summary of the four workshops follows.

 

 

    • 10:00 10:30
      Registration 30m
    • 10:30 11:00
      Coffee Break 30m
    • 11:00 11:30
      Introduction: Opening ceremony Workshop aims
    • 11:30 12:00
      Summary of previous workshops 30m
    • 12:00 12:45
      Ice-breaker activity 45m
    • 12:45 13:45
      Lunch + Group Photo 1h
    • 13:45 14:15
      Data science: a tool for social innovation 30m

      The world as we know it is changing at an exponential rate. The new world has placed a new demand for countries to keep up and adapt in innovative ways. Developing countries are increasingly being left behind and access to new skills and capabilities is ever more inadequate. The African continent is being presented with an opportunity for scientific participation with the global community via Astronomy and Big Data with the coming Square Kilometer Array (SKA) Telescope. African countries are now growing and joining the ranks of other emerging economies but there is a need to harmonize science into society for sustainable impact. The challenge with such a delicate stage of development is universal and equal access to opportunity and knowledge access. In this talk I will present how data science is being used in the SKA Africa partner countries to bring together experts in the field of Big Data, Machine Learning, Astronomy and Social sciences to seek how we can all work together for Social uplift.

      Speaker: Dr Nadeem Oozeer (South African Radio Astronomy Observatory (SARAO))
    • 14:15 14:45
      Developments in the Southern African Development Community (SADC) - Cyberinfrastructure Framework to Support Regional Collaborations on Big Data projects 30m

      This talk shares experiences around the development and implementation of the Southern African Development Community (SADC) Cyber-infrastructure Framework aimed at supporting multi-sectoral and multidisciplinary regional collaborations around big data projects of regional impact. The Cyber-infrastructure Framework envisaged a regional commons of high performance computing platforms over a fabric of National education and research networks to host multi-sectoral data repositories - facilitated by robust regional policies - utilised by trained users working on collaborative regional R&D projects. The Frameworks defines several pillars for implementation - ranging from harmonised regional policy through to infrastructure deployment, R&D, resource mobilisation, strategic partnerships, and human capital development. In the R&D pillar there are some regional projects e.g. in Weather and Climate and potential projects in geoscience, bioinformatics and spatial drone image data - all aimed at exercising the CI. The cyber-infrastructure also seeks to promote participation of citizens in science, promote education and innovation value chains.

      Speaker: Dr Tshiamo Motshegwa (University of Botswana)
    • 14:45 15:15
      Getting ready for the AVN in Madagascar: Big Data challenges & opportunities 30m

      Madagascar is one of the South African SKA partner countries on the AVN project. Located 50 km away from Antananarivo, the city of Arivonimamo hosts the old telecon dish to be converted into a radio telescope as part of the pan-African project. Despite the progress being made and various opportunities and support from the AVN and DARA community, there are still outstanding issues that need to be addressed such as the local ability to handle and analyse huge quantities of data collected well before successful conversion of the dish.

      Speaker: Dr Zara Randriamanakoto (South African Astronomical Observatory)
    • 15:15 15:45
      Coffee Break 30m
    • 15:45 16:15
      IDIA from big data to big ideas 30m
      Speaker: Dr Jasper Horrell (IDIA)
    • 16:15 17:00
      Discussion 1: Local communities and computing in developing countries 45m

      Panel members:
      Nadeem Oozeer
      Tshiamo Motshegwa
      Zara Randriamanakoto

      Speaker: Dr Rene Breton
    • 09:30 10:00
      Cardiac Remodelling Prediction using Deep Learning 30m

      Left Ventricular (LV) remodeling involves changes in the ventricular size, shape and function. Hence, analysis and prediction of LV remodeling is important to improve patient survival, emergency medical response and treatment strategies after acute myocardial infarction (AMI). As cardiac magnetic resonance imaging (MRI) is increasingly used in clinical assessment of cardiovascular diseases, the overwhelming size of a typical cardiac MRI image dataset poses a significant challenge for a time-efficient image quantification and interpretation for the cardiac expert. Furthermore, there is scarce study on classification and prediction of LV remodeling using multiple cardiac MRI derived data (oedema, infarct size and micro vascular obstruction at onset) and textual data. Therefore, it is critically important to propose an approach in mining meaningful patterns from multimodal data (image and text) from Sarawak Big Heart Data. We used a Deep Learning approach for the prediction of LV remodeling that can assist cardiac experts in LV remodeling prediction by analysing the images acquired from cardiac MRIs for faster decision making in clinical cardiac health management and prevention of cardiovascular mortality. This approach will then be realised in a proof of concept and will be compared with the existing manual images analysis by cardiac experts in term of sensitivity, specificity, positive and negative predictive value. In line with the Malaysia National Key Economic Area (NKEA), this research will address on a focus area in the Strategic Thrust 2 – Improving Well Being for All in the Eleventh Malaysia Plan and contribute to the coding of AMI, clinical management and treatment strategies for the improvement in system delivery for better health outcomes.

      Speaker: Dr Dayang NurFatimah (University of Malaysia, Sarawak (UNIMAS))
    • 10:00 10:30
      Interpretable models for predicting early mortality in patients coronary artery disease 30m

      This work refers to the construction of models using machine learning algorithms for early prediction (during the first 24 hours of admission) of hospital mortality in patients with coronary artery disease through the use of clinical notes and structured clinical data (electronic health record -EHR). The aim is to effectively identify suitable models to predict early mortality and recognize risk factors. For unstructured EHR, n-gram models to extract feature in the clinical notes are explored. For structured data, different machine learning algorithms are evaluated and combined with different kind of information to identify risk factors. We also validate the model performance and compare its performance using reference scores. It is important to mention, that this study is focused on the analysis of the first 24 hours of admission, because during this time it is possible to identify invasive procedures or not, avoiding irreversible damage or sudden death.

      Speaker: Dr Demetrio Fabián García Nocetti (UNAM)
    • 10:30 11:00
      Cofee Break 30m
    • 11:00 11:30
      Astrostatistics: Big Questions in the age of Big Data 30m

      Astronomy has been one of the pioneering sciences in handling large volumes of data, which has historically meant that the bond between astronomy and statistics has been very strong. However, due to the advent of modern astrophysics in the 20th century, a significant estrangement between astronomy and statistics arose. Only recently the bond has begun to mend and to grow strong again thanks to astronomers acknowledging the limitations of "tried and true" statistical methods and becoming more willing to learn and apply modern statistical methods to their research. Incidentally, this has created an open science environment in which these methods have become available for anyone, lowering the barriers to entry for many astronomers in the Global South. In particular, I will mention how this has impacted my own research using a few examples from recent years.

      Speaker: Dr German Chaparro (ECCI)
    • 11:30 12:00
      Computational Fluid Dynamics as a digital technology practice implementation 30m

      The talk focuses on Computational Fluid Dynamics as a digital technology practice implementation. These kind of techniques are not just useful in industry but in many different scientific research areas such as Astronomy and others.

      Speaker: Mr Mauricio Suarez (National Astronomical Observatory, Colombia)
    • 12:00 12:45
      Discussion 2: Medicine and modern techniques 45m

      Panel Members:
      Dayang Iskandar
      Fabian Garcia Nocetti
      Germán Chaparro
      Mauricio Suarez

      Speaker: Prof. Mark Thompson (University of Hertfordshire)
    • 12:45 13:45
      Lunch 1h
    • 13:45 14:15
      Astronomy and Airlines: Expanding our horizons with Newton and GCRF 30m

      Since 2017, my Thai collaborators (NARIT, MFU) and I have worked together on various Newton/GCRF-funded projects. The first of these projects focussed on using data from GOTO - a new wide-field, high-cadence optical astronomical survey - to expose Thai students and researchers to vast amounts of digital data. Initial projects focused on the development of Machine Learning algorithms to analyse GOTO data, but our work has since expanded to include data management and front-end development. As well as astronomy-related research, however, our team has also been using our expertise in data analysis and management to help businesses and organisations across Thailand via two GCRF-funded projects. In these projects, students work closely with businesses across a diverse range of sectors - from supermarkets and hotels to airlines and cloud-computing providers - to help solve their data-related problems while also acquiring technical skills that are truly relevant for Thailand's growing technical economy. In this presentation, I will discuss the work we have done so far across all our projects, and outline our plans for the future of our collaboration.

      Speaker: Dr James Mullaney (University of Sheffield )
    • 14:15 14:45
      Recommender Systems - Mobile Phones 30m

      Recommender systems are a large part of Machine Learning Research. Relating to the Telecom industry, Antel (a Uruguayan Telecom Company) have developed a machine learning recommender system model to give mobile phones recommendations for telecom retail.

      Speaker: Sebastian Laborde (ANTEL)
    • 14:45 15:15
      BIOS, a case study of a public-private partnership for the promotion of biotechnology and data science in Colombia, Latin America 30m

      In 2007, the national government and public and private entities defined that it was necessary to take advantage of Colombian biodiversity through a high-level research center in the country that developed and harnessed its great potential in the fields of Bioinformatics, Biology Computational and Data Science. This idea pointed to the achievement of the objectives in innovation and economic development of the national government so that in 2008 its creation began, thanks to a cooperation agreement between a private public alliance. Additionally I want to talk about one of our main projects: ORIGEN, a national strategy of precision medicine for the genetic study of the Colombian population through the use of tools in artificial intelligence.

      Speaker: Eduardo Gomez Restrepo (Centre for Bioinformatics and Computational Biology of Colombia (BIOS))
    • 15:15 15:45
      Coffee Break 30m
    • 15:45 16:15
      Big Data Service and Network Infrastructure 30m

      Big Data and Digital Technology are new eras in the world, many activities in our lives have a strong relationship to each other in this new era, whether in the government, industry, business, or education sectors. In this talk I would like to talk about big data and digital services as a new big business opportunity in Indonesia and look at the network infrastructure as the main carrying capacity of these services.

      Speaker: Muhammad Taufik (Indosat Ooredoo)
    • 16:15 17:00
      Discussion 3: Machine learning and industry 45m

      Panel Members:
      James Mullaney
      Sebastián Laborde
      Eduardo Gómez Restrepo
      Muhammad Taufik

      Speaker: Utane Sawangwit (NARIT)
    • 09:00 12:45
      Visit at the Thai National Observatory 3h 45m
    • 12:45 13:45
      Lunch 1h
    • 13:45 17:00
      Cultural visit to the Wat Phrathat Doi Suthep temple 3h 15m
    • 17:00 20:00
      Conference banquet 3h
    • 09:30 10:00
      Developing Big Data Analytics capacity in SKA Partner countries 30m

      Developing Big Data Analytics capacity in SKA African Partner Countries involves the roll-out of infrastructure to help build the expertise in the different countries. At the core of the infrastructure are the skills required to perform data analysis, such as scientific programming and machine learning algorithms. Hence this talk will focus on these developments in the 8 partner countries in Africa.

      Speaker: Dr Happy Sithole (Council for Scientific & Industrial Research (CSIR) - NICIS)
    • 10:00 10:30
      Big Data, HPC & Astronomical Numerical Simulations 30m

      The 21st century can be the era of Big Data. Huge databases are generated by different scientific projects, industry and society in general. In the particular case of Astronomy, both ground (SKA, ALMA, SDSS, LSST, etc) and space telescopes (HST, Chandra, Spitzer, etc), as well as Astronomical Numerical Simulations (Millennium Simulation project, IllustrisTNG project, etc) are developed in supercomputers and generate databases of tens or hundreds of petabytes. To manipulate these databases, different data mining algorithms and specialized software for data analysis are used. In the case of numerical simulations data can be manipulated in two ways: first, the database is generated and then analyzed, or in-situ data analysis is done, i.e., data is analyzed as it is generated. In this presentation I will discuss the two possibilities for data analysis.

      Speaker: Dr Alfredo Santillan Gonzalez (UNAM)
    • 10:30 11:00
      Coffee Break 30m
    • 11:00 11:30
      Monitoring of an invasive weed from space 30m

      In this talk I will present our work on the monitoring of an invasive weed called Parthenium, which spreads very quickly across Pakistan and can destroy a significant fraction of the crops if not actively controlled. Our research looks into implementing a new approach which relies on enabling large-scale monitoring using satellite imaging and taps on skills borrowed from astrophysics. We use extensive ground truth data in order to train machine learning algorithms to recognise the subtle signature of the plant as viewed from space at low resolution. Our work, still in progress, shows promising results demonstrating that we might indeed be able to track this weed from space.

      Speaker: Dr Rene Breton (University of Manchester)
    • 11:30 12:00
      Do we need to invoke Machine Learning or Artificial Intelligence to add value to big data? 30m

      Whenever we mention Big Data, we hear Machine Learning (ML) and/or Artificial Intelligence (AI). Is this required? Radio Frequency Interference (RFI) affects astronomical data that is
      measured by radio telescope such as MeerKAT. RFI is known to be dynamic in various aspects. There is often claims that the RFI environment radio facility around site is changing . How can we quantify this from the telescope perspective? Once given their science dataset, astronomers remove the RFI and hardly give feedback to the Operations, Commissioning and RFI team about the quality of their data, unless it is really bad. Furthermore, how confidently can one be to remove some potentially “affected” data points? In this talk I will show how we address the above 3 questions from a data science perspective.

      Speaker: Dr Nadeem Oozeer (South African Radio Astronomy Observatory (SARAO))
    • 12:00 12:45
      Discussion 4: HPC & cloud computing, Agri-food/local communities 45m

      Panel member:
      Happy Sithole
      Alfredo J. Santillan
      Rene Breton
      Nadeem Oozeer

      Speaker: Dr Premana Premadi (Bandung Institute of Technology)
    • 12:45 13:45
      Lunch 1h
    • 13:45 14:15
      China SKA Regional Centre Prototyping 30m

      China has been taking part in SKA pre-construction since the 1990's; 21cma and FAST both have some relationship with SKA. In past years, we had involved many packages including DISH/CSP/SDP. We are highly involved with the SKA Regional Center construction from 2015 onwards. In this presentation, I will look at the progress of China SKA Regional Centre Prototyping, including the building, hardware, network and also some initial results using this platform, and the future plan of SRC.

      Speaker: Mr Shaoguang Guo (Shanghai Astronomical Observatory)
    • 14:15 14:45
      Ongoing Astronomical Projects in Algeria and Big Data Issues 30m

      This talk will be about some of the most important astronomical projects in Algeria. I will speak mainly about the ongoing project of the Aures Observatory and the farm of telescopes for Transcient Monitoring (GWs, GRBS etc), the Schedular program with genetics algorithms, Site testing and Radioastronomy with an emphasis on Big Data requirements and needs (such as infrastructure and HPC).

      Speaker: Dr Nassim Seghouani (Centre of Research in Astronomy, Astrophysics & Geophysics)
    • 14:45 15:15
      Bosscha Observatory Exoplanet Research 30m

      The exoplanet research program was initiated in Bosscha Observatory since 2006 using the transit method. However, it was only since 2015 that the research program has been carried out more systematically and automatically. With the operation of the new telescope in 2014 and the implementation of the observation automation system, we have begun a new journey of exoplanets detection research. After spending the first few years learning, we can then concentrate on developing data servers and computing . The research now carries out transit photometry observation to produce a database of light curve set that will be identified of exoplanets. The MCMC fitting method is used to look for physical parameters afterwards. Other exoplanet detection research is conducted using surveys on stellar clusters. With the same observational data, we learn to apply machine learning to classified variable stars.

      Speaker: Dr Yusuf Muhammad (Bandung Institute of Technology)
    • 15:15 15:45
      Coffee Break 30m
    • 15:45 16:15
      Developing network of astronomical facilities in Indonesia and Asia-Pacific Region: Instruments, Data, and Knowledge Sharing 30m

      Development of astronomy in South East Asian countries are gaining momentum mostly due to the success of human capacity building and increase in research funding. However, to contribute to the advancement of science at the forefront, in particular astrophysics, can be prohibitively costly for most developing countries. It is therefore imperative to construct collaborative strategies among astronomical institutions in countries in the same region that includes scientific goals, human resources, hardware and software facility, as well as cost and benefit among aspects to consider.

      Speaker: Dr Premana Premadi (Bandung Institute of Technology)
    • 16:15 17:00
      Discussion 5: SKA and new facilities 45m

      Panel members:
      Shaoguang Guo
      Nassim Seghouani
      Muhammad Yusuf

      Speaker: Siraprapa Sanpa-arsa (NARIT)
    • 09:30 10:00
      Big data and big images in galactic astronomy 30m

      TBC

      Speaker: Prof. Mark Thompson (University of Hertfordshire)
    • 10:00 10:30
      Computational Fluid Dynamics: A first approach to aerodynamics 30m

      In this talk the problem of the external fluid passing around an obstacle in 2-dimensions will be dealt. The stream-vorticity formulation will be used in order to understand the behaviour of the fluid after passing such obstacle. Although the geometry of the obstacle will be a foundamental one, this can be changed to more complex topologies.

      Speaker: Mr Mauricio Suarez (National Astronomical Observatory, Colombia)
    • 10:30 11:00
      Coffee Break 30m
    • 11:00 11:30
      The implementation of big data architecture for radio astronomy and machine learning in a balloon to characterize the atmosphere in Ecuador 30m

      TBC

      Speaker: Darwin Mena (National Polytechnic School)
    • 11:30 12:00
      Radioastronomy and Space Science: from a big data playground to societal spin-offs 30m

      Large sensor-based science infrastructures for radio astronomy will be among the most intensive data-driven projects in the world, facing very high demanding computation, storage, management, and above all power demands. The geographically wide distribution of the radioastronomical sensors and and its associated processing requirements in the form of tailored High Performance Computing (HPC) facilities and cloud distributed environments require a Greener approach towards the Information and Communications Technologies (ICT) adopted for the data processing to enable operational compliance with strict power budgets. In addition, this convergence of Big data technologies with parallel computing opens new digital avenues, that may enhance Internet of Things (IoT) applications in smart farming, smart tourism and the ubiquitous smart cities frameworks. Here we outline major characteristics and innovative scenarios that may share common digital background and potential high societal impacts.

      Speaker: Dr Domingos Barbosa (Instituto de Telecomunicações)
    • 12:00 12:45
      Discussion 6: Technologies 45m

      Panel members:
      Mark Thompson
      Mauricio Suarez
      Darwin Mena
      Domingos Barbosa

      Speaker: Dr Rene Breton
    • 12:45 13:45
      Lunch Break 1h
    • 13:45 14:45
      Brainstorming Session
      Convener: Dr Rene Breton
    • 14:45 15:15
      Workshop summary Concluding remarks: Workshop Summary & Concluding Remarks
    • 15:15 15:45
      Coffee Break 30m