Datasets for the rest of us, via

  • Originally a data repository of publicly-available datasets for ‘power users’
  • Revamped site seeks to make data relevant and understandable to laymen as well
Datasets for the rest of us, via

WITH all the talk of data and analytics, the Infocomm Development Authority of Singapore (IDA) launched in June 2011 as a repository for publicly-available datasets, originally aimed at so-called ‘power users.’
Now the agency wants to go one step further and make data relevant and understandable to the rest of us.
The original idea was to give power users the chance to play around with data, according to Lin Zhaowei, Government Digital Services consultant at the IDA. It was the Government’s one-stop portal for publicly-available datasets from 70 public agencies.
“It was mainly … pitched at power users of data – you had to download the data files first, then put them into a programme like Excel before you could even see what the data showed,” he tells Digital News Asia (DNA) via email.
But the initial reception was muted, to say the least.
“Even if you were a ‘data person,’ you might have never heard of the site,” Lin admits.
But the IDA also wanted members of the general public to understand and use the 8,700 datasets that were then available, so work on a site revamp began in late 2014, with the public beta officially launched last October.
The new site has over 500 datasets – the older datasets were cleaned up and the inadequate ones cleared out – a number that will grow as more become available. [The previous two paragraphs were amended for greater accuracy.]
“We had one key mission: Help the public understand and use public data,” says Lin.
“So we introduced new features such as interactive charts, dashboards, a blog section for data narratives, and a new developers’ portal that provides high-frequency datasets,” he adds.
READ ALSO: Qlik visual analytics app lets you find cost to live or travel in 8 APAC cities
To do all this, it was necessary to define standards for ease of use. These included standards on how data tables are structured, as well as for “metadata so users know what’s in the dataset,” says Lin.
“These standards are important because people can use our datasets without spending hours cleaning the data themselves,” he adds.
By making datasets visual, users do not need to plug the data into a program. Almost all datasets on the new are represented in charts, according to Lin.
“You don’t even need to download the data to see the general trend of the numbers, and all datasets are accessible via an API (application programming interface), which makes it really easy for developers to extract the latest data in a consistent format,” he says.
Users warm up to heat maps, and more

Datasets for the rest of us, via

All these efforts have paid off. Since the start of the year, the revamped site has had more than 190,000 unique visitors and one million pageviews, according to the IDA’s Government Digital Services associate consultant Loh Li Wei.
“Reception has been very positive, and we are confident that more people will use the site as we continue to add more datasets and introduce more features,” he tells DNA via email.
The datasets have spawned interesting uses from members of the public, with people building real-time visualisations through the API.
“Recently, a Year One computer science student from NUS (the National University of Singapore) created a heat map visualising all the available taxis in Singapore,” says Loh.
“It was very popular online, with many media outlets sharing the work.
“Just weeks before that, another developer created a web app called for people to check where taxi stands and available taxis are.
“It’s really interesting because you can use geolocation on your phone to check the situation where you are.
“Both apps were created using the real-time taxi availability dataset from our developers’ portal that was launched in April,” he adds.
The developer of, Lim Chee Aun, rates the current datasets an impressive “8 out of 10” because of the “vast availability of different types of data.”
“I personally would love to see more transportation-related datasets as I find them more useful and relevant to my everyday use cases.
Datasets for the rest of us, via“So far it's been a pretty good experience for me to use some of the APIs – of course, I would love to see the data be more accurate and in real-time.
“There’s quite a lot of historical datasets in which is great for analysis and visualisations, but I’m more focused on making them useful for people, right now, right at the moment,” he adds.
Lim (pic) sees more opportunities for developers to consume these datasets.
“As a developer, nothing frustrates me more than the lack of datasets that stops me from executing on my ideas and helping the public,” he says.
This was all part of the plan for exposing the general public to data, according to IDA’s Lin.
“There are two general groups of data consumers: The general public, laymen who don’t really know what data is about; and power users who are looking for data to do analysis and app development,” he says.
“By making data relevant and understandable, citizens are being exposed to data that they may not know was available in the first place.
“Through our blog, we also hope to write articles that are of interest to the general public, such as our piece on 4G mobile broadband speeds across the island.
“We hope that as we provide more data to the public, creative developers can come up with innovative ways to turn that data into useful apps or services for people,” he adds.
Smart Nation and beyond

Datasets for the rest of us, via

With Singapore’s Smart Nation initiative ramping up, has its work cut out for it, especially as data has been identified as essential.
“At the Founders Forum in April 2015, Prime Minister Lee Hsien Loong mentioned that ageing, mobility and data sharing would top Smart Nation priorities,” says the IDA’s Loh.
“To support these goals, will continue working with agencies to make more datasets available to the public, and explore new ways to deliver data to Singaporeans.
“With these initiatives, can help create a rich ecosystem where data can be shared to unlock value and innovation for all Singaporeans,” he declares.
However, some work still needs to be done in migrating older datasets, as well as in boosting engagement, admits Lin.
“Currently, our immediate priority is to migrate datasets over from the old site onto the new site,” he says. “There are over 70 agencies to work with, so this is taking some time.
“In the coming months, we will be creating more data-driven content for our blog, and will also be ramping up our engagement with the public via our social media channels.
“We are starting to look into creating chatbots for popular messaging services such as Facebook Messenger so that it would be easier to interact with our data,” he adds.
Meanwhile, more datasets are being identified for public release, according to Loh.
“We have been working with the NEA (National Environment Agency) to make more high-frequency environment data such as wind speed and direction available on our developers’ portal.
“This will be available to the public within the next few months.
“At the same time, we are working with various agencies to identify more datasets that can be shared with the public,” he adds.
Related Stories:
Singapore’s Hive to bring data science goodness to the people
IDA launches fellowship programme for data science and tech talents
A life in data, a handbook for data scientists
Data stories … at home, in school, and within the enterprise
For more technology news and the latest updates, follow us on TwitterLinkedIn or Like us on Facebook.

Keyword(s) :
Author Name :
Download Digerati50 2020-2021 PDF

Digerati50 2020-2021

Get and download a digital copy of Digerati50 2020-2021