Finding the Right Tools — How We Rebuilt Iran Open Data On CKAN

November 9, 2017

In November 2016, Iran Open Data platform was launched — a civil society-developed open data platform that is working to gather, clean, and structure datasets published by the Iranian government, and make them freely available to citizens, journalists, and civil society stakeholders.

After a year of intense activity, the team is gearing up to launch an all-new version of the Iran Open Data platform at the end of November, with a whole host of additional functionalities to make it even more useful for journalists and civil society groups.

So why did the platform need an overhaul? Honestly, the first version was far from perfect, and caused more than its share of operational headaches for our team. In this post we’ll share some of the challenges the team faced when running the old Iran Open Data platform, and highlight the advantages we see in building something on CKAN (the Comprehensive Knowledge Archive Network) instead.

Why did Iran Open Data need an overhaul?

From the very beginning, Iran Open Data put the issue of transparency at its core. As a consequence, the original platform was built using Jekyll, with the website’s source code openly available on GitHub, the data hosted on Amazon Web Services, and the page itself being served via GitHub pages.

This meant that everything about Iran Open Data was publicly available and accessible, but over time this configuration did end up causing us a number of practical problems.

Most crucially, the Jekyll/Github setup was not scalable at all. Although the team has been able to upload datasets in bulk, additional notes need to be added manually on each dataset, which is an intensely time-consuming process. Upload times are also very lengthy, currently sitting at a painstaking 30 minutes per upload.

The first iteration of Iran Open Data wasn’t entirely user-friendly, either. Datasets cannot be previewed prior to downloading the .csv in question, and there is no search functionality to pore through the 380+ datasets on offer. With all these issues in mind, a large-scale overhaul of the platform was needed.

That’s why we decided to take another look at what other open data platforms were doing. To date, a number of governments around the world — from Asia-Pacific to Europe, to Africa, to the United States are now moving towards using CKAN as the main portal for sharing datasets publicly. Given this trend, we thought it would be worthwhile to review the platform’s functionalities, and to figure out the value of the CKAN platform for a civil society-led initiative such as Iran Open Data.

The new-look Iran Open Data homepage

The Advantages of CKAN for Iran Open Data

Visualisations are cool!

When it comes to usability, being able to visualise imported .csv files within CKAN gives users more options to help them understand the available datasets. The ‘resource view’ functionalities of CKAN allow users to quickly understand datasets’ core trends through a range of simple graph types.

It’s low maintenance.

For organisations with limited budgets to build a simple data storage in a portal, CKAN is the simplest tool to install in low cost. Additionally, handy CKAN extensions allow users to upload datasets from other sources with minimal hassle.

It speaks the right language.

When it comes to compatibility, smooth language localisation features are needed to make the portal fully accessible to users from different parts of the world. CKAN has great extensions that can support multiple-language datasets (more on that below!), making it a great fit for Iran Open Data’s Persian and English-language datasets.

It keeps things light.

With platforms like CKAN, there are always valid concerns raised around accessibility in areas with low levels of internet connectivity. In the Iranian context, internet speeds can vary dramatically across the country, depending on local infrastructure. However, in the current version of CKAN, there exists the possibility to run a ‘lite’ version of the platform to keep it accessible to users with lousy connectivity.

On IOD 2.0 you can now preview datasets without having to download and open each one in Excel!

A Few Exciting CKAN Extensions We’re Using

  • Fluent — A neat extension to store and return multilingual fields in CKAN datasets, resources, organizations and groups. We couldn’t keep IOD so seamlessly bilingual without it!
  • Pages — An extension for building custom pages. We used it for building IOD’s new ‘About Us’ and ‘Use Cases’ pages.
  • Scheming — This is the extension we’re using for our custom-built metadata schema.
  • PDFview — Although we want the possibility to upload PDF in the future. This extension will render any PDF inside IOD.
  • Custom Theme — Not an extension per se, but CKAN gives the possibility to create your own theme instead of changing the core files
  • Bulk Import — Viderum, an organisation of open data superheroes, created a special script for us that enables the functionality to import datasets in bulk from a spreadsheet. This will save our team a lot of valuable time.

As a side note, we’re getting ongoing support from Viderum in the development, styling and maintenance of the portal, plus plenty of help building custom functionalities for Iran Open Data. They’re great!

Standby for Launch — Iran Open Data v2.0 is Coming!

That’s it for now. The new-and-improved Iran Open Data will be landing in late November 2017, so stay on the lookout for further updates. We’re very keen to get your thoughts and feedback on the new platform once it rolls out, so don’t be shy if there are new things you’re not so keen on, or things that you absolutely adore! (Hopefully there’ll be more of the latter).

Until then, leave us your questions and comments about the platform redesign, and be in touch with [email protected] if you’d like to chat some more about our work.

See you again soon!