This paper proposes the use of synthetic training data generated by large language models
to improve machine learning SDG classifiers. It shows that supplementing existing training data with
synthetic data produced by the ChatGPT tool improves the performance of the SDGClassy classifier.
This addition of synthetic data is especially useful in building SDG classifiers given the limited availability
of properly labeled data and the complex, interconnected nature of the SDGs. Synthetic data thus enables
more effective machine-learning applications in this context.
Working Papers
Displaying 1 - 2 of 2
Economic Analysis and Policy
Economic Analysis and Policy, Sustainable Development
Does digitalization reduce corruption? What are the benefits of data-driven digital government innovations to strengthen public integrity and advance the Sustainable Development Goals? While the correlation between digitalization and corruption is well established, there is less actionable evidence on the effects of specific digitalization reforms on different types of corruption and the policy channels through which they operate. This paper unbundles the integrity dividends of digital reforms that the pandemic has accelerated. It analyses the rise of integrity-tech and integrity analytics in the anticorruption space, deployed by data-savvy integrity institutions. It also assesses the…