Menu

Working Papers

Displaying 1 - 2 of 2
Economic Analysis and Policy

This paper proposes the use of synthetic training data generated by large language models
to improve machine learning SDG classifiers. It shows that supplementing existing training data with
synthetic data produced by the ChatGPT tool improves the performance of the SDGClassy classifier.
This addition of synthetic data is especially useful in building SDG classifiers given the limited availability
of properly labeled data and the complex, interconnected nature of the SDGs. Synthetic data thus enables
more effective machine-learning applications in this context.

Economic Analysis and Policy

We introduce two separate datasets (The Global Consumption Dataset (GCD) and The Global Income Dataset (GID)) containing an unprecedented portrait of consumption and income of persons over time, within and across countries, around the world. The benchmark version of the dataset presents estimates in PPP units of monthly real consumption and income for every decile of the population (a ‘consumption/income profile’) for 133 countries and more than half a century (1960-2012). We describe the construction of the datasets and demonstrate some possible uses by presenting preliminary results concerning the consumption distribution, poverty and inequality for the world and specific country…