Reproducing results
Reproducing data
After obtaining necessary permissions for data access (particularly GEE, after which you will need to run earthengine authenticate
on the machine you are using to run the package), to produce all data for countries considered, first navigate to the data/
subpackage then run the script bash make_countries.sh
.
The initial run will fail, as currently you will need to manually go to the Google Drive of the authenticated account, and
manually place all data in a external/gee
directory within the top-level data/
directory. This will be fixed in the future.
Within your drive, the GEE data for each country should be placed in a gee/{country}
directory for better organisation, but currently
on download the data for all countries should be placed in a single directory - this may change in future versions.
It may take some time for the GEE data to appear in your Drive - you can check the status of
tasks here.
Once all data is available locally, run bash make_countries.sh
one more time, and on completion you should have all data necessary!
Reproducing models
Once the datasets have been created, to train models with the same pipeline as we used, for the same countries,
navigate to the models/
subpackage then simply run bash train_countries.sh
.
Reproducing predictions
Finally to use these models to make all the predictions, then just run bash predict_countries.sh
. By default these predictions
will be saved in the predictions/
directory within the top-level data/
directory of the repo.
Additional steps
Prediction intervals are not automatically generated as part of the pipeline. They will be incorporated soon, but in the meantime you
may follow steps as in ./notebooks/prediction_intervals.ipynb
if desired.