
Overview
Building footprints are useful for a range of important applications, from population estimation, urban planning and humanitarian response, to environmental and climate science. This large-scale open dataset contains the outlines of buildings derived from high-resolution satellite imagery in order to support these types of uses. The project is based in Google's Ghana office, focusing on the continent of Africa and the Global South at large.
Quick links
Dataset description
The dataset contains 1.8 billion building detections, across an inference area of 58M km2 covering Africa, South Asia, South-East Asia, Latin America and the Caribbean. The current dataset is in its 3rd version.
For each building in this dataset we include the polygon describing its footprint on the ground, a confidence score indicating how sure we are that this is a building, and a Plus Code corresponding to the centre of the building. There is no information about the type of building, its street address, or any details other than its geometry.

Uses of the data
Potential use cases of the data include:
- Population mapping: Building footprints are a key ingredient for estimating population density. In areas of rapid change, or where census information is out of date, population estimates are vital for many kinds of planning and statistics.
- Humanitarian response: To plan the response to a flood, drought, or other natural disaster, it is useful to assess the number of buildings or households affected. This is also useful for disaster risk reduction, e.g. to estimate the number of buildings in a particular hazard area.
- Environmental science: Knowledge of settlement density is useful for understanding human impact on the natural environment. For example, it helps with estimating energy needs and carbon emissions in a certain area, or pressure on protected areas and wildlife due to urbanisation.
- Addressing systems: In many areas buildings do not have formal addresses, which can make it difficult for people to access social benefits and economic opportunities. Building footprint data can help with the rollout of digital addressing systems such as Plus Codes.
- Vaccination planning: Knowing the density of population and settlements helps to anticipate demand for vaccines and the best locations for facilities. This data is also useful for precision epidemiology, as well as eradication efforts such as mosquito net distribution.
- Statistical indicators: Buildings data can be used to help calculate statistical indicators for national planning, such as the numbers of houses in the catchment areas of schools and health centres; mean travel distances to the nearest hospital or forecast of demand for transportation systems.
Explore the Open Buildings data
confidence >= 0.75 | |
0.7 < confidence < 0.75 | |
0.65 <= confidence < 0.7 |
Data format
The dataset consists of 3 parts: building polygons, building points and score thresholds.
Building polygons and points
Building polygons and points are stored in spatially sharded CSVs with one CSV per S2 cell level 4. Each row in the CSV represents one building polygon or point and has the following columns:
- latitude: latitude of the building polygon centroid,
- longitude: longitude of the building polygon centroid,
- area_in_meters: area in square meters of the polygon,
- confidence: confidence score [0.65;1.0] assigned by the model,
- geometry: the building polygon in the WKT format (POLYGON or MULTIPOLYGON). This feature is present in only in polygons data,
- full_plus_code: the full Plus Code at the building polygon centroid.
Score thresholds
The estimated score thresholds are stored as one CSV. Each row in the CSV represents one S2 cell level 4 bucket and has the following columns:
- s2_token: S2 cell token of the bucket,
- geometry: geometry in the WKT format of the S2 cell bucket,
- confidence_threshold_80%_precision, confidence_threshold_85%_precision, confidence_threshold_90%_precision: estimated confidence score threshold to get specific precision for building polygons in this S2 cell bucket,
- building_count_80%_precision, building_count_85%_precision, building_count_90%_precision: number of building polygons in this S2 cell bucket with confidence score greater than or equal to the score threshold needed to get the specific precision,
- building_count: number of building polygons in this S2 cell bucket,
- num_samples: number of samples used for generating the score threshold. This feature exists from v2 onwards.
Download
The polygon data (178 GB total) is composed of a set of CSV files, with one file per level 4 S2 cell that are up to 7.8 GB in size. Similarly, the points data (48 GB total) are up to 2.1 GB per file. Select a download method below.
Download from the map
To manually download polygons data for a specific cell, click on the map below.
Download polygons or points data for a specific country or region
This Colab notebook shows how data can be downloaded for a specific country or region.
Download all data
Download all polygons using gsutil (178 GB total):
gsutil cp -R gs://open-buildings-data/v3/polygons_s2_level_4_gzip
Download all points (48 GB total):
gsutil cp -R gs://open-buildings-data/v3/points_s2_level_4_gzip
Download metadata
Metadata files can be downloaded as follows:
- Tile geometry and URLs in geojson format.
- Score thresholds CSV file.
Download score thresholds using gsutil:
gsutil cp gs://open-buildings-data/v3/score_thresholds_s2_level_4.csv
Version history
v1: inference carried out during April 2021 on imagery covering 19.4M km2 of Africa.
v2: inference carried out during August 2022 on imagery covering 39.1M km2 of Africa, South and South-East Asia.
v3: inference carried out during May 2023 on imagery covering 58M km2 of Africa, South and South-East Asia, Latin America and the Caribbean. See FAQ for comparison of versions.
FAQ
-
TO THE FULLEST EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT WILL ANY OF THE LICENSORS OR ANY THIRD PARTY THAT PUBLISHES ANY LICENSED MATERIAL BE LIABLE TO YOU ON ANY LEGAL THEORY FOR ANY INCIDENTAL, DIRECT, INDIRECT, PUNITIVE, ACTUAL, SPECIAL, EXEMPLARY, OR OTHER DAMAGES, INCLUDING WITHOUT LIMITATION, LOSS OF REVENUE OR INCOME, LOST PROFITS, PAIN AND SUFFERING, EMOTIONAL DISTRESS, COST OF SUBSTITUTE GOODS OR SERVICES, OR SIMILAR DAMAGES SUFFERED OR INCURRED BY YOU OR ANY THIRD PARTY THAT ARISE IN CONNECTION WITH SUCH MATERIALS (OR THE TERMINATION THEREOF FOR ANY REASON), EVEN IF ANY OF THE LICENSORS OR ANY THIRD PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. TO THE FULLEST EXTENT PERMITTED BY APPLICABLE LAW, ANY OF THE LICENSORS OR ANY THIRD PARTY IS NOT RESPONSIBLE OR LIABLE WHATSOEVER IN ANY MANNER FOR ANY CONTENT POSTED ON OR AVAILABLE THROUGH THE RELEVANT MATERIALS (INCLUDING CLAIMS OF INFRINGEMENT RELATING TO THAT CONTENT), FOR YOUR USE OF THE MATERIALS.