96 image chips
How to export thousands of image chips from Earth Engine in a few minutes¶
This source code of this notebook was adopted from the Medium post - Fast(er) Downloads by Noel Gorelick. Credits to Noel.
Due to the limitation of the multiprocessing package, the functionality of this notebook can only be run in the top-level. Therefore, it could not be implemented as a function under geemap.
Install packages¶
Uncomment the following line to install the required packages.
# !pip install geemap retry
Import libraries¶
import ee
import geemap
import logging
import multiprocessing
import os
import requests
import shutil
from retry import retry
Initialize GEE to use the high-volume endpoint¶
ee.Initialize(opt_url="https://earthengine-highvolume.googleapis.com")
Create an interactive map¶
Map = geemap.Map()
Map
Define the Region of Interest (ROI)¶
You can use the drawing tools on the map to draw an ROI, then you can use Map.user_roi
to retrieve the geometry. Alternatively, you can define the ROI as an ee.Geometry as shown below.
# region = Map.user_roi
region = ee.Geometry.Polygon(
[
[
[-122.513695, 37.707998],
[-122.513695, 37.804359],
[-122.371902, 37.804359],
[-122.371902, 37.707998],
[-122.513695, 37.707998],
]
],
None,
False,
)
Define the image source¶
Using the 1-m NAIP imagery.
image = (
ee.ImageCollection("USDA/NAIP/DOQQ")
.filterBounds(region)
.filterDate("2020", "2021")
.mosaic()
.clip(region)
.select("N", "R", "G")
)
Using the 10-m Sentinel-2 imagery.
# image = ee.ImageCollection('COPERNICUS/S2_SR') \
# .filterBounds(region) \
# .filterDate('2021', '2022') \
# .select('B8', 'B4', 'B3') \
# .median() \
# .visualize(min=0, max=4000) \
# .clip(region)
Set parameters¶
If you want the exported images to have coordinate system, change format
to GEO_TIFF
. Otherwise, you can use png
or jpg
formats.
params = {
"count": 100, # How many image chips to export
"buffer": 127, # The buffer distance (m) around each point
"scale": 100, # The scale to do stratified sampling
"seed": 1, # A randomization seed to use for subsampling.
"dimensions": "256x256", # The dimension of each image chip
"format": "png", # The output image format, can be png, jpg, ZIPPED_GEO_TIFF, GEO_TIFF, NPY
"prefix": "tile_", # The filename prefix
"processes": 25, # How many processes to used for parallel processing
"out_dir": ".", # The output directory. Default to the current working directly
}
Add layers to map¶
Map.addLayer(image, {}, "Image")
Map.addLayer(region, {}, "ROI", False)
Map.setCenter(-122.4415, 37.7555, 12)
Map
Generate a list of work items¶
In the example, we are going to generate 1000 points using the stratified random sampling, which requires a classBand
. It is the name of the band containing the classes to use for stratification. If unspecified, the first band of the input image is used. Therefore, we have toADD a new band with a constant value (e.g., 1) to the image. The result of the getRequests()
function returns a list of dictionaries containing points.
def getRequests():
img = ee.Image(1).rename("Class").addBands(image)
points = img.stratifiedSample(
numPoints=params["count"],
region=region,
scale=params["scale"],
seed=params["seed"],
geometries=True,
)
Map.data = points
return points.aggregate_array(".geo").getInfo()
Create a function for downloading image¶
The getResult()
function then takes one of those points and generates an image centered on that location, which is then downloaded as a PNG and saved to a file. This function uses image.getThumbURL()
to select the pixels, however you could also use image.getDownloadURL()
if you wanted the output to be in GeoTIFF or NumPy format (source).
@retry(tries=10, delay=1, backoff=2)
def getResult(index, point):
point = ee.Geometry.Point(point["coordinates"])
region = point.buffer(params["buffer"]).bounds()
if params["format"] in ["png", "jpg"]:
url = image.getThumbURL(
{
"region": region,
"dimensions": params["dimensions"],
"format": params["format"],
}
)
else:
url = image.getDownloadURL(
{
"region": region,
"dimensions": params["dimensions"],
"format": params["format"],
}
)
if params["format"] == "GEO_TIFF":
ext = "tif"
else:
ext = params["format"]
r = requests.get(url, stream=True)
if r.status_code != 200:
r.raise_for_status()
out_dir = os.path.abspath(params["out_dir"])
basename = str(index).zfill(len(str(params["count"])))
filename = f"{out_dir}/{params['prefix']}{basename}.{ext}"
with open(filename, "wb") as out_file:
shutil.copyfileobj(r.raw, out_file)
print("Done: ", basename)
Download images¶
%%time
logging.basicConfig()
items = getRequests()
pool = multiprocessing.Pool(params["processes"])
pool.starmap(getResult, enumerate(items))
pool.close()
Retrieve sample points¶
Map.addLayer(Map.data, {}, "Sample points")
Map