In [1]:

# import geemap
import geemap as geemap
import ee
import numpy as np
import pandas as pd

from egis.utils import (
    vis_params_dict, default_vis_params, get_known_roi, extract_single_img, 
    extract_img_collection_metadata, extract_img_collection_properties,
    coords_to_polygon, collection_len
)

In [2]:

ee.Initialize()

Description: understanding Landsat satellite data, frequency of images and ways to visualize without clouds.

Landsat coverage¶

The satellite is circulating around the globe and taking remote sensing images. Thereby it will take consecutive snapshots from the earth as it progresses. Individual areas will be revisited regularly roughly every 2 weeks.

In [3]:

roi_coords = get_known_roi('grand_canyon')
roi_region_large = coords_to_polygon(get_known_roi('us_west_large'))
roi_region_medium = coords_to_polygon(get_known_roi('grand_canyon_rectangle'))

In [4]:

vis_params_l8 = default_vis_params('landsat')

In [5]:

dataset = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR') \
                  .filterDate('2016-10-01', '2016-12-31') \
                  .filterBounds(roi_region_large)

The orbit of the satellite is reflected in the image tiles: vertical image neighbours are aligned more than horizontal neighbours, because there is much less time lag between vertically consecutive images.

In [ ]:

Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 5)
Map.addLayer(dataset, vis_params_l8)
Map

$image.png$

Due to the time lag, horizontal neighbours can have very different weather conditions.

In [ ]:

Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 8)
Map.addLayer(dataset, vis_params_l8)
Map

Image metadata¶

To get a better feeling about the volume of images produced by Landsat we can pull a collection of images and extract metadata from it.

In [8]:

dataset = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR') \
                  .filterDate('2016-01-01', '2016-12-31') \
                  .filterBounds(roi_region_medium)

Given those time and spatial filters the collection still has rather many images:

In [9]:

collection_len(dataset)

Out[9]:

In [10]:

this_img = extract_single_img(dataset, 0)
this_img_props_dict = geemap.image_props(this_img).getInfo()
img_props = pd.DataFrame.from_dict(this_img_props_dict, orient='index')
img_props

Out[10]:

	0
CLOUD_COVER	49.26
CLOUD_COVER_LAND	49.26
EARTH_SUN_DISTANCE	0.987519
ESPA_VERSION	2_23_0_1a
GEOMETRIC_RMSE_MODEL	9.104
GEOMETRIC_RMSE_MODEL_X	7.242
GEOMETRIC_RMSE_MODEL_Y	5.517
IMAGE_DATE	2016-02-14
IMAGE_QUALITY_OLI	9
IMAGE_QUALITY_TIRS	9
LANDSAT_ID	LC08_L1TP_034034_20160214_20170224_01_T1
LEVEL1_PRODUCTION_DATE	1487928420000
NOMINAL_SCALE	30
PIXEL_QA_VERSION	generate_pixel_qa_1.6.0
SATELLITE	LANDSAT_8
SENSING_TIME	2016-02-14T17:44:12.9346930Z
SOLAR_AZIMUTH_ANGLE	151.02919
SOLAR_ZENITH_ANGLE	55.366402
SR_APP_VERSION	LaSRC_1.3.0
WRS_PATH	34
WRS_ROW	34
system:asset_size	697.355882 MB
system:band_names	[B1, B2, B3, B4, B5, B6, B7, B10, B11, sr_aero...
system:id	LANDSAT/LC08/C01/T1_SR/LC08_034034_20160214
system:index	LC08_034034_20160214
system:time_end	2016-02-14 17:44:12
system:time_start	2016-02-14 17:44:12
system:version	1522753341537768

In [11]:

img_props.loc['system:band_names'].squeeze()

Out[11]:

['B1',
 'B2',
 'B3',
 'B4',
 'B5',
 'B6',
 'B7',
 'B10',
 'B11',
 'sr_aerosol',
 'pixel_qa',
 'radsat_qa']

We can extract the most important properties for all images:

In [12]:

list_of_props = ['CLOUD_COVER', 'CLOUD_COVER_LAND', 'WRS_PATH', 'WRS_ROW']
all_img_dates = geemap.image_dates(dataset).getInfo()
metadata = extract_img_collection_metadata(dataset, list_of_props)
metadata['date'] = all_img_dates
metadata.head(5)

Out[12]:

	CLOUD_COVER	CLOUD_COVER_LAND	WRS_PATH	WRS_ROW	date
0	49.26	49.26	34.0	34.0	2016-02-14
1	9.31	9.31	34.0	34.0	2016-03-01
2	2.47	2.47	34.0	34.0	2016-03-17
3	2.99	2.99	34.0	34.0	2016-04-02
4	6.76	6.76	34.0	34.0	2016-05-04

From this we immediately can see that we get multiple images for multiple tiles of the Worldwide reference system:

In [13]:

metadata.loc[:, ['WRS_PATH', 'WRS_ROW']].drop_duplicates().head(8)

Out[13]:

	WRS_PATH	WRS_ROW
0	34.0	34.0
20	34.0	35.0
42	34.0	36.0
65	34.0	37.0
87	35.0	33.0
110	35.0	34.0
133	35.0	35.0
156	35.0	36.0

A given tile is revisited roughly every 2 weeks:

In [14]:

single_tile_over_time = metadata.query('WRS_PATH == 39 and WRS_ROW == 37')
single_tile_over_time

Out[14]:

	CLOUD_COVER	CLOUD_COVER_LAND	WRS_PATH	WRS_ROW	date
633	74.01	74.01	39.0	37.0	2016-01-16
634	11.01	11.01	39.0	37.0	2016-02-01
635	2.33	2.33	39.0	37.0	2016-02-17
636	10.90	10.90	39.0	37.0	2016-03-04
637	2.41	2.41	39.0	37.0	2016-03-20
638	87.75	87.75	39.0	37.0	2016-04-05
639	5.50	5.50	39.0	37.0	2016-04-21
640	27.35	27.35	39.0	37.0	2016-05-07
641	0.17	0.17	39.0	37.0	2016-05-23
642	8.91	8.91	39.0	37.0	2016-06-08
643	0.00	0.00	39.0	37.0	2016-06-24
644	0.00	0.00	39.0	37.0	2016-07-10
645	1.28	1.28	39.0	37.0	2016-07-26
646	4.75	4.75	39.0	37.0	2016-08-11
647	2.08	2.08	39.0	37.0	2016-08-27
648	0.02	0.02	39.0	37.0	2016-09-12
649	3.34	3.34	39.0	37.0	2016-09-28
650	0.77	0.77	39.0	37.0	2016-10-14
651	79.11	79.11	39.0	37.0	2016-10-30
652	6.53	6.53	39.0	37.0	2016-11-15
653	40.18	40.18	39.0	37.0	2016-12-01
654	0.30	0.30	39.0	37.0	2016-12-17

In [15]:

single_tile_over_time.shape

Out[15]:

(22, 5)

Depending on the time of the year, any given tile might look pretty different for different visits of the satellite:

In [ ]:

scene = extract_single_img(dataset, 643)

Map = geemap.Map()
Map.centerObject(scene, 8)
Map.addLayer(scene, vis_params_l8, 'default RGB')

Map

In particular with weather phenomena like clouds:

In [ ]:

scene = extract_single_img(dataset, 633)

Map = geemap.Map()
Map.centerObject(scene, 8)
Map.addLayer(scene, vis_params_l8, 'default RGB')

Map

$image.png$

Mosaic visualization¶

Clouds are an obvious impediment for any visual inspection. So we can try to improve the visualization of a mosaic by only taking images with rather low ratio of clouds.

Without any special consideration of clouds, this is what we get:

In [ ]:

Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(dataset, vis_params_l8)
Map

An easy way to improve things is by taking median pixel values, not only the last observation per pixel:

In [ ]:

Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(dataset.median(), vis_params_l8)
Map

If we sort with regards to cloud coverage, we can easily visualize both extremes for cloud coverage.

In [20]:

cloud_reduced_images = dataset.sort('CLOUD_COVER', opt_ascending=True)

In [ ]:

Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(cloud_reduced_images, vis_params_l8)
Map

In [22]:

cloud_reduced_images = dataset.sort('CLOUD_COVER', opt_ascending=False)

In [ ]:

Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(cloud_reduced_images, vis_params_l8)
Map

Although this is already a huge improvement of the visualization, it still comes with the problem that individual images might be taken from totally different points in time. This might lead to neighbouring images having totally different vegetation states (summer vs winter) or even having snow vs no snow.

Masking¶

Masking is a way to create a boolean flag on a pixel level. So for example, for each pixel in an image we could identify whether in this pixel we either see a cloud or a shadow of a cloud. Such a mask can then be used to only show those parts of an image that fulfill a certain level of quality. In the following example we only will show only pixels that are not covered with clouds.

In [ ]:

scene = extract_single_img(dataset, 633)

Map = geemap.Map()
Map.centerObject(scene, 8)
Map.addLayer(scene, vis_params_l8, 'default RGB')

Map

In [25]:

def maskL8sr(image):
    # From: https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C01_T1_SR
    
    # Bits 3 and 5 are cloud shadow and cloud, respectively.
    cloudShadowBitMask = (1 << 3)
    cloudsBitMask = (1 << 5)
    
    # Get the pixel QA band.
    qa = image.select('pixel_qa')
    
    # Both flags should be set to zero, indicating clear conditions.
    mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0) \
                 .And(qa.bitwiseAnd(cloudsBitMask).eq(0))
    
    return image.updateMask(mask)

In [ ]:

scene = maskL8sr(extract_single_img(dataset, 633))

Map = geemap.Map()
Map.centerObject(scene, 8)
Map.addLayer(scene, vis_params_l8, 'default RGB')

Map

Now we can also combine masking techniques with median computations: first we filter clouds on a per pixel basis and take the median values of the remaining pixels.

In [ ]:

Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(dataset.map(maskL8sr).median(), vis_params_l8)
Map