In [1]:
# import geemap
import geemap as geemap
import ee
import numpy as np
import pandas as pd

from egis.utils import (
    vis_params_dict, default_vis_params, get_known_roi, extract_single_img, 
    extract_img_collection_metadata, extract_img_collection_properties,
    coords_to_polygon, collection_len
)
In [2]:
ee.Initialize()

Description: understanding Landsat satellite data, frequency of images and ways to visualize without clouds.

Landsat coverage

The satellite is circulating around the globe and taking remote sensing images. Thereby it will take consecutive snapshots from the earth as it progresses. Individual areas will be revisited regularly roughly every 2 weeks.

In [3]:
roi_coords = get_known_roi('grand_canyon')
roi_region_large = coords_to_polygon(get_known_roi('us_west_large'))
roi_region_medium = coords_to_polygon(get_known_roi('grand_canyon_rectangle'))
In [4]:
vis_params_l8 = default_vis_params('landsat')
In [5]:
dataset = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR') \
                  .filterDate('2016-10-01', '2016-12-31') \
                  .filterBounds(roi_region_large)

The orbit of the satellite is reflected in the image tiles: vertical image neighbours are aligned more than horizontal neighbours, because there is much less time lag between vertically consecutive images.

In [ ]:
Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 5)
Map.addLayer(dataset, vis_params_l8)
Map

image.png

Due to the time lag, horizontal neighbours can have very different weather conditions.

In [ ]:
Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 8)
Map.addLayer(dataset, vis_params_l8)
Map

image.png

Image metadata

To get a better feeling about the volume of images produced by Landsat we can pull a collection of images and extract metadata from it.

In [8]:
dataset = ee.ImageCollection('LANDSAT/LC08/C01/T1_SR') \
                  .filterDate('2016-01-01', '2016-12-31') \
                  .filterBounds(roi_region_medium)

Given those time and spatial filters the collection still has rather many images:

In [9]:
collection_len(dataset)
Out[9]:
893

Any single images comes with a lot of metadata attached:

In [10]:
this_img = extract_single_img(dataset, 0)
this_img_props_dict = geemap.image_props(this_img).getInfo()
img_props = pd.DataFrame.from_dict(this_img_props_dict, orient='index')
img_props
Out[10]:
0
CLOUD_COVER 49.26
CLOUD_COVER_LAND 49.26
EARTH_SUN_DISTANCE 0.987519
ESPA_VERSION 2_23_0_1a
GEOMETRIC_RMSE_MODEL 9.104
GEOMETRIC_RMSE_MODEL_X 7.242
GEOMETRIC_RMSE_MODEL_Y 5.517
IMAGE_DATE 2016-02-14
IMAGE_QUALITY_OLI 9
IMAGE_QUALITY_TIRS 9
LANDSAT_ID LC08_L1TP_034034_20160214_20170224_01_T1
LEVEL1_PRODUCTION_DATE 1487928420000
NOMINAL_SCALE 30
PIXEL_QA_VERSION generate_pixel_qa_1.6.0
SATELLITE LANDSAT_8
SENSING_TIME 2016-02-14T17:44:12.9346930Z
SOLAR_AZIMUTH_ANGLE 151.02919
SOLAR_ZENITH_ANGLE 55.366402
SR_APP_VERSION LaSRC_1.3.0
WRS_PATH 34
WRS_ROW 34
system:asset_size 697.355882 MB
system:band_names [B1, B2, B3, B4, B5, B6, B7, B10, B11, sr_aero...
system:id LANDSAT/LC08/C01/T1_SR/LC08_034034_20160214
system:index LC08_034034_20160214
system:time_end 2016-02-14 17:44:12
system:time_start 2016-02-14 17:44:12
system:version 1522753341537768
In [11]:
img_props.loc['system:band_names'].squeeze()
Out[11]:
['B1',
 'B2',
 'B3',
 'B4',
 'B5',
 'B6',
 'B7',
 'B10',
 'B11',
 'sr_aerosol',
 'pixel_qa',
 'radsat_qa']

We can extract the most important properties for all images:

In [12]:
list_of_props = ['CLOUD_COVER', 'CLOUD_COVER_LAND', 'WRS_PATH', 'WRS_ROW']
all_img_dates = geemap.image_dates(dataset).getInfo()
metadata = extract_img_collection_metadata(dataset, list_of_props)
metadata['date'] = all_img_dates
metadata.head(5)
Out[12]:
CLOUD_COVER CLOUD_COVER_LAND WRS_PATH WRS_ROW date
0 49.26 49.26 34.0 34.0 2016-02-14
1 9.31 9.31 34.0 34.0 2016-03-01
2 2.47 2.47 34.0 34.0 2016-03-17
3 2.99 2.99 34.0 34.0 2016-04-02
4 6.76 6.76 34.0 34.0 2016-05-04

From this we immediately can see that we get multiple images for multiple tiles of the Worldwide reference system:

In [13]:
metadata.loc[:, ['WRS_PATH', 'WRS_ROW']].drop_duplicates().head(8)
Out[13]:
WRS_PATH WRS_ROW
0 34.0 34.0
20 34.0 35.0
42 34.0 36.0
65 34.0 37.0
87 35.0 33.0
110 35.0 34.0
133 35.0 35.0
156 35.0 36.0

A given tile is revisited roughly every 2 weeks:

In [14]:
single_tile_over_time = metadata.query('WRS_PATH == 39 and WRS_ROW == 37')
single_tile_over_time
Out[14]:
CLOUD_COVER CLOUD_COVER_LAND WRS_PATH WRS_ROW date
633 74.01 74.01 39.0 37.0 2016-01-16
634 11.01 11.01 39.0 37.0 2016-02-01
635 2.33 2.33 39.0 37.0 2016-02-17
636 10.90 10.90 39.0 37.0 2016-03-04
637 2.41 2.41 39.0 37.0 2016-03-20
638 87.75 87.75 39.0 37.0 2016-04-05
639 5.50 5.50 39.0 37.0 2016-04-21
640 27.35 27.35 39.0 37.0 2016-05-07
641 0.17 0.17 39.0 37.0 2016-05-23
642 8.91 8.91 39.0 37.0 2016-06-08
643 0.00 0.00 39.0 37.0 2016-06-24
644 0.00 0.00 39.0 37.0 2016-07-10
645 1.28 1.28 39.0 37.0 2016-07-26
646 4.75 4.75 39.0 37.0 2016-08-11
647 2.08 2.08 39.0 37.0 2016-08-27
648 0.02 0.02 39.0 37.0 2016-09-12
649 3.34 3.34 39.0 37.0 2016-09-28
650 0.77 0.77 39.0 37.0 2016-10-14
651 79.11 79.11 39.0 37.0 2016-10-30
652 6.53 6.53 39.0 37.0 2016-11-15
653 40.18 40.18 39.0 37.0 2016-12-01
654 0.30 0.30 39.0 37.0 2016-12-17
In [15]:
single_tile_over_time.shape
Out[15]:
(22, 5)

Depending on the time of the year, any given tile might look pretty different for different visits of the satellite:

In [ ]:
scene = extract_single_img(dataset, 643)

Map = geemap.Map()
Map.centerObject(scene, 8)
Map.addLayer(scene, vis_params_l8, 'default RGB')

Map

image.png

In particular with weather phenomena like clouds:

In [ ]:
scene = extract_single_img(dataset, 633)

Map = geemap.Map()
Map.centerObject(scene, 8)
Map.addLayer(scene, vis_params_l8, 'default RGB')

Map

image.png

Mosaic visualization

Clouds are an obvious impediment for any visual inspection. So we can try to improve the visualization of a mosaic by only taking images with rather low ratio of clouds.

Without any special consideration of clouds, this is what we get:

In [ ]:
Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(dataset, vis_params_l8)
Map

image.png

An easy way to improve things is by taking median pixel values, not only the last observation per pixel:

In [ ]:
Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(dataset.median(), vis_params_l8)
Map

image.png

If we sort with regards to cloud coverage, we can easily visualize both extremes for cloud coverage.

In [20]:
cloud_reduced_images = dataset.sort('CLOUD_COVER', opt_ascending=True)
In [ ]:
Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(cloud_reduced_images, vis_params_l8)
Map

image.png

In [22]:
cloud_reduced_images = dataset.sort('CLOUD_COVER', opt_ascending=False)
In [ ]:
Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(cloud_reduced_images, vis_params_l8)
Map

image.png

Although this is already a huge improvement of the visualization, it still comes with the problem that individual images might be taken from totally different points in time. This might lead to neighbouring images having totally different vegetation states (summer vs winter) or even having snow vs no snow.

Masking

Masking is a way to create a boolean flag on a pixel level. So for example, for each pixel in an image we could identify whether in this pixel we either see a cloud or a shadow of a cloud. Such a mask can then be used to only show those parts of an image that fulfill a certain level of quality. In the following example we only will show only pixels that are not covered with clouds.

In [ ]:
scene = extract_single_img(dataset, 633)

Map = geemap.Map()
Map.centerObject(scene, 8)
Map.addLayer(scene, vis_params_l8, 'default RGB')

Map

image.png

In [25]:
def maskL8sr(image):
    # From: https://developers.google.com/earth-engine/datasets/catalog/LANDSAT_LC08_C01_T1_SR
    
    # Bits 3 and 5 are cloud shadow and cloud, respectively.
    cloudShadowBitMask = (1 << 3)
    cloudsBitMask = (1 << 5)
    
    # Get the pixel QA band.
    qa = image.select('pixel_qa')
    
    # Both flags should be set to zero, indicating clear conditions.
    mask = qa.bitwiseAnd(cloudShadowBitMask).eq(0) \
                 .And(qa.bitwiseAnd(cloudsBitMask).eq(0))
    
    return image.updateMask(mask)
In [ ]:
scene = maskL8sr(extract_single_img(dataset, 633))

Map = geemap.Map()
Map.centerObject(scene, 8)
Map.addLayer(scene, vis_params_l8, 'default RGB')

Map

image.png

Now we can also combine masking techniques with median computations: first we filter clouds on a per pixel basis and take the median values of the remaining pixels.

In [ ]:
Map = geemap.Map()
Map.setCenter(roi_coords[0], roi_coords[1], 6)
Map.addLayer(dataset.map(maskL8sr).median(), vis_params_l8)
Map

image.png

The source of the notebook can be found here