remote_sensing module¶

`HLS` ¶

A class to handle Harmonized Landsat Sentinel (HLS) satellite data.

This class provides functionality to search, retrieve, and process HLS data, which combines data from Landsat and Sentinel-2 satellites. It supports various data operations including masking, scaling, and metadata retrieval.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.GeoDataFrame or tuple or Shapely Geometry`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	required
`start_date`	`str`	The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.	`'2014-01-01'`
`end_date`	`str`	The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.	`'2025-09-12'`
`bands`	`list`	The bands to be used. Default is all bands.	`None`
`resolution`	`str`	The resolution of the data. Defaults to native resolution.	`None`
`crs`	`str`	The coordinate reference system. Default is 'utm'.	`'utm'`
`remove_nodata`	`bool`	Whether to remove no data values. Default is True.	`True`
`scale_data`	`bool`	Whether to scale the data. Default is True.	`True`
`add_metadata`	`bool`	Whether to add metadata to the data. Default is True.	`True`
`add_platform`	`bool`	Whether to add platform information to the data. Default is True.	`True`
`groupby`	`str`	The groupby parameter for the data. Default is "solar_day".	`'solar_day'`

Attributes:

Name	Type	Description
`data`	`xarray.Dataset`	The loaded HLS data.
`metadata`	`geopandas.GeoDataFrame`	Metadata for the retrieved HLS scenes.
`rgb`	`xarray.DataArray`	RGB composite of the HLS data.
`ndvi`	`xarray.DataArray`	Normalized Difference Vegetation Index (NDVI) calculated from the data.
`Methods`	`None`
`-------`	`None`
`search_data()`	`None`	Searches for HLS data based on the specified parameters.
`get_data()`	`None`	Retrieves the HLS data based on the search results.
`get_metadata()`	`None`	Retrieves metadata for the HLS scenes.
`get_combined_metadata()`	`None`	Retrieves and combines metadata for both Landsat and Sentinel-2 scenes.
`remove_nodata_inplace()`	`None`	Removes no data values from the data.
`mask_data()`	`None`	Masks the data based on the Fmask quality layer.
`scale_data_inplace()`	`None`	Scales the data to reflectance values.
`add_platform_inplace()`	`None`	Adds platform information to the data as coordinates.
`get_rgb()`	`None`	Retrieves the RGB composite of the data.
`get_ndvi()`	`None`	Calculates the Normalized Difference Vegetation Index (NDVI).

Source code in easysnowdata/remote_sensing.py

class HLS:
    """
    A class to handle Harmonized Landsat Sentinel (HLS) satellite data.

    This class provides functionality to search, retrieve, and process HLS data, which combines
    data from Landsat and Sentinel-2 satellites. It supports various data operations including
    masking, scaling, and metadata retrieval.

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or Shapely Geometry
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    start_date : str, optional
        The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
    end_date : str, optional
        The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
    bands : list, optional
        The bands to be used. Default is all bands.
    resolution : str, optional
        The resolution of the data. Defaults to native resolution.
    crs : str, optional
        The coordinate reference system. Default is 'utm'.
    remove_nodata : bool, optional
        Whether to remove no data values. Default is True.
    scale_data : bool, optional
        Whether to scale the data. Default is True.
    add_metadata : bool, optional
        Whether to add metadata to the data. Default is True.
    add_platform : bool, optional
        Whether to add platform information to the data. Default is True.
    groupby : str, optional
        The groupby parameter for the data. Default is "solar_day".

    Attributes
    ----------
    data : xarray.Dataset
        The loaded HLS data.
    metadata : geopandas.GeoDataFrame
        Metadata for the retrieved HLS scenes.
    rgb : xarray.DataArray
        RGB composite of the HLS data.
    ndvi : xarray.DataArray
        Normalized Difference Vegetation Index (NDVI) calculated from the data.

    Methods
    -------
    search_data()
        Searches for HLS data based on the specified parameters.
    get_data()
        Retrieves the HLS data based on the search results.
    get_metadata()
        Retrieves metadata for the HLS scenes.
    get_combined_metadata()
        Retrieves and combines metadata for both Landsat and Sentinel-2 scenes.
    remove_nodata_inplace()
        Removes no data values from the data.
    mask_data()
        Masks the data based on the Fmask quality layer.
    scale_data_inplace()
        Scales the data to reflectance values.
    add_platform_inplace()
        Adds platform information to the data as coordinates.
    get_rgb()
        Retrieves the RGB composite of the data.
    get_ndvi()
        Calculates the Normalized Difference Vegetation Index (NDVI).
    """

    # https://lpdaac.usgs.gov/documents/1698/HLS_User_Guide_V2.pdf
    # https://lpdaac.usgs.gov/documents/842/HLS_Tutorial.html

    def __init__(
        self,
        bbox_input,
        start_date="2014-01-01",
        end_date=datetime.datetime.now().strftime("%Y-%m-%d"),
        bands=None,
        resolution=None,
        crs="utm",
        remove_nodata=True,
        scale_data=True,
        add_metadata=True,
        add_platform=True,
        groupby="solar_day",
    ):  #'ProducerGranuleId'
        """
        The constructor for the HLS class.

        Parameters:
            bbox_input (geopandas.GeoDataFrame or tuple or Shapely Geometry): GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
            start_date (str): The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
            end_date (str): The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
            bands (list): The bands to be used. Default is all bands. Must include SCL for data masking. Each band should be a string like 'B01', 'B02', etc.
            resolution (str): The resolution of the data. Defaults to native resolution, 10m.
            crs (str): The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.
            groupby (str): The groupby parameter for the data. Default is "solar_day".
        """
        # Initialize the attributes
        self.bbox_input = bbox_input
        self.start_date = start_date
        self.end_date = end_date
        self.bands = bands
        self.resolution = resolution
        self.crs = crs
        self.remove_nodata = remove_nodata
        self.scale_data = scale_data
        self.add_metadata = add_metadata
        self.add_platform = add_platform
        self.groupby = groupby

        self.bbox_gdf = convert_bbox_to_geodataframe(self.bbox_input)

        if self.crs == None:
            self.crs = self.bbox_gdf.estimate_utm_crs()

        # Define the band information
        self.band_info = { # https://github.com/stac-extensions/eo#common-band-names
            "coastal": {
                "landsat_band": "B01",
                "sentinel_band": "B01",
                "description": "430-450 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "blue": {
                "landsat_band": "B02",
                "sentinel_band": "B02",
                "description": "450-510 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "green": {
                "landsat_band": "B03",
                "sentinel_band": "B03",
                "description": "530-590 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "red": {
                "landsat_band": "B04",
                "sentinel_band": "B04",
                "description": "640-670 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "rededge071": {
                "landsat_band": "-",
                "sentinel_band": "B05",
                "description": "690-710 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "rededge075": {
                "landsat_band": "-",
                "sentinel_band": "B06",
                "description": "730-750 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "rededge078": {
                "landsat_band": "-",
                "sentinel_band": "B07",
                "description": "770-790 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "nir": {
                "landsat_band": "-",
                "sentinel_band": "B08",
                "description": "780-880 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "nir08": {
                "landsat_band": "B05",
                "sentinel_band": "B8A",
                "description": "850-880 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "swir16": {
                "landsat_band": "B06",
                "sentinel_band": "B11",
                "description": "1570-1650 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "swir22": {
                "landsat_band": "B07",
                "sentinel_band": "B12",
                "description": "2110-2290 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "water vapor": {
                "landsat_band": "-",
                "sentinel_band": "B09",
                "description": "930-950 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "cirrus": {
                "landsat_band": "B09",
                "sentinel_band": "B10",
                "description": "1360-1380 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "lwir11": {
                "landsat_band": "B10",
                "sentinel_band": "-",
                "description": "10600-11190 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "lwir12": {
                "landsat_band": "B11",
                "sentinel_band": "-",
                "description": "11500-12510 nm",
                "data_type": "int16",
                "nodata": "-9999",
                "scale": "0.0001",
            },
            "Fmask": {
                "landsat_band": "Fmask",
                "sentinel_band": "Fmask",
                "description": "quality bits",
                "data_type": "uint8",
                "nodata": "255",
                "scale": "1",
            },
            "SZA": {
                "landsat_band": "SZA",
                "sentinel_band": "SZA",
                "description": "Sun zenith degrees",
                "data_type": "uint16",
                "nodata": "40000",
                "scale": "0.01",
            },
            "SAA": {
                "landsat_band": "SAA",
                "sentinel_band": "SAA",
                "description": "Sun azimuth degrees",
                "data_type": "uint16",
                "nodata": "40000",
                "scale": "0.01",
            },
            "VZA": {
                "landsat_band": "VZA",
                "sentinel_band": "VZA",
                "description": "View zenith degrees",
                "data_type": "uint16",
                "nodata": "40000",
                "scale": "0.01",
            },
            "VAA": {
                "landsat_band": "VAA",
                "sentinel_band": "VAA",
                "description": "View azimuth degrees",
                "data_type": "uint16",
                "nodata": "40000",
                "scale": "0.01",
            },
        }

        self.Fmask_mask_info = {
            0: {"name": "Cirrus", "bit number": "0"},
            1: {"name": "Cloud", "bit number": "1"},
            2: {"name": "Adjacent to cloud / shadow", "bit number": "2"},
            3: {"name": "Cloud shadows", "bit number": "3"},
            4: {"name": "Snow / ice", "bit number": "4"},
            5: {"name": "Water", "bit number": "5"},
            6: {
                "name": "Aerosol level (00:climatology aersol,01:low aerosol,10:moderate aerosol, 11:high aerosol)",
                "bit number": "6-7",
            },
        }

        # Initialize the data attributes
        self.search = None
        self.data = None
        self.metadata = None

        self.rgb = None
        self.ndvi = None
        self.ndsi = None
        self.ndwi = None
        self.evi = None
        self.ndbi = None

        self.search_data()
        self.get_data()
        if self.remove_nodata:
            self.remove_nodata_inplace()
        if self.scale_data:
            self.scale_data_inplace()
        if self.add_metadata:
            self.get_combined_metadata()
        if self.add_platform:
            self.add_platform_inplace()

    def search_data(self):
        """
        The method to search the data.
        """

        catalog = pystac_client.Client.open(
            "https://cmr.earthdata.nasa.gov/stac/LPCLOUD"
        )

        # Search for items within the specified bbox and date range
        landsat_search = catalog.search(
            collections=["HLSL30_2.0"],
            bbox=self.bbox_gdf.total_bounds,
            datetime=(self.start_date, self.end_date),
        )
        sentinel_search = catalog.search(
            collections=["HLSS30_2.0"],
            bbox=self.bbox_gdf.total_bounds,
            datetime=(self.start_date, self.end_date),
        )

        self.search_landsat = landsat_search
        self.search_sentinel = sentinel_search
        print(
            f"Data searched. Access the returned seach with the .search_landsat or .search_sentinel attribute."
        )

    def get_data(self):
        """
        The method to get the data.
        """
        # Prepare the parameters for odc.stac.load
        load_params_landsat = {
            "items": self.search_landsat.item_collection(),
            "bbox": self.bbox_gdf.total_bounds,
            "chunks": {"time": 1, "x": 512, "y": 512},
            "crs": self.crs,  # maybe put 'utm'?
            "groupby": self.groupby,
            "fail_on_error": False,
            "stac_cfg": get_stac_cfg(sensor="HLSL30_2.0"),
        }
        if self.bands:
            load_params_landsat["bands"] = self.bands
        else:
            load_params_landsat["bands"] = [
                band
                for band, info in self.band_info.items()
                if info["landsat_band"] != "-"
            ]
        if self.resolution:
            load_params_landsat["resolution"] = self.resolution
        else:
            load_params_landsat["resolution"] = 30

        L30_ds = odc.stac.load(**load_params_landsat)

        load_params_sentinel = {
            "items": self.search_sentinel.item_collection(),
            "bbox": self.bbox_gdf.total_bounds,
            "chunks": {"time": 1, "x": 512, "y": 512},
            "crs": self.crs,
            "groupby": self.groupby,
            "fail_on_error": False,
            "stac_cfg": get_stac_cfg(sensor="HLSS30_2.0"),
        }
        if self.bands:
            load_params_sentinel["bands"] = self.bands
        else:
            load_params_sentinel["bands"] = [
                band
                for band, info in self.band_info.items()
                if info["sentinel_band"] != "-"
            ]
        if self.resolution:
            load_params_sentinel["resolution"] = self.resolution
        else:
            load_params_sentinel["resolution"] = 30

        S30_ds = odc.stac.load(**load_params_sentinel)

        # Load the data lazily using odc.stac
        self.data = xr.concat((L30_ds, S30_ds), dim="time", fill_value=-9999).sortby(
            "time"
        )

        self.data.attrs["band_info"] = self.band_info
        # self.data.attrs['scl_class_info'] = self.scl_class_info

        # if 'scl' in self.data.variables:
        #    self.data.scl.attrs['scl_class_info'] = self.scl_class_info

        print(
            f"Data retrieved. Access with the .data attribute. Data CRS: {self.bbox_gdf.estimate_utm_crs().name}."
        )

    def remove_nodata_inplace(self):
        """
        The method to remove no data values from the data.
        """
        data_removed=False
        for band in self.data.data_vars:
            nodata_value = self.data[band].attrs.get("nodata")
            if nodata_value is not None:
                self.data[band] = self.data[band].where(self.data[band] != nodata_value)
                data_removed=True
        if data_removed:
            print(f"Nodata values removed from the data. In doing so, all bands converted to float32. To turn this behavior off, set remove_nodata=False.")
        else:
            print(f"Tried to remove nodata values and set them to nans, but no nodata values found in the data.")


    def mask_data(
        self,
        remove_cirrus=True,
        remove_cloud=True,
        remove_adj_to_cloud=True,
        remove_cloud_shadows=True,
        remove_snow_ice=False,
        remove_water=False,
        remove_aerosol_low=False,
        remove_aerosol_moderate=False,
        remove_aerosol_high=False,
        remove_aerosol_climatology=False,
    ):
        """
        The method to mask the data using Fmask.

        Parameters:
            remove_cirrus (bool): Whether to remove cirrus pixels.
            remove_cloud (bool): Whether to remove cloud pixels.
            remove_adj_to_cloud (bool): Whether to remove pixels adjacent to clouds.
            remove_cloud_shadows (bool): Whether to remove cloud shadow pixels.
            remove_snow_ice (bool): Whether to remove snow and ice pixels.
            remove_water (bool): Whether to remove water pixels.
            remove_aerosol_low (bool): Whether to remove low aerosol pixels.
            remove_aerosol_moderate (bool): Whether to remove moderate aerosol pixels.
            remove_aerosol_high (bool): Whether to remove high aerosol pixels.
            remove_aerosol_climatology (bool): Whether to remove climatology aerosol pixels.


        """

        # Get value of QC bit based on location
        def get_qc_bit(ar, bit):
            # taken from Helen's fantastic repo https://github.com/UW-GDA/mekong-water-quality/blob/main/02_pull_hls.ipynb
            return (ar // (2**bit)) - ((ar // (2**bit)) // 2 * 2)

        mask = xr.DataArray()
        # Mask the data based on the Fmask values
        mask_list = []
        if remove_cirrus:
            mask_list.append(0)
        if remove_cloud:
            mask_list.append(1)
        if remove_adj_to_cloud:
            mask_list.append(2)
        if remove_cloud_shadows:
            mask_list.append(3)
        if remove_snow_ice:
            mask_list.append(4)
        if remove_water:
            mask_list.append(5)
        if (
            remove_aerosol_climatology
            | remove_aerosol_low
            | remove_aerosol_moderate
            | remove_aerosol_high
        ):
            mask_list.append(6)
            aerosol_mask = (
                get_qc_bit(self.data["Fmask"], 6)
                .astype(str)
                .str.cat(get_qc_bit(self.data["Fmask"], 7).astype(str))
            )
            if remove_aerosol_climatology:
                mask = xr.concat([mask, aerosol_mask == "00"], dim="masks")
            if remove_aerosol_low:
                mask = xr.concat([mask, aerosol_mask == "01"], dim="masks")
            if remove_aerosol_moderate:
                mask = xr.concat([mask, aerosol_mask == "10"], dim="masks")
            if remove_aerosol_high:
                mask = xr.concat([mask, aerosol_mask == "11"], dim="masks")

        for val in mask_list:
            if val != 6:
                mask = xr.concat(
                    [mask, get_qc_bit(self.data["Fmask"], val)], dim="masks"
                )

        mask = mask.sum(dim="masks")
        self.data = self.data.where(mask == 0)

        print(
            f"WARNING: The cloud masking is pretty bad over snow and ice. Use with caution."
        )
        print(f"Data masked. Using Fmask, removed pixels classified as:")
        for val in mask_list:
            print(self.Fmask_mask_info[val]["name"])

    def get_metadata(self, item_collection):

        HLS_metadata = gpd.GeoDataFrame.from_features(
            item_collection.to_dict(transform_hrefs=True), "EPSG:4326"
        )
        HLS_metadata = HLS_metadata.drop(
            columns=["start_datetime", "end_datetime"], inplace=False
        )
        HLS_metadata["datetime"] = pd.to_datetime(HLS_metadata["datetime"], utc=True)

        series_list = []
        for item in item_collection:
            url = item.assets["metadata"].href
            series = HLS_xml_url_to_metadata_df(url)
            series_list.append(series)

        extra_attributes = pd.DataFrame(series_list)
        extra_attributes["Temporal"] = pd.to_datetime(extra_attributes["Temporal"])
        extra_attributes["Platform"] = extra_attributes["Platform"].str.title()

        metadata_gdf = gpd.GeoDataFrame(
            pd.merge_asof(
                HLS_metadata,
                extra_attributes,
                left_on="datetime",
                right_on="Temporal",
                direction="nearest",
                tolerance=pd.Timedelta("100ms"),
            )
        ).drop(columns="Temporal")
        metadata_gdf = metadata_gdf[
            [
                "datetime",
                "ProducerGranuleId",
                "Platform",
                "eo:cloud_cover",
                "AssociatedBrowseImageUrls",
                "geometry",
            ]
        ]

        return metadata_gdf

    def get_combined_metadata(self):
        L30_metadata = self.get_metadata(self.search_landsat.item_collection())
        S30_metadata = self.get_metadata(self.search_sentinel.item_collection())
        combined_metadata_gdf = (
            pd.concat([L30_metadata, S30_metadata])
            .sort_values("datetime")
            .reset_index(drop=True)
        )

        self.metadata = combined_metadata_gdf
        print(
            f"Metadata retrieved. Access with the .metadata attribute. To turn this behavior off, set add_metadata=False."
        )

    def add_platform_inplace(self):
        temp_grouped_metadata = self.metadata
        temp_grouped_metadata["cluster"] = (
            temp_grouped_metadata["datetime"].diff().dt.total_seconds().gt(60).cumsum()
        )

        grouped_metadata = pd.DataFrame()
        grouped_metadata["datetime"] = temp_grouped_metadata.groupby("cluster")[
            "datetime"
        ].apply(np.mean)
        grouped_metadata["Platforms"] = temp_grouped_metadata.groupby("cluster")[
            "Platform"
        ].apply(np.unique)
        grouped_metadata["eo:cloud_cover_avg"] = (
            temp_grouped_metadata.groupby("cluster")["eo:cloud_cover"]
            .apply(np.mean)
            .astype(int)
        )
        grouped_metadata["BrowseUrls"] = self.metadata.groupby("cluster")[
            "AssociatedBrowseImageUrls"
        ].apply(list)
        grouped_metadata["geometry"] = (
            temp_grouped_metadata.groupby("cluster")["geometry"]
            .apply(list)
            .apply(shapely.geometry.MultiPolygon)
        )
        grouped_metadata_gdf = gpd.GeoDataFrame(grouped_metadata).sort_values(
            "datetime"
        )

        self.data = self.data.assign_coords(
            {
                "platform": (
                    "time",
                    [item[0] for item in grouped_metadata_gdf["Platforms"].values],
                )
            }
        )
        self.data = self.data.assign_coords(
            {
                "eo:cloud_cover_avg": (
                    "time",
                    grouped_metadata_gdf["eo:cloud_cover_avg"].values,
                )
            }
        )
        self.data = self.data.assign_coords(
            {
                "AssociatedBrowseImageUrls": (
                    "time",
                    grouped_metadata_gdf["BrowseUrls"].values,
                )
            }
        )
        self.data = self.data.assign_coords(
            {"geometry": ("time", grouped_metadata_gdf["geometry"].values)}
        )

        print(
            f"Platform, geometry, cloud cover, browse URLs added to data as coordinates. Access with the .data attribute. To turn this behavior off, set add_platform=False."
        )

    def scale_data_inplace(self):
        """
        The method to scale the data.
        """

        # Define a function to scale a data variable
        def scale_var(x):
            band = x.name
            if band in self.data.band_info:
                scale_factor_dict = self.data.band_info[band]
                # Extract the actual scale factor from the dictionary and convert it to a float
                scale_factor = float(scale_factor_dict["scale"])
                return x * scale_factor
            else:
                return x

        # Apply the function to each data variable in the Dataset
        self.data = self.data.apply(scale_var, keep_attrs=True)
        print(
            f"Data scaled to reflectance. Access with the .data attribute. To turn this behavior off, set scale_data=False."
        )

    # def get_rgb(self):
    #     """
    #     The method to get the RGB data.

    #     Returns:
    #         xarray.DataArray: The RGB data.
    #     """
    #     # Convert the red, green, and blue bands to an RGB DataArray
    #     rgb_da = self.data[["red", "green", "blue"]].to_dataarray(dim="band")
    #     self.rgb = rgb_da

    #     print(f"RGB data retrieved. Access with the .rgb attribute.")


    def get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None}):
        """
        Retrieve RGB data with optional percentile-based contrast stretching and CLAHE enhancement.

        This method calculates and stores three versions of RGB data: raw, percentile-stretched, and CLAHE-enhanced.

        Parameters
        ----------
        percentile_kwargs : dict, optional
            Parameters for percentile-based contrast stretching. Keys are:
            - 'lower': Lower percentile for contrast stretching (default: 2)
            - 'upper': Upper percentile for contrast stretching (default: 98)
        clahe_kwargs : dict, optional
            Parameters for CLAHE enhancement. Keys are:
            - 'clip_limit': Clipping limit for CLAHE (default: 0.03)
            - 'nbins': Number of bins for CLAHE histogram (default: 256)
            - 'kernel_size': Size of kernel for CLAHE (default: None)

        Returns
        -------
        None
            The method stores results in instance attributes.

        Notes
        -----
        Results are stored in the following attributes:
        - .rgb: Raw RGB data
        - .rgb_percentile: Percentile-stretched RGB data
        - .rgb_clahe: CLAHE-enhanced RGB data
        """

        rgba_da = self.data.odc.to_rgba(bands=('red','green','blue'),vmin=-0.30, vmax=1.35)
        self.rgba = rgba_da

        rgb_da = rgba_da.isel(band=slice(0, 3)) # .where(self.data.scl>=0, other=255) if we want to make no data white
        self.rgb = rgb_da

        self.rgb_percentile = self.get_rgb_percentile(**percentile_kwargs)
        self.rgb_clahe = self.get_rgb_clahe(**clahe_kwargs)

        print(f"RGB data retrieved.\nAccess with the following attributes:\n.rgb for raw RGB,\n.rgba for RGBA,\n.rgb_percentile for percentile RGB,\n.rgb_clahe for CLAHE RGB.\nYou can pass in percentile_kwargs and clahe_kwargs to adjust RGB calculations, check documentation for options.")

    def get_rgb_percentile(self, **percentile_kwargs):
        """
        Apply percentile-based contrast stretching to the RGB bands of the Sentinel-2 data.

        This function creates a new DataArray with the contrast-stretched RGB bands.

        Parameters
        ----------
        **kwargs : dict
            Keyword arguments for percentile calculation. Supported keys:
            - 'lower': Lower percentile for contrast stretching (default: 2)
            - 'upper': Upper percentile for contrast stretching (default: 98)

        Returns
        -------
        xarray.DataArray
            RGB data with percentile-based contrast stretching applied.

        Notes
        -----
        The function clips values to the range [0, 1] and masks areas where SCL < 0.
        """
        lower_percentile = percentile_kwargs.get('lower', 2)
        upper_percentile = percentile_kwargs.get('upper', 98)

        def stretch_percentile(da):
            p_low, p_high = np.nanpercentile(da.values, [lower_percentile, upper_percentile])
            return (da - p_low) / (p_high - p_low)

        rgb_da = self.rgb.where(self.rgba.isel(band=-1)==255)

        template = xr.zeros_like(rgb_da)
        rgb_percentile_da = xr.map_blocks(stretch_percentile, rgb_da, template=template)
        rgb_percentile_da = rgb_percentile_da.clip(0, 1)#.where(self.data.scl>=0)

        return rgb_percentile_da

    def get_rgb_clahe(self, **kwargs):
        """
        Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to RGB bands.

        This function creates a new DataArray with CLAHE applied to the RGB bands.

        Parameters
        ----------
        **kwargs : dict
            Keyword arguments for CLAHE. Supported keys:
            - 'clip_limit': Clipping limit for CLAHE (default: 0.03)
            - 'nbins': Number of bins for CLAHE histogram (default: 256)
            - 'kernel_size': Size of kernel for CLAHE (default: None)

        Returns
        -------
        xarray.DataArray
            RGB data with CLAHE enhancement applied.

        Notes
        -----
        The function applies CLAHE to each band separately and masks areas where SCL < 0.
        https://scikit-image.org/docs/stable/api/skimage.exposure.html#skimage.exposure.equalize_adapthist
        """

        # Custom wrapper to preserve xarray metadata
        def equalize_adapthist_da(da, **kwargs):
            # Apply the CLAHE function from skimage
            result = skimage.exposure.equalize_adapthist(da.values, **kwargs)
            #new_coords = {k: v for k, v in da.coords.items() if k != 'band' or len(v) == 3}

            # Convert the result back to a DataArray, preserving the original metadata
            return xr.DataArray(result, dims=da.dims, coords=da.coords, attrs=da.attrs)

        rgb_da = self.rgb

        #template = rgb_da.copy(data=np.empty_like(rgb_da).data)
        template = xr.zeros_like(rgb_da)
        rgb_clahe_da = xr.map_blocks(equalize_adapthist_da, rgb_da, template=template, kwargs=kwargs)
        rgb_clahe_da = rgb_clahe_da.where(self.rgba.isel(band=-1)==255)#.where(self.data.scl>=0)

        return rgb_clahe_da

    # Indicies

    def get_ndvi(self):
        """
        The method to get the NDVI data.

        Returns:
            ndvi_da (xarray.DataArray): The NDVI data.
        """
        red = self.data.red
        nir = self.data.nir
        ndvi_da = (nir - red) / (nir + red)

        self.ndvi = ndvi_da

        print(f"NDVI data calculated. Access with the .ndvi attribute.")

`init(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', bands=None, resolution=None, crs='utm', remove_nodata=True, scale_data=True, add_metadata=True, add_platform=True, groupby='solar_day')` `special` ¶

The constructor for the HLS class.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.GeoDataFrame or tuple or Shapely Geometry`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	required
`start_date`	`str`	The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.	`'2014-01-01'`
`end_date`	`str`	The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.	`'2025-09-12'`
`bands`	`list`	The bands to be used. Default is all bands. Must include SCL for data masking. Each band should be a string like 'B01', 'B02', etc.	`None`
`resolution`	`str`	The resolution of the data. Defaults to native resolution, 10m.	`None`
`crs`	`str`	The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.	`'utm'`
`groupby`	`str`	The groupby parameter for the data. Default is "solar_day".	`'solar_day'`

Source code in easysnowdata/remote_sensing.py

def __init__(
    self,
    bbox_input,
    start_date="2014-01-01",
    end_date=datetime.datetime.now().strftime("%Y-%m-%d"),
    bands=None,
    resolution=None,
    crs="utm",
    remove_nodata=True,
    scale_data=True,
    add_metadata=True,
    add_platform=True,
    groupby="solar_day",
):  #'ProducerGranuleId'
    """
    The constructor for the HLS class.

    Parameters:
        bbox_input (geopandas.GeoDataFrame or tuple or Shapely Geometry): GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
        start_date (str): The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
        end_date (str): The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
        bands (list): The bands to be used. Default is all bands. Must include SCL for data masking. Each band should be a string like 'B01', 'B02', etc.
        resolution (str): The resolution of the data. Defaults to native resolution, 10m.
        crs (str): The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.
        groupby (str): The groupby parameter for the data. Default is "solar_day".
    """
    # Initialize the attributes
    self.bbox_input = bbox_input
    self.start_date = start_date
    self.end_date = end_date
    self.bands = bands
    self.resolution = resolution
    self.crs = crs
    self.remove_nodata = remove_nodata
    self.scale_data = scale_data
    self.add_metadata = add_metadata
    self.add_platform = add_platform
    self.groupby = groupby

    self.bbox_gdf = convert_bbox_to_geodataframe(self.bbox_input)

    if self.crs == None:
        self.crs = self.bbox_gdf.estimate_utm_crs()

    # Define the band information
    self.band_info = { # https://github.com/stac-extensions/eo#common-band-names
        "coastal": {
            "landsat_band": "B01",
            "sentinel_band": "B01",
            "description": "430-450 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "blue": {
            "landsat_band": "B02",
            "sentinel_band": "B02",
            "description": "450-510 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "green": {
            "landsat_band": "B03",
            "sentinel_band": "B03",
            "description": "530-590 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "red": {
            "landsat_band": "B04",
            "sentinel_band": "B04",
            "description": "640-670 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "rededge071": {
            "landsat_band": "-",
            "sentinel_band": "B05",
            "description": "690-710 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "rededge075": {
            "landsat_band": "-",
            "sentinel_band": "B06",
            "description": "730-750 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "rededge078": {
            "landsat_band": "-",
            "sentinel_band": "B07",
            "description": "770-790 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "nir": {
            "landsat_band": "-",
            "sentinel_band": "B08",
            "description": "780-880 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "nir08": {
            "landsat_band": "B05",
            "sentinel_band": "B8A",
            "description": "850-880 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "swir16": {
            "landsat_band": "B06",
            "sentinel_band": "B11",
            "description": "1570-1650 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "swir22": {
            "landsat_band": "B07",
            "sentinel_band": "B12",
            "description": "2110-2290 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "water vapor": {
            "landsat_band": "-",
            "sentinel_band": "B09",
            "description": "930-950 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "cirrus": {
            "landsat_band": "B09",
            "sentinel_band": "B10",
            "description": "1360-1380 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "lwir11": {
            "landsat_band": "B10",
            "sentinel_band": "-",
            "description": "10600-11190 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "lwir12": {
            "landsat_band": "B11",
            "sentinel_band": "-",
            "description": "11500-12510 nm",
            "data_type": "int16",
            "nodata": "-9999",
            "scale": "0.0001",
        },
        "Fmask": {
            "landsat_band": "Fmask",
            "sentinel_band": "Fmask",
            "description": "quality bits",
            "data_type": "uint8",
            "nodata": "255",
            "scale": "1",
        },
        "SZA": {
            "landsat_band": "SZA",
            "sentinel_band": "SZA",
            "description": "Sun zenith degrees",
            "data_type": "uint16",
            "nodata": "40000",
            "scale": "0.01",
        },
        "SAA": {
            "landsat_band": "SAA",
            "sentinel_band": "SAA",
            "description": "Sun azimuth degrees",
            "data_type": "uint16",
            "nodata": "40000",
            "scale": "0.01",
        },
        "VZA": {
            "landsat_band": "VZA",
            "sentinel_band": "VZA",
            "description": "View zenith degrees",
            "data_type": "uint16",
            "nodata": "40000",
            "scale": "0.01",
        },
        "VAA": {
            "landsat_band": "VAA",
            "sentinel_band": "VAA",
            "description": "View azimuth degrees",
            "data_type": "uint16",
            "nodata": "40000",
            "scale": "0.01",
        },
    }

    self.Fmask_mask_info = {
        0: {"name": "Cirrus", "bit number": "0"},
        1: {"name": "Cloud", "bit number": "1"},
        2: {"name": "Adjacent to cloud / shadow", "bit number": "2"},
        3: {"name": "Cloud shadows", "bit number": "3"},
        4: {"name": "Snow / ice", "bit number": "4"},
        5: {"name": "Water", "bit number": "5"},
        6: {
            "name": "Aerosol level (00:climatology aersol,01:low aerosol,10:moderate aerosol, 11:high aerosol)",
            "bit number": "6-7",
        },
    }

    # Initialize the data attributes
    self.search = None
    self.data = None
    self.metadata = None

    self.rgb = None
    self.ndvi = None
    self.ndsi = None
    self.ndwi = None
    self.evi = None
    self.ndbi = None

    self.search_data()
    self.get_data()
    if self.remove_nodata:
        self.remove_nodata_inplace()
    if self.scale_data:
        self.scale_data_inplace()
    if self.add_metadata:
        self.get_combined_metadata()
    if self.add_platform:
        self.add_platform_inplace()

`get_data(self)` ¶

The method to get the data.

Source code in easysnowdata/remote_sensing.py

def get_data(self):
    """
    The method to get the data.
    """
    # Prepare the parameters for odc.stac.load
    load_params_landsat = {
        "items": self.search_landsat.item_collection(),
        "bbox": self.bbox_gdf.total_bounds,
        "chunks": {"time": 1, "x": 512, "y": 512},
        "crs": self.crs,  # maybe put 'utm'?
        "groupby": self.groupby,
        "fail_on_error": False,
        "stac_cfg": get_stac_cfg(sensor="HLSL30_2.0"),
    }
    if self.bands:
        load_params_landsat["bands"] = self.bands
    else:
        load_params_landsat["bands"] = [
            band
            for band, info in self.band_info.items()
            if info["landsat_band"] != "-"
        ]
    if self.resolution:
        load_params_landsat["resolution"] = self.resolution
    else:
        load_params_landsat["resolution"] = 30

    L30_ds = odc.stac.load(**load_params_landsat)

    load_params_sentinel = {
        "items": self.search_sentinel.item_collection(),
        "bbox": self.bbox_gdf.total_bounds,
        "chunks": {"time": 1, "x": 512, "y": 512},
        "crs": self.crs,
        "groupby": self.groupby,
        "fail_on_error": False,
        "stac_cfg": get_stac_cfg(sensor="HLSS30_2.0"),
    }
    if self.bands:
        load_params_sentinel["bands"] = self.bands
    else:
        load_params_sentinel["bands"] = [
            band
            for band, info in self.band_info.items()
            if info["sentinel_band"] != "-"
        ]
    if self.resolution:
        load_params_sentinel["resolution"] = self.resolution
    else:
        load_params_sentinel["resolution"] = 30

    S30_ds = odc.stac.load(**load_params_sentinel)

    # Load the data lazily using odc.stac
    self.data = xr.concat((L30_ds, S30_ds), dim="time", fill_value=-9999).sortby(
        "time"
    )

    self.data.attrs["band_info"] = self.band_info
    # self.data.attrs['scl_class_info'] = self.scl_class_info

    # if 'scl' in self.data.variables:
    #    self.data.scl.attrs['scl_class_info'] = self.scl_class_info

    print(
        f"Data retrieved. Access with the .data attribute. Data CRS: {self.bbox_gdf.estimate_utm_crs().name}."
    )

`get_ndvi(self)` ¶

The method to get the NDVI data.

Returns:

Type	Description
	ndvi_da (xarray.DataArray): The NDVI data.

Source code in easysnowdata/remote_sensing.py

def get_ndvi(self):
    """
    The method to get the NDVI data.

    Returns:
        ndvi_da (xarray.DataArray): The NDVI data.
    """
    red = self.data.red
    nir = self.data.nir
    ndvi_da = (nir - red) / (nir + red)

    self.ndvi = ndvi_da

    print(f"NDVI data calculated. Access with the .ndvi attribute.")

`get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None})` ¶

Retrieve RGB data with optional percentile-based contrast stretching and CLAHE enhancement.

This method calculates and stores three versions of RGB data: raw, percentile-stretched, and CLAHE-enhanced.

Parameters:

Name	Type	Description	Default
`percentile_kwargs`	`dict`	Parameters for percentile-based contrast stretching. Keys are: - 'lower': Lower percentile for contrast stretching (default: 2) - 'upper': Upper percentile for contrast stretching (default: 98)	`{'lower': 2, 'upper': 98}`
`clahe_kwargs`	`dict`	Parameters for CLAHE enhancement. Keys are: - 'clip_limit': Clipping limit for CLAHE (default: 0.03) - 'nbins': Number of bins for CLAHE histogram (default: 256) - 'kernel_size': Size of kernel for CLAHE (default: None)	`{'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None}`

Returns:

Type	Description
`None`	The method stores results in instance attributes.

Source code in easysnowdata/remote_sensing.py

def get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None}):
    """
    Retrieve RGB data with optional percentile-based contrast stretching and CLAHE enhancement.

    This method calculates and stores three versions of RGB data: raw, percentile-stretched, and CLAHE-enhanced.

    Parameters
    ----------
    percentile_kwargs : dict, optional
        Parameters for percentile-based contrast stretching. Keys are:
        - 'lower': Lower percentile for contrast stretching (default: 2)
        - 'upper': Upper percentile for contrast stretching (default: 98)
    clahe_kwargs : dict, optional
        Parameters for CLAHE enhancement. Keys are:
        - 'clip_limit': Clipping limit for CLAHE (default: 0.03)
        - 'nbins': Number of bins for CLAHE histogram (default: 256)
        - 'kernel_size': Size of kernel for CLAHE (default: None)

    Returns
    -------
    None
        The method stores results in instance attributes.

    Notes
    -----
    Results are stored in the following attributes:
    - .rgb: Raw RGB data
    - .rgb_percentile: Percentile-stretched RGB data
    - .rgb_clahe: CLAHE-enhanced RGB data
    """

    rgba_da = self.data.odc.to_rgba(bands=('red','green','blue'),vmin=-0.30, vmax=1.35)
    self.rgba = rgba_da

    rgb_da = rgba_da.isel(band=slice(0, 3)) # .where(self.data.scl>=0, other=255) if we want to make no data white
    self.rgb = rgb_da

    self.rgb_percentile = self.get_rgb_percentile(**percentile_kwargs)
    self.rgb_clahe = self.get_rgb_clahe(**clahe_kwargs)

    print(f"RGB data retrieved.\nAccess with the following attributes:\n.rgb for raw RGB,\n.rgba for RGBA,\n.rgb_percentile for percentile RGB,\n.rgb_clahe for CLAHE RGB.\nYou can pass in percentile_kwargs and clahe_kwargs to adjust RGB calculations, check documentation for options.")

`get_rgb_clahe(self, **kwargs)` ¶

Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to RGB bands.

This function creates a new DataArray with CLAHE applied to the RGB bands.

Parameters:

Name	Type	Description	Default
`**kwargs`	`dict`	Keyword arguments for CLAHE. Supported keys: - 'clip_limit': Clipping limit for CLAHE (default: 0.03) - 'nbins': Number of bins for CLAHE histogram (default: 256) - 'kernel_size': Size of kernel for CLAHE (default: None)	`{}`

Returns:

Type	Description
`xarray.DataArray`	RGB data with CLAHE enhancement applied.

Source code in easysnowdata/remote_sensing.py

def get_rgb_clahe(self, **kwargs):
    """
    Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to RGB bands.

    This function creates a new DataArray with CLAHE applied to the RGB bands.

    Parameters
    ----------
    **kwargs : dict
        Keyword arguments for CLAHE. Supported keys:
        - 'clip_limit': Clipping limit for CLAHE (default: 0.03)
        - 'nbins': Number of bins for CLAHE histogram (default: 256)
        - 'kernel_size': Size of kernel for CLAHE (default: None)

    Returns
    -------
    xarray.DataArray
        RGB data with CLAHE enhancement applied.

    Notes
    -----
    The function applies CLAHE to each band separately and masks areas where SCL < 0.
    https://scikit-image.org/docs/stable/api/skimage.exposure.html#skimage.exposure.equalize_adapthist
    """

    # Custom wrapper to preserve xarray metadata
    def equalize_adapthist_da(da, **kwargs):
        # Apply the CLAHE function from skimage
        result = skimage.exposure.equalize_adapthist(da.values, **kwargs)
        #new_coords = {k: v for k, v in da.coords.items() if k != 'band' or len(v) == 3}

        # Convert the result back to a DataArray, preserving the original metadata
        return xr.DataArray(result, dims=da.dims, coords=da.coords, attrs=da.attrs)

    rgb_da = self.rgb

    #template = rgb_da.copy(data=np.empty_like(rgb_da).data)
    template = xr.zeros_like(rgb_da)
    rgb_clahe_da = xr.map_blocks(equalize_adapthist_da, rgb_da, template=template, kwargs=kwargs)
    rgb_clahe_da = rgb_clahe_da.where(self.rgba.isel(band=-1)==255)#.where(self.data.scl>=0)

    return rgb_clahe_da

`get_rgb_percentile(self, **percentile_kwargs)` ¶

Apply percentile-based contrast stretching to the RGB bands of the Sentinel-2 data.

This function creates a new DataArray with the contrast-stretched RGB bands.

Parameters:

Name	Type	Description	Default
`**kwargs`	`dict`	Keyword arguments for percentile calculation. Supported keys: - 'lower': Lower percentile for contrast stretching (default: 2) - 'upper': Upper percentile for contrast stretching (default: 98)	required

Returns:

Type	Description
`xarray.DataArray`	RGB data with percentile-based contrast stretching applied.

Source code in easysnowdata/remote_sensing.py

def get_rgb_percentile(self, **percentile_kwargs):
    """
    Apply percentile-based contrast stretching to the RGB bands of the Sentinel-2 data.

    This function creates a new DataArray with the contrast-stretched RGB bands.

    Parameters
    ----------
    **kwargs : dict
        Keyword arguments for percentile calculation. Supported keys:
        - 'lower': Lower percentile for contrast stretching (default: 2)
        - 'upper': Upper percentile for contrast stretching (default: 98)

    Returns
    -------
    xarray.DataArray
        RGB data with percentile-based contrast stretching applied.

    Notes
    -----
    The function clips values to the range [0, 1] and masks areas where SCL < 0.
    """
    lower_percentile = percentile_kwargs.get('lower', 2)
    upper_percentile = percentile_kwargs.get('upper', 98)

    def stretch_percentile(da):
        p_low, p_high = np.nanpercentile(da.values, [lower_percentile, upper_percentile])
        return (da - p_low) / (p_high - p_low)

    rgb_da = self.rgb.where(self.rgba.isel(band=-1)==255)

    template = xr.zeros_like(rgb_da)
    rgb_percentile_da = xr.map_blocks(stretch_percentile, rgb_da, template=template)
    rgb_percentile_da = rgb_percentile_da.clip(0, 1)#.where(self.data.scl>=0)

    return rgb_percentile_da

`mask_data(self, remove_cirrus=True, remove_cloud=True, remove_adj_to_cloud=True, remove_cloud_shadows=True, remove_snow_ice=False, remove_water=False, remove_aerosol_low=False, remove_aerosol_moderate=False, remove_aerosol_high=False, remove_aerosol_climatology=False)` ¶

The method to mask the data using Fmask.

Parameters:

Name	Type	Description	Default
`remove_cirrus`	`bool`	Whether to remove cirrus pixels.	`True`
`remove_cloud`	`bool`	Whether to remove cloud pixels.	`True`
`remove_adj_to_cloud`	`bool`	Whether to remove pixels adjacent to clouds.	`True`
`remove_cloud_shadows`	`bool`	Whether to remove cloud shadow pixels.	`True`
`remove_snow_ice`	`bool`	Whether to remove snow and ice pixels.	`False`
`remove_water`	`bool`	Whether to remove water pixels.	`False`
`remove_aerosol_low`	`bool`	Whether to remove low aerosol pixels.	`False`
`remove_aerosol_moderate`	`bool`	Whether to remove moderate aerosol pixels.	`False`
`remove_aerosol_high`	`bool`	Whether to remove high aerosol pixels.	`False`
`remove_aerosol_climatology`	`bool`	Whether to remove climatology aerosol pixels.	`False`

Source code in easysnowdata/remote_sensing.py

def mask_data(
    self,
    remove_cirrus=True,
    remove_cloud=True,
    remove_adj_to_cloud=True,
    remove_cloud_shadows=True,
    remove_snow_ice=False,
    remove_water=False,
    remove_aerosol_low=False,
    remove_aerosol_moderate=False,
    remove_aerosol_high=False,
    remove_aerosol_climatology=False,
):
    """
    The method to mask the data using Fmask.

    Parameters:
        remove_cirrus (bool): Whether to remove cirrus pixels.
        remove_cloud (bool): Whether to remove cloud pixels.
        remove_adj_to_cloud (bool): Whether to remove pixels adjacent to clouds.
        remove_cloud_shadows (bool): Whether to remove cloud shadow pixels.
        remove_snow_ice (bool): Whether to remove snow and ice pixels.
        remove_water (bool): Whether to remove water pixels.
        remove_aerosol_low (bool): Whether to remove low aerosol pixels.
        remove_aerosol_moderate (bool): Whether to remove moderate aerosol pixels.
        remove_aerosol_high (bool): Whether to remove high aerosol pixels.
        remove_aerosol_climatology (bool): Whether to remove climatology aerosol pixels.


    """

    # Get value of QC bit based on location
    def get_qc_bit(ar, bit):
        # taken from Helen's fantastic repo https://github.com/UW-GDA/mekong-water-quality/blob/main/02_pull_hls.ipynb
        return (ar // (2**bit)) - ((ar // (2**bit)) // 2 * 2)

    mask = xr.DataArray()
    # Mask the data based on the Fmask values
    mask_list = []
    if remove_cirrus:
        mask_list.append(0)
    if remove_cloud:
        mask_list.append(1)
    if remove_adj_to_cloud:
        mask_list.append(2)
    if remove_cloud_shadows:
        mask_list.append(3)
    if remove_snow_ice:
        mask_list.append(4)
    if remove_water:
        mask_list.append(5)
    if (
        remove_aerosol_climatology
        | remove_aerosol_low
        | remove_aerosol_moderate
        | remove_aerosol_high
    ):
        mask_list.append(6)
        aerosol_mask = (
            get_qc_bit(self.data["Fmask"], 6)
            .astype(str)
            .str.cat(get_qc_bit(self.data["Fmask"], 7).astype(str))
        )
        if remove_aerosol_climatology:
            mask = xr.concat([mask, aerosol_mask == "00"], dim="masks")
        if remove_aerosol_low:
            mask = xr.concat([mask, aerosol_mask == "01"], dim="masks")
        if remove_aerosol_moderate:
            mask = xr.concat([mask, aerosol_mask == "10"], dim="masks")
        if remove_aerosol_high:
            mask = xr.concat([mask, aerosol_mask == "11"], dim="masks")

    for val in mask_list:
        if val != 6:
            mask = xr.concat(
                [mask, get_qc_bit(self.data["Fmask"], val)], dim="masks"
            )

    mask = mask.sum(dim="masks")
    self.data = self.data.where(mask == 0)

    print(
        f"WARNING: The cloud masking is pretty bad over snow and ice. Use with caution."
    )
    print(f"Data masked. Using Fmask, removed pixels classified as:")
    for val in mask_list:
        print(self.Fmask_mask_info[val]["name"])

`remove_nodata_inplace(self)` ¶

The method to remove no data values from the data.

Source code in easysnowdata/remote_sensing.py

def remove_nodata_inplace(self):
    """
    The method to remove no data values from the data.
    """
    data_removed=False
    for band in self.data.data_vars:
        nodata_value = self.data[band].attrs.get("nodata")
        if nodata_value is not None:
            self.data[band] = self.data[band].where(self.data[band] != nodata_value)
            data_removed=True
    if data_removed:
        print(f"Nodata values removed from the data. In doing so, all bands converted to float32. To turn this behavior off, set remove_nodata=False.")
    else:
        print(f"Tried to remove nodata values and set them to nans, but no nodata values found in the data.")

`scale_data_inplace(self)` ¶

The method to scale the data.

Source code in easysnowdata/remote_sensing.py

def scale_data_inplace(self):
    """
    The method to scale the data.
    """

    # Define a function to scale a data variable
    def scale_var(x):
        band = x.name
        if band in self.data.band_info:
            scale_factor_dict = self.data.band_info[band]
            # Extract the actual scale factor from the dictionary and convert it to a float
            scale_factor = float(scale_factor_dict["scale"])
            return x * scale_factor
        else:
            return x

    # Apply the function to each data variable in the Dataset
    self.data = self.data.apply(scale_var, keep_attrs=True)
    print(
        f"Data scaled to reflectance. Access with the .data attribute. To turn this behavior off, set scale_data=False."
    )

`search_data(self)` ¶

The method to search the data.

Source code in easysnowdata/remote_sensing.py

def search_data(self):
    """
    The method to search the data.
    """

    catalog = pystac_client.Client.open(
        "https://cmr.earthdata.nasa.gov/stac/LPCLOUD"
    )

    # Search for items within the specified bbox and date range
    landsat_search = catalog.search(
        collections=["HLSL30_2.0"],
        bbox=self.bbox_gdf.total_bounds,
        datetime=(self.start_date, self.end_date),
    )
    sentinel_search = catalog.search(
        collections=["HLSS30_2.0"],
        bbox=self.bbox_gdf.total_bounds,
        datetime=(self.start_date, self.end_date),
    )

    self.search_landsat = landsat_search
    self.search_sentinel = sentinel_search
    print(
        f"Data searched. Access the returned seach with the .search_landsat or .search_sentinel attribute."
    )

`MODIS_snow` ¶

A class to handle MODIS snow data.

This class provides functionality to search, retrieve, and process MODIS snow cover data. It supports various MODIS snow products and allows for spatial and temporal subsetting.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.GeoDataFrame or tuple or Shapely Geometry`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	`None`
`clip_to_bbox`	`bool`	Whether to clip the data to the bounding box. Default is True.	`True`
`start_date`	`str`	The start date for the data in the format 'YYYY-MM-DD'. Default is '2000-01-01'.	`'2000-01-01'`
`end_date`	`str`	The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.	`'2025-09-12'`
`data_product`	`str`	The MODIS data product to retrieve. Can choose between 'MOD10A1F', 'MOD10A1', or 'MOD10A2'. Default is 'MOD10A2'.	`'MOD10A2'`
`bands`	`list`	The bands to be used. Default is all bands.	`None`
`resolution`	`str`	The resolution of the data. Defaults to native resolution.	`None`
`crs`	`str`	The coordinate reference system. Default is None.	`None`
`vertical_tile`	`int`	The vertical tile number for MODIS data. Default is None.	`None`
`horizontal_tile`	`int`	The horizontal tile number for MODIS data. Default is None.	`None`
`mute`	`bool`	Whether to mute print outputs. Default is False.	`False`

Attributes:

Name	Type	Description
`data`	`xarray.Dataset`	The loaded MODIS snow data.
`binary_snow`	`xarray.DataArray`	Binary snow cover map derived from the data (only for MOD10A2 product).
`Methods`	`None`
`-------`	`None`
`search_data()`	`None`	Searches for MODIS snow data based on the specified parameters.
`get_data()`	`None`	Retrieves the MODIS snow data based on the search results.
`get_binary_snow()`	`None`	Calculates a binary snow cover map from the data (only for MOD10A2 product).

Source code in easysnowdata/remote_sensing.py

class MODIS_snow:
    """
    A class to handle MODIS snow data.

    This class provides functionality to search, retrieve, and process MODIS snow cover data.
    It supports various MODIS snow products and allows for spatial and temporal subsetting.

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or Shapely Geometry, optional
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    clip_to_bbox : bool, optional
        Whether to clip the data to the bounding box. Default is True.
    start_date : str, optional
        The start date for the data in the format 'YYYY-MM-DD'. Default is '2000-01-01'.
    end_date : str, optional
        The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
    data_product : str, optional
        The MODIS data product to retrieve. Can choose between 'MOD10A1F', 'MOD10A1', or 'MOD10A2'. Default is 'MOD10A2'.
    bands : list, optional
        The bands to be used. Default is all bands.
    resolution : str, optional
        The resolution of the data. Defaults to native resolution.
    crs : str, optional
        The coordinate reference system. Default is None.
    vertical_tile : int, optional
        The vertical tile number for MODIS data. Default is None.
    horizontal_tile : int, optional
        The horizontal tile number for MODIS data. Default is None.
    mute : bool, optional
        Whether to mute print outputs. Default is False.

    Attributes
    ----------
    data : xarray.Dataset
        The loaded MODIS snow data.
    binary_snow : xarray.DataArray
        Binary snow cover map derived from the data (only for MOD10A2 product).

    Methods
    -------
    search_data()
        Searches for MODIS snow data based on the specified parameters.
    get_data()
        Retrieves the MODIS snow data based on the search results.
    get_binary_snow()
        Calculates a binary snow cover map from the data (only for MOD10A2 product).

    Notes
    -----
    Available data products:
    MOD10A1: Daily snow cover, 500m resolution
    MOD10A2: 8-day maximum snow cover, 500m resolution
    MOD10A1F: Daily cloud-free snow cover (gap-filled), 500m resolution

    Data citations:
    MOD10A1F: Hall, D. K. and G. A. Riggs. (2020). MODIS/Terra CGF Snow Cover Daily L3 Global 500m SIN Grid, Version 61 [Data Set]. Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://doi.org/10.5067/MODIS/MOD10A1F.061. Date Accessed 03-19-2024.
    MOD10A1: Hall, D. K. and G. A. Riggs. (2021). MODIS/Terra Snow Cover Daily L3 Global 500m SIN Grid, Version 61 [Data Set]. Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://doi.org/10.5067/MODIS/MOD10A1.061. Date Accessed 03-28-2024.
    MOD10A2: Hall, D. K. and G. A. Riggs. (2021). MODIS/Terra Snow Cover 8-Day L3 Global 500m SIN Grid, Version 61 [Data Set]. Boulder, Colorado USA. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://doi.org/10.5067/MODIS/MOD10A2.061. Date Accessed 03-28-2024.
    """

    def __init__(
        self,
        bbox_input=None,
        clip_to_bbox=True,
        start_date="2000-01-01",
        end_date=today,
        data_product="MOD10A2",
        bands=None,
        resolution=None,
        crs=None,
        vertical_tile=None,
        horizontal_tile=None,
        mute=False,
    ):

        if mute:
            blockPrint()

        self.bbox_input = bbox_input
        self.bbox_gdf = convert_bbox_to_geodataframe(bbox_input)
        self.clip_to_bbox = clip_to_bbox
        self.start_date = start_date
        self.end_date = end_date
        self.data_product = data_product
        self.bands = bands
        self.resolution = resolution
        self.crs = crs
        self.vertical_tile = vertical_tile
        self.horizontal_tile = horizontal_tile


        self.search_data()
        self.get_data()

        if mute:
            enablePrint()

    def search_data(self):

        if self.data_product == "MOD10A1" or self.data_product == "MOD10A2":
            catalog = pystac_client.Client.open(
                "https://planetarycomputer.microsoft.com/api/stac/v1",
                modifier=planetary_computer.sign_inplace,
            )

            if self.bbox_input is not None:
                search = catalog.search(
                    collections=[f"modis-{self.data_product[3:]}-061"],
                    bbox=self.bbox_gdf.total_bounds,
                    datetime=(self.start_date, self.end_date),
                )

            else:
                search = catalog.search(
                    collections=[f"modis-{self.data_product[3:]}-061"],
                    datetime=(self.start_date, self.end_date),
                    query={
                        "modis:vertical-tile": {"eq": self.vertical_tile},
                        "modis:horizontal-tile": {"eq": self.horizontal_tile},
                    },
                )

        elif self.data_product == "MOD10A1F":
            search = earthaccess.search_data(
                short_name="MOD10A1F",
                cloud_hosted=False,
                bounding_box=tuple(self.bbox_gdf.total_bounds),
                temporal=(self.start_date, self.end_date),
            )

        else:
            raise ValueError(
                "Data product not recognized. Please choose 'MOD10A1', 'MOD10A2', or 'MOD10A1F'."
            )

        self.search = search

    def get_data(self):

        if self.data_product == "MOD10A1" or self.data_product == "MOD10A2":

            load_params = {
                "items": self.search.item_collection(),
                "chunks": {"time": 1, "x": 512, "y": 512},
            }
            if self.clip_to_bbox:
                load_params["bbox"] = self.bbox_gdf.total_bounds
            if self.bands:
                load_params["bands"] = self.bands
            if self.crs:
                load_params["crs"] = self.crs
            if self.resolution:
                load_params["resolution"] = self.resolution

            modis_snow = odc.stac.load(**load_params)

        elif self.data_product == "MOD10A1F":
            # files = earthaccess.open(results) # doesn't seem to work for .hdf files...
            # https://github.com/nsidc/earthaccess/blob/main/docs/tutorials/file-access.ipynb
            # https://github.com/nsidc/earthaccess/tree/main
            # https://earthaccess.readthedocs.io/en/latest/tutorials/emit-earthaccess/
            # https://nbviewer.org/urls/gist.githubusercontent.com/scottyhq/790bf19c7811b5c6243ce37aae252ca1/raw/e2632e928647fd91c797e4a23116d2ac3ff62372/0-load-hdf5.ipynb
            # https://docs.dask.org/en/latest/array-creation.html#concatenation-and-stacking
            # https://matthewrocklin.com/blog/work/2018/02/06/hdf-in-the-cloud

            # guess we'll download instead
            temp_download_fp = "/tmp/local_folder"  # do these auto delete, or should i delete when opened explicitly? shutil.rmtree(temp_download_fp)

            files = earthaccess.download(
                self.search, temp_download_fp
            )  # can i suppress the print output? https://earthaccess.readthedocs.io/en/latest/user-reference/api/api/


            if self.clip_to_bbox:
                modis_snow = xr.concat(
                    [
                        rxr.open_rasterio(
                            file, variable="CGF_NDSI_Snow_Cover", chunks={}
                        )["CGF_NDSI_Snow_Cover"]
                        .squeeze()
                        .rio.clip_box(*self.bbox_gdf.total_bounds,crs=self.bbox_gdf.crs)
                        .assign_coords(
                            time=pd.to_datetime(
                                rxr.open_rasterio(
                                    file, variable="CGF_NDSI_Snow_Cover", chunks={}
                                )
                                .squeeze()
                                .attrs["RANGEBEGINNINGDATE"]
                            )
                        )
                        .drop_vars("band")
                        for file in files
                    ],
                    dim="time",
                )

            else:
                modis_snow = xr.concat(
                    [
                        rxr.open_rasterio(
                            file, variable="CGF_NDSI_Snow_Cover", chunks={}
                        )["CGF_NDSI_Snow_Cover"]
                        .squeeze()
                        .assign_coords(
                            time=pd.to_datetime(
                                rxr.open_rasterio(
                                    file, variable="CGF_NDSI_Snow_Cover", chunks={}
                                )
                                .squeeze()
                                .attrs["RANGEBEGINNINGDATE"]
                            )
                        )
                        .drop_vars("band")
                        for file in files
                    ],
                    dim="time",
                )

        else:
            raise ValueError(
                "Data product not recognized. Please choose 'MOD10A1', 'MOD10A2', or 'MOD10A1F'."
            )

        self.data = modis_snow

        if self.data_product == "MOD10A2":
            self.data.attrs["class_info"] = {
                0: {"name": "missing data", "color": "#006400"},
                1: {"name": "no decision", "color": "#FFBB22"},
                11: {"name": "night", "color": "#FFFF4C"},
                25: {"name": "no snow", "color": "#F096FF"},
                37: {"name": "lake", "color": "#FA0000"},
                39: {"name": "ocean / sparse vegetation", "color": "#B4B4B4"},
                50: {"name": "cloud", "color": "#F0F0F0"},
                100: {"name": "lake ice", "color": "#0064C8"},
                200: {"name": "snow", "color": "#0096A0"},
                254: {"name": "detector saturated", "color": "#00CF75"},
                255: {"name": "fill", "color": "#FAE6A0"},
            }

        print("Data retrieved. Access with the .data attribute.")

    def get_binary_snow(self):

        if self.data_product == "MOD10A2":
            self.binary_snow = xr.where(self.data["Maximum_Snow_Extent"] == 200, 1, 0).rio.write_crs(self.data.rio.crs)
            print("Binary snow map calculated. Access with the .binary_snow attribute.")
        else:
            print("This method is only available for the MOD10A2 product.")

`Sentinel1` ¶

A class to handle Sentinel-1 RTC satellite data.

This class provides functionality to search, retrieve, and process Sentinel-1 Radiometric Terrain Corrected (RTC) data. It supports various data operations including border noise removal and unit conversion.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.GeoDataFrame or tuple or Shapely Geometry`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	required
`start_date`	`str`	The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.	`'2014-01-01'`
`end_date`	`str`	The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.	`'2025-09-12'`
`catalog_choice`	`str`	The catalog choice for the data. Default is 'planetarycomputer'.	`'planetarycomputer'`
`bands`	`list`	The bands to be used. Default is all bands.	`None`
`units`	`str`	The units of the data. Can be 'dB' or 'linear power'. Default is 'dB'.	`'dB'`
`resolution`	`str`	The resolution of the data. Defaults to native resolution.	`None`
`crs`	`str`	The coordinate reference system. Default is None.	`None`
`groupby`	`str`	The groupby parameter for the data. Default is "sat:absolute_orbit".	`'sat:absolute_orbit'`
`chunks`	`dict`	The chunk size for dask arrays. Default is {}.	`{}`
`remove_border_noise`	`bool`	Whether to remove border noise from the data. Default is True.	`True`

Attributes:

Name	Type	Description
`data`	`xarray.Dataset`	The loaded Sentinel-1 data.
`metadata`	`geopandas.GeoDataFrame`	Metadata for the retrieved Sentinel-1 scenes.
`local_incidence_angle_data`	`xarray.DataArray`	Local incidence angle values calculated from Sentinel-1 data and Copernicus DEM. functionality requires google earth engine initialization, and is based on https://gis.stackexchange.com/a/352658
`Methods`	`None`
`-------`	`None`
`search_data()`	`None`	Searches for Sentinel-1 data based on the specified parameters.
`get_data()`	`None`	Retrieves the Sentinel-1 data based on the search results.
`get_metadata()`	`None`	Retrieves metadata for the Sentinel-1 scenes.
`remove_border_noise()`	`None`	Removes border noise from the data.
`linear_to_db()`	`None`	Converts linear power units to decibels (dB).
`db_to_linear()`	`None`	Converts decibels (dB) to linear power units.
`add_orbit_info()`	`None`	Adds orbit information to the data as coordinates.
`get_local_incidence_angle()`	`None`	Calculates and retrieves the local incidence angle for the area of interest.

Source code in easysnowdata/remote_sensing.py

class Sentinel1:
    """
    A class to handle Sentinel-1 RTC satellite data.

    This class provides functionality to search, retrieve, and process Sentinel-1 Radiometric Terrain Corrected (RTC) data.
    It supports various data operations including border noise removal and unit conversion.

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or Shapely Geometry
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    start_date : str, optional
        The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
    end_date : str, optional
        The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
    catalog_choice : str, optional
        The catalog choice for the data. Default is 'planetarycomputer'.
    bands : list, optional
        The bands to be used. Default is all bands.
    units : str, optional
        The units of the data. Can be 'dB' or 'linear power'. Default is 'dB'.
    resolution : str, optional
        The resolution of the data. Defaults to native resolution.
    crs : str, optional
        The coordinate reference system. Default is None.
    groupby : str, optional
        The groupby parameter for the data. Default is "sat:absolute_orbit".
    chunks : dict, optional
        The chunk size for dask arrays. Default is {}.
    remove_border_noise : bool, optional
        Whether to remove border noise from the data. Default is True.

    Attributes
    ----------
    data : xarray.Dataset
        The loaded Sentinel-1 data.
    metadata : geopandas.GeoDataFrame
        Metadata for the retrieved Sentinel-1 scenes.
    local_incidence_angle_data : xarray.DataArray
        Local incidence angle values calculated from Sentinel-1 data and Copernicus DEM.
        functionality requires google earth engine initialization, and is based on https://gis.stackexchange.com/a/352658

    Methods
    -------
    search_data()
        Searches for Sentinel-1 data based on the specified parameters.
    get_data()
        Retrieves the Sentinel-1 data based on the search results.
    get_metadata()
        Retrieves metadata for the Sentinel-1 scenes.
    remove_border_noise()
        Removes border noise from the data.
    linear_to_db()
        Converts linear power units to decibels (dB).
    db_to_linear()
        Converts decibels (dB) to linear power units.
    add_orbit_info()
        Adds orbit information to the data as coordinates.
    get_local_incidence_angle()
        Calculates and retrieves the local incidence angle for the area of interest.
    """

    def __init__(
        self,
        bbox_input,
        start_date="2014-01-01",
        end_date=today,
        catalog_choice="planetarycomputer",
        bands=None,
        units='dB', # linear power or dB
        resolution=None,
        crs=None,
        groupby="sat:absolute_orbit",
        chunks={}, # {"x": 512, "y": 512} or # {"x": 512, "y": 512, "time": -1}
        remove_border_noise=True,
    ):
        """
        The constructor for the Sentinel1 class.

        Parameters:
            bbox_input (geopandas.GeoDataFrame or tuple or shapely.Geometry): GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
            start_date (str): The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
            end_date (str): The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
            catalog_choice (str): The catalog choice for the data. Can choose between 'planetarycomputer' and <unimplemented>, default is 'planetarycomputer'.
            bands (list): The bands to be used. Default is all bands.
            resolution (str): The resolution of the data. Defaults to native resolution, 10m.
            crs (str): The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.
            groupby (str): The groupby parameter for the data. Default is "sat:absolute_orbit".
        """
        # Initialize the attributes
        self.bbox_input = bbox_input
        self.start_date = start_date
        self.end_date = end_date
        self.catalog_choice = catalog_choice
        self.bands = bands
        self.resolution = resolution
        self.crs = crs
        self.chunks = chunks
        self.groupby = groupby
        self.remove_border_noise = remove_border_noise

        #if not self.geobox:
        self.bbox_gdf = convert_bbox_to_geodataframe(self.bbox_input)

        if self.crs is None:
            self.crs = self.bbox_gdf.estimate_utm_crs()

        # if resolution == None:
        #     self.resolution = 10

        self.search = None
        self.data = None
        self.metadata = None
        self._local_incidence_angle_data = None

        self.search_data()
        self.get_data()
        self.get_metadata()
        if self.remove_border_noise:
            self.remove_bad_scenes_and_border_noise()
        self.add_orbit_info()
        if units == 'dB':
            self.linear_to_db()
        else:
            print('Units remain in linear power. Convert to dB using the .linear_to_db() method.')

    def search_data(self):
        """
        The method to search the data.
        """

        # Choose the catalog URL based on catalog_choice
        if self.catalog_choice == "planetarycomputer":
            catalog_url = "https://planetarycomputer.microsoft.com/api/stac/v1"
            catalog = pystac_client.Client.open(
                catalog_url, modifier=planetary_computer.sign_inplace
            )
        # elif self.catalog_choice == "aws":
        #     catalog_url = indigo
        #     catalog = pystac_client.Client.open(catalog_url)
        else:
            raise ValueError(
                "Invalid catalog_choice. Choose either 'planetarycomputer' or <unimplemented>."
            )

        # Search for items within the specified bbox and date range
        search = catalog.search(
            collections=["sentinel-1-rtc"],
            bbox=self.bbox_gdf.total_bounds,
            datetime=(self.start_date, self.end_date),
        )
        # elif self.geobox:
        #     search = catalog.search(
        #         collections=["sentinel-1-rtc"],
        #         bbox=np.array(self.geobox.extent.boundingbox.to_crs('epsg:4326')),
        #         datetime=(self.start_date, self.end_date),
        #     )

        self.search = search
        print(f"Data searched. Access the returned seach with the .search attribute.")

    def get_data(self):
        """
        The method to get the data.
        """
        # Prepare the parameters for odc.stac.load
        load_params = {
            "items": self.search.items(),
            "nodata": -32768,
            "chunks": self.chunks,
            "groupby": self.groupby,
        }
        if self.bands:
            load_params["bands"] = self.bands
        load_params["crs"] = self.crs
        load_params["bbox"] = self.bbox_gdf.total_bounds
        load_params["resolution"] = self.resolution

        # Load the data lazily using odc.stac
        self.data = odc.stac.load(**load_params).sortby(
            "time"
        )  # sorting by time because of known issue in s1 mpc stac catalog
        self.data.attrs["units"] = "linear power"
        print(
            f"Data retrieved. Access with the .data attribute. Data CRS: {self.bbox_gdf.estimate_utm_crs().name}."
        )

    def get_metadata(self):
        """
        The method to get the metadata.
        """
        stac_json = self.search.item_collection_as_dict()
        metadata_gdf = gpd.GeoDataFrame.from_features(stac_json, "epsg:4326")

        self.metadata = metadata_gdf
        print(f"Metadata retrieved. Access with the .metadata attribute.")

    # def remove_border_noise(self,threshold=0.001):
    #     """
    #     The method to remove border noise from the data.
    #     https://forum.step.esa.int/t/grd-border-noise-and-thermal-noise-removal-are-not-working-anymore-since-march-13-2018/9332
    #     https://www.mdpi.com/2072-4292/8/4/348
    #     https://forum.step.esa.int/t/nan-appears-at-the-edge-of-the-scene-after-applying-border-noise-removal-sentinel-1-grd/40627/2
    #     https://sentiwiki.copernicus.eu/__attachments/1673968/OI-MPC-OTH-MPC-0243%20-%20Sentinel-1%20masking%20no%20value%20pixels%20grd%20products%20note%202023%20-%202.2.pdf?inst-v=534578f3-fc04-48e9-bd69-3a45a681fe67#page=12.58
    #     https://ieeexplore.ieee.org/document/8255846
    #     https://www.mdpi.com/2504-3900/2/7/330
    #     """
    #     self.data.loc[dict(time=slice('2014-01-01','2018-03-14'))] = self.data.sel(time=slice('2014-01-01','2018-03-14')).where(lambda x: x > threshold)
    #     print(f"Border noise removed from the data.")

    def remove_bad_scenes_and_border_noise(self, threshold=0.001):
        cutoff_date = np.datetime64('2018-03-14')

        original_crs = self.data.rio.crs

        result = xr.where(
            self.data.time < cutoff_date,
            self.data.where(self.data > threshold),
            self.data.where(self.data > 0)
        )

        result.rio.write_crs(original_crs, inplace=True)

        self.data = result
        print(f"Falsely low scenes and border noise removed from the data.")

    def linear_to_db(self):
        """
        The method to convert the linear power data to dB.
        """
        self.data = 10 * np.log10(self.data)
        self.data.attrs["units"] = "dB"
        print(
            f"Linear power units converted to dB. Convert back to linear power units using the .db_to_linear() method."
        )

    def db_to_linear(self):
        """
        The method to convert the dB data to linear power.
        """
        self.data = 10 ** (self.data / 10)
        self.data.attrs["units"] = "linear power"
        print(
            f"dB converted to linear power units. Convert back to dB using the .linear_to_db() method."
        )

    def add_orbit_info(self):
        """
        The method to add the relative orbit number to the data.
        """
        metadata_groupby_gdf = (
            self.metadata.groupby([f"{self.groupby}"]).first().sort_values("datetime")
        )
        self.data = self.data.assign_coords(
            {"sat:orbit_state": ("time", metadata_groupby_gdf["sat:orbit_state"])}
        )
        self.data = self.data.assign_coords(
            {
                "sat:relative_orbit": (
                    "time",
                    metadata_groupby_gdf["sat:relative_orbit"].astype("int16"),
                )
            }
        )
        print(
            f"Added relative orbit number and orbit state as coordinates to the data."
        )

    @property
    def local_incidence_angle_data(self):
        """
        Property to access local incidence angle data.

        Returns
        -------
        xarray.DataArray
            DataArray containing local incidence angle values aligned to the same grid as the primary data.

        Notes
        -----
        On first access, this property calculates local incidence angles using Sentinel-1 data and 
        Copernicus 30m DEM. Results are cached for subsequent accesses.
        """
        if self._local_incidence_angle_data is None:
            self.get_local_incidence_angle()
        return self._local_incidence_angle_data

    def get_local_incidence_angle(self, resolution=None, initialize_ee=True):
        """
        Calculate local incidence angle for Sentinel-1 data within a bounding box.

        Parameters
        ----------
        resolution : int or float, optional
            Desired resolution in meters for the calculation. Defaults to self.resolution if set, or 30m.
        initialize_ee : bool, optional
            Whether to initialize Earth Engine. Default is True.

        Returns
        -------
        xarray.DataArray
            DataArray containing local incidence angle values with dimensions (sat:relative_orbit, y, x),
            aligned to the same grid as the primary data.

        Notes
        -----
        This method calculates the local incidence angle using the Copernicus 30m DEM and
        stores the results in the local_incidence_angle_data attribute.
        """
        import math
        import xee
        from collections import Counter

        # Use object's resolution if none provided
        calc_resolution = resolution or self.resolution or 30

        # Initialize Earth Engine with high-volume endpoint
        if initialize_ee:
            ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')
        else:
            print("Initialization turned off. If you haven't already, please sign in to Google Earth Engine by running:\n\nimport ee\nee.Authenticate()\nee.Initialize()\n\n")

        # Convert bbox to Earth Engine geometry
        bbox = tuple(self.bbox_gdf.total_bounds)
        ee_bbox = ee.Geometry.Rectangle(bbox)

        # Filter Sentinel-1 collection
        collection = ee.ImageCollection('COPERNICUS/S1_GRD') \
            .filterBounds(ee_bbox) \
            .filter(ee.Filter.listContains('transmitterReceiverPolarisation', 'VV')) \
            .filter(ee.Filter.eq('instrumentMode', 'IW'))

        # Get distinct orbit numbers
        distinct_orbits = collection.aggregate_array('relativeOrbitNumber_start').distinct()
        orbit_list = distinct_orbits.getInfo()

        if not orbit_list:
            raise ValueError("No Sentinel-1 data found for the specified bounding box.")

        print(f"Found {len(orbit_list)} unique relative orbits: {orbit_list}")

        # Find the most common projection among the orbits
        orbit_projections = {}
        print("Analyzing orbit projections to find the most common one...")

        for orbit in orbit_list:
            orbit_image = collection \
                .filter(ee.Filter.eq('relativeOrbitNumber_start', orbit)) \
                .first()

            if orbit_image:
                # Get projection info
                proj_info = orbit_image.select(0).projection().getInfo()
                crs = proj_info['crs']
                orbit_projections[orbit] = {
                    'crs': crs,
                    'transform': proj_info['transform']
                }
                print(f"Orbit {orbit} uses {crs}")

        # Count CRS frequencies
        crs_counts = Counter([info['crs'] for info in orbit_projections.values()])
        most_common_crs = crs_counts.most_common(1)[0][0]

        print(f"Most common CRS: {most_common_crs} (used by {crs_counts[most_common_crs]} of {len(orbit_list)} orbits)")

        # Function to calculate local incidence angle using Copernicus 30m DEM
        def calculate_local_incidence_angle(image):
            img_geom = image.geometry()

            # Use Copernicus 30m DEM with proper reprojection
            dem_collection = ee.ImageCollection('COPERNICUS/DEM/GLO30')
            dem = dem_collection.select('DEM').mosaic().clip(img_geom)

            # Reproject DEM to the most common CRS with the specified resolution
            projection = ee.Projection(most_common_crs).atScale(calc_resolution)
            dem = dem.reproject(projection)

            # 2.1.1 Radar geometry 
            theta_i = image.select('angle')
            phi_i = ee.Terrain.aspect(theta_i) \
                .reduceRegion(ee.Reducer.mean(), theta_i.get('system:footprint'), 1000) \
                .get('aspect')

            # 2.1.2 Terrain geometry
            alpha_s = ee.Terrain.slope(dem).select('slope')
            phi_s = ee.Terrain.aspect(dem).select('aspect')

            # 2.1.3 Model geometry
            # reduce to 3 angle
            phi_r = ee.Image.constant(phi_i).subtract(phi_s)

            # convert all to radians
            phi_rRad = phi_r.multiply(math.pi / 180)
            alpha_sRad = alpha_s.multiply(math.pi / 180)
            theta_iRad = theta_i.multiply(math.pi / 180)
            ninetyRad = ee.Image.constant(90).multiply(math.pi / 180)

            # slope steepness in range (eq. 2)
            alpha_r = (alpha_sRad.tan().multiply(phi_rRad.cos())).atan()

            # slope steepness in azimuth (eq 3)
            alpha_az = (alpha_sRad.tan().multiply(phi_rRad.sin())).atan()

            # local incidence angle (eq. 4)
            cos_theta_lia = (alpha_az.cos().multiply((theta_iRad.subtract(alpha_r)).cos()))

            # Ensure valid range for acos
            cos_theta_lia = cos_theta_lia.clamp(-1, 1)

            theta_lia = cos_theta_lia.acos()
            theta_liaDeg = theta_lia.multiply(180 / math.pi)

            return image.addBands(theta_liaDeg.rename('local_incidence_angle'))

        # Create list to store DataArrays for each orbit
        orbit_arrays = []

        # Create standard projection based on the most common CRS
        standard_projection = ee.Projection(most_common_crs).atScale(calc_resolution)

        # Process each orbit
        for orbit in orbit_list:
            # Get images for this orbit
            orbit_images = collection \
                .filter(ee.Filter.eq('relativeOrbitNumber_start', orbit)) \
                .sort('system:time_start', True) \
                .limit(3)

            if orbit_images.size().getInfo() > 0:
                # Calculate LIA for each image
                lia_images = orbit_images.map(calculate_local_incidence_angle)

                # Calculate median LIA
                median_lia = lia_images.select('local_incidence_angle').median()

                # Set properties on the median image
                timestamp = orbit_images.first().get('system:time_start')
                median_lia = median_lia.set({
                    'relativeOrbitNumber_start': orbit,
                    'system:time_start': timestamp
                })

                # Create a single-image collection for xee
                orbit_collection = ee.ImageCollection([median_lia])

                # Use xee to convert to xarray
                try:
                    ds = xr.open_dataset(
                        orbit_collection, 
                        engine='ee',
                        geometry=bbox,
                        projection=standard_projection,
                        chunks={},
                    )

                    # Extract the DataArray
                    da = ds['local_incidence_angle']

                    # Remove the time dimension if present
                    if 'time' in da.dims:
                        da = da.isel(time=0, drop=True)

                    # Add orbit as a coordinate
                    da = da.assign_coords({'sat:relative_orbit': orbit})

                    # Check for NaN values
                    nan_percentage = np.isnan(da.values).mean() * 100
                    print(f"Orbit {orbit} - Shape: {da.shape}, NaN percentage: {nan_percentage:.1f}%")

                    if nan_percentage < 100:  # Only keep arrays with some valid data
                        # Store in list
                        orbit_arrays.append(da)
                        print(f"Successfully processed orbit {orbit}")
                    else:
                        print(f"Skipping orbit {orbit} - all values are NaN")

                except Exception as e:
                    print(f"Error processing orbit {orbit}: {e}")

        if orbit_arrays:
            # Ensure all arrays have the same shape before concatenating
            shapes = [da.shape for da in orbit_arrays]
            if len(set(shapes)) > 1:
                print(f"Warning: Arrays have different shapes: {shapes}")

                # Take the shape with the most non-NaN values as template
                best_da_idx = np.argmax([~np.isnan(da.values).sum() for da in orbit_arrays])
                template_da = orbit_arrays[best_da_idx]

                for i in range(len(orbit_arrays)):
                    if i != best_da_idx and orbit_arrays[i].shape != template_da.shape:
                        orbit_num = orbit_arrays[i].sat_relative_orbit.values[0]
                        print(f"Resampling orbit {orbit_num} to match template shape {template_da.shape}")
                        orbit_arrays[i] = orbit_arrays[i].reindex_like(template_da)

            # Combine all orbits into a single DataArray
            lia_da = xr.concat(orbit_arrays, dim='sat:relative_orbit')

            # Add attributes
            lia_da.attrs.update({
                'long_name': 'Sentinel-1 Local Incidence Angle',
                'units': 'degrees',
                'description': 'Local incidence angle calculated from Sentinel-1 data and Copernicus 30m DEM',
                'source': 'Sentinel-1 GRD'
            })

            lia_da = lia_da.transpose("sat:relative_orbit",'Y','X').rename({'X':'x', 'Y':'y'}).rio.set_spatial_dims(x_dim='x', y_dim='y')
            lia_da = lia_da.sortby('sat:relative_orbit')
            # should be in range from 0 to 90
            #lia_da = lia_da.where(lambda x: (x >= 0) & (x <= 90))

            # Reproject to match data grid exactly using bilinear interpolation
            if self.data is not None:
                # Get reference grid from first data variable
                ref_da = self.data[list(self.data.data_vars)[0]].isel(time=0)

                # Ensure lia_da has CRS information
                if not lia_da.rio.crs and ref_da.rio.crs:
                    lia_da.rio.write_crs(ref_da.rio.crs, inplace=True)

                # Reproject to match data grid using bilinear interpolation, careful with nodata
                lia_da = lia_da.rio.reproject_match(
                    ref_da,
                    resampling=rio.warp.Resampling.bilinear,
                    nodata=np.nan,
                )

                print("Local incidence angle data reprojected to match main data grid using bilinear resampling.")

            self._local_incidence_angle_data = lia_da
            print("Local incidence angle calculation complete. Access via the .local_incidence_angle_data attribute.")

            #return self._local_incidence_angle_data
        else:
            raise ValueError("No valid Sentinel-1 data found for the specified bounding box.")

`local_incidence_angle_data` `property` `readonly` ¶

Property to access local incidence angle data.

Returns:

Type	Description
`xarray.DataArray`	DataArray containing local incidence angle values aligned to the same grid as the primary data.

`init(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', catalog_choice='planetarycomputer', bands=None, units='dB', resolution=None, crs=None, groupby='sat:absolute_orbit', chunks={}, remove_border_noise=True)` `special` ¶

The constructor for the Sentinel1 class.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.GeoDataFrame or tuple or shapely.Geometry`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	required
`start_date`	`str`	The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.	`'2014-01-01'`
`end_date`	`str`	The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.	`'2025-09-12'`
`catalog_choice`	`str`	The catalog choice for the data. Can choose between 'planetarycomputer' and , default is 'planetarycomputer'.	`'planetarycomputer'`
`bands`	`list`	The bands to be used. Default is all bands.	`None`
`resolution`	`str`	The resolution of the data. Defaults to native resolution, 10m.	`None`
`crs`	`str`	The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.	`None`
`groupby`	`str`	The groupby parameter for the data. Default is "sat:absolute_orbit".	`'sat:absolute_orbit'`

Source code in easysnowdata/remote_sensing.py

def __init__(
    self,
    bbox_input,
    start_date="2014-01-01",
    end_date=today,
    catalog_choice="planetarycomputer",
    bands=None,
    units='dB', # linear power or dB
    resolution=None,
    crs=None,
    groupby="sat:absolute_orbit",
    chunks={}, # {"x": 512, "y": 512} or # {"x": 512, "y": 512, "time": -1}
    remove_border_noise=True,
):
    """
    The constructor for the Sentinel1 class.

    Parameters:
        bbox_input (geopandas.GeoDataFrame or tuple or shapely.Geometry): GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
        start_date (str): The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
        end_date (str): The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
        catalog_choice (str): The catalog choice for the data. Can choose between 'planetarycomputer' and <unimplemented>, default is 'planetarycomputer'.
        bands (list): The bands to be used. Default is all bands.
        resolution (str): The resolution of the data. Defaults to native resolution, 10m.
        crs (str): The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.
        groupby (str): The groupby parameter for the data. Default is "sat:absolute_orbit".
    """
    # Initialize the attributes
    self.bbox_input = bbox_input
    self.start_date = start_date
    self.end_date = end_date
    self.catalog_choice = catalog_choice
    self.bands = bands
    self.resolution = resolution
    self.crs = crs
    self.chunks = chunks
    self.groupby = groupby
    self.remove_border_noise = remove_border_noise

    #if not self.geobox:
    self.bbox_gdf = convert_bbox_to_geodataframe(self.bbox_input)

    if self.crs is None:
        self.crs = self.bbox_gdf.estimate_utm_crs()

    # if resolution == None:
    #     self.resolution = 10

    self.search = None
    self.data = None
    self.metadata = None
    self._local_incidence_angle_data = None

    self.search_data()
    self.get_data()
    self.get_metadata()
    if self.remove_border_noise:
        self.remove_bad_scenes_and_border_noise()
    self.add_orbit_info()
    if units == 'dB':
        self.linear_to_db()
    else:
        print('Units remain in linear power. Convert to dB using the .linear_to_db() method.')

`add_orbit_info(self)` ¶

The method to add the relative orbit number to the data.

Source code in easysnowdata/remote_sensing.py

def add_orbit_info(self):
    """
    The method to add the relative orbit number to the data.
    """
    metadata_groupby_gdf = (
        self.metadata.groupby([f"{self.groupby}"]).first().sort_values("datetime")
    )
    self.data = self.data.assign_coords(
        {"sat:orbit_state": ("time", metadata_groupby_gdf["sat:orbit_state"])}
    )
    self.data = self.data.assign_coords(
        {
            "sat:relative_orbit": (
                "time",
                metadata_groupby_gdf["sat:relative_orbit"].astype("int16"),
            )
        }
    )
    print(
        f"Added relative orbit number and orbit state as coordinates to the data."
    )

`db_to_linear(self)` ¶

The method to convert the dB data to linear power.

Source code in easysnowdata/remote_sensing.py

def db_to_linear(self):
    """
    The method to convert the dB data to linear power.
    """
    self.data = 10 ** (self.data / 10)
    self.data.attrs["units"] = "linear power"
    print(
        f"dB converted to linear power units. Convert back to dB using the .linear_to_db() method."
    )

`get_data(self)` ¶

The method to get the data.

Source code in easysnowdata/remote_sensing.py

def get_data(self):
    """
    The method to get the data.
    """
    # Prepare the parameters for odc.stac.load
    load_params = {
        "items": self.search.items(),
        "nodata": -32768,
        "chunks": self.chunks,
        "groupby": self.groupby,
    }
    if self.bands:
        load_params["bands"] = self.bands
    load_params["crs"] = self.crs
    load_params["bbox"] = self.bbox_gdf.total_bounds
    load_params["resolution"] = self.resolution

    # Load the data lazily using odc.stac
    self.data = odc.stac.load(**load_params).sortby(
        "time"
    )  # sorting by time because of known issue in s1 mpc stac catalog
    self.data.attrs["units"] = "linear power"
    print(
        f"Data retrieved. Access with the .data attribute. Data CRS: {self.bbox_gdf.estimate_utm_crs().name}."
    )

`get_local_incidence_angle(self, resolution=None, initialize_ee=True)` ¶

Calculate local incidence angle for Sentinel-1 data within a bounding box.

Parameters:

Name	Type	Description	Default
`resolution`	`int or float`	Desired resolution in meters for the calculation. Defaults to self.resolution if set, or 30m.	`None`
`initialize_ee`	`bool`	Whether to initialize Earth Engine. Default is True.	`True`

Returns:

Type	Description
`xarray.DataArray`	DataArray containing local incidence angle values with dimensions (sat:relative_orbit, y, x), aligned to the same grid as the primary data.

Source code in easysnowdata/remote_sensing.py

def get_local_incidence_angle(self, resolution=None, initialize_ee=True):
    """
    Calculate local incidence angle for Sentinel-1 data within a bounding box.

    Parameters
    ----------
    resolution : int or float, optional
        Desired resolution in meters for the calculation. Defaults to self.resolution if set, or 30m.
    initialize_ee : bool, optional
        Whether to initialize Earth Engine. Default is True.

    Returns
    -------
    xarray.DataArray
        DataArray containing local incidence angle values with dimensions (sat:relative_orbit, y, x),
        aligned to the same grid as the primary data.

    Notes
    -----
    This method calculates the local incidence angle using the Copernicus 30m DEM and
    stores the results in the local_incidence_angle_data attribute.
    """
    import math
    import xee
    from collections import Counter

    # Use object's resolution if none provided
    calc_resolution = resolution or self.resolution or 30

    # Initialize Earth Engine with high-volume endpoint
    if initialize_ee:
        ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')
    else:
        print("Initialization turned off. If you haven't already, please sign in to Google Earth Engine by running:\n\nimport ee\nee.Authenticate()\nee.Initialize()\n\n")

    # Convert bbox to Earth Engine geometry
    bbox = tuple(self.bbox_gdf.total_bounds)
    ee_bbox = ee.Geometry.Rectangle(bbox)

    # Filter Sentinel-1 collection
    collection = ee.ImageCollection('COPERNICUS/S1_GRD') \
        .filterBounds(ee_bbox) \
        .filter(ee.Filter.listContains('transmitterReceiverPolarisation', 'VV')) \
        .filter(ee.Filter.eq('instrumentMode', 'IW'))

    # Get distinct orbit numbers
    distinct_orbits = collection.aggregate_array('relativeOrbitNumber_start').distinct()
    orbit_list = distinct_orbits.getInfo()

    if not orbit_list:
        raise ValueError("No Sentinel-1 data found for the specified bounding box.")

    print(f"Found {len(orbit_list)} unique relative orbits: {orbit_list}")

    # Find the most common projection among the orbits
    orbit_projections = {}
    print("Analyzing orbit projections to find the most common one...")

    for orbit in orbit_list:
        orbit_image = collection \
            .filter(ee.Filter.eq('relativeOrbitNumber_start', orbit)) \
            .first()

        if orbit_image:
            # Get projection info
            proj_info = orbit_image.select(0).projection().getInfo()
            crs = proj_info['crs']
            orbit_projections[orbit] = {
                'crs': crs,
                'transform': proj_info['transform']
            }
            print(f"Orbit {orbit} uses {crs}")

    # Count CRS frequencies
    crs_counts = Counter([info['crs'] for info in orbit_projections.values()])
    most_common_crs = crs_counts.most_common(1)[0][0]

    print(f"Most common CRS: {most_common_crs} (used by {crs_counts[most_common_crs]} of {len(orbit_list)} orbits)")

    # Function to calculate local incidence angle using Copernicus 30m DEM
    def calculate_local_incidence_angle(image):
        img_geom = image.geometry()

        # Use Copernicus 30m DEM with proper reprojection
        dem_collection = ee.ImageCollection('COPERNICUS/DEM/GLO30')
        dem = dem_collection.select('DEM').mosaic().clip(img_geom)

        # Reproject DEM to the most common CRS with the specified resolution
        projection = ee.Projection(most_common_crs).atScale(calc_resolution)
        dem = dem.reproject(projection)

        # 2.1.1 Radar geometry 
        theta_i = image.select('angle')
        phi_i = ee.Terrain.aspect(theta_i) \
            .reduceRegion(ee.Reducer.mean(), theta_i.get('system:footprint'), 1000) \
            .get('aspect')

        # 2.1.2 Terrain geometry
        alpha_s = ee.Terrain.slope(dem).select('slope')
        phi_s = ee.Terrain.aspect(dem).select('aspect')

        # 2.1.3 Model geometry
        # reduce to 3 angle
        phi_r = ee.Image.constant(phi_i).subtract(phi_s)

        # convert all to radians
        phi_rRad = phi_r.multiply(math.pi / 180)
        alpha_sRad = alpha_s.multiply(math.pi / 180)
        theta_iRad = theta_i.multiply(math.pi / 180)
        ninetyRad = ee.Image.constant(90).multiply(math.pi / 180)

        # slope steepness in range (eq. 2)
        alpha_r = (alpha_sRad.tan().multiply(phi_rRad.cos())).atan()

        # slope steepness in azimuth (eq 3)
        alpha_az = (alpha_sRad.tan().multiply(phi_rRad.sin())).atan()

        # local incidence angle (eq. 4)
        cos_theta_lia = (alpha_az.cos().multiply((theta_iRad.subtract(alpha_r)).cos()))

        # Ensure valid range for acos
        cos_theta_lia = cos_theta_lia.clamp(-1, 1)

        theta_lia = cos_theta_lia.acos()
        theta_liaDeg = theta_lia.multiply(180 / math.pi)

        return image.addBands(theta_liaDeg.rename('local_incidence_angle'))

    # Create list to store DataArrays for each orbit
    orbit_arrays = []

    # Create standard projection based on the most common CRS
    standard_projection = ee.Projection(most_common_crs).atScale(calc_resolution)

    # Process each orbit
    for orbit in orbit_list:
        # Get images for this orbit
        orbit_images = collection \
            .filter(ee.Filter.eq('relativeOrbitNumber_start', orbit)) \
            .sort('system:time_start', True) \
            .limit(3)

        if orbit_images.size().getInfo() > 0:
            # Calculate LIA for each image
            lia_images = orbit_images.map(calculate_local_incidence_angle)

            # Calculate median LIA
            median_lia = lia_images.select('local_incidence_angle').median()

            # Set properties on the median image
            timestamp = orbit_images.first().get('system:time_start')
            median_lia = median_lia.set({
                'relativeOrbitNumber_start': orbit,
                'system:time_start': timestamp
            })

            # Create a single-image collection for xee
            orbit_collection = ee.ImageCollection([median_lia])

            # Use xee to convert to xarray
            try:
                ds = xr.open_dataset(
                    orbit_collection, 
                    engine='ee',
                    geometry=bbox,
                    projection=standard_projection,
                    chunks={},
                )

                # Extract the DataArray
                da = ds['local_incidence_angle']

                # Remove the time dimension if present
                if 'time' in da.dims:
                    da = da.isel(time=0, drop=True)

                # Add orbit as a coordinate
                da = da.assign_coords({'sat:relative_orbit': orbit})

                # Check for NaN values
                nan_percentage = np.isnan(da.values).mean() * 100
                print(f"Orbit {orbit} - Shape: {da.shape}, NaN percentage: {nan_percentage:.1f}%")

                if nan_percentage < 100:  # Only keep arrays with some valid data
                    # Store in list
                    orbit_arrays.append(da)
                    print(f"Successfully processed orbit {orbit}")
                else:
                    print(f"Skipping orbit {orbit} - all values are NaN")

            except Exception as e:
                print(f"Error processing orbit {orbit}: {e}")

    if orbit_arrays:
        # Ensure all arrays have the same shape before concatenating
        shapes = [da.shape for da in orbit_arrays]
        if len(set(shapes)) > 1:
            print(f"Warning: Arrays have different shapes: {shapes}")

            # Take the shape with the most non-NaN values as template
            best_da_idx = np.argmax([~np.isnan(da.values).sum() for da in orbit_arrays])
            template_da = orbit_arrays[best_da_idx]

            for i in range(len(orbit_arrays)):
                if i != best_da_idx and orbit_arrays[i].shape != template_da.shape:
                    orbit_num = orbit_arrays[i].sat_relative_orbit.values[0]
                    print(f"Resampling orbit {orbit_num} to match template shape {template_da.shape}")
                    orbit_arrays[i] = orbit_arrays[i].reindex_like(template_da)

        # Combine all orbits into a single DataArray
        lia_da = xr.concat(orbit_arrays, dim='sat:relative_orbit')

        # Add attributes
        lia_da.attrs.update({
            'long_name': 'Sentinel-1 Local Incidence Angle',
            'units': 'degrees',
            'description': 'Local incidence angle calculated from Sentinel-1 data and Copernicus 30m DEM',
            'source': 'Sentinel-1 GRD'
        })

        lia_da = lia_da.transpose("sat:relative_orbit",'Y','X').rename({'X':'x', 'Y':'y'}).rio.set_spatial_dims(x_dim='x', y_dim='y')
        lia_da = lia_da.sortby('sat:relative_orbit')
        # should be in range from 0 to 90
        #lia_da = lia_da.where(lambda x: (x >= 0) & (x <= 90))

        # Reproject to match data grid exactly using bilinear interpolation
        if self.data is not None:
            # Get reference grid from first data variable
            ref_da = self.data[list(self.data.data_vars)[0]].isel(time=0)

            # Ensure lia_da has CRS information
            if not lia_da.rio.crs and ref_da.rio.crs:
                lia_da.rio.write_crs(ref_da.rio.crs, inplace=True)

            # Reproject to match data grid using bilinear interpolation, careful with nodata
            lia_da = lia_da.rio.reproject_match(
                ref_da,
                resampling=rio.warp.Resampling.bilinear,
                nodata=np.nan,
            )

            print("Local incidence angle data reprojected to match main data grid using bilinear resampling.")

        self._local_incidence_angle_data = lia_da
        print("Local incidence angle calculation complete. Access via the .local_incidence_angle_data attribute.")

        #return self._local_incidence_angle_data
    else:
        raise ValueError("No valid Sentinel-1 data found for the specified bounding box.")

`get_metadata(self)` ¶

The method to get the metadata.

Source code in easysnowdata/remote_sensing.py

def get_metadata(self):
    """
    The method to get the metadata.
    """
    stac_json = self.search.item_collection_as_dict()
    metadata_gdf = gpd.GeoDataFrame.from_features(stac_json, "epsg:4326")

    self.metadata = metadata_gdf
    print(f"Metadata retrieved. Access with the .metadata attribute.")

`linear_to_db(self)` ¶

The method to convert the linear power data to dB.

Source code in easysnowdata/remote_sensing.py

def linear_to_db(self):
    """
    The method to convert the linear power data to dB.
    """
    self.data = 10 * np.log10(self.data)
    self.data.attrs["units"] = "dB"
    print(
        f"Linear power units converted to dB. Convert back to linear power units using the .db_to_linear() method."
    )

`search_data(self)` ¶

The method to search the data.

Source code in easysnowdata/remote_sensing.py

def search_data(self):
    """
    The method to search the data.
    """

    # Choose the catalog URL based on catalog_choice
    if self.catalog_choice == "planetarycomputer":
        catalog_url = "https://planetarycomputer.microsoft.com/api/stac/v1"
        catalog = pystac_client.Client.open(
            catalog_url, modifier=planetary_computer.sign_inplace
        )
    # elif self.catalog_choice == "aws":
    #     catalog_url = indigo
    #     catalog = pystac_client.Client.open(catalog_url)
    else:
        raise ValueError(
            "Invalid catalog_choice. Choose either 'planetarycomputer' or <unimplemented>."
        )

    # Search for items within the specified bbox and date range
    search = catalog.search(
        collections=["sentinel-1-rtc"],
        bbox=self.bbox_gdf.total_bounds,
        datetime=(self.start_date, self.end_date),
    )
    # elif self.geobox:
    #     search = catalog.search(
    #         collections=["sentinel-1-rtc"],
    #         bbox=np.array(self.geobox.extent.boundingbox.to_crs('epsg:4326')),
    #         datetime=(self.start_date, self.end_date),
    #     )

    self.search = search
    print(f"Data searched. Access the returned seach with the .search attribute.")

`Sentinel2` ¶

A class to handle Sentinel-2 satellite data.

This class provides functionality to search, retrieve, and process Sentinel-2 satellite imagery. It supports various data operations including masking, scaling, and calculation of spectral indices.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.GeoDataFrame or tuple or Shapely Geometry`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	required
`start_date`	`str`	The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.	`'2014-01-01'`
`end_date`	`str`	The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.	`'2025-09-12'`
`catalog_choice`	`str`	The catalog choice for the data. Can choose between 'planetarycomputer' and 'earthsearch'. Default is 'planetarycomputer'.	`'planetarycomputer'`
`bands`	`list`	The bands to be used. Default is all bands. Must include SCL for data masking.	`None`
`resolution`	`str`	The resolution of the data. Defaults to native resolution, 10m.	`None`
`crs`	`str`	The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.	`None`
`remove_nodata`	`bool`	Whether to remove no data values. Default is True.	`True`
`harmonize_to_old`	`bool`	Whether to harmonize new data to the old baseline. Default is True.	`None`
`scale_data`	`bool`	Whether to scale the data. Default is True.	`True`
`groupby`	`str`	The groupby parameter for the data. Default is "solar_day".	`'solar_day'`

Attributes:

Name	Type	Description
`data`	`xarray.Dataset`	The loaded Sentinel-2 data.
`metadata`	`geopandas.GeoDataFrame`	Metadata for the retrieved Sentinel-2 scenes.
`rgb`	`xarray.DataArray`	RGB composite of the Sentinel-2 data.
`ndvi`	`xarray.DataArray`	Normalized Difference Vegetation Index (NDVI) calculated from the data.
`ndsi`	`xarray.DataArray`	Normalized Difference Snow Index (NDSI) calculated from the data.
`ndwi`	`xarray.DataArray`	Normalized Difference Water Index (NDWI) calculated from the data.
`evi`	`xarray.DataArray`	Enhanced Vegetation Index (EVI) calculated from the data.
`ndbi`	`xarray.DataArray`	Normalized Difference Built-up Index (NDBI) calculated from the data.
`Methods`	`None`
`-------`	`None`
`search_data()`	`None`	Searches for Sentinel-2 data based on the specified parameters.
`get_data()`	`None`	Retrieves the Sentinel-2 data based on the search results.
`get_metadata()`	`None`	Retrieves metadata for the Sentinel-2 scenes.
`remove_nodata_inplace()`	`None`	Removes no data values from the data.
`mask_data()`	`None`	Masks the data based on the Scene Classification Layer (SCL).
`harmonize_to_old_inplace()`	`None`	Harmonizes new Sentinel-2 data to the old baseline.
`scale_data_inplace()`	`None`	Scales the data to reflectance values.
`get_rgb()`	`None`	Retrieves the RGB composite of the data.
`get_ndvi()`	`None`	Calculates the Normalized Difference Vegetation Index (NDVI).
`get_ndsi()`	`None`	Calculates the Normalized Difference Snow Index (NDSI).
`get_ndwi()`	`None`	Calculates the Normalized Difference Water Index (NDWI).
`get_evi()`	`None`	Calculates the Enhanced Vegetation Index (EVI).
`get_ndbi()`	`None`	Calculates the Normalized Difference Built-up Index (NDBI).

Source code in easysnowdata/remote_sensing.py

class Sentinel2:
    """
    A class to handle Sentinel-2 satellite data.

    This class provides functionality to search, retrieve, and process Sentinel-2 satellite imagery.
    It supports various data operations including masking, scaling, and calculation of spectral indices.

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or Shapely Geometry
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    start_date : str, optional
        The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
    end_date : str, optional
        The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
    catalog_choice : str, optional
        The catalog choice for the data. Can choose between 'planetarycomputer' and 'earthsearch'. Default is 'planetarycomputer'.
    bands : list, optional
        The bands to be used. Default is all bands. Must include SCL for data masking.
    resolution : str, optional
        The resolution of the data. Defaults to native resolution, 10m.
    crs : str, optional
        The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.
    remove_nodata : bool, optional
        Whether to remove no data values. Default is True.
    harmonize_to_old : bool, optional
        Whether to harmonize new data to the old baseline. Default is True.
    scale_data : bool, optional
        Whether to scale the data. Default is True.
    groupby : str, optional
        The groupby parameter for the data. Default is "solar_day".

    Attributes
    ----------
    data : xarray.Dataset
        The loaded Sentinel-2 data.
    metadata : geopandas.GeoDataFrame
        Metadata for the retrieved Sentinel-2 scenes.
    rgb : xarray.DataArray
        RGB composite of the Sentinel-2 data.
    ndvi : xarray.DataArray
        Normalized Difference Vegetation Index (NDVI) calculated from the data.
    ndsi : xarray.DataArray
        Normalized Difference Snow Index (NDSI) calculated from the data.
    ndwi : xarray.DataArray
        Normalized Difference Water Index (NDWI) calculated from the data.
    evi : xarray.DataArray
        Enhanced Vegetation Index (EVI) calculated from the data.
    ndbi : xarray.DataArray
        Normalized Difference Built-up Index (NDBI) calculated from the data.

    Methods
    -------
    search_data()
        Searches for Sentinel-2 data based on the specified parameters.
    get_data()
        Retrieves the Sentinel-2 data based on the search results.
    get_metadata()
        Retrieves metadata for the Sentinel-2 scenes.
    remove_nodata_inplace()
        Removes no data values from the data.
    mask_data()
        Masks the data based on the Scene Classification Layer (SCL).
    harmonize_to_old_inplace()
        Harmonizes new Sentinel-2 data to the old baseline.
    scale_data_inplace()
        Scales the data to reflectance values.
    get_rgb()
        Retrieves the RGB composite of the data.
    get_ndvi()
        Calculates the Normalized Difference Vegetation Index (NDVI).
    get_ndsi()
        Calculates the Normalized Difference Snow Index (NDSI).
    get_ndwi()
        Calculates the Normalized Difference Water Index (NDWI).
    get_evi()
        Calculates the Enhanced Vegetation Index (EVI).
    get_ndbi()
        Calculates the Normalized Difference Built-up Index (NDBI).
    """

    def __init__(
        self,
        bbox_input,
        start_date="2014-01-01",
        end_date=today,
        catalog_choice="planetarycomputer",
        collection= "sentinel-2-l2a", # could also choose "sentinel-2-c1-l2a" once published to https://github.com/Element84/earth-search
        bands=None,
        resolution=None,
        crs=None,
        remove_nodata=True,
        harmonize_to_old=None,
        scale_data=True,
        groupby="solar_day",
    ):
        """
        The constructor for the Sentinel2 class.

        Parameters:
            bbox_input (geopandas.GeoDataFrame or tuple or Shapely Geometry): GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
            start_date (str): The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
            end_date (str): The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
            catalog_choice (str): The catalog choice for the data. Can choose between 'planetarycomputer' and 'earthsearch', default is 'planetarycomputer'.
            bands (list): The bands to be used. Default is all bands. Must include SCL for data masking. Each band should be a string like 'B01', 'B02', etc.
            resolution (str): The resolution of the data. Defaults to native resolution, 10m.
            crs (str): The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.
            groupby (str): The groupby parameter for the data. Default is "solar_day".
        """
        # Initialize the attributes
        self.bbox_input = bbox_input
        self.start_date = start_date
        self.end_date = end_date
        self.catalog_choice = catalog_choice
        self.collection = collection
        self.bands = bands
        self.resolution = resolution
        self.crs = crs
        self.remove_nodata = remove_nodata
        self.harmonize_to_old = harmonize_to_old
        self.scale_data = scale_data
        self.groupby = groupby

        self.bbox_gdf = convert_bbox_to_geodataframe(self.bbox_input)

        if self.crs == None:
            self.crs = self.bbox_gdf.estimate_utm_crs()

        # Define the band information
        self.band_info = {
            "B01": {
                "name": "coastal",
                "description": "Coastal aerosol, 442.7 nm (S2A), 442.3 nm (S2B)",
                "resolution": "60m",
                "scale": "0.0001",
            },
            "B02": {
                "name": "blue",
                "description": "Blue, 492.4 nm (S2A), 492.1 nm (S2B)",
                "resolution": "10m",
                "scale": "0.0001",
            },
            "B03": {
                "name": "green",
                "description": "Green, 559.8 nm (S2A), 559.0 nm (S2B)",
                "resolution": "10m",
                "scale": "0.0001",
            },
            "B04": {
                "name": "red",
                "description": "Red, 664.6 nm (S2A), 665.0 nm (S2B)",
                "resolution": "10m",
                "scale": "0.0001",
            },
            "B05": {
                "name": "rededge",
                "description": "Vegetation red edge, 704.1 nm (S2A), 703.8 nm (S2B)",
                "resolution": "20m",
                "scale": "0.0001",
            },
            "B06": {
                "name": "rededge2",
                "description": "Vegetation red edge, 740.5 nm (S2A), 739.1 nm (S2B)",
                "resolution": "20m",
                "scale": "0.0001",
            },
            "B07": {
                "name": "rededge3",
                "description": "Vegetation red edge, 782.8 nm (S2A), 779.7 nm (S2B)",
                "resolution": "20m",
                "scale": "0.0001",
            },
            "B08": {
                "name": "nir",
                "description": "NIR, 832.8 nm (S2A), 833.0 nm (S2B)",
                "resolution": "10m",
                "scale": "0.0001",
            },
            "B8A": {
                "name": "nir08",
                "description": "Narrow NIR, 864.7 nm (S2A), 864.0 nm (S2B)",
                "resolution": "20m",
                "scale": "0.0001",
            },
            "B09": {
                "name": "nir09",
                "description": "Water vapour, 945.1 nm (S2A), 943.2 nm (S2B)",
                "resolution": "60m",
                "scale": "0.0001",
            },
            "B11": {
                "name": "swir16",
                "description": "SWIR, 1613.7 nm (S2A), 1610.4 nm (S2B)",
                "resolution": "20m",
                "scale": "0.0001",
            },
            "B12": {
                "name": "swir22",
                "description": "SWIR, 2202.4 nm (S2A), 2185.7 nm (S2B)",
                "resolution": "20m",
                "scale": "0.0001",
            },
            "AOT": {
                "name": "aot",
                "description": "Aerosol Optical Thickness map, based on Sen2Cor processor",
                "resolution": "10m",
                "scale": "1",
            },
            "SCL": {
                "name": "scl",
                "description": "Scene classification data, based on Sen2Cor processor",
                "resolution": "20m",
                "scale": "1",
            },
            "WVP": {
                "name": "wvp",
                "description": "Water Vapour map",
                "resolution": "10m",
                "scale": "1",
            },
            "visual": {
                "name": "visual",
                "description": "True color image",
                "resolution": "10m",
                "scale": "0.0001",
            },
        }

        # Define the scene classification information, classes here: https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel-2/scene-classification/
        self.scl_class_info = {
            0: {"name": "No Data (Missing data)", "color": "#000000"},
            1: {"name": "Saturated or defective pixel", "color": "#ff0000"},
            2: {
                "name": "Topographic casted shadows",
                "color": "#2f2f2f",
            },  # (called 'Dark features/Shadows' for data before 2022-01-25)
            3: {"name": "Cloud shadows", "color": "#643200"},
            4: {"name": "Vegetation", "color": "#00a000"},
            5: {"name": "Not-vegetated", "color": "#ffe65a"},
            6: {"name": "Water", "color": "#0000ff"},
            7: {"name": "Unclassified", "color": "#808080"},
            8: {"name": "Cloud medium probability", "color": "#c0c0c0"},
            9: {"name": "Cloud high probability", "color": "#ffffff"},
            10: {"name": "Thin cirrus", "color": "#64c8ff"},
            11: {"name": "Snow or ice", "color": "#ff96ff"},
        }

        self.scl_cmap = plt.cm.colors.ListedColormap(
            [info["color"] for info in self.scl_class_info.values()]
        )

        # Initialize the data attributes
        self.search = None
        self.data = None
        self.metadata = None

        self.rgb = None
        self.rgba = None
        self.rgb_clahe = None
        self.rgb_percentile = None
        self.ndvi = None
        self.ndsi = None
        self.ndwi = None
        self.evi = None
        self.ndbi = None

        self.search_data()
        self.get_data()

        if self.remove_nodata:
            self.remove_nodata_inplace()

        if self.harmonize_to_old is None:
            if self.catalog_choice == "planetarycomputer":
                self.harmonize_to_old = True
            else:
                if self.collection == "sentinel-2-c1-l2a":
                    self.harmonize_to_old = True
                elif self.collection == "sentinel-2-l2a":
                    self.harmonize_to_old = False
                    print(f"Since {self.collection} on {self.catalog_choice} is used, harmonization step is not needed.")
                else:
                    raise ValueError(f"Unknown collection: {self.collection}")

        if self.harmonize_to_old:
            self.harmonize_to_old_inplace()

        if self.scale_data:
            self.scale_data_inplace()

        self.get_metadata()



        # Add the plot_scl method as an attribute to the SCL data variable
        if 'scl' in self.data.data_vars:
            self.data.scl.attrs['example_plot'] = self.plot_scl
            self.data.scl.attrs['class_info'] = self.scl_class_info
            self.data.scl.attrs['cmap'] = self.scl_cmap

    def plot_scl(self, scl_data, ax=None, figsize=None, col_wrap=5, legend_kwargs=None):

        if figsize is None:
            figsize = (8, 10) if scl_data.time.size == 1 else (12, 7)

        class_values = sorted(list(self.scl_class_info.keys()))
        bounds = [(class_values[i] + class_values[i + 1]) / 2 for i in range(len(class_values) - 1)]
        bounds = [class_values[0] - 0.5] + bounds + [class_values[-1] + 0.5]
        norm = matplotlib.colors.BoundaryNorm(bounds, self.scl_cmap.N)

        if scl_data.time.size == 1:
            # Single image plot
            if ax is None:
                f, ax = plt.subplots(figsize=figsize)
            else:
                f = ax.get_figure()

            im = scl_data.plot.imshow(ax=ax, cmap=self.scl_cmap, norm=norm, add_colorbar=False)
            ax.set_aspect("equal")

            local_time = pd.to_datetime(scl_data.time.values).tz_localize('UTC').tz_convert('America/Los_Angeles')
            ax.set_title(f"Sentinel-2 Scene Classification Layer (SCL)\n{local_time.strftime('%B %d, %Y')}\n{local_time.strftime('%I:%M%p')}")

        else:
            # Multiple images plot
            f = scl_data.plot.imshow(
                col='time',
                col_wrap=col_wrap,
                cmap=self.scl_cmap,
                norm=matplotlib.colors.BoundaryNorm(bounds, self.scl_cmap.N),
                add_colorbar=False,
                #figsize=figsize
            )

            for ax, time in zip(f.axs.flat, scl_data.time.values):
                local_time = pd.to_datetime(time).tz_localize('UTC').tz_convert('America/Los_Angeles')
                ax.set_title(f'{local_time.strftime("%B %d, %Y")}\n{local_time.strftime("%I:%M%p")}')
                ax.axis('off')
                ax.set_aspect('equal')

            f.fig.suptitle('Sentinel-2 SCL time series', fontsize=16, y=1.02)

        # Add legend
        legend_handles = []
        class_names = []
        for class_value, class_info in self.scl_class_info.items():
            legend_handles.append(
                plt.Rectangle((0, 0), 1, 1, facecolor=class_info["color"], edgecolor="black")
            )
            class_names.append(class_info["name"])

        legend_kwargs = legend_kwargs or {}

        if scl_data.time.size == 1:
            default_legend_kwargs = {
                "bbox_to_anchor": (0.5, -0.1),
                "loc": "upper center",
                "ncol": len(class_names) // 4,
                "frameon": False,
                "handlelength": 3.5,
                "handleheight": 5,
            }
        else:
            default_legend_kwargs = {
                "bbox_to_anchor": (0.5, -0.1),
                "loc": "upper center",
                "ncol": len(class_names) // 4,
                "frameon": False,
                "handlelength": 5,
                "handleheight": 6,
                "fontsize": 16,
            }

        legend_kwargs = {**default_legend_kwargs, **legend_kwargs}

        if scl_data.time.size == 1:
            ax.legend(legend_handles, class_names, **legend_kwargs)
            f.tight_layout(pad=1.5, w_pad=1.5, h_pad=1.5)
            f.dpi = 300
        else:
            f.fig.legend(legend_handles, class_names, **legend_kwargs)
            f.fig.tight_layout(pad=1.5, w_pad=1.5, h_pad=1.5)
            f.fig.dpi = 300

        return f, ax if scl_data.time.size == 1 else f

    def search_data(self):
        """
        The method to search the data.
        """

        # Choose the catalog URL based on catalog_choice
        if self.catalog_choice == "planetarycomputer":
            catalog_url = "https://planetarycomputer.microsoft.com/api/stac/v1"
            catalog = pystac_client.Client.open(
                catalog_url, modifier=planetary_computer.sign_inplace
            )
        elif self.catalog_choice == "earthsearch":
            os.environ["AWS_REGION"] = "us-west-2"
            os.environ["GDAL_DISABLE_READDIR_ON_OPEN"] = "EMPTY_DIR"
            os.environ["AWS_NO_SIGN_REQUEST"] = "YES"
            catalog_url = "https://earth-search.aws.element84.com/v1"
            catalog = pystac_client.Client.open(catalog_url)
        else:
            raise ValueError(
                "Invalid catalog_choice. Choose either 'planetarycomputer' or 'earthsearch'."
            )

        # Search for items within the specified bbox and date range
        search = catalog.search(
            collections=[self.collection],
            bbox=self.bbox_gdf.total_bounds,
            datetime=(self.start_date, self.end_date),
        )
        self.search = search
        print(f"Data searched. Access the returned seach with the .search attribute.")

    def get_data(self):
        """
        The method to get the data.
        """
        # Prepare the parameters for odc.stac.load
        load_params = {
            "items": self.search.items(),
            "bbox": self.bbox_gdf.total_bounds,
            "nodata": 0,
            "chunks": {},
            "crs": self.crs,
            "groupby": self.groupby,
            "stac_cfg": get_stac_cfg(sensor="sentinel-2-l2a"),
        }
        if self.bands:
            load_params["bands"] = self.bands
        else:
            load_params["bands"] = [info["name"] for info in self.band_info.values()]
        if self.resolution:
            load_params["resolution"] = self.resolution

        # Load the data lazily using odc.stac
        self.data = odc.stac.load(**load_params)

        self.data.attrs["band_info"] = self.band_info
        self.data.attrs["scl_class_info"] = self.scl_class_info

        if "scl" in self.data.variables:
            self.data.scl.attrs["scl_class_info"] = self.scl_class_info

        print(
            f"Data retrieved. Access with the .data attribute. Data CRS: {self.bbox_gdf.estimate_utm_crs().name}."
        )

    def get_metadata(self):
        """
        The method to get the metadata.
        """

        stac_json = self.search.item_collection_as_dict()
        metadata_gdf = gpd.GeoDataFrame.from_features(stac_json, "epsg:4326")
        if self.catalog_choice == "earthsearch":
            metadata_gdf["s2:mgrs_tile"] = (
                metadata_gdf["mgrs:utm_zone"].apply(lambda x: f"{x:02d}")
                + metadata_gdf["mgrs:latitude_band"]
                + metadata_gdf["mgrs:grid_square"]
            )

        self.metadata = metadata_gdf
        print(f"Metadata retrieved. Access with the .metadata attribute.")

    def remove_nodata_inplace(self):
        """
        The method to remove no data values from the data.
        """
        data_removed = False
        for band in self.data.data_vars:
            nodata_value = None
            nodata_value = self.data[band].attrs.get("nodata")
            if nodata_value is not None:
                #print(f"Removing nodata {nodata_value} values for band {band}...")
                self.data[band] = self.data[band].where(self.data[band] != nodata_value)
                data_removed = True
        if data_removed:
            print(f"Nodata values removed from the data. In doing so, all bands converted to float32. To turn this behavior off, set remove_nodata=False.")
        else:
            print(f"Tried to remove nodata values and set them to nans, but no nodata values found in the data.")

    def mask_data(
        self,
        remove_nodata=True,
        remove_saturated_defective=True,
        remove_topo_shadows=True,
        remove_cloud_shadows=True,
        remove_vegetation=False,
        remove_not_vegetated=False,
        remove_water=False,
        remove_unclassified=False,
        remove_medium_prob_clouds=True,
        remove_high_prob_clouds=True,
        remove_thin_cirrus_clouds=True,
        remove_snow_ice=False,
    ):
        """
        The method to mask the data.

        Parameters:
            remove_nodata (bool): Whether to remove no data pixels.
            remove_saturated_defective (bool): Whether to remove saturated or defective pixels.
            remove_topo_shadows (bool): Whether to remove topographic shadow pixels.
            remove_cloud_shadows (bool): Whether to remove cloud shadow pixels.
            remove_vegetation (bool): Whether to remove vegetation pixels.
            remove_not_vegetated (bool): Whether to remove not vegetated pixels.
            remove_water (bool): Whether to remove water pixels.
            remove_unclassified (bool): Whether to remove unclassified pixels.
            remove_medium_prob_clouds (bool): Whether to remove medium probability cloud pixels.
            remove_high_prob_clouds (bool): Whether to remove high probability cloud pixels.
            remove_thin_cirrus_clouds (bool): Whether to remove thin cirrus cloud pixels.
            remove_snow_ice (bool): Whether to remove snow or ice pixels.
        """

        # Mask the data based on the Scene Classification (SCL) band (see definitions above)
        mask_list = []
        if remove_nodata:
            mask_list.append(0)
        if remove_saturated_defective:
            mask_list.append(1)
        if remove_topo_shadows:
            mask_list.append(2)
        if remove_cloud_shadows:
            mask_list.append(3)
        if remove_vegetation:
            mask_list.append(4)
        if remove_not_vegetated:
            mask_list.append(5)
        if remove_water:
            mask_list.append(6)
        if remove_unclassified:
            mask_list.append(7)
        if remove_medium_prob_clouds:
            mask_list.append(8)
        if remove_high_prob_clouds:
            mask_list.append(9)
        if remove_thin_cirrus_clouds:
            mask_list.append(10)
        if remove_snow_ice:
            mask_list.append(11)

        print(f"Removed pixels with the following scene classification values:")
        for val in mask_list:
            print(self.scl_class_info[val]["name"])

        scl = self.data.scl
        mask = scl.where(scl.isin(mask_list) == False, 0)
        self.data = self.data.where(mask != 0)

    def harmonize_to_old_inplace(self):
        """
        Harmonize new Sentinel-2 data to the old baseline.
        Adapted from: https://planetarycomputer.microsoft.com/dataset/sentinel-2-l2a#Baseline-Change
        Returns
        -------
        harmonized: xarray.Dataset
            A Dataset with all values harmonized to the old
            processing baseline.
        """
        cutoff = datetime.datetime(2022, 1, 25)
        offset = 1000
        bands = [
            "B01",
            "B02",
            "B03",
            "B04",
            "B05",
            "B06",
            "B07",
            "B08",
            "B8A",
            "B09",
            "B11",
            "B12",
        ]
        bands = [self.data.band_info[band]["name"] for band in bands]
        old = self.data.sel(time=slice(None, cutoff))

        to_process = list(set(bands) & set(self.data.data_vars))
        new = self.data.sel(time=slice(cutoff, None))

        for band in to_process:
            if band in new.data_vars:
                new[band] = new[band].clip(offset) - offset

        self.data = xr.concat([old, new], dim="time")

        print(
            f"Data acquired after January 25th, 2022 harmonized to old baseline. To override this behavior, set harmonize_to_old=False."
        )

    def scale_data_inplace(self):
        """
        The method to scale the data.
        """
        for band in self.data.data_vars:
            scale_factor = self.data[band].attrs.get("scale")

            if scale_factor is None:
                scale_factor = next((info['scale'] for name, info in self.band_info.items() if info['name'] == band), None)

            scale_factor = int(scale_factor) if scale_factor == '1' else float(scale_factor)
            self.data[band] = self.data[band] * scale_factor

        print(
            f"Data scaled to float reflectance. To turn this behavior off, set scale_data=False."
        )

    def get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None}):
        """
        Retrieve RGB data with optional percentile-based contrast stretching and CLAHE enhancement.

        This method calculates and stores three versions of RGB data: raw, percentile-stretched, and CLAHE-enhanced.

        Parameters
        ----------
        percentile_kwargs : dict, optional
            Parameters for percentile-based contrast stretching. Keys are:
            - 'lower': Lower percentile for contrast stretching (default: 2)
            - 'upper': Upper percentile for contrast stretching (default: 98)
        clahe_kwargs : dict, optional
            Parameters for CLAHE enhancement. Keys are:
            - 'clip_limit': Clipping limit for CLAHE (default: 0.03)
            - 'nbins': Number of bins for CLAHE histogram (default: 256)
            - 'kernel_size': Size of kernel for CLAHE (default: None)

        Returns
        -------
        None
            The method stores results in instance attributes.

        Notes
        -----
        Results are stored in the following attributes:
        - .rgb: Raw RGB data
        - .rgb_percentile: Percentile-stretched RGB data
        - .rgb_clahe: CLAHE-enhanced RGB data
        """

        rgba_da = self.data.odc.to_rgba(bands=('red','green','blue'),vmin=0, vmax=1.7)
        self.rgba = rgba_da

        rgb_da = rgba_da.isel(band=slice(0, 3))  #.where(self.data.scl>=0, other=255) if we want to make no data white
        self.rgb = rgb_da

        self.rgb_percentile = self.get_rgb_percentile(**percentile_kwargs)
        self.rgb_clahe = self.get_rgb_clahe(**clahe_kwargs)

        print(f"RGB data retrieved.\nAccess with the following attributes:\n.rgb for raw RGB,\n.rgba for RGBA,\n.rgb_percentile for percentile RGB,\n.rgb_clahe for CLAHE RGB.\nYou can pass in percentile_kwargs and clahe_kwargs to adjust RGB calculations, check documentation for options.")

    def get_rgb_percentile(self, **percentile_kwargs):
        """
        Apply percentile-based contrast stretching to the RGB bands of the Sentinel-2 data.

        This function creates a new DataArray with the contrast-stretched RGB bands.

        Parameters
        ----------
        **kwargs : dict
            Keyword arguments for percentile calculation. Supported keys:
            - 'lower': Lower percentile for contrast stretching (default: 2)
            - 'upper': Upper percentile for contrast stretching (default: 98)

        Returns
        -------
        xarray.DataArray
            RGB data with percentile-based contrast stretching applied.

        Notes
        -----
        The function clips values to the range [0, 1] and masks areas where SCL < 0.
        """
        lower_percentile = percentile_kwargs.get('lower', 2)
        upper_percentile = percentile_kwargs.get('upper', 98)

        def stretch_percentile(da):
            p_low, p_high = np.nanpercentile(da.values, [lower_percentile, upper_percentile])
            return (da - p_low) / (p_high - p_low)

        rgb_da = self.rgb

        template = xr.zeros_like(rgb_da)
        rgb_percentile_da = xr.map_blocks(stretch_percentile, rgb_da, template=template)
        rgb_percentile_da = rgb_percentile_da.clip(0, 1).where(self.data.scl>=0)

        return rgb_percentile_da

    def get_rgb_clahe(self, **kwargs):
        """
        Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to RGB bands.

        This function creates a new DataArray with CLAHE applied to the RGB bands.

        Parameters
        ----------
        **kwargs : dict
            Keyword arguments for CLAHE. Supported keys:
            - 'clip_limit': Clipping limit for CLAHE (default: 0.03)
            - 'nbins': Number of bins for CLAHE histogram (default: 256)
            - 'kernel_size': Size of kernel for CLAHE (default: None)

        Returns
        -------
        xarray.DataArray
            RGB data with CLAHE enhancement applied.

        Notes
        -----
        The function applies CLAHE to each band separately and masks areas where SCL < 0.
        https://scikit-image.org/docs/stable/api/skimage.exposure.html#skimage.exposure.equalize_adapthist
        """

        # Custom wrapper to preserve xarray metadata
        def equalize_adapthist_da(da, **kwargs):
            # Apply the CLAHE function from skimage
            result = skimage.exposure.equalize_adapthist(da.values, **kwargs)
            #new_coords = {k: v for k, v in da.coords.items() if k != 'band' or len(v) == 3}

            # Convert the result back to a DataArray, preserving the original metadata
            return xr.DataArray(result, dims=da.dims, coords=da.coords, attrs=da.attrs)

        rgb_da = self.rgb

        #template = rgb_da.copy(data=np.empty_like(rgb_da).data)
        template = xr.zeros_like(rgb_da)
        rgb_clahe_da = xr.map_blocks(equalize_adapthist_da, rgb_da, template=template, kwargs=kwargs)
        rgb_clahe_da = rgb_clahe_da.where(self.data.scl>=0)

        return rgb_clahe_da

    # Indicies
    # find indicies Sentinel-2 indicies here: https://www.indexdatabase.de/db/is.php?sensor_id=96 and https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel/sentinel-2/

    def get_ndvi(self):
        """
        The method to get the NDVI data.
        S2 NDVI definition: (B08 - B04) / (B08 + B04) [https://www.indexdatabase.de/db/i-single.php?id=58]

        Returns:
            ndvi_da (xarray.DataArray): The NDVI data.
        """
        red = self.data.red
        nir = self.data.nir
        ndvi_da = (nir - red) / (nir + red)

        self.ndvi = ndvi_da

        print(f"NDVI data calculated. Access with the .ndvi attribute.")

    def get_ndsi(self):
        """
        The method to get the NDSI data.
        S2 NDSI definition: (B03 - B11) / (B03 + B11)

        Returns:
            ndsi_da (xarray.DataArray): The NDSI data.
        """
        green = self.data.green
        swir16 = self.data.swir16
        ndsi_da = (green - swir16) / (green + swir16)

        self.ndsi = ndsi_da

        print(f"NDSI data calculated. Access with the .ndsi attribute.")

    def get_ndwi(self):
        """
        The method to get the NDWI data.
        S2 NDWI definition: (B03 - B08) / (B03 + B08)

        Returns:
            ndwi_da (xarray.DataArray): The NDWI data.
        """
        green = self.data.green
        nir = self.data.nir
        ndwi_da = (green - nir) / (green + nir)

        self.ndwi = ndwi_da

        print(f"NDWI data calculated. Access with the .ndwi attribute.")

    def get_evi(self):
        """
        The method to get the EVI data.
        S2 EVI definition: 2.5 * (B08 - B04) / (B08 + 6 * B04 - 7.5 * B02 + 1) [https://www.indexdatabase.de/db/si-single.php?sensor_id=96&rsindex_id=16]

        Returns:
            xarray.DataArray: The EVI data.
        """
        red = self.data.red
        nir = self.data.nir
        blue = self.data.blue

        evi_da = 2.5 * (nir - red) / (nir + 6 * red - 7.5 * blue + 1)

        self.evi = evi_da

        print(f"EVI data calculated. Access with the .evi attribute.")

    def get_ndbi(self):
        """
        The method to get the NDBI data.
        S2 NDBI definition: (B08 - B12) / (B08 + B12)

        Returns:
            xarray.DataArray: The NDBI data.
        """
        nir = self.data.nir
        swir22 = self.data.swir22
        ndbi_da = (nir - swir22) / (nir + swir22)

        self.ndbi = ndbi_da

        print(f"NDBI data calculated. Access with the .ndbi attribute.")

`init(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', catalog_choice='planetarycomputer', collection='sentinel-2-l2a', bands=None, resolution=None, crs=None, remove_nodata=True, harmonize_to_old=None, scale_data=True, groupby='solar_day')` `special` ¶

The constructor for the Sentinel2 class.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.GeoDataFrame or tuple or Shapely Geometry`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	required
`start_date`	`str`	The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.	`'2014-01-01'`
`end_date`	`str`	The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.	`'2025-09-12'`
`catalog_choice`	`str`	The catalog choice for the data. Can choose between 'planetarycomputer' and 'earthsearch', default is 'planetarycomputer'.	`'planetarycomputer'`
`bands`	`list`	The bands to be used. Default is all bands. Must include SCL for data masking. Each band should be a string like 'B01', 'B02', etc.	`None`
`resolution`	`str`	The resolution of the data. Defaults to native resolution, 10m.	`None`
`crs`	`str`	The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.	`None`
`groupby`	`str`	The groupby parameter for the data. Default is "solar_day".	`'solar_day'`

Source code in easysnowdata/remote_sensing.py

def __init__(
    self,
    bbox_input,
    start_date="2014-01-01",
    end_date=today,
    catalog_choice="planetarycomputer",
    collection= "sentinel-2-l2a", # could also choose "sentinel-2-c1-l2a" once published to https://github.com/Element84/earth-search
    bands=None,
    resolution=None,
    crs=None,
    remove_nodata=True,
    harmonize_to_old=None,
    scale_data=True,
    groupby="solar_day",
):
    """
    The constructor for the Sentinel2 class.

    Parameters:
        bbox_input (geopandas.GeoDataFrame or tuple or Shapely Geometry): GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
        start_date (str): The start date for the data in the format 'YYYY-MM-DD'. Default is '2014-01-01'.
        end_date (str): The end date for the data in the format 'YYYY-MM-DD'. Default is today's date.
        catalog_choice (str): The catalog choice for the data. Can choose between 'planetarycomputer' and 'earthsearch', default is 'planetarycomputer'.
        bands (list): The bands to be used. Default is all bands. Must include SCL for data masking. Each band should be a string like 'B01', 'B02', etc.
        resolution (str): The resolution of the data. Defaults to native resolution, 10m.
        crs (str): The coordinate reference system. This should be a string like 'EPSG:4326'. Default CRS is UTM zone estimated from bounding box.
        groupby (str): The groupby parameter for the data. Default is "solar_day".
    """
    # Initialize the attributes
    self.bbox_input = bbox_input
    self.start_date = start_date
    self.end_date = end_date
    self.catalog_choice = catalog_choice
    self.collection = collection
    self.bands = bands
    self.resolution = resolution
    self.crs = crs
    self.remove_nodata = remove_nodata
    self.harmonize_to_old = harmonize_to_old
    self.scale_data = scale_data
    self.groupby = groupby

    self.bbox_gdf = convert_bbox_to_geodataframe(self.bbox_input)

    if self.crs == None:
        self.crs = self.bbox_gdf.estimate_utm_crs()

    # Define the band information
    self.band_info = {
        "B01": {
            "name": "coastal",
            "description": "Coastal aerosol, 442.7 nm (S2A), 442.3 nm (S2B)",
            "resolution": "60m",
            "scale": "0.0001",
        },
        "B02": {
            "name": "blue",
            "description": "Blue, 492.4 nm (S2A), 492.1 nm (S2B)",
            "resolution": "10m",
            "scale": "0.0001",
        },
        "B03": {
            "name": "green",
            "description": "Green, 559.8 nm (S2A), 559.0 nm (S2B)",
            "resolution": "10m",
            "scale": "0.0001",
        },
        "B04": {
            "name": "red",
            "description": "Red, 664.6 nm (S2A), 665.0 nm (S2B)",
            "resolution": "10m",
            "scale": "0.0001",
        },
        "B05": {
            "name": "rededge",
            "description": "Vegetation red edge, 704.1 nm (S2A), 703.8 nm (S2B)",
            "resolution": "20m",
            "scale": "0.0001",
        },
        "B06": {
            "name": "rededge2",
            "description": "Vegetation red edge, 740.5 nm (S2A), 739.1 nm (S2B)",
            "resolution": "20m",
            "scale": "0.0001",
        },
        "B07": {
            "name": "rededge3",
            "description": "Vegetation red edge, 782.8 nm (S2A), 779.7 nm (S2B)",
            "resolution": "20m",
            "scale": "0.0001",
        },
        "B08": {
            "name": "nir",
            "description": "NIR, 832.8 nm (S2A), 833.0 nm (S2B)",
            "resolution": "10m",
            "scale": "0.0001",
        },
        "B8A": {
            "name": "nir08",
            "description": "Narrow NIR, 864.7 nm (S2A), 864.0 nm (S2B)",
            "resolution": "20m",
            "scale": "0.0001",
        },
        "B09": {
            "name": "nir09",
            "description": "Water vapour, 945.1 nm (S2A), 943.2 nm (S2B)",
            "resolution": "60m",
            "scale": "0.0001",
        },
        "B11": {
            "name": "swir16",
            "description": "SWIR, 1613.7 nm (S2A), 1610.4 nm (S2B)",
            "resolution": "20m",
            "scale": "0.0001",
        },
        "B12": {
            "name": "swir22",
            "description": "SWIR, 2202.4 nm (S2A), 2185.7 nm (S2B)",
            "resolution": "20m",
            "scale": "0.0001",
        },
        "AOT": {
            "name": "aot",
            "description": "Aerosol Optical Thickness map, based on Sen2Cor processor",
            "resolution": "10m",
            "scale": "1",
        },
        "SCL": {
            "name": "scl",
            "description": "Scene classification data, based on Sen2Cor processor",
            "resolution": "20m",
            "scale": "1",
        },
        "WVP": {
            "name": "wvp",
            "description": "Water Vapour map",
            "resolution": "10m",
            "scale": "1",
        },
        "visual": {
            "name": "visual",
            "description": "True color image",
            "resolution": "10m",
            "scale": "0.0001",
        },
    }

    # Define the scene classification information, classes here: https://custom-scripts.sentinel-hub.com/custom-scripts/sentinel-2/scene-classification/
    self.scl_class_info = {
        0: {"name": "No Data (Missing data)", "color": "#000000"},
        1: {"name": "Saturated or defective pixel", "color": "#ff0000"},
        2: {
            "name": "Topographic casted shadows",
            "color": "#2f2f2f",
        },  # (called 'Dark features/Shadows' for data before 2022-01-25)
        3: {"name": "Cloud shadows", "color": "#643200"},
        4: {"name": "Vegetation", "color": "#00a000"},
        5: {"name": "Not-vegetated", "color": "#ffe65a"},
        6: {"name": "Water", "color": "#0000ff"},
        7: {"name": "Unclassified", "color": "#808080"},
        8: {"name": "Cloud medium probability", "color": "#c0c0c0"},
        9: {"name": "Cloud high probability", "color": "#ffffff"},
        10: {"name": "Thin cirrus", "color": "#64c8ff"},
        11: {"name": "Snow or ice", "color": "#ff96ff"},
    }

    self.scl_cmap = plt.cm.colors.ListedColormap(
        [info["color"] for info in self.scl_class_info.values()]
    )

    # Initialize the data attributes
    self.search = None
    self.data = None
    self.metadata = None

    self.rgb = None
    self.rgba = None
    self.rgb_clahe = None
    self.rgb_percentile = None
    self.ndvi = None
    self.ndsi = None
    self.ndwi = None
    self.evi = None
    self.ndbi = None

    self.search_data()
    self.get_data()

    if self.remove_nodata:
        self.remove_nodata_inplace()

    if self.harmonize_to_old is None:
        if self.catalog_choice == "planetarycomputer":
            self.harmonize_to_old = True
        else:
            if self.collection == "sentinel-2-c1-l2a":
                self.harmonize_to_old = True
            elif self.collection == "sentinel-2-l2a":
                self.harmonize_to_old = False
                print(f"Since {self.collection} on {self.catalog_choice} is used, harmonization step is not needed.")
            else:
                raise ValueError(f"Unknown collection: {self.collection}")

    if self.harmonize_to_old:
        self.harmonize_to_old_inplace()

    if self.scale_data:
        self.scale_data_inplace()

    self.get_metadata()



    # Add the plot_scl method as an attribute to the SCL data variable
    if 'scl' in self.data.data_vars:
        self.data.scl.attrs['example_plot'] = self.plot_scl
        self.data.scl.attrs['class_info'] = self.scl_class_info
        self.data.scl.attrs['cmap'] = self.scl_cmap

`get_data(self)` ¶

The method to get the data.

Source code in easysnowdata/remote_sensing.py

def get_data(self):
    """
    The method to get the data.
    """
    # Prepare the parameters for odc.stac.load
    load_params = {
        "items": self.search.items(),
        "bbox": self.bbox_gdf.total_bounds,
        "nodata": 0,
        "chunks": {},
        "crs": self.crs,
        "groupby": self.groupby,
        "stac_cfg": get_stac_cfg(sensor="sentinel-2-l2a"),
    }
    if self.bands:
        load_params["bands"] = self.bands
    else:
        load_params["bands"] = [info["name"] for info in self.band_info.values()]
    if self.resolution:
        load_params["resolution"] = self.resolution

    # Load the data lazily using odc.stac
    self.data = odc.stac.load(**load_params)

    self.data.attrs["band_info"] = self.band_info
    self.data.attrs["scl_class_info"] = self.scl_class_info

    if "scl" in self.data.variables:
        self.data.scl.attrs["scl_class_info"] = self.scl_class_info

    print(
        f"Data retrieved. Access with the .data attribute. Data CRS: {self.bbox_gdf.estimate_utm_crs().name}."
    )

`get_evi(self)` ¶

The method to get the EVI data.

S2 EVI definition: 2.5 * (B08 - B04) / (B08 + 6 * B04 - 7.5 * B02 + 1) [https://www.indexdatabase.de/db/si-single.php?sensor_id=96&rsindex_id=16]

Returns:

Type	Description
`xarray.DataArray`	The EVI data.

Source code in easysnowdata/remote_sensing.py

def get_evi(self):
    """
    The method to get the EVI data.
    S2 EVI definition: 2.5 * (B08 - B04) / (B08 + 6 * B04 - 7.5 * B02 + 1) [https://www.indexdatabase.de/db/si-single.php?sensor_id=96&rsindex_id=16]

    Returns:
        xarray.DataArray: The EVI data.
    """
    red = self.data.red
    nir = self.data.nir
    blue = self.data.blue

    evi_da = 2.5 * (nir - red) / (nir + 6 * red - 7.5 * blue + 1)

    self.evi = evi_da

    print(f"EVI data calculated. Access with the .evi attribute.")

`get_metadata(self)` ¶

The method to get the metadata.

Source code in easysnowdata/remote_sensing.py

def get_metadata(self):
    """
    The method to get the metadata.
    """

    stac_json = self.search.item_collection_as_dict()
    metadata_gdf = gpd.GeoDataFrame.from_features(stac_json, "epsg:4326")
    if self.catalog_choice == "earthsearch":
        metadata_gdf["s2:mgrs_tile"] = (
            metadata_gdf["mgrs:utm_zone"].apply(lambda x: f"{x:02d}")
            + metadata_gdf["mgrs:latitude_band"]
            + metadata_gdf["mgrs:grid_square"]
        )

    self.metadata = metadata_gdf
    print(f"Metadata retrieved. Access with the .metadata attribute.")

`get_ndbi(self)` ¶

The method to get the NDBI data.

S2 NDBI definition: (B08 - B12) / (B08 + B12)

Returns:

Type	Description
`xarray.DataArray`	The NDBI data.

Source code in easysnowdata/remote_sensing.py

def get_ndbi(self):
    """
    The method to get the NDBI data.
    S2 NDBI definition: (B08 - B12) / (B08 + B12)

    Returns:
        xarray.DataArray: The NDBI data.
    """
    nir = self.data.nir
    swir22 = self.data.swir22
    ndbi_da = (nir - swir22) / (nir + swir22)

    self.ndbi = ndbi_da

    print(f"NDBI data calculated. Access with the .ndbi attribute.")

`get_ndsi(self)` ¶

The method to get the NDSI data.

S2 NDSI definition: (B03 - B11) / (B03 + B11)

Returns:

Type	Description
	ndsi_da (xarray.DataArray): The NDSI data.

Source code in easysnowdata/remote_sensing.py

def get_ndsi(self):
    """
    The method to get the NDSI data.
    S2 NDSI definition: (B03 - B11) / (B03 + B11)

    Returns:
        ndsi_da (xarray.DataArray): The NDSI data.
    """
    green = self.data.green
    swir16 = self.data.swir16
    ndsi_da = (green - swir16) / (green + swir16)

    self.ndsi = ndsi_da

    print(f"NDSI data calculated. Access with the .ndsi attribute.")

`get_ndvi(self)` ¶

The method to get the NDVI data.

S2 NDVI definition: (B08 - B04) / (B08 + B04) [https://www.indexdatabase.de/db/i-single.php?id=58]

Returns:

Type	Description
	ndvi_da (xarray.DataArray): The NDVI data.

Source code in easysnowdata/remote_sensing.py

def get_ndvi(self):
    """
    The method to get the NDVI data.
    S2 NDVI definition: (B08 - B04) / (B08 + B04) [https://www.indexdatabase.de/db/i-single.php?id=58]

    Returns:
        ndvi_da (xarray.DataArray): The NDVI data.
    """
    red = self.data.red
    nir = self.data.nir
    ndvi_da = (nir - red) / (nir + red)

    self.ndvi = ndvi_da

    print(f"NDVI data calculated. Access with the .ndvi attribute.")

`get_ndwi(self)` ¶

The method to get the NDWI data.

S2 NDWI definition: (B03 - B08) / (B03 + B08)

Returns:

Type	Description
	ndwi_da (xarray.DataArray): The NDWI data.

Source code in easysnowdata/remote_sensing.py

def get_ndwi(self):
    """
    The method to get the NDWI data.
    S2 NDWI definition: (B03 - B08) / (B03 + B08)

    Returns:
        ndwi_da (xarray.DataArray): The NDWI data.
    """
    green = self.data.green
    nir = self.data.nir
    ndwi_da = (green - nir) / (green + nir)

    self.ndwi = ndwi_da

    print(f"NDWI data calculated. Access with the .ndwi attribute.")

`get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None})` ¶

Retrieve RGB data with optional percentile-based contrast stretching and CLAHE enhancement.

This method calculates and stores three versions of RGB data: raw, percentile-stretched, and CLAHE-enhanced.

Parameters:

Name	Type	Description	Default
`percentile_kwargs`	`dict`	Parameters for percentile-based contrast stretching. Keys are: - 'lower': Lower percentile for contrast stretching (default: 2) - 'upper': Upper percentile for contrast stretching (default: 98)	`{'lower': 2, 'upper': 98}`
`clahe_kwargs`	`dict`	Parameters for CLAHE enhancement. Keys are: - 'clip_limit': Clipping limit for CLAHE (default: 0.03) - 'nbins': Number of bins for CLAHE histogram (default: 256) - 'kernel_size': Size of kernel for CLAHE (default: None)	`{'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None}`

Returns:

Type	Description
`None`	The method stores results in instance attributes.

Source code in easysnowdata/remote_sensing.py

def get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None}):
    """
    Retrieve RGB data with optional percentile-based contrast stretching and CLAHE enhancement.

    This method calculates and stores three versions of RGB data: raw, percentile-stretched, and CLAHE-enhanced.

    Parameters
    ----------
    percentile_kwargs : dict, optional
        Parameters for percentile-based contrast stretching. Keys are:
        - 'lower': Lower percentile for contrast stretching (default: 2)
        - 'upper': Upper percentile for contrast stretching (default: 98)
    clahe_kwargs : dict, optional
        Parameters for CLAHE enhancement. Keys are:
        - 'clip_limit': Clipping limit for CLAHE (default: 0.03)
        - 'nbins': Number of bins for CLAHE histogram (default: 256)
        - 'kernel_size': Size of kernel for CLAHE (default: None)

    Returns
    -------
    None
        The method stores results in instance attributes.

    Notes
    -----
    Results are stored in the following attributes:
    - .rgb: Raw RGB data
    - .rgb_percentile: Percentile-stretched RGB data
    - .rgb_clahe: CLAHE-enhanced RGB data
    """

    rgba_da = self.data.odc.to_rgba(bands=('red','green','blue'),vmin=0, vmax=1.7)
    self.rgba = rgba_da

    rgb_da = rgba_da.isel(band=slice(0, 3))  #.where(self.data.scl>=0, other=255) if we want to make no data white
    self.rgb = rgb_da

    self.rgb_percentile = self.get_rgb_percentile(**percentile_kwargs)
    self.rgb_clahe = self.get_rgb_clahe(**clahe_kwargs)

    print(f"RGB data retrieved.\nAccess with the following attributes:\n.rgb for raw RGB,\n.rgba for RGBA,\n.rgb_percentile for percentile RGB,\n.rgb_clahe for CLAHE RGB.\nYou can pass in percentile_kwargs and clahe_kwargs to adjust RGB calculations, check documentation for options.")

`get_rgb_clahe(self, **kwargs)` ¶

Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to RGB bands.

This function creates a new DataArray with CLAHE applied to the RGB bands.

Parameters:

Name	Type	Description	Default
`**kwargs`	`dict`	Keyword arguments for CLAHE. Supported keys: - 'clip_limit': Clipping limit for CLAHE (default: 0.03) - 'nbins': Number of bins for CLAHE histogram (default: 256) - 'kernel_size': Size of kernel for CLAHE (default: None)	`{}`

Returns:

Type	Description
`xarray.DataArray`	RGB data with CLAHE enhancement applied.

Source code in easysnowdata/remote_sensing.py

def get_rgb_clahe(self, **kwargs):
    """
    Apply Contrast Limited Adaptive Histogram Equalization (CLAHE) to RGB bands.

    This function creates a new DataArray with CLAHE applied to the RGB bands.

    Parameters
    ----------
    **kwargs : dict
        Keyword arguments for CLAHE. Supported keys:
        - 'clip_limit': Clipping limit for CLAHE (default: 0.03)
        - 'nbins': Number of bins for CLAHE histogram (default: 256)
        - 'kernel_size': Size of kernel for CLAHE (default: None)

    Returns
    -------
    xarray.DataArray
        RGB data with CLAHE enhancement applied.

    Notes
    -----
    The function applies CLAHE to each band separately and masks areas where SCL < 0.
    https://scikit-image.org/docs/stable/api/skimage.exposure.html#skimage.exposure.equalize_adapthist
    """

    # Custom wrapper to preserve xarray metadata
    def equalize_adapthist_da(da, **kwargs):
        # Apply the CLAHE function from skimage
        result = skimage.exposure.equalize_adapthist(da.values, **kwargs)
        #new_coords = {k: v for k, v in da.coords.items() if k != 'band' or len(v) == 3}

        # Convert the result back to a DataArray, preserving the original metadata
        return xr.DataArray(result, dims=da.dims, coords=da.coords, attrs=da.attrs)

    rgb_da = self.rgb

    #template = rgb_da.copy(data=np.empty_like(rgb_da).data)
    template = xr.zeros_like(rgb_da)
    rgb_clahe_da = xr.map_blocks(equalize_adapthist_da, rgb_da, template=template, kwargs=kwargs)
    rgb_clahe_da = rgb_clahe_da.where(self.data.scl>=0)

    return rgb_clahe_da

`get_rgb_percentile(self, **percentile_kwargs)` ¶

Apply percentile-based contrast stretching to the RGB bands of the Sentinel-2 data.

This function creates a new DataArray with the contrast-stretched RGB bands.

Parameters:

Name	Type	Description	Default
`**kwargs`	`dict`	Keyword arguments for percentile calculation. Supported keys: - 'lower': Lower percentile for contrast stretching (default: 2) - 'upper': Upper percentile for contrast stretching (default: 98)	required

Returns:

Type	Description
`xarray.DataArray`	RGB data with percentile-based contrast stretching applied.

Source code in easysnowdata/remote_sensing.py

def get_rgb_percentile(self, **percentile_kwargs):
    """
    Apply percentile-based contrast stretching to the RGB bands of the Sentinel-2 data.

    This function creates a new DataArray with the contrast-stretched RGB bands.

    Parameters
    ----------
    **kwargs : dict
        Keyword arguments for percentile calculation. Supported keys:
        - 'lower': Lower percentile for contrast stretching (default: 2)
        - 'upper': Upper percentile for contrast stretching (default: 98)

    Returns
    -------
    xarray.DataArray
        RGB data with percentile-based contrast stretching applied.

    Notes
    -----
    The function clips values to the range [0, 1] and masks areas where SCL < 0.
    """
    lower_percentile = percentile_kwargs.get('lower', 2)
    upper_percentile = percentile_kwargs.get('upper', 98)

    def stretch_percentile(da):
        p_low, p_high = np.nanpercentile(da.values, [lower_percentile, upper_percentile])
        return (da - p_low) / (p_high - p_low)

    rgb_da = self.rgb

    template = xr.zeros_like(rgb_da)
    rgb_percentile_da = xr.map_blocks(stretch_percentile, rgb_da, template=template)
    rgb_percentile_da = rgb_percentile_da.clip(0, 1).where(self.data.scl>=0)

    return rgb_percentile_da

`harmonize_to_old_inplace(self)` ¶

Harmonize new Sentinel-2 data to the old baseline.

Adapted from: https://planetarycomputer.microsoft.com/dataset/sentinel-2-l2a#Baseline-Change

Returns:

Type	Description
`xarray.Dataset`	A Dataset with all values harmonized to the old processing baseline.

Source code in easysnowdata/remote_sensing.py

def harmonize_to_old_inplace(self):
    """
    Harmonize new Sentinel-2 data to the old baseline.
    Adapted from: https://planetarycomputer.microsoft.com/dataset/sentinel-2-l2a#Baseline-Change
    Returns
    -------
    harmonized: xarray.Dataset
        A Dataset with all values harmonized to the old
        processing baseline.
    """
    cutoff = datetime.datetime(2022, 1, 25)
    offset = 1000
    bands = [
        "B01",
        "B02",
        "B03",
        "B04",
        "B05",
        "B06",
        "B07",
        "B08",
        "B8A",
        "B09",
        "B11",
        "B12",
    ]
    bands = [self.data.band_info[band]["name"] for band in bands]
    old = self.data.sel(time=slice(None, cutoff))

    to_process = list(set(bands) & set(self.data.data_vars))
    new = self.data.sel(time=slice(cutoff, None))

    for band in to_process:
        if band in new.data_vars:
            new[band] = new[band].clip(offset) - offset

    self.data = xr.concat([old, new], dim="time")

    print(
        f"Data acquired after January 25th, 2022 harmonized to old baseline. To override this behavior, set harmonize_to_old=False."
    )

`mask_data(self, remove_nodata=True, remove_saturated_defective=True, remove_topo_shadows=True, remove_cloud_shadows=True, remove_vegetation=False, remove_not_vegetated=False, remove_water=False, remove_unclassified=False, remove_medium_prob_clouds=True, remove_high_prob_clouds=True, remove_thin_cirrus_clouds=True, remove_snow_ice=False)` ¶

The method to mask the data.

Parameters:

Name	Type	Description	Default
`remove_nodata`	`bool`	Whether to remove no data pixels.	`True`
`remove_saturated_defective`	`bool`	Whether to remove saturated or defective pixels.	`True`
`remove_topo_shadows`	`bool`	Whether to remove topographic shadow pixels.	`True`
`remove_cloud_shadows`	`bool`	Whether to remove cloud shadow pixels.	`True`
`remove_vegetation`	`bool`	Whether to remove vegetation pixels.	`False`
`remove_not_vegetated`	`bool`	Whether to remove not vegetated pixels.	`False`
`remove_water`	`bool`	Whether to remove water pixels.	`False`
`remove_unclassified`	`bool`	Whether to remove unclassified pixels.	`False`
`remove_medium_prob_clouds`	`bool`	Whether to remove medium probability cloud pixels.	`True`
`remove_high_prob_clouds`	`bool`	Whether to remove high probability cloud pixels.	`True`
`remove_thin_cirrus_clouds`	`bool`	Whether to remove thin cirrus cloud pixels.	`True`
`remove_snow_ice`	`bool`	Whether to remove snow or ice pixels.	`False`

Source code in easysnowdata/remote_sensing.py

def mask_data(
    self,
    remove_nodata=True,
    remove_saturated_defective=True,
    remove_topo_shadows=True,
    remove_cloud_shadows=True,
    remove_vegetation=False,
    remove_not_vegetated=False,
    remove_water=False,
    remove_unclassified=False,
    remove_medium_prob_clouds=True,
    remove_high_prob_clouds=True,
    remove_thin_cirrus_clouds=True,
    remove_snow_ice=False,
):
    """
    The method to mask the data.

    Parameters:
        remove_nodata (bool): Whether to remove no data pixels.
        remove_saturated_defective (bool): Whether to remove saturated or defective pixels.
        remove_topo_shadows (bool): Whether to remove topographic shadow pixels.
        remove_cloud_shadows (bool): Whether to remove cloud shadow pixels.
        remove_vegetation (bool): Whether to remove vegetation pixels.
        remove_not_vegetated (bool): Whether to remove not vegetated pixels.
        remove_water (bool): Whether to remove water pixels.
        remove_unclassified (bool): Whether to remove unclassified pixels.
        remove_medium_prob_clouds (bool): Whether to remove medium probability cloud pixels.
        remove_high_prob_clouds (bool): Whether to remove high probability cloud pixels.
        remove_thin_cirrus_clouds (bool): Whether to remove thin cirrus cloud pixels.
        remove_snow_ice (bool): Whether to remove snow or ice pixels.
    """

    # Mask the data based on the Scene Classification (SCL) band (see definitions above)
    mask_list = []
    if remove_nodata:
        mask_list.append(0)
    if remove_saturated_defective:
        mask_list.append(1)
    if remove_topo_shadows:
        mask_list.append(2)
    if remove_cloud_shadows:
        mask_list.append(3)
    if remove_vegetation:
        mask_list.append(4)
    if remove_not_vegetated:
        mask_list.append(5)
    if remove_water:
        mask_list.append(6)
    if remove_unclassified:
        mask_list.append(7)
    if remove_medium_prob_clouds:
        mask_list.append(8)
    if remove_high_prob_clouds:
        mask_list.append(9)
    if remove_thin_cirrus_clouds:
        mask_list.append(10)
    if remove_snow_ice:
        mask_list.append(11)

    print(f"Removed pixels with the following scene classification values:")
    for val in mask_list:
        print(self.scl_class_info[val]["name"])

    scl = self.data.scl
    mask = scl.where(scl.isin(mask_list) == False, 0)
    self.data = self.data.where(mask != 0)

`remove_nodata_inplace(self)` ¶

The method to remove no data values from the data.

Source code in easysnowdata/remote_sensing.py

def remove_nodata_inplace(self):
    """
    The method to remove no data values from the data.
    """
    data_removed = False
    for band in self.data.data_vars:
        nodata_value = None
        nodata_value = self.data[band].attrs.get("nodata")
        if nodata_value is not None:
            #print(f"Removing nodata {nodata_value} values for band {band}...")
            self.data[band] = self.data[band].where(self.data[band] != nodata_value)
            data_removed = True
    if data_removed:
        print(f"Nodata values removed from the data. In doing so, all bands converted to float32. To turn this behavior off, set remove_nodata=False.")
    else:
        print(f"Tried to remove nodata values and set them to nans, but no nodata values found in the data.")

`scale_data_inplace(self)` ¶

The method to scale the data.

Source code in easysnowdata/remote_sensing.py

def scale_data_inplace(self):
    """
    The method to scale the data.
    """
    for band in self.data.data_vars:
        scale_factor = self.data[band].attrs.get("scale")

        if scale_factor is None:
            scale_factor = next((info['scale'] for name, info in self.band_info.items() if info['name'] == band), None)

        scale_factor = int(scale_factor) if scale_factor == '1' else float(scale_factor)
        self.data[band] = self.data[band] * scale_factor

    print(
        f"Data scaled to float reflectance. To turn this behavior off, set scale_data=False."
    )

`search_data(self)` ¶

The method to search the data.

Source code in easysnowdata/remote_sensing.py

def search_data(self):
    """
    The method to search the data.
    """

    # Choose the catalog URL based on catalog_choice
    if self.catalog_choice == "planetarycomputer":
        catalog_url = "https://planetarycomputer.microsoft.com/api/stac/v1"
        catalog = pystac_client.Client.open(
            catalog_url, modifier=planetary_computer.sign_inplace
        )
    elif self.catalog_choice == "earthsearch":
        os.environ["AWS_REGION"] = "us-west-2"
        os.environ["GDAL_DISABLE_READDIR_ON_OPEN"] = "EMPTY_DIR"
        os.environ["AWS_NO_SIGN_REQUEST"] = "YES"
        catalog_url = "https://earth-search.aws.element84.com/v1"
        catalog = pystac_client.Client.open(catalog_url)
    else:
        raise ValueError(
            "Invalid catalog_choice. Choose either 'planetarycomputer' or 'earthsearch'."
        )

    # Search for items within the specified bbox and date range
    search = catalog.search(
        collections=[self.collection],
        bbox=self.bbox_gdf.total_bounds,
        datetime=(self.start_date, self.end_date),
    )
    self.search = search
    print(f"Data searched. Access the returned seach with the .search attribute.")

`authenticate_all()` ¶

Authenticates with all potential data providers.

This function authenticates with NASA EarthData, Planetary Computer, and Earth Engine. It prints the authentication status for each provider.

Source code in easysnowdata/remote_sensing.py

def authenticate_all():
    """
    Authenticates with all potential data providers.

    This function authenticates with NASA EarthData, Planetary Computer, and Earth Engine.
    It prints the authentication status for each provider.
    """
    print("Authenticating for all potential data providers...")

    print("Authenticating for NASA EarthData...")
    earthaccess.login(persist=True)
    # print("Authenticating for Planetary Computer...")
    # planetary_computer.set_subscription_key()
    print("Authenticating for Earth Engine...")
    ee.Authenticate()

`get_esa_worldcover(bbox_input=None, version='v200', mask_nodata=False)` ¶

Fetches 10m ESA WorldCover global land cover data (2020 v100 or 2021 v200) for a given bounding box.

Description: The discrete classification maps provide 11 classes defined using the Land Cover Classification System (LCCS) developed by the United Nations (UN) Food and Agriculture Organization (FAO).

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.geodataframe.GeoDataFrame \| tuple \| shapely.geometry.base.BaseGeometry \| None`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	`None`
`version`	`str`	Version of the WorldCover data. The two versions are v100 (2020) and v200 (2021). Default is 'v200'.	`'v200'`
`mask_nodata`	`bool`	Whether to mask no data values. Default is False. If False: (dtype=uint8, rio.nodata=0, rio.encoded_nodata=None) If True: (dtype=float32, rio.nodata=nan, rio.encoded_nodata=0)	`False`

Returns:

Type	Description
`DataArray`	WorldCover DataArray with class information in attributes.

Source code in easysnowdata/remote_sensing.py

def get_esa_worldcover(
    bbox_input: gpd.GeoDataFrame | tuple | shapely.geometry.base.BaseGeometry | None = None,
    version: str = "v200", mask_nodata: bool = False,
) -> xr.DataArray:
    """
    Fetches 10m ESA WorldCover global land cover data (2020 v100 or 2021 v200) for a given bounding box.

    Description:
    The discrete classification maps provide 11 classes defined using the Land Cover Classification System (LCCS)
    developed by the United Nations (UN) Food and Agriculture Organization (FAO).

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or Shapely Geometry
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    version : str, optional
        Version of the WorldCover data. The two versions are v100 (2020) and v200 (2021). Default is 'v200'.
    mask_nodata : bool, optional
        Whether to mask no data values. Default is False.
        If False: (dtype=uint8, rio.nodata=0, rio.encoded_nodata=None)
        If True: (dtype=float32, rio.nodata=nan, rio.encoded_nodata=0)

    Returns
    -------
    xarray.DataArray
        WorldCover DataArray with class information in attributes.

    Examples
    --------
    >>> import geopandas as gpd
    >>> import easysnowdata
    >>> 
    >>> # Define a bounding box for Mount Rainier
    >>> bbox = (-121.94, 46.72, -121.54, 46.99)
    >>> 
    >>> # Fetch WorldCover data for the area
    >>> worldcover_da = easysnowdata.remote_sensing.get_esa_worldcover(bbox)
    >>> 
    >>> # Plot the data using the example plot function
    >>> f, ax = worldcover_da.attrs['example_plot'](worldcover_da)

    Notes
    -----
    Data citation:
    Zanaga, D., Van De Kerchove, R., De Keersmaecker, W., Souverijns, N., Brockmann, C., Quast, R., Wevers, J., Grosu, A.,
    Paccini, A., Vergnaud, S., Cartus, O., Santoro, M., Fritz, S., Georgieva, I., Lesiv, M., Carter, S., Herold, M., Li, Linlin,
    Tsendbazar, N.E., Ramoino, F., Arino, O. (2021). ESA WorldCover 10 m 2020 v100. doi:10.5281/zenodo.5571936.
    """

    def get_class_info():
        classes = {
            10: {"name": "Tree cover", "color": "#006400"},
            20: {"name": "Shrubland", "color": "#FFBB22"},
            30: {"name": "Grassland", "color": "#FFFF4C"},
            40: {"name": "Cropland", "color": "#F096FF"},
            50: {"name": "Built-up", "color": "#FA0000"},
            60: {"name": "Bare / sparse vegetation", "color": "#B4B4B4"},
            70: {"name": "Snow and ice", "color": "#F0F0F0"},
            80: {"name": "Permanent water bodies", "color": "#0064C8"},
            90: {"name": "Herbaceous wetland", "color": "#0096A0"},
            95: {"name": "Mangroves", "color": "#00CF75"},
            100: {"name": "Moss and lichen", "color": "#FAE6A0"},
        }
        return classes

    def get_class_cmap(classes):
        cmap = plt.cm.colors.ListedColormap(
            [classes[key]["color"] for key in classes.keys()]
        )
        return cmap

    def plot_classes(self, ax=None, figsize=(8, 10), legend_kwargs=None):
        if ax is None:
            f, ax = plt.subplots(figsize=figsize)
        else:
            f = ax.get_figure()

        class_values = sorted(list(self.attrs["class_info"].keys()))
        bounds = [
            (class_values[i] + class_values[i + 1]) / 2 for i in range(len(class_values) - 1)
        ]
        bounds = [class_values[0] - 0.5] + bounds + [class_values[-1] + 0.5]
        norm = matplotlib.colors.BoundaryNorm(bounds, self.attrs["cmap"].N)

        im = self.plot.imshow(ax=ax, cmap=self.attrs["cmap"], norm=norm, add_colorbar=False)
        #ax.set_aspect("equal")

        legend_handles = []
        class_names = []
        for class_value, class_info in self.attrs["class_info"].items():
            legend_handles.append(
                plt.Rectangle((0, 0), 1, 1, facecolor=class_info["color"], edgecolor="black")
            )
            class_names.append(class_info["name"])

        legend_kwargs = legend_kwargs or {}
        default_legend_kwargs = {
            "bbox_to_anchor": (0.5, -0.1),
            "loc": "upper center",
            "ncol": len(class_names) // 3,
            "frameon": False,
            "handlelength": 3.5,
            "handleheight": 5,
        }
        legend_kwargs = {**default_legend_kwargs, **legend_kwargs}

        ax.legend(legend_handles, class_names, **legend_kwargs)

        ax.set_xlabel("Longitude")
        ax.set_ylabel("Latitude")
        ax.set_title(f"ESA WorldCover\n{version} ({version_year})")
        f.tight_layout(pad=5.5, w_pad=5.5, h_pad=1.5)
        f.dpi = 300

        return f, ax

    # Convert the input to a GeoDataFrame if it's not already one
    bbox_gdf = convert_bbox_to_geodataframe(bbox_input)

    if version == "v100":
        version_year = "2020"
    elif version == "v200":
        version_year = "2021"
    else:
        raise ValueError("Incorrect version number. Please provide 'v100' or 'v200'.")

    catalog = pystac_client.Client.open(
        "https://planetarycomputer.microsoft.com/api/stac/v1",
        modifier=planetary_computer.sign_inplace,
    )
    search = catalog.search(collections=["esa-worldcover"], bbox=bbox_gdf.total_bounds)
    worldcover_da = (
        odc.stac.load(
            search.items(), bbox=bbox_gdf.total_bounds, bands="map", chunks={}
        )["map"]
        .sel(time=version_year)
        .squeeze()
    )

    if mask_nodata:
        worldcover_da = worldcover_da.where(worldcover_da>0)
        worldcover_da.rio.write_nodata(0, encoded=True, inplace=True)

    worldcover_da.attrs["class_info"] = get_class_info()
    worldcover_da.attrs["cmap"] = get_class_cmap(worldcover_da.attrs["class_info"])
    worldcover_da.attrs['data_citation'] = "Zanaga, D., Van De Kerchove, R., De Keersmaecker, W., Souverijns, N., Brockmann, C., Quast, R., Wevers, J., Grosu, A., Paccini, A., Vergnaud, S., Cartus, O., Santoro, M., Fritz, S., Georgieva, I., Lesiv, M., Carter, S., Herold, M., Li, Linlin, Tsendbazar, N.E., Ramoino, F., Arino, O. (2021). ESA WorldCover 10 m 2020 v100. doi:10.5281/zenodo.5571936."

    worldcover_da.attrs['example_plot'] = plot_classes

    return worldcover_da

`get_forest_cover_fraction(bbox_input=None, mask_nodata=False)` ¶

Fetches ~100m forest cover fraction data for a given bounding box.

Description: The data is obtained from the Copernicus Global Land Service: Land Cover 100m: collection 3: epoch 2019: Globe dataset. The specific layer used is the Tree-CoverFraction-layer, which provides the fractional cover (%) for the forest class.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.geodataframe.GeoDataFrame \| tuple \| shapely.geometry.base.BaseGeometry \| None`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	`None`
`mask_nodata`	`bool`	Whether to mask no data values. Default is False. If False: (dtype=uint8, rio.nodata=255, rio.encoded_nodata=None) If True: (dtype=float32, rio.nodata=nan, rio.encoded_nodata=255)	`False`

Returns:

Type	Description
`DataArray`	Forest cover fraction DataArray.

Source code in easysnowdata/remote_sensing.py

def get_forest_cover_fraction(bbox_input: gpd.GeoDataFrame | tuple | shapely.geometry.base.BaseGeometry | None = None, mask_nodata: bool = False,
) -> xr.DataArray:
    """
    Fetches ~100m forest cover fraction data for a given bounding box.

    Description:
    The data is obtained from the Copernicus Global Land Service: Land Cover 100m: collection 3: epoch 2019: Globe dataset.
    The specific layer used is the Tree-CoverFraction-layer, which provides the fractional cover (%) for the forest class.

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or shapely.Geometry
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    mask_nodata : bool, optional
        Whether to mask no data values. Default is False.
        If False: (dtype=uint8, rio.nodata=255, rio.encoded_nodata=None)
        If True: (dtype=float32, rio.nodata=nan, rio.encoded_nodata=255)

    Returns
    -------
    xarray.DataArray
        Forest cover fraction DataArray.

    Examples
    --------
    >>> import geopandas as gpd
    >>> from easysnowdata import remote_sensing
    >>> 
    >>> # Define a bounding box for an area of interest
    >>> bbox = (-122.5, 47.0, -121.5, 48.0)
    >>> 
    >>> # Fetch forest cover fraction data
    >>> forest_cover = remote_sensing.get_forest_cover_fraction(bbox)
    >>> 
    >>> # Plot the data using the example plot function
    >>> f, ax = forest_cover.attrs['example_plot'](forest_cover)

    Notes
    -----
    Data citation:
    Marcel Buchhorn, Bruno Smets, Luc Bertels, Bert De Roo, Myroslava Lesiv, Nandin-Erdene Tsendbazar, Martin Herold, & Steffen Fritz. (2020).
    Copernicus Global Land Service: Land Cover 100m: collection 3: epoch 2019: Globe (V3.0.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.3939050
    """

    def plot_forest_cover(self, ax=None, figsize=(8, 10), legend_kwargs=None):
        if ax is None:
            f, ax = plt.subplots(figsize=figsize)
        else:
            f = ax.get_figure()

        cmap = matplotlib.colormaps.get_cmap('Greens').copy()
        cmap.set_over('white')  # Set values over 100 (i.e., 255) to white

        im = self.plot.imshow(ax=ax, cmap=cmap, vmin=0, vmax=100, add_colorbar=False)

        cbar = plt.colorbar(im, ax=ax, extend='max')
        cbar.set_label('Forest Cover Fraction (%)')

        ax.set_xlabel("Longitude")
        ax.set_ylabel("Latitude")
        ax.set_title("Copernicus Global Land Service Forest Cover Fraction\nLand Cover 100m: collection 3: epoch 2019")
        f.tight_layout(pad=1.5, w_pad=1.5, h_pad=1.5)
        f.dpi = 300

        return f, ax

    # Convert the input to a GeoDataFrame if it's not already one
    bbox_gdf = convert_bbox_to_geodataframe(bbox_input)

    fcf_da = rxr.open_rasterio(
        "https://zenodo.org/record/3939050/files/PROBAV_LC100_global_v3.0.1_2019-nrt_Tree-CoverFraction-layer_EPSG-4326.tif",
        chunks=True,
        mask_and_scale=mask_nodata,
    )

    fcf_da = fcf_da.rio.clip_box(*bbox_gdf.total_bounds,crs=bbox_gdf.crs).squeeze()



    fcf_da.attrs['example_plot'] = plot_forest_cover
    fcf_da.attrs['data_citation'] = "Marcel Buchhorn, Bruno Smets, Luc Bertels, Bert De Roo, Myroslava Lesiv, Nandin-Erdene Tsendbazar, Martin Herold, & Steffen Fritz. (2020). Copernicus Global Land Service: Land Cover 100m: collection 3: epoch 2019: Globe (V3.0.1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.3939050"

    return fcf_da

`get_nlcd_landcover(bbox_input=None, layer='landcover', initialize_ee=True)` ¶

Fetches National Land Cover Database (NLCD) data for a given bounding box.

Description: The National Land Cover Database (NLCD) provides nationwide data on land cover and land cover change at a 30m resolution. The dataset includes various layers such as land cover classification, impervious surfaces, and urban intensity. Projection is an albers equal area conic projection.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.geodataframe.GeoDataFrame \| tuple \| shapely.geometry.base.BaseGeometry \| None`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	`None`
`layer`	`str`	The NLCD layer to retrieve. Options are: - 'landcover' - 'impervious' - 'impervious_descriptor' - 'science_products_land_cover_change_count' - 'science_products_land_cover_change_first_disturbance_date' - 'science_products_land_cover_change_index' - 'science_products_land_cover_science_product' - 'science_products_forest_disturbance_date' Default is 'landcover'.	`'landcover'`
`initialize_ee`	`bool`	Whether to initialize Earth Engine. Default is True.	`True`

Returns:

Type	Description
`DataArray`	NLCD DataArray for the specified region and layer.

Source code in easysnowdata/remote_sensing.py

def get_nlcd_landcover(bbox_input: gpd.GeoDataFrame | tuple | shapely.geometry.base.BaseGeometry | None = None, 
             layer: str = 'landcover',
             initialize_ee: bool = True) -> xr.DataArray:
    """
    Fetches National Land Cover Database (NLCD) data for a given bounding box.

    Description:
    The National Land Cover Database (NLCD) provides nationwide data on land cover and land cover change 
    at a 30m resolution. The dataset includes various layers such as land cover classification, 
    impervious surfaces, and urban intensity. Projection is an albers equal area conic projection.

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or shapely.Geometry
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    layer : str, optional
        The NLCD layer to retrieve. Options are:
        - 'landcover'
        - 'impervious'
        - 'impervious_descriptor'
        - 'science_products_land_cover_change_count'
        - 'science_products_land_cover_change_first_disturbance_date'
        - 'science_products_land_cover_change_index'
        - 'science_products_land_cover_science_product'
        - 'science_products_forest_disturbance_date'
        Default is 'landcover'.
    initialize_ee : bool, optional
        Whether to initialize Earth Engine. Default is True.

    Returns
    -------
    xarray.DataArray
        NLCD DataArray for the specified region and layer.

    Examples
    --------
    >>> import geopandas as gpd
    >>> import easysnowdata
    >>> 
    >>> # Define a bounding box for an area of interest
    >>> bbox = (-122.5, 47.0, -121.5, 48.0)
    >>> 
    >>> # Fetch NLCD land cover data
    >>> nlcd_landcover_da = easysnowdata.remote_sensing.get_nlcd_landcover(bbox, layer='landcover')
    >>> 
    >>> # Plot the data
    >>> nlcd_landcover_da.attrs['example_plot'](nlcd_landcover_da)


    Notes
    -----
    - NLCD data is only available for the contiguous United States
    - The latest version (2021) includes data from 2001-2021
    - Resolution is 30 meters

    Data citation:
    Dewitz, J., 2023, National Land Cover Database (NLCD) 2021 Products: U.S. Geological Survey data release, doi:10.5066/P9JZ7AO3
    """
    # Initialize Earth Engine with high-volume endpoint
    if initialize_ee:
        ee.Initialize(opt_url='https://earthengine-highvolume.googleapis.com')
    else:
        print("Initialization turned off. If you haven't already, please sign in to Google Earth Engine by running:\n\nimport ee\nee.Authenticate()\nee.Initialize()\n\n")

    # Convert the input to a GeoDataFrame if it's not already one
    bbox_gdf = convert_bbox_to_geodataframe(bbox_input)

    image_collection = ee.ImageCollection('USGS/NLCD_RELEASES/2021_REL/NLCD')
    image = image_collection.first()

    projection = image.select(0).projection()

    ds = xr.open_dataset(
        image_collection, 
        engine='ee', 
        geometry=tuple(bbox_gdf.total_bounds), 
        projection=projection,
        chunks={},
    ).squeeze().transpose().rename({'X':'x','Y':'y'}).rio.set_spatial_dims(x_dim='x', y_dim='y').astype('uint8')

    nlcd_da = ds[layer]

    # would be nice for them to come in as ints
    # https://github.com/google/Xee/issues/86
    # https://github.com/google/Xee/issues/146


    def get_class_info():
        info = image.getInfo()['properties']

        if layer == 'landcover':
            return {
                value: {
                    "name": name.split(':')[0],
                    "color": f"#{palette}"
                } for value, name, palette in zip(
                    info['landcover_class_values'],
                    info['landcover_class_names'],
                    info['landcover_class_palette']
                )
            }
        elif layer == 'impervious':
            return None  
        elif layer == 'impervious_descriptor':
            return {
                value: {
                    "name": name.split('.')[0],
                    "color": f"#{palette}"
                } for value, name, palette in zip(
                    info['impervious_descriptor_class_values'],
                    info['impervious_descriptor_class_names'],
                    info['impervious_descriptor_class_palette']
                )
            }
        elif layer.startswith('science_products'):
            return {
                value: {
                    "name": name,
                    "color": f"#{palette}"
                } for value, name, palette in zip(
                    info[f'{layer}_class_values'],
                    info[f'{layer}_class_names'],
                    info[f'{layer}_class_palette']
                )
            }

    def get_class_cmap(classes):
        if classes is None: 
            return plt.cm.YlOrRd
        return plt.cm.colors.ListedColormap([classes[key]["color"] for key in classes.keys()])

    def plot_classes(self, ax=None, figsize=(8, 10), legend_kwargs=None):
        if ax is None:
            f, ax = plt.subplots(figsize=figsize)
        else:
            f = ax.get_figure()

        if self.name != 'impervious':
            class_values = sorted(list(self.attrs["class_info"].keys()))
            bounds = [(class_values[i] + class_values[i + 1]) / 2 for i in range(len(class_values) - 1)]
            bounds = [class_values[0] - 0.5] + bounds + [class_values[-1] + 0.5]
            norm = matplotlib.colors.BoundaryNorm(bounds, self.attrs["cmap"].N)

            im = self.plot.imshow(ax=ax, cmap=self.attrs["cmap"], norm=norm, add_colorbar=False)

            legend_handles = []
            class_names = []
            for class_value, class_info in self.attrs["class_info"].items():
                legend_handles.append(
                    plt.Rectangle((0, 0), 1, 1, facecolor=class_info["color"], edgecolor="black")
                )
                class_names.append(class_info["name"])

            legend_kwargs = legend_kwargs or {}
            default_legend_kwargs = {
                "bbox_to_anchor": (0.5, -0.1),
                "loc": "upper center",
                "ncols": 4,
                "frameon": False,
                "handlelength": 3.5,
                "handleheight": 5,
            }
            legend_kwargs = {**default_legend_kwargs, **legend_kwargs}
            ax.legend(legend_handles, class_names, **legend_kwargs)

        else: 
            im = self.plot.imshow(ax=ax, cmap=self.attrs["cmap"], add_colorbar=False)
            f.colorbar(im, ax=ax, label='Percent impervious surface [%]')

        ax.set_xlabel("x")
        ax.set_ylabel("y")
        #ax.axis('equal')
        ax.set_title(f"NLCD {self.name.title()} (2021)")
        #f.tight_layout(pad=0, w_pad=0, h_pad=0)
        f.dpi = 300

        return f, ax

    class_info = get_class_info()
    nlcd_da.attrs["class_info"] = class_info
    nlcd_da.attrs["cmap"] = get_class_cmap(class_info)
    nlcd_da.attrs["example_plot"] = plot_classes

    nlcd_da.attrs['data_citation'] = "Dewitz, J., 2023, National Land Cover Database (NLCD) 2021 Products: U.S. Geological Survey data release, doi:10.5066/P9JZ7AO3"


    return nlcd_da

`get_seasonal_mountain_snow_mask(bbox_input=None, data_product='mountain_snow', mask_nodata=False)` ¶

Fetches ~1km static global seasonal (mountain snow / snow) mask for a given bounding box.

Description: Seasonal Mountain Snow (SMS) mask derived from MODIS MOD10A2 snow cover extent and GTOPO30 digital elevation model produced at 30 arcsecond spatial resolution.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.geodataframe.GeoDataFrame \| tuple \| shapely.geometry.base.BaseGeometry \| None`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	`None`
`data_product`	`str`	Data product to fetch. Choose from 'snow' or 'mountain_snow'. Default is 'mountain_snow'.	`'mountain_snow'`
`mask_nodata`	`bool`	Whether to mask no data values. Default is False. If False: (dtype=uint8, rio.nodata=255, rio.encoded_nodata=None) If True: (dtype=float32, rio.nodata=nan, rio.encoded_nodata=255)	`False`

Returns:

Type	Description
`DataArray`	Mountain snow DataArray with class information in attributes.

Source code in easysnowdata/remote_sensing.py

def get_seasonal_mountain_snow_mask(
    bbox_input: gpd.GeoDataFrame | tuple | shapely.geometry.base.BaseGeometry | None = None, data_product: str = "mountain_snow", mask_nodata: bool = False,
) -> xr.DataArray:
    """
    Fetches ~1km static global seasonal (mountain snow / snow) mask for a given bounding box.

    Description:
    Seasonal Mountain Snow (SMS) mask derived from MODIS MOD10A2 snow cover extent and GTOPO30 digital elevation model
    produced at 30 arcsecond spatial resolution.

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or shapely.Geometry
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    data_product : str, optional
        Data product to fetch. Choose from 'snow' or 'mountain_snow'. Default is 'mountain_snow'.
    mask_nodata : bool, optional
        Whether to mask no data values. Default is False.
        If False: (dtype=uint8, rio.nodata=255, rio.encoded_nodata=None)
        If True: (dtype=float32, rio.nodata=nan, rio.encoded_nodata=255)

    Returns
    -------
    xarray.DataArray
        Mountain snow DataArray with class information in attributes.

    Examples
    --------
    >>> import geopandas as gpd
    >>> import easysnowdata
    >>> 
    >>> # Define a bounding box for a mountainous area
    >>> bbox = (-106.0, 39.0, -105.0, 40.0)
    >>> 
    >>> # Fetch mountain snow mask data
    >>> mountain_snow_da = easysnowdata.remote_sensing.get_seasonal_mountain_snow_mask(bbox)
    >>> 
    >>> # Plot the data using the example plot function
    >>> f, ax = mountain_snow_da.attrs['example_plot'](mountain_snow_da)

    Notes
    -----
    Data citation:
    Wrzesien, M., Pavelsky, T., Durand, M., Lundquist, J., & Dozier, J. (2019).
    Global Seasonal Mountain Snow Mask from MODIS MOD10A2 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.2626737
    """

    def get_class_info(data_product):
        if data_product == "snow":
            classes = {
                0: {"name": "Little-to-no snow", "color": "#030303"},
                1: {"name": "Indeterminate due to clouds", "color": "#755F4A"},
                2: {"name": "Ephemeral snow", "color": "#792B8E"},
                3: {"name": "Seasonal snow", "color": "#679ACF"},
                255: {"name": "Fill", "color": "#ffffff"},
            }
        elif data_product == "mountain_snow":
            classes = {
                0: {"name": "Mountains with little-to-no snow", "color": "#030303"},
                1: {"name": "Indeterminate due to clouds", "color": "#755F4A"},
                2: {"name": "Mountains with ephemeral snow", "color": "#792B8E"},
                3: {"name": "Mountains with seasonal snow", "color": "#679ACF"},
                255: {"name": "Fill", "color": "#ffffff"},
            }
        else:
            raise ValueError('Invalid data_product. Choose from "snow" or "mountain_snow".')
        return classes

    def get_class_cmap(classes):
        cmap = plt.cm.colors.ListedColormap(
            [classes[key]["color"] for key in classes.keys()]
        )
        return cmap

    def plot_classes(self, ax=None, figsize=(8, 10), legend_kwargs=None):
        if ax is None:
            f, ax = plt.subplots(figsize=figsize)
        else:
            f = ax.get_figure()

        class_values = sorted(list(self.attrs["class_info"].keys()))
        bounds = [
            (class_values[i] + class_values[i + 1]) / 2 for i in range(len(class_values) - 1)
        ]
        bounds = [class_values[0] - 0.5] + bounds + [class_values[-1] + 0.5]
        norm = matplotlib.colors.BoundaryNorm(bounds, self.attrs["cmap"].N)

        im = self.plot.imshow(ax=ax, cmap=self.attrs["cmap"], norm=norm, add_colorbar=False)
        #ax.set_aspect("equal")

        legend_handles = []
        class_names = []
        for class_value, class_info in self.attrs["class_info"].items():
            legend_handles.append(
                plt.Rectangle((0, 0), 1, 1, facecolor=class_info["color"], edgecolor="black")
            )
            class_names.append(class_info["name"])

        legend_kwargs = legend_kwargs or {}
        default_legend_kwargs = {
            "bbox_to_anchor": (0.5, -0.1),
            "loc": "upper center",
            "ncol": len(class_names) // 2,
            "frameon": False,
            "handlelength": 3.5,
            "handleheight": 5,
        }
        legend_kwargs = {**default_legend_kwargs, **legend_kwargs}

        ax.legend(legend_handles, class_names, **legend_kwargs)

        ax.set_xlabel("Longitude")
        ax.set_ylabel("Latitude")
        ax.set_title(f"Global seasonal {'mountain ' if data_product == 'mountain_snow' else ''}snow mask\nfrom Wrzesien et al 2019")
        f.tight_layout(pad=5.5, w_pad=5.5, h_pad=1.5)
        f.dpi = 300

        return f, ax

    print(f'This function takes a moment, getting zipped file from zenodo...')
    # Convert the input to a GeoDataFrame if it's not already one
    bbox_gdf = convert_bbox_to_geodataframe(bbox_input)

    url = f"zip+https://zenodo.org/records/2626737/files/MODIS_{'mtnsnow' if data_product == 'mountain_snow' else 'snow'}_classes.zip!/MODIS_{'mtnsnow' if data_product == 'mountain_snow' else 'snow'}_classes.tif"

    mountain_snow_da = rxr.open_rasterio(
        url,
        chunks=True,
        mask_and_scale=mask_nodata,
    ).rio.clip_box(*bbox_gdf.total_bounds, crs=bbox_gdf.crs).squeeze()

    # looks like the creators accidently set no data to 256 and 265 instead of 255, therefore unmasked the data is of type uint32 :(
    # attempt to fix this by setting all invalid values to 255, then converting types
    mask = mountain_snow_da > 3
    mountain_snow_da = mountain_snow_da.where(~mask, 255)

    if mask_nodata:
        mountain_snow_da = mountain_snow_da.astype("float32").rio.write_nodata(255, encoded=True)
    else:
        mountain_snow_da = mountain_snow_da.astype("uint8").rio.set_nodata(255)


    mountain_snow_da.attrs["class_info"] = get_class_info(data_product)
    mountain_snow_da.attrs["cmap"] = get_class_cmap(mountain_snow_da.attrs["class_info"])
    mountain_snow_da.attrs['data_citation'] = "Wrzesien, M., Pavelsky, T., Durand, M., Lundquist, J., & Dozier, J. (2019). Global Seasonal Mountain Snow Mask from MODIS MOD10A2 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.2626737"

    mountain_snow_da.attrs['example_plot'] = plot_classes

    return mountain_snow_da

`get_seasonal_snow_classification(bbox_input=None, mask_nodata=False)` ¶

Fetches 10arcsec (~300m) Sturm & Liston 2021 seasonal snow classification data for a given bounding box.

Description: This dataset consists of global, seasonal snow classifications determined from air temperature, precipitation, and wind speed climatologies. This is the 10 arcsec (~300m) product in EPSG:4326.

Parameters:

Name	Type	Description	Default
`bbox_input`	`geopandas.geodataframe.GeoDataFrame \| tuple \| shapely.geometry.base.BaseGeometry \| None`	GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.	`None`
`mask_nodata`	`bool`	Whether to mask no data values. Default is False. If False: (dtype=uint8, rio.nodata=9, rio.encoded_nodata=None) If True: (dtype=float32, rio.nodata=nan, rio.encoded_nodata=9)	`False`

Returns:

Type	Description
`DataArray`	Seasonal snow class DataArray with class information in attributes.

Source code in easysnowdata/remote_sensing.py

def get_seasonal_snow_classification(bbox_input: gpd.GeoDataFrame | tuple | shapely.geometry.base.BaseGeometry | None = None, mask_nodata: bool = False,
) -> xr.DataArray:
    """
    Fetches 10arcsec (~300m) Sturm & Liston 2021 seasonal snow classification data for a given bounding box.

    Description:
    This dataset consists of global, seasonal snow classifications determined from air temperature,
    precipitation, and wind speed climatologies. This is the 10 arcsec (~300m) product in EPSG:4326.

    Parameters
    ----------
    bbox_input : geopandas.GeoDataFrame or tuple or Shapely Geometry
        GeoDataFrame containing the bounding box, or a tuple of (xmin, ymin, xmax, ymax), or a Shapely geometry.
    mask_nodata : bool, optional
        Whether to mask no data values. Default is False.
        If False: (dtype=uint8, rio.nodata=9, rio.encoded_nodata=None)
        If True: (dtype=float32, rio.nodata=nan, rio.encoded_nodata=9)

    Returns
    -------
    xarray.DataArray
        Seasonal snow class DataArray with class information in attributes.

    Examples
    --------
    >>> import geopandas as gpd
    >>> import easysnowdata
    >>> 
    >>> # Define a bounding box for an area of interest
    >>> bbox = (-120.0, 40.0, -118.0, 42.0)
    >>> 
    >>> # Fetch seasonal snow classification data
    >>> snow_classification_da = easysnowdata.remote_sensing.get_seasonal_snow_classification(bbox)
    >>> 
    >>> # Plot the data using the example plot function
    >>> f,ax = snow_classification_da.attrs['example_plot'](snow_classification_da)

    Notes
    -----
    Data citation:
    Liston, G. E. and M. Sturm. (2021). Global Seasonal-Snow Classification, Version 1 [Data Set].
    Boulder, Colorado USA. National Snow and Ice Data Center. https://doi.org/10.5067/99FTCYYYLAQ0. Date Accessed 03-06-2024.
    """

    def get_class_info():
        classes = {
            1: {"name": "Tundra", "color": "#a100c8"},
            2: {"name": "Boreal Forest", "color": "#00a0fe"},
            3: {"name": "Maritime", "color": "#fe0000"},
            4: {"name": "Ephemeral (includes no snow)", "color": "#e7dc32"},
            5: {"name": "Prairie", "color": "#f08328"},
            6: {"name": "Montane Forest", "color": "#00dc00"},
            7: {"name": "Ice (glaciers and ice sheets)", "color": "#aaaaaa"},
            8: {"name": "Ocean", "color": "#0000ff"},
            9: {"name": "Fill", "color": "#ffffff"},
        }
        return classes

    def get_class_cmap(classes):
        cmap = plt.cm.colors.ListedColormap(
            [classes[key]["color"] for key in classes.keys()]
        )
        return cmap

    def plot_classes(self, ax=None, figsize=(8, 10), legend_kwargs=None):
        if ax is None:
            f, ax = plt.subplots(figsize=figsize)
        else:
            f = ax.get_figure()

        class_values = sorted(list(self.attrs["class_info"].keys()))
        bounds = [
            (class_values[i] + class_values[i + 1]) / 2 for i in range(len(class_values) - 1)
        ]
        bounds = [class_values[0] - 0.5] + bounds + [class_values[-1] + 0.5]
        norm = matplotlib.colors.BoundaryNorm(bounds, self.attrs["cmap"].N)

        im = self.plot.imshow(ax=ax, cmap=self.attrs["cmap"], norm=norm, add_colorbar=False)
        #ax.set_aspect("equal")


        legend_handles = []
        class_names = []
        for class_value, class_info in self.attrs["class_info"].items():
            legend_handles.append(
                plt.Rectangle((0, 0), 1, 1, facecolor=class_info["color"], edgecolor="black")
            )
            class_names.append(class_info["name"])

        legend_kwargs = legend_kwargs or {}
        default_legend_kwargs = {
            "bbox_to_anchor": (0.5, -0.1),
            "loc": "upper center",
            "ncol": len(class_names) // 3,
            "frameon": False,
            "handlelength": 3.5,
            "handleheight": 5,
        }
        legend_kwargs = {**default_legend_kwargs, **legend_kwargs}

        ax.legend(legend_handles, class_names, **legend_kwargs)

        ax.set_xlabel("Longitude")
        ax.set_ylabel("Latitude")
        ax.set_title("Seasonal snow classification\nfrom Sturm & Liston 2021")
        f.tight_layout(pad=1.5, w_pad=1.5, h_pad=1.5)
        f.dpi = 300

        return f, ax

    # Convert the input to a GeoDataFrame if it's not already one
    bbox_gdf = convert_bbox_to_geodataframe(bbox_input)

    snow_classification_da = rxr.open_rasterio(
        "https://snowmelt.blob.core.windows.net/snowmelt/eric/snow_classification/SnowClass_GL_300m_10.0arcsec_2021_v01.0.tif",
        chunks=True,
        mask_and_scale=mask_nodata,
    )
    snow_classification_da = (
        snow_classification_da.rio.clip_box(*bbox_gdf.total_bounds,crs=bbox_gdf.crs).squeeze()
    )

    if mask_nodata:
        snow_classification_da.rio.write_nodata(9, encoded=True, inplace=True)
    else:
        snow_classification_da.rio.set_nodata(9, inplace=True)

    snow_classification_da.attrs["class_info"] = get_class_info()
    snow_classification_da.attrs["cmap"] = get_class_cmap(snow_classification_da.attrs["class_info"])
    snow_classification_da.attrs['data_citation'] = "Liston, G. E. and M. Sturm. (2021). Global Seasonal-Snow Classification, Version 1 [Data Set]. Boulder, Colorado USA. National Snow and Ice Data Center. https://doi.org/10.5067/99FTCYYYLAQ0. Date Accessed 03-06-2024."

    snow_classification_da.attrs['example_plot'] = plot_classes

    return snow_classification_da

remote_sensing module¶

HLS ¶

__init__(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', bands=None, resolution=None, crs='utm', remove_nodata=True, scale_data=True, add_metadata=True, add_platform=True, groupby='solar_day') special ¶

get_data(self) ¶

get_ndvi(self) ¶

get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None}) ¶

get_rgb_clahe(self, **kwargs) ¶

get_rgb_percentile(self, **percentile_kwargs) ¶

mask_data(self, remove_cirrus=True, remove_cloud=True, remove_adj_to_cloud=True, remove_cloud_shadows=True, remove_snow_ice=False, remove_water=False, remove_aerosol_low=False, remove_aerosol_moderate=False, remove_aerosol_high=False, remove_aerosol_climatology=False) ¶

remove_nodata_inplace(self) ¶

scale_data_inplace(self) ¶

search_data(self) ¶

MODIS_snow ¶

Sentinel1 ¶

local_incidence_angle_data property readonly ¶

__init__(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', catalog_choice='planetarycomputer', bands=None, units='dB', resolution=None, crs=None, groupby='sat:absolute_orbit', chunks={}, remove_border_noise=True) special ¶

add_orbit_info(self) ¶

db_to_linear(self) ¶

get_data(self) ¶

get_local_incidence_angle(self, resolution=None, initialize_ee=True) ¶

get_metadata(self) ¶

linear_to_db(self) ¶

search_data(self) ¶

Sentinel2 ¶

__init__(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', catalog_choice='planetarycomputer', collection='sentinel-2-l2a', bands=None, resolution=None, crs=None, remove_nodata=True, harmonize_to_old=None, scale_data=True, groupby='solar_day') special ¶

get_data(self) ¶

get_evi(self) ¶

get_metadata(self) ¶

get_ndbi(self) ¶

get_ndsi(self) ¶

get_ndvi(self) ¶

get_ndwi(self) ¶

get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None}) ¶

get_rgb_clahe(self, **kwargs) ¶

get_rgb_percentile(self, **percentile_kwargs) ¶

harmonize_to_old_inplace(self) ¶

remove_nodata_inplace(self) ¶

scale_data_inplace(self) ¶

search_data(self) ¶

authenticate_all() ¶

get_esa_worldcover(bbox_input=None, version='v200', mask_nodata=False) ¶

get_forest_cover_fraction(bbox_input=None, mask_nodata=False) ¶

get_nlcd_landcover(bbox_input=None, layer='landcover', initialize_ee=True) ¶

get_seasonal_mountain_snow_mask(bbox_input=None, data_product='mountain_snow', mask_nodata=False) ¶

get_seasonal_snow_classification(bbox_input=None, mask_nodata=False) ¶

`HLS` ¶

`init(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', bands=None, resolution=None, crs='utm', remove_nodata=True, scale_data=True, add_metadata=True, add_platform=True, groupby='solar_day')` `special` ¶

`get_data(self)` ¶

`get_ndvi(self)` ¶

`get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None})` ¶

`get_rgb_clahe(self, **kwargs)` ¶

`get_rgb_percentile(self, **percentile_kwargs)` ¶

`mask_data(self, remove_cirrus=True, remove_cloud=True, remove_adj_to_cloud=True, remove_cloud_shadows=True, remove_snow_ice=False, remove_water=False, remove_aerosol_low=False, remove_aerosol_moderate=False, remove_aerosol_high=False, remove_aerosol_climatology=False)` ¶

`remove_nodata_inplace(self)` ¶

`scale_data_inplace(self)` ¶

`search_data(self)` ¶

`MODIS_snow` ¶

`Sentinel1` ¶

`local_incidence_angle_data` `property` `readonly` ¶

`init(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', catalog_choice='planetarycomputer', bands=None, units='dB', resolution=None, crs=None, groupby='sat:absolute_orbit', chunks={}, remove_border_noise=True)` `special` ¶

`add_orbit_info(self)` ¶

`db_to_linear(self)` ¶

`get_data(self)` ¶

`get_local_incidence_angle(self, resolution=None, initialize_ee=True)` ¶

`get_metadata(self)` ¶

`linear_to_db(self)` ¶

`search_data(self)` ¶

`Sentinel2` ¶

`init(self, bbox_input, start_date='2014-01-01', end_date='2025-09-12', catalog_choice='planetarycomputer', collection='sentinel-2-l2a', bands=None, resolution=None, crs=None, remove_nodata=True, harmonize_to_old=None, scale_data=True, groupby='solar_day')` `special` ¶

`get_data(self)` ¶

`get_evi(self)` ¶

`get_metadata(self)` ¶

`get_ndbi(self)` ¶

`get_ndsi(self)` ¶

`get_ndvi(self)` ¶

`get_ndwi(self)` ¶

`get_rgb(self, percentile_kwargs={'lower': 2, 'upper': 98}, clahe_kwargs={'clip_limit': 0.03, 'nbins': 256, 'kernel_size': None})` ¶

`get_rgb_clahe(self, **kwargs)` ¶

`get_rgb_percentile(self, **percentile_kwargs)` ¶

`harmonize_to_old_inplace(self)` ¶

`remove_nodata_inplace(self)` ¶

`scale_data_inplace(self)` ¶

`search_data(self)` ¶

`authenticate_all()` ¶

`get_esa_worldcover(bbox_input=None, version='v200', mask_nodata=False)` ¶

`get_forest_cover_fraction(bbox_input=None, mask_nodata=False)` ¶

`get_nlcd_landcover(bbox_input=None, layer='landcover', initialize_ee=True)` ¶

`get_seasonal_mountain_snow_mask(bbox_input=None, data_product='mountain_snow', mask_nodata=False)` ¶

`get_seasonal_snow_classification(bbox_input=None, mask_nodata=False)` ¶